One of the marquee accessibility features coming to iOS 17 this fall is something Apple is calling Personal Voice. As I reported in May, Personal Voice blends poignancy with functionality by giving people—and their families—who cope with certain conditions, such as ALS, the ability to record their voices using randomized prompts (up to 15 minutes in duration) from the system. As I wrote, Personal Voice is clearly recognition by Apple—who worked with organizations like the Team Gleason Foundation in developing the feature—that those who will inevitably lose their voice needn’t remain voiceless. Moreover, from a technical perspective, the fact Personal Voice is made possible by the neural networks in Apple’s systems-on-a-chip means the company’s vaunted custom silicon chops have yet one more benefit: accessibility.
It’s quintessential Apple, the interplay of hardware and software.
Voice technology matters to the team at biometrics company Aware. The 30-year-old company, begun in 1993, describes itself on its website as having the ability to help customers “with the right balance of security and friction for an experience that delights your users while ensuring compliance and protecting your organization.” Aware’s technologies are trusted by innumerable institutions, including over 150 law enforcement agencies and more than 20 financial institutions. Aware boasts more than 60 partners spanning over 20 countries around the world.
In an interview conducted in late June, chief technology officer Dr. Mohamed Lazzouni explained Aware doesn’t interact with the end user, but its technologies certainly do. At a high level, Dr. Lazzouni told me identity is no longer confined to person-to-person interaction. As technology’s capabilities have expanded and its power and influence has risen, biometric authentication has become a popular (and accessible) way to make sure people are who they say they are. Whether by voice or face or touch, or some combination thereof, the omnipresent nature of tech means identification is nowhere near the strictly in-person affair it has been in the past. Things nowadays are decidedly more complex.
Dr. Lazzouni has dedicated his career to building simple solutions.
“You can imagine the moment that identity is no longer intermediated via human-to-human transactions, and the machine gets involved with in the middle. Things immediately become very complex, because they in some way form the processes that used to be fundamentally relied upon to be done by a human being looking at an [identification] document or talking to a physical human being or asking them a few questions to vet or assert the authenticity of their identity,” he said. “When that responsibility gets discharged to a computer and you have to rely on the human interaction with that computer to assert and certify the authenticity of that identity, that complexity now falls to software.”
In a disability context, Dr. Lazzouni told me the team at Aware “absolutely” considers an end user’s ability (or not) to interact with a particular modality, saying in part “what we do with our programs is we approach this requirement with multiple dimensions and multiple approach vectors.” Further, he reiterated Aware’s domain is not in building consumer-facing applications. Rather, the company builds toolkits, or engines as Dr. Lazzouni called them, for application designers to have what they need to build the consumer-facing software.
“Think of us [at Aware] as being the toolkit that has all of the elements inside the toolbox,” Dr. Lazzouni said of the scope of Aware’s work. “Depending on whoever is building the application for what purpose, they need to build that application, they can come and use the right fit to drop it into their application and make it customer-facing.”
Asked about Aware and voice technologies, Dr. Lazzouni said the company is working on what he described as “liveness detection” to voice recognition software. Liveness detection is a process whereby am algorithm securely detects whether a biometric sample, like someone’s voice, originated from a fake representation or an actual human. Liveness detection, Dr. Lazzouni said, is a crucial element in preventing voice spoofing. He cited so-called “vishing” (voice phishing) being commonly used by cybercriminals to impersonate real people. Senior citizens often are victims of such attacks, with Dr. Lazzouni telling me someone can rush to the bank to withdraw money for a purportedly sick loved one when, in reality, they fell prey to an AI-generated scam by impersonating someone’s voice with obtaining prior consent.
“Today, many different types of voice cloning companies are launching, and as this technology becomes more mainstream and available, further abuses and misuses are surely to emerge,” Dr. Lazzouni said. “As these technologies expand, liveness detection provides the most certain safeguard that a voice is coming from a real person. Many organizations want to offer voice recognition as a form of authentication because it is fast, frictionless, and secure, but they’re going to need additional layers of validation.”
Dr. Lazzouni called voice cloning “an exciting new frontier” with the potential to unleash many benefits onto society, especially in medicine and for people with certain speech disabilities. Harkening back to Personal Voice on iOS, Dr. Lazzouni said it’s possible to create a synthesized voice that can assist those with speech delays to communicate in a voice uniquely their own. Likewise, a person with throat cancer who needs their larynx removed, Dr. Lazzouni told me, could conceivably have their voice cloned prior to the surgery in order to have a synthesized voice that sounds appreciably akin to their old self. For all the good, however, Dr. Lazzouni caveated society must be vigilant of disruptive technologies and exercise caution. Many potential issues, from ethical, legal, and privacy concerns, can be significant if left unchecked. “Organizations that have invested in voice recognition as a form of biometric authentication would be well-advised to take extra measures to guard against these threats,” he said. “For example, in addition to liveness detection, voice and face are complementary biometrics so they can be used very effectively together.”
Looking towards the future, Dr. Lazzouni general optimism at, and appreciation of, technology’s capacity for genuine good. “Through the process of its ubiquitous presence, the more you put this [technology] in the hands of people, the more human capital, with its ingenuity and desire to advance the platforms, [will] move them to future horizons that we cannot even fathom today that are likely to become possible,” he said. “5 years and 10 years and 15 years from now, let alone 25 and 50 years from now, is a thrilling prospect. The learning rate of people is getting exacerbated by the day. The learning age and creativity of people is getting younger by the day. This is an amazing thing, to be part of the human race to know that this is a possibility that exists.”
Tech like Personal Voice definitely signals ingenuity and advancement.
Read the full article here