Dimitri Kanevsky has been researching the use of speech-recognition technology as an aid for people who are deaf or hard of hearing for more than three decades—for many years at IBM, and, since 2014, at Google. So when I met him at a recent Google event focused on the company’s accessibility efforts, I asked if he ever dreamed back in the 1980s that technology would ever get as good at understanding spoken words as it is in 2019.
“No, I expected it would become so good in five years,” he responded. “And after five years, in the next five years. And then the next five years.” It was only when he joined Google that the tech achieved the accuracy that he once thought would come fairly quickly.
If speech recognition hadn’t become really good, we might not have been able to talk about it so freely: Kanevsky has been deaf since the age of one. But as we discussed his work, he glanced at his phone, which gave him a transcription of what I was saying. It was so instantaneous that he began answering my questions as I was finishing asking them, as he might have if he—rather than his phone—was hearing my speech.
As you might have figured out by now, Kanevsky was using an Android app that he had a hand in developing. Field-tested at Gallaudet University, the famed school for deaf and hard-of-hearing people, the app is called Live Transcribe. It works with more than 70 languages and dialects and will be available through Google’s Play Store—at first as a limited beta—as well as pre-installed on Pixel 3 phones.
Live Transcribe aims to put real-time transcription in the pocket of people who need it so it’s available anytime and anywhere. That stands in contrast to old-school remote transcription services performed by human specialists, which require advance scheduling and carry substantial hourly fees—not a huge problem for something like a business meeting, but an obstacle if you just want to ask a question of a coworker, chat with a friend, or play with a grandchild.
Anyone who’s used Siri, Alexa, or the Google Assistant knows that computers have gotten dramatically better at accurately understanding speech in recent years. But there’s a big difference between understanding spoken commands and the sort of full-blown recognition that Live Transcribe does, where the goal is to correctly interpret anything one person might say to another. At Google’s event—where it left Live Transcribe running to transcribe the presenters as they spoke—the app didn’t get every word right. But it was mostly spot on, and its mistakes were usually nitpicky stuff rather than glaring blunders. It also uses AI to grasp the context of phrases—so it knows the difference between “New Jersey” (the state) and “new jersey” (the shirt you just bought).
Live Transcribe doesn’t have many features, and that’s kind of the point. Google considered adding more functionality—such as the ability to save transcripts, which would certainly make it handy for folks like journalists—but ultimately decided to focus on optimizing it for the sole purpose of assisting people who are deaf or hard of hearing. It uses haptic feedback to alert a user that someone has started talking, and allows those who can’t or don’t want to speak to participate in a conversation by typing. In the interest of privacy, it doesn’t store past transcriptions in the cloud.
Along with Live Transcribe, Google is releasing a service called Sound Amplifier, which was announced at last year’s Google IO conference and is now arriving in the Play Store and preinstalled on Pixel 3 phones. Rather than turning speech into text, it’s designed to help people hear better in challenging situations, from restaurants to study halls to airport lounges.
“You’re at a dinner party with your friends, and the environment is kind of loud,” said Google software engineer Ricardo Garcia by way of example during the accessibility event. “And sometimes you can hear the person right next to you just fine, but it’s difficult to hear someone across the table.” Sound Amplifier picks up sound using your Android phone’s microphones, dynamically processes it to boost quiet sounds and remove background noise, and lets you listen to the results through a pair of wired headphones. You can also adjust the audio in a variety of ways, such as fine-tuning it separately for your left and right ears.
What Sound Amplifier does reminds me of a bit is Here One, the earbuds, from short-lived startup Doppler Labs, that aimed to intelligently filter out audio distractions and let you tweak their settings to your personal preferences and specific environments. But while Doppler’s idea was to build the technology into a sleek, AirPod-esque $300 set of wireless buds, Google wants to make it accessible to anyone with a pair of headphones and a phone capable of running Android Pie.
Along with other journalists at Google’s event, I tried out Sound Amplifier by listening to it through a box that simulated hearing loss. But the day may well come when I don’t need a special box to gauge its effectiveness: One in three Americans over the age of 65 has hearing loss. Worldwide, says the World Health Organization, 466 million people are deaf or hard of hearing, a figure the WHO expects to grow to 900 million by 2055.
Google is fond of emphasizing that it likes to build things to reach large swaths of humanity—and rather than catering to a niche, these two new apps have a potential audience that’s large and only growing larger.