Tips For Mastering Voice Recognition On Your iPhone, Android, or Desktop

You don't need to talk like a robot or an English language professor to control and dictate messages to your phone or computer. Here's how the speech-to-text software makers suggest you should speak.

robot voice

Nobody talks on the phone anymore, but people are talking at their phones. Speech recognition is simply an expectation on smartphones these days, as well as in car navigation systems and your web browser. We’re all learning how to talk to machines, but we could be better at it. Here’s a pocket dictionary for dictating to your devices.

The Basics: Can It Hear You Now?

On any platform, check your settings and helper guides to get an understanding of what your app or phone can actually do, as completely misunderstood responses from a speech app can be pretty infuriating. Android phones with Google’s Voice Search installed, for example, can make a phone call, compose email, send a text message, get directions, and pull up musical artists--but can’t launch applications from vocal commands. An iPhone, on its own, can only call contacts and play music when you hold down the home button, until you install a Dragon Dictation or Go app on there.

On any mobile phone, check that the microphone space is clear and free of obstructions. This isn’t so much an issue on iPhones, but Androids and other phones can have pinhole-style microphones that can get gummed up or partially covered by awkward cases.

Intermediate: Stop Slowing Down, But Think Before Speaking

You need not talk like a robot to be understood by one, says Vlad Sejnoha, chief technical officer at Nuance, maker of the Dragon speech-to-text software for Windows, Macs, iPhones and iPads. Dragon’s software learns speech styles and tics over time, and you want to aim for a natural speaking flow. Still, Sejnoha says, it helps to think out what you’re going to say before you say it.

“If you think about how we talk to friends, we make a lot of what we call disfluencies,” Sejnoha told Fast Company via email. “We slur our words, stop and start, interrupt ourselves. Our software can deal with a certain amount of that, but the less there is, the better (the learning) is.”

Digging around Android forums and elsewhere, I found a good number of tesimonies from speech-to-text enthusiasts who saw better results from simply speaking at a normal clip. Google representative Nadja Blagojevic offered much the same advice for the search giant’s voice product in both Android and its Chrome browser: “Speak naturally and clearly, but don’t strain to enunciate too much or speak slowly.”

Training yourself to not slow down when you’re also trying to improve elocution takes some time, as Sejnoha said, but it eventually becomes a groove.

“There’s an element to just … becoming accustomed to generating text while speaking,” he said. “Some become very good at it, as part of their job. The rest of us are accustomed to generating text with a keyboard, which is a stop-and-go process … When our users relax, they realize they can speak in a reasonable way and get their message out faster.”

Advanced Dictation: Punctuation and Personalization

Another embarrassing thing you can stop doing is holding your phone or desktop microphone directly in front of your mouth. Even in a crowded bar or with loud music, Sejnoha said, you “might be surprised at how well (Dragon) does.” If it's a noise just outside your window, or from across the office, your Windows or Mac system itself might offer some help. In your Mac’s System Preferences, there’s a whole range of Speech options, sure; but check in the Input section of the Sound options, and you’ll find a check box to “Use ambient noise reduction,” which Google’s Blagojevic recommends for using the Chrome browser’s speech function. Windows offers similar mic control methods in its own Control Panel.

Wind, however, is a more problematic kind of noise. The best noise cancellation tools put in cars and hearing aids use multiple microphones to pinpoint the speaker and amplify their input, but your phone isn’t quite as refined an audio tool. If it’s a windy day, you might just have to type out what you want, provided you’re not driving.

Once you’re getting good at getting your words across, you’ll want to nail down punctuation, and maybe even emoticons--how else will you wield the passive-aggressive put-downs you’ve refined in years of typing? Punctuation is no different than words, as Dragon, Google, and most good voice recognition software will train itself to how you say "period," "comma," or even "smiley." Just be sure to actually go back and fix bad punctuation, as that’s often how the software learns.

Finally, if you’re using an Android phone, be sure that you’ve enabled Personalized Voice Recognition. That allows Google to keep key recordings of how you say things, and how you correct Google’s transcription, on Google’s servers. It benefits your speech-to-text on Android, in your browser, and, ultimately, your self-driving car.

[Image: Flickr user rossbreadmore]

Add New Comment

3 Comments

  • Lisalinn123

    Thank you for this informative article. I can't figure out how to enable Personalized Voice Recognition. Any hints?

  • Thingumybob

    What voice iteration/dictation boils down to is "thinking before you open your mouth". I am old enough to have used dictating machines (remember mini-cassettes?), which was a process that did require you to plan before you spoke, even if the secretary was able to use her initiative to clean up the spoken word. I reckon it was a [now defunct] technique that even today assists me in reducing a tendency to speak drivel!