When Google revealed some improvements to Android, we were excited about how speech recognition and synthesis seemed to be buried deeply in its code—full of promise. For various reasons, including fragmentation, it's never quite emerged as a game-changer. So now Apple's taking up the torch, and if rumors prove true it's not just adding speech tech to the iPhone... it's transforming the device into something new again, starting a whole new paradigm. If it works for Apple, expect others to follow.
Get ready to meet your smart personal assistant.
We've wondered for a while about how Apple was going to mix Nuance speech recognition tech with its iOS devices, and how the technology it acquired when it bought Siri (the firm behind an artificially inteligent assistant) would emerge. Now, courtesy of a hot tip from 9to5Mac, we know that Apple's "Assistant" is going to combine all of this tech into one powerful system, that runs throughout the upcoming iOS5—with some capabilities limited to the upcoming iPhone 5.
Examples of the system include talking to your phone to set up an alarm or reminder, requesting GPS directions using voice alone, sending text messages—basic interactivity in other words, but such that it renders the keyboard practically redundant. This is the sort of system that Google promised in Android.
And now we're seeing that Assistant has a whole other level: It also interfaces with WolframAlpha—Stephen Wolfram's "fact computer" that can intelligently understand data-specific questions and return meaningful suggestions. This means you could, ostensibly, ask your iPhone how many shopping days remain until Christmas, where the International Space Station is at that moment in time in its orbit, or how many Internet users are in China—and get almost instantaneous data fed back to you from WA's computational systems. Though this sounds neat, relatively simple and perhaps handy, it's important to realize how significant a move this would be, because with billions of data points in WA and a smart voice-interface, the iPhone becomes almost like a computer from a previously unseen future. Or maybe a science fiction story. As an example, check out the video of an interaction Arthur C. Clarke imagined between programmer Dr. Chandra and his SAL9000 computer—click here to see the clip on YouTube.
Sources inside Apple are also suggesting that the way Assistant is coded means it can handle an almost conversational chat—essentially answering back to clarify data in the way SAL does in the clip above, perhaps to check which number for a contact to send an SMS to, or to verify which street name you're asking GPS directions to.
We also think that voice controls are going to be integrated throughout the OS, so in theory you could ask your iPhone for a piece of data, then initiate a Skype call to a contact to discuss your thoughts, then compose an email to another contact...all just by talking. Which means your smartphone has become a genuine, semi-intelligent artificial personal assistant. As well as changing how you use the device as a typical consumer, this has all sorts of implications for business users, particularly those who are habitually on the road. And remember that RIM is currently facing massive criticism for being fuddy-duddy and slow to adopt new technology in its enterprise-facing BlackBerry devices, with their keyboard-centric design. Picture what this capability may do for enterprise sales of the iPad 3 in early 2012.
According to some technical thinking, many of these capabilities will be reserved for the upcoming iPhone 5, because it's said to have a much bigger on-chip memory available for the processor (sporting 1GB of RAM instead of 512MB) and this is essential for it to handle the audio-processing and data-shunting that this voice-centric system will demand. But other news says that Nuance, the firm behind the crowdsourced voice recognition tech that Apple's adopting, is opening up developer access to its core speech recognition systems for free, meaning that many apps may quickly integrate speech controls—and this may be possible on older iPhones, because Nuance's system relies on shooting audio samples off to the cloud to process.
Apple is making good on decades of thought and promise, from its Knowledge Navigator concept and even the too-early-to-market Newton device. Meanwhile, where Apple dots fresh footprints on a new technology beach, others follow the trail. Google's Android OS has, according to recent analysis, achieved a new success and grown so rapidly that over the last three months in the U.S. 56% of smartphones activated were Android powered, twice that of Apple's stable 28% share. Google is sure to follow Apple's lead into speech recognition to keep its devices competitive, and Android-partnered firms like Samsung (which has a broad research history, including robotics) have already got varying degrees of speech recognition experience.
All of which means that soon your smartphone will be much smarter than you thought, and you'll interact with it in ways that you never dreamed possible, much more than merely using it to speak to someone over the phone lines. Isn't it about time we renamed the things? "Phone" is becoming anachronistic.
Update: It looks like Nuance is also adding support for Windows Phone 7 app developers, which underlines how much of an explosion in voice recognition is going to hit smartphones in the very near future.
[Image: Flickr user br1dotcom]