Shortly after the iPhone 4S went on sale this fall, its marquee feature, the Siri voice assistant, sparked a heated debate in the tech world: Was Siri a great computing interface, or was it the greatest? To many observers, Apple’s software, which combines uncannily accurate voice recognition with an artificial-intelligence engine that’s capable of understanding complex English-language requests, marked the realization of our sci-fi dreams. Soon we’d be controlling everything by voice–cars, fridges, toasters, and, crucially, our TVs. Both Google TV and Microsoft’s Kinect let people control their televisions by voice; according to The New York Times, Apple’s long-rumored entry into manufacturing TVs might rest on the same gimmick.
Lost in Siri’s coronation is any hint of how strange the world would be if we all took up chattering with our machines. Siri-like interfaces are limited to private use; sure, you can try dictating an email on the subway or an airplane, but be sure to wear a mouth guard–and maybe even full body armor. Even issuing simple commands–“Remind me to buy Metamucil,” “Play more Katy Perry”–could incite embarrassment. And then there’s the question of utility: Why tell your TV to turn up the volume when you could just hit a button on the clicker?
Voice control certainly has a place in the future of computing. But the rush to crown it as the Next Great Interface, despite its obvious shortcomings, illustrates a void in our conception of the future. As mobile machines keep getting more powerful, we’re running into a fundamental technological hurdle: We need to find new, better ways of managing our phones and tablets in order to unleash their capabilities. That’s why we hail every interface invention, from multitouch to motion control to voice, as the next holy grail. As promising as each of these technologies may be, none completely solves the mobile interface dilemma. On our portable devices, it’s still too difficult to input large piles of text, to work with complex graphics, or to otherwise manipulate data as we’re used to on our desktops. “In many ways, this is the grand challenge of computing,” says Chris Harrison, an interface researcher at Carnegie Mellon’s Human-Computer Interaction Institute. “We won’t use mobile devices to their true potential until we break through this interface bottleneck.”
Harrison and his colleagues, not to mention dozens of researchers elsewhere in academia and the industry, are constantly trying to crack this problem. Some ideas are incremental improvements on current interfaces. For instance, one of Harrison’s projects, called TapSense, would let your smartphone determine how you tapped its screen in order to allow for more complex interactions; use your knuckle instead of your fingertip and your phone might interpret the gesture as a right-click. Other projects imagine much greater flights of fancy. OmniTouch imagines a mobile computer that doesn’t have a screen. Instead, you carry around a wearable projector that would create a touch-screen interface anywhere. Need to calculate a tip at lunch? Project a calculator on the table and punch at the projected keys. While Harrison has won commercial backing for a few of his projects–he produced OmniTouch while working for Microsoft Research–none of his ideas is likely to solve the interface dilemma.
Why? Because no single interface will be the panacea. Until we somehow run our computers through thought–and, hey, let’s not rule that out!–we’re unlikely to settle on a single input method for all devices. Computers of the future, like those today, will likely shift gracefully among a host of interfaces: voice in the car, a touch screen on the plane, a projected screen when you’re collaborating with others, and probably some zany innovation to write your novel while you’re just walking down the street. What will that new interface look like? I can’t tell you–but trust me, don’t ask Siri. It doesn’t know either.