IBM's SAPIR: The Smartphone Image-Recognition App That Recognizes Everything

Ever looked at a vacation photo and wondered "where the heck did I take this?" IBM's got an image recognition system that should solve the problem--and it points to the future of smartphone tech.

The name of this new technology reveals pretty much everything you need to know about it--SAPIR (Search in Audio-Visual Content Using Peer-to-peer Information Retrieval)--even though it makes no sense at first. Here's how it works: You send a file to the app and it whisks the data off and interrogates a vast collection of cloud-sourced content to try to work out similar items that have already been identified. The clever bit is that you don't necessarily need to apply any sort of keyword tag to your file beforehand. SAPIR is, in theory, smart enough to do the recognition automatically, in a similar fashion to the way we humans recognize objects from audio and visual cues.

To get a feel for how it works, check out the video of the system being used in Madrid.

You'll have noticed that SAPIR is in early stages from that demo, and that its user interface is a little clunky. But the implications for the future of the system are rather incredible. As well as still photos, SAPIR can recognize video imagery and audio. In theory you could play it some music and learn the title (much like smartphone apps like Midomi) and perhaps even more esoteric sounds like a Ferrari engine.

The system probably won't work as a stand-alone in the future--at least in a real-time mode, like that demoed in the video. Augmented reality systems really do a better job of providing you real-time location-based info, thanks to position sensors and digital compasses---but they rely on a database of tagged points of interest. Combining SAPIR with AR could lead to all sorts of amazing auto-tagging possibilities, making the data layer in an AR system much more information-rich. And the image and audio recognition needn't just be used for landmarks. Imagine you see a person wearing some clothes you like, and want to buy, it could potentially identify the vendor for you and, of course, use some AR tricks to navigate you to the store.

This really could be the incredible future for smartphones, and it's probably not as far away as you think. If you take an imaginative leap into a future of ubiquitous head-mounted display tech, then AR combined with a SAPIR-like system would form the head-up-display for real life that we've seen in countless sci-fi movies.

