advertisement
advertisement

A Reverse Search Engine For Images That Could Tell You What You’re Looking At

Visipedia is building a searchable visual encyclopedia, starting with birds.

A Reverse Search Engine For Images That Could Tell You What You’re Looking At
[Image: Birds in flight via Shutterstock]

Say you’re on a hike, and you see a strange flower. You want to know what it is, so you text a photo to your smartest green thumb friends. They have no idea. Unless you feel like pouring through plant guides, you might be out of luck.

advertisement

That’s a situation that could be solved soon, with improving computer vision technology and an assist from humans with actual expertise. And not just for plants, but birds and beetles or furniture and food, and other things you might want to classify, too.

Computers today are getting better and better at “seeing.” On the one hand, as on Facebook, programs can use biometrics to pick out an individual from a barely distinguishable mass of faces. At the other end of the spectrum, software is getting better at distinguishing between objects that are very different, like knowing that an elephant is not a piano or identifying major landmarks, such as the Eiffel Tower. But the problem described above–identifying a specific species of bird from all the others–is truly hard for any computer to solve. Unlike thousands of birding fanatics across the world, a computer doesn’t have the expert knowledge to parse these small group distinctions.

Visipedia is a long-running project that aims to create a visual encyclopedia, where the search input is an image, rather than word or phrase. Like Wikipedia, its power lies in crowdsourcing from a small user base for the benefit of the masses. “It’s an instance of humans and machines working together. Although it has some sophisticated machine learning, Visipedia is nothing without an engaged community,” says Serge Belongie, who is currently a computer scientist at Cornell Tech.


At the beginning, Belongie and his colleagues decided to focus on birds as the first test case for Visipedia–there are lots of obsessed birders, and the variation in avian color, shape, and plumage posed a true challenge. They worked with Cornell University’s Lab of Ornithology to create a database of bird species, and then asked the wider birding community that follows the lab to help identify images and tag different body parts. Today they have over 1 million crowdsourced image annotations. Over time as the algorithms improved, Visipedia went from being able to identify birds with 20% accuracy to as high as 85% accuracy today–a level approaching the success rate of a human expert. With the search results, which IDs the bird, comes additional information about it, like any encyclopedia.

At a computer vision conference in June, Belongie plans to announce that the project is ready to expand beyond the avian realm. He envisions Visipedia as being useful in all sorts of domains. Plants and animals come to mind, but so does an app for foodies or one for architecture fanatics or adventurous shoppers. Cancer pathologists could even use better image classification to group new tumors.

Another similar project, from researchers at Columbia University and the University of Maryland, is taking a different approach by developing standalone apps, like LeafSnap (a leaf field guide), DogSnap, and BirdSnap, for different users.

advertisement

The researchers don’t intend to create all of these apps. Instead, they want to create the backend program with an API so any community or company could create their own desktop or mobile apps. Belongie plans to keep Visipedia a nonprofit, where the images and information stays open-source, but also license the technology for corporate users who don’t want to comply by those terms.

“I imagine it much more as something that appears in the fabric of different apps and sites. Imagine you’re a frequent Wikipedia user. Let’s say you really like mushrooms. One day you’re using it, and there’s a little search box on the page for mushrooms and it allows you to search [your own] image of mushroom. There’s not a big announcement. There’s no unified roll-out. It’s just there,” Belongie says.

Belongie isn’t giving up on the initial community of bird enthusiasts that supported Visipedia. Cornell may integrate the technology into its Merlin Bird ID app soon. “We don’t want to be pigeon-holed as bird people, but they have been a great way to just build out the system,” he says.

About the author

Jessica Leber is a staff editor and writer for Fast Company's Co.Exist. Previously, she was a business reporter for MIT’s Technology Review and an environmental reporter at ClimateWire.

More