Facebook knows that imagery is one of the life forces sustaining the 1.86 billion people who regularly use the world’s-largest communication tool, with billions upon billions of photos of babies, pets, vacations, and the like shared every year. And that’s why it’s vital for the company to figure out ways to surface the most relevant imagery when users scroll through their news feeds or search for things their friends or loved ones have shared.
Today, Facebook announced a series of artificial intelligence innovations that it thinks will boost users’ experience, technological breakthroughs that enable its AI systems to understand imagery at the pixel level.
The biggest benefits of the new AI work are twofold.
First is a set of a dozen new image classification actions that can be used to spell out action in photos to visually impaired users in a way that wasn’t possible before.
Second, the system could allow users to find photos shared by their friends or family members based on keywords even when those photos haven’t been tagged or annotated with any kind of text.
AI is a major effort for Facebook, something the company sees as a key to delivering the most relevant content across many, if not all, of its major services. It seeks to dominate in AI and machine learning, much as it does in social networking and instant messaging, and has assembled teams totaling more than 150 people devoted solely to the field. Facebook has also tripled its investment in processing power for AI and machine learning research—though it won’t say how much that investment is–in recent years.
Naturally, Facebook isn’t alone in these efforts. Every major tech company is investing heavily in AI, as the technology is seen as the basis of the next era of computing. “It is the most important computing development in the last 20 years,” Jen-Hsun Huang, the CEO of Nvidia, told Fast Company last year, “and Facebook and others are going to have to race to make sure that AI’s a core competency.”
In a blog post about the new developments, Joaquin Candela, the head of Facebook’s Applied Machine Learning group, noted that online search, even for images, has traditionally required parsing text, and that images could generally only be found if they had been properly tagged or captioned.
“That’s changing because we’ve pushed computer vision to the next stage with the goal of understanding images at the pixel level,” Candela wrote. This helps our systems do things like recognize what’s in an image, what type of scene it is, if it’s a well-known landmark, and so on. This, in turn, helps us better describe photos for the visually impaired and provide better search results for posts with images and videos.”
Accessibility has been a focus of Facebook’s since 2011, aiming to improve the way visually- or hearing-impaired users interact with the service. In 2015, the company began utilizing AI as a way to enrich blind users’ experiences. It built algorithms that were able to automatically interpret certain photos and videos into spoken words that offered the visually impaired context around objects in posts they’d never had before.
With that system, a screen reader could tell a user that an image of a sunset contained things like nature, outdoor, clouds, grass, a horizon, plants, or trees.
But now, the new image classification system could add actions to the context it can provide, things like “people walking,” “people riding horses,” “people dancing,” “people playing instruments,” and others.
The advance was made possible by building a machine-learning model, based on 130,000 hand-annotated photos of people, that can “seamlessly infer actions of people in photos,” Candela wrote.
The second major innovation is a search system that takes advantage of image understanding to filter through masses of possible photos in order to produce the most relevant in the quickest way possible. The upshot? It should be able to surface pictures of, say, a black shirt even when such images aren’t tagged.
This could be valuable for users who remember seeing a photo of someone or something but not where it came from. And especially so if the desired photo didn’t have tags or captions.
Ultimately, while advances like these are impressive today, Facebook knows that it’s only in the early stages of applying AI to image search and accessibility.
“While these new developments are noteworthy,” Candela wrote, “we have a long and exciting road ahead and are just scratching the surface of what is possible.”