Inside Facebook’s MemNets And The Drive To Bring Advanced AI To Everyday Tasks

Facebook opens up about their proposed personal digital helper.

Inside Facebook’s MemNets And The Drive To Bring Advanced AI To Everyday Tasks


Facebook has long said its core mission is to connect everyone in the world. Now it wants to use artificial intelligence to automatically help those connected billions with everyday tasks.

Although there’s no immediate product on the horizon, Facebook thinks it will one day be able to offer its entire user base a personal digital helper that could understand, and respond accurately to, complex queries.

For some time now, the social networking giant has been talking about its nascent AI efforts. This spring, at its annual F8 developers conference, for example, it showed off the ability to query a computer system about facts derived from long Lord of the Rings text passages.

That system, which involves a form of “short-term memory to the convolutional neural networks that power our deep-learning systems,” as Facebook CTO Mike Schroepfer wrote in a blog post today, was an early demonstration of what it calls Memory Networks, or MemNets. The goal is to help computers to “understand language more like a human would–with context, instead of rote ones and zeroes memorization like a machine.”

All of this, of course, is in the service of Facebook’s hunger to give its huge user base the most relevant information. That’s even more true as it seeks to use its Connectivity initiative to bring Internet service to billions of people in developing nations who currently can’t get online. As those people become connected, Facebook expects many of them will join up, putting more pressure on its systems and providing ever more urgency to build infrastructure capable of meeting the needs of what could easily be more than 2 billion regular users.

One of the goals, Schroepfer explained during a press roundtable at Facebook’s new Frank Gehry-designed headquarters in Menlo Park, California last week, is building tools that let users ask questions about the contents of photos. The first result of that project is what it calls Visual Q&A (VQA), a system it hopes could help blind, or low-vision, users better understand what’s in the photos in their news feeds.


Already, as part of its Accessibility initiative, Facebook has been working to give blind users more context about photos and videos. It has developed a system designed to automatically interpret visual contents for those users by spelling out the basics of a photo.

For example, while a sighted person would have no trouble understanding the impact of a photo of a stunning sunset over San Francisco Bay, a blind person would get nothing out of it without additional context. With Facebook’s AI technology, however, that user’s screen reader–a tool that converts text to speech–could tell them this: “This image may contain ‘nature, outdoor, cloud, grass, horizon, plant, [or a] tree.'”

With VQA, Facebook aims to take a much more ambitious step forward. Though not yet available in any kind of product form, Schroepfer showed a video at the press roundtable in which users were able to ask questions of the digital assistant about a photo, and the automated helper was able to offer accurate responses based on its confidence in its interpretation of the photos’ contents, and it does so with “an order of magnitude reduction” in the kind of brute-force training image-recognition systems have required in the past.

“You can imagine tons of useful applications for Facebook,” Schroepfer said, acknowledging that the company needs to do a better job of understanding what users want to see in their news feeds. If someone doesn’t like getting pictures of coffee art sent to them, he explained, the system could filter out such photos in the future.

Or, users could ask the systems questions about a photo, such as if there’s a baby in a photo, where the baby is standing, and what it’s doing. In a video demonstration Schroepfer showed, the assistant was able to correctly respond that yes, a photo did include a baby, and that it was in a bathroom, and having its teeth brushed.

In order for the tool to be as powerful as possible, it has to be trained to learn more about how the “real world” works, he said.


For example, if you present the system with an image of a stack of off-balance blocks, it should be able to determine if the blocks will stand or fall. Right now, Schroepfer said, Facebook’s system is able to accurately predict the answer about 90% of the time.

A much more challenging problem is the game of Go. Computers have long been able to beat the world’s-best humans at chess, but Go has proven to be far more difficult. That’s mainly, Schroepfer said, because Go involves an extremely complex branching system, where even a small number of moves have more than a hundred thousand possible outcomes.

The best human Go players succeed because of their ability to see visual patterns on the board. So Facebook set out to program its AI to similarly recognize those visual patterns on a Go board, and within just months, the AI has progressed to being able to beat some of the best amateur Go players in the world, Schroepfer boasted.

But making a better AI Go player clearly isn’t Facebook’s end game. Rather, Schroepfer suggested, the true promise of Facebook’s AI efforts may well be in providing every one of its users with an always-on personal digital assistant. That’s something few of us can afford today, he noted, but which Facebook hopes to be able to offer to the “entire planet” in the not-too-distant future.

Such a digital assistant tool–which Facebook is already testing in conjunction with its human-powered M project–would ideally be able to offer answers to a wide variety of queries based on language, vision, planning, or prediction, Schroepfer said.

The more questions people ask of M, he said, the more it will learn. Over time, it should be able to respond automatically, without human help, to questions ranging from what the weather is going to be like, to help with getting birthday cards, or changing flights.


“The reason this is exciting is because it’s scalable,” Schroepfer said. “We could deploy (the assistant) for the entire world.”

Facebook’s M system is already connected to its MemNets project, he said, which helps it learn as users as questions and humans find, and plug in, the answers. Ultimately, Facebook wants to make the system fully automatic, a commitment that it uses as a recruiting tool for the best artificial intelligence talent.

“The promise I made to all the AI folks that joined us,” Schroepfer said, “is that this is the best place to get your work to a billion people.”


About the author

Daniel Terdiman is a San Francisco-based technology journalist with nearly 20 years of experience. A veteran of CNET and VentureBeat, Daniel has also written for Wired, The New York Times, Time, and many other publications.