Back in 1975, two young computer hackers named Steve Jobs and Steve Wozniak helped create a game called Breakout. Inspired by the popular paddle-and-ball game Pong, Breakout asked players to use a similar setup to smash bricks on a screen by bouncing a ball back and forth. While slightly derivative, the presence of the two future Apple cofounders at a critical stage in their careers underscores just how significant video games have been in the history of groundbreaking high-tech.
Forty years later, the same game is being used as part of another significant development in artificial intelligence. It’s also a sign of where large technology companies are taking what was formerly an obscure corner of academia.
In the London offices of DeepMind—an AI startup bought by Google for $400 million in early 2014—a computer controls the familiar Breakout paddle to send forth a volley of brick-smashing shots. It is the first time the computer has played the game, and it is not very good. By the 200th game, it is faring better—about as well as a good human player. By the 600th go-around, it has mastered the strategies you’d find in a seasoned player, and even manages to surprise DeepMind’s founder, Demis Hassabis.
“The idea is that these types of system are more humanlike in the way that they learn,” he says, referring to the game-playing software agent, which began with seven Atari 2600 games in an academic lab a few years ago. “We learn by experiencing the world around us, through our senses, and then our brains make models of the world that allow us to make decisions and plans about what to do. That’s exactly what we’re trying to design.”
Artificial intelligence is nothing new when it comes to video games, having first been used in basic checkers and chess games as far back as the 1950s. More recently, and famously, a computer called Deep Blue defeated champion Garry Kasparov at chess in 1997, while IBM’s Watson won the quiz show Jeopardy! in 2011.
DeepMind’s breakthrough is different, though. Unlike those examples, it hasn’t just mastered one game, but many–49 different Atari video games, to be precise. While its performance doesn’t match that of Breakout in every instance, the software demonstrates an ability to hone its skills over the course of multiple games. In other words, it learns—and in doing so hints at a holy grail of AI: general intelligence.
“This work is the first time that anyone has built a single general-learning system that can learn directly from experience to master a wide range of challenging tasks—in this case a set of Atari games—and perform at, or better than, a human,” says Hassabis. The next step is building bots that can maneuver in and win at 3-D games—and applying the lessons of this research to money-making applications like more-intelligent recommendations or autonomous vehicles.
A computer that spends hours mastering old video games might make it sound like Google has succeeded in creating a slacker bot. But as Hassabis explains, the company is busy working out how it can be used to make its existing services—and certainly some new, as-yet-unannounced ones—better. “Our focus is on the core things that Google does, so search, phone assistant, machine translation, and things like that,” he says. “We’re looking at applying components of the research to the main Google systems.”
That would include a host of other applications too, from determining which ad to show you to which video to play next to, eventually, piloting your car home. Google is far from the only company to be interested in the field of deep learning, which has rapidly risen from an obscure subcategory of computer science to become one of the most-hyped fields in artificial intelligence. Microsoft, Twitter, Facebook, Apple, Baidu, and many others have fiercely competed for deep-learning researchers, of whom there are still relatively few. Before Google purchased DeepMind in 2014, Peter Norvig, a director of research at Google, told Technology Review that his company already employed “less than 50 percent but certainly more than 5 percent” of the world’s machine-learning experts; with its purchase of DeepMind, Google significantly deepened its AI bench, which already includes brains like Geoff Hinton, Sebastian Thrun, Fernando Pereira, and Ray Kurzweil.
In 2011, Stanford computer science professor Andrew Ng (now at Baidu) founded the Google Brain project, which proved capable of recognizing high-level concepts, such as cats, after watching YouTube videos—and without ever having been told what a “cat” is. Facebook, which employs machine-learning expert Yann LeCun, is using deep learning to better identify faces and objects in the millions of photos and videos uploaded to the social network every day. Recently a handful of Facebook scientists published a paper on what they call a “memory network” (imagine a neural network paired with a memory bank), with enormous implications for both machines’ ability to answer detailed questions and learn to carry out complex tasks, like language translation. Five days later, DeepMind released a paper on a similar approach it calls a “neural Turing machine”—a sign of the neck-and-neck race in which the two companies now find themselves.
Apple, for its part, has used AI to boost voice recognition in technologies like its Siri virtual assistant. And at Microsoft, the deep learning–based “Project Adam” imagines hypothetical concepts like being able to point your smartphone at a dog and have it immediately identify the exact breed.
And as more AI researchers head to Silicon Valley (or London), lured in part by sizable salaries, some have sounded an alarm for the integrity of academic work. “The barriers between Silicon Valley and academia are blurry and getting blurrier,” wrote two researchers, Sergey Feldman and Alex Rubinsteyn, after Mark Zuckerberg paid a visit to a machine-learning conference in 2013. The concern is that the world’s biggest data sets and computational resources will be locked behind corporate doors, not open to scientists. “However, if academia has any hope [of] maintaining an atmosphere of open inquiry (rather than just proprietary R&D), academics have to protect their culture,” they wrote. “Otherwise, the resulting decline in high-quality reproducible research will be a loss for everyone involved, and society at large.”
Hassabis has sought to smooth over the perceived conflicts between an Internet behemoth like Google and a research-focused outfit like DeepMind. In a blog post, Hassabis pointed beyond using AI for “real world” challenges (“Okay, Google, plan me a great backpacking trip through Europe!” for example) and toward the benefits it can offer science.
“We also hope this kind of domain general learning algorithm will give researchers new ways to make sense of complex large-scale data creating the potential for exciting discoveries in fields such as climate science, physics, medicine and genomics,” he wrote. “And it may even help scientists better understand the process by which humans learn.”
To achieve its learning, DeepMind’s software agent combines two key approaches to the field of machine learning: deep learning and reinforcement learning. Deep learning deals with what are called “neural networks” to mimic the way that the human brain is able to take raw information and translate it into a type of computational understanding. Instead of having to be programmed to deal with every situation it might encounter, deep learning is mostly unsupervised and instead allows the computer to learn through a form of trial and error, similar to the way a person would. In the process, deep learning is helping computers do much more with inputs, revolutionizing fields like speech recognition, computer vision, and natural-language processing.
Reinforcement learning, meanwhile, refers to concepts that can help a machine figure out how to play games: learning that certain actions lead to rewards, while others do not. The trick is to be able to quickly compare all various actions and rewards, based on previous experience, and make optimal decisions. That’s where the neural network comes in: By pairing the techniques of reinforcement learning with a neural network, DeepMind has fashioned a computer program that can speedily “learn” a game just by watching how its most recent move—left, right, punch, etc.—ticks up the points on the scoreboard.
“[This] work is opening the door to a very exciting direction in which deep learning is incorporated into reinforcement learning,” says Yoshua Bengio, a professor at the Department of Computer Science and Operations Research at the University of Montreal—home to one of the world’s biggest concentrations of deep-learning researchers. “Deep learning allows a computer to extract knowledge about the world, while reinforcement learning allows a computer to learn how to act according to that knowledge. Whereas deep learning already has many industrial applications, reinforcement learning, if cracked, would considerably expand the scope of applications.”
Unlike a software agent like Deep Blue, which had to be instructed in the finer points of chess, DeepMind’s new technology doesn’t have to be preprogrammed for each game. Instead, it has access to the games’ control inputs and score. Next, it’s left to figure out how best to act. Not only does this mean it can learn and adapt on its own, but it also no longer requires that its programmer know more than it does about the subject it is teaching.
“The ultimate goal is to build smart, general-purpose machines, [although] we’re many decades off from doing that,” Hassabis says. “But I do think that this is the first significant rung of the ladder that we’re on. It’s the first step toward proving that a general learning system can work, and that it can work on a challenging task that even humans find difficult.”
In this spirit of innovation, DeepMind’s software agent is now graduating to newer, more complex games, like those on Super Nintendo and PCs, which could eventually include the likes of Civilization and Grand Theft Auto V.
“We are now moving towards 3-D games, where the challenge is much greater because you have got to navigate around a 3-D world, there’s a requirement to have long-term memory, and you have to process 3-D vision, which is much harder than 2-D vision,” Hassabis says. “I would say that this will happen within the next five-plus years. I’d be surprised if it takes longer than that.” Because these games deal with more ambiguity than classic games like Breakout and Space Invaders, the ability of a machine to learn how to play them better suggests how general-learning artificial intelligence could be used in the real world.
Given how much has happened in the past half-decade of deep-learning research, once that next step is completed, the possibilities from there are endless.