What happens inside a neural network’s “brain”? It’s an important and tricky question, since these densely packed models now touch people, laws, and society every day. The short answer, until recently? ¯\_(ツ)_/¯.
Google Brain, the company’s deep learning and AI arm, has published research in this area for years. Back in 2015, a blockbuster paper titled “Inceptionism” introduced the media–and much of the public–to the concept of neural networks by showing how they “dream,” in wild, hallucinogenic images of clouds shaped like fish and fantastical birds. The images gripped our imaginations and changed how we talk about AI. But the point for the researchers was to show off “simple techniques for peeking inside these networks.”
This week, three years later, two of the same Google Brain researchers and their colleagues published a new paper on their progress called “The Building Blocks of Interpretability.” It shows a new way of seeing inside the brains of machines. The researchers’ word for this is “interpretability,” or the ability to understand. In short, they’ve built a series of interfaces for looking inside neural networks to see how they make decisions.
So how does a neural network see our world? Take a look at one example: a photo of a golden lab and a kitten frolicking that a neural network has labeled as a “Labrador retriever” and “tiger cat.” Using one of the researchers’ interfaces, you can see exactly how the neural net came to that decision, isolating certain areas of the photo with each neuron’s best guess about what it “sees” and how sure it is about that. (For instance, it was pretty sure about the dog’s “floppy ears,” and the cat’s “pointy ears.”) As layers of neurons work together, the machine’s understanding evolves, beginning with detecting edges to actual shapes and objects.
You’d be far better served by clicking around their demos yourself rather than listening to me try to explain them.
It’s the “why” of the research that’s so compelling. Right now, researchers can only assess neural networks based on their final decisions–we don’t really understand how they get to those decisions. That’s a big problem. If neural networks are participating in criminal justice, financial markets, national security, healthcare, and government, we need to understand why they make decisions, not just if they got them right or wrong. Transparency and accountability–basic tenets of democracy–are lost when we don’t. The paper proposes an interface that lets researchers peer into their neural networks, like looking through a literal window. Eventually, they posit, these interfaces could help researchers shape the actual thought process going on inside these digital brains.
“Human feedback on the model’s decision-making process, facilitated by interpretability interfaces, could be a powerful solution to these problems,” the Google team writes. “It might allow us to train models not just to make the right decisions, but to make them for the right reasons.”
The researchers still have a long way to go, they hasten to add, and their approach isn’t without pitfalls (interfaces can still be tricked). It’s also worth noting that Google is far from the only organization trying to accomplish this goal. Just as the first graphical user interfaces let normal people look into the black box of early computers, these new interfaces might let society do something similar for AI.