# This Algorithm Predicts A Neighborhood’s Crime Rate Using Google Street View

## Can you find the nearest McDonald’s based on a single photo? This computer can.

[Tilt-shift looking down: Rob van Esch via Shutterstock]

Every day we make complex inferences based on our surroundings. Is that a safe street to walk down? Is the nearest McDonald’s to the left? We use a contextual understanding of, and judgments about, our environment to look beyond merely the “visual scene” and decide what stores and services we expect to find nearby, and even the likely economic climate of the neighborhood.

Now a computer can do the same thing by simply looking at a picture from Google Street View.

A deep learning project by researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) fed 8 million images from Google Street View into an algorithm. The result is a computer that can accurately predict the distance to the nearest McDonald’s in the fewest steps possible, and the crime rate of an area, by looking at an image.

This represents a change in the way we should think about image recognition. “A lot of the existing computer vision research to date has focused on what’s inside an image–for example, does a particular image contain a cat or part of a face?” says Aditya Khosla, a fourth-year computer science PhD student who worked on the project. “We wanted to look at what we can learn from the image through inferences.”

The study started with Khosla and his three colleagues picking eight cities from around the world–including Boston, Chicago, Hong Kong, London, Los Angeles, New York, Paris, and San Francisco. Each of these cities was divided into a series of locations roughly 16 meters apart. For each point, four images were taken showing the view from north, south, east, and west.

Next the team obtained the location of establishments of interest using Google Places. Despite having a range of possibilities to choose from, they settled on McDonald’s restaurants as their reference point of choice–largely because McDonald’s were found in all eight of the cities they had chosen.

“We wanted something that would be found everywhere but would also be slightly tough to guess the location of,” says Joseph Lim, a fellow PhD student who also worked on the project. “At one point we considered Nike stores, but these often tend to be located in the shopping mall, which is typically in the center of a city. We wanted an added level of complexity.”

Aggregated crime data, meanwhile, was gathered from organizations like San Francisco CrimeSpotting. This allowed the construction of crime density maps, which could be used for training. Of the 8 million image samples from Google Street View, half were used for training the algorithm, and the other half for testing it.

Results have proven impressive. Using deep learning tools, the team was able to create an algorithm that recognized what it was looking at, and could use this to draw conclusions. While humans proved better at navigating to their nearest McDonald’s in the fewest possible steps, the algorithm consistently outperformed people when being shown two photos and answering which scene takes you closer to a Big Mac.

A demo of the human vs. algorithm experiment can be seen here.

“The opacity of the algorithm means that it’s hard for me to know exactly what the high-level descriptors are which suggest a McDonald’s is nearby,” Khosla says. “An example might be the ratio or number of taxicabs, though, which suggest that you are in a highly populated commercial part of a city–or if the algorithm detects an ocean in the image, which means we are likely on the outskirts of the city.”

However, Khosla admits that the project was more about kick-starting research than creating an optimized algorithm. “It’s a complex task for machine learning because of the abstraction involved,” he says. “What we’re trying to do is show that studying images should be about more than just analyzing what is visible. If the goal of artificial intelligence is to build machines that can mimic human intelligence, this level of abstraction is the obvious next step.”

The team also has some ideas about where research could go next–and how this could be scaled into a real-world project.