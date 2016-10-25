Data-driven algorithms govern many aspects of life: university admissions, resume screening, and a person’s ability to get a car or home loan. Often, using data leads to more efficient allocation of resources and better outcomes for everyone. But algorithms can come with unintended consequences–and without care, their application can result in a society we don’t want.

Typically, we think of algorithms as being neutral and objective, but when software is written and trained by humans, it often encodes the biases and prejudices of the people that make and shape it. Ultimately, the biases built into algorithms can be racist and marginalize low-ranking socioeconomic groups. What’s truly worrying is that, unlike with people, the biases in algorithms are sometimes difficult to detect, undo, and fix.

No one sets out to create a racist model, but often bias creeps into algorithms inadvertently because of training data. To give a glaring example, Tay, the Microsoft chatbot, didn’t start out as offensive, but after interactions with malicious users, it parroted offensive content. Pokémon Go is another example where bias slipped in. Various reports have noted the tendency to see fewer Pokéstops in predominantly black neighborhoods than in white communities. The reason is that data for the game originally came from another location-based game called Ingress, which was more popular with white users who suggested points of interest.

Sometimes, the pathway to biased training data is even more circuitous, such as when Google Photos last year mistagged a photo with a young African-American couple with the label “gorilla.” A Google spokeswoman said the company was appalled by the mistake and was taking action to improve automatic image labeling technology. Instead of “seeing” a face, these kinds of algorithms identify shapes, colors, and patterns in order to make educated guesses as to what the picture might actually be. However, it appears as if Google had simply never tested their algorithm on people with darker skin tones.

All of these examples are relatively benign, but when algorithms are deployed to make decisions, they can have much more serious consequences. For instance, recidivism models, which try to predict the likelihood of people committing future crimes after their release, label people who live in poor neighborhoods as more likely to relapse into criminal behavior. As a result, these people are frequently sentenced to longer jail terms by the courts. A speech given in 2014 by then-Attorney General Eric Holder was critical of such “risk assessments.” Although they were crafted with good intentions, he warned such assessments “may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society.”

Big data is increasingly being used to gauge the performance of workers. For example, software algorithms are being used to generate scores that evaluate teacher effectiveness. Teachers with the lowest scores are let go. Rating teachers is a laudable goal and could theoretically eliminate human bias in evaluation, but these algorithms have faced criticism because reducing human behavior to mathematical formulas is very hard. Student outcomes are affected by many factors, a number of which are outside the control of a teacher.

Using data to monitor business performance can have its advantages, but firms must recognize the limitations of machines and incorporate feedback to improve the algorithms over time.