Numbers can be a powerful way to illustrate a point, predict the future, and confirm a decision. They can also be wrong and misleading, says data expert John H. Johnson, author of Everydata: The Misinformation Hidden in the Little Data You Consume Every Day.
"People are afraid of data for two reasons," he says. "First, most people have never taken a basic stats class, and if you’ve never been trained you might think numbers are scary. And second, we’re bombarded with information. From your smartphone to your Fit Bit to your gas gauge, the average American consumes 34 pickup trucks of data every day. That can be overwhelming."
Lacking a clear understanding can lead to a misinterpretation of the numbers or believing someone else’s conclusions. Here are three common ways data can trip up you and your business:
Outliers and unequal data sets can impact averages, leading to a comparison of apples to oranges. For example, the average salary for a U.S. mayor is $62,000, while the average salary of a deputy mayor is $83,000. "That’s like Robin getting more money than Batman," says Johnson.
But averages can be misleading as they often hide variations. The mayor and deputy mayor data set is misleading because it doesn’t take into account the fact that almost every city has a mayor, and the vast majority make less than $100,000, says Johnson. On the other hand, only large cities have deputy mayors; in New York City, for example, there are four deputy mayors, each of whom makes more than $200,000, he adds.
"It’s important to think about what’s under the surface," Johnson says. "By studying a sample that includes both cities with and without deputy mayors we end up with statistics that at first glance seem completely non-intuitive. An average is only as good as its underlying data."
A 1996 commercial told parents that four out of five pediatricians recommend Gerber baby food. "Sounds like a pretty solid endorsement, doesn’t it?" asks Johnson. "The actual number was just 12%. Gerber cherry-picked data to make their point, and ignored data points that contradicted it."
Instead of looking at pediatricians as a whole, Gerber looked only at those who recommended baby food and named a specific brand. "That’s very different," says Johnson.
The Federal Trade Commission thought so, too, and faulted Gerber for failing to clearly disclose the fact that it was only counting a selection of pediatricians.
In order to know if data has been cherry-picked, Johnson says you have to know how much original data exists and ask questions about where the data came from. Read the fine print, and think about whether the data was selected in a specific way, and whether that method had bias toward a certain outcome.
A headline from a 2015 People.com article said, "Living Near a Starbucks Will Increase Your Home’s Value." Before you move closer to Starbucks, Johnson says you need to step back and understand what’s truly driving the data.
"It might be that Starbucks puts their stores in neighborhood with more expensive homes," he says. "Or maybe Starbucks puts its stores in the center of towns and villages and home prices rise faster in those areas. We don’t know, and that’s the point."
While there may be a relationship between two variables, you can’t know if it’s meaningful if other variables have been omitted, says Johnson. Start by asking the question, "What else could explain this?" It’s possible that Starbucks is really making your home worth more, but it’s possible that Starbucks could serve as a proxy for other things that improve it as well?
When using data, Johnson says it’s important to look at the big picture. "Don’t be afraid of numbers, they’re just numbers," he says. "Stop, think and use your intuition. When you see a new a number or phrase in a new study said, look closely and ask questions. Does the person or organization have an agenda? How might that cause them to skew numbers? Statistical literacy is empowering."