The seasonal flu causes as many as 500,000 deaths worldwide every year, and tracking its spread is one of the best ways to improve prevention and reduce this mortality. Much effort has focused on finding better ways to gather faster and faster intelligence about the spread of the flu. The better the real-time information, the more hospitals and doctors can prepare to stock vaccines and run public education campaigns.
To reduce the typical two-week time lag in the CDC’s flu monitoring, researchers have looked at everything from ER visits and drug store receipts to orange juice sales, Twitter trends, and even cancellation rates for OpenTable restaurant reservations. Google Flu Trends is the method that’s made the most headlines–when people feel symptoms, they tend to Google them, sometimes days before heading to the doctor.
Experts and Google itself know their are flaws in the Flu Trends tool, especially when the flu is being covered heavily in the news or existing at extremely high levels (so people who don’t have the flu might be searching related terms anyway). Now researchers from Boston Children’s Hospital and Harvard Medical School have come up with another untapped resource that uses the Web to better track the flu: traffic to Wikipedia articles. They believe their method might work better than Google’s tool, based on the early research.
“Wikipedia is a hugely popular website,” says co-author David McIver. “A lot of people are very likely to come to it when they are looking for information. What’s sort of nice, compared to a straight search engine query, is that when someone actually visits the Wikipedia page, this is probably what the person is really searching for.”
Their results showed Wikipedia can be more useful than Google when it comes to public health, and much faster than the CDC can gather data.
After compiling the day-by-day U.S. Wikipedia traffic for a wide range of pages associated with the flu, such as “Influenza treatment,” “fever,” and “flu season,” they found that Wikipedia usage could accurately predict the week of “peak” influenza activity 17% more often than Google Flu Trends between December 2007 to August 2013. One reason that their predictions may be more accurate than Google’s, says co-author John Brownstein, is that Wikipedia traffic is less likely to be affected by media coverage of the flu compared to search traffic.
Another benefit of their method, the authors say, is that it is all open to the public. While public health experts views Google Flu Trends as valuable tool, they are also frustrated that the algorithm and data are proprietary to Google. They can’t look at how it works, and so can’t work to identify and improve upon flaws very easily.
The researchers say the work is still early-stage and more validation work is required. Google brought a lot of attention into the area of using non-traditional digital tools to speed the pace of flu-tracking, but ultimately more reliable and transparent options are needed. “We envision the possibility of having this system running in the background,” says Brownstein. “We could have an app or website where you could see in real time the statistics. We’re hoping someday it could be used in conjunction with existing methods,” he says.
One day, the varied flows of our digital exhaust could be used to track diseases other than the flu. “Flu is a great use case because it’s widespread, and there’s not a lot of stigma associated with the flu. It’s really good for testing these kinds of methods,” he says.