Three major sexually transmitted diseases (STDs)—chlamydia, gonorrhea, and syphilis—are on the rise in the U.S. for the first time since 2006, according the U.S. Centers for Disease Control and Prevention (CDC). As past experience shows, when people get sick, they search online for answers. But all too often it’s easy to draw failed conclusions from that search data.
Google has now given four universities, including the University of Illinois and Columbia University, deep access to anonymized search data to better track the spread of STDs, according to an article by CNN and Kaiser Health News.
It’s not just about knowing what happened, but making smart estimates on what will happen in the future. By correlating historical statistics on infections with the search terms used at the time, researchers can build models that predict new outbreaks when the same terms and high search volume surface again.
Sound familiar? It is. Google developed its own system for disease tracking in 2008 with Flu Trends. The project correlated data from public health authorities like the CDC with Google search trends to intuit the spread of the disease, purportedly faster than official agencies could do with historical data reported afterwards. Google later created Dengue Trends to provide similar tracking for the fast-spreading mosquito-borne viral infection that threatens billions of people in tropical and subtropical regions.
Google shut down Flu Trends on August 20, 2015, after taking heat for the declining accuracy of its predictions. A 2014 study by Northeastern University, the University of Houston, and Harvard published in Science pointed to goofs like misreading the peak of the 2013 flu season by 140 percent, leading Professors David Lazer at Northeastern University and Ryan Kennedy at the University of Houston to write in a Wired article that Flu Trends “failed spectacularly.” The problem, they say, was that Google was too broad in the terms that it considered relevant to the flu, and that it didn’t update its algorithms as search habits and features changed over time. As they noted: “…while Google’s efforts in projecting the flu were well meaning, they were remarkably opaque in terms of method and data—making it dangerous to rely on Google Flu Trends for any decision-making.”
Google seems to have gotten the message in shutting down its own tracking program but also in creating a program for outside researchers (who have more expertise) to apply for access to the detailed Google search data. Columbia University has already been participating in this program for flu tracking, as have Boston Hospital and the CDC’s influenza division. But Google data is only one of several sources that go into the mix.
With STD tracking, Google seems to be providing the same level of deep insight as it has for flu data. This hints that the model could potentially be extended to yet other ailments—with the caveat of not relying too much on one data source.