The violent events in Charlottesville, Virginia, earlier this month and President Trump’s failure to condemn them have renewed the national conversation around extremist groups feeling “emboldened” by the administration to commit acts of racism, hate, and violence. In the first 34 days after the election, the Southern Poverty Law Center counted 1,094 bias incidents around the nation, 37% of which directly referenced Donald Trump or his campaign slogans.
Because of the dearth of a comprehensive agency or database that records hate crimes at the national level, multiple efforts from advocacy organizations, news outlets, and individual journalists have emerged to track hate crimes, including a collaborative reporting project led by ProPublica called Documenting Hate. The project is a database of national hate crimes, gathered through news reports but also from individuals whose stories are reported and verified directly through the site. The idea is that collecting this data in one place will help journalists accurately report on these incidents, while also providing a constantly evolving snapshot of hate crimes in the U.S.
Google News Lab, who is a partner in the project, recently released a tool meant to further that mission: the Hate Crime Index. Created with data viz studio Pitch Interactive, the Index uses machine learning to automatically collect articles that cover racism, bigotry, and abuse.
The most impressive part of the tool, however, lies behind the interface. To create the visualization, Google News Lab and Pitch Interactive tapped the Google Cloud Natural Language API, which uses machine learning models to extract information about people, places, and events from text. (Similar machine learning technology powers Google Search and Google Assistant.)
Google News Lab data editor Simon Rogers explains it this way: The API bypasses the filters that Google News uses to surface the most relevant news and highlight certain news outlets. Rather, the API has access to a raw feed, then filters stories according to certain keywords and sentiments that indicate they are about hate crimes. This allows the database to collect hyper-local stories that might otherwise get culled in Google News filters.
In Rogers’s view, it’s the job of data journalists, like his team at Google News Lab, to fill in those gaps and bring accurate and complete data to people. The Documenting Hate Index leverages machine learning and Google’s trove of search data to surface the first layer of data quickly and easily, so that journalists can use it to bring the increase in hate crimes since the election under greater scrutiny.
As Rogers puts it, there are plenty of amazing local reporters who are picking up on hate crime incidents in their region, but those reports never get seen elsewhere. The Index offers journalists the ability to connect isolated incidents to the bigger picture of what’s going on in our country today. “Anything that brings truth and facts and data to this issue is important,” he says.