The violent events in Charlottesville, Virginia, earlier this month and President Trump’s failure to condemn them have renewed the national conversation around extremist groups feeling “emboldened” by the administration to commit acts of racism, hate, and violence. In the first 34 days after the election, the Southern Poverty Law Center counted 1,094 bias incidents around the nation, 37% of which directly referenced Donald Trump or his campaign slogans.
Because of the dearth of a comprehensive agency or database that records hate crimes at the national level, multiple efforts from advocacy organizations, news outlets, and individual journalists have emerged to track hate crimes, including a collaborative reporting project led by ProPublica called Documenting Hate. The project is a database of national hate crimes, gathered through news reports but also from individuals whose stories are reported and verified directly through the site. The idea is that collecting this data in one place will help journalists accurately report on these incidents, while also providing a constantly evolving snapshot of hate crimes in the U.S.
Google News Lab, who is a partner in the project, recently released a tool meant to further that mission: the Hate Crime Index. Created with data viz studio Pitch Interactive, the Index uses machine learning to automatically collect articles that cover racism, bigotry, and abuse.
Using the Index is fairly intuitive: You search for keywords by date, and the Index will pull up related stories from within that time frame. A sidebar along the left-hand side lists keywords that are found in the corresponding articles by the rate that they appear. For example, if you pull up all the reports of hate crimes from last week, “Muslims,” “Pedro Cruz”–the cousin of a man in Florida killed by a gunman ranting homophobic slurs–and “Trump Advisor” appear large and at the top of the list, indicating they come up the most in news reports. The articles on the right-hand side are listed by date, and linked to their source. The Index gives a comprehensive view of what is happening all over the country at very local, granular levels.
The most impressive part of the tool, however, lies behind the interface. To create the visualization, Google News Lab and Pitch Interactive tapped the Google Cloud Natural Language API, which uses machine learning models to extract information about people, places, and events from text. (Similar machine learning technology powers Google Search and Google Assistant.)
Google News Lab data editor Simon Rogers explains it this way: The API bypasses the filters that Google News uses to surface the most relevant news and highlight certain news outlets. Rather, the API has access to a raw feed, then filters stories according to certain keywords and sentiments that indicate they are about hate crimes. This allows the database to collect hyper-local stories that might otherwise get culled in Google News filters.
Essentially, the Index supplements the work that journalists and civil rights advocates are doing for the Documenting Hate project, which includes meatier reporting projects and data-sharing efforts with civil rights groups. There’s a serious lack of data around hate crimes: Although the FBI is required by law to collect hate crime data, local jurisdictions are not required to report incidents to the federal government. As a result, the FBI’s data is patchy, incomplete, and practically useless.
In Rogers’s view, it’s the job of data journalists, like his team at Google News Lab, to fill in those gaps and bring accurate and complete data to people. The Documenting Hate Index leverages machine learning and Google’s trove of search data to surface the first layer of data quickly and easily, so that journalists can use it to bring the increase in hate crimes since the election under greater scrutiny.
As Rogers puts it, there are plenty of amazing local reporters who are picking up on hate crime incidents in their region, but those reports never get seen elsewhere. The Index offers journalists the ability to connect isolated incidents to the bigger picture of what’s going on in our country today. “Anything that brings truth and facts and data to this issue is important,” he says.