Smartphone location data is a dream for marketers who want to know where you go and how long you spend there—and a privacy nightmare. But this kind of geolocation data could also be used to protect people’s voting rights on Election Day.
The newly founded nonprofit Center for New Data is now tracking voters at the polls using smartphone location data to help researchers understand how easy—or difficult—it is for people to vote in different places. Called the Observing Democracy project, the nonpartisan effort is making data on how far people have to travel to vote and how long they have to wait in line available in a privacy-friendly way so it can be used to craft election policies that ensure voting is accessible for everyone.
Election data has already fueled changes in various municipalities and states. A 66-page lawsuit filed by Fair Fight Action against the state of Georgia in the wake of Stacey Abrams’s narrow loss to Brian Kemp in the 2018 gubernatorial race relies heavily on data to back its assertions of unconstitutionally delayed and deferred voter registration, unfair challenges to absentee and provisional ballots, and unjustified purges of voter rolls—all hallmarks of voter suppression.
The promise of Observing Democracy is to make this type of impactful data available much more rapidly than ever before. Barely a month old, Observing Democracy isn’t wasting any time: Its all-volunteer staffers will be receiving data potentially as soon as Nov. 4 on voter wait times at polling locations, travel times to polling stations, and how frequently ballot drop-off boxes are visited, courtesy of location-data mining companies X-Mode Social and Veraset, which was spun off from SafeGraph.
Observing Democracy then melds that data with polling center locations from the Center for Public Integrity and voter information from L2 Political, the voting analytics company that supplies voter data to the Republican National Committee and Democratic National Committee. The last step is to de-identify the data before sharing the combined information with researchers and journalists that the organization has vetted for being nonpartisan.
All told, the organization estimates that it will be ingesting “billions” of rows of data representing more than 40 million devices, with Election Day coverage of one-in-five U.S. adults. To protect that vast trove of voter information, Observing Democracy is using tools from data-governance software and guidance provider Immuta, says Steven Davenport, co-executive director of the project.
“Raw device data provides the greatest opportunity as a researcher, but [the] greatest privacy risk. Part of what we do is make it as useful as possible with the least amount of risk,” Davenport says.
While studying polling location wait times might not pose privacy concerns, how far a voter has traveled to reach their polling location could be used to identify an individual.
“All incoming data is de-identified, and we have policies against reidentifying it,” Davenport says. “The protections we have against that include having as few people as possible working directly on the data.”
X-Mode said in an emailed statement that it obfuscates device user IDs and generalizes the devices via aggregation to make its data pseudonymous. Veraset, L2 Political, and the Center for Public Integrity did not return requests for comment by the time of publication.
Part of the challenge of de-identifying data so that it can’t be traced back to any individual is keeping it de-identified even when new data comes in, says Sophie Stalla-Bourdillon, a data governance and legal engineer at Immuta and expert adviser to the Center for New Data.
“We have to monitor the environment as we’re transforming the data,” she says. “This is part of why the Center for New Data is not listing any data publicly. Our intent is to publish research from the data without exposing any individuals to risk.”
To prevent accidental re-identification, the project won’t show researchers data for locations with fewer than 25 individual devices counted, such as polling stations serving jurisdictions with few voters. That’s a stricter standard of aggregate generalization than incoming location data from X-Mode.
These safety measures are part of Observing Democracy’s ethical jiujitsu in using technology that critics often say is a fundamental part of the surveillance economy. But its real goal is to use tracking technology that typically powers individualized advertising and flip it so that it benefits pressing public issues. For instance, the nonprofit has also used smartphone location data to track COVID-19 infections after likely superspreader events, such as the Sturgis motorcycle rally in South Dakota and its smaller cousin, the Lake of the Ozarks’ BikeFest in Missouri, as part of its Covid Alliance project.
What can location data tell us about access to voting? Davenport, co-executive director Ryan Naughton, and their team of high-powered advisers expect Observing Democracy’s data to result in serious policy analysis and hopefully real change.
The genesis of their desire to use more accurate data to influence policy lies partially in a research paper based on data from the 2016 election that wasn’t published until 2019. “Racial Disparities in Voting Wait Times: Evidence From Smartphone Data” used cell location data to show that residents of “entirely Black neighborhoods” waited 29% longer to vote, and were 74% more likely to spend more than 30 minutes at their polling place, than residents of “entirely white neighborhoods.” Two of the paper’s authors, Kareem Haggag and Devin Pope, are now among Observing Democracy’s 11 expert advisers.
By leveraging smartphone location data that’s already been collected, Observing Democracy should be able to provide legal and policy analysts with a far more robust set of data on a much faster timeline. That’s according to Nick Stephanopoulos, a Harvard Law School professor unaffiliated with Observing Democracy.
“Most surveys have a limited number of respondents who have self-reported, so their information is anecdotal,” says Stephanopoulos, who specializes in election law and has written extensively on districting, gerrymandering, campaign finance issues, and voting rights. “Even a big survey with tens of thousands of respondents is not going to cover all the polling places across the country. It just can’t compare to what Observing Democracy will be able to provide.”
Evidence of long lines has never had data as granular and fancy as this.”
Academics, political campaigns, courts, and voters could all benefit from knowing the costs of long wait times in line, Stephanopoulos says, noting, “I don’t believe plaintiffs have ever had data of this kind in litigation. Reports of very long lines in Ohio gave rise to sweeping litigation afterward, but evidence of long lines has never had data as granular and fancy as this.”
It might not come as a surprise that a project using voter location data is controversial before its first field test. Davenport says that even the use of the word democracy in the project name raised eyebrows from at least one expert he’s spoken with. Ultimately, however, he believes that the best way to combat the potential loss of legitimacy of U.S. elections, whether from foreign or domestic disinformation campaigns, is with expert analysis of the facts.
“Our organization was founded to take commercial data, enhance its privacy protections and security protections, and wrap it in a fabric of prestigious, nonpartisan uses for research and documenting history,” he says. “But it’s so hard to present this election data in a stance that has broad appeal when we’re all in our bubbles.”