Crimson Hexagon, a Boston data analytics company, raised some eyebrows last week when it announced that its access to the firehose of user data from Facebook and Instagram had been reinstated—after being suspended and investigated by the social media giant for alleged misuse of data for surveillance purposes.
The reinstatement, which began earlier this month, followed “several weeks of constructive discussion and information exchange,” said Dan Shore, Crimson’s chief financial officer. But the companies didn’t specify the results of the inquiry or explain why access was restored, raising more questions about how Facebook and other platforms police third parties like Cambridge Analytica and Crimson Hexagon.
Crimson boasts of having gathered the largest public repository of social media data, which it has used in work for clients that include Adidas and Anheuser-Busch InBev, as well as the Department of Homeland Security, the State Department, and clients in Turkey and Russia, including “a Russian nonprofit with ties to the Kremlin.” the Wall Street Journal reported in July.
Its work has drawn additional questions because it was co-founded in 2007 by Gary King, a prestigious Harvard social science professor who is also helping manage a Facebook-sponsored academic research project involving a massive amount of Facebook data. Professor King declined to comment for this story.
In May, Facebook said it had suspended 200 third-party apps for improper collection and misuse of its user data. The company has been investigating thousands of apps since explosive reporting documented an illicit data harvest involving tens of millions of Facebook users by Trump campaign contractor Cambridge Analytica. More suspensions are expected to be announced soon.
On July 20, after receiving questions from a Journal reporter, Facebook said it suspended Crimson’s API access and began an investigation into allegations that user data—data that had been properly collected—had been used for surveillance. Facebook was also investigating a claim made in the Journal report that Crimson Hexagon had inadvertently accessed some private Instagram data.
Crimson has also relied extensively on Twitter data, and has collected so many tweets—over a trillion since 2010—that Twitter relies on the company’s services to analyze its own network. In response to questions about Crimson Hexagon, a Twitter spokesperson reiterated its policy. “We prohibit the use of our data products for surveillance purposes, or for any purpose that is inconsistent with our users’ expectations of privacy. Period. These rules apply to all users of our developer platform, not just government entities. We have invested heavily in our data compliance program over the last several years and we rigorously enforce our rules against violating developers—up to and including permanent suspension of access to Twitter data in any form. If we learn of any developer breaking our rules, we will investigate and take appropriate action”.
Even before the most recent firestorm over Silicon Valley’s work for government agencies, civil rights advocates have raised alarms about the software. In 2016, Facebook limited access to Facebook and Instagram data to Geofeedia after an ACLU investigation found that the service, sometimes called “TweetDeck for cops,” was helping law enforcement fusion centers around the country conduct local surveillance of Black Lives Matter activists. The move came after a similar action by Twitter, which also banned the New York City company Dataminr from selling reports on Twitter data to government intelligence agencies. (Twitter has a stake in the firm.)
In March 2017, Facebook followed Twitter’s lead and added language to its Platform Policy that specifically prohibited developers from using Facebook data “to provide tools that are used for surveillance,” though it didn’t specify how it defined surveillance.
The policy puts Facebook in a tough position. Even if Crimson Hexagon were have found to be in violation of Facebook’s policies, shutting down its access could be a death knell for a billion dollar data industry, made of startups and large defense contractors who relationships with Facebook and other services are crucial to government and commercial clients.
“To lock down what is publicly available and ingested through an API would put social listening and analytics into a tailspin,” Kieley Taylor, managing partner and global head of social at advertising agency GroupM told AdExchanger last month. “T]he precedent of going deeper into the contracts when the data appears to be captured in the right way is troubling. It’s also unclear to me how this would be broadly enforced.”
How Facebook’s app investigations work
Facebook says that typically it has no information about its developers’ contracts with customers, or how those customers use Facebook data. But it requires developers to submit statements about how they plan to use the data they request and conducts both automated and manual reviews to monitor developers’ compliance. It also does broader investigations if it receives reports that a developer may been in violation of policy, at which point it may suspend an app.
Facebook’s app investigations are treated on a case-by-case basis, according to a person familiar with the company’s approach. Facebook may request legal certifications that data has not been improperly used, and may conduct on-site audits of computers and servers. In some cases, Facebook will discourage companies like Crimson Hexagon from working with certain entities whose use of data may raise concerns, or request that those companies wind down existing contracts, while maintaining an ongoing conversation to ensure that a company is in compliance.
Spokespeople for both companies would not comment on whether Facebook made such requests to Crimson Hexagon. Citing proprietary business and security concerns, a Facebook spokesperson declined to provide more details about the results of the investigation into Crimson’s contracts, and referred questions to Crimson Hexagon. Crimson Hexagon declined to comment beyond a blog post and referred questions to Facebook.
As with the contracts in question, Facebook’s legal certifications with developers like Crimson may also include confidentiality agreements. In June 2016, when Facebook wrote to Aleksandr Kogan, the developer at the center of the Cambridge Analytica scandal, demanding that he delete illicitly-obtained data, Kogan was required to sign a non-disclosure agreement. (Facebook has said it has since released him and others in that case from any confidentiality agreements.)
Big questions about enforcement
In its blog post, Shore wrote that several of Facebook’s questions “focused on a small number of our government customers, which represent less than 5% of our business.” The company has analyzed environmental protests in Turkey, according to its website, and provided software to the U.S. State Department’s digital anti-terrorism unit.
According to a person close to the firm, its contracts in Russia and Turkey were not current, and were not active at the time of the Journal story.”To our knowledge, no government customer has used the Crimson Hexagon platform for surveillance of any individual or group,” Shore wrote. In a previous blog post it said it “only collects publicly available social media data that anyone can access” and that it “does not collect private social media data.”
The company said it would step up its vetting of customers by monitoring government customers “on an ongoing basis to ensure the public’s expectations of privacy are met,” Shore wrote. “As governments and government-sponsored organizations change how they use data, we too must change.”
Initially pitched as an antiterrorism tool, software like Crimson’s has spread quickly across all levels of law enforcement, from immigration agencies to police departments. In a March 2017 lawsuit against the Memphis Police Department, the ACLU of Tennessee alleged that the police were using “social media collator” software to spy on local Black Lives Matter protesters in violation of a consent decree. The ACLU said in a July filing that in addition to tools like Geofeedia and NC4, police officers used a fake Facebook profile to spy on Black activists and “chatter” around protest activities. The police surveillance encompassed “any of the organizations that arose out of Ferguson” around the country, and was shared with police, school officials, the U.S. Department of Justice, the Dept. of Defense, utility companies and FedEx and Autozone.
Crimson Hexagon’s pledge to monitor its customers echoes demands by civil rights advocates for more oversight and transparency. “We need these companies to tell us how they’re doing without activists having to work day in and day out to remind them of their obligations to their users and to society,” Malkia A. Cyril, the founder and executive director of the Center for Media Justice, told Wired in March. “We need the will to come from within to take transparency one step further, document your enforcement, and tell us how enforcement is going through independent audits.”
Sandy Parakilas, a former Facebook operations manager focused on privacy and policy issues, questioned the existing protocols for protecting user data from surveillance. Facebook can do investigations and companies can conduct their own audits; government agencies dealing with US personal data are subject to certain privacy guidelines. Still, he said, “there’s no external oversight of third party companies collecting data on U.S. citizens for analysis on behalf of the government. The ‘new safeguards’ are presumably internal to Crimson Hexagon, ie, grading their own homework.”
Not that Cambridge
Like Cambridge Analytica and other data-focused firms, Crimson’s techniques have links with universities. Crimson Hexagon’s approach was initially developed in Cambridge, Massachusetts, at Harvard’s Institute for Quantitative Social Science, as the “hyper-accurate estimation, classification, and quantification of unstructured data,” then-CEO Scott Centurino told Fast Company in 2010. The company’s name derives from Harvard’s official color but also from Jorge Luis Borges’s short story “The Library of Babel,” in which the speaker describes a “Crimson Hexagon,” a special, possibly mythical room that’s key to decoding the astronomical library.
Cambridge Analytica had no official ties to academic institutions, but its techniques, like its name, were derived from research that began in Cambridge, England, at Cambridge University’s Psychometric Center. The deceased data company relied upon Kogan, then a university researcher, to collect data on millions of Facebook users. Facebook and British regulators have said they are investigating the university’s role.
Crimson co-founder Gary King, who holds a distinguished University Professorship, is also the cofounder and leader of Social Science One, an initiative created in partnership with Facebook in the wake of the Cambridge Analytica scandal. The project’s aim is to provide researchers with Facebook and Instagram data in order to improve research around disinformation and social media’s impact on elections.
The project’s researchers will have access to a “privacy-protected” petabyte of Facebook data as part of a new model King has helped develop to boost academic-industry partnerships. Findings will be published without review by Facebook, he says, and the social network will receive no funding from the social network. “The data collected by private companies has vast potential to help social scientists understand and solve society’s greatest challenges,” King wrote in a blog post. “But until now that data has typically been unavailable for academic research.”
In an email, King said that while he had developed Crimson’s technology, he had no day-to-day involvement in Crimson Hexagon, and referred questions to a company spokesperson.
In his blog post, Crimson’s Shore wrote that the firm was “hopeful that our extended dialogue with Facebook will lead to a strengthened and deepened Facebook partnership that will help our customers draw increased value from public online information on both of our platforms.” In a brief statement, Facebook said, “We appreciate their cooperation and look forward to working with them in the future.”
Facebook is undergoing a massive push to reduce abuse of its platform and its data, trying to stem false news, concealed political ads, abusive microtargeting, and other frauds. And it continues to investigate thousands of apps for data abuses, said a spokesperson.
Facebook itself is also under investigation. The Federal Trade Commission, which placed Facebook under a 20-year privacy probation in 2011, is now again examining Facebook’s data practices, after a 2017 government-mandated audit found nothing wrong. And it’s facing lawsuits from users angry about how the platform had collected and distributed their data in ways that Cambridge Analytica and others exploited.
Facebook has moved to dismiss the lawsuits, and in an email, a Facebook spokesperson said that the “claims have no merit, and we will continue to defend ourselves vigorously.”
In papers filed Wednesday in a class-action lawsuit in California, Facebook sought to stop a request for information about its agreements with developers regarding user data. The company claimed that users have “no viable claim” to sue because they had consented to sharing their data. “Plaintiffs’ counsel has admitted that this is not a data breach case, in which users’ data was obtained against their will,” the company wrote.