The Syrian civil war keeps plunging to new depths of unimaginable cruelty. On February 18, the regime of Syrian president Bashar al-Assad launched an aerial assault on Eastern Ghouta, an opposition-held enclave on the outskirts of Damascus where 400,000 residents have been living under a government-imposed siege since 2013. An estimated 500 civilians were killed in under a week.
Much of what the world knows about Eastern Ghouta’s plight comes from residents uploading videos from the area. On February 19, Eliot Higgins, founder of Bellingcat and an analyst who pioneered the use of “open-source” evidence in conflict investigations, compiled a playlist of YouTube videos taken in the enclave that day. The videos leave little doubt as to the horrors being inflicted on the area’s civilian population. The 26-video playlist that Higgins made the following day depicts a child’s body being loaded into an ambulance, people screaming as they confront the rubble of destroyed buildings, and fighter jets diving through the empty sky.
YouTube hosts 4 million videos related to Syria that have been uploaded since the outbreak of the war in 2011, according to Keith Hiatt, vice president of the human rights program at Benetech, a technology nonprofit. But YouTube, whose first video in 2005 depicts one of the company’s cofounders in front of an elephant enclosure at the San Diego Zoo, wasn’t designed to be the world’s largest repository of war footage. Over the summer of 2017, YouTube introduced a machine-learning-based algorithm to flag videos for terms of service (ToS)-related violations. The algorithm’s purpose was to expedite the removal of propaganda videos that extremist groups like ISIS had posted—but it flagged a large volume of activist content for removal, too. Within a few days, some 900 Syria-related channels, including those run by Higgins and Bellingcat, disappeared off the platform.
The subsequent outcry led to media coverage in the New York Times and the Intercept. Many videos (including Higgins’s and Bellingcat’s) were eventually restored, and the pace of removals slowed. But activists say that they didn’t entirely stop. Between September and December, well after the issue drew high-profile media attention, some 68 YouTube channels that the web video storage and curation project the Syrian Archive had been tracking were taken offline, comprising over 400,000 videos and bringing the total number of deleted channels to 216. Other human rights monitors noticed videos continuing to vanish off the site through the beginning of 2018. “We download YouTube videos and import them into our system,” Shabnam Mojtahedi, a legal and strategy analyst with the Syria Justice and Accountability Center, said back in January, “and there have been times when I’m reviewing a video and analyzing it for the legal content, and I’ll hit refresh or come back two minutes later, and it will be off YouTube.”
Since January, all but 55 of those 216 channels have been restored. That’s only a partial victory, according to Hadi al-Khatib, the Archive’s founder and director. He says upwards of 200,000 videos are still offline—and that’s just the ones his organization knows about. Due to the sheer volume of Syria-related videos currently on the platform, and YouTube’s unwillingness to share information about the deletions, the purge erased some unknown number of videos whose existence or deletion never caught the attention of those in a position to raise their case with the company.
“There are many more removed channels than the number that we have, but the problem is that we don’t know them,” says Khatib, who added that some channels will be “gone forever unless YouTube publishes a report of all removed channels, or gives us a list of them.” Many of the deletions have been cyclical: One prominent example is Sham News, a citizen journalism project responsible for uploading 245,000 videos. Khatib says the site has been kicked off YouTube and restored again at least four times after activists protested to the company. Channels aren’t always restored in full, and sometimes come back with as much as 20% of their videos missing.
Activists say that deletions are still continuing, albeit at a much slower rate than over the summer. Even with over 160 channels coming back, “the process of reinstating channels is taking a long time,” says Khatib.
How Does Its Algorithm Work? YouTube Won’t Tell You
It’s also far from clear how YouTube’s algorithm really works, how effective it is, or what it’s programmed to look for—the company refuses to share information about the algorithm’s error rate or the total number of deletions of the program to which it’s contributed. YouTube also won’t describe in any useful detail the learning factors that the algorithm uses to improve its performance.
The ongoing deletions reveal issues that activists and tech companies will face for decades. With governments concerned about the online spread of extremism and tech platforms wary of being perceived as messaging conduits for bigots and terrorists, it’s likely that Terms of Service (ToS) enforcement will become stricter. The introduction of machine-learning algorithms to police content means that companies now have the tools and incentives to mass-delete content, some of which could turn out to be of historical, political, legal, or moral value if it were allowed to remain online.
For now, the history of the Syrian civil war is still at YouTube’s mercy, and it’s unknown how much of it has already disappeared. “Think about the newsreel footage that was shot in the concentration camps when the Allies liberated them,” says Hiatt, explaining that those contemporary news reports are still crucial for promoting modern-day awareness of the Holocaust. “Imagine if that footage was owned by a private company that thought it would not be a good business move to make it public and they deleted it. That’s the situation we’re in.”
Syrian Civil War Is The “First YouTube Conflict”
Google, which is YouTube’s parent company, knows how significant its platform has been during the war. “The Syrian civil war is in many ways the first YouTube conflict in the same way that Vietnam was the first television conflict,” Justin Kosslyn, the product manager for Jigsaw, formerly called Google Ideas, said during an interview on the sidelines of September’s Oslo Freedom Forum in New York, where Kosslyn had just spoken. “You have more hours of footage of the Syrian civil war on YouTube then there actually are hours of the war in real life.” In 2016, Jigsaw developed Montage, a Google Docs-like application that allows for collaborative analysis of online videos. Kosslyn said the project was undertaken with human rights-related investigations in mind.
The value of YouTube’s Syria videos is indisputable, especially since the regime and other armed actors have closed off much of the country to journalists and human rights observers. Higgins and his colleagues proved beyond all doubt that Assad’s forces gassed a suburb of Damascus in August 2013, and a U.N. organization is now in the early stages of assessing YouTube’s Syria footage for its future use in war crimes trials. In December 2016, the U.N. General Assembly voted to establish the International Impartial and Independent Mechanism (IIIM) to assist in war crimes prosecutions related to Syria. In connection with the IIIM, Hiatt and his team at Benetech are developing software that can search and organize the estimated 4 million videos related to the conflict. The IIIM will facilitate the use of the videos in court if alleged human rights abusers ever face trial.
Videos from the conflict could prove critical in cases where they might violate the site’s ToS—even ISIS propaganda videos help identify members of the organization and explain its internal hierarchies. “The difficulty in this type of work is that the information put out there on social media by the perpetrators of the violence can also be used to hold those perpetrators accountable,” Mojtahedi notes.
YouTube sometimes hints at how much the world doesn’t know about the war. Higgins says he’s “fairly certain” there was a sarin gas attack in Damascus during the summer of 2017 that largely escaped attention. “Even investigating one incident can take a week of work, even if you have two or three people on it, and there are five or six incidents a day at least that are worth investigating in Syria,” says Higgins.
YouTube’s preferred fix for conflict-related ToS issues is for uploaders to provide contextual detail of what their videos depict, in order to clearly establish the content’s value and distinguish it from extremist material. But the videos are often the work of people in war zones who have limited internet access, don’t speak English, and don’t know the specifics of web platforms’ ToS. It’s possible that many uploaders no longer have reliable internet access or aren’t alive to provide new context for their videos—and then there’s the difficulty of informing people in a war zone about new ToS guidelines, or bringing seven years of videos into compliance.
“Which news agencies need to go back over years of content in order to update everything and add details to everything?” asks Khatib. “This is impossible.” There are several cases in which the Syrian Archive can’t convince YouTube to restore a channel because the group can’t reach an uploader to inform them of how to add context to YouTube’s ToS standards. If an uploader is imprisoned or killed, their content is at added risk of being permanently lost if YouTube deletes it. YouTube doesn’t have a codified process for third-party requests to restore content, instead depending on ad hoc lines of communication with activists and analysts. The Syrian Archive also says it hasn’t found a way to download more than 100,000 videos per channel off the platform, meaning that external archives can only preserve a fraction of videos from vulnerable high-volume uploaders like Sham News.
For a time, human rights groups experimented with building their own apps, intended to document atrocities, that were in line with the needs and experiences of people living through war. But they couldn’t compete with YouTube’s popularity, usability, and sheer scale. “The human rights community that works in digital tech has come to understand that creating new apps and channels doesn’t make sense for anybody,” says Alexa Koenig, executive director of the Human Rights Center at Berkeley Law School. “YouTube controls that space, and they will for the foreseeable future.”
Tighter Restrictions Could Be On The Way
YouTube says 400 hours of video are uploaded to the site every minute, and new developments, such as Germany’s stringent new social media hate speech law, make the removal of extremist content especially urgent for the company. Tighter restrictions could be on the way: On March 1st, the European Commission issued a recommendation that sites remove terrorist material within one hour of it being flagged by law enforcement. Google is inevitably accountable to shareholders and regulators—not to ordinary Syrians, who are some of the most powerless people on earth.
In the kind of machine-learning-based enforcement that YouTube introduced last year, an algorithm rapidly cross-checks millions of videos against a database of “hashes,” or image signatures that indicate a video’s content. It’s then left up to Google’s guidelines enforcement team, which will involve about 10,000 people by the end of 2018, to decide whether those videos should stay online or not. Both activists and YouTube say that it’s only in very rare cases that the algorithm automatically deletes content before a human moderator sees it.
But no one outside of Google knows what the YouTube algorithm is even looking for. “We don’t have a list of the factors that go into making decisions with the algorithm,” says Dia Kayyali, an advocacy coordinator with Witness, a nonprofit dedicated to video documentation of human rights-related events that has been in constant communication with YouTube over the Syria removals. “Nothing like that exists as far as I know. And that’s a problem.”
Fast Company asked YouTube about the algorithm’s performance, requesting data on the program’s error rate, the number of videos tagged, the number deleted as a result of the algorithm’s flags, and on how the rate of flaggings and deletions has changed over time. We also asked about which learning factors the algorithm incorporates in order to improve itself. The company declined to go into specifics.
“Last June we announced technological improvements to the tools our reviewers use in video takedowns and we are continuing to improve these,” YouTube said in a statement. “We are now using machine learning to flag violent extremism content for human review. With the massive volume of videos on our site, sometimes we make the wrong call. When it’s brought to our attention that a video or channel has been removed mistakenly, we act quickly to reinstate it.”
On February 2, Norah Puckett, a senior lawyer on Google’s litigation team, gave a presentation on the company’s ToS enforcement practices at a Santa Clara University conference. She spoke generally about ToS removals, saying that in 2016, YouTube was “getting something like 250,000 flags a day and removed through the various means that we enforce our policies something like upwards of 90 million videos.” These numbers, which predate the introduction of the machine-learning algorithm, don’t distinguish between user and human-generated flags, and those produced by automated enforcement tools—250,000 flags a day also comes out to 91 million a year, suggesting a high correspondence between flaggings and removals.
Without Basic Info, How Do You Start The Conversation?
Nate Cardozo, a senior staff attorney at the Electronic Frontier Foundation, believes YouTube hasn’t shared information that could help meaningfully analyze their algorithm’s performance. “I’m sure Google knows the error rate, but we don’t. Without that, it’s hard to even start this conversation,” says Cardozo. He thinks the entire process for removing videos is overly opaque. “We know next to nothing about how these takedowns work. We don’t know the total number, we don’t know the percentage that were tagged by algorithm vs by a human . . . We know nothing, and that’s not unique to YouTube by the way. That’s almost universal across the industry.”
Problems would linger even if the algorithm had an error rate of zero. The Syrian civil war is a complex conflict involving dozens of armed groups and a half-dozen state armed militaries, while Google’s Terms of Service are a thicket of legalese aimed at balancing corporate self-interest with an entire world’s worth of regulatory demands. Figuring out if a given video from Syria violates a given tech company’s ToS would be a difficult feat for an educated human being to pull off. As long as the incentives call for deleting more content as opposed to less, ToS enforcement is likely going to ensnare videos of potential value. Algorithms just expand the scope and speed of the process.
Google knows that it hosts material of immense importance that still pushes the boundaries of its ToS. It’s less clear that the company has grappled with the basic tension between automated enforcement and the preservation of human rights-related material and historical memory, something that might require more dramatic changes than just the restoration of a few deleted channels. “They understand that there is evidentiary value,” Dia Kayyali of Witness says of YouTube’s Syria videos. “But does that translate to the people who are writing the algorithms? Does it translate to the people designing the machine learning? I really don’t know. My guess is no.”