Go to Google Instant. Search for porn. Even if you’re at work. Don’t worry. Nothing will come up.
Just don’t hit “enter.”
“We strictly apply a narrow set of removal policies for pornography, violence, hate speech, and terms that are frequently used to find content that infringes copyrights,” a Google spokesperson tells Fast Company by email. Good thing. Or is it?
To many, such precautions amount to overt censorship. Along with Google’s blacklist, an algorithm helps the search giant flag content “objectionable content” from appearing automatically in its results. For example, when a user types “porn” on Google Instant, the company’s results-as-you-type search engine, results will not display unless the user manually hits the search button. The same goes for searching a racial slur, four-letter word, or for content that Google believes may violate copyright.
“It’s important to note that removing queries from auto-complete is a hard problem, and not as simple as blacklisting particular terms and phrases,” Google says.
The policies guiding its blacklist and removal
algorithm are vague at best. The company is not forthcoming about the
terms and phrases it chooses to blacklist, about the hundreds of signals
that might cause its algorithm to flag certain queries and results, nor
is it clear how the blacklist works in accordance with Google’s
In recent weeks, critics have blasted Google after it began removing completions for websites such as BitTorrent, RapidShare, and MegaUpload–services often associated with pirated content. But perhaps more surprising than the queries Google chose to censor, as the critics pointed out, were the terms and phrases it left uncensored, including searches for other file-sharing sites, as well as a slew of generally disturbing queries (see: “How to kidnap a child”).
During a recent meeting with Ben Gomes, the lead engineer on Google Instant, a Google spokesperson explained that, in the case of pornography, the algorithm looked at not simply the words in one’s query but also the potential results to see whether they were too pornographic.
“It’s not acceptable for us to suggest it to you–to complete it for you,” Gomes added.
The Google spokesperson said the algorithm is “not targeted to a particular site,” but it’s understandable why those censored might feel singled out, and why others have questioned the company’s motivations and stance on search neutrality. It doesn’t help that Google has declined to go into specifics about why the blacklist and algorithm flags certain queries and not others.
“The whole thing is comically bad,” says Simon Morris, VP of product management at BitTorrent, one of the services censored on Google Instant. “They’re still suggesting The Pirate Bay on Instant–it’s ridiculous.”
Morris ran through a number of other strange exceptions to Google’s policy of censoring content that infringes copyright. “Type in X-U-N-L, and all of a sudden, Xunlei comes up,” Morris says, referring to the Chinese BitTorrent client. “Google invested in Xunlei! What’s going on here?”
Google acquired a small stake in Xunlei in 2007, when the New York Times reported Google had paid about $5 million for a 4% stake in the company. Why would Google censor BitTorrent and not Xunlei and The Pirate Bay, two very similar file-sharing platforms? Google couldn’t say for the record.
“I mean, come on! It’s silly, it’s frustrating, it’s sad–the whole thing is really disingenuous,” Morris continues, shaking his head in disbelief. “You’re going to blacklist us, and not The Pirate Bay?”
Google’s results fall into three categories. First, there are the queries and results that Google Instant will not censor–results and suggestions will appear regardless of how many letters one types. The second category includes phrases and terms that Google will not suggest nor display until the full term is entered. For example, when typing in “BitTorrent,” Google will not suggest an auto-completed phrase, nor will it display any results for “BitTorren” without the last “T”–the screen will remain blank until the full word is typed, and then Google will show results.
Lastly, there are the terms and phrases that Google will not display, suggest, or auto-complete through Instant, unless the user actually hits the search button–or “enter.” “Porn” is the obvious example, but other notable exceptions include file-sharing sites such as FilesTube and RapidShare.
There are a slew of other inexplicable exceptions to Google’s censorship system. Last week, BoingBoing featured a disturbing list of queries that Google still displays and auto-completes, including searches about kidnapping children, making poison gas, and cheating on one’s taxes. The chart went viral, and a week later, Google ceased from suggesting, auto-completing, and displaying results for “how to kidnap a child” queries.
Was this a response to bad press? Did Google manually add this phrase to its blacklist? Or was its algorithm somehow coincidentally tweaked to censor that particular result?
Again, Google declined to comment specifically for the record and could only offer the following statement: “We get more than one billion searches each day, and, because of this, we take an algorithmic approach to removals to deal with the massive scale at which we operate.” The spokesperson also explained that, if the hundred-plus signals that guide its algorithm were readily accessible, it would leave the engine susceptible to people gaming the system–results tweaked to avoid censorship or to take advantage of the algorithm.
Of course, we’re not against censoring queries about how to kidnap children–it’s understandable why that would fall under Google’s policy against violence. However, when it comes to its policy on copyright infringement–when a range of similar file-sharing services (including one Google has a financial stake in) are censored or left uncensored for no apparent reason–Google’s role in the process is anything but transparent.
During my talk with Ben Gomes, the top Google engineer, I asked how the algorithm and blacklist could have such jarring exceptions. In particular, I asked why Google doesn’t censor Megavideo.com, a website infamous for offering pirated TV shows and movies, and a service run by MegaUpload, one of the sites that Google blocks on Instant.
Gomes did not offer an answer, and the Google spokesperson in the meeting said, “I don’t know … I don’t know the specifics.”
Some are not satisfied with that answer.
“This whole idea that this is an effective way to govern the Internet–it’s just not an informed strategy by Google,” says BitTorrent’s Simon Morris. “This is a band-aid that’s been offered up as some sacrificial lamb in pursuit of some other type of relationship–I hope they’re really earning nice brownie points for doing it.”
This isn’t the first time Google’s role in censorship has been questioned or theorized about. Many times, Google has been accused of manually adjusting its search results. The Google spokesperson sent me a few recent examples of these conspiracies, including what some claimed were censored results for queries such as “Climategate” and “Islam is.”
“Just like our search algorithms, our auto-complete removals are imperfect and change regularly,” said the spokesperson. “It’s an imperfect solution.”