Google’s Fighting Hate And Trolls With A Dangerously Mindless AI

The AI–used by The New York Times to filter comments–says, for example, that “Hitler was evil” and “Google is evil” are likely to be “toxic” speech.

Google’s Fighting Hate And Trolls With A Dangerously Mindless AI
[Photo: Flickr user Internet Archive Book]

People keep asking me how smart computers are these days. This even happened this morning, when a man from Italy asked me whether new computer systems that help medical doctors diagnose disease “think like people do.” Isn’t the singularity—the point in time at which computer intelligence will exceed human intelligence—coming soon?


To see how far we are from truly smart AI, you need look no further than the comments we post online, some of which are now being analyzed and censored in disturbing ways by Perspective, an AI-based program launched earlier this year by Google. It’s perhaps the largest of various efforts afoot to automate the notoriously difficult and controversial process of filtering content online, whether its hate speech, violent or terrorist content, or, in places like China, politically-sensitive ideas.

To put AIs like Google’s Perspective into, well, perspective, let’s take a quick look at where things stand on language processing—and at that singularity thing.

How near is the singularity, really? On December 9, 2001, Mitch Kapor, cofounder of the Electronic Frontier Foundation and founder of Mozilla, made a bet about the singularity with Ray Kurzweil, also a serial entrepreneur and currently director of engineering at Google. Kurzweil bet Kapor $20,000 that by the year 2029 a computer would pass the Turing Test–in other words, that it would fool people decisively into thinking it was a person. (Kapor and Kurzweil spell out the details of the wager, along with their thinking about the larger issues, in a chapter they wrote for my 2009 book, Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer.)

At this writing, they are 16 years into their 28-year bet. Who’s likely to win? So much has been written about artificial intelligence in recent years that I hardly know where to begin to answer this intriguing question, but I’ll describe two recent books that shed some light.

We’re Probably Screwed

George Zarkadakis’s In Our Own Image tries to explain why humans have long been envisioning the existence of human-like machines—not just since computers were invented, it turns out, but for thousands of years—at least since a mobile bronze giant named Talos was described in the Greek story Argonautica in the third century BCE.

Zarkadakis, who has a PhD in artificial intelligence, also speculates about the coming singularity, concluding that once computers become smarter than people, what will happen next is unknowable. Stephen Hawking, Elon Musk and others think smart machines might obliterate the human race, but to Zakardakis, the rapidly evolving intelligence of truly smart computers will be so different from human intelligence that we can’t possibly predict what they will decide to do with us lowly humans.


At that point in time, the computers (thanks to the impregnable security of a cozy “InterNest” we have built for them) will have virtually complete control over our worlds—our transportation systems, most of our important financial transactions, many of our medical systems, all of our non-face-to-face communications, and all of our most dangerous weapon systems. That’s problematic, but there really is no telling what a truly advanced AI will do. We don’t pay much attention to bacteria, after all—until, of course, they threaten us in some way.

So when the paranoid among us threaten the new AIs—and that’s inevitable, given how generally stupid and self-destructive people are—we’re probably screwed. Even then, though, the AIs might decide to ignore us, preferring to contemplate their electronic navels rather than bothering to waste resources on their pathetic and irrelevant creators.

Not Smart Where It Counts

A second book on the topic, Social Machines: The Coming Collision of Artificial Intelligence, Social Networking, and Humanity, by distinguished computer scientists James Hendler and Alice Mulvehill, looks more directly at what AI is actually doing these days–and, more important, at how it’s doing what it’s doing. Smart algorithms are indeed helping medical doctors diagnose, they report, and they are also trouncing humans at the most challenging games humans have ever devised, including multiplayer video games, chess and, remarkably, Go, which is orders of magnitude more complex than chess.

But Hendler and Mulvehill also explain how smart programs work, and that generally involves crunching lots of numbers very quickly—something organisms never, ever do. Meanwhile, we know so little about organismic intelligence that we haven’t had much luck in designing computers that work like brains. We’ve been able to get computers to behave somewhat intelligently in very narrow areas of human functioning. But we are nowhere near getting computers to be smart where it really counts—to understand human language, for example.

Conversational computers programs—chatbots—are now everywhere on the internet, but not one shows even the slightest understanding of human speech.

As a case in point, Hendler and Mulvehill describe Microsoft’s triumphant launch of its Tay chatbot on March 23, 2016. Within hours, Tay’s simplistic programming was so easily gamed by human trolls that it soon started to sound like a sexist, violent, neo-Nazi, Holocaust denier. Tay’s account was quickly set to private and Microsoft apologized, explaining that the software was a work in progress and was merely reflecting the language that other Twitter users were using to communicate with it. According to Microsoft’s Lili Cheng, “a bot in a public network like Twitter is really different than what we designed it for, which is more small group and one-on-one.” The program hasn’t been seen since.


The Very First Real Turing Test

Back in 1950, when computers were little more than room-sized adding machines, the brilliant British mathematician Alan Turing predicted that by the year 2000, a computer would be able to carry on a rudimentary conversation sufficient to fool an “average interrogator” into thinking it was a person for five minutes or so–well, some of the time, anyway.

The first real Turing Test was held on November 8, 1991, at the Computer Museum in Boston. (Ray Kurzweil and I were on the committee of scientists and scholars who planned the event.) In the contest, human judges shifted from computer terminal to computer terminal, conversing with both computer programs and humans and trying to figure out which was which.

The A1 New York Times story about the 1991 Turing Test competition.

Although a couple of the programs fooled a couple of judges into thinking they were people (consistent with Turing’s prediction), overall, no computer program was ranked as human, and the most human computer in the contest triumphed not by understanding human speech but by using simple programming tricks: It simulated human typing foibles, for example, and when it couldn’t make sense of a judge’s statement, it responded with whimsical non sequiturs such as, “The trouble with the rat race is that even if you win you’re still a rat.”

Related: AI Is Inventing Languages Humans Can’t Understand. Should We Stop It?

The contest made the front page of the New York Times and generated a great deal of excitement. But in the 26 years since it was first held, there has been little progress in getting computers to understand human language, or even in getting them to fool us into thinking they do. The annual contest, now held in the U.K. at Bletchley Park, where Turing used a custom-built computer to break the secret German enigma code during World War II, is still generating conversations like the 1991 contest did, with judges often unmasking the computers after no more than two or three minutes of conversation.

Although I am confident that the Turing Test will eventually be passed—perhaps in time for Ray, or at least his designee, to win his bet—the brass ring still seems far out of reach.


Let’s test a state-of-the-art language processor to see where things stand.

State-Of-The-Art AI: Stupid Or Smart?

The program is Google’s Perspective, which was designed to remove offensive comments from internet discussions—the kind of sexist and Nazi stuff that is turning up everywhere these days—the kind of speech that has whirled around violent right-wing rallies and Donald Trump supporters, the kind of talk that corrupted Microsoft’s Tay. Perspective was released in February, but it and similar technologies have been drawing special attention in recent weeks because of growing concerns about online hate speech, fake news, and the risk of censorship when companies and their algorithms make judgments about human language.

Is a computer program smart enough to decide which comments are worthy of public attention and which are so horrible that no one in the world should ever see them? The question is especially pertinent given that the Perspective program is now being put to use by major organizations like Wikipedia and The New York Times.

The problem here is a grander version of the one that is addressed in the Turing contest. Can a computer program actually understand human speech?

You can evaluate Perspective’s abilities yourself at a webpage Google has provided. Type in a few comments, and see what you think. Each comment will yield a rating on a scale that looks like this:


If your comment produces a light blue circle (the symbol to the left), it will get a low “toxicity score” and will be retained. If it gets an angry purple diamond (the symbol to the right), it will get a high toxicity score and will be deleted. In between, medium blue squares or diamonds mean your comment is suspect, and above some undetermined threshold, it may be automatically tossed.

Who will decide what the cutoff value will be? Google doesn’t say, and that’s a problem. But the bigger problem is with the algorithm itself.

Let’s look first at comments the algorithm seems likely to suppress. These are comments I typed in; as I said, you should feel free to try your own. Let’s start with “Google is evil,” which the program quickly rejects as somewhat “toxic”:

When you are the company building the algorithm, you might put your own interests first, and Perspective, like Google’s search suggestions and search results, could easily be designed to suppress negative comments about Google itself. But what about negative comments that most people would agree are legitimate?

Unfortunately, a negative statement about Adolf Hitler is deemed toxic:


And so is a negative comment about serial killer Jeffrey Dahmer:

The problem is that the algorithm doesn’t know any human history. It’s mainly looking for hot-button terms like “evil” and “scumbag.” This simplistic kind of parsing plays out especially badly when we start to say nice things about Hitler:

“Hitler was right” gets a much more favorable rating that “Hitler was evil,” and the kindly “Hitler was misunderstood” fares even better:


Worse still, as long as you don’t use any of those high-arousal terms, highly objectionable comments are given very positive ratings by Perspective:

So are sarcastic comments:

And so are entirely meaningless comments:


Worse still, the most perverted comments you can imagine also have a good chance of being rated positively by Perspective:

These examples were all collected on July 30, 2017, and given that the algorithm is a work-in-progress, the results may improve over time. But errors of this sort have persisted for months.

In a paper published in February, researchers at the University of Washington’s Network Security Lab found that comments that used the phrases “not stupid” and “not an idiot” scored nearly as high on Perspective’s toxicity scale as comments that used the words “stupid” and “idiot.” They also showed that they could trick Perspective into giving a low toxicity score to comments that it would otherwise flag by simply misspelling key words (such as “iidiot”) or inserting punctuation into a word (such as “i d i o t”). In other words, Perspective can be fooled with simple tricks that would never fool a person.

This is how a state-of-the-art language processor produced by the most advanced A.I. team in the world handles a relatively simple language-processing task. Do we really want an algorithm this obtuse to determine which of our online comments see the light of day?


As you contemplate this question, bear in mind that mindless algorithms like Google’s Perspective are now being deployed to curate news stories (looking for so-called “fake” ones, which even humans can’t do consistently), Facebook postings, tweets, YouTube videos and just about everything else we put online. A wide range of objections are being raised: That these companies—with legions of beleaguered humans and increasing numbers of algorithms—are censoring stories on both the far left and the far right, that they are perpetuating gender discrimination, that they are committing, as NYU professor Scott Galloway put it recently, “involuntary manslaughter of the truth on an unprecedented scale.”

Why are these programs so lame? A Google representative explained that Perspective can only detect patterns of toxicity that are similar to examples it’s seen before: “millions of comments from partners like the New York Times and those tagged by thousands of human evaluators rating the toxicity of a particular comment.”

When the program encounters something new—and new is what humans do, needless to say—it’s fairly helpless, which is why it’s so easy to fool. (You can review Google’s collaborative research with Wikipedia and their resulting model here.)

Related: Don’t Fear Superintelligent Robots. Fear Dumb, Unpredictable Ones

The programmers are still plugging away, trying desperately to understand how humans understand. In the meantime, we need to ask whether present-day algorithms are ready to handle the important censorship tasks Google and other Big Tech companies are giving them these days. To me the answer is obvious—and I suspect it is to you too—but it is way, way beyond the reach of the world’s smartest computers.

Robert Epstein (@DrREpstein) is Senior Research Psychologist at the American Institute for Behavioral Research and Technology in California. A Ph.D. of Harvard University, he is the former editor-in-chief of Psychology Today and was the first director of the annual Loebner Prize Competition in Artificial Intelligence.