This year marks 70 years since Alan Turing published his paper introducing the concept of the Turing Test in response to the question, “Can machines think?” The test’s goal was to determine if a machine can exhibit conversational behavior indistinguishable from a human. Turing predicted that by the year 2000, an average human would have less than a 70% chance of distinguishing an AI from a human in an imitation game where who is responding—a human or an AI—is hidden from the evaluator.

Why haven’t we as an industry been able to achieve that goal, 20 years past that mark? I believe the goal put forth by Turing is not a useful one for AI scientists like myself to work toward. The Turing Test is fraught with limitations, some of which Turing himself debated in his seminal paper. With AI now ubiquitously integrated into our phones, cars, and homes, it’s become increasingly obvious that people care much more that their interactions with machines be useful, seamless and transparent—and that the concept of machines being indistinguishable from a human is out of touch. Therefore, it is time to retire the lore that has served as an inspiration for seven decades, and set a new challenge that inspires researchers and practitioners equally.

The Turing Test and the popular imagination

In the years that followed its introduction, the Turing Test served as the AI north star for academia. The earliest chatbots of the ’60s and ’70s, ELIZA and PARRY, were centered around passing the test. As recently as 2014, chatbot Eugene Goostman declared that it had passed the Turing Test by tricking 33% of the judges that it was human. However, as others have pointed out, the bar of fooling 30% of judges is arbitrary, and even then the victory felt outdated to some.

Still, the Turing Test continues to drive popular imagination. OpenAI’s Generative Pre-trained Transformer 3 (GPT-3) language model has set off headlines about its potential to beat the Turing Test. Similarly, I’m still asked by journalists, business leaders, and other observers, “When will Alexa pass the Turing Test?” Certainly, the Turing Test is one way to measure Alexa’s intelligence—but is it consequential and relevant to measure Alexa’s intelligence that way?

To answer that question, let’s go back to when Turing first laid out his thesis. In 1950, the first commercial computer had yet to be sold, groundwork for fiber-optic cables wouldn’t be published for another four years, and the field of AI hadn’t been formally established—that would come in 1956. We now have 100,000 times more computing power on our phones than Apollo 11, and together with cloud computing and high-bandwidth connectivity, AIs can now make decisions based on huge amounts of data within seconds.

While Turing’s original vision continues to be inspiring, interpreting his test as the ultimate mark of AI’s progress is limited by the era when it was introduced. For one, the Turing Test all but discounts AI’s machine-like attributes of fast computation and information lookup, features that are some of modern AI’s most effective. The emphasis on tricking humans means that for an AI to pass Turing’s test, it has to inject pauses in responses to questions like, “do you know what is the cube root of 3434756?” or, “how far is Seattle from Boston?” In reality, AI knows these answers instantaneously, and pausing to make its answers sound more human isn’t the best use of its skills. Moreover, the Turing Test doesn’t take into account AI’s increasing ability to use sensors to hear, see, and feel the outside world. Instead, it’s limited simply to text.

To make AI more useful today, these systems need to accomplish our everyday tasks efficiently. If you’re asking your AI assistant to turn off your garage lights, you aren’t looking to have a dialogue. Instead, you’d want it to fulfill that request and notify you with a simple acknowledgment, “ok” or “done.” Even when you engage in an extensive dialogue with an AI assistant on a trending topic or have a story read to your child, you’d still like to know it is an AI and not a human. In fact, “fooling” users by pretending to be human poses a real risk. Imagine the dystopian possibilities, as we’ve already begun to see with bots seeding misinformation and the emergence of deep fakes.