At Microsoft’s annual developer conference Monday, CEO Satya Nadella showed a video of a more conversational, and more functional, Cortana assistant.
This new Cortana seemed able to follow the conversation with a fictional user and tap into different knowledge domains to provide information and help solve problems. It moved appointments around on the user’s calendar. It sent driving directions to the user’s car. And the user was able to talk in a natural way, instead of remembering and speaking certain combinations of words designed to trigger certain one-off transactions, like “Siri, set an alarm for 5:30 a.m.”
The personal assistants we’re familiar with today—like Alexa and Siri—represent an early phase in the evolution of the technology. The Holy Grail is a personal assistant so conversational that it feels like talking to a human. Computer science still has a long way to go to get there, but watching all the steps forward is exciting.
The Cortana that Nadella showed off to developers was using natural language assistant technology created by a small company Microsoft acquired last year called Semantic Machines, which is developing what it calls “conversational AI.” The company had lined up some big names in AI research, too, including UC Berkeley professor Dan Klein, Stanford professor Percy Liang, and former Apple chief speech scientist Larry Gillick.
The first thing you might notice about an assistant using the Semantic Machines AI is that it’s able to understand context. This allows the user to start a dialogue about a certain task without having to continue repeating the task at hand with every command or question. In the demo video, the user is trying to set up a lunch meeting. The assistant suggests a restaurant, and the user asks “is there outdoor seating?” The assistant knows it’s being asked for further information about the restaurant that it’s just suggested. If the user says “will the weather be good?” the assistant knows it’s being asked about the day of the lunch meeting. It might go to an app that it knows has a weather forecast.
The machine has to be ready
The point is that during the contextual dialogue between the human and the machine, a lot of things can come up. And the machine has to be ready.
Most existing natural language personal assistants use artificial intelligence to understand the meaning and intent of the user’s words but then are guided by hand-coded rules to know how to carry out the task. Dan Klein told me Semantic Machines applies machine learning to that second part too, allowing the AI to learn from a wide range of data (including the user’s past requests) to decide what service to perform and how.
“You can imagine how many thousands or millions of these things there could be at every possible branching point of a dialogue,” Klein told me. “And it’s just impossible to create a system by hand that can anticipate these things correctly.”
“This is why machine learning is the only possible way to do it,” Klein said. Klein, who was cofounder and chief scientist at Semantic Systems, is still a professor at Berkeley.
Context is not an entirely new concept in digital assistants, but the ability to navigate such a wide universe of possible user intentions pushes the envelope.
The secret sauce is in the way the assistant is trained, Klein told me. ” . . . the training data that you need to make a system like this possible is extremely complicated to invent,” he said. “It’s not just a matter of, let’s say, feeding it millions of text messages that you found on the web or things like that. It’s a very specific kind of thing that we had to come up with in order to train a system like this.”
Microsoft has big plans for the Semantic Machines technology. The company says it will be worked into Cortana and will eventually eventually power conversations across all of Microsoft’s products and services—like Office.
Andrew Shuman, Microsoft’s corporate vice president for Cortana, believes a conversational and context aware assistant might eventually change the experience of working with Microsoft apps.
“We want it to be less cognitive load, less feeling like I have to go to PowerPoint for this or Word for that, or Outlook for this and Teams for that, and more about personal preferences and intents,” he said.
Among consumers, Cortana is used less than Alexa, Google Assistant, and Siri. But the competition for digital assistant dominance in the workplace is far from settled.
“Although speech assistants have a long way to go before they gain acceptance with enterprises, Microsoft must continue to invest in Cortana as it will likely become the next user interface and front end to the Microsoft 365 experience in future,” says Nick McQuire, VP and head of enterprise and AI research at CCS Insight.
“Above all, Cortana must also counter the early lead Google is having in this area with Assistant which is now integrating into G Suite, Microsoft’s fiercest rival in cloud and productivity,” McQuire said.