Ask Joshua Montgomery what’s wrong with smart speakers like the Amazon Echo and Google Home, and you’ll likely get a cynical answer.
Montgomery is the CEO of Mycroft, which for the past few years has been building an open-source alternative to big tech’s voice assistants. He doesn’t trust any of those companies–not Google, nor Apple, nor Amazon–to protect people’s privacy or act in users’ best interests.
“When you look the history of big tech, and what they’ve done with user data, it’s full of privacy issues. It’s full of user agency issues. It’s full of marketing. It’s full of self-enrichment at the expense of the user,” Montgomery says. “I don’t have any confidence at all that they’re not going to use those same techniques and tactics in their new technology.”
Last Thursday, Mycroft announced its first consumer product, the Mark II smart speaker. Like Amazon’s Echo and the Google Home, the Mark II will have a far-field microphone array for answering voice commands from across the room, along with a touch screen for visual responses, like the one on Amazon’s Echo Show. (The Mark I, which shipped last year, was more of a rough draft for hackers, built by hand from Raspberry Pi boards.)
But unlike big tech’s smart speakers, Mycroft won’t store any voice data on its servers unless you opt into a program that improves speech recognition for open-source voice projects. Users who do share their data can rescind it at any time. Mycroft has already raised more than double the $50,000 it was seeking to fund the device on Kickstarter and plans to ship by the end of this year.
While the goal of creating a privacy-first voice assistant is noble, maintaining tight control over your personal data always comes at a cost. In this case, Mycroft’s all-or-nothing approach to retaining voice data will make speech recognition more challenging, putting the company at an inherent disadvantage against companies that hoover up as much audio as they can.
Controlling The Flow
To be clear, Mycroft will treat its speaker’s always-listening element the same way as its larger rivals. The speaker will listen for a wake word using a processor on the device and won’t start uploading audio until it’s been triggered.
The key distinction is what happens to the audio after it heads to the cloud. Although Amazon and Google both let users review and delete their voice recordings, neither company offers a way to automatically delete data over time. And when users try to wipe data in bulk–an option only available through Google’s and Amazon’s websites, not their mobile voice assistant apps–both companies discourage it. A pop-up on Amazon’s site says deleting the data “may degrade your experience,” while Google shows a pop-up saying it can “make Google services more useful to you” if you keep those voice samples on file.
Nino Tasca, a senior product manager for Google Assistant, said in an email statement that the company uses individual voice samples to improve wake word recognition, determine which user is talking, and learn how users pronounce words and phrases.
Amazon did not answer specific questions, and instead provided an email statement describing in broad terms how the company uses voice data. “Alexa uses your voice recording to answer your questions, fulfill your requests, and improve your experience and our services,” the company says. “This includes training Alexa to interpret speech and language to help improve her ability to understand and respond to your requests.”
Neither response explains why users don’t at least have the option to automatically delete some voice data over time or anonymize their voice samples. Such practices are the norm for Apple, which only retains voice samples for six months and keeps them anonymous while training Siri’s speech recognition.
Not that any of those approaches satisfy Montgomery, who worries that all these companies will become laxer about privacy over time. He also wonders if data could be de-anonymized to satisfy business needs or uphold law enforcement requests–particularly in countries like China where companies have proven willing to sacrifice privacy in exchange for reaching more customers.
“Based on the past performance of these companies as they’ve deployed services, I think it would be foolish to anticipate them becoming more private over time,” he says. “The best way to make sure that your data is kept private is to make sure that your data is gone.”
Say That Again
The downside to MyCroft’s hardline stance on privacy is that it limits the company’s ability to improve speech recognition. Instead of training its AI from scratch, Mycroft is using open data sets from Mozilla, which come from volunteer voice recordings and other open sources such as transcribed TED talks. That amounts to a fraction of what Google and Amazon are gathering.
“Does big tech have a data advantage? Absolutely,” Montgomery says. Still, he argues that with Mozilla’s reach and concerted efforts by the open source community, it’s still possible to build a compelling alternative to mainstream voice assistants.
But even as Montgomery downplays the issue, Mozilla acknowledges that there are significant disadvantages to creating a voice assistant without massive amounts of personal voice data.
“Generally, for services like this, the bigger pool of data you have, the easier it is for you to develop a model that is really effective and really responsive,” says Marshall Erwin, Mozilla’s director of trust and security. “And that ultimately is why companies like Google and Amazon have been ahead of the game on this, because they have many different channels that can allow them to create large pools of data for this purpose.”
Compared to Montgomery, Erwin’s stance toward tech giants isn’t nearly as hostile. He applauds Google and Amazon for at least showing users their voice data and giving them the option to delete it all. In some ways, he says, that’s even better than Apple’s approach of hiding and anonymizing what it’s collected.
“I think transparency might be the best option if you really think the data needs to be collected and retained for some amount of time in the first place,” he says.
Erwin would still like to see those companies be more up-front about how long they hang on to voice data, provide retention limits for users who want them, and encourage people who care about privacy to use those management tools. But he also puts some of the burden on users, who should know that the data that Amazon and Google collect from their virtual assistants isn’t all that different from what they gather through users’ activity in other formats.
“If I buy detergent with my Alexa, ultimately, I’m creating a similar type of risk compared to when I go on Amazon and buy detergent,” Erwin says. “That’s just a choice that people need to be informed about, to know that these companies have vast pools of data about them.”
Open Voice’s Future
Montgomery points out that there’s more to Mycroft than just privacy. The company plans to make its speaker more customizable than an Echo or Google Home, for instance by letting users set up custom wake words. And while Apple, Google, and Amazon all use smart speakers to prioritize their own online services, Montgomery envisions Mycroft as a neutral party.
“As these technologies become a significant part of how we interact with technology, the question really becomes, ‘When I ask this device a question, am I getting the best answer for me, or am I getting the best answer for whatever company developed the tech?'” he says.
Still, Montgomery admits that Mycroft won’t appeal to everyone, at least not as a consumer product. But selling speakers directly to consumers isn’t Mycroft’s primary business model anyway. The real money is in offering open-source AI to businesses, who might be even more wary about turning over audio logs and user data to tech giants.
To that end, one of Mycroft’s backers is Jaguar Land Rover, which invested $110,000 in the company to explore voice control in automobiles. Microphone maker Shure is also an investor. While Mycroft doesn’t have any integration deals yet, Montgomery says he’s in talks with several companies that want to have their own branded voice assistants. They aren’t interested in ceding more control over their customers to Amazon or Google.
“They are absolutely 100% certain that that will end badly,” Montgomery says.