Rand Hindi, the CEO and cofounder of Paris-based AI startup Snips, believes our voice assistants are misleading us.
Amazon’s Alexa and Google Assistant, Hindi says, have trained consumers to think that every voice command must be processed and stored online, even if the task has no inherent connection to the internet. With Snips, he’s hoping to prove them wrong. For the past couple of years, Snips has been building an offline voice assistant for individual developers and enterprises that don’t want to depend on Big Tech. Now, the startup is working on its own consumer hardware, including a set of smart speakers and a base station for processing voice commands locally.
“We’re making a very strong bet on people’s willingness to trade, basically, a recognized brand for privacy,” Hindi says. “We’re hoping that privacy will be a major differentiator for our product. But of course nobody has done it before, because no one has been able to do it before.”
Hindi recognizes that the odds are stacked against Snips, an unknown newcomer, in its battle against some of the biggest companies in the world. But he’s also confident in the startup’s technology. And as people begin reckoning with tech giants’ immense power–made possible in part by the vast amount of data they collect on consumer behavior–Snips’ offline voice assistant could arrive at just the right time.
Voice without the cloud
Virtually every other voice assistant, Hindi says, was built around the assumption of limitless cloud storage and computing power, as delivered by giant data centers. To create an AI that works offline, Snips trimmed away features and capabilities that it didn’t believe were necessary.
With speech recognition, for instance, Hindi believes there are diminishing returns to the amount of data an AI relies on, so Snips stores less of it.
“What we noticed is that there’s actually a logarithmic increase in performance with the amount of data, which means that after a certain point, adding more data from other users doesn’t actually bring much more value,” he says. “It’s the 80-20 rule. You’re getting 80% of the precision with 20% of the data.”
To allow for natural language understanding, Snips devised a way to train its AI without keeping records of what users are saying. When developers want to add new skills, Snips can hire humans though services like Amazon Mechanical Turk to generate thousands of sample spoken commands, which it uses to train the AI. Hindi says this method allows for 90% accuracy without collecting any user data.
“What we ended up concluding, quite frankly, was that there is actually little value in trading privacy for training data, because you can get equivalent training data without actually getting real data from users,” he says.
Snips even rewrote elements of TensorFlow, Google’s popular open-source framework for machine learning models, to make it more lightweight.
“Throughout the entire stack of voice, we basically had to find a way to really squeeze every single piece of performance we could, to a point where today, we’re actually able to run an Alexa-like assistant on the Raspberry Pi.”
That’s a bit of an exaggeration. While Snips can technically run on super-basic computers like the Raspberry Pi, an all-purpose AI on par with Alexa or Google Assistant requires more computing power. And without sending commands through the cloud, controlling other devices such as coffee makers or televisions becomes trickier.
To that end, Snips designed its platform to be modular. Developers can offload individual elements such as wake word processing or natural language recognition onto another device, or they can use a more powerful computer to handle a broader range voice commands. The idea is to have a system that becomes more useful and interconnected over time, without having to rely on the cloud. “What we are building at the moment is effectively this network of devices in your home that are completely interoperable,” Hindi says.
Snips could have avoided all these complications by using cloud computing and just deleting customers’ data instead of hording it, similar to what another startup called Mycroft is doing. But Hindi says that approach introduces ongoing costs for companies that want to license the technology. It’s also more susceptible to hacking or government intrusion.
“Anybody who claims to offer privacy while processing your voice in the cloud is, quite frankly, at best clueless and at worst lying,” he says.
Hardware and crypto
All of this has laid the groundwork for a foray into consumer hardware. In May, Snips announced a concept called the Snips Air, consisting of a smart speaker base station and satellite microphones that users can install throughout a home. Hindi says the AI technology is already in place, and Snips has more than 12,500 registered developers who’ve built more than 20,000 skills on the platform. Now the startup just has to build the hardware on which to run it all.
Here’s where things get even less conventional: Although Snips has raised $22 million in venture capital to date, it’s turning to cryptocurrency to fund its hardware. The startup will have an initial coin offering, in which investors can purchase tokens with existing cryptocurrencies including Bitcoin and Ethers. Third-party developers will then be able to sell third-party voice skills in exchange for tokens, using a blockchain to process the transactions.
As well as capitalizizing on the cryptocurrency craze, Hindi says that the approach fits with Snips’ privacy focus. By processing skill purchases with a decentralized currency, Snips avoids having to process credit cards and learn who its customers are.
“To be honest, we tried to find another way, but the moment you have to handle credit cards, you basically have to know identity, and this was just a very, very simple solution,” he says. “I understand blockchain is still not a very mature technology, there’s still a lot of risk around it, and there’s still a lot of education to be done, but it does solve the privacy issue for us.”
That said, you’ll still be able to buy the hardware itself with a credit card–assuming it becomes an actual product. The Snips Air hardware is still in the concept stage–the images Snips has shown are just rendering–and the startup has no experience building consumer devices. There’s also no guarantee that the initial coin offering will generate the $30 million that Hindi says Snips would need to bring the product to fruition.
Still, Hindi says Snips isn’t just trying to flip its technology to a larger firm. The focus now is following the vision of a private voice assistant. And on that, he doesn’t see much alignment with tech giants.
“I think it’s very difficult, because the companies that are truly excited about voice are companies who are still stuck in this 1990s business model of data,” he says. “Today, it feels like we have a much, much bigger opportunity to be the private-by-design alternative to whatever exists out there.”