Siri's Voice Is Already In The 2013 Cadillac XTS

Apple's instantly recognizable virtual-assistant Siri—or at least her twin sister with the same exact voice—is now driving. Just ask Cadillac. Here's what happens when sonic branding collides.

Last week, at the Classic Car Club in lower Manhattan, engineers from Cadillac showed off the automaker's latest rides. In a 2013 XTS, Mike Hichme, a top engineering manager at General Motors, plugged in his iPhone, tapped a button on the steering wheel, and commanded that a song be played.

Suddenly, the instantly recognizable voice of Siri came over the car's speakers, filling the vehicle with her comforting cadence. That was fast, I thought—much sooner than the 12 months GM had anticipated such integration taking. Had the partnership Apple only recently announced—to bring the company's famous virtual assistant to Cadillacs, BMWs, and Mercedes—already come to fruition? "No, that's not Siri," Hichme assured me. "But it's basically the same voice."

"Basically" might not be the right word. Siri and the voice used by Cadillac's Cue telematics system are exactly the same. That's because both companies—and many others—work with speech-recognition company Nuance Communications, the $1.4 billion juggernaut that dominates the space. For Apple, which has made Siri the centerpiece of its iPhone and iPad devices—not mention its multimillion-dollar, celeb-fueled marketing campaign—the use of her voice by other companies could represent a major conflict of sonic branding.

When you hear Siri in an unexpected place—say, in a 2013 Cadillac XTS—the verbal doppelgänger can cause confusion, as if Mystique from X-Men adopted Siri's characteristics but had none of her memory: the same voice with a different brain.

 

When you hear Siri in an unexpected place—say, in a 2013 Cadillac XTS—the verbal doppelgänger can cause confusion, as if Mystique from X-Men adopted Siri's characteristics but had none of her memory: the same voice with a different brain.

Cadillac's Cue system uses the same basic software, but the technology powering it is not nearly as robust as that powering Siri. For Apple, which protects its brand and assets like a seasoned Hollywood publicist, this is likely disconcerting, especially since so many companies are now adopting a similar female voice. As Hichme later told me by email, "The Nuance text-to-speech voice sounds the same with Apple, Cadillac, Ford Sync, and I think even BMW and Audi—that is, the woman's voice reading the prompts sounds the same. Almost everyone who does voice recognition uses Nuance's technology."

Sound is an integral though often overlooked part of advertising. Forms of sonic branding have been around for decades—think: NBC's three xylophone notes that make you think of its decades of programming—but brands are increasingly investing in it. It's the reason, for example, you hear the same four notes at the end of every recent AT&T ad. But the idea doesn't just involve slapping a jingle at the end of an ad, though. It's about attaching an emotional trigger to every place people encounter a brand to help tell its story. Siri has become the latest voice of Apple, a company with a long history of effective sonic branding (Mac startup sound, anyone?). For her part, Siri now appears in ads with Samuel Jackson and Zooey Deschanel; she headlines major company conferences; and she's rapidly becoming the chief way that users interact with their mobile devices. Plus, she has a name: Siri knows you, and you know Siri.

(Of course, as expected from locked-down Apple, Nuance is allowed only to say that it "licenses technology to Apple for a variety of its products," which neither confirms nor denies that Siri herself derives from its technology.)

Mike Thompson, executive VP and GM of Nuance Mobile, believes sound can create an "incredibly human relationship" between a customer and brand. "We're trying to develop a voice and an experience that matches their brand," Thompson says.

He ticks off the common company considerations. "A big decision is always gender: Should it be a male or a female? Beyond that, what are the voice qualities that you want? Do you want the person to be helpful? Informative? What kind of style do you want it to have?" he says. "Designers love this stuff, because it creates an emotional attachment and loyalty to the device. For any given company, which has multiple devices, it's possible that you can begin to tie a brand through that voice across your fleet of cars or your array of consumer electronics."

Because of the significance of this sonic brand association, as Thompson explains, Nuance is seeing the companies interested in this area are "growing dramatically." Nuance now offers corporations a "huge array of voices and languages," and the ability for third-parties to build custom voices, which Thompson calls a "really scientific and artistic process."

For automakers, as Thompson cites as an example, the "in-car experience might need to have a voice that is both informative but casual: You don't really want to have someone who you're joking around with and having fun, because the car is a place that requires safety and focus and the complete avoidance of driver distraction. The experience in a European high-end automobile might be a little bit different than a $15,000 car targeted at 20-year-olds."

And the more often a particular voice ties directly to a particular brand, the more likely it is that a company will want to own that voice itself, like a popular restaurant laying claim to its signature dish. "As the awareness of any given voice grows publicly, then you're absolutely right, the potential for companies which want to own that, I guess, may go up, so that no one else can get it," Thompson says.

Nuance is now considering whether to actually sell voices to brands, as opposed to licensing them. But for now, it's safe to say that much of the corporate world is ready to listen to what Nuance has to offer.

[Image: Flickr user beast love]

Add New Comment

2 Comments

  • Scott F

    How would using the same voice recognition technology result is the same synthesized voice? This makes no sense. I fantasize that one day tech writers will know at least a little bit about the technology about which they write.