At Pandora, Every Listener Is A Test Subject

In the eight years since Pandora launched, its music recommendation technology has evolved into a tightly woven blend of human and machine intelligence. We take a behind-the-scenes look at their big data experiments with chief data scientist Eric Bieschke.

At Pandora, Every Listener Is A Test Subject

Sarah Young listens to Pandora constantly. From the moment the 27-year-old hairstylist wakes up in the morning, she’s tuned into one of the service’s infinite personalized radio stations. When she’s finished getting ready, she flips her laptop shut and heads into work, where an iPad streams Pandora all day. On her way to and from the salon, she listens to the “’90s new wave” station on her phone. Lately, she’s noticed more repetition.


“One time, they played this Portishead song five times in three hours,” says Young. “Things seem to get more repetitive in the mid-afternoon.”

Young doesn’t know it, but she’s a lab rat. Alongside 70 million others, she scurries around in virtual space, her every move monitored by the omnipresent eye of Pandora’s massive and complex music intelligence algorithm. Just like in a real laboratory, scientists–in this case, data scientists–constantly tweak variables to see how she responds. To Young, the change in repetition felt subtle. Coincidental, even. It wasn’t. Changes like this are part of Pandora’s ongoing experimentation in how best to deliver music to its listeners.

At any given moment, the company’s data gurus and engineers are running dozens of experiments on its vast user base. Listeners are split into segments, each of which is exposed to a slightly different experience. One group will hear mostly familiar songs, while the other will discover more new music. Do listeners want more local artists? Do they tolerate live recordings? Acoustic versions? What difference does the time of day make? How about geography? How much weight should the thumbs-up button have? Twiddling dozens of virtual knobs, Pandora’s data scientists make small tweaks to the algorithm and watch what happens.


On Repeat

One of the biggest questions the company has is the same one that has long been critical to radio programmers: How frequently should you repeat a given song or artist?

“At the moment, we’re running about two dozen tests that attack repetition in different ways,” says Eric Bieschke, chief scientist and VP of Playlists at Pandora. “Some of them are about increasing exposure to new music. Some of them are about cycling songs in and out of a rotation so you’re actually hearing the same concentration of really good music, but it’s spaced out in such a way that people don’t perceive it as a repetitious experience.”

Eric Bieschke, Chief Scientist and VP of Playlists at Pandora

In Young’s case, the increase in repetition was palpable. That probably means that when it first occurred, she didn’t tap the “thumbs-down” or “skip” buttons any more frequently than usual. Or, she may have just been lumped into a larger segment of users based on other data about her listening behavior. From Pandora’s standpoint, the most important piece of data from the experiment is that Young didn’t leave. Or in the parlance of the company’s analytics team, the “return rate” was not affected. Day in and day out, she keeps coming back.


“Every week, we’re rolling out these experiments on real people in the real world,” says Bieschke. “We don’t have to spend a lot of time guessing what people love about music. We can actually just run experiments and find out.”

Over the years, Pandora has conducted thousands of these tests. Some of them last for months. Some run for a few weeks and are quickly abandoned. Some of them permanently impact the experience for a small subset of users, while others are fed back into the master algorithm that sits atop the company’s vast array of mini-recommendation engines.

Some of the repetition experiments, for example, wound up yielding insights clearcut enough to have a broad impact on the way Pandora works for everybody. That Young noticed more repeated songs during working hours was no coincidence. Pandora’s data has shown that while users are often hungry for new music, they’re less tolerant of discovery while they’re at work.


Related Story: Inside Google’s Infinite Music Intelligence Machine

Perhaps it’s because Pandora serves as background music while they focus on important, productive tasks and they can’t spare the mental bandwidth to appreciate something unfamiliar. Whatever it is, the data has spoken and as a result, Pandora will tend to play more familiar, often repetitive playlists during work hours.

“The reason terrestrial radio repeats so much of the same darn music is because that is the thing getting people coming back to the radio station,” Bieschke says. Indeed, the rate at which users return to Pandora is one of the key metrics that every one of these experiments is aimed at boosting. As much as we all profess to hate the repetitive nature of radio, it apparently does the trick for the analytics team intently staring at the user retention needle. “It’s the most annoying thing about terrestrial radio, but it’s absolutely by design,” he says.

Things like repetitiveness, song duration, and the order of tracks are obvious things to measure, but Pandora’s experiments can go much deeper, thanks to its rich collection of musicological data. The team has found, for instance, that people who gravitate toward instrumental music are more receptive to discovering new things. Thus, people who put on a classical station or a station based on a Miles Davis song will hear more new music than people who listen to pop and rock stations. The distinction is so pronounced that stations based on instrumental hip-hop will yield more serendipitous moments of discovery than those based on lyric-heavy rap tracks.


In The Genes

Pandora Radio started with a simple premise: If you take academically trained music experts and ask them to tag songs with dozens of pieces of meta data, you can build a system with a deep understanding of music, right down to specific tonal qualities, instruments played, rhythmic nuances, and hundreds of other details. If you then feed that uniquely human knowledge into a computational algorithm, you can make previously impossible things happen. Someday, you might even be able to put radio DJs out of business.

“The Music Genome Project is absolutely a huge differentiator for people,” says Tim Westergren, who cofounded Pandora in 2000. “There are trained musicians that come in every day and sit with headphones, entering numbers. I think it’s the closest you can get to a friend recommending you music.”

What these certified musicologists do when they sit down at their stations each month is tag each song they listen to with hundreds of specific musical attributes, or “genes.” The result is a multi-point vector for every song in the database, with each track containing anywhere from 150 to 500 attributes, depending on the genre.


In a 2011 interview with Ars Technica, Pandora’s chief musicologist Nolan Gasser illustrated the process with an example:

One of our classical analysts will get up a recording of Scriabin’s “Third Piano Sonata” on their computer terminal, along with the full classical genome on the same screen. He or she will then go carefully through each of the four movements individually, so that the entire sonata will likely take well over an hour to fully analyze; but that’s okay, because the beauty of it is that once the work is analyzed, it’s complete in our database, and we don’t need to re-analyze the recording by Glenn Gould and compare it to that by Ashkenazy, for example.

Over the years, Pandora’s musicologists have applied this process to millions of songs across numerous genres. The resulting data is the secret sauce that tells Pandora what to play every time you start a new station or decide you don’t like a given track. According to Pandora, this blend of human and machine intelligence is what gives their recommendation engine a leg up over competing services.

The system was the brainchild of Westergren and Will Glaser, both of whom joined forces with Jon Kraft in 2000 to start Savage Beast Technologies. The company initially focused on offering music recommendation services to businesses like Best Buy and AOL, but nearly failed before repositioning itself to target consumers instead. The newly rebranded Pandora Media made a bet that it could not only find value in a large audience, but also in the infinite stream of data those people would willingly feed into its core product. Their hunch was correct. Pandora Radio went live in January 2005 and the listeners–as well as the data–started pouring in.


Mixing Human And Machine Intelligence

Today, the Music Genome Project is just one piece of Pandora’s music recommendation mega-engine. The waves of millions of users that have passed through the service over the years have all been collectively training the system, one play, skip, thumbs-up, or thumbs-down at a time. This behavior has generated what Bieschke refers to as Pandora’s second “gargantuan pool of data.”

“When we launched, there was no data from our listeners,” he says. “Inherently, the thing we were building from was just the music. Today, our listeners are creating data far faster than our internal team of musicians.”

The human-powered Music Genome Project still sits at the heart of what Pandora does, but the manual process of bulk-listening by ear has an inherent problem that user-generated data doesn’t: scalability. There’s just too much music in the world for a team of human beings to sit down and thoroughly describe all of it.


To offset this disadvantage, Pandora employs its own in-house machine listening technology, much like Google’s new All Access music subscription service does. But unlike Google and other newly risen competitors, Pandora can merge machine listening with nearly a decade of human intuition to create a deeper understanding of the music its service spins. As far as machine listening has come in the last few years, as Westergren puts it, the technology “has not even come close to the sophistication of the ear.”

Talk to anybody who specializes in online music discovery, and they’ll tell you the same thing: The best approach is not strictly algorithmic, nor is it solely based on human smarts. Instead, the most effective way to connect people with a series of songs they’re sure to love is by weaving together both approaches: machine learning techniques and good, old-fashioned human brains.

“We’re never purely doing the musician-musicologist-expert technique and we’re never doing the pure data scientist-machine learning-eningeering technique,” explains Bieschke. “We pull all of these people together–engineers, data scientists, musicians, musicologists, curators–and put them all in the room and have them come at the problem from different directions. Sometimes the insight comes from people with music expertise and sometimes the insights come from the people looking at the pure data. Oftentimes, the true breakthroughs come when we’re crossing those two worlds.”


Pandora relies on what’s known as an ensemble-style recommendation system to power its music discovery. That is, it combines a number of different statistical methodologies to form one complex monster of an algorithm. More accurately, it operates as a collection of algorithms, fueled by dozens of mini-recommenders.

“We’ve got 54 different individual recommenders that take entirely separate statistical approaches to how to recommend music to specific people,” explains Bieschke. “The master algorithm sits on top, looks at all the other recommenders, and not only is it paying attention to what the collective wisdom is across all the individual statistical techniques, it’s also learning which specific techniques work for you. It looks and figures out that on your Daft Punk station, you really like recommenders A, B, and C and on your Kinks station, you likes recommenders C, D, and F.”

While many of these statistical observations find their way into Pandora’s master algorithm, those that don’t are still of value to system, Bieschke explains.


“There’s no damage to adding new techniques to our system,” he says. “As long as they’re the best algorithm for one person, it’s worth adding. The worst that happens is it works perfectly for that one person, and for everybody else it will just ignore it.”

As Users Grow Up, So Does Pandora

Aside from the Music Genome data, Pandora’s biggest asset is probably its age. As it approaches a decade of existence, the pool of data from which it draws insights only gets bigger. In fact, Bieschke says that the data explosion is so relentless that it “becomes a technology challenge to just figure out how to sift through in some sort of comprehensible way.”

To help manage the flood, the team relies on big data favorites like Hadoop and Hive. They also swear by a large-scale graphing computation tool called GraphChi, although Bieschke is hesitant to divulge exactly how it’s used at Pandora.


As the service’s methodology evolves, so too do the listening habits of its millions of listeners. It wasn’t something the founders were necessarily aiming for when they launched, but the passage of time has handed Pandora another valuable insight: A long-term view of how people’s tastes evolve throughout their lives.

“If you look at what somebody has been listening to from 2005 to 2013, you can visualize their relationship with music throughout their lives,” Bieschke says. “Their high school years and then their college years and post-college years. With the data we’ve got, we can sort of single out individual people and see where their musical tastes are going and how they’re evolving over time.”

Amy Webb is one such listener. The digital strategy consultant and author has been listening to Pandora since it launched. At the time, she was a single, twentysomething journalist and adjunct college instructor with a penchant for George Michael and other relics from her 1980s youth. Today, her obsession with George Michael lives on, but her tastes have otherwise evolved, as have her life circumstances. She now runs her own business and has a family of her own.


A self-professed geek with a technical background, Webb is keenly aware of how Pandora learns from her behavior and has spent years actively training some of her favorite Quick Mix stations. Several of them are based on artists from the ’80s and ’90s like The Beastie Boys, REM, and Alice in Chains, while others represent newer musical interests. For instance, she keeps a well-trained station based on a Mumford and Sons song, which serves as a children-friendly playlist for car ride sing-a-longs with her husband and their young daughter. These are types of changes that Pandora’s music intelligence machine will pick up on over time as it tries to understand who Webb is and which songs should be streamed to her. But even as new artists find their way into Webb’s life, her old favorites will keep popping up.

“People often return to their roots,” says Bieschke. “The things they listened to growing up become very important to them. All of these sorts of things about people’s lives moving forward and their musical tastes changing weren’t things we were thinking about when we launched, but it seems very obvious now that we’re looking at almost a decade of people’s listening behavior.”

If Pandora’s data scientists are tweaking her playlists based on long-term life changes, she hasn’t noticed. But like Sarah Young and other listeners, Webb has detected a palpable increase in repetition.

“I’ve trained a few of the stations pretty well,” says Webb. “But I’ve noticed over the last year or so that I’m not getting introduced to new artists or songs as much as in the past. I listen to a lot of Motown, and it seems as though the catalogue isn’t as expansive as it used to be.”

For Webb, Pandora frequently serves as background music at work, which is when Pandora’s data experiments show users to be the least receptive to unfamiliar things. It’s possible that, as a result of recent algorithmic tweaks, she’s getting more repeat songs at her desk during the day.

The Future: Why Smarter Devices Mean Smarter Radio

There’s another reason why Webb might be seeing more repetition. Time of day isn’t the only variable Bieschke and his team use to try different algorithms on users. Increasingly, they’re finding that the type of device a user listens on is an important indicator of how they’re listening, which informs what kinds of music they might want to hear.

“If I know you’re listening on an iPad or you’re listening on and Android phone or a Samsung device, or a Ford Sync car, there are huge amounts of information there about the type of music you want to hear,” Bieschke says. “There are real behavior changes for people based on their device and their environment. If they’re at home and they’re cooking and the device is in other room… If we play something bad enough, they’ll go in the other room and turn it down. It’s a very textured sort of data landscape. The thumb-down you get on a Samsung player is very different than the thumb-down you get at a desktop computer during working hours. They mean different things.”

For Bieschke, knowing the name of the device is just the beginning. He eagerly awaits a future in which more and more intelligence can be drawn from the hardware itself. Pandora’s mobile apps don’t use the accelerometer in smartphones to detect listening contexts (jogging, as opposed to driving, as opposed to sitting), but they’ve experimented with features like this during internal hackathons. Tests like these are extremely important for the company, because most of Pandora’s listening happens on mobile today, and the company has been proactive about integrating with a variety of connected devices beyond tablets and phones.

But even the accelerometer idea feels slightly old-fashioned in 2013. Far more exciting to Bieschke and many in his field is the prospect of a future in which one’s phone can broadcast ambient, contextual data that can be used for music discovery in a public setting. For example, imagine walking into a cafe that’s using Pandora to stream music. The proprietor’s tablet could detect the presence of other Pandora users in the room and adjust the playlist accordingly. If there are five such listeners in the cafe, Pandora could calculate a blend of the common Music Genome attributes found in each of their user profiles and pick songs that are statistically the most likely to please most of the people in the room. The results may not always be perfect, but they’d sure beat the whim of the FM radio DJ randomly selected by the guy who happened to be working behind the counter.

As phone hardware gets smarter, wearables become more fashionable, and sensors proliferate, the possibilities for music programming and discovery will know fewer and fewer limits. Such fantasies, Bieschke concedes, will face obstacles such as concerns over user privacy and, at least initially, the inherent limitations of processors and battery life. But as the tech continues to evolve, so too will privacy norms and consumer sentiment. To technologists like Bieschke, this hyper-connected future is not only inevitable, but preferable to the reality we know today. In the meantime, it’s fun to daydream about the possibilities–as long as the dream features a good soundtrack.

Interested in where radio is headed in the 21st Century? It’s an topic we’re tracking here at Co.Labs. Follow along via our story tracker about the future of radio.

[Image: Flickr user Art Bromage]


About the author

John Paul Titlow is a writer at Fast Company focused on music and technology, among other things. Find me here: Twitter: @johnpaul Instagram: @feralcatcolonist