You music lovers out there probably think we're living in a Golden Age. iTunes, Pandora , Rhapsody, music distribution and discovery couldn't get any better, right? With the proliferation of music sites and apps, we must be at some sort of saturation point, after all, the telos of digital music technology.
But spend a bit of time talking to Brian Whitman, cofounder of The Echo Nest , and you realize that we're really in a digital music Stone Age. Sure, we've come a long way, but there's still plenty we can't do--our recommendation engines are limited, as is our ability to sift information automatically from songs (to tell the sex of a singer just from his or her voice, for instance). The Echo Nest, a five-year-old company devoted to aggregating, indexing, using, and sharing vast troves of music data, just announced  a collaboration with Columbia University's LabROSA (Laboratory for the Recognition and Organization of Speech and Audio) on something called the Million Song Dataset, free to use for non-commercial music researchers.
Let's begin with music recommendation. What's wrong with Pandora? Repetition. Even cofounder and Chief Strategy Officer Tim Westergren admitted recently that they're working on the repetition. Pandora's site declares that there are 800,000 songs and counting in its database. Not a negligible number, by any means. But Echo Nest has 30 million. "It's great for a top 40 radio experience," Whitman tells Fast Company, giving Pandora credit where due. "But if you want to dig deep down into a lot more music, you need some automated discovery platform." Whereas Pandora proudly  employs people  to manually go through music and classify it, The Echo Nest, says Whitman, "understands the world of music automatically." And not just how it sounds.
The Echo Nest crawls the web in search of music and writing about music; it also partners with major labels like Universal and aggregators like 7Digital . It then devours data about the music, on both the "acoustic side"--tempo, key, etc. (Echo Nest's system crunches that sort of data in about 10 seconds for a song)--and the "cultural side"--what reviewers are saying about the music for instance. It crawls the web, Google-style, ravenous for new musical information. If you tweet about the band you saw last night, "we have that in our databases within the hour," says Whitman.
What are the uses of data on 30 million songs? Broadly, there are two categories: commercial and academic.
On the commercial side, one of the most exciting uses of Echo Nest is that its data will empower the next generation of app-makers to make apps we can't even imagine yet. About 150 apps have already been made using the application; they're listed here . Some highlights include last week's launch, "Pocket Hipster ," in which a mustachioed jerk with an ironic bowtie, suspenders, and a fixed gear bike scrutinizes your playlist, tells you how horrible it is, and recommends little-known gems. The Echo Nest has also partnered with MTV to power its Music Meter, which updates every 15 minutes or so with information about which bands are most talked about on the Internet. Music Meter just launched in mobile app form yesterday. The Echo Nest's massive database makes it better at understanding "the long tail of music," says Whitman, "stuff waiting out there to be discovered, but no one knows about it."
One of the first fun apps that came out using The Echo Nest's data was inspired by the Saturday Night Live sketch in which Christopher Walken urges Blue Oyster Cult's percussionist to go nuts on the cowbell.
Didn't you think Jay-Z's "Hard Knock Life" just was missing a little ... something? Me too.
|Make your own at MoreCowbell.dj |
"It was a crazy thing I never would have imagined," says a bemused Whitman. "This was my dissertation work, and people are now making joke apps from it."
The other use category is academia, and that's where the free-to-use Million Song Dataset comes in. Researchers in, say, physics, share the same reality, so they can replicate each other's experiments and advance the science. But researchers in music information retrieval haven't had the same reality to share, so to speak--they haven't had a large shared data set. Until now. "This is me giving a gift to my graduate school doppelganger, 10 years younger than me," says Whitman, a PhD graduate of MIT's Media Lab.
The world of academic digital music research is one many of us haven't considered. Whitman identifies a few major research problems. Though the human ear can easily separate the sound of a guitar from the sound of drums from the sound of a voice, computers can't do that yet, making full transcription of songs a hugely labor-intensive task. A program that can listen to a song and transcribe each instrument's role would be a major leap forward. Others are trying to devise a program that could identify the year or decade a song was made, just from listening to it (sifting things like production values, whether the song's in mono or stereo, and so on). There are already programs that are very good at identifying the genre of a song.
We can't know yet what the full fruits of Echo Nest's datasets will be. "Surprise us," says The Echo Nest's site, in a challenge to researchers and developers everywhere. In the meantime, crank up the cowbell.
|Make your own at MoreCowbell.dj |
Read More: Most Innovative Companies: Pandora 
Follow Fast Company  on Twitter.
[Image by Art Renewal Center Museum  & Andrew Hur ]