Discovery Engines: Policing The Riot Of Information Overload

Why can't anyone tame the social stream and just give us the good stuff?

Illustration by Debaser
Illustration by Debaser

Every minute of every day, the more than a half-billion members of Facebook collectively create almost 1 million photos, wall posts, status updates, and other bits of ephemera. The firehose at Twitter looks tame by comparison—the network sees more than 125,000 tweets a minute, only half of them about (or from) randy congressmen. Then there's YouTube, which recently announced that it receives more than 48 hours of video per minute. If you watched video every minute of your life, you'd get through 10 days' worth of YouTube uploads.

As impressive as these statistics obviously are, they're daunting, too, both for consumers and companies. Taming this torrent into something manageable and highly relevant is increasingly seen as the key for Twitter, YouTube, Facebook, and any other chaotic content network looking to realize monster revenue.

That explains why discovery is the word du jour in tech. It also explains why there's a flurry of activity to build a "discovery engine," the search engine's smart-ass cousin that tries to answer vague queries (like "funny video"—one of the top searches on YouTube). The video hub is constantly adjusting its home page in order to present ever more "relevant" links to its clip-addled hordes. Facebook's news-feed algorithm, EdgeRank, has been subtly shifting in an effort to display only the updates the site thinks you'd like to see. And a number of startups and big-media companies have also taken up this challenge. IPad apps such as Flipboard, Zite, and (a New York Times Co. and Beta-works joint venture) create snappy personalized magazines sourced by content found across the Web.

Some of these efforts have seen slight success. Fans who use YouTube through its Leanback interface (designed for TVs and leveraging its what-to-watch-next algorithm) stick with the site for 30 minutes a day, compared with 15 minutes for the average viewer. Although that's encouraging, it pales when compared with YouTube's goal of several hours of daily viewing.

And that's why the discovery engine remains a mythical beast. Today's personalization tools are built on several faulty premises. There's still too much presuming that we want a steady diet of what we just consumed. Just because you clicked on one post of Sarah Palin reinterpreting history doesn't mean you want to hear all she has to say.

Another is that we're interested in everything that our friends are. (I like my friends despite their inexplicable devotion to Mad Men.) In addition, the more connected we are, the more likely that even the best discovery engines will get overwhelmed. As online content expands exponentially, the less likely it is that any one photo, video, news story, or status update will be to your liking. It also wouldn't hurt if these services could avoid overpromising their ability to tame your content feed.

Perhaps more bedeviling is getting past the assumption that we always want something "personalized." Last year, Flipboard acquired Ellerdale, a startup that specialized in "extracting relevancy" from social networks, in order to begin to predict what kinds of stories readers might want. But Arthur van Hoff, Ellerdale's cofounder and now Flipboard's chief technology officer, notes that the company is being very cautious about baking Ellerdale's prediction systems into Flipboard. At the moment, Flipboard presents stories mainly in response to a user's explicit desires (based on the feeds or topics you tell it you'd like to see). When predicting stories, the big worry, of course, is getting it wrong. "If Osama bin Laden dies and our personalization algorithm doesn't show you that news, it would be a bad experience," van Hoff says.

Don't expect the drive to personalize to go away, though. It seems built into the financial structure of the web, notes the Washington Post Co.'s chief digital officer Vijay Ravindran, who this spring launched a personalized news site called Trove. Better personalization means more clicks—and all the while, you're seeing more ads. "It's sort of like drilling for oil," he says. "The oil companies don't exactly know where it is, but they're always going to be drilling because the value of finding that oil is so high."

Add New Comment


  • Nikki Ralston

    It's not an either or situation. Discovery engines need to be complete holistic solutions that offer both passive (based on consumption analysis) and active (user defined) personalization, meaningful social recommendations qualified by comparing the tastes of users (so you won't get Mad Men recommendations just because your friends like it), and intuitive ways to browse the catalog and discover unexpected gems.
    Video discovery is a special challenge, because the traditional way of cataloging titles by actor, director and genre do not allow for this kind of rich discovery. That's why the catalog had to be reinvented from the ground up.
    Nikki Ralston

  • Katymcc

    We've been talking about personalization since the "olden days" when email was our main source of outreach to our target audiences.  It seems we have a hard time getting it right unless we ask people directly what information they're looking for.  It'll be interesting to see if discovery engines can finally crack the code. 

  • Chris Russo


    Great article....Being on the emergency alerting side of using mobile apps and social media we can certainly see where valid information can easily get lost in the noise...maybe its time to let the stakeholders honestly share their input as a filter clean up the space..:)  last thing I want is a 10-30 second add based on flurry crowding a space to sell me a flashlight because the powers out during a crisis...:)  


  • Wendy Smith

    We ARE in overload!  Still, I sometimes like having something brought to my attn that is personalized to my interests. But, i also LOVE finding something really new to me from outside my existing zone of attention.