For proof, simply type in the subject of a breaking news story, such as “French riots,” into a traditional search engine. The first items that pop up probably won’t reflect the last November’s racial/religious riots in France. In a decent blog search engine, typing in the same term will pull up almost exclusively news stories and other material relevant to the recent turmoil.
Traditional search engines only spider sites every one day to a few weeks, so they don’t reflect the latest postings. They have only a partial database of all the blogs that exist, and an average of one new blog is being created every second. Lastly, most traditional search engines such as Google rank posts primarily based on the number of incoming links. However, in the blogosphere, the most relevant post may not have any incoming links, simply because the most relevant post is so new.
The new blog search engine tools take the nuances of the blogosphere into account. They strive to correctly identify blogs and posts by their relevance, timeliness, and popularity. Eventually, more criteria will be added to their equations. As more and more websites incorporate blog-type functionality (frequent updating) and technology (RSS), figuring out how to search blogs will be more and more important.
NYU college student intern Chris Duncan has researched the efficacy of the major blog ranking engines with us, and they are: PubSub, Technorati, Bloglines, Feedster, IceRocket, and Google Blog Search. Our thanks to Chris for his extensive research and contributions to this article. We were primarily interested in using these tools to identify the most influential blogs, in connection with the marketing for our new book, “The Virtual Handshake.“
Here’s our quick take on the contenders:
|Top 5 Sites by Inlinking Sites for one day||Top 5 Sites by Outlinking Sites||Top 5 Sites by Link Rank (based on number of incoming links, plus other factors such as recency)|
As of September 18, 2005
PubSub doesn’t appear to incorporate any more sophisticated criteria than the raw number of links. PubSub’s LinkRank system also its problems; on its top ten list are such obscure sites as VBulletin.org and blogs.tamtam.nl. For our purposes, PubSub has very limited functionality.
Technorati’s list of the most influential bloggers has many more relevant entries than Pubsub. Its primary criterion is the same as PubSub: the number of sites that link to a particular site. The top sites on PubSub’s in-link list approach 2,000 links for the top 10; on Technorati, they are all above 8,000 links.
|# of Sites Linking to Top 5 Blogs|
As of September 18, 2005
By checking more relevant sites, Technorati does seem to capture the buzz better than Google or Alexa. Technorati is clearly better than PubSub in terms of analyzing in-link lists, although they still have some less-than-relevant entries such as Yahoo! Messenger on the list. Web developer and blogger Jason Kottke also says that Technoratihas trouble counting links . A post of his received 159 trackbacks, but Technorati only listed it as having 93 sites linking to it.
Top 5 Sites by Subscription
Bloglines takes a different approach by looking at feed subscribers instead of links to a given web site. Their list looks like a good mix of PubSub and Technorati. It has sites such as BoingBoing, Gizmodo, and Engadget, which also appear on Technorati’s list, as well as the New York Times and BBC, which show up on PubSub. Bloglines also provides a simple description beneath each link, which helps you to scan a list of bloggers very rapidly.
The Bloglines system has a clear advantage over both of the other lists in that it is an “opt-in” list. The advantage of using feeds is that it more accurately measures the interests of the average blog reader (who doesn’t have an active blog), as opposed to the opinions of fellow influencers (who run blogs that can carry link weight). Bloglines is comparable to using TV’s Nielsen Ratings to evaluate a show’s popularity, as opposed to asking the PBS commentators what TV shows they most like. Of course, the downside is that the number of subscribing feeds is very susceptible to manipulation.
|Top 5 on Feedster 100||Top 5 on Feedster 500|
Feedster provides both an RSS feed aggregator as well as a search tool. They maintain two different top lists which use different analytic techniques: the Feedster Top 500 and the Feedster Top 100. The Feedster Top 500 is much like the in-link lists that Technorati and PubSub have. We’re assuming (they don’t say) that the in-links counted by the Feedster Top 500 are taken from individual blog posts. This puts the top five sites at well over 20,000 links each; it takes 809 links to make the top 500 list. While the Top 500 is similar to the sites we’ve already discussed, the Top 100 is more useful. The Feedster Top 100 is like Bloglines; it looks at feed subscriptions to form its list. BoingBoing makes this top five, yet again. Otherwise, we see different sites but ones that we would expect to make the top five: Wired, Slashdot, and Dilbert. We’re not sure how the Google Weblog ended up on this list; it’s much too special-interest to attract numbers comparable to Dilbert.
The next site we looked at was IceRocket, backed by Dallas Mavericks owner Mark Cuban. It doesn’t have a list of the top bloggers, but it is a useful resource. At IceRocket, you can search for blog posts by topic or by URL (i.e., find out who is linking to it). This way, you can also compare trends for certain topics to see how who is blogging about what over about a two-month period.
Link tracking is another useful tool provided by IceRocket. They provide a service that lets bloggers track the links to their own posts with a short line of code. This can also be done through the search feature by entering a URL. The aforementioned post by Jason Kottke comes up with 59 sites linking to it. You can really see whose posts are creating buzz around the internet, although Memeorandum, a news summary aggregator drawing from experts and pundits, insiders and outsiders, media professionals and amateur bloggers every 5 minutes, is a simpler one-stop solution to do that.
Google Blog Search
Google’s blog search engine is a more recent development than the others. Its advanced functions and speed are consistent with other Google products. It can search for blog title, authors, and by date (but not any posts before March of 2005). Unfortunately, they do not provide a general ranking list of the top X number of blogs. However, we can use the “link:” feature to see how many sites are linking to some of the more popular sites we’ve seen. We will also compare this to their Google “link:” numbers so we can see how they differ.
(# blogs linking to this blog)
(# sites linking to this site)
Obviously, the numbers vary greatly. This can be attributed to the fact that the Google Blog Search has a much smaller pool of sites in which it finds results and possibly in part due to the fact that these results only date from March 2005. In any case, Google Blog Search rankings and in-links will probably become important benchmarks, and contribute further to Google’s ever-more-impressive earnings.
There is obviously no perfect ranking system on the Internet–either for blogs or for web sites in general– and there probably never will be. Our recommendation is that PubSub try to focus their results on real blogs; too many of their results are commercial sites or blog utilities. Technorati needs to become more accurate in their counts of links. Bloglines and Feedster could be substantially improved if they took the next step and separated the blogs into categories. If your goal is to find the most influential bloggers, in a given category or overall, you’re best off using Technorati.
The ideal system would incorporate elements from all of the services that we’ve discussed–link tracking, in-linking, feed subscriptions, tags, viewer ratings–and then find the best way to weight them. It’s a (much-disputed) truism of biology that “ontogeny recapitulates phylogeny,” but it’s happening here. Blog search engines and ranking tools are recapitulating the evolution of Web search engines such as Google. When Google launched, many thought that the search engine game was already settled, but Google became very powerful because it provided a truly superior search experience. There is room for just as much innovation in the blog search world.