Princeton researchers have developed a new method that tracks the origins of influential ideas. The traditional way of measuring influence in academic journals is by citation count, and online search engines use links and traffic to determine which sites are most relevant to your search. The Princeton approach is more direct. By running computer algorithms on the actual text of a set of documents, and analyzing how their language changes over time, the researchers are able to track the origin and spread of new ideas.
The idea is that a paper that gains many citations, or a site that garners many links, might ultimately not lead to a real shift in thinking, despite its apparent popularity. The converse is also true: Papers and sites that evade widespread citation might nonetheless find channels of influence despite their apparent obscurity, perhaps simply by affecting the point of view of a handful of powerful people.
The researchers, whose names are David Blei and Sean Garrish, tested their algorithm on several decades' worth of papers in three academic journals, then compared their results against the traditional citation-based measure of influence. The two methods were in agreement almost half the time. But in a telling case, the Blei-Garrish method determined that a column in 1972 that accurately predicted an expanding role for the National Science Foundation was not cited anywhere, and would not have been deemed influential by traditional measures.
Though they tested their method on scholarly papers, the researchers say it should also be applicable to measuring influence in news stories, on the Internet--in theory, anywhere there is text to be crunched.
"We are also exploring the idea that you can find patterns in how language changes over time," said Blei in a press release. "Once you've identified the shapes of those patterns, you might be able to recognize something important as it develops, to predict the next big idea before it's gotten big."
[Images: Flickr user brewbooks; inset: Frank Wojciechowski]