Is Twitter a Reliable Polling Machine? Not Yet, Says Carnegie-Mellon

But the study could help Twitter figure out how to monetize its data.



Could Twitter become a cheaper replacement for polling firms such as
YouGov in the future? A research team at Carnegie-Mellon University in Pittsburgh has gone through a billion tweets in an attempt to discover whether sentiments posted on the social media site reflect opinion polls. Noah Smith, assistant professor of language technologies and machine learning at the School of Computer Science, says that, although the results were a little bumpy on a day-to-day basis, the opinions “smoothed” themselves out over a longer period.

But while Twitter is a great way to “take the temperature of the population” very quickly, Smith says it’s not quite ready to become a reliable mainstream polling machine. “The results are noisy, as are results of polls. Opinion pollsters have learned to compensate for these distortions, while we’re still trying to identify and understand the noise in our data. Given that, I’m excited that we get any signal at all from social media that correlates with the polls.”

Smith and his team combed 140-character messages with an economic or political bent from 2008 and 2009, and used text analysis techniques to determine whether they expressed positive or negative sentiments. The economic results were compared with the Index of Consumer Sentiment from Reuters and the Gallup Organization’s Economic Confidence Index, while political tweets–all this was before, during and after the presidential elections, remember–were compared to Gallup’s presidential approval daily poll, and a compilation of 46 different polls from, all of which had been obtained by traditional methods (that is to say, telephone surveys).

While some of the Twitter sentiment results were spot-on in relation to consumer confidence and Obama’s approval ratings after he was sworn in, the data gleaned during the election told a slightly different story. Increased mentions of both the words Obama and McCain correlated with rises in Obama’s popularity. Democrats as early adopters, eh?

For Twitter to present itself as a viable opinion pollster, the firm would need to bring in software (including natural language processing tools, query-driven analysis, and use of demographic and time stamp data) that would analyze the language used in tweets, as well as introducing control groups in the way that polling firms already use. It could, however, be a smart, easy way for Twitter to expand its offerings, and monetize itself.

About the author

My writing career has taken me all round the houses over the past decade and a half--from grumpy teens and hungover rock bands in the U.K., where I was born, via celebrity interviews, health, tech and fashion in Madrid and Paris, before returning to London, where I now live. For the past five years I've been writing about technology and innovation for U.S