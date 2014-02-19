At its best, Twitter is a place to find breaking news, thoughtful dialogue, and unexpected voices. At its worst, it’s a forum for knee-jerk reactions, trolls, and harassment. At both of these extremes, as in life, there’s bound to be some bad language along the way.

Just how much do people curse on Twitter? Who swears and what do they say?

Four mild-mannered researchers endeavored to find out, and the paper they produced–presented this week at the ACM Conference on Computer-Supported Cooperative Work & Social Computing–is a compendium of offensive language that rivals the depraved, cuss-filled brilliance of the recent film The Wolf of Wall Street.





After examining a random one-month sample of 51 million English-language tweets from 14 million distinct user accounts, they came up with this conclusion: We curse a lot on Twitter, where our language is usually public, even more than we do in real life. Even more compelling, they discovered the underlying context of when and why cursing happens and who is cursing to whom.

“It’s a sizable fraction of the words we use. On average, one tweet out of 13 tweets will contain at least one cursing word,” says Wenbo Wang, a PhD researcher at Wright State University who led the study. “Because of social media, people don’t see each other. They can say things they wouldn’t say in the physical world.” Other studies have found that 0.5 to 0.7% of words we say in the physical world are curses–on Twitter, the researchers found the rate to be 1.15%. Or as the paper reads, and as Wang was too polite to repeat during our phone interview:

The most popular curse word is fuck, which covers 34.73% of all the curse word occurrences, followed by shit (15.04%), ass (14.48%), bitch (10.34%), nigga (9.68%), hell (4.46%), whore (1.82%), dick (1.67%), piss (1.53%), and pussy (1.16%).

The findings are interesting for anyone who uses Twitter, but for the team, all affiliated with Ohio Center of Excellence in Knowledge-enabled Computing, the paper will fold into work with broader societal implications related to mental health, verbal abuse, online harassment, and gender differences in online communications.





“Social content is extremely rich,” says the center’s director Amit P. Sheth. “The cursing issue is an expression of sentiment and emotion…it’s kind of a core issue of understanding the language.” The center is working on developing automated tools that could flag issues of worrisome harassment on social media, especially in high school and college years, or could identify depressive disorders or disposition to violence. Creating filters for kids on social media is also another potential application.