WikiLeaks is all over the news at the moment, having pulled the veil off thousands of U.S. diplomatic and Iraq War communications. As we've done before, we looked at the data in a different way: By peeking at the words themselves.
WikiLeaks' latest data treasure trove includes hundreds of thousands of communications between diplomats and politicians in the U.S. and overseas, and while the full archive has only been shared with a select group of people, WikiLeaks has posted the whole shebang online for anyone to see—in slices. We grabbed hold of the first slice of data, dismembered it and looked at the communications ("cables" as they're quaintly called) and ran them through a simple word cloud generator. The results are powerful.
Guess what word is most on the minds of U.S. diplomats in 2010? Yup—Iran. "Iranian" is prominent in the mix too as is "nuclear," which should explain the interest. Fascinatingly "Turkey" is more prominent than "Afghanistan," possibly due to the country's key location in supporting U.S. and NATO operations in both theaters of conflict. "Saudi" and "Israel" make unsurprising appearances, along with the perennial diplomatic thorn of "North Korea". Diplomats are talking about "missiles," "security," "information," and "programs," "technology," and "economic" matters. Is it shocking to see that "military" as a word is used far more than "peace"?
Of course, we're limited by the choice of data that WikiLeaks has chosen to share—and by the particular documents that it got its hands on in the first place, but the second most notable word in the image is a bit of a surprise: Russia. The nation, even in its crumbled super-power, no-longer Cold War state, has such a sway on global politics that it's position in support of global political moves against Korea, Iran and so on means it pops up in diplomatic communications all the time.
But what of the other enormous WikiLeaks treasure trove? The Iraq War Diaries consist of a huge text file many hundreds of megabytes in size, and are thus difficult to penetrate. So we grabbed a random fraction of the text, around 1% of the reports of daily goings on in and around Iraq during the war there, and ran them through the same analysis process.
It's a different story, of course. The Diaries are more mechanical than florid diplomatic communiques, and are all about simple "reports" and "updates" of what went on. They're also much more tragic. The word that stands out most, of course, is "IED"—improvised explosive device. We know it's a threat to security forces in the troubled nation, but to see exactly how often it pops up in the reports is still eye-opening. It's the reason "KIA" and "WIA" (Killed/Wounded in Action) and "casualties" pop up so often too. "Contractors" is the other stand-out word, if you think about it: We expect to see "tank" and "weapons" and "detonation," but "contractors" reveals how the war in Iraq has changed—proof positive that it's moved on from being a military affair into one that involves private security forces.
To read more news on this, and similar stuff, keep up with my updates by following me, Kit Eaton, on Twitter.