Since 2010, the Library of Congress has been quietly archiving every public tweet. It was run in partnership with the National Library, and Twitter intended to create a growing database of all the things–good and bad–that happened on the platform. As the Library of Congress’s Robert Dizard told Fast Company in 2013:
Our mission is to collect, preserve, and provide access to creative and historical record of America . . . We’re looking at Twitter from a research and scholarship perspective as providing a reflection of everyday life as well as showing the development and impact of significant events. You also have the record and recordings of individuals. Which are also valuable.
Now that’s about the change. Starting on December 31, the Library of Congress will continue to archive tweets, but not every one. It will, instead, collect them on a “very selective basis.” The Library writes, “Generally, the tweets collected and archived will be thematic and event-based, including events such as elections, or themes of ongoing national interest, e.g., public policy.”
From the get-go, the project was a behemoth. The Library of Congress was essentially vacuuming up every tweet, archiving it, and attempting to turn it into a public searchable destination. In 2013, the data already represented hundreds of terabytes. Even back then, creating this archive was an immense task, and as Twitter has grown and changed, it became more and more unfeasible.
According to the U.S.’s oldest federal cultural institution, the decision to not archive every tweet was brought on by the platform’s growing volume. Not only can the Library of Congress not receive images, videos, and links, but tweets themselves are also longer than they used to be, as Twitter recently expanded the limit for tweets from 140 to 280 characters.
The Library described one of the reasons the project is changing as thus: “Twitter is expanding the size of tweets beyond what was originally described at the beginning of effort.”
And so, we can thank Twitter’s recent decision to make tweets longer as one of the reasons the Library of Congress is no longer creating one of the coolest living databases.CGW