Skip to content

Do you want to archive Twitter? Good luck with that

    Of the moment Elon Musk closed his Twitter deal, the network’s diehard users have taken steps to praise it. People have downloaded their own archive from Twitter. Others have started threads with screenshots of their all-time favorite tweets. And there’s an ongoing Google Doc cataloging Twitter trends and memes, a guide that could one day serve to decipher the app’s hieroglyphics.

    Whether Twitter goes out of business (as Musk himself has said is a possibility) or becomes an unnavigable stream of hate speech and deceptive parody accounts, the future of the network is unknown. But there are fears that Twitter’s wealth of content, important for both historical and political impact (as well as a good laugh), could be lost. Twitter’s premise — the 140-character joke (now 280) — doesn’t lend itself well to archiving. That’s partly because capturing a stream of content that grows by the thousands every minute is a technical nightmare, but it’s also because of ethical concerns that not all tweets are created equal. Some are fired by world leaders inciting violence and others by individuals who would be unknown citizens if not for their affinity with the bird app. Both types of tweets can go viral and have lasting consequences.

    “I think it’s really important to think carefully about the data you’re collecting,” said Miles McCain of PolitiTweet, a service that archives tweets from public figures and influential institutions. “If you try to archive anything and everything, you end up with a bunch of information that doesn’t really matter.”

    An attempt by the United States Library of Congress, which began documenting every public tweet in 2010, failed. Tweets have evolved from short bits of text to regular photos, videos, and live links. The library ended the Sisyphean project seven years later, saying it would only archive select accounts. In 2012, the library said it was archiving half a billion tweets every day. A spokesperson for the library did not comment to WIRED prior to publishing this story.

    Elisabeth Fondren, a journalism professor at St. John’s University in New York City, says the failure of that archiving project turned out to be a huge missed opportunity for preserving a rich dataset of political discourse and communication trends. The present moment has highlighted the need to archive social media and exposed the uncertainty of hosting a public plaza on the servers of a private company.

    “If it had been successful, we would have it now,” says Fondren. “It really undermines researchers’ attempts to assess the social impact of media on society.”

    Smaller third-party services have tried for years to archive more specific content. ProPublica maintains a list of politicians’ deleted tweets in its Politwoops database. PolitiTweet has a database that tracks 1,500 accounts. These track statements and news stories from key people in government and politics, but the projects are not intended to capture the mass discourse of online communication.

    Twitter was designed to capture the moment, and at first finding or viewing older tweets wasn’t easy and didn’t seem important. But by 2014, Twitter had improved its search for public tweets. The move helped investigators, but it also breathed new life into long-forgotten tweets that had been moved down the timeline without much thought. The change proved problematic for some tweeters, such as those who began expressing 140-character musings as teenagers but had since become college students or young professionals. Their tweets didn’t always age the same, especially as an era of cancellation culture arrived.