What Twitter's 200 Million Email Leak Really Means

Rosie Struve; Getty Images

Following reports in late 2022 that hackers sold data stolen from 400 million Twitter users, researchers now say a widely distributed trove of email addresses linked to about 200 million users is likely a refined version of the larger trove with duplicate entries removed. The social network has not yet commented on the massive exposure, but the cache of data clarifies the severity of the leak and who is most at risk from it.

From June 2021 to January 2022, there was a bug in a Twitter Application Programming Interface, or API, that allowed attackers to submit contact information such as email addresses and receive the associated Twitter account in return. Before it was patched, attackers exploited the flaw to “scrape” data from the social network. And while the bug didn’t allow hackers to access passwords or other sensitive information like DMs, it did expose the connection between Twitter accounts, which are often pseudonymous, and the email addresses and phone numbers associated with them, potentially allowing users to can be identified.

While live, the vulnerability was seemingly exploited by multiple actors to build different data sets. One that has been circulating on criminal forums since the summer lists the email addresses and phone numbers of about 5.4 million Twitter users. The huge, newly unearthed treasure appears to contain only email addresses. However, the widespread distribution of the data carries the risk of fueling phishing attacks, identity theft attempts, and other individual targeting.

Twitter has not responded to WIRED’s requests for comment. Company wrote on the API vulnerability in an August disclosure: “When we learned of this, we immediately investigated and resolved it. At that time, we had no indication that anyone had exploited the vulnerability.” Apparently, Twitter’s telemetry was insufficient to detect the malicious scrape.

Twitter is far from the first platform to expose data to mass scraping due to an API flaw, and in such scenarios it’s common for confusion about how much different data actually exists due to malicious exploitation. However, these incidents are still significant as they add more connections and validation to the vast amount of stolen data already existing in the criminal ecosystem on users.

“Obviously there were several people who knew about this API vulnerability and several people who scraped it. Did different people scrape different things? How many trumps are there? It doesn’t really matter,” said Troy Hunt, founder of the breach tracking site HaveIBeenPwned. Hunt included the Twitter dataset in HaveIBeenPwned and says it represented information on more than 200 million accounts. Ninety-eight percent of email addresses had already been exposed in previous breaches recorded by HaveIBeenPwned. And Hunt says he has sent notification emails to nearly 1,064,000 of his service’s 4,400,000 million email subscribers.

“It’s the first time I’ve sent a seven-figure email,” he says. “Almost a quarter of my entire corpus of subscribers is really significant. But because there was so much of it already, I don’t think this is going to be a long-tailed incident in terms of impact. But it can de-anonymize people. What I’m more concerned about is those individuals who wanted to maintain their privacy.

Twitter wrote in August that it shared these concerns about the possibility of users’ pseudonymous accounts being linked to their real identities as a result of the API vulnerability.

“If you operate a pseudonymous Twitter account, we understand the risks an incident like this can pose and we deeply regret that it happened,” the company wrote. “To keep your identity as concealed as possible, we recommend that you do not add a publicly known phone number or email address to your Twitter account.”

However, for users who hadn’t yet linked their Twitter handles to burner email accounts at the time of the scrape, the advice comes too late. In August, the social network said it was notifying potentially affected individuals of the situation. The company did not say whether it will issue further notice in light of the hundreds of millions of exposed records.

Ireland’s data protection commission said last month it was investigating the incident that exposed the trove of email addresses and phone numbers of 5.4 million users. Twitter is also currently under investigation by the US Federal Trade Commission over whether the company violated a “decree of consent” that required Twitter to improve its user privacy and data protection measures.

This story originally appeared on wired.com.

What Twitter’s 200 Million Email Leak Really Means