Skip to content

New York Times says OpenAI may have deleted evidence for lawsuits

    Lawsuits are never exactly a lovefest, but the copyright battle between The New York Times and both OpenAI and Microsoft is becoming particularly contentious. This week, the Times claimed that OpenAI engineers had accidentally deleted data that the newspaper's team had spent more than 150 hours working on as potential evidence.

    OpenAI was able to recover much of the data, but the Times' legal team says the original file names and folder structure are still missing. According to a statement filed in court on Wednesday by Jennifer B. Maisel, an attorney for the newspaper, this means the information “cannot be used to determine where the News Plaintiffs' copied articles” may have been included in the artificial intelligence models from OpenAI.

    “We disagree with the characterizations made and will provide our response shortly,” OpenAI spokesperson Jason Deutrom told WIRED in a statement. The New York Times declined to comment.

    The Times filed a copyright lawsuit against OpenAI and Microsoft last year, claiming the companies had illegally used their articles to train artificial intelligence tools like ChatGPT. The case is one of several ongoing legal battles between AI companies and publishers, including a similar lawsuit filed by the Daily News that is being handled by some of the same lawyers.

    The Times' case is currently under investigation, meaning both sides are turning over requested documents and information that could become evidence. As part of the lawsuit, OpenAI was required by the court to show the training data to the Times, which is a big problem: OpenAI has never publicly revealed exactly what information was used to build its AI models. To make this public, OpenAI created what the court calls a “sandbox” of two “virtual machines” for the Times' lawyers to browse. In its statement, Maisel said that OpenAI engineers had “erased” the data organized by the Times team on one of these machines.

    According to Maisel's documents, OpenAI acknowledged that the information had been removed and attempted to address the issue shortly after it was made aware of it earlier this month. But when the newspaper's lawyers looked at the “recovered” data, it was too disorganized, requiring them to “create their work from scratch with significant human and computer processing time,” several other Times lawyers said in a letter to the judge. the same day as Maisel's declaration.

    The lawyers noted that they had “no reason to believe” the removal was “intentional.” In emails submitted into evidence along with Maisel's letter, OpenAI attorney Tom Gorman called the data deletion a “glitch.”