New Meta AI Demo Writes Racist And Inaccurate Scientific Literature, Gets Pulled

Enlarge / An AI-generated illustration of robots making science.

Ars Technica

On Tuesday, Meta AI unveiled a demo of Galactica, a large language model designed to “store, combine, and reason about scientific knowledge.” Although intended to speed up the writing of scientific literature, hostile users who ran tests found that it could do just that generate realistic nonsense. After several days of ethical criticismMeta has taken the demo offline, reports MIT Technology Review.

Large language models (LLMs), such as OpenAI’s GPT-3, learn how to write text by studying millions of examples and understanding the statistical relationships between words. As a result, they may write persuasive-sounding documents, but those works can also be riddled with falsehoods and potentially damaging stereotypes. Some critics call LLMs “stochastic parrots” for their ability to convincingly spit out text without understanding its meaning.

Enter Galactica, an LLM focused on science literature writing. The authors trained Galactica on “a vast and curated corpus of humankind’s scientific knowledge,” including more than 48 million papers, textbooks and notes, scientific websites and encyclopedias. According to Galactica’s paper, Meta AI researchers believed that this supposedly high-quality data would lead to high-quality output.

A screenshot from Meta AI's Galactica website before the demo ended. — Enlarge / A screenshot from Meta AI’s Galactica website before the demo ended.

Meta AI

As of Tuesday, visitors to the Galactica website could type prompts to generate documents such as literature reviews, wiki articles, lecture notes and answers to questions, following samples provided by the website. The site presented the model as “a new interface to access and manipulate what we know about the universe”.

While some people liked the demo promising and usableothers soon discovered that anyone could type racist or potentially offensive clues, generating authoritative-sounding content on those topics just as easily. For example, someone used it author a wiki article on a fictional research paper titled “The Benefits of Eating Broken Glass.”

Even if Galactica’s output didn’t offend social norms, the model could attack and spit out well-understood scientific facts inaccuracies such as incorrect dates or animal names, which require in-depth knowledge of the subject to be captured.

I asked #Galactica about some things I know and I am troubled. In every case it was wrong or biased, but sounded right and authoritative. I think it’s dangerous. Here are some of my experiments and my analysis of my concerns. (1/9)

— Michael Black (@Michael_J_Black) November 17, 2022

As a result, Meta drawn the Galactica demo Thursday. After that, Meta’s Chief AI Scientist Yann LeCun tweeted“The Galactica demo is offline for now. It is no longer possible to have fun by casually abusing it. Happy?”

The episode recalls a common ethical dilemma with AI: When it comes to potentially harmful generative models, is it for the general public to use them responsibly, or for the model publishers to prevent abuse?

Where industry practice falls between these two extremes is likely to vary between cultures and as deep learning models mature. Ultimately, government regulation may ultimately play a major role in shaping the response.