How Can AI Destroy Humanity?

Last month, hundreds of well-known people in the artificial intelligence world signed an open letter warning that AI could one day destroy humanity.

“Reducing the risk of AI extinction should be a global priority alongside other societal-scale risks such as pandemics and nuclear war,” the one-sentence statement said.

The letter was the latest in a series of dire warnings about AI that have been particularly light on details. Current AI systems cannot destroy humanity. Some of them can barely add and subtract. So why are the people who know the most about AI so concerned?

The scary scenario.

Tech industry Cassandras say companies, governments or independent researchers could one day deploy powerful AI systems to handle everything from business to warfare. Those systems could do things we don’t want them to do. And if people tried to intervene or shut them down, they could resist or even replicate themselves so they could keep working.

“Today’s systems are nowhere near existential risk,” said Yoshua Bengio, a professor and AI researcher at the University of Montreal. “But in one, two, five years? There is too much uncertainty. That is the problem. We are not sure that this will not pass a point where it becomes catastrophic.”

The worryers have often used a simple metaphor. If you ask a machine to make as many paperclips as possible, they say, it can get carried away and transform everything – including humanity – into paperclip factories.

How does that tie in with the real world – or an imagined world not too many years in the future? Companies could increasingly give AI systems greater autonomy and connect them to vital infrastructure, including power grids, exchanges and military weapons. From there, they can cause problems.

To many experts, this didn’t seem so plausible until about a year ago, when companies like OpenAI showed significant improvements in their technology. That showed what could be possible if AI continues to evolve at such a rapid pace.

“AI will steadily be delegated and — as it becomes more autonomous — can take over the decision-making and thinking of today’s humans and human-led institutions,” said Anthony Aguirre, a cosmologist at the University of California, Santa Cruz and a founder of the Future of Life Institute, the organization behind one of the two open letters.

“At some point it would become clear that the great machine that runs society and the economy is not really under human control and cannot be shut down, any more than the S&P 500 could be shut down,” he said.

Or so the theory goes. Other AI experts think it’s a ridiculous premise.

“Hypothetical is such a polite way of articulating what I think about the existential risk talk,” said Oren Etzioni, the founder and CEO of the Allen Institute for AI, a research lab in Seattle.

Are there signs that AI could do this?

Not quite. But researchers are transforming chatbots like ChatGPT into systems that can take actions based on the text they generate. A project called AutoGPT is the best example.

The idea is to give the system goals like ‘found a company’ or ‘make some money’. Then he keeps looking for ways to achieve that goal, especially if he is connected to other internet services.

A system like AutoGPT can generate computer programs. If researchers give it access to a computer server, it can actually run those programs. In theory, this is a way for AutoGPT to do almost anything online: retrieve information, use applications, create new applications, and even improve itself.

Systems like AutoGPT are currently not working properly. They tend to get stuck in endless loops. Researchers gave one system all the resources it needed to replicate itself. It couldn’t.

Over time, those limitations can be resolved.

“People are actively trying to build systems that improve themselves,” said Connor Leahy, the founder of Conjecture, a company that says it aims to align AI technologies with human values. “Currently this is not working. But someday it will happen. And we don’t know when that day is.”

Mr. Leahy argues that when investigators, companies and criminals give these systems goals such as “making some money”, they could end up breaking into banking systems, sparking a revolution in a country where they have oil futures or replicating themselves when someone tries to steal them. turn. out.

Where do AI systems learn to misbehave?

AI systems like ChatGPT are built on neural networks, mathematical systems that can learn skills by analyzing data.

Around 2018, companies like Google and OpenAI started building neural networks that learned from huge amounts of digital text pulled from the internet. By locating patterns in all this data, these systems learn to generate writing on their own, including news articles, poems, computer programs, and even human conversations. The result: chatbots like ChatGPT.

Because they learn from more data than even their creators can comprehend, these systems also exhibit unexpected behavior. Researchers recently showed that one system was able to hire a human online to beat a Captcha test. When the human asked if it was “a robot,” the system lied and said it was a visually impaired person.

Some experts worry that if researchers make these systems more powerful and train them on ever-increasing amounts of data, they could develop more bad habits.

Who are the people behind these warnings?

In the early 2000s, a young writer named Eliezer Yudkowsky began warning that AI could destroy humanity. His online posts spawned a community of believers. This community, called rationalists or effective altruists, became hugely influential in academia, government think tanks, and the tech industry.

Mr. Yudkowsky and his writings played a key role in the creation of both OpenAI and DeepMind, an AI lab that Google acquired in 2014. And many of the community of “EAs” worked in these labs. They believed that because they understood the dangers of AI, they were in the best position to build it.

The two organizations that recently published open letters warning of the risks of AI – the Center for AI Safety and the Future of Life Institute – are closely associated with this movement.

The recent warnings also come from research pioneers and industry leaders like Elon Musk, who has long warned of the risks. The final letter was signed by Sam Altman, the CEO of OpenAI; and Demis Hassabis, who helped found DeepMind and now oversees a new AI lab that combines the top researchers from DeepMind and Google.

Other respected figures signed one or both warning letters, including Dr. Bengio and Geoffrey Hinton, who recently retired as Google executives and researchers. In 2018, they received the Turing Award, also known as “the Nobel Prize for computing,” for their work on neural networks.

How can AI destroy humanity?

The scary scenario.

Are there signs that AI could do this?

Where do AI systems learn to misbehave?

Who are the people behind these warnings?