One day, the tech industry’s Cassandras say, companies, governments or independent researchers could deploy powerful AI systems to handle everything from business to warfare. Those systems could do things that we do not want them to do. And if humans tried to interfere or shut them down, they could resist or even replicate themselves so they could keep operating.
“Today’s systems are not anywhere close to posing an existential risk,” said Yoshua Bengio, a professor and AI researcher at the University of Montreal. “But in one, two, five years? There is too much uncertainty. That is the issue. We are not sure this won’t pass some point where things get catastrophic.”
The worriers have often used a simple metaphor. If you ask a machine to create as many paper clips as possible, they say, it could get carried away and transform everything — including humanity — into paper clip factories.
How does that tie into the real world — or an imagined world not too many years in the future? Companies could give AI systems more and more autonomy and connect them to vital infrastructure, including power grids, stock markets and military weapons. From there, they could cause problems.
For many experts, this did not seem all that plausible until the last year or so, when companies like OpenAI demonstrated significant improvements in their technology. That showed what could be possible if AI continues to advance at such a rapid pace.
“AI will steadily be delegated, and could — as it becomes more autonomous — usurp decision making and thinking from current humans and human-run institutions,” said Anthony Aguirre, a cosmologist at the University of California, Santa Cruz, and a founder of the Future of Life Institute, the organisation behind one of two open letters.
“At some point, it would become clear that the big machine that is running society and the economy is not really under human control, nor can it be turned off, any more than the S&P 500 could be shut down,” he said.
Or so the theory goes. Other AI experts believe it is a ridiculous premise.
“Hypothetical is such a polite way of phrasing what I think of the existential risk talk,” said Oren Etzioni, the founding chief executive of the Allen Institute for AI, a research lab in Seattle.
Are there signs AI could do this?
Not quite. But researchers are transforming chatbots like ChatGPT into systems that can take actions based on the text they generate. A project called AutoGPT is the prime example.
The idea is to give the system goals like “create a company” or “make some money.” Then it will keep looking for ways of reaching that goal, particularly if it is connected to other internet services.
A system like AutoGPT can generate computer programs. If researchers give it access to a computer server, it could actually run those programs. In theory, this is a way for AutoGPT to do almost anything online — retrieve information, use applications, create new applications, even improve itself.
Systems like AutoGPT do not work well right now. They tend to get stuck in endless loops. Researchers gave one system all the resources it needed to replicate itself. It couldn’t do it.
In time, those limitations could be fixed.
“People are actively trying to build systems that self-improve,” said Connor Leahy, the founder of Conjecture, a company that says it wants to align AI technologies with human values. “Currently, this doesn’t work. But someday, it will. And we don’t know when that day is.”
Leahy argues that as researchers, companies and criminals give these systems goals like “make some money,” they could end up breaking into banking systems, fomenting revolution in a country where they hold oil futures or replicating themselves when someone tries to turn them off.
Where do AI systems learn to misbehave?
AI systems like ChatGPT are built on neural networks, mathematical systems that can learns skills by analyzing data.
Around 2018, companies like Google and OpenAI began building neural networks that learned from massive amounts of digital text culled from the internet. By pinpointing patterns in all this data, these systems learn to generate writing on their own, including news articles, poems, computer programs, even humanlike conversation. The result: chatbots like ChatGPT.
Because they learn from more data than even their creators can understand, these system also exhibit unexpected behaviour. Researchers recently showed that one system was able to hire a human online to defeat a Captcha test. When the human asked if it was “a robot,” the system lied and said it was a person with a visual impairment.
Some experts worry that as researchers make these systems more powerful, training them on ever larger amounts of data, they could learn more bad habits.
Who are the people behind these warnings?
In the early 2000s, a young writer named Eliezer Yudkowsky began warning that AI could destroy humanity. His online posts spawned a community of believers. Called rationalists or effective altruists, this community became enormously influential in academia, government think tanks and the tech industry.
Yudkowsky and his writings played key roles in the creation of both OpenAI and DeepMind, an AI lab that Google acquired in 2014. And many from the community of “EAs” worked inside these labs. They believed that because they understood the dangers of AI, they were in the best position to build it.
The two organisations that recently released open letters warning of the risks of AI — the Center for AI Safety and the Future of Life Institute — are closely tied to this movement.
The recent warnings have also come from research pioneers and industry leaders like Elon Musk, who has long warned about the risks. The latest letter was signed by Sam Altman, the chief executive of OpenAI; and Demis Hassabis, who helped found DeepMind and now oversees a new AI lab that combines the top researchers from DeepMind and Google.
Other well-respected figures signed one or both of the warning letters, including Bengio and Geoffrey Hinton, who recently stepped down as an executive and researcher at Google. In 2018, they received the Turing Award, often called “the Nobel Prize of computing,” for their work on neural networks.
This article originally appeared in The New York Times.
Written by: Cade Metz
Photographs by: Saratta Chuengsatiansup
©2023 THE NEW YORK TIMES