Ilya Sutskever, then chief scientist at OpenAI. Sutskever recently left the firm. He had been on the company’s board and had voted to oust CEO Sam Altman last year. Photo / The New York Times
A group of OpenAI insiders is blowing the whistle on what they say is a culture of recklessness and secrecy at the San Francisco artificial intelligence company, which is racing to build the most powerful AI systems ever created.
The group, which includes nine current and former OpenAI employees, hasrallied in recent days around shared concerns that the company has not done enough to prevent its AI systems from becoming dangerous.
The members say OpenAI, which started as a nonprofit research lab and burst into public view with the 2022 release of ChatGPT, is putting a priority on profits and growth as it tries to build artificial general intelligence, or AGI, the industry term for a computer program capable of doing anything a human can.
They also claim that OpenAI has used hardball tactics to prevent workers from voicing their concerns about the technology, including restrictive nondisparagement agreements that departing employees were asked to sign.
“OpenAI is really excited about building AGI, and they are recklessly racing to be the first there,” said Daniel Kokotajlo, a former researcher in OpenAI’s governance division and one of the group’s organisers.
The group published an open letter Tuesday calling for leading AI companies, including OpenAI, to establish greater transparency and more protections for whistleblowers.
Other members include William Saunders, a research engineer who left OpenAI in February, and three other former OpenAI employees: Carroll Wainwright, Jacob Hilton and Daniel Ziegler. Several current OpenAI employees endorsed the letter anonymously because they feared retaliation from the company, Kokotajlo said. One current and one former employee of Google DeepMind, Google’s central AI lab, also signed.
A spokesperson for OpenAI, Lindsey Held, said in a statement: “We’re proud of our track record providing the most capable and safest AI systems and believe in our scientific approach to addressing risk. We agree that rigorous debate is crucial given the significance of this technology, and we’ll continue to engage with governments, civil society and other communities around the world.”
The campaign comes at a rough moment for OpenAI. It is still recovering from an attempted coup last year, when members of the company’s board voted to fire CEO Sam Altman over concerns about his candor. Altman was brought back days later, and the board was remade with new members.
The company also faces legal battles with content creators who have accused it of stealing copyrighted works to train its models. (TheNew York Times sued OpenAI and its partner, Microsoft, for copyright infringement last year.) And its recent unveiling of a hyper-realistic voice assistant was marred by a public spat with Hollywood actress Scarlett Johansson, who claimed that OpenAI had imitated her voice without permission.
But nothing has stuck like the charge that OpenAI has been too cavalier about safety.
Last month, two senior AI researchers - Ilya Sutskever and Jan Leike - left OpenAI under a cloud. Sutskever, who had been on OpenAI’s board and voted to fire Altman, had raised alarms about the potential risks of powerful AI systems. His departure was seen by some safety-minded employees as a setback.
So was the departure of Leike, who along with Sutskever had led OpenAI’s “superalignment” team, which focused on managing the risks of powerful AI models. In a series of public posts announcing his departure, Leike said he believed that “safety culture and processes have taken a back seat to shiny products.”
Neither Sutskever nor Leike signed the open letter written by former employees. But their exits galvanised other former OpenAI employees to speak out.
“When I signed up for OpenAI, I did not sign up for this attitude of ‘Let’s put things out into the world and see what happens and fix them afterward,’” Saunders said.
Some of the former employees have ties to effective altruism, a utilitarian-inspired movement that has become concerned in recent years with preventing existential threats from AI. Critics have accused the movement of promoting doomsday scenarios about the technology, such as the notion that an out-of-control AI system could take over and wipe out humanity.
Kokotajlo, 31, joined OpenAI in 2022 as a governance researcher and was asked to forecast AI progress. He was not, to put it mildly, optimistic.
In his previous job at an AI safety organisation, he predicted that AGI might arrive in 2050. But after seeing how quickly AI was improving, he shortened his timelines. Now he believes there is a 50 per cent chance that AGI will arrive by 2027 - in just three years.
He also believes that the probability that advanced AI will destroy or catastrophically harm humanity - a grim statistic often shortened to “p(doom)” in AI circles - is 70 per cent.
At OpenAI, Kokotajlo saw that even though the company had safety protocols in place - including a joint effort with Microsoft known as the “deployment safety board”, which was supposed to review new models for major risks before they were publicly released - they rarely seemed to slow anything down.
For example, he said, in 2022 Microsoft began quietly testing in India a new version of its Bing search engine that some OpenAI employees believed contained a then-unreleased version of GPT-4, OpenAI’s state-of-the-art large language model. Kokotajlo said he was told that Microsoft had not gotten the safety board’s approval before testing the new model, and after the board learned of the tests - via a series of reports that Bing was acting strangely toward users - it did nothing to stop Microsoft from rolling it out more broadly.
A Microsoft spokesperson, Frank Shaw, disputed those claims. He said the India tests hadn’t used GPT-4 or any OpenAI models. The first time Microsoft released technology based on GPT-4 was in early 2023, he said, and it was reviewed and approved by a predecessor to the safety board.
Eventually, Kokotajlo said, he became so worried that, last year, he told Altman that the company should “pivot to safety” and spend more time and resources guarding against AI’s risks rather than charging ahead to improve its models. He said Altman had claimed to agree with him, but that nothing much changed.
In April, he quit. In an email to his team, he said he was leaving because he had “lost confidence that OpenAI will behave responsibly” as its systems approach human-level intelligence.
“The world isn’t ready, and we aren’t ready,” Kokotajlo wrote. “And I’m concerned we are rushing forward regardless and rationalising our actions.”
OpenAI said last week that it had begun training a new flagship AI model, and that it was forming a new safety and security committee to explore the risks associated with the new model and other future technologies.
On his way out, Kokotajlo refused to sign OpenAI’s standard paperwork for departing employees, which included a strict nondisparagement clause barring them from saying negative things about the company, or else risk having their vested equity taken away.
Many employees could lose out on millions of dollars if they refused to sign. Kokotajlo’s vested equity was worth roughly US$1.7 million, he said, which amounted to the vast majority of his net worth, and he was prepared to forfeit all of it.
(A minor firestorm ensued last month after Vox reported news of these agreements. In response, OpenAI claimed that it had never clawed back vested equity from former employees, and would not do so. Altman said he was “genuinely embarrassed” not to have known about the agreements, and the company said it would remove nondisparagement clauses from its standard paperwork and release former employees from their agreements.)
In their open letter, Kokotajlo and the other former OpenAI employees call for an end to using nondisparagement and nondisclosure agreements at OpenAI and other AI companies.
“Broad confidentiality agreements block us from voicing our concerns, except to the very companies that may be failing to address these issues,” they write.
They also call for AI companies to “support a culture of open criticism” and establish a reporting process for employees to anonymously raise safety-related concerns.
They have retained a pro bono lawyer, Lawrence Lessig, the prominent legal scholar and activist. Lessig also advised Frances Haugen, a former Facebook employee who became a whistleblower and accused that company of putting profits ahead of safety.
In an interview, Lessig said that while traditional whistleblower protections typically applied to reports of illegal activity, it was important for employees of AI companies to be able to discuss risks and potential harms freely, given the technology’s importance.
“Employees are an important line of safety defence, and if they can’t speak freely without retribution, that channel’s going to be shut down,” he said.
Held, the OpenAI spokesperson, said the company had “avenues for employees to express their concerns”, including an anonymous integrity hotline.
Kokotajlo and his group are sceptical that self-regulation alone will be enough to prepare for a world with more powerful AI systems. So they are calling for lawmakers to regulate the industry, too.
“There needs to be some sort of democratically accountable, transparent governance structure in charge of this process,” Kokotajlo said. “Instead of just a couple of different private companies racing with each other, and keeping it all secret.”