OpenAI’s ex-policy lead criticizes the company for ‘rewriting’ its AI safety history

A high-profile ex-OpenAI coverage researcher, Miles Brundage, took to social media on Wednesday to criticize OpenAI for “rewriting the historical past” of its deployment strategy to doubtlessly dangerous AI methods.

Earlier this week, OpenAI printed a document outlining its present philosophy on AI security and alignment, the method of designing AI methods that behave in fascinating and explainable methods. Within the doc, OpenAI stated that it sees the event of AGI, broadly outlined as AI methods that may carry out any process a human can, as a “steady path” that requires “iteratively deploying and studying” from AI applied sciences.

“In a discontinuous world […] security classes come from treating the methods of right now with outsized warning relative to their obvious energy, [which] is the strategy we took for [our AI model] GPT‑2,” OpenAI wrote. “We now view the primary AGI as only one level alongside a sequence of methods of accelerating usefulness […] Within the steady world, the best way to make the subsequent system protected and helpful is to study from the present system.”

However Brundage claims that GPT-2 did, in truth, warrant considerable warning on the time of its launch, and that this was “100% constant” with OpenAI’s iterative deployment technique right now.

“OpenAI’s launch of GPT-2, which I used to be concerned in, was 100% constant [with and] foreshadowed OpenAI’s present philosophy of iterative deployment,” Brundage wrote in a post on X. “The mannequin was launched incrementally, with classes shared at every step. Many safety specialists on the time thanked us for this warning.”

Brundage, who joined OpenAI as a analysis scientist in 2018, was the corporate’s head of coverage analysis for a number of years. On OpenAI’s “AGI readiness” crew, he had a specific deal with the accountable deployment of language era methods similar to OpenAI’s AI chatbot platform ChatGPT.

GPT-2, which OpenAI introduced in 2019, was a progenitor of the AI methods powering ChatGPT. GPT-2 might reply questions on a subject, summarize articles, and generate textual content on a degree generally indistinguishable from that of people.

Whereas GPT-2 and its outputs might look fundamental right now, they have been cutting-edge on the time. Citing the chance of malicious use, OpenAI initially refused to launch GPT-2’s supply code, opting as an alternative of give chosen information shops restricted entry to a demo.

The choice was met with blended opinions from the AI business. Many specialists argued that the menace posed by GPT-2 had been exaggerated, and that there wasn’t any proof the mannequin could possibly be abused within the methods OpenAI described. AI-focused publication The Gradient went as far as to publish an open letter requesting that OpenAI launch the mannequin, arguing it was too technologically vital to carry again.

OpenAI finally did launch a partial model of GPT-2 six months after the mannequin’s unveiling, adopted by the complete system a number of months after that. Brundage thinks this was the precise strategy.

“What a part of [the GPT-2 release] was motivated by or premised on pondering of AGI as discontinuous? None of it,” he stated in a publish on X. “What’s the proof this warning was ‘disproportionate’ ex ante? Ex publish, it prob. would have been OK, however that doesn’t imply it was accountable to YOLO it [sic] given data on the time.”

Brundage fears that OpenAI’s goal with the doc is to arrange a burden of proof the place “considerations are alarmist” and “you want overwhelming proof of imminent risks to behave on them.” This, he argues, is a “very harmful” mentality for superior AI methods.

“If I have been nonetheless working at OpenAI, I might be asking why this [document] was written the best way it was, and what precisely OpenAI hopes to attain by poo-pooing warning in such a lop-sided means,” Brundage added.

OpenAI has traditionally been accused of prioritizing “shiny merchandise” on the expense of security, and of rushing product releases to beat rival corporations to market. Final 12 months, OpenAI dissolved its AGI readiness crew, and a string of AI security and coverage researchers departed the corporate for rivals.

Aggressive pressures have solely ramped up. Chinese AI lab DeepSeek captured the world’s consideration with its overtly accessible R1 mannequin, which matched OpenAI’s o1 “reasoning” mannequin on a variety of key benchmarks. OpenAI CEO Sam Altman has admitted that DeepSeek has lessened OpenAI’s technological lead, and said that OpenAI would “pull up some releases” to higher compete.

There’s some huge cash on the road. OpenAI loses billions yearly, and the corporate has reportedly projected that its annual losses might triple to $14 billion by 2026. A quicker product launch cycle may gain advantage OpenAI’s backside line near-term, however presumably on the expense of security long-term. Specialists like Brundage query whether or not the trade-off is value it.

Source link