Skip to content

Should artists be paid for training data? The vice president of OpenAI would not say so

Should artists whose work was used to train generative AI like ChatGPT be compensated for their contributions? Peter Deng, vice president of consumer products at OpenAI, the creator of ChatGPT, was reluctant to give an answer when asked on the SXSW main stage this afternoon.

“That’s a great question,” he said when SignalFire partner (and former TechCrunch writer) Josh Constine, who interviewed Deng in a wide-ranging fireside chat, asked the question. Some in the crowd shouted “yes” in response, which Deng acknowledged. “I heard from the audience that yes. I hear it from the audience.”

That Deng dodged the question is not surprising. OpenAI is in a delicate legal position when it comes to the ways it uses data to train generative AI systems like the art creation tool. DALL-E 3which is built into ChatGPT.

Systems like DALL-E 3 rely on a huge number of examples (artwork, illustrations, photographs, etc.) that usually come from public sites and data sets on the web. OpenAI and other generative AI vendors argue that fair use, the legal doctrine that allows the use of copyrighted works to make a secondary creation as long as it is transformative, protects their practice of mining public data and using it for training without compensation. not even credit. artists.

In fact, OpenAI recently argued that it would be impossible to create useful AI models without copyrighted material. “Training AI models using publicly available Internet materials is fair use, as supported by long-standing and widely accepted precedents,” the company writes in a January report. blog post. “We believe this principle is fair to creators, necessary for innovators, and critical to America’s competitiveness.”

The creators, unsurprisingly, disagree.

A class-action lawsuit brought by artists such as Grzegorz Rutkowski, known for his work on Dungeons & Dragons and Magic: The Gathering, against OpenAI and several of its rivals, Midjourney and DeviantArt, is making its way through the courts. The defendants argue that tools like DALL-E 3 and Midjourney replicate artists’ styles without the artists’ explicit permission, allowing users to generate new works that resemble the artists’ originals for which the artists do not they receive no payment.

OpenAI has licensing agreements in place with some content providers, such as Shutterstockand allows webmasters to prevent their web crawler from crawling their site for training data. Additionally, like some of its rivals, OpenAI allows artists to “opt out” and remove their work from the data sets the company uses to train its imaging models. (Some artists have described the opt-out tool, which requires submitting an individual copy of each image for removal along with a description, although it is onerous).

Deng said he believes artists ought has more agency in creating and using generative AI tools like DALL-E, but isn’t sure exactly what form that might take.

“[A]Artists must be part of [the] ecosystem as much as possible,” Deng said. “I think if we can find a way to speed up the creation of art, we’ll really help the industry a little more… In a sense, every artist has been inspired by artists who came before them, and I wonder how much of that will be accelerated with this”.