Image by Cristofer Maximilian, from Unsplash

Creators Demand Tech Giants To Pay For AI Training Data

Reading time: 4 min

Last Updated: Feb 7, 2025

Written by Kiara Fabbri Multimedia Journalist
Fact-Checked by Justyn Newman Lead Cybersecurity Editor

Governments are allowing AI developers to steal content – both creative and journalistic – for fear of upsetting the tech sector and damaging investment, a UK Parliamentary committee heard this week, as first reported by The Register.

In a Rush? Here are the Quick Facts!

UK MPs heard concerns over AI exploiting copyrighted content without compensation.
Composer Max Richter warned AI threatens musicians’ livelihoods and originality.
Publishers found 1,000 bots scraping data from 3,000 news websites for AI models.

Despite a tech industry figure insisting that the “original sin” of text and data mining had already occurred and that content creators and legislators should move on, a joint committee of MPs heard from publishers and a composer angered by the tech industry’s unchecked exploitation of copyrighted material.

The Culture, Media and Sport Committee and Science, Innovation and Technology Committee asked composer Max Richter how he would know if “bad-faith actors” were using his material to train AI models.

“There’s really nothing I can do,” he told MPs. “There are a couple of music AI models, and it’s perfectly easy to make them generate a piece of music that sounds uncannily like me,” he said, as reported by The Register.

“That wouldn’t be possible unless it had hoovered up my stuff without asking me and without paying for it. That’s happening on a huge scale. It’s obviously happened to basically every artist whose work is on the internet,” Richter added.

Richter, whose work has been used in major film and television scores, warned that automated material would edge out human creators, impoverishing musicians. “You’re going to get a vanilla-ization of music culture,” he said, as reported by The Register.

“If we allow the erosion of copyright, which is really how value is created in the music sector, then we’re going to be in a position where there won’t be artists in the future,” he added.

Former Google staffer James Smith echoed this sentiment, saying, “The original sin, if you like, has happened.” He suggested governments should focus on supporting licensing as an alternative monetization model, reported The Register.

Matt Rogerson, director of global public policy at the Financial Times, disagreed, emphasizing that AI companies were actively scraping content without permission. “We can only deal with what we see in front of us,” he said, as reported by The Register.

A study found that 1,000 unique bots were scraping data from 3,000 publisher websites, likely for AI model training, according to The Register.

Sajeeda Merali, chief executive of the Professional Publishers Association, criticized the AI sector’s argument that transparency over data scraping was commercially sensitive. “Its real concern is that publishers would then ask for a fair value in exchange for that data,” she said, as reported by The Register.

The controversy over AI training data escalated in October when over 13,500 artists signed a petition to stop AI companies from scraping creative works without consent. Organized by composer and former AI executive Ed Newton-Rex, the petition was signed by public figures like Julianne Moore, Thom Yorke, and Kazuo Ishiguro.

“There are three key resources that generative AI companies need to build AI models: people, compute, and data. They spend vast sums on the first two – sometimes a million dollars per engineer, and up to a billion dollars per model. But they expect to take the third – training data – for free,” Newton-Rex said.

Tensions heightened further when a group of artists leaked access to OpenAI’s text-to-video tool, Sora, in protest. Calling themselves “Sora PR Puppets,” they provided free access to Sora’s API via Hugging Face, allowing users to generate video clips for three hours before OpenAI shut it down.

The protesters claimed OpenAI treated artists as “PR puppets,” exploiting unpaid labor for a $157 billion company. They released an open letter demanding fair compensation and invited artists to develop their own AI models.

With artists and publishers pushing back against AI’s unchecked use of their content, the debate over ethical AI training practices continues to intensify. The UK government faces mounting pressure to implement policies that protect creative industries without stifling technological advancement.

Creators Demand Tech Giants To Pay For AI Training Data

We're thrilled you enjoyed our work!

Leave a Comment