In May, OpenAI said it was developing a tool to let creators specify how they want their works to be included — or excluded — from its AI training data. But 7 months later, this feature has yet to see the light of day.
Called Media Manager, the tool would “identify copyrighted text, images, audio and video,” OpenAI said at the time, to reflect creators’ preferences “across multiple sources.” It was intended to fend off some of the company’s harshest critics and potentially shield OpenAI from IP-related legal challenges.
But people in the know tell TechCrunch that the tool was rarely seen as a significant release domestically. “I don’t think it was a priority,” said a former OpenAI employee. “To be honest, I don’t remember anyone working on it.”
A non-employee coordinating work with the company told TechCrunch in December that they had discussed the tool with OpenAI in the past, but that there had been no recent updates. (These people declined to be publicly identified discussing confidential business matters.)
And a member of OpenAI’s legal team who was working for media manager Fred von Lohmann moved into a part-time consulting role in October. OpenAI PR confirmed Von Lohmann’s move to TechCrunch via email.
OpenAI has yet to provide an update on Media Manager’s progress, and the company missed a self-imposed deadline to deploy the tool by 2025.
IP problems
AI models like OpenAI learn patterns in data sets to make predictions—for example, that a person biting into a burger will leave a bite mark. This allows models to learn how the world works, to some extent, by observing it. ChatGPT can write persuasive emails and essays, while Sora, OpenAI’s video generator, can create relatively realistic visuals.
The ability to use examples of writing, film, and more to generate new works makes AI incredibly powerful. But it is also regurgitative. When driven in a certain way, the models – most of which have been trained on countless web pages, videos and images – produce near-duplicates of that data, which despite being “publicly available”, are not meant to be used in this way.
For example, Sora can generate clips featuring the TikTok logo and popular video game characters. The New York Times has had ChatGPT quote its articles verbatim (OpenAI blamed the behavior on a “hack”).
This has understandably upset creators whose works have been included in AI training without their permission. Many have become lawyers.
OpenAI is fighting lawsuits filed by artists, writers, YouTubers, computer scientists, and news organizations, all of whom claim the startup trained their works illegally. Plaintiffs include authors Sarah Silverman and Ta Nehisi-Coates, visual artists and media conglomerates such as The New York Times and Radio-Canada, to name a few.
OpenAI has pursued licensing agreements with select partners, but not all creators find the terms attractive.
OpenAI offers developers several ad hoc ways to “opt out” its AI training. Last September, the company launched a submission form to allow artists to tag their work to be removed from its upcoming training batches. And OpenAI has long allowed webmasters to block its web crawlers from collecting data across their domains.
But creators have criticized these methods as haphazard and inadequate. There are no specific opt-out mechanisms for written works, videos or audio recordings. And the opt-out form for images requires submitting a copy of each image to be removed along with a description, a painstaking process.
Media Manager was introduced as a complete overhaul – and expansion – of solutions to give up OpenAI today.
In its announcement post in May, OpenAI said Media Manager would use “cutting-edge machine learning research” to enable content creators and owners to “show (OpenAI) what they own.” OpenAI, which claimed it was collaborating with regulators while developing the tool, said it hoped Media Manager would “set a standard across the AI industry”.
OpenAI has never publicly mentioned Media Manager since then.
A spokesperson told TechCrunch that the tool was “still in development” as of August, but did not respond to a request for comment in mid-December.
OpenAI hasn’t given any indication of when Media Manager might launch — or even what features and capabilities it might launch with.
Fair use
Assuming Media Manager does arrive at some point, experts aren’t convinced it will ease creators’ concerns — or do much to resolve legal issues surrounding the use of AI and IP.
Adrian Cyhan, an IP lawyer at Stubbs Alderton & Markiles, noted that Media Manager as described is an ambitious undertaking. Even platforms as big as YouTube and TikTok struggle with content ID at scale. Could OpenAI really do better?
“Ensuring compliance with legally required creator protections and potential compensation claims under review presents challenges,” Cyhan told TechCrunch, “especially given the rapidly evolving and potentially disparate legal landscape in national and local jurisdictions.”
Ed Newton-Rex, founder of Fairly Trained, a non-profit organization that certifies that AI companies are respecting creators’ rights, believes that Media Manager would unfairly shift the burden of controlling AI training onto creators; By not using it, they may be giving tacit approval for their works to be used. “Most creators will never hear about it, let alone use it,” he told TechCrunch. “But it will nevertheless be used to protect the mass exploitation of creative work against the wishes of the creators.”
Mike Borella, co-chair of MBHB’s AI practice group, noted that pull systems don’t always take into account transformations that can be made to an artwork, such as an image that has been scaled down. They also may not address the pervasive scenario of third-party platforms hosting copies of creators’ content, adds Joshua Weigensberg, an IP and media attorney for Pryor Cashman.
“Creators and copyright owners do not control, and often do not even know, where their works appear online,” Weigensberg said. “Even if a creator tells each AI platform that they are opting out of training, those companies may continue to train on copies of their work available on third-party websites and services.”
Media Manager may not even be particularly favorable to OpenAI, at least from a legal point of view. Evan Everist, a partner at Dorsey & Whitney who specializes in copyright law, said that while OpenAI could use the tool to show a judge that it is toning down its training on IP-protected content, Media Manager would likely not protect the company from damages if found to have infringed.
“Copyright owners have no obligation to come out and preemptively tell others not to infringe their works before that infringement occurs,” Everist said. “The basics of copyright law still apply – ie, don’t take and copy other people’s stuff without permission. This feature may be more for PR and positioning OpenAI as an ethical user of content.
An account
In the absence of a Media Manager, OpenAI has implemented filters—albeit imperfect—to prevent its models from regenerating training examples. And in the lawsuits it is fighting, the company continues to claim fair use protections, asserting that its designs create transformative works, not plagiarism.
OpenAI may prevail in its copyright disputes.
Courts could rule that the company’s AI has a ‘transformative purpose’, following the precedent set roughly a decade ago in the publishing industry’s lawsuit against Google. In that case, a court ruled that Google’s copying of millions of books for Google Books, a type of digital archive, was permissible.
OpenAI has publicly said that it would be “impossible” to train competitive AI models without using copyrighted material – authorized or not. “Limiting training data to public domain books and drawings created more than a century ago may produce an interesting experiment, but will not provide AI systems that meet the needs of today’s citizens,” the company wrote in a submission in January to the UK House of Lords. .
If the courts ultimately declare OpenAI the winner, Media Manager wouldn’t serve much of a legal purpose. OpenAI seems willing to make that bet — or reconsider its withdrawal strategy.