Yes I’m aware that Nvidia and Adobe have announced they will license training data. I don’t know what those agreements will look like, but I can’t imagine that they make any sense in terms of traditional licensing arrangements. Rather, I’m guessing they just brute forced things to build goodwill among artist communities and perhaps to distinguish themselves from other AI companies. I sincerely doubt these arrangements will help artists though, and I fear these licensing conversations will distract from better conversations on how to balance interests. To explain my thoughts on this, I first have to start from the beginning.
AI training is very complicated, but can be explained pretty simply using the lie-to-children model. There’s an older video by CGP Grey (and footnote video) that does a beautiful job, but I will try to summarize. AI training is the solution to the problem of trying to make a computer good at something complicated. Let’s say you want to teach a computer to play chess (or Go). The simplest, least efficient, way to start is to tell the computer if “human move1 = moves this pawn, move that pawn” and so on. Writing the program like this would take forever, and the computer would be bad at repeated games of chess because a human could quickly figure out what the computer is doing. So the programmer has to find a more efficient set of rules that also has better success at adapting to and beating human players. But there is a limit, the program keeps getting more complex and great chess players will still find the difficulty lacking. A big part of the problem is that humans and computers just think differently, and it’s really hard to write a program to get a computer to act like humans do when it comes to things that humans are really good at.
Fortunately, there are several methods of getting computers to figure it out on their own. While these methods function different, they generally involve giving the computer a goal, a lot of data, a way to test itself, and a tremendous amount of processing power. The computer keeps iterating until it passes the test and the human keeps tweaking the goal, the data, and the test based on what kind of results they are getting. After a while this learning starts to resemble how humans learn, but done in a way that only computers can do. The results produced by this learning is often called a “black box”, because we don’t know exactly how the computer uses the model it creates to solve the problems we give it.
AI image generation training, at it’s simplest, is giving the model an image and text pair and telling it that some of the words describe a style and some of the words describe objects within the picture. The goal set for the model is to understand and be able to reproduce the style or the objects (or both) based on the words. Give it a picture of The Starry Night and a description, and the model will start to learn the style concepts based on the words “Van Gogh” and “post-impressionism.” It will also start to understand the objects of city, moon, stars, and how they would look at night. It takes a lot of images to train towards a functional understanding of these concepts, and after each image is ingested into the model it’s basically trash. It isn’t stored in the model, only the concepts are. And those concepts should ideally not be tied to a single image (see overfitting).
This brute force learning is not that different from how humans learn art. A human has to learn the practical techniques of how to make art, but they also should be looking at other artists’ work. When I was learning pottery, there was a point where my instructor said now that you can produce a pot it’s time to figure out your style. That involved looking at lots of pictures of pottery. For computers, that’s the only step that really matters. Teaching a human to create art in this way would be like locking someone in a room with every Monet and a set of paints and not letting them out until they have produced impressionist art. Importantly, current AI simply reproduces learned styles. It would not create post-impressionism in protest if given the same task.
The war between impressionism and post-impressionism
Licensing can be a pretty complex process in which copyright law, a lot of lawyers, mutual business interests, and the threat of lawsuit get together to produce a contract that everyone thinks should have been better for them. There is also usually art involved.
To continue oversimplifying things, lets just say that every time a song is streamed someone, somewhere, moves a bead on an abacus. At the end of the month an amount is paid based on whatever the abacus says. An ad campaign uses a photo? Another abacus bead moves based on where it’s displayed, how many people see it, etc.. A song gets used in a tire commercial? More abacuses. The important thing is that an artist’s work is getting reproduced, in whole or in part, in a quantifiable way. That quantity relates back to agreed to terms and the artist is paid based on those terms.
Let’s ignore the fact that AI training is likely fair use. Let’s ignore that the audience for the works is a computer. Let’s ignore that the works are only used to teach the computer concepts. Let’s ignore the fact that those works are never (ideally) reproduced or displayed to users of the AI. Even ignoring all that, there is still the problem that the number of times each work is used is one (ideally). Stable Diffusion’s first release ingested 2.3 billion images. So in abacus terms each work moves one lonely bead to make up 1/2.3 billionth (and counting) of a share in a theoretical license fee.
The next problem is what the share is in. The Copyright Office has so far said that AI generated elements of a work can’t be copyrighted. Grimes recently tweeted that she would split 50/50 any royalties using her voice. But royalties are based on licensing, which is based on copyright. Theoretically streaming companies and other distributors could pay royalties on a copyright-less song, but would they? And would listeners pay for a song in the public domain? Maybe. The more likely answer is that works produced by AI have no value on their own in any meaningful sense that would enable artists to get a piece of.
OK, but what about giving artists a share in some right for their contribution to the model? I don’t think this works for a few reasons, but there are two main theories behind this proposal that we have to take in turn. The first theory is related to the open question of what claims an artist has for a song that sounds like them, but isn’t based on any song in their catalog. Answering that question the wrong way could matter for music generally. Creating a rule that gives an artist rights in songs that sound like them, but otherwise aren’t infringing, would mean Led Zeppelin would have a claim over Greta Van Fleet’s catalog. We currently don’t give rights for even heavy influence, or for impersonation. The same is true in other arts as well.
The second theory is based on the fact that AI can be used to infringe copyright in a traditional sense. While output is ideally original, sometimes overfitting occurs and the model will output works that are close enough to works it trained on to be considered infringing under current copyright law. This raises many parallels to the Betamax case, where the Supreme Court ruled that distributing recordable media and recording devices did not result in contributory infringement even if it was used for infringement. Congress reacted to this ruling with the Audio Home Recording Act, which created a generic royalty for each device or recordable media sold. I don’t think that’s the answer here because overfitting is a bug, and AI developers are (and should) working towards fixing that bug.
Either way there isn’t even a good way to divvy up what money might be available. AI training requires a lot of data, so any particular artist’s share would be extremely dilute. One might argue that shares should be paid out based on importance to the model, but that is likely impossible to figure out in any meaningful way. AI functions as a black box, and it’s hard to quantify how much of each work is in any particular concept it invokes when responding to a prompt. That holds true even when artists’ names are used in prompts. Yes the artist might have greater influence in the output (as intended), but all the other works were still necessary to teach the model what a cat is and what a bicycle is and how a cat might ride a bicycle through Dali’s The Persistence of Memory.
Regardless, none of this solves the problem artists are facing: that they are competing with a computer that is faster, cheaper, dumber, weirder, more chaotic, and less open to feedback. A residual of five cents a month is not going to fix that, even if the artist can collect. For example, Stable Diffusion has released their AI model as open source, and many people (myself included) run it on their own computers for free. There’s no way to put the genie back in the bottle.
None of this dismisses the fact that there should be conversations on how to navigate this new space in a way that preserves art and artists. The simplest answer might be that artists are smart and talented people and they are already figuring it out. A new survey shows that 60% of musicians are using AI to create. Special effects artists Corridor Crew have figured out a workflow to basically do AI rotoscoping. Professional photographer Aaron Nace (owner of Phlearn) has a video teaching how to integrate AI into creative photography projects. Disney animator Aaron Blaise has a video encouraging artists to embrace new AI technology. Artists will adapt to using AI as a tool and will produce better work than people who don’t understand art and have no idea what they are doing. And their use of AI will likely be a small enough part of their works that they will still be able to rely on copyright if that is their model.
The next simplest answer is that denying copyright to low effort uses of AI is probably the best way to protect artists long term. One of the biggest threats artists face from AI is that powerful studios, labels, etc., will use AI to cut them out of the process. If that work can’t be copyrighted, because no human contributed anything of artistic merit, then the copyright industry won’t be able to turn AI into a cash cow by cutting their artist costs.
Finally, yes maybe we should have a conversation about whether there is a line in training models heavily towards the style of living artists. But these kinds of threats are way more present among “hobbyist” communities (NSFW warning for like half the content here) that play with the technology for their own interests. Larger AI developers all seem to be training away from that towards more general models that are easier to use and based on simple prompts.
Matthew Lane is a Senior Director at InSight Public Affairs. Originally posted to Substack. Republished here with permission.