You are viewing your 1 free article this month. Login to read more articles.
Copyright was not made for a world with AI, so the first case brought to trial will be crucial.
The frightening fact that an AI can now write a book doesn’t so much challenge the legal status quo in the publishing industry as zap it to bits with a laser gun.
Until now copyright law has served publishing companies well. It establishes their exclusive right to publish a particular book, either as the rights owner or the licensee. The entire business model is based on the principal that copying is not allowed, so a return can be made on an investment.
Copyright law works just fine to prevent outright counterfeits and acts of human plagiary, as that is what it was designed to do. But it’s all out of ideas when it comes to computer generated works which, while technically new, rely entirely on the digital ransacking of source material. It lacks provision to protect rights owners in this situation.
Copyright only really does what it says on the tin. It’s the right to make copies of original works, be they books, songs, films or visual art works. A publisher needs to either own the copyright in a text in order to make copies, or it needs a license from the author. Copyright is easy to comprehend at this level, because the copies being made are literally just that – actual facsimiles of the original text.
However, things get very complicated very quickly when only a part of the work is copied. The law falls back on one of its favourite definitions to handle this, and states that copyright is infringed if a copy is made of the whole work or any “substantial” part of it.
This elusive and infuriating term essentially leaves the question of copyright infringement open to almost limitless interpretation. Courts have grappled with this over the years in all kinds of cases, and the only really meaningful limitation they have come up with is that “substantial” is a qualitative rather than quantitative determination. When it comes to a literary work this has meant that a small section of a novel, even as little as a few sentences can be a substantial part, if it can be show to be particularly meaningful.
The actual legal question boils down to whether or not an AI generated book comprising original text, but which was constructed from data derived from an earlier book, contains a “substantial” part of that earlier book
What is very clear though is that copyright does not protect the ideas vested in a text, or something like an authorship style. In this sense copyright is not about copying, rather it is about making a copy, which is an important distinction. There has to be an actual part of the original work in the copy for there to be infringement.
So, with an AI-generated text the question of infringement would therefore come down to whether or not it had actually made a copy of the whole or a substantial part of an original work. There are two basic ways it might do this that existing copyright law anticipates.
The first, if provable, would be a silver bullet, and it is the fact that if an electronic copy of a work is recorded anywhere then there is infringement. If an AI had to make and store a digital copy of an original work in order to rely on it to generate text then there’s infringement right there. Whether this actually happens, or more importantly can be proved, is much harder to answer.
The second way an AI might use a substantial part of an original work is if blocks of text were reproduced wholesale, and that would depend on just how the AI was programmed to behave. We have seen AIs do something very similar with illustrations, where entire parts of earlier images have been copied and placed in a collage with others.
However, leaving aside these conventional infringement situations, the much larger question is whether any AI generated work should be considered copyright infringing due to the automated extent to which it must rely on earlier works.
Artificial intelligence is not capable of genuine originality in the human sense, and relies instead on processing existing texts to generate something similar. An AI could potentially produce an original book in the distinctive style of a known author, but it would do so by slavishly relying on all kinds of data vested in their existing works. Textual patterns, sentence construction, common words and expressions, and even narrative structures are recognised by the AI and then reproduced. Likewise, an AI could produce a non-fiction book on any given subject, once again drawing the content from published texts.
So the actual legal question boils down to whether or not an AI-generated book comprising original text, but which was constructed from data derived from an earlier book, contains a “substantial” part of that earlier book. On the basis of the current interpretation of copyright law the answer would be no, simply because the exact sequence of words was different. However, the emergence of the first AI authors have made that seem like an outdated approach.
If the AI-generated book could never exist without the original text, perhaps copyright law should be contorted so the answer becomes yes. It comes down to appreciating that there is value in the data intrinsically embedded in an earlier text, which prior to AI was never even considered.
For now we’ll have to wait until a major AI copyright case reaches trial, and a judge seizes the opportunity to clarify the situation. Until then the publishing industry has to protect itself as best it can, by recognising AI learning as a new stand-alone use of a copyright work, so it can be defined and apportioned in publishing contracts. It would also take a very brave publisher to be the first to actually publish a provocative AI-generated book which might illicit an angry legal response. It will happen, and probably very soon, and my advice to any publisher would be to let someone else stick their neck out first.