ChatGPT and other AI softwares have exploded in popularity and are increasingly sophisticated—but can they ever fully replicate the human touch?
I avoided the temptation to ask ChatGPT to write this piece, but I couldn’t resist asking it to do some creative writing to see for myself what sort of literary chops it has. The prompt “write a short story featuring a small goat and a complicated marriage” produced a genuinely amusing tale, with dialogue and characterisation. I was shocked by its fluency and coherence.
What are the implications of this technology for the publishing business? Will the art of researching and writing books be assisted or replaced by clever code? Will literary translators be made redundant? Can I outsource writing book pitches and finding good comp titles? Will editors and scouts begin to rely on AI reports on manuscripts under consideration?
With one major exception, I am leaning towards “no” in answer to all these questions because the natural language models currently being developed are trained on pre-existing material (including, regressively, their own output and that of other models). While they are very good at producing plausible-looking content, they are incapable of genuine invention. The term “Artificial Intelligence” used to describe a natural language model is misleading because it encourages us to think of the system as having some sort of self-awareness. It does not. That is a feature of AGI (Artificial General Intelligence) which remains a distant goal, well beyond the current reach of tech companies. ChatGPT and Bard’s outputs are the blended product of vast data sets and extremely clever modelling of the way that language works—but these systems have no ability to test whether their output is factually correct or meaningful, let alone beautiful. There are many ways in which these tools could be useful to a writer or a publisher, including summarising text and structuring arguments. But for now, at least, they are only tools, with significant flaws and limitations. They are not engines of invention.
There are intriguing questions relating to copyright law and the ownership of the texts that natural language models produce. I was relieved to find that the Terms of Use for Open AI, which created ChatGPT and GPT-4, assign ownership of the copyright in their models’ output to the user. This seems fair. If the user enters an original text as a prompt (in which they own the copyright as a matter of course) then the response from the chatbot should be their property, too.
The legal situation becomes more complex if a user copy-pastes all, or part, of a copyright work into the prompt. When I tried this with the first stanza of “The Waste Land”, the bot correctly identified and attributed the poem, and gave a coherent commentary on the text. But OpenAI’s systems can’t recognise every copyright work (not least those published after 2021, which is the horizon for Chat GPT’s dataset “memory” of the world). The potential for parasitic, derivative or even infringing “new” texts being created via prompts featuring in-copyright material is real. ChatGPT happily produced a 400-word poem when I prompted it to write one “featuring substantial elements of ‘The Waste Land’ by T S Eliot”. The result dutifully included gobbets of the original, regurgitated in doggerel worthy of William McGonagall: “The sun beats down on this barren land/Where dead trees offer no shade/and the dry stones offer no helping hand/The Waste Land has us all enslaved.”
There are murkier issues here. What use does OpenAI make of in-copyright books in order to train its natural language models? Has it acquired a licence to use any? I have not been able to find definitive answers to these questions, though I can guess what they are. After substantial lobbying from the tech industry, the UK government recently proposed an exception to copyright law that would allow unlimited, free data mining of copyright material, including for commercial purposes. Happily the creative industries pushed back and it seems a more balanced policy approach will be taken, recognising the importance and role of licensing in data mining activities and the rights of creators.
Translation software, on the other hand, poses a more significant threat to translators’ livelihoods in the nearer term than chatbots do to authors’ and publishers’ incomes. I can foresee a dismal future in which the act of book translation is free, instantaneous and trivially easy for a reader to deploy. In some ways it’s here already, although I don’t think many people would tolerate the experience of reading a whole novel that has been run through DeepL or Google Translate.
The art of literary translation is as challenging a task for a software system as the art of writing the original text, in terms of the cognitive skills, and only AGI will be able to do it as well as a human. But we can’t be sure to what extent readers will prize the human touch. Recently I heard a rumour that one global publishing group is already using translation software to help it publish books for which it has acquired the rights in other languages. This prompted me to update our agency’s boilerplate translation licence to require the publisher to appoint a suitably qualified human translator to carry out the work. I wonder how long it will be before a publisher tells me that this requirement is unacceptable?
Sam Edenborough is rights director at Greyhound Literary