Confused about what AI means for you as a writer right now? Here are the facts.
As recently reported in The Bookseller, developments in artificial intelligence (AI) prompted the Society of Authors to reach out to AI firms, advocating for fair compensation for authors whose work is being used to train models. This is a critical issue, and like much about AI, it is complex and often misunderstood.
We, as authors, cannot possibly stay on top of every "techie" advancement, but it is worth understanding the basics: what AI really is, how it works, and what it means for our rights.
AI has been around for years and is deployed in countless ways we take for granted. When Amazon recommends products to you, that is AI at work. When you use tools like Microsoft Editor or Grammarly, you’re benefiting from AI.
Such systems consist of sophisticated models (like "neural networks" or "random forests") that help computers do tasks that require human intelligence, like recognising patterns and making decisions in many situations—including solving crimes.
The buzz today is about generative AI, a true step change in advancement. These models can create entirely new content—text, music, images—seemingly out of thin air. And that is a worry.
Large Language Models (LLMs) like ChatGPT are at the core of this technology, able to generate entire paragraphs, summarise complex topics and even mimic writing styles. What may look like magic is still only maths. It is auto-predict on steroids, determining what word is statistically most likely to come next in a sentence, based on an astronomical amount of content they’ve been trained on.
And it can be wildly wrong. It’s important to remember they’re not knowledge engines—they do not "know" facts—they are just incredibly adept at generating human-like text.
Tools like Google’s Gemini and Microsoft’s Copilot are now blending AI with search engines to provide more accurate information, and even cite their sources. But for now, when using it for research, it is best to always query where their confident assertions came from and verify.
If you are using a free version of a model like ChatGPT, Claude, or Gemini, your input may be used to further train the model
The process of training LLMs involves feeding them enormous amounts of text. This is where the controversy begins. Many lawsuits have emerged from artists and intellectual property owners who object to their work being used without consent to train models. While this might seem like a straightforward copyright issue, the reality is more nuanced.
Under current copyright laws, there is a potentially valid argument that such LLMs are making "fair use" of copyrighted materials because they do not reproduce text verbatim; they are merely learning—just like we do when we read our genre’s blurbs on Amazon to improve our own pitch. However, the bigger issue is not whether what they create breaches copyright, but whether they obtained the texts legally.
Modern books are not generally available for free on the web, nor are they public domain. Yet, somehow, AI models have accessed them. This has led to lawsuits alleging that some models have been trained using pirated copies of books. In response, AI companies are starting to strike deals with publishers, like Condé Nast, for legitimacy.
Another concern for authors is what happens when they use generative AI. If you are using a free version of a model like ChatGPT, Claude or Gemini, your input may be used to further train the model (the same is true for Grammarly, and so on). This does not mean the tool will regurgitate your exact sentences for someone else, but it does mean that your input contributes to its learning process, which some people object to out of principle.
However, if you are using a paid version, you have a closed environment and what you input generally isn’t used to train the model. That is why corporations with sensitive or proprietary information tend to subscribe.
A final concern raised is what happens if agents or editors use AI models to process manuscripts. If they have a paid subscription, using it to analyse a manuscript is not fundamentally different from sharing it with a human colleague. The knowledge generated by the AI stays within the agency, much like human memory.
However, if an agency or publisher uses a free AI model, this raises significant ethical concerns because they would be distributing authors’ copyrighted work without consent.
As the recent HarperCollins deal with an AI audio company shows, publishers are eager to embrace innovative technologies. But with AI developing so rapidly, it is crucial that they and all trade associations—for authors, agents and publishers—remain vigilant, and develop ethical policies and guardrails to protect the industry’s most valuable asset: the writers.