You are viewing your 1 free article this month. Login to read more articles.
HarperCollins has confirmed a licensing agreement with an undisclosed AI company that will allow limited use of select non-fiction backlist titles for training AI models. The deal includes authors published by HarperCollins UK, which has been actively reaching out to authors and their agents to explain the arrangement.
Authors who opt into the programme will receive $2,500 for each book, as will its publisher. There is a three-year limit on the license, with the AI company agreeing to restrictions on the output, and also to enforce any copyright breaches.
Many publishers have signed AI contracts with large tech companies over the past year, including several academic publishers such as Taylor & Francis, Wiley and Oxford University Press with concerns raised about the lack of opt out for those writers, with Cambridge University Press notable for seeking permission from 20,000 of its authors for the use of their content.
Publishers, includng trade publishers, make a distinction between the unlicensed use of their copyrights, and deals that they might strike with AI companies, with Harper’s approach consistent with the position set out both by the UK Publishers Association and the Society of Authors. The US Authors Guild has also issued a statement approving of the approach. Penguin Random House recently changed the wording on its copyright pages to help protect authors’ intellectual property from being used to train large language models (LLMs) and other artificial intelligence (AI) tools.
Nevertheless others, including some authors, take a more fundamental position about the unfettered training of AI, and its future implications.
The development was first revealed on Bluesky by writer Daniel Kibblesmith, who referenced an email sent by the agency which represented his 2017 picture book, Santa’s Husband, illustrated by A P Quach, requesting his permission to allow the book to train the model. The email reads: “You are receiving this memo because we have been informed by HarperCollins that they would like permission to include your book in an overall deal that they are making with a large tech company to use a broad swath of nonfiction books for the purpose of providing content for the training of an AI language learning model [LLMs].
“You are likely aware, as we all are, that there are controversies surrounding the use of copyrighted material in the training of AI models. Much of the controversy comes from the fact that many companies seem to be doing so without acknowledging or compensating the original creators. And of course, there is concern that these AI models may one day makes us all obsolete.”
The email also said that: “HarperCollins has been required to keep this company’s identity confidential,” as part of the agreement. Kibblesmith described the situation as “abominable” and declined the offer.
The deal includes authors published by HarperCollins UK, but the company declined to comment. A spokesperson for HarperCollins US told The Bookseller: “HarperCollins has reached an agreement with an artificial intelligence technology company to allow limited use of select nonfiction backlist titles for training AI models to improve model quality and performance. While we believe this deal is attractive, we respect the various views of our authors, and they have the choice to opt in to the agreement or to pass on the opportunity.
"HarperCollins has a long history of innovation and experimentation with new business models. Part of our role is to present authors with opportunities for their consideration while simultaneously protecting the underlying value of their works and our shared revenue and royalty streams. This agreement, with its limited scope and clear guardrails around model output that respects author’s rights, does that."
HarperCollins told The Bookseller last summer it had not sold any material to AI for research: “We have not sold any access for AI research. If we were to reach an agreement to do so, we would provide authors the option of whether or not to participate.”