Protecting Literature from Encroaching AI Influence by Dr. Timothy Smith
- Dr. Timothy Smith
- 2 days ago
- 3 min read

Photo Source: Unsplash
Recently, in a very public and declamatory manner, Hachette Book Group canceled the planned release of the horror novel Shy Girl by Mia Ballard. It also discontinued the UK edition of the novel at the same time. The cancellation came on the heels of a growing public outcry that artificial intelligence generated the novel. According to the publisher, “Hachette remains committed to original creative expression and storytelling.” (nytimes.com) The author denies the accusation that she used AI to write the book, and Ms. Ballard claims that the editor she hired to work on the book used AI. The dropping of Shy Girl marks a significant moment for publishing, as no other bookseller has dropped a title based on the accusation that AI generated the content. Hachette claims they did a thorough analysis of the text to determine that 70% of the book came from a machine.
In recent years, following the global release of large language models (LLMs), the massive injection of machine-written content into the internet, from blogs to news sites, has challenged teachers, publishers, and readers in discerning human from machine-generated writing. Because the basis of LLMs depends on the probability that certain words will occur next to or near each other, LLMs will often become repetitive and predictable. Text forensics aimed at distinguishing machine-generated from human-generated text examines word choice using a technique called “perplexity.” Perplexity measures how surprised an LLM is by finding a word near another word. For example, a typical word combination such as “the rain fell steadily from the grey sky” would receive a low perplexity score because of its common use. However, a phrase such as “the rain fell intermittently from the blue sky” would get a much higher perplexity score due to its unusual phrasing. Machine-generated writing tends to have a low perplexity score, which can make the text a bit dull to read. Additionally, text forensics looks for a quality called “burstiness.” Burstiness refers to a mixture of short and long sentences. Human writing tends to have more burstiness, while machine writing will have more sentences of the same length and less variation. A commercial tool for AI text generation called GPTZero uses perplexity, burstiness, and their interplay to detect AI-generated text. (gptzero.me) Other tools look for typos, which AI seldom makes, as well as punctuation frequency, which humans show higher variability in.
Given the current detectable differences between machine-generated text and human-generated writing, a larger question emerges about the human experience of reading and encountering written stories. Does the human-written text contain more subtle patterns over larger stretches of text than a machine can currently create? LLMs continue to improve, with the power to look at larger and larger windows of text, allowing models to detect longer, more subtle patterns. The perplexity and burstiness of human writing may call upon the same impulse that drove the Arts and Crafts movement, born in Great Britain in response to the Industrial Revolution. The Industrial Revolution began around the mid-1700s with the introduction of machine-powered factories, which represented a shift away from hand tools. Critics of the Industrial Revolution felt that machine-made products were soulless and degrading. The writer and painter William Blake, an intense critic of the Industrial Revolution, described the factories in England in his poem Jerusalem as “dark Satanic Mills.” The critics felt that factory workers would lose a sense of pride in their work as machines took over what they used to do with their hands. Moreover, small flaws and variations in handmade objects give them a unique quality that uniform machine-made objects lack.
Perhaps the subtleties, nuances, and patterns found in handmade objects that the Arts and Crafts movement celebrated now animate the same reactions readers have expressed regarding human writing in contrast to machine writing. In essence, most writing does not reach the level of artistic expression to qualify as literature. Even everyday writing in emails, reports, and blog posts must communicate differently from machine-generated writing. Hachette pulled Shy Girl to preserve the artistic integrity of their publishing house and clearly stand behind human writing. The writer and her expression remain prized, perhaps in part, because human expression through words contains patterns that exist beyond the mere likelihood that some words will be near each other in a sentence.

Dr. Smith’s career in scientific and information research spans the areas of bioinformatics, artificial intelligence, toxicology, and chemistry. He has published a number of peer-reviewed scientific papers. He has worked over the past seventeen years developing advanced analytics, machine learning, and knowledge management tools to enable research and support high-level decision making. Tim completed his Ph.D. in Toxicology at Cornell University and a Bachelor of Science in chemistry from the University of Washington.
You can buy his book on Amazon in paperback and in kindle format here.


Comments