CAN YOU REALLY DETERMINE if a novel will be a bestseller with algorithms? You know, those mysterious formulas that Google uses to decide which adds pop up and Facebook employs to force feed you add links that you can’t resist clicking.
If you want to get your hands on whoever came up with such black magic, it’s too late. The term is derived from the name of a Persian mathematician by the name of Muhammad ibn Musa al-Khwarizmi and he died in the year 800 or something like that.
The Dream of a Bestseller
Having a bestseller is not what consumes the mind of most authors like me. We just dream of someone besides our wife or mother saying they read our story and liked it.
“I have a great book. It’s called Stantasyland. Except I don’t have the money to buy a million copies to put it on the bestsellers list ~ Stanley Victor Paskavich
More resourceful, independent authors like Kerry Nietz come up with a cool title like Amish Vampires in Space and somehow see it show up on The Tonight Show with Jimmy Falon making jokes about the title and cover art. But even though well written, AVS may or may not ever become a bestseller. I’m pulling for you, Kerry.
Enter the Algorithms?
Don’t worry, I haven’t forgotten about the algorithm thing. A group of computer scientists from Stony Brook University in New York has created a technique (Statistical Stylometry) that employs its algorithms to examine the use of words and grammar.
“Predicting the success of literary works poses a massive dilemma for publishers and aspiring writers alike.” ~ Assistant Professor Yejin Choi, Association of Computational Linguistics.
The Stony Brook computer wizards claim that their Statistical Stylometry can predict if a book will be a commercially successful bestseller with 84 percent accuracy. How? Their algorithm examines novels for:
Factors found in successful books:
- Heavy use of conjunctions
- Large numbers of nouns and adjectives
- Verbs that describe thought processes, e.g. “recognized” or “remembered”.
Factors found in less successful books:
- Too many conjunctions
- Too many verbs and adverbs
- Explicit description of actions and emotions, e.g. “wanted”, “took” or “promised”
Just a couple of Problems
An algorithm is only as good as it’s creators’ assumptions and the data set it uses.
The Stony Brook research is based on the most downloaded free books from Project Gutenberg plus a few from Amazon.Project Gutenberg contains 40,000 titles available as a free download. Those are either not copyrighted or in the Public Domain. That means these are books that are at least 7o-years old or were not sales worthy enough to bother to obtain a copyright for. In case you haven’t noticed, writing styles have changed considerably since World War II.
Then there are those books that are now considered classics but were abject failures when first published. Consider these novels I had as required reading in high school.
Brave New World – Aldous Huxley gave the world this reverse dystopian vision of the future in 1932. Literary critics hated it, the public didn’t want it, and only a few thousand copies sold at its first printing. By the time of his death on November 22, 1963 (does that date sound familiar?), Brave New World had become a classic and was required reading in thousands of schools.
Lord of the Flies – William Golding’s haunting piece of social commentary sold less than 3,000 copies and then promptly went out of print. It was years later that Lord of the Flies found a new publisher, a world of readers, and Golding became a Nobel Prize Laureate.
The Catcher In The Rye – now considered the seminal piece of fiction for the Baby Boomer generation, J.D. Salinger’s work was initially lampooned as “too long, disappointing, amateurish, predictable and boring.” The public ignored the critics and Salinger’s work has sold 65 million copies since then.
The Lord of the Rings Trilogy – Long before it gave birth to a cinema juggernaut, J. R. R. Tolkien’s, Lord of the Rings was hardly considered a success. The New York Times dubbed it as “death to literature itself.” The New Republic described the book and its characters as “anemic, and lacking in fiber”.
Can anyone predict what book will be a bestseller? The truth is, the whole bestseller thing often has little to do with how well written a novel is. Shelf space at Barnes and Nobles or Books a Million is auctioned off like prime shelf space in a grocery store and publishers blast thousands of free copies to pad New York Times and USA Today bestseller lists.
This is not sour grapes. I’m sure those mysterious algorithms can ferret out a probable commercial success. But, over the long haul, it’s readers who determine if a story has legs. How many bestsellers have you read that left you reaching for something else? On the other hand, how many great stories will never be on the shelf closest to the bookstore door?