Random Thursday: More data-crunching in the meaning (or lack thereof) of book rankings

Thursday, May 5, 2016 - 08:00

Alpennia Logo

Today is even more random than usual, as any possibility of applying my brain to a new creative topic is toast. It looks like I'm finally circling down to being able to close the investigation that I've been working on the last two months. And I set up my new laptop last night and have yet to go through and do a complete functionality verification regarding programs and data-transfer, so I don't dare do anything on either the old or new machine that involves changing files yet. But fortunately I can do some more nattering on the question of what sorts of useful meaning can be squeezed from Amazon and Goodreads book ranking data.

I've had a delightful amount of engagement on this topic, and somewhat unusually it's come from a lot of different intersections of my online life. People are popping over from the LesFic mailing lists to comment, and Tuesday's blog got a mention on File 770, which resulted in pointers to some other similar data-crunching that ties in to the topic.

[Note: these two posts were originally posted on LiveJournal then migrated to Dreamwidth. Comments on the May 3, 2016 post can be found in DW here, and those on the DW version of this post can be found here.]

Correlation of Rating Volume and Sales

A commenter on File 770 pointed me to a fascinating analysis by SFF author Mark Lawrence that looks at the correlation between the number (but not magnitude) of Goodreads ratings and book sales. He used his own sales data plus data provided from colleagues writing in similar genres who were willing to provide numbers. (Note that book contracts sometimes explicitly prohibit authors from discussing income or sales numbers in public.) Lawrence's conclusions are stated as:

[T]here is a pretty close relationship between the number of Goodreads ratings a book has and its total sales. PROVIDING that the books are of a similar age and the same genre.

(A rough squint at his graph suggests that the correlation for his specific dataset works out to around $7-8 per Goodreads rating.)

Correlation of Goodreads Rating Volume and Average Rating

(This originally appeared in slightly different form as a comment I made at File 770.)

Having poked around a bit on Goodreads, I’ve pulled a data set from a list entitled “Best Historical Fantasy” (the genre chosen to be thematically comparable to one of the sets I pulled from Amazon) and I took the top 100 titles from that list. Goodreads lists (in contrast to the sales-based Amazon Top 100 lists) are created by reader/reviewers and books are added and up-voted by other reader/reviewers. Internal ranking in the list is dependent on some combination of the average rating and the number of different people recommending the book for the list.

Overall, the results confirmed my analysis from Amazon with a couple of interesting details.

The first interesting detail is that the Goodreads correlation (at least for this dataset) continues roughly linear throughout the ranking scale, rather than losing linearity toward the bottom of the scale. My knee-jerk hypothesis is that this relates to the ability to leave a ranking on Goodreads without leaving a review. This may suppress the number of lower rankings on Amazon because someone who doesn't care for a book may be less motivated to say so if they have to do it in a text review rather than a one-click star ranking.

The second interesting detail is something that most people who pay attention already know anecdotally, which is that Goodreads rankings have a wider spread. This may be simply a consequence of the "suppression of lower rankings" effect on Amazon noted above. (To be clear: I'm not saying that Amazon actively suppresses lower rankings, only that the structure of the interface may have that as an effect.)

The lowest ranking in the Amazon Historical Fantasy Top 100 was 4.0, while Goodreads rankings (though obviously not for the same exact books) went down to 3.1 In addition to the abovementioned suppression effect, the suggested interpretations for Goodreads ratings are also shifted down from Amazon ratings. If you go by the suggested interpretations, an Amazon 4* is a Goodreads 3*. Amazon rating distributions are more likely to descend from a maximum at 5-stars, while Goodreads distributions--even for popular books--are more likely to have a maximum at 4-starts, indicating an overall shift of the curve. More on this below.

The third interesting observation is more particular. As I was entering the rating/# of reviews data, I started noticing a particular name standing out as an outlier. Not an outlier in terms of matching the average-rating/no.-of-ratings trend, but a set of books–all by the same author–who fell high in the list (based on the number of people voting them for the list) while having surprisingly few ratings. So although I wasn’t tracking author or title in general, I flagged every time that author came up in the list: a total of 11 times (out of 100 books). When the books were sorted in descending order of average rating, that author occupied 7 of the top 10 slots (average ratings of 4.7 or more) and completely followed the overall correlation of having relatively few reviews. (In a dataset where a handful of titles had rating entries in the 6 figures, all but one of those 7 had been rated fewer than 100 times.) Having taken a look at the author’s bibliography on wikipedia (and I’m not going to name names because that’s not the point), the best hypothesis seems to be that this is someone with a small but extremely dedicated readership. That dedication extended to ensuring the authors position on the list by voting for the books there, but reflects the general conclusion that the average rating must be understood in the context of the overall number of ratings.

(A commenter on File 770 suggested an alternate interpretation than "small group of enthusiastic fans", noting that some Goodreads authors engage in list-vote trading to move each others' books higher in lists. I have no speculations on this particular author's situation.)

Correlation Between Average Amazon Rating and Average Goodreads Rating for Popular Books

While it's an easy anecdotal observation that average Amazon ratings tend to be higher than average Goodreads ratings, I wanted to confirm this impression with data. So I started with Amazon's Top 100 Historical Fantasy books, looking only at Kindle sales this time (for consistency), and stripping out omnibus editions (to avoid redundancy). I used Amazon data as the starting point because I figured sales figures were harder to game than Goodreads lists.

Because I was going to have to search on each book individually in Goodreads, I only took the top 40 from the Amazon list, which trimmed down to 33 when omnibus editions and not-yet-released titles were excluded. I then calculated the Amazon:Goodreads ratio, as well as the absolute difference. I played around with these results in several ways, plotting them as a curve distribution and running a mean and standard deviation.

At a very rough approximation, the absolute difference in reviews seems to follow a standard curve: average = 0.4 stars, standard deviation = 0.2 stars. In fact, roughly 90% of the data fell within one standard deviation of the mean.

The ratio of reviews isn't quite as pretty a distribution, but has an average ratio of 1.09 with a standard deviation of 0.06. Here only 80% of the data fell within one StDev.

I leave it to a better statistician than I to say something meaningful. I expect that differences in average rating may be much more variable for books with smaller distributions, due to the larger effects of individual choice (both in what rating to give and in which rating site to participate in).

Only one book in the set had a higher average ranking in Goodreads than in Amazon. It's a pre-release listing and has relatively few reviews, which may account for its position as an outlier. Its position on the Amazon Top 100 list seems to be due entirely to the fact that it's a 47 North publication (i.e., an Amazon imprint) and "sales" are probably artificially inflated by internal promotion activities, such as making it available to "book club" arrangements (based on comments in the reviews). Some of the text reviews on Amazon are...um...harsh.

Some Random Overall Conclusions

Mark Lawrence’s sales correlation is very interesting. And it doesn’t necessarily mean that the average rating number by itself isn’t meaningful. I suspect that comparing the combined average rating + number of ratings data to the overall trend line for the relevant genre will tell you more than the absolute average rating alone, in terms of whether the reading community likes or doesn’t like the book. But all of that may be irrelevant to sales.

Just for fun, what does all this say about my books?

Daughter of Mystery has an average Amazon rating 0.5 stars higher than for Goodreads. Plotting the cumulative average Amazon Rating against total ratings over time, I'd say the average rating more or less settled in to a relatively stable number around the 15-20 review mark, but is still highly subject to single inputs. The first 5 reviews were all 5-star.

The Mystic Marriage has an average Amazon rating 0.7 stars higher than for Goodreads. The average Amazon ranking doesn't seem to have stabilized yet and the first 6 reviews were all 5-star. This book is still at the stage where the ratings are artificially high due to low numbers of readers. (Excuse me while I go sob in a corner for a while.)

Both books fall pretty solidly on the normal trendline for Lesbian Romance, but don't have enough data points to compare meaningfully to the Top 100 Historical Fantasy data. As for sales data...let's just say, "Inadequate data for meaningful analysis." (The next related phenomenon I examine may be changes in rating patterns for series books. My knee-jerk hypothesis is that later books in a series will tend to have increasingly higher average ratings because readers will typically continue with the series only if they liked it.)

It's very important to keep in mind that these comparisons and correlations are only meaningful when looking at sets of books with similar potential distribution. A lesbian romance with 200 Amazon ratings isn't comparable to a blockbuster best-seller with 200 Amazon ratings. The latter is only starting to scratch the surface of its most dedicated readership while the former may well have already saturated the market.

In conclusion, much as it pains me to admit it, I do myself no favors in begging my readers and fans to leave reviews if they weren't already inclined to do so. Strongarming existing readers into leaving ratings/reviews does not necessarily generate new readers. It certainly doesn't directly generate additional income. And to the extent that looking at the average rating + # of ratings provides useful data, artificially inflating one's average rating by solicitation to existing fans isn't meaningful.

Instead, the only useful thing to do is beg people to encourage other people to read my books. And the rating stats will fall out of that on their own.

Major category:

Thinking

Tags:

reviews