Normalizing Ratings

16 Symmetry 8 5/2/2025, 12:39:45 AM hopefullyintersting.blogspot.com ↗

Comments (8)

nlh · 45m ago
Similarly - one of my biggest complaints about almost every rating system in production is how just absolutely lazy they are. And by that, I mean everyone seems to think "the object's collective rating is an average of all the individual ratings" is good enough. It's not.

Take any given Yelp / Google / Amazon page and you'll see some distribution like this:

User 1: "5 stars. Everything was great!"

User 2: "5 stars. I'd go here again!"

User 3: "1 star. The food was delicious but the waiter was so rude!!!one11!! They forgot it was my cousin's sister's mother's birthday and they didn't kiss my hand when I sat down!! I love the food here but they need to fire that one waiter!!"

Yelp: 3.6 stars average rating.

One thing I always liked about FourSquare was that they did NOT use this lazy method. Their score was actually intelligent - it checked things like how often someone would return, how much time they spent there, etc. and weighted a review accordingly.

theendisney · 32m ago
With averages: to have 5 stars you need a hudred 5 star ratings for each one star rating.

If one would normalize the ratings they could change without doing anything. A former customer may start giving good ratings elsewhere making yours worse or give poor ones inproving yours.

Maybe the relevance of old ratings should decline.

kayson · 20m ago
The normalization doesn't have to be "live". You could apply the factor at time of rating and then not change it.
tibbar · 5m ago
One of my favorite algorithms for this is Expectation Maximization [0].

You would start by estimating each driver's rating as the average of their ratings - and then estimate the bias of each rider by comparing the average rating they give to the estimated score of their drivers. Then you repeat the process iteratively until you see both scores (driver rating, and user bias) converge.)

[0] https://en.wikipedia.org/wiki/Expectation%E2%80%93maximizati...

Retr0id · 30m ago
> I'm genuinely mystified why its not applied anywhere I can see.

I wonder if companies are afraid of being accused of "cooking the books", especially in contexts where the individual ratings are visible.

If I saw a product with 3x 5-star reviews and 1x 3-star review, I'd be suspicious if the overall rating was still a perfect 5 stars.

xnx · 31m ago
I don't understand why letter grades aren't more popular for rating things in the US.

"A+" "B" "C-" "F", etc. feel a lot more intuitive than how stars are used.

technetist · 1m ago
I think that ultimately you run into the same issue.

In US education you are taught that you need to get an A. Anything below a C, gets you on the equivalent of a “Performance Improvement Plan” in corporate world. And B is… well… B.

So with that rating engrained, people would probably feel bad about rating their ride-share driver a C when they did what was expected. And it wouldn’t stop companies from pushing for A ratings.

Even elsewhere like the food industry where they do have letter ratings, A is the norm with anything lower being an outlier.

Perhaps for this to work, it would need a complete systemic shift where C truly is the average and A and F are the outliers. In school C would need to be “did the student do the assignment.” And A would need to be “the student did the assignment, and then some.”

NegativeK · 8m ago
We'd still get the same pressure to give an A+ to every interaction unless things were fucked.

I used to rate three stars for what "performs as expected" until I realized that it's punishing good products. Switch to A-F would result in the same behavior, except it'd be Uber drivers trying to make a living instead of noxious parents declaring that their kid deserves an A.