Examples of scoring disfunction

@joet - this:


That’s the view you get when you click on the brewer and click to sort by the visible score. That’s bad. Display the raw average if you like - with ticks if you like - but it’s a bad idea to use it for anything. Because it’s not robust. Because it can be rorted (to use an Australianism).

Thanks to both @aww and joet for showing up in my grumpy thread.

A word about variance: it is definitely not the case that a good scoring system should produce low variance. There is no “true” score that the average is “discovering”: some beers are polarizing - and indeed a good way to look for beers you are likely to like but are (for you) under-rated is to look for those with a big spread of scores in styles you like (you can still do this using the old stats page for a beer). You used to be able to jump to the last page or two of scores to see why people didn’t like it too, but this ability has been removed.

2 Likes

So this is not he Portland-based Cascade but some other brewery?

1 Like

well, there is significant utility to the measure which is why amazon uses it extensively. it simply normally paired with ratecounts as a sort of caveat when applicable. we definitely understand the user need for unweighted means which is why we’ve developed their presence. This feedback is very valuable though. Maybe it’s important to change presentation in a way that more closely associates that average to number of ratings, especially wherever there is no 100-point scale score?

I think the brewery in question is the Australian one…

Yes, Cascade in Hobart, Tasmania. Operating continuously since 1824.

Picturesque place. I did the brewery tour in the early '90s, just after it had been acquired by CUB (which is now in the process of being detached from ABInBev). At the end of tour, the guide revealed that the brewery had sold the rights to the brands separately to the facility, so in order to allow us to sample Cascade’s products she had to nip across the road to the bottle shop and buy a couple of six packs at retail prices.

CUB has never quite known what to do with Cascade - the brewery is small, Taswegians are highly north-south parochial and freight to the big island is expensive. So when ABInBev merged with SABMiller, it looked like an opportunity. They brewed a local version of Goose Island IPA at Cascade (which I noted in my rerate of Goose IPA shortly before I stopped rerating here).

This is what I meant in my review of Redline IPA when I said “…but this is brewed in Tasmania due to the hydra-headed nature of… geese.” I guess that’s a correction.

Nothing got done about this. Inflated invisible ticks - including by employees of where the beer was brewed - are allowed to considerably influence the ill-advised version of the score that’s now emphasised.

Another example from today: I’m looking at a retailer’s newly arrived page and see they have London Beer Factory / Wild Beer Bubba. Haven’t anything from that brewery, so I look up the beer here. Its score shows as 2.98 - not promising for a DIPA that Purvis wants to sell me at A$12.99. But I am now aware that RateBeer’s scores are a mess, so I look further.

There are 3 reviews and one additional invisible (to me) tick. All three reviewers have been here a while and have more than 2000 ratings (2270, 9264, 3554). They give scores of 3.5, 3.6 and 3.8 (mean 3.633). The invisible ticker, therefore, has scored this beer 1.0. Did they hate it? Are they one of those people who happily used ticks as a way of marking that they had acquired but not tasted the beer? Are they - this time - an employee of a rival brewer? I dunno, but their tick sufficed to bring the most visible score for this beer on the site from 3.63 to 2.98. That’s the score you see when you look at the brewer’s list of beers:

People submit good information to RateBeer. I understand you want to cater to people who can’t be bothered to do that but I don’t understand why you want to allow them to ruin the information content of the site.

To repeat: it’s not possible to distinguish between failed development and vandalism.

7 Likes

The website is in such state that most products don’t have enough full reviews to have a representative weighted average anymore (so basically, all new products suck for a year or so here unless the brewer is very popular and their products manage to get around 20 ratings…if they are lucky) …so they decided to let all ratings count (ticks and reviews) for the shown real average score…

Eck, I could accept that (even if I think that most/all perfect 1 or perfect 5 ticks are errors or cellars) but right now, all ratings are counting for the real average score and they are not even following the basic QUALITY OF OUR SCORES ( https://www.ratebeer.com/our-scores ) guidelines…

@services @joet

1 Like

Just as an FYI, I can see ticks and this is somebody who has 3826 1* ticks. Clearly not vandalism, but somebody who doesn’t realise their ticks impact any scores and uses it for something other than rating.

2 Likes

Indeed. Also someone who cannot be blamed at all.
-users doing that were basically forced to do that because alternatives were removed and other alternative suggestions ignored / marked as not being a priority
-users doing that didn’t expect ticks would count
-allowing ticks to count was done without due consideration of consequences and despite multiple warnings

1 Like

I suppose I’ve learned how to ‘read’ RB scores so I don’t take much notice. Bought a beer today with a score of 20 (and a style score of 5) because, those scores don’t really mean all that much do they? Unfortunately a lot of good beers will be made to look unattractive to those unable to decipher the broken system.

As the number of available beers continues to increase, and our membership dwindles, there will be ever fewer ratings per beer. The UK is reasonably well-represented on Ratebeer but even still there are loads of beers with around 5-10 rates, even from well-loved breweries.

It’s been a while that I Check the Beer score on UT, and rate it here…scores here are based on so few users that they either mean nothing or are at great risk of having an anomaly rating that scraps the score…

1 Like

If I am really stuck I’ll usually look at friends’ rates, even if there are one or two it gives me an idea of what I might be in for.

1 Like