Memo from the Dark Side: In Defense of the 100-Point Scale

Aug 3, 2010 | Columns

By Michael Franz

Exactly five years have passed since we launched Wine Review Online, making this seem like a good time for a bit of stock-taking. Of the various things we’ve done while developing the site, the one that has drawn the most flak has been the adoption of the 100-point scale for evaluating wines. Scoring wines on this scale was new to me five years ago, and it was new to almost all of us who write for WRO. I took this step with some apprehension and found the transition quite uncomfortable, but after employing the 100-point scale continuously for years, I’ve concluded that most objections to it are rooted in prejudice, misconception or flawed reasoning.

When I left The Washington Post five years ago to join forces with Robert Whitley in building the world’s leading web-only wine review site, both of us were clearly aware that we couldn’t accomplish our objective without using the 100-point scale. It took months for us to develop the entire concept for the site, but only about 30 seconds to adopt a scoring system.

In my moments of exhausted reflection during the building process, however, I wondered if the 100-point scale was an evil–albeit a necessary evil.

The possible evils of the scale were questionable, but the necessity was clear. Given the norms of high-level wine criticism in 2005, as well as the expectations of consumers and the practices of the wine trade, the scale was indispensable if we wanted our reviews to be widely read and used promotionally by the trade. Words alone would not satisfy enough consumers or trade members to get our name widely known without spending a gazillion dollars on advertising–regardless of how thoughtful or evocative our words might be.

So, as a business proposition, the only real question was, “What sort of scoring system should we adopt, 100 points, or 5 stars, or 3 puffs, or the 20 point system first used by U.C. Davis?”

Looking back, I’m quite certain that we made the right choice, and not only because of merely commercial considerations regarding WRO’s advancement. I’ve come to see that most of the oft-heard objections to the 100-point scale are overblown or mistaken, that its advantages vastly outweigh any disadvantages, and that no superior system has yet been developed.

Nevertheless, the 100-point scale remains quite controversial, so an explicit defense is in order. To give critics of the scale their due, I’ll make my case in response to some of the most commonly-voiced criticisms:

The 100-point scale imperfectly conveys the reviewer’s experience of the wine to the reader.

True enough, but considering the 100-point scale rationally and in context starts from the recognition that any means of communicating a reviewer’s perception and evaluation of a wine will always be imperfect. After all, the task at hand involves transforming a subjective sensory experience of a complex object from the taster’s palate into words or numbers or symbols, and then getting the reader to form an impression of the thing being described or scored that is closely akin to the reviewer’s palate experience.

If you ask a linguist or a psychologist whether it is possible to conduct such a process without distortion, you’ll soon see a pair of rolling eyeballs.

Regardless of how the results of wine evaluation are conveyed, there will always be some distortion and misunderstanding, some “noise” in the communication circuit. The only meaningful question here is whether a particular mode of conveying one’s sense of a wine conveys more or less information than others, and with more or less distortion.

If considered apart from a written review, points wouldn’t measure up very well by that standard. However, at Wine Review Online points aren’t used in isolation, but are limited to use as a summary measure of the reviewer’s overall evaluation of a wine. And in that limited usage, points on a 100-point scale transmit more information more precisely than puffs or points on a truncated 5-point or 20-point scale, as I’ll argue below.

Words alone should be sufficient to convey a reviewer’s sense of a wine, without reducing a complex beverage to a number.

By disposition as well as training, I’m a word guy rather than a number guy, but I’m certain that I can help a reader much more by using both of them to review a wine than just one or the other.

It is true that numbers lack the warmth and nuance that words can convey. But it is also true that words often lack precision. Words don’t stand in a clear and uniform relation to one another in the way that numbers do, and there is a much greater risk of miscommunication between minds when using words than numbers.

Besides, arguing for words as opposed to numbers in wine reviewing essentially amounts to arguing for a false choice, at least where Wine Review Online is concerned, since we virtually always offer a written description along with any score that we print. Our reviewers generally go beyond explaining what a wine tastes like to assess its overall character, address what makes it distinctive (geographically or technically or stylistically), and often indicate what sorts of foods might best be paired with it.

Scores are not a substitute for any of this information. Rather, they are a summary of the reviewer’s overall evaluation, and since they are numerical, they tell the reader precisely how the reviewer regards the wine in overall terms by comparison to other wines that he or she has reviewed. In this respect they perform a function that words simply cannot provide as adequately, and it is one that can be extremely valuable to readers.

Numerical scores on a 100-point scale are shams that falsely suggest an evaluative precision that isn’t possible.

I understand where this objection comes from. It arises from concern over the possible misunderstanding by readers of point scores as a sum of “things” (or units of quality) in a wine, rather than as a measure of the reviewer’s overall admiration for the wine by comparison to other wines.

Do some consumers misinterpret point scores, holding a false idea of their correspondence to an “objective” reality? Yes, of course they do. Does it therefore make sense to do away with point scores? No, it doesn’t. If we threw overboard everything that is misunderstood by certain people, we’d be done with science and literature and art and almost all enterprises involving any sophistication at all.

Since point scores properly understood measure a relationship between a reviewer and a wine, the issue of whether they are falsely precise actually boils down to a question of how capably and consistently the reviewer registers his or her overall evaluations.

A score is only as precise as the reviewer assigning it. There’s no getting around that, and I don’t mind acknowledging it. However, if it is true that the 100-point scale can be a sham in the hands of a reviewer who is careless or lacking in seasoning or talent, it is also true that point scores can be quite precise if the reviewer is careful, seasoned and talented.

These are three distinct virtues, and they’re all crucial in determining the worth of the point score at the end of a review. Sure, it is true that wines can be scored on a whim, just as it is true that people open their mouths and give voice to mere opinions quite frequently. But is it true that, just because many people opine thoughtlessly, nobody ever really reflects on things and seeks genuine understanding before making a statement?

My experience as a wine educator, a wine competition judge, and an editor of other wine writers suggests strongly that certain human beings can assign point scores consistently and accurately. I do not claim that there are many human beings who can do this, but I do indeed claim that there are some. They are able to accomplish the feat because they 1) taste methodically, 2) work on a foundation of vast experience, 3) employ a developed talent for discerning subtle differences between wines and assessing those differences critically.

The indisputable fact that few people can assign numerical scores precisely is no basis for dismissing numerical scores. It is a basis for something, however. The fact that relatively few people can assign numerical scores precisely is a reason for skepticism regarding the proliferation of wine blogs and reviews in an era when any bozo with $500 can put up a website.

What’s my advice for dealing with this? Know thy reviewer. Take a very close look at the “About Us” page on every website (including this one). And compare your own sense of wines that you taste carefully against the reviews and scores you read, in order to find out who is a reliable guide, who tastes carefully but doesn’t share your taste, and who is just a quack.

Scores on a 100-point scale are subjective.

Indeed they are, and everyone should be aware of that. However, “subjective” is not the same as “meaningless” or “inherently inaccurate.” Reviewers who summarize their words with point scores do not need to be consistent with one another to perform a useful service for readers. They do, however, need to be consistent with themselves, recognizing the one essential rule that competent scoring requires judging every wine on the single 100-point scale with comparable criteria against every other wine in a uniform manner.

Point scoring can be subjective but still meaningful provided that reviewers are consistent with themselves. Being "consistent with themselves" allows for the liklihood that some reviewers will inevitably be impressed more favorably by big, broad wines, whereas others will prefer brighter, more “linear” ones. That is not necessarily a problem if two conditions are met: First, it isn’t an issue if both of these reviewers are internally consistent in their preferences and their scoring. And second, it isn’t a problem if the publication that prints their reviews always indicates a reviewer’s name alongside the point score.

At Wine Review Online, our contributors are established, experienced experts, and we invariably indicate the name of the person behind the points. We never employ a “tasting panel,” and we don’t publish lists of un-attributed reviews that are simply ascribed to WRO. Some of us are more generous with points than others, and some lean toward New World wines while others are more impressed by European styles. Which is fine, provided once again that you heed the maxim: Know thy reviewer.

By the way, if it bugs you a little that you should need to monitor your reviewer, it is worth remembering that you’d still need to do this in the absence of scores from a 100-point scale at the end of your reviews. Points are subjective, but so too are words and puffs, which do not write themselves.

A three-puff or five-star scoring system conveys gradations of quality more realistically than the 100-point scale.

I saw the strongest version of this point that I’ve ever encountered in a note from a friend just last week. Paraphrasing him closely, his contention was that other products or artifacts or experiences that critics review rarely are evaluated on a 100-point scale, and for good reason. He hasn’t seen restaurants or movies or consumer products rated with point scores, and would regard reviewing Citizen Kane at 96 but Casablanca at 92 as “foolishness,” to use his word. No one, though, would laugh at the reviewer saying that both of those movies deserve five stars. Similarly, it seems silly to rate a KitchenAid dishwasher at 89, but a Kenmore at only 87, whereas it seems sensible to give five stars to one and only four to the other. That narrower scale, which is used quite widely to review things other than wine, draws distinctions but does not pretend at the precision that the 100-point scale so falsely does.

This argument looks compelling at first glance, but it cracks when subjected to scrutiny in connection with specific examples regarding wine reviewing.

I agree that a five-point scale is more modest and easier to employ than a 100-point scale, but I disagree that it would be more useful in practical terms for wine reviewing. Such a scale is simply too compressed to convey the fine gradations in quality that divide fine wines in a way that is maximally helpful for consumers.

A few examples serve to demonstrate this point. For starters, consider the fact that if the five “First Growth” Bordeaux from 2000 were all rated on a five star scale, they’d all get five stars. And they’d deserve them. But which one should you buy? Those identical five-point scores aren’t worth a dime for differentiating between the wines. The verbiage in most reviews won’t help much either, since when tasted young they all show notes of black fruits, vanilla and woodsmoke, and the descriptions read very much like one another. Most people would therefore be left to decide which wine to buy based on nothing deeper than label design or selling price. Which is to say that the reviewer who was armed with only five points evaluated the wines realistically–but unhelpfully.

If we pursue a related example and drop down from First Growth wines to consider the other 56 classified growth Bordeaux wines from very strong vintages like 2000 or 2005 or 2009, I can predict what will happen to our five-star reviewer:

1) First, he will produce a giant logjam of four-star ratings because almost none of these wines are as strong as the First Growths but almost all clearly deserve more than three stars, which is a 60% rating or a D- in school terms.

2) Second, he will start begging us to let him use half-point scores so that he can wriggle out from under his overly-compressed scale and come closer to doing justice to the wines (like Léoville Barton and Gruaud Larose that deserve to be elevated above the logjam and given four-and-a-half stars, or underachievers like Marquis de Terme that call for something like three-and-a-half).

3) Third, if he’s really careful and conscientious about distinguishing between the wines to help his readers, he’s going to find that he’s still too clustered in his ratings, and is going to start to wonder about whether we’ll let him use quarter-points, or the U.C. Davis scale which would give him a few additional gradations. Or perhaps he’ll just admit that Robert M. Parker, Jr. isn’t quite a fool after all.

Five-star or three-puff ratings are workable for diswashers because there are relatively few of them, their features are readily apparent, and their performance can be measured straightforwardly by reference to decibels or water spots left on glasses. They also work for movies because Citizen Kane and Casablanca are roughly in the same league, and we presume that we have plenty of time to see both of them. However, most of us don’t have plenty of money to buy all of the 2000 First Growths, and their features aren’t readily apparent like dishwasher knobs or cycle options. We can only afford one of them–if we are lucky–and we find it meaningful and useful to have a reputable expert like Parker consider their concentration, integration, complexity and capacity for development and tell us in summary that Mouton is excellent but still just 97+ by comparison to the 100 points he awards to Margaux.

The 100-point scale is a misnomer in the first place, since even the worst wine gets 50 free points, and since wines scoring fewer than 85 points never get reviewed.

This objection is mostly true, but also mostly without weight.

Yes, wines get 50 points on the 100-point scale simply for being wines. The same is true in classroom situations (which I know a little about, since I’m a professor as well as a wine writer). My students get 50 points for starters, simply because I respect them as human beings. If their work is too lazy or dim to earn another 10 points above that, they’re still human beings–just human beings with an “F.”

Everybody seems to understand that that’s how the grading scale works. Some aren’t wild about the outcome, but everybody gets the idea. This is one of the reasons why the 100-point scale has become so popular. It has its quirks, like the free 50 points, but everybody gets it, whereas two-and-a-half puffs aren’t nearly so clear.

Yes, the great majority (though hardly all) of the wines that we review on WRO are scored at 85 points or higher. I grant that, in practical terms, we really use something like a 15-point scale. But so what? This 15-point range is clearly more capable of doing justice to large numbers of fine wines than a five-point or three-puff system, as I believe I’ve shown. And the same is true of the 20-point U.C. Davis scale, which in practice is really a 6- or 7-point scale (at least until its users start resorting to half-points, which they need to do to achieve the discrimination permitted by the 100-point scale that they sometimes decry when speaking from the other side of their mouths).

Attaching a numerical score to a thing of beauty like a fine wine is inherently crass.

That which is crass lies largely in the eye of the beholder. As an aesthetic judgment, this proposition is essentially beyond rational argument, and if you maintain this view, you are welcome to it. However, I want to observe something about it that isn’t often noticed: It emits a whiff of condescension, and the condescending attitude at its source sometimes stems more from the speaker’s personal situation than from his or her aesthetic sophistication.

A very large percentage of the point-bashers whom I know are either wine writers or wine trade members. As a rule, both are lavishly supplied with wine, rarely confronting the hand-wringing decisions required for consumers who have only a limited budget for buying quality wines.

The writer who disdains point scores doesn’t need to choose between the five different Pinots released by an Oregon winery this year. He can taste them all because, guess what? The FedEx driver dropped them at the doorstep yesterday, free of charge! He can note that this one is reminiscent of black cherries whereas that one is more suggestive of red cherries, with a little less toastiness in the finish. Why should he summarize his impressions with points? Comparisons are odious to the aesthete–especially numerical comparisons–and mere consumers should be content to study his descriptors.

Yet one thing the past quarter-century of wine writing in North America shows for sure is that vast numbers of consumers are not content with reading descriptors. They can only afford to purchase one of those five Pinots, and they aren’t merely interested in what a writer thinks they taste like (though that is important), but exactly how highly he evaluates them relative to one another, and to other Pinots from other wineries.

I didn’t like having to score wines with points when I started doing it for WRO. It was much easier and more comfortable to simply describe the samples that I tasted for The Washington Post and then list them in the paper in order of preference. Having to assign a score at the end of a review required that I come to an unequivocal decision at the end of the evaluation process, and that was hard to do. Moreover, having to come up with a specific score also left me much more exposed to criticism by those whose evaluations differed from mine.

Which makes me wonder: Of those writers who look down their noses at point scores, how many would simply rather not put their asses on the line with a clear-cut score?

Similarly, a large contingent of point-bashers consists of wine trade members, especially retail salespersons and sommeliers. Awash in wine that is free or deeply discounted, they are also emancipated from much of the hand-wringing that budget-conscious consumers must engage in. Moreover, the need for their services is undercut to some degree by the belief among consumers that point scores are more reliable than advice in restaurants or retail stores, which may be tainted by financial self-interest, and which is rarely based on comparative peer-group tastings as inclusive as those conducted by major wine writers.

Let me be clear here: Many sommeliers and retail wine salespeople taste more extensively and discerningly than some wine writers. More broadly, I do not ascribe dismissive attitudes toward the 100-point scale entirely or even mostly to narrow personal interest or to the good fortune of having lots of free wine. But I definitely ascribe some of them to these influences.

Point scores wrench wines out of the contexts that contribute to consumer enjoyment of them, ignoring mood, season, or partnership with food.

This is an unfair criticism because contextual information and usage advice are tasks for the words in a review, not the number at the end of it.

The verbiage in an excellent review should communicate something about the character of the wine that will help you decide whether it suits your mood or your food or the season. If you are in an energetic mood while sitting in front of a plate of oysters on a hot night, you’d be better off with lean, racy Chardonnay from Chablis than a big, buttery one from Australia. The words from well-written reviews should tell you enough about these two wines to enable you to pull the right bottle out of your basement. The role of the numerical score at the end of the reviews is simply to provide you with some guidance when deciding which Chablis to stash in the basement in the first place.

The 100-point scoring system privileges big wines and imposes a glass ceiling on scores for lean, elegant wines.

When that is true, it is a reflection of the taste peculiarities of the reviewer much more than of any structural bias in the 100-point scale itself. More broadly, this phenomenon (which I grant is not merely imagined by critics of the 100-point scale) is reflective of the prominence of The Wine Advocate and The Wine Spectator since the 1980s. Some palates prefer super-ripe wines with gobs of this and that, and give lots of points to them. Other writers who prefer lean, elegant wines are perfectly free to give high scores to them–provided that they are willing to dissent explicitly from the wine fashions of the moment.

Wine Review Online’s core mission is to enhance and diversify wine criticism by offering a platform to outstanding writers, permitting them to speak their minds and score their wines while free of any “Party Line” regarding wine styles. If one of them tastes a Beaujolais that he thinks merits a score at a level usually reserved for Grand Cru Burgundy, he is free to assign it. None of our contributors has ever been told to back off of a high score for a steely Sauvignon Blanc or a lean Blanc de Blancs Champagne. And none of them will ever be told to do any such thing.

* * *

Questions or comments? Write to me at [email protected]

Memo from the Dark Side: In Defense of the 100-Point Scale

Recent Posts