GRAND JURY EUROPEEN
THE STATISTICIAN'S POINT OF VIEW
Bernard Burtschy, Professor of statistics - in addition to being a member of
the Jury and one of the winners of the Euro Cave-European Grand Jury Trophy -
draws conclusions which enable him to make a statistical analysis based on the
official results.
------------------------------------
Classifying wines, vintages, even wine-tasters, with a view to quantification,
is a process which naturally interests the statistician, a man preoccupied by
figures. The process is all the more interesting due to the fact that the use
of a set of collective notes to draw up a classification of Châteaux demands a
technique that is not without its hazards.
Which scoring system ?
France in particular and Europe in general, are using scales from 1 to 20 or
from 1 to 10. The American University tradition, adopted in particular by
Robert Parker, operates on a scale of 100. This scale has its own rules which
are not always understood. Scores range from 50 to 100 in theory, but in
practice it is rare to find a score lower than 72 (zero). the maximum, 100, is
very rare too. The gap is therefore from to 25 to 30 points.
The jury was asked to score the wines on a scale of 100 so that the scores
would be comparable to the American system. A detailed analysis of the
wine-tasters notes showed that they scored in three different ways :
The first group scored according to the rules of the American system. Their
scores were between 70 and 100. i.e. a gap of 30 points.
A second group used the scale that they were probably most familiar with, from
0 to 10 or from 0 to 20. Then they multiplied by a factor of 5 or 10. Their
scores showed a range of between 20 and 100, i.e. a gap of 80 points.
The third group combined the two systems and scored between 50 and 100, i.e. a
gap of 50 points.
The aggregation of individual scores
The classifications were compiled in traditional manner by combining the
individual scores. A rough classification was thus obtained, quantifiable in
terms of points. This type of classification, which is easy to understand,
supposes a perfect homogeneity in the scoring system between the tasters. Even
if they adopted the same scale for scoring, this homogeneity would not exist,
because it is rare that two tasters score with the same distribution of points.
Is this important ? Yes, because the influence of the taster on the final
classification depends on his system of scoring. A taster who gives the same
score to every wine has no influence on the classification. The more his scores
are diverse, the greater his influence will be.
The consequences are immediate. A taster who uses a scale of 60 points will
have twice the influence of a taster who uses a scale of 30 points, and will
thus count for two.
The "rough classification" gives greater weight to a taster who uses
a large scale.
A taster with disparate notes will have a maximum influence.
It is obviously possible to try and standardize the tasters' evaluations using
identical systems of scoring. The experiment, often conducted in product tests,
shows that this standardization is never perfect. With wines, it is virtually
impossible because it is so difficult to reflect all the nuances of a wine in
one score. As one has to concede that there are several languages in Europe,
one must also admit that there are several ways of scoring.
Standardized classifications
There are some more or less sophisticated ways of reducing the exaggerated
influence of a particular taster on the overall classification. The simplest
method is to take not a wine's score, but its position in the classification.
Each taster identifies a first wine and a last one, with perhaps ones
that are equal. The sum of the positioning offers a more accurate result than
the "rough classification". The positioning method is unsatisfactory
in one respect compared to the "rough classification": if a wine is
far ahead (or behind) the others, the positioning method doesn't take this into
account.
Statisticians prefer to standardize the scores of each taster by putting them
on the same average with a constant dispersion. The contribution of each taster
to the classification is thus exactly the same. From a rigorously scientific
point of view, the only useful classification is one that has been standardized.
Thereafter, nothing prevents one from weighting each taster, depending for
example on the number of wines discovered "blind".
Group tasting versus individual tasting
The virtues of individual tasting are well known - as are the faults. The
individual taster has his own particular taste and it is one that the public
can refer to. On the other hand nothing enables one to distinguish an eventual
substandard performance by the taster confronted by a spoilt wine, due to a
lack of comparison with others. Collective tasting, by diluting the influence
of each taster in the group, renders the tasting both more reliable and less
personal. Apart from the question of principles or ideologies for or against
each type of tasting method, modern statistical methods enable the faults of
collective tastings to be remedied.
A taster, who judges a series of wines, is rightly judged in turn by his
tasting. One needs a lot of time to discover a particular taster's method of
tasting. The only way to judge in any kind of formal way is to position his
tasting in relation to his judgment of other wines or in relation to other
tasters tasting the same wines. Unfortunately, the context in which the tasting
takes place also changes.
They are never the same bottles, on the same occasion, etc. And
comparison is very personal (which sometimes allows one to save one's
reputation).
Collective tasting, if it is properly processed statistically, enables an
immediate comparison to be made of the differing profiles of each of the
tasters. The rich potential for analysis is incomparable. Twenty tasters,
tasting the same bottles on the same day enables one to draw up twenty parallel
classifications, and to reveal the similarities and the divergences.
Even the analysis of the similarities and the divergences are better than all
the classifications in the world.
Apart from anything else, the methods involved enable a reliable classification
with nuances. Two wines could have the same score for different reasons: one
for its smoothness and the other for its austere elegance. The confrontation
between tasters' judgments, some in favor of the smoothness, others
appreciating the suave elegance, enables one to situate a wine in a much surer
way than with an individual tasting. An analysis of contrasts remains the
surest method of analyzing.
This approach to tasting has serious consequences for the compilation of
juries. Many think that a jury must be homogenous, that each taster must be a
"clone" of the ideal taster, the best in the world. This is to ignore
the enormous variety of wines, their complexities and the diversity of tastes.
On the contrary, wine-tasting juries must reflect the tastes of consumers.
Professional tasters have often enough been accused of living on another
planet, totally cut off from reality. Collective tasting, properly treated from
a statistical point of view with modern processing methods, enables one to
reveal this diversity of choices. Let's not deprive ourselves of that...
© The Wine Institute of Las Vegas
|