When Clinical Description Becomes Statistical Prediction

n outperforms informal,
subjective aggregation much of the time. However, these
data have little bearing on the question of whether, or
under what conditions, clinicians can make reliable and
valid observations and inferences at a level of generality
relevant to practice or useful as data to be aggregated
statistically. An emerging body of research suggests that
clinical observations, just like lay observations, can be
quantied using standard psychometric procedures, so that
clinical description becomes statistical prediction.
The style and sequence of the [book] reect my own ambivalence
and real puzzlement, and I have deliberately left the document in
this discursive form to retain the avor of the mental conict that
besets most of us who do clinical work but try to be scientists.
(Meehl, 1954, p. vi)
I
n 1954, Paul Meehl published his classic book on
Clinical Versus Statistical Prediction. Clinical predic-
tion referred to the use of an individual (an expert; in
psychology, a clinician) to predict an event. Statistical
prediction referred to the use of an actuarial formula to
predict the same event. In the prototypical study reviewed
by Meehl, the clinical expert had access to all of the
information used to create the competing formula (and
sometimes additional data). The clinician could combine
the information in any way he or she saw t, making use of
clinical skill, intuition, and theoretical knowledge. In con-
trast, the mathematical equation had no exibility.
In the vast majority of cases, the formula turned out to
be at least as good a predictor as the clinician. Meehls
understanding of this nding was that the clinician com-
bined the variables in an idiosyncratic manner, whereas the
formula combined them in the way that past history had
shown to be most predictive. In statistical terms, the clini-
cian was an imperfect, unreliable generator of regression
weights (see Goldberg, 1991).
Meehls book touched off a decades-long debate about
the reliability and validity of clinical judgment. The hard
scientists savored the victory of statistics over clinical
intuition; the soft psychologists railed against the deval-
uation of clinical expertise. The terms of the debate (and
the attendant affect) seem little different today. Although
psychologists have revisited the question of clinical versus
statistical prediction many times since Meehls book (e.g.,
Dawes, Faust, & Meehl, 1989; Holt, 1958; Sarbin, 1962;
Sawyer, 1966), the weight of the evidence remains the
same as it was in 1954: In the vast majority of studies, a
good formula matches or trumps an intuitive clinical sooth-
sayer (Grove, Zald, Lebow, Snitz, & Nelson, 2000).
In framing the clinicalstatistical debate, Meehl
(1954) used the term clinical to refer to a method of
aggregating data (informal, unstructured vs. statistical, ac-
tuarial). We believe, however, that the debate since Meehl
has often confounded the method of aggregation (unstruc-
tured judgment vs. statistical aggregation using algorithms
rened over successive iterations) with the nature of the
observer (clinician expert vs. lay). Meehl was clear in
dening clinical as a mode of data aggregation (and his
collaborators have largely adhered to that denition; e.g.,
Dawes et al., 1989; Grove & Meehl, 1996; Grove et al.,
2000). However, in broader psychological discourse, clin-
ical has come to be used more broadly (and in accord with
its standard English denition) to denote the judgments,
inferences, observations, and practices of clinicians. The
confusion of these two meanings of clinical has led to a
widespread belief that empirical data have shown that the
observations, thought processes, and beliefs of clinicians
are seriously awed (e.g., Tavris, 2003).
Consider the following excerpt from Meehls obitu-
ary, published in the APS Observer: Meehls reputation
spread with his 1954 book . . . in which he showed that
statistical formulas were better than, or at least equal to,
clinicians at predicting such things as what sort of treat-
ment would best benet a mentally ill person (American
Psychological Society Observer, 2003, p. 13; emphasis
added). This statement is particularly problematic given
Preparation of this article was supported in part by National Institute of
Mental Health Grants MH62377 and MH62378 to Drew Westen. We
thank William Grove, Scott Lilienfeld, Keith Rayner, and George Stricker
for their comments on an earlier version of this article.
Correspondence concerning this article should be addressed to Drew
Westen, Departments of Psychology and Psychiatry, Emory University,
532 North Kilgo Circle, Atlanta, GA 30322 or to Joel Weinberger, Derner
Institute, Adelphi University, Box 701, Garden City, NY 11530. E-mail:
dwesten@emory.edu or weinberg@panther.adelphi.edu
595
October 2004 American Psychologist
Copyright 2004 by the American Psychological Association 0003-066X/04/$12.00
Vol. 59, No. 7, 595 613
DOI: 10.1037/0003-066X.59.7.595 that Meehl himself practiced psychoanalysis, despite his
awareness of its inadequate evidentiary basis in replicable
scientic studies (Meehl, 1978; Meehl, personal commu-
nication, 2002). Similar sentiments can be seen across the
landscape in contemporary clinical psychology, as in the
shift to clinical scientist models of clinical psychology
training that minimize the importance of clinical experi-
ence for understanding clinical phenomena (e.g., McFall,
1991); models of treatment that minimize the role of clin-
ical judgment on the grounds that such judgment is inher-
ently inferior, over the long run, to interventions prescribed
in a well-validated manual (see Westen, Novotny, &
Thompson-Brenner, 2004); and models of assessment and
diagnosis that advocate that clinicians replace their stan-
dard diagnostic practices with structured interviews that
inquire about each diagnostic criterion for each disorder in
the fourth edition of the Diagnostic and Statistical Manual
of Mental Disorders (DSMIV; American Psychiatric As-
sociation, 1994; Basco et al., 2000; Segal, Corcoran, &
Coughlin, 2002; Wood, Garb, Lilienfeld, & Nezworski,
2002). Underlying all of these contemporary incarnations
of the clinicianresearcher tension that has existed since the
rise of clinical psychology (see, e.g., McReynolds, 1987) is
the view that clinical observations, judgments, procedures,
methods of inquiry, and theoretical and technical predilec-
tionsto use Meehls (1960, p. 19) term, the cognitive
activity of the clinician cannot be trusted.
Our goal in this article is to revisit the clinicalstatis-
tical debate and, in the process, to rethink the question of
what clinicians can and cannot do. We suggest that Meehls
arguments against informal aggregation stand 50 years
later, but they have no bearing on whether, or under what
circumstances, clinicians can make reliable and valid ob-
servations and inferences. We rst address the dual mean-
ings of the term clinical and examine the conditions under
which the two types of clinical judgment are likely to be
useful in prediction. We then review an emerging body of
research on the quantication of clinical observation that
considers what happens when we unconfound the two
meanings, crossing clinical observation with statistical pre-
diction. We conclude by reconsidering a paradox with
which Meehl struggled throughout his career, a paradox
that (in his words, cited above [Meehl, 1954, p. vi]) besets
most of us who do clinical work but try to be scientists, of
how to reconcile idiographic (and potentially idiosyncratic)
clinical judgment in a given hour with nomothetic science.
We suggest that the clinicalstatistical distinction consti-
tutes as much a continuum as a dichotomy, and that every
application of nomothetic, probabilistic statements to a
given case (whether that case is a patient, a study to be
designed or interpreted, or a body of literature) inherently
involves clinical modes of aggregation.
Before proceeding, we should briey note the poten-
tial meanings of the other word in the phrase clinical
prediction, namely prediction. Cognitive processes can be
arrayed on a continuum, from lower level processes, such
as sensation and perception (which nevertheless involve
substantial top-down processing), through processes de-
noted by terms such as inference, judgment, and decision
making. Clinical observation includes substantial elements
of perception and low-level categorization (e.g., the patient
cries a lot or has a history of arrests) that require minimal
inference. It also, however, includes substantial elements of
judgment or inference (e.g., the patient is emotionally
labile or is sensitive to rejection), which are not dissimilar
in kind from the inferences required of lay observers when
self-reporting symptoms or personality traits (e.g., My
mood is very changeable or I often worry about being
rejected by people important to me). As we argue below,
there is good reason to believe that clinicians can make
reliable judgments at this level of abstraction, which we
denote here by the terms observations, inferences, and
judgments. We restrict the term prediction to the way it is
usually operationalized in research on clinical and statisti-
cal prediction, to refer to broader generalizations or prog-
nostications (whether about past, concurrent, or future
events), such as whether the patient is likely to have a
history of sexual abuse or to make a successful suicide
attempt in the next 2 years.
Two Meanings of Clinical
In an article on the Compar