Inpatient diagnostic assessments: 1. Accuracy of structured vs. unstructured interviews☆
Introduction
DSM-I (1952) and DSM-II (1968) defined syndromes more as psychological reactions than as biopsychosocial conditions and solicited clinicians’ subjective evaluations (‘clinical judgment’) as much as their objective assessments. Consequently, many clinicians focused more on patient's ‘problems’ arising from their psychodynamic status than on mental disorders generated by biopsychosocial conditions; treatment of choice was usually psychotherapy (Whitehorn, 1944, Sullivan, 1970). Prior to 1980, ‘the classification system and specialty of psychiatry [were] often held as less ‘medical’ or scientific than other branches of medicine. Extensive efforts to correct this perception resulted in a paradigmatic shift from hermeneutic [interpretive…theory based] to empirically based approaches, and the development of a nosology intended to increase diagnostic reliability and facilitate research efforts. These changes are embodied in…DSM-III [1980] and DSM-IV [1994]’ (Bogenschutz and Nurnberg, 2000, p. 824).
The Traditional Diagnostic Assessment (TDA), the unstructured interview that is the standard of practice for that task throughout psychiatry (Othmer and Othmer, 1994, Hales et al., 1995, Kaplan and Sadock, 1998), has evolved alongside DSM. To increase diagnostic validity and reliability, clinicians since 1980 have expanded the Mental Status Examination (MSE) and the history of present and past illnesses, and are using more structured formats: ‘Most clinicians are now imbued not only with the content of the DSM-IV criteria sets but also with a different method of interviewing patients and eliciting psychopathology. Compared with pre-DSM-III days, clinical evaluations are now much more likely to be more semi-structured and less open-ended. Clinicians are much more likely to ask specific questions to elicit the [i.e. MSE and history of illness; emphasis added] necessary to make a DSM diagnosis...no evaluation is complete unless the diagnostic questions are asked’ (Frances et al., 1995, p. 66).
Accurate diagnosis is pragmatically as well as theoretically important, because ‘treatment plans are often based on …diagnostic type. The advent of disease-specific treatment protocols has heightened the necessity for accurate diagnostic procedures’ (Basco et al., 2000).
Textbooks seem not to express concerns or list references regarding the validity and reliability of the TDA (Othmer and Othmer, 1994, Hales et al., 1995, Kaplan and Sadock, 1998). Core Readings in Psychiatry (Sacks et al., 1995), the APA bibliography, has 11 references about diagnostic validity, but none about TDA.
Researchers have reported the diagnostic variance of the TDA (Williams et al., 1992, McGorry et al., 1995, Mojtabai and Nicholson, 1995, Hill et al., 1996, van Praag, 1997), but these findings have not led to significant changes in clinical practice.
Although clinicians use the TDA with unquestioning faith, researchers mostly avoid using it as an exclusive diagnostic method in clinical trials. Researchers also tend to avoid using TDAs exclusively in their work to validate syndromes for DSM-IV (Widiger et al., 1994, Widiger et al., 1996).
Following DSM-III's introduction in 1980, researchers began to evaluate how clinicians used it. Lipton and Simon (1985, p. 370) found that ‘documentation of DSM-III criteria for assigned chart diagnoses was not present in 80% of the 131 charts reviewed’. Skodol et al. (1984) found that 75% of incorrect diagnoses made with DSM-III resulted from incorrect application of criteria. Robinson et al. (1985) found that a university-affiliated faculty misidentified 13–48% of DSM-III criteria for major depression. Greist (1998) examined how clinicians followed diagnostic rules and found error rates between 10 and 37%. Garb (1998) reviewed research on clinical judgment and found that clinicians often failed to adhere to diagnostic criteria. Such reports justified the development of methods to improve diagnostic accuracy of the TDA.
In 1980, the corresponding author undertook to ‘computerize’ DSM. This took its present form in 1994 with the advent of CADI, which directs the clinician to evaluate all relevant criteria in DSM-IV algorithms, by sequentially displaying questions and answers on the computer screen. It is a structured interview. Clinicians enter their assessments into the computer, and the program matches them with DSM-IV algorithms to make diagnoses. CADI has been in beta testing since 1996. CADI has been previously reported at national meetings (Miller, 1996, Miller, 1998, Miller, 1999, Miller, 2000a, Miller, 2000b).
The MSE underlies DSM diagnostics, so the corresponding author restructured it to use Key Criteria as MSE items. Key Criteria are those criteria (listed first in DSM algorithms) that must be evaluated before the linked diagnosis can be ruled in or ruled out. Table 1 shows 13 diagnostic groups and their linked 25 Key Criteria. If any Key Criterion is positive, the linked diagnosis must be evaluated completely. If all Key Criteria for a linked diagnosis are negative, that diagnosis can be ruled out. Thus, CADI operates in the same way as ‘Decision Trees for Differential Diagnosis’ (DSM-IV, Appendix A).
The CADI and the paper-and-pencil SCID both assess DSM algorithms. The CADI differs in that it has an MSE (based on Key Criteria), while SCID does not. The CADI makes diagnoses programmatically, while the paper-and-pencil SCID requires the clinician to choose the diagnoses.
Lieff (1987), Taintor et al. (1997), and Blacker (2000) have described different computerized diagnostic interviews, a topic outside the compass of this article.
The purposes of the study were twofold: (1) to compare structured vs. unstructured methods for making psychiatric diagnosis; (2) to assess the use of computer assistance for structured diagnosis and the validity and reliability of the CADI.
We used the paper-and-pencil format of the SCID, because it is the format most widely used, and because it enabled the comparison of three distinct methods — unstructured paper-and-pencil (TDA), structured paper-and-pencil (SCID-CV), and structured computer-assisted (CADI).
Section snippets
Subjects
The 56 subjects were recruited from psychiatric inpatients at a university-affiliated publicly funded hospital. Qualifications included minimum age 18, enough English fluency and cognitive function to comprehend written and oral descriptions of the study and to sign informed consent, and sufficient verbal fluency and memory to answer questions in the CADI and SCID-CV interviews. The first patient admitted each week was screened. If patient #1 did not meet criteria, we screened patient #2, etc.
Primary diagnosis
Frequencies found by the different methods are in columns A–D of Table 3 (see bottom two lines for rates of agreement compared with Consensus). TDA had agreement=53.8% and kappa=0.4325; SCID-CV had agreement=85.7% and kappa=0.8189; CADI had agreement=85.7% and kappa=0.8147. Although the SCID and the CADI had the same overall agreement (48/56), a comparison of columns B and C shows that they had minor differences in eight of 10 individual groups. Using Fleiss's (1973) standards for kappa, the
Regionalism
Los Angeles County, with its urban population of 13 million and its high racial-ethnic mix, makes it non-representative of the USA.
Subjects
Subjects may represent psychiatric inpatients from the public sector (Pollack et al., 1996), but not all patients from all sectors.
Clinicians
Clinicians who did the TDAs may represent clinicians in publicly funded academic hospitals, but not all psychiatrists. Given these limitations, results are provisional.
Diagnostic complications
Comorbidity confounds the diagnostic process, and our subjects had
Acknowledgements
This study was supported by Eli Lilly and Company.
References (42)
- et al.
Factors affecting reliability and confidence of DSM-III-R psychosis-related diagnosis
Psychiatry Research
(2001) Psychiatric diagnosis: are clinicians still necessary
Comprehensive Psychiatry
(1983)- et al.
Training and quality assurance with the Structured Clinical Interview for DSM-IV (SCID-I/PO)
Psychiatry Research
(1998) - et al.
Psychiatric diagnosis in clinical practice: is comorbidity being missed?
Comprehensive Psychiatry
(1999) - et al.
Methods to improve diagnostic accuracy in a community mental health setting
American Journal of Psychiatry
(2000) Psychiatric rating scales
- et al.
Classification of mental disorders
- et al.
Implementing evidence-based practices in routine mental health service settings
Psychiatric Services
(2001) Statistical Methods for Rates and Proportions
(1973)- et al.
DSM-IV Guidebook
(1995)
Studying the Clinician: Judgment Research and Psychological Assessment
Defining best practices: a prescription for greater autonomy
Psychiatric Services
The computer as clinician assistant: assessment made simple
Psychiatric Services
Problem of diagnosis in post-mortem brain studies of schizophrenia
American Journal of Psychiatry
Computer Applications in Psychiatry
Psychiatric diagnosis in a state hospital: Manhattan State revisited
Hospital and Community Psychiatry
Low-tech autopsies in the era of high-tech medicine: continued value for quality assurance and patient safety
Journal of the American Medical Association
Spurious precision: procedural validity of diagnostic assessment in psychotic disorders
American Journal of Psychiatry
The procedural validity of retrospective case note diagnosis
Australian and New Zealand Journal of Psychiatry
Cited by (0)
- ☆
Similar findings with an overlapping sample were presented at the Annual Meeting, American Psychiatric Association, 6 May 2000, in Chicago, Illinois.