Delphi as a method to establish consensus for diagnostic criteria

https://doi.org/10.1016/S0895-4356(03)00211-7Get rights and content

Abstract

Background and objectives

To achieve a consensus, among a panel of experts, on the best clinical criteria for the clinical diagnosis of carpal tunnel syndrome (CTS).

Method

Experts rated the diagnostic importance of items from the clinical history and physical examination for CTS. The ratings were expressed on a 10-cm visual analog scale. The average and standard deviation of the scores for each item were returned to the panelists. The panel members evaluated the items a second time with knowledge of the group responses from the first round. The scores were standardized to minimize scaling variations and, after the second round, the items were ranked in order of importance assigned by the group. Cronbach's α was used as a measure of homogeneity for the rankings. Increasing homogeneity was considered to be an indication of consensus among the panelists.

Results

Cronbach's α increased from 0.86 after the first round to 0.91 after the second iteration. Panelists who were relative outliers on the first round demonstrated a much higher correlation with the entire group after the second round.

Conclusions

Delphi is an effective method of establishing consensus for certain clinical questions. Cronbach's α was a useful statistic for measuring the extent of consensus among the panel members. Delphi was chosen from the possible methods of group process because of its inherent feasibility. The absence of a need by the panelists to meet in person removed any constraint on the geographic location of the panel members. In addition, the anonymous nature of Delphi was thought to be a key factor in avoiding a result that might be skewed by one or more persuasive panelists. Both of these characteristics were felt to be particularly important to the topic on which consensus was sought, the clinical diagnostic criteria for CTS. This movement in the opinions of some of the panelists appeared to result from the feedback of information describing the group opinion.

Introduction

Delphi is a well-recognized group process in the social sciences [1], [2], [3], [4], [5], [6], [7], [8]. Although prior studies have used this method to establish appropriateness criteria for treatment [9], [10], [11], [12], [13], [14], [15], [16], it has received less attention as a tool to establish consensus among health care professionals on diagnosis.

Delphi is a completely anonymous process in which the participants never meet. Delphi otherwise resembles the nominal group in structure. Ideas are expressed to the participants in the context of a mailed questionnaire. Responses to the items in the questionnaire are collated and analyzed. Items may be dropped or added in a second round in which the group responses to the first round are reported to the participants. New responses to the items are recorded and repeated iterations of the process carried out until a consensus appears to have been reached. The determination that a consensus has been achieved requires an operational definition that is appropriate to the issue under consideration.

Delphi is particularly attractive for the task of achieving consensus, especially among health care professionals. First, the absence of an obligation to meet in person greatly improves the feasibility of Delphi and lowers the cost significantly. Second, and perhaps more importantly, there are less likely to be constraints on either the size or composition of the group. Participants may be recruited from diverse geographic locations and clinical backgrounds. Third, the reliability of group consensus for the issue being examined improves as the number of panelists is increased [2]. A panel size appropriate to the issue under consideration is easier to achieve because of the inherent feasibility of the Delphi process. Finally, the anonymous nature of the exercise ensures that a single influential participant will not have a disproportionate impact on the outcome of the group as can occur with other group processes.

The Delphi process has been criticized as being subject to bias because the investigator limits the scope of the issue evaluated by the panelists. Thus, as the breadth of the issue under consideration is at least partially controlled by the investigator, any consensus that may emerge may be somewhat distorted [5]. The Delphi method has also been criticized for the fact that the panelists never meet together. Other group processes depend on the interaction between the participants as a source of novel insight into an issue. Due to the nature of Delphi, no discussion takes place, and any consensus that the group appears to have developed can only derive from information provided to it by the investigator. Where there is discussion among panelists, like in other types of group process, the consensus reached may be significantly different from that expected prior conducting the group process [4]. Finally, criteria for determining that group consensus has been achieved have not been established.

Carpal tunnel syndrome (CTS) is a diagnosis commonly made in industrialized societies. The prevalence of CTS reportedly varies with the geographic location of the population under study [17], [18], [19]. There also are variations in the reported prevalence of the condition between different industries [20], [21], [22], [23], [24], [25], [26], [27]. Potential explanations for these variations may include intrinsic differences among the populations, different exposures, or variations in the diagnostic criteria for identifying CTS. One of the more likely explanations for these variations is the different case definition for this condition among reports. These vary in their nature, emphasis, and stringency.

The clinical evaluation remains an important aspect of the diagnostic process for CTS. Electrodiagnostic studies are often considered to be a gold standard diagnostic test for CTS [28], [29], [30]. In other words, electrodiagnostic tests are frequently taken to represent a demonstration of the essential lesion for the clinical condition called CTS. There is not consensus on this issue, and the assumption that these tests represent the essential lesion in CTS may be flawed for the following reasons. First, electrodiagnostic testing does not have perfect sensitivity or specificity, and the tests may be normal despite clinically significant nerve compression [31], [32]. Second, the standard interpretation of electrodiagnostic data assumes a normal distribution of nerve conduction velocities, and arbitrarily designates velocities more than two standard deviations from the mean as abnormal. This results in a misclassification of asymptomatic individuals as being affected by CTS. In addition, the literature reports that assumption of a normal distribution of nerve conduction velocity may not be reasonable [33]. The third point is that cut point for defining an abnormality of nerve conduction velocity varies in the literature [30], [34], [35]. There is no established consensus on the electrical evidence for CTS. The specificity and sensitivity of sensory nerve conduction measurements is affected by the threshold defining CTS. This would not be expected of a gold standard test in the usual connotation for this term.

The use of electrodiagnostic tests is not universal among experts who treat CTS [36]. Like any diagnostic test, electrodiagnostic evaluations should be interpreted from within a clinical context. Ideally, they should be used in a bayesian manner to modify the pretest probability of CTS established clinically. Thus, there is a need to standardize the clinical criteria for the diagnosis of CTS.

Clinical experience at our center has indicated that CTS is a condition that is diagnosed and treated by a broad spectrum of specialist and primary care clinicians. These diagnosticians bring varying experiences to the task of diagnosing CTS, experiences that have been obtained within an intellectual framework or paradigm specific to a particular clinical specialty. These unique clinical experiences could form the basis for diagnostic criteria that are not uniform among diagnosticians from different training backgrounds. The absence of uniform diagnostic criteria makes the study of potentially important factors, like industrial exposures, difficult or even impossible. Thus, CTS is a common clinical condition diagnosed using widely varying clinical criteria.

Diagnostic criteria for any condition must be both valid and clinically sensible. Establishing a consensus among clinical experts on what the criteria should comprise does not ensure validity. As clinical experience evolves, the opinions of experts may also change, together with their diagnostic practices. The development of methods for seeking consensus must consider this and be flexible so that the criteria can be re-examined and revised at intervals. Obtaining agreement among clinical opinion makers should be seen as a starting point for establishing criteria that are likely to have significant clinical sensibility and that can be tested to evaluate validity. The key issue is the development of a consensus so that a gold standard diagnostic criterion can be established.

The objective of this study was to determine whether the Delphi method could be used to achieve consensus among influential expert clinicians representing all of the involved clinical disciplines for diagnostic criteria for CTS. An additional goal was to measure agreement within the panel using Cronbach's α as a measure of the internal consistency of the group.

Section snippets

Selection of the panelists

Panelists were recruited from clinical disciplines involved in the diagnosis and treatment of CTS including neurology, neurosurgery, rheumatology, occupational health, plastic surgery, and orthopedic surgery. We attempted to identify experts defined in at least one of two ways. First, some panelists were leaders in their clinical fields as evidenced by their roles as opinion makers within national organizations such as the American Society for Surgery of the Hand and the American Society for

Results

Cronbach's α for the first round of the Delphi process was 0.86. The individual panelist-group correlation ranged between 0.23 and 0.73 (Table 2). This suggested that there were some panelists who were relative outliers. Two of the three panelists with the lowest correlation with the entire group were from the same clinical specialty, rheumatology. The three highest values were orthopedic hand surgeons.

The items were ranked in descending order according to the average score assigned by the

Discussion

The concept of consensus within a group is easily understood, but the best way to measure this phenomenon is unclear. Furthermore, criteria that indicate a consensus has been achieved will vary with the setting in which agreement is sought and the method being utilized.

Consensus within the group in general should be reflected in decreases in the variance of the responses. Consensus among groups has often been quanitated using group means and standard deviations [16]. In our study, the standard

Acknowledgements

Dr. Wright is supported as a the R.B. Salter Chair of Surgical Research and as an Investigator of the Canadian Institute for Health Research. This work was supported by Physicians' Services Incorporated Foundation Grant 97-52.

References (40)

  • N.C. Dalkey et al.

    An experimental application of the Delphi method to the use of experts

    Manage Sci

    (1963)
  • N.C. Dalkey

    The Delphi method: an experimental study of group opinion

    (1969)
  • N. Dalkey et al.

    The Delphi Method III. Use of self-ratings to improve group estimates

    (1969)
  • A. Fink et al.

    Consensus methods: characteristics and guidelines for use

    Am J Public Health

    (1984)
  • F.J. Romm et al.

    Developing criteria for quality of assessment: effect of the Delphi technique

    Health Serv Res

    (1979)
  • A.D. Weinberg et al.

    The Delphi technique as a method for determining the continuing education needs of physicians regarding coronary artery disease

    J Assoc Hosp Med Educ

    (1977)
  • J. Zinn et al.

    The use of the Delphi panel for consensus development on indicators of laboratory performance

    Clin Lab Manage Rev

    (1999)
  • M.R. Couper

    The Delphi technique: characteristics and sequence model

    ANS Adv Nurs Sci

    (1984)
  • R.P. Gale et al.

    Delphi-panel analysis of appropriateness of high-dose chemotherapy and blood cell or bone marrow autotransplants in diffuse large-cell lymphoma

    Leuk Lymphoma

    (1998)
  • B.J. Hillman et al.

    Improving diagnostic accuracy: a comparison of interactive and Delphi consultations

    Invest Radiol

    (1977)
  • Cited by (362)

    View all citing articles on Scopus
    View full text