Original articles
Identification of clinically important changes in health status using receiver operating characteristic curves

https://doi.org/10.1016/S0895-4356(99)00140-7Get rights and content

Abstract

Identification of criterion standards for clinically important changes for groups of patients requires that judgments of the degree of change that represents a clinically important change are consistent among patients. We demonstrate the use of receiver operating characteristic (ROC) curves to test if patients' judgments of clinically important changes are consistent. Twenty-three patients with systemic lupus erythematosus (SLE) were examined prospectively every 2 weeks for up to 40 weeks. At each assessment, each patient rated the activity of their SLE on a visual analog scale, rated whether their SLE was more active, less active, or unchanged over each 2-week interval, and rated the importance of any change in SLE activity. One of three physician examiners completed similar assessments. Each measured change in the patient global assessment was categorized according to the patient's judgment of whether no change in SLE activity was noted or whether the patient thought their SLE was more or less active during the interval. ROC curves were constructed from these data. Areas under the ROC curve that were significantly greater than 0.5 were considered evidence for consistent ratings among patients of important changes in SLE activity. Patient assessments of change were available for 383 of 392 2-week intervals (97.7%). Of these, patients reported no change in SLE activity in 200 intervals, improvement in 72 intervals, and worsening in 111 intervals. Intervals of improvement could be distinguished from intervals of no change by changes in the patient global assessments [ROC area = 0.68; 95% confidence interval (CI) 0.60, 0.76]. The cutpoint with the greatest sensitivity and specificity for any improvement was a decrease of 5 points or more (on a 0–100 scale) in patient global assessment. Intervals of worsening could also be distinguished from intervals of no change (ROC area = 0.80; 95% CI 0.74, 0.85), and the best cutpoint was an increase of 5 points or more in the patient global assessment. Group criteria for major improvement or worsening and for relative changes in the patient global assessment could also be determined, as could criteria for important changes in physician global assessments. By testing the consistency of patients' judgments of important changes, ROC curves provide a means to determine if group criteria for clinically important change can be established.

Introduction

In clinical practice, implicit criteria for treatment benefits are routinely used. Patients and physicians continually judge whether the benefit gained from a particular therapy is sufficient to outweigh its cost and any adverse effects that may accompany its use. However, rarely are the criteria for clinically important changes made explicit, and rarely are these criteria based on responses in specific clinical measures. Moreover, judgment of the magnitude of change that represents an important change is unique for each patient.

Recently, investigators have sought to establish explicit criteria for clinically important changes in operationalized clinical measures for groups of patients 1, 2, 3, 4, 5, 6, 7. Knowing the magnitude of change that constitutes a clinically important change in a measure would greatly facilitate the planning and interpretation of tests of new treatments and comparisons of different treatments. Efforts to date have assumed that patients are consistent in their judgments of the magnitude of change in clinical status that represents a clinically important change. However, criteria for important changes may vary widely among patients. Because patients' valuations about the degree of treatment benefit that represents an important improvement are personal and individual, the overlap between changes considered by some to be important and by others to be negligible may be so great that it may not be possible to establish criteria for changes that most patients would consider important.

One approach to determining the consistency of patient's judgments of clinically important changes is to compute the sensitivity and specificity of a range of measured changes in health status for clinically important changes. Receiver operating characteristic (ROC) curves constructed from such data would indicate the accuracy with which measured changes corresponded to judgments of important changes in health status 8, 9. An area under the ROC curve of 0.5 would indicate that, over all magnitudes of change, changes in the health status measure were no better at identifying an important change than a random guess. This result would obtain if there was little consistency among patients in the magnitude of change they considered important. An area under the ROC curve that was significantly greater than 0.5 would indicate that changes in a health status measure corresponded accurately with patients' judgments of important change, and that important changes could consistently be distinguished from unimportant changes. Criteria for important changes that were based on the judgments of a group of patients could then be determined.

In this study, we demonstrate the use of ROC curves to test if patient's judgments of important changes are consistent. The sample included 23 patients with systemic lupus erythematosus (SLE) who were assessed repeatedly in a prospective study of changes in SLE activity over time. At each assessment, patients rated the activity of their SLE, and judged whether their SLE had improved, worsened, or was unchanged from the previous assessment. ROC curves were constructed from the association of measured changes in patients' global assessments of SLE activity and their judgments of whether any change had occurred. We examined ROC curves for improvements and worsenings separately, and also examined ROC curves for major improvement or worsening, relative changes, and changes in physician global assessments. We hypothesized that patients' judgments would be consistent (i.e., have ROC areas significantly greater than 0.5), and that their judgments for major improvement or worsening would be more consistent (i.e., have larger ROC areas) than their judgments for improvement or worsening of any magnitude. We also hypothesized that, because patients would likely be very concerned about and sensitive to any decrement in health, the worsening in health recognized as important would be smaller than the improvement in health necessary to be recognized as important. Therefore, we anticipated that the criterion standards for improvement would be larger than the criterion standards for worsening.

Section snippets

Patients and study protocol

Twenty-three patients were enrolled in a time-intensive prospective observational study of changes in SLE activity over time. Participants were recruited from the Stanford University Medical Center or VA Palo Alto rheumatology clinics, or from the practices of local rheumatologists. All participants met the revised criteria for the classification of SLE [10], were age 18 years or older, spoke and read English, and had clinical evidence of some degree of SLE activity as determined by their

Results

Nineteen women and four men participated in the study. Their mean (± SD) age was 44.8 ± 14.4 years, and the mean duration of their SLE was 3.5 ± 4.8 years. The mean patient global assessment (40 ± 21) and mean physician global assessment (28 ± 18) at study entry indicated mild to moderate SLE activity. Nineteen participants completed 20 assessments each, and one participant each completed 17, 8, 7, and 3 assessments (total 415 assessments). Information on patient global assessments and

Discussion

While criteria for important change may be readily defined for individual patients, determining criteria that are applicable to groups of patients is much more complicated. Heterogeneity among patients in their judgments of the degree of change in health that represents an important change is an important obstacle to defining group criteria for clinically important changes [11]. Our study demonstrates that ROC curves can be used to determine the consistency with which patients rate the

Acknowledgements

Supported in part by NIH grant AR20610 to the Stanford University Multipurpose Arthritis Center.

References (11)

There are more references available in the full text version of this article.

Cited by (0)

View full text