Elsevier

Social Science & Medicine

Volume 58, Issue 4, February 2004, Pages 799-809
Social Science & Medicine

Comparing directly measured standard gamble scores to HUI2 and HUI3 utility scores: group- and individual-level comparisons

https://doi.org/10.1016/S0277-9536(03)00254-5Get rights and content

Abstract

Directly measured standard gamble (SG) utility scores reflect the respondent's assessment and valuation of their own health status. Scores from the health utilities index (HUI) are based on self-assessed health status but valued using community preferences obtained using the SG. Our objectives were to find if mean directly measured utility scores agree with mean HUI mark 2 (HUI2) and mean HUI mark 3 (HUI3) scores. Also, if individual directly measured utility scores agree with HUI2 and HUI3 scores, and whether HUI2 and HUI3 scores agree. Questionnaires based on the HUI2 and HUI3 health-status classification systems were administered by interviewers to 140 teenage survivors of extremely low birthweight (ELBW) and 124 control group teens. Respondents were asked to think about their own usual health states using six dimensions from HUI2 and value that state using the SG. Mean SG scores are compared with mean HUI2 and mean HUI3 scores using paired sample t-tests. Mean HUI2 scores are compared with mean HUI3 scores. Agreement among scores is assessed using intra-class correlation coefficient (ICC). The effect of severity of health-state morbidity on agreement was assessed using three approaches. ELBW cohort mean (standard deviation) SG, HUI2, and HUI3 scores were 0.90 (0.20), 0.89 (0.14), and 0.80 (0.22). Results for controls were 0.93 (0.11), 0.95 (0.09), and 0.89 (0.13). Mean SG and HUI2 scores did not differ; mean SG and HUI3 did differ; mean HUI2 and HUI3 also differed. At the individual level for ELBW, the ICCs between SG and HUI2, SG and HUI3, and HUI2 and HUI3 scores were 0.13, 0.28, and 0.64. For controls the ICCs were 0.14, 0.24, and 0.56. HUI2 scores appear to match directly measured utility scores reasonably well at the group level. HUI2 and HUI3 scores differ systematically. At the individual level, however, HUI2 and HUI3 scores are poor substitutes for directly measured scores.

Introduction

It has become increasingly important to assess the health-related quality of life (HRQL) of health outcomes and the process by which those outcomes are achieved. HRQL information is crucial in the evaluation of healthcare interventions. Some forms of HRQL assessment support the economic evaluation of healthcare services using cost-utility and cost-effectiveness analyses. HRQL is also pivotal in describing the burden of illness for particular conditions and diseases, information that can help shape priorities both for interventions and for research.

Preference-based measures are one important family of HRQL measures. (Two other major classes of HRQL measures are generic profile measures (such as the Rand-36; Hays, 1998; Hays & Morales, 2001) and specific measures (such as the Pediatric Asthma Quality of Life Questionnaire; Juniper et al., 1996).) Preference-based measures provide scores that value health states. The conventional scale assigns dead a score of 0.00 and perfect health a score of 1.00, and thus permits the integration of mortality and morbidity.

There are two major categories of preference-based measures: direct and multi-attribute (or indirect). In the direct approach, respondents are asked to value health states. The health states evaluated in the direct approach may be hypothetical health states or the respondent's subjectively defined current health state (SDCS). In the latter case, the respondent reflects on their own state of health and then values it. In the multi-attribute approach, respondents complete a questionnaire based on a health-status classification system that is a component of a multi-attribute system. Prominent multi-attribute systems include the health utilities index (HUI; Feeny, Torrance, & Furlong, 1996; Furlong, Feeny, Torrance, & Barr, 2001; Torrance, Furlong, & Feeny, 2002), quality of well-being (QWB; Patrick, Bush, & Chen, 1973) scale, and EuroQol (EQ-5D; Essink-Bot, Stouthard, & Bonsel, 1993; Rabin & de Charro, 2001).

In the multi-attribute approach, the respondent provides descriptive information about their health status. The score for that health state is then derived using a scoring function for the system. Scoring functions are typically estimated using preference scores from community samples. In contrast, in the direct approach, the respondent's own values are reflected in the score. Thus, there are at least two potential sources of differences in scores derived from a multi-attribute system for a person's health state and the direct standard gamble (SG) score provided by that person at the same point in time. First, the respondent's conceptualization of health status may include dimensions not included in that particular multi-attribute system. Second, scores from the multi-attribute approach embody community preferences (typically mean scores) and ignore the heterogeneity in preferences among respondents that is reflected in directly measured scores.

In practice, the direct approach typically involves the use of well-trained and supervised professional interviewers, in-person interviews, and imposes cognitive demands on respondents. Interviews are supported by detailed study-specific scripts and the use of props. In contrast, the multi-attribute approach involves the administration of questionnaires that take less time to complete than direct preference interviews and impose fewer demands on respondents.

How do scores from the direct and multi-attribute approaches compare? Do scores agree at the group level? Do scores agree at the individual level? Does the degree of agreement vary with the severity of the health state? Would the results of a cost-utility analysis be affected by the choice of direct versus multi-attribute scores? Do scores derived from one multi-attribute system agree with scores derived from another? Given that several prominent guidelines for economic evaluation recommend the use of community preferences (Canadian Coordinating Office for Health Technology Assessment (CCOHTA), 1997; Gold, Siegel, Russell, & Weinstein, 1996), it is important to compare the direct scores provided by patients with scores based on community preferences. The paper provides comparisons of directly measured SG scores and scores derived from the HUI mark 2 (HUI2) and HUI3 mark 3 (HUI3) systems. At both the group and individual level the SG and HUI2 scores are compared; similarly, comparisons are made between SG and HUI3 scores. HUI2 and HUI3 are widely used in clinical and population health studies and in economic evaluations of healthcare services. It is therefore important to compare scores from these two systems to assist users in the interpretation of results based on either system. Do results based on HUI2 differ from those based on HUI3?

Section snippets

Respondents

Two cohorts of respondents were interviewed. The two cohorts have been described previously (Saigal et al. (1994a), Saigal et al. (1994b)) and will only be described briefly here. The first group of respondents was a cohort of survivors of extremely low birthweight (ELBW; <1000 g), born to residents of a geographically defined region (Central West) of Ontario between 1977 and 1982. The second group of control subjects were born at term and recruited at age 8 and were matched to the ELBW cohort

Respondents

ELBW: 150 of the 169 ELBW survivors agreed to participate in the study (Saigal et al., 1996). Nine of these were severely impaired and did not participate in the interviews. In addition to these nine, data were incomplete for an autistic child who participated in the interviews. Data available for the analyses reported in this paper are therefore for 140 ELBW survivors, 83% of the survivors in the cohort. Given that data are missing for 10 individuals with severe or substantial impairments, the

Conclusions

For HUI2 at the group mean level, there is agreement between HUI2 and SG scores. The two are virtually interchangeable. However, at the individual level, there is little agreement.

Mean HUI3 group level scores are lower than SG scores. As with HUI2, at the individual level agreement is poor.

HUI3 scores are systematically lower than HUI2 scores. There are several reasons for this. First, because states worse than dead were handled differently in HUI3 preference measurements used for estimating

Acknowledgements

Financial support: The original study was supported by the Ontario Ministry of Health Grant 04447. The analyses reported in the paper were supported by grants from the Alberta Heritage Foundation for Medical Research (#199909) and the Institute of Health Economics. The funding agencies played no role in the design, interpretation, or analysis of the project and have not reviewed or approved of this manuscript.

The authors acknowledge the input of Elizabeth Burrows, Barbara Stoskopf, Lorraine

References (40)

  • Feeny, D., & Torrance, G. W. (1989). Incorporating utility-based quality-of-life assessments in clinical trials: Two...
  • Feeny, D. H., Torrance, G. W., & Furlong, W. J. (1996). Health utilities index. In: B. Spilker (Ed.), Quality of life...
  • D. Feeny et al.

    A comprehensive multiattribute system for classifying the health status of survivors of childhood cancer

    Journal of Clinical Oncology

    (1992)
  • D. Feeny et al.

    Multi-attribute and single-attribute utility functions for the health utilities index mark 3 system

    Medical Care

    (2002)
  • W.J. Furlong et al.

    The health utilities index (HUI) system for assessing health-related quality of life in clinical studies

    Annals of Medicine

    (2001)
  • Furlong, W., Feeny, D., Torrance, G. W., Barr, R., & Horsman, J. (1990). Guide to design and development of...
  • Furlong, W., Feeny, D., Torrance, G. W., Goldsmith, C., DePauw, S., Zhu, Z., Denton, M., & Boyle, M. (1998)....
  • S.E. Gabriel et al.

    Health-related quality of life in economic evaluations for osteoporosisWhose values should we use?

    Medical Decision Making

    (1999)
  • Gold, M. R., Siegel, J. E., Russell, L. B., & Weinstein, M. C. (Eds.). (1996). Cost-effectiveness in health and...
  • P. Grootendorst et al.

    Health utilities index mark 3Evidence of construct validity for stroke and arthritis in a population health survey

    Medical Care

    (2000)
  • Cited by (94)

    • A Review of the Psychometric Performance of Selected Child and Adolescent Preference-Based Measures Used to Produce Utilities for Child and Adolescent Health

      2021, Value in Health
      Citation Excerpt :

      Out of the 76 studies, 53 studies assess only 1 of the child- and adolescent-specific preference-based measures assessed here. Nineteen studies assess both HUI2 and HUI3,34,38-54 2 studies assess CHU9D and EQ-5D-Y-3L,55,56 1 assesses EQ-5D-Y-3L and HUI2,57 and 1 assesses CHU9D and HUI2.7,58 Forty-two studies assess HUI3, 26 studies assess HUI2, 20 studies assess EQ-5D-Y-3L, and 12 studies assess CHU9D.

    View all citing articles on Scopus
    View full text