Original Article
The Swedish SF-36 Health Survey III. Evaluation of Criterion-Based Validity: Results from Normative Population

https://doi.org/10.1016/S0895-4356(98)00102-4Get rights and content

Abstract

Assumptions of the variation in SF-36 scale scores were tested in relation to external criteria in 8930 respondents comprising the Swedish norming population. Physical health scales were strongly associated with age, while small differences were found for the Mental Health scale across age groups. Females reported poorer health than males, particularly in ages between 30–40 and over 70. Worse health profiles were associated with social risk factors (unemployment, divorce, etc.). The disability pension rate was strongly related to reduced Physical Functioning and increased Bodily Pain. The use of medical care was reflected in general health scores (i.e., the lower the scale score, the higher the care consumption). Self-reported physical and psychological symptoms were selectively related to SF-36 scales. All SF-36 scales, except Mental Health, were more strongly related to ratings of health satisfaction than to global quality of life. Combinations of the SF-36 well-being scales explained a substantial part of the variance of these ratings. In conclusion, the performed criterion-validity tests support the cross-cultural stability of the SF-36.

Introduction

Multinational applications of health-related quality of life questionnaires have increased rapidly during the past two decades 1, 2, 3, 4, 5. This rapid dissemination of surveys across countries is not, however, matched by a well-established set of standards for evaluating the quality of adaptations and their equivalence to the original forms 4, 6. Beyond the translation process and evaluation of data quality, scaling assumptions, and reliability, cross-cultural validation requires validity tests across patient and population groups to create norms and interpretation guidelines in each country. The International Quality of Life Assessment (IQOLA) Project represents a significant step toward this end 7, 8, 9. Previous documentation of the original SF-36 10, 11, 12, 13 and successful applications in the UK 14, 15, 16 set the stage for non-English language evaluations. As part of this work, the translation procedure and psychometric properties of the Swedish SF-36 were evaluated 17, 18.

Psychometric validation of the Swedish SF-36 has shown that results for data completeness, scaling assumptions, reliability, and construct validity compare well with the results of the U.S. evaluation 17, 18. In line with U.S. results, data quality was less favorable for the elderly, less educated, and the unemployed. Most subgroups completed all items comprising the eight SF-36 scales. Age was the main predictor of data completeness in the Swedish sample (cf, age, education and poverty in the U.S. study; [11]). Item-internal consistency was high across all subgroups and scales. No item-scale correlation fell below the desired level of 0.40. Scaling success rates, which directly reflect the construct validity of the SF-36 in Sweden, were consistently high across subgroups and scales, except in the oldest age groups. Internal consistency reliabilities (Cronbach’s alpha) met the desired levels for group comparisons (>0.70) across all subgroups and scales, except social functioning and role-emotional in the young sample. Further evidence of construct validity of the Swedish SF-36 has been documented. The hypothesized interrelations of SF-36 scale scores and the two-dimensional structure of health—physical and mental—was successfully replicated in Swedish populations 18, 19. Normative data from Sweden, broken down by sociodemographic variables, were consistent with those of the United States and UK 12, 15. Swedish normative health profiles followed the shape of those of the United States, UK, and several other IQOLA countries 9, 17, 20. Also, the discriminatory capacity of the instrument was documented (i.e., differing health profiles were obtained for groups assumed to differ in health status 18, 19). Clinical validation of the Swedish SF-36 was conducted in middle- and old-age cohorts [19], replicating techniques used in the United States [13]. Health profiles of mixed and separate groups of clinical cases could be interpreted in accordance with clinical expectations and previous assumptions. Thus, the expected sensitivity of the Swedish SF-36 to clinical manifestations was supported. A manual that includes Swedish norms and interpretation guidelines is available [18] and the Swedish medical community was informed [21].

Instrument validation implies accumulation of many different types of evidence that indicate to what degree an instrument measures what it is intended to measure. The validity of a self-report measure may also be judged by looking at its association with conceptually related variables. Using the Swedish normative database (n = 8930), this article addresses certain assumptions as to the criterion-related validity of the Swedish SF-36.

Section snippets

General Design and Normative Population Sample

For standardization of national norms, seven population studies were performed in different communities: urban-suburban, middle-size and small town, and rural areas. The design, procedure and basic sociodemographic statistics for the general Swedish population studies from 1991–92 have been presented elsewhere [17]. Response rates varied from around 60% (small town sample) to almost 75% (total county), with a mean rate of 68%. Representativeness was achieved across gender, age, socioeconomic

first assumption: the prevalence of limitations in physical functioning is dependent on age

The percentage of men and women, respectively, with the best possible score on the PF scale is shown in Table 1 by age group. The expected decline by age was confirmed. Around 10% more men than women reported no limitations below retirement age (<65 years), while the proportion with no limitations was similar by gender in higher ages.

second assumption: sf-36 scales reflecting physical health are age-dependent; mental health scales are not

Table 2 provides basic information on the eight SF-36 scales by age group and gender. In support of the assumption, physical health declined at a faster rate with

Discussion

During the last decades developments in psychometric theory of multi-item scale and component analyses have enriched the testing procedures used to establish reliability and validity of questionnaire data [25]. Within the IQOLA Project, the methods and tests of conceptual equivalence and cross-cultural relevance of the SF-36 Health Survey have been successful [26]. This work also stressed the need for normative studies to confirm feasibility and allow interpretations across different

Acknowledgements

This work was supported by grants from The Swedish Council for Social Research (90-0111:2), The Medical Faculty, Göteborg University, Local Community Medicine Authorities, The Swedish Association of Local Authorities, Glaxo Sweden, and by grants to the International Quality of Life Assessment (IQOLA) Project from Glaxo Wellcome, Inc. and Schering-Plough Corporation.

The authors gratefully acknowledge: Björn Wettergren, MD, Community Medicine Unit, Bohuslandstinget; Ingemar Norling, PhD,

References (30)

  • N.K. Aaronson et al.

    International quality of life assessment (IQOLA) project

    Quality Life Res

    (1992)
  • J.E. Ware et al.

    The SF-36 Health SurveyDevelopment and use in mental health research and the IQOLA Project

    Int J Ment Health

    (1993)
  • J.E. Ware et al.

    The SF-36 Health Status SurveyI. Conceptual framework and item selection

    Med Care

    (1992)
  • A. McHorney et al.

    The MOS 36-Item Short-Form Health Survey (SF-36)III. Tests of data quality, scaling assumptions and reliability across diverse patients groups

    Med Care

    (1994)
  • J.E. Ware et al.

    SF-36 Health Survey Manual and Interpretation Guide

    (1993)
  • Cited by (385)

    View all citing articles on Scopus
    View full text