Abstract
To the extent that outcomes of health assessment instruments are to be used interchangeably, the summary scores based on these outcomes need to be equated or made comparable. If the summary scores of different health assessment instruments are not equated, inferences based on them could be flawed. Ideally, summary scores would be comparable because of careful instrument design. In practice, that rarely happens. Statistical intervention is usually needed. This article addresses key questions associated with the linking of summary scores of health outcomes. What is meant by outcome linking and equating? How does equating differ from other types of linking? What common data collection designs are used to capture data for outcomes linking? What are some of the standard statistical procedures used to link outcomes directly? What assumptions do they make? What role does IRT play in linking outcomes? What assumptions do IRT methods make? This article makes a distinction between direct statistical adjustments of summary score distributions, and indirect procedures based on psychometric models of items or questions.
Similar content being viewed by others
References
Dorans N. J., & Holland, P. W. (2000). Population invariance and the equatability of tests: Basic theory and the linear case. Journal of Educational Measurement 37, 281–306.
Angoff, W. H. (1971). Scales, norms and equivalent scores. In R. L. Thorndike (Ed.), Educational Measurement (2nd ed., pp. 508–600). Washington DC: American Council on Education.) (Reprinted as Angoff, W. H. (1984) Scales, Norms and Equivalent Scores (Princeton, NJ: Educational Testing Service)).
Holland, P. W., & Rubin, D. B. (Eds.) (1982). Test equating. New York: Academic Press.
Petersen, N. S., Kolen, M. J. & Hoover, H. D. (1989). Scaling, norming and equating. In R. L. Linn (Ed.), Educational Measurement (3rd ed., pp. 221–262). New York: Macmillan.
von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004). The kernel method of equating. New York: Springer.
Kolen, M. J., & Brennan, R. L. (2004). Test equating, linking, and scaling: Methods and practices (2nd Ed.). New York: Springer-Verlag.
Holland, P. W. & Dorans, N. J. (2007). Linking and equating test scores. In Brennan R.L. (Ed.), Educational measurement (4th ed., pp. 187–220). Westport, CT: Praeger Publishers.
Kolen, M. J. (2004). Linking assessments: Concept and history. Applied Psychological Measurement, 28, 219–226.
Pommerich, M. & Dorans, N. J. (Eds.) (2004). Concordance. [Special Issue] Applied Psychological Measurement, 28, 215–289
Kolen, M. J. (2007). Data collection designs and linking procedures. In N. J. Dorans, M. Pommerich, & P.W. Holland (Eds.), Linking and aligning scores and scales (pp. 31–55). New York: Springer.
Dorans, N. J. (2004). Equating, concordance and expectation. Applied Psychological Measurement, 28, 227–246.
The IQOLA Project. Retrieved December 1, 2006 from http://www.iqola.org/project.aspx#translation
Dorans, N. J., Lyu, C. F., Pommerich, M., & Houston, W. M. (1997). Concordance between ACT Assessment and recentered SAT I sum scores. College and University, 73, 24–34.
Holland, P. W., & Thayer, D. T. (2000). Univariate and bivariate loglinear models for discrete test score distributions. Journal of Educational and Behavioral Statistics. 25, 133–183.
Braun, H. I., & Holland, P. W. (1982). Observed-score test equating: a mathematical analysis of some ETS equating procedures. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 9–49). New York: Academic Press.
Yen, W. M., & Fitzpatrick, A. R. (2007). Item response theory. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 111–154). Westport, CT: Praeger Publishers.
Chang, C.-H., & Cella, D. (1997). Equating health-related quality of life instruments in applied oncology settings. Physical Medicine and Rehabilitation: States of the art reviews, 11, 397–406.
McHorney, C. A., & Cohen, A. S. (2000). Equating heath status measures with item response theory: Illustrations with functional status items. Medical Care, 38, 43–59.
Orlando, M., Sherboune, C. D., & Thissen, D. (2000). Summed-score linking using item response theory: Application to depression measurement. Psychological Assessment, 12, 354–359.
Teresi, J. A., Stewart, A. L., Morales, L. S., & Stahl, S. M. (Eds.) (2006). Measurement in a multi-ethnic society. Medical Care, 44(11 Suppl. 33), S3–S4.
Dorans, N. J., Pommerich, M., & Holland P. W. (2007). Postscript. In N. J. Dorans, M. Pommerich, & P. W. Holland (Eds.), Linking and aligning scores and scales (pp. 355–358). New York: Springer.
Acknowledgements
Neil J. Dorans is a Distinguished Presidential Appointee in the Center for Statistical Theory and Practice of the Research and Development Division at Educational Testing Service. The opinions expressed in this paper are his alone and do not represent the opinions of Educational Testing Service. The comments of colleagues at ETS, Dr. C-H Chang of Northwestern University, Dr. Ron D. Hays of UCLA and Dr. Bryce Reeve of the National Institutes of Health were helpful in preparing this paper, as were those of three reviewers.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dorans, N.J. Linking scores from multiple health outcome instruments. Qual Life Res 16 (Suppl 1), 85–94 (2007). https://doi.org/10.1007/s11136-006-9155-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-006-9155-3