Skip to main content
Log in

Quality Assurance Methods for Performance-Based Assessments

  • Published:
Advances in Health Sciences Education Aims and scope Submit manuscript

Abstract

Performance assessments are subject to many potential error sources. For performance-based assessments, including standardized patient (SP) examinations, these error sources, if left unchecked, can compromise the validity and reliability of scores. Quality assurance (QA) measures, both quantitative and qualitative, can be used to ensure that candidate scores are accurate and reasonably free from measurement error. The purpose of this paper is to outline several QA strategies that can be used to identify potential content- and score-related problems with SP assessments. These approaches include case analyses and various comparisons of primary and observer scores. Specific examples from the ECFMG Clinical Skills Assessment(CSA®) are used to educate the reader concerning appropriate statistical methods and legitimate data interpretations. The results presented in this investigation highlight the need for well-defined training regimes, regular feedback to those involved in rating/scoring performances, and detailed statistical analyses of all scores.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Bollen, K.A. (1989). Structural Equations with Latent Variables. New York: John Wiley & Sons.

    Google Scholar 

  • Boulet, J.R., Friedman Ben-David, M. et al. (1998a). Using standardized patients to assess the interpersonal skills of physicians. Academic Medicine 73: S94–S96.

    PubMed  Google Scholar 

  • Boulet, J.R., Friedman Ben-David, M. et al. (1998b). An investigation of the sources of measurement error in the post-encounter written scores from standardized patient examinations. Advances in Health Sciences Education 3: 89–100.

    Article  PubMed  Google Scholar 

  • Boulet, J.R., Friedman Ben-David, M. et al. (2000). The use of holistic scoring for post-encounter written exercises. In D. Melnick (ed.), Proceedings of the Eighth Ottawa Conference on Medical Education and Assessment, pp. 254–260. Philadelphia: National Board of Medical Examiners.

    Google Scholar 

  • Brennan, R.L. & Johnson, E.G. (1995). Generalizability of performance assessments. Educational Measurement: Issues and Practice Winter: 9–12.

  • Carraccio, C. & Englander, R. (2000). The objective structured clinical examination: A step in the direction of competency-based evaluation. Archives of Pediatric Adolescent Medicine 154: 736–741.

    Google Scholar 

  • Chambers, K.A., Boulet, J.R. & Gary, N.E. (2000). The management of patient encounter time in a high-stakes assessment using standardized patients. Medical Education 34: 813–817.

    Article  PubMed  Google Scholar 

  • Clauser, B.E., Swanson, D.B. & Clyman, S.G. (1996). The generalizability of scores from a performance assessment of physicians' patient management skills. Academic Medicine 71: S109–S111.

    PubMed  Google Scholar 

  • Cooper-Patrick, L., Gallo, J.J. et al. (1999). Race, gender, and partnership in the patient-physician relationship. Journal of the American Medical Association 282: 583–589.

    Article  PubMed  Google Scholar 

  • Dauphinee, D. & Norcini, J.J. (1999). Assessing health care professionals in the new millenium. Advances in Health Sciences Education 4: 3–7.

    Article  PubMed  Google Scholar 

  • De Champlain, A.F., Margolis, M.J. et al. (1997). Standardized patients' accuracy in recording examinees' behaviors using checklists. Academic Medicine 72: S85–S87.

    PubMed  Google Scholar 

  • Downing, S.M. & Haladyna, T.M. (1997). Test item development: Validity evidence from quality assurance procedures. Applied Measurement in Education 10: 61–82.

    Google Scholar 

  • ECFMG (1999). Clinical Skills Assessment (CSA) Candidate Orientation Manual. Philadelphia, Pennsylvania: Educational Commission for Foreign Medical Graduates (ECFMG).

    Google Scholar 

  • Friedman Ben-David, M., Boulet, J.R. et al. (1997). Issues of validity and reliability concerning who should score the post-encounter patient-progress note. Academic Medicine 72: S79–S81.

    PubMed  Google Scholar 

  • Grand'Maison, P., Brailovsky, C.A. et al. (1997). Using standardized patients in licensing / certification examinations: Comparison of two tests in Canada. Family Medicine 29: 27–32.

    PubMed  Google Scholar 

  • Hodges, B., Turnbull, J. et al. (1995). Assessment of communication skills with complex cases using OSCE format. In A.I. Rothman & R. Cohen (eds.), Proceedings of the Sixth Ottawa Conference on Medical Education, pp. 269–272. Toronto: University of Toronto Bookstore.

    Google Scholar 

  • Hodges, B., Regehr, G. et al. (1999). OSCE checklists do not capture increasing levels of expertise. Academic Medicine 74: 1129–1134.

    PubMed  Google Scholar 

  • Klass, D.J. (1994). “High-stakes” testing of medical students using standardized patients. Teaching and Learning in Medicine 6: 28–32.

    Google Scholar 

  • Kline, R.B. (1998). Principles and Practice of Structural Equation Modeling. New York: The Guilford Press.

    Google Scholar 

  • Pangaro, L.N., Worth-Dickstein, H. et al. (1997). Performance of “standardized examinees” in a standardized-patient examination of clinical skills. Academic Medicine 72: 1008–1011.

    PubMed  Google Scholar 

  • Reznick, R., Blackmore, D. et al. (1996). Large-scale high-stakes testing with an OSCE: Report from the Medical College of Canada. Academic Medicine 71: S19–S21.

    Google Scholar 

  • Rutala, P.J., Witzke, D.B. et al. (1990). Student fatigue as a variable affecting performance in an objective structured clinical examination. Academic Medicine 65: S53–S54.

    PubMed  Google Scholar 

  • Searle, S.R., Speed, F.M. & Milliken, G.A. (1980). Population marginal means in the linear model: An alternative to least squares means. The American Statistician 34: 216–221.

    Google Scholar 

  • Sinacore, J.M., Connell, K.J. et al. (2000). A method for measuring interrater agreement on checklists. Evaluation & the Health Professions 22: 221–234.

    Google Scholar 

  • Swanson, D.B., Clauser, B.E. & Case, S.M. (1999). Clinical skills assessment with standardized patients in high-stakes tests: A framework for thinking about score precision, equating, and security. Advances in Health Sciences Education 4: 67–106.

    Article  PubMed  Google Scholar 

  • Swanson, D.B., Norman, G.R. & Linn, R.L. (1995). Performance-based assessment: Lessons from the health professions. Educational Researcher 24: 5–11.

    Google Scholar 

  • Tamblyn, R.M., Klass, D.J. et al. (1991). Sources of unreliability and bias in standardized-patient rating. Teaching and Learning in Medicine 3: 74–85.

    Google Scholar 

  • van der Vleuten, C., Norman, G.R. & De Graaff, E. (1991). Pitfalls in the pursuit of objectivity: Issues of reliability. Medical Education 25: 110–118.

    PubMed  Google Scholar 

  • Vu, N.V. & Barrows, H.S. (1994). Use of standardized patients in clinical assessments: recent developments and measurement findings. Educational Researcher 23: 23–30.

    Google Scholar 

  • Wallace, P., Garman, K. et al. (1999). Effect of varying amounts of feedback on standardized patient checklist accuracy in clinical practice examinations. Teaching and Learning in Medicine 11: 148–152.

    Article  Google Scholar 

  • Wang, Y., Stillman, P.L. et al. (1996). The effect of fatigue on the accuracy of standardized patients' checklist recording. Teaching & Learning in Medicine 8: 148–151.

    Google Scholar 

  • Whelan, G.P. (1999). Educational Commission for Foreign Medical Graduates: Clinical skills assessment prototype. Medical Teacher 21: 156–160.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John R. Boulet.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boulet, J.R., McKinley, D.W., Whelan, G.P. et al. Quality Assurance Methods for Performance-Based Assessments. Adv Health Sci Educ Theory Pract 8, 27–47 (2003). https://doi.org/10.1023/A:1022639521218

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1022639521218

Navigation