Quality Assurance Methods for Performance-Based Assessments

Boulet, John R.; McKinley, Danette W.; Whelan, Gerald P.; Hambleton, Ronald K.

doi:10.1023/A:1022639521218

Quality Assurance Methods for Performance-Based Assessments

Published: March 2003

Volume 8, pages 27–47, (2003)
Cite this article

Advances in Health Sciences Education Aims and scope Submit manuscript

John R. Boulet¹,
Danette W. McKinley¹,
Gerald P. Whelan¹ &
…
Ronald K. Hambleton²

774 Accesses
60 Citations
1 Altmetric
Explore all metrics

Abstract

Performance assessments are subject to many potential error sources. For performance-based assessments, including standardized patient (SP) examinations, these error sources, if left unchecked, can compromise the validity and reliability of scores. Quality assurance (QA) measures, both quantitative and qualitative, can be used to ensure that candidate scores are accurate and reasonably free from measurement error. The purpose of this paper is to outline several QA strategies that can be used to identify potential content- and score-related problems with SP assessments. These approaches include case analyses and various comparisons of primary and observer scores. Specific examples from the ECFMG Clinical Skills Assessment(CSA®) are used to educate the reader concerning appropriate statistical methods and legitimate data interpretations. The results presented in this investigation highlight the need for well-defined training regimes, regular feedback to those involved in rating/scoring performances, and detailed statistical analyses of all scores.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modern Metrics for Evaluating Surgical Technical Skills

Article 18 August 2017

Stacy Shackelford & Mark Bowyer

A pilot study of marking accuracy and mental workload as measures of OSCE examiner performance

Article Open access 25 July 2016

Aidan Byrne, Tereza Soskova, … Lee Coombes

Performance Assessment

References

Bollen, K.A. (1989). Structural Equations with Latent Variables. New York: John Wiley & Sons.
Google Scholar
Boulet, J.R., Friedman Ben-David, M. et al. (1998a). Using standardized patients to assess the interpersonal skills of physicians. Academic Medicine 73: S94–S96.
PubMed Google Scholar
Boulet, J.R., Friedman Ben-David, M. et al. (1998b). An investigation of the sources of measurement error in the post-encounter written scores from standardized patient examinations. Advances in Health Sciences Education 3: 89–100.
Article PubMed Google Scholar
Boulet, J.R., Friedman Ben-David, M. et al. (2000). The use of holistic scoring for post-encounter written exercises. In D. Melnick (ed.), Proceedings of the Eighth Ottawa Conference on Medical Education and Assessment, pp. 254–260. Philadelphia: National Board of Medical Examiners.
Google Scholar
Brennan, R.L. & Johnson, E.G. (1995). Generalizability of performance assessments. Educational Measurement: Issues and Practice Winter: 9–12.
Carraccio, C. & Englander, R. (2000). The objective structured clinical examination: A step in the direction of competency-based evaluation. Archives of Pediatric Adolescent Medicine 154: 736–741.
Google Scholar
Chambers, K.A., Boulet, J.R. & Gary, N.E. (2000). The management of patient encounter time in a high-stakes assessment using standardized patients. Medical Education 34: 813–817.
Article PubMed Google Scholar
Clauser, B.E., Swanson, D.B. & Clyman, S.G. (1996). The generalizability of scores from a performance assessment of physicians' patient management skills. Academic Medicine 71: S109–S111.
PubMed Google Scholar
Cooper-Patrick, L., Gallo, J.J. et al. (1999). Race, gender, and partnership in the patient-physician relationship. Journal of the American Medical Association 282: 583–589.
Article PubMed Google Scholar
Dauphinee, D. & Norcini, J.J. (1999). Assessing health care professionals in the new millenium. Advances in Health Sciences Education 4: 3–7.
Article PubMed Google Scholar
De Champlain, A.F., Margolis, M.J. et al. (1997). Standardized patients' accuracy in recording examinees' behaviors using checklists. Academic Medicine 72: S85–S87.
PubMed Google Scholar
Downing, S.M. & Haladyna, T.M. (1997). Test item development: Validity evidence from quality assurance procedures. Applied Measurement in Education 10: 61–82.
Google Scholar
ECFMG (1999). Clinical Skills Assessment (CSA) Candidate Orientation Manual. Philadelphia, Pennsylvania: Educational Commission for Foreign Medical Graduates (ECFMG).
Google Scholar
Friedman Ben-David, M., Boulet, J.R. et al. (1997). Issues of validity and reliability concerning who should score the post-encounter patient-progress note. Academic Medicine 72: S79–S81.
PubMed Google Scholar
Grand'Maison, P., Brailovsky, C.A. et al. (1997). Using standardized patients in licensing / certification examinations: Comparison of two tests in Canada. Family Medicine 29: 27–32.
PubMed Google Scholar
Hodges, B., Turnbull, J. et al. (1995). Assessment of communication skills with complex cases using OSCE format. In A.I. Rothman & R. Cohen (eds.), Proceedings of the Sixth Ottawa Conference on Medical Education, pp. 269–272. Toronto: University of Toronto Bookstore.
Google Scholar
Hodges, B., Regehr, G. et al. (1999). OSCE checklists do not capture increasing levels of expertise. Academic Medicine 74: 1129–1134.
PubMed Google Scholar
Klass, D.J. (1994). “High-stakes” testing of medical students using standardized patients. Teaching and Learning in Medicine 6: 28–32.
Google Scholar
Kline, R.B. (1998). Principles and Practice of Structural Equation Modeling. New York: The Guilford Press.
Google Scholar
Pangaro, L.N., Worth-Dickstein, H. et al. (1997). Performance of “standardized examinees” in a standardized-patient examination of clinical skills. Academic Medicine 72: 1008–1011.
PubMed Google Scholar
Reznick, R., Blackmore, D. et al. (1996). Large-scale high-stakes testing with an OSCE: Report from the Medical College of Canada. Academic Medicine 71: S19–S21.
Google Scholar
Rutala, P.J., Witzke, D.B. et al. (1990). Student fatigue as a variable affecting performance in an objective structured clinical examination. Academic Medicine 65: S53–S54.
PubMed Google Scholar
Searle, S.R., Speed, F.M. & Milliken, G.A. (1980). Population marginal means in the linear model: An alternative to least squares means. The American Statistician 34: 216–221.
Google Scholar
Sinacore, J.M., Connell, K.J. et al. (2000). A method for measuring interrater agreement on checklists. Evaluation & the Health Professions 22: 221–234.
Google Scholar
Swanson, D.B., Clauser, B.E. & Case, S.M. (1999). Clinical skills assessment with standardized patients in high-stakes tests: A framework for thinking about score precision, equating, and security. Advances in Health Sciences Education 4: 67–106.
Article PubMed Google Scholar
Swanson, D.B., Norman, G.R. & Linn, R.L. (1995). Performance-based assessment: Lessons from the health professions. Educational Researcher 24: 5–11.
Google Scholar
Tamblyn, R.M., Klass, D.J. et al. (1991). Sources of unreliability and bias in standardized-patient rating. Teaching and Learning in Medicine 3: 74–85.
Google Scholar
van der Vleuten, C., Norman, G.R. & De Graaff, E. (1991). Pitfalls in the pursuit of objectivity: Issues of reliability. Medical Education 25: 110–118.
PubMed Google Scholar
Vu, N.V. & Barrows, H.S. (1994). Use of standardized patients in clinical assessments: recent developments and measurement findings. Educational Researcher 23: 23–30.
Google Scholar
Wallace, P., Garman, K. et al. (1999). Effect of varying amounts of feedback on standardized patient checklist accuracy in clinical practice examinations. Teaching and Learning in Medicine 11: 148–152.
Article Google Scholar
Wang, Y., Stillman, P.L. et al. (1996). The effect of fatigue on the accuracy of standardized patients' checklist recording. Teaching & Learning in Medicine 8: 148–151.
Google Scholar
Whelan, G.P. (1999). Educational Commission for Foreign Medical Graduates: Clinical skills assessment prototype. Medical Teacher 21: 156–160.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Research and Evaluation, Educational Commission for Foreign Medical Graduates (ECFMG), 3624 Market Street, Philadelphia, PA, 19104-2685, USA
John R. Boulet, Danette W. McKinley & Gerald P. Whelan
University of Massachusetts, MA, USA
Ronald K. Hambleton

Authors

John R. Boulet
View author publications
You can also search for this author in PubMed Google Scholar
Danette W. McKinley
View author publications
You can also search for this author in PubMed Google Scholar
Gerald P. Whelan
View author publications
You can also search for this author in PubMed Google Scholar
Ronald K. Hambleton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to John R. Boulet.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boulet, J.R., McKinley, D.W., Whelan, G.P. et al. Quality Assurance Methods for Performance-Based Assessments. Adv Health Sci Educ Theory Pract 8, 27–47 (2003). https://doi.org/10.1023/A:1022639521218

Download citation

Issue Date: March 2003
DOI: https://doi.org/10.1023/A:1022639521218

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quality Assurance Methods for Performance-Based Assessments

Abstract

Access this article

Similar content being viewed by others

Modern Metrics for Evaluating Surgical Technical Skills

A pilot study of marking accuracy and mental workload as measures of OSCE examiner performance

Performance Assessment

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Navigation

Quality Assurance Methods for Performance-Based Assessments

Abstract

Access this article

Similar content being viewed by others

Modern Metrics for Evaluating Surgical Technical Skills

A pilot study of marking accuracy and mental workload as measures of OSCE examiner performance

Performance Assessment

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation