Shining a spotlight on scoring in the OSCE: Checklists and item weighting

Matt Homer; Richard Fuller; Jennifer Hallam; Godfrey Pell

doi:10.1080/0142159X.2020.1781072

Shining a spotlight on scoring in the OSCE: Checklists and item weighting

Med Teach. 2020 Sep;42(9):1037-1042. doi: 10.1080/0142159X.2020.1781072. Epub 2020 Jul 1.

Authors

Matt Homer¹, Richard Fuller², Jennifer Hallam¹, Godfrey Pell¹

Affiliations

¹ Leeds Institute of Medical Education, School of Medicine, University of Leeds, Leeds, UK.
² School of Medicine, University of Liverpool, Liverpool, UK.

PMID: 32608303
DOI: 10.1080/0142159X.2020.1781072

Abstract

Introduction: There has been a long-running debate about the validity of item-based checklist scoring of performance assessments like OSCEs. In recent years, the conception of a checklist has developed from its dichotomous inception into a more 'key-features' and/or chunked approach, where 'items' have the potential to become weighted differently, but the literature does not always reflect these broader conceptions.Methods: We consider theoretical, design and (clinically trained) assessor issues related to differential item weighting in checklist scoring of OSCEs stations. Using empirical evidence, this work also compares candidate decisions and psychometric quality of different item-weighting approaches (i.e. a simple 'unweighted' scheme versus a differentially weighted one).Results: The impact of different weighting schemes affect approximately 30% of the key borderline group of candidates, and 3% of candidates overall. We also find that measures of overall assessment quality are a little better under the differentially weighted scoring system.Discussion and conclusion: Differentially weighted modern checklists can contribute to valid assessment outcomes, and bring a range of additional benefits to the assessment. Judgment about weighting of particular items should be considered a key design consideration during station development and must align to clinical assessor expectations of the relative importance of sub-tasks.

Keywords: OSCE scoring; assessment quality; checklist design; item weighting.

MeSH terms

Checklist*
Clinical Competence
Educational Measurement*
Humans
Judgment
Psychometrics
Reproducibility of Results