Setting defensible standards in small cohort OSCEs: Understanding better when borderline regression can 'work'

Matt Homer; Richard Fuller; Jennifer Hallam; Godfrey Pell

doi:10.1080/0142159X.2019.1681388

Setting defensible standards in small cohort OSCEs: Understanding better when borderline regression can 'work'

Med Teach. 2020 Mar;42(3):306-315. doi: 10.1080/0142159X.2019.1681388. Epub 2019 Oct 26.

Authors

Matt Homer¹, Richard Fuller², Jennifer Hallam¹, Godfrey Pell¹

Affiliations

¹ Leeds Institute of Medical Education, School of Medicine, University of Leeds, Leeds, UK.
² School of Medicine, University of Liverpool, Liverpool, UK.

PMID: 31657266
DOI: 10.1080/0142159X.2019.1681388

Abstract

Introduction: Borderline regression (BRM) is considered problematic in small cohort OSCEs (e.g. n < 50), with institutions often relying on item-centred standard setting approaches which can be resource intensive and lack defensibility in performance tests.Methods: Through an analysis of post-hoc station- and test-level metrics, we investigate the application of BRM in three different small-cohort OSCE contexts: the exam for international medical graduates wanting to practice in the UK, senior sequential undergraduate exams, and Physician associates exams in a large UK medical school.Results: We find that BRM provides robust metrics and concomitantly defensible cut scores in the majority of stations (percentage of problematic stations 5, 14, and 12%, respectively across our three contexts). Where problems occur, this is generally due to an insufficiently strong relationship between global grades and checklist scores to be confident in the standard set by BRM in these stations.Conclusion: This work challenges previous assumptions about the application of BRM in small test cohorts. Where there is sufficient spread of ability, BRM will generally provide defensible standards, assuming careful design of station-level scoring instruments. However, extant station cut-scores are preferred as a substitute where BRM standard setting problems do occur.

MeSH terms

Clinical Competence
Cohort Studies
Education, Medical, Undergraduate*
Educational Measurement*
Humans
Reproducibility of Results
Schools, Medical