Managing extremes of assessor judgment within the OSCE

Richard Fuller; Matt Homer; Godfrey Pell; Jennifer Hallam

doi:10.1080/0142159X.2016.1230189

Managing extremes of assessor judgment within the OSCE

Med Teach. 2017 Jan;39(1):58-66. doi: 10.1080/0142159X.2016.1230189. Epub 2016 Sep 27.

Authors

Richard Fuller¹, Matt Homer¹, Godfrey Pell¹, Jennifer Hallam¹

Affiliation

¹ a School of Medicine , Leeds Institute of Medical Education, University of Leeds , Leeds , UK.

PMID: 27670246
DOI: 10.1080/0142159X.2016.1230189

Abstract

Context: There is a growing body of research investigating assessor judgments in complex performance environments such as OSCE examinations. Post hoc analysis can be employed to identify some elements of "unwanted" assessor variance. However, the impact of individual, apparently "extreme" assessors on OSCE quality, assessment outcomes and pass/fail decisions has not been previously explored. This paper uses a range of "case studies" as examples to illustrate the impact that "extreme" examiners can have in OSCEs, and gives pragmatic suggestions to successfully alleviating problems.

Method and results: We used real OSCE assessment data from a number of examinations where at station level, a single examiner assesses student performance using a global grade and a key features checklist. Three exemplar case studies where initial post hoc analysis has indicated problematic individual assessor behavior are considered and discussed in detail, highlighting both the impact of individual examiner behavior and station design on subsequent judgments.

Conclusions: In complex assessment environments, institutions have a duty to maximize the defensibility, quality and validity of the assessment process. A key element of this involves critical analysis, through a range of approaches, of assessor judgments. However, care must be taken when assuming that apparent aberrant examiner behavior is automatically just that.

MeSH terms

Checklist
Clinical Competence
Education, Medical / methods*
Education, Medical / standards*
Educational Measurement / methods*
Educational Measurement / standards*
Humans
Judgment
Observer Variation*
Psychometrics
Reproducibility of Results