National inter-rater agreement of standardised simulated-patient-based assessments

Amir H Sam; Michael D Reid; Viral Thakerar; Mark Gurnell; Rachel Westacott; Malcolm W R Reed; Celia A Brown

doi:10.1080/0142159X.2020.1845909

National inter-rater agreement of standardised simulated-patient-based assessments

Med Teach. 2021 Mar;43(3):341-346. doi: 10.1080/0142159X.2020.1845909. Epub 2020 Nov 16.

Authors

Amir H Sam¹, Michael D Reid¹, Viral Thakerar¹, Mark Gurnell^{2

3}, Rachel Westacott⁴, Malcolm W R Reed⁵, Celia A Brown⁶

Affiliations

¹ Imperial College School of Medicine, Imperial College London, London, UK.
² Wellcome Trust-MRC Institute of Metabolic Science, University of Cambridge, Cambridge, UK.
³ NIHR Cambridge Biomedical Research Centre, Addenbrooke's Hospital, Cambridge, UK.
⁴ Birmingham Medical School, University of Birmingham, Birmingham, Edgbaston, UK.
⁵ Brighton and Sussex Medical School, University of Sussex, Brighton, UK.
⁶ Division of Health Sciences, Warwick Medical School, University of Warwick, Coventry, UK.

PMID: 33198538
DOI: 10.1080/0142159X.2020.1845909

Abstract

Purpose: The forthcoming UK Medical Licensing Assessment will require all medical schools in the UK to ensure that their students pass an appropriately designed Clinical and Professional Skills Assessment (CPSA) prior to graduation and registration with a licence to practice medicine. The requirements for the CPSA will be set by the General Medical Council, but individual medical schools will be responsible for implementing their own assessments. It is therefore important that assessors from different medical schools across the UK agree on what standard of performance constitutes a fail, pass or good grade.

Methods: We used an experimental video-based, single-blinded, randomised, internet-based design. We created videos of simulated student performances of a clinical examination at four scripted standards: clear fail (CF), borderline (BD), clear pass (CPX) and good (GD). Assessors from ten regions across the UK were randomly assigned to watch five videos in 12 different combinations and asked to give competence domain scores and an overall global grade for each simulated candidate. The inter-rater agreement as measured by the intraclass correlation coefficient (ICC) based on a two-way random-effects model for absolute agreement was calculated for the total domain scores.

Results: 120 assessors enrolled in the study, with 98 eligible for analysis. The ICC was 0.93 (95% CI 0.81-0.99). The mean percentage agreement with the scripted global grade was 74.4% (range 40.8-96.9%).

Conclusions: The inter-rater agreement amongst assessors across the UK when rating simulated candidates performing at scripted levels is excellent. The level of agreement for the overall global performance level for simulated candidates is also high. These findings suggest that assessors from across the UK viewing the same simulated performances show high levels of agreement of the standards expected of students at a 'clear fail,' 'borderline,' 'clear pass' and 'good' level.

Keywords: Undergraduate; assessment; inter-rater reliability.

Publication types

Randomized Controlled Trial

MeSH terms

Clinical Competence*
Educational Measurement*
Humans
Observer Variation
Reproducibility of Results
Schools, Medical
Students