National inter-rater agreement of standardised simulated-patient-based assessments

Med Teach. 2021 Mar;43(3):341-346. doi: 10.1080/0142159X.2020.1845909. Epub 2020 Nov 16.

Abstract

Purpose: The forthcoming UK Medical Licensing Assessment will require all medical schools in the UK to ensure that their students pass an appropriately designed Clinical and Professional Skills Assessment (CPSA) prior to graduation and registration with a licence to practice medicine. The requirements for the CPSA will be set by the General Medical Council, but individual medical schools will be responsible for implementing their own assessments. It is therefore important that assessors from different medical schools across the UK agree on what standard of performance constitutes a fail, pass or good grade.

Methods: We used an experimental video-based, single-blinded, randomised, internet-based design. We created videos of simulated student performances of a clinical examination at four scripted standards: clear fail (CF), borderline (BD), clear pass (CPX) and good (GD). Assessors from ten regions across the UK were randomly assigned to watch five videos in 12 different combinations and asked to give competence domain scores and an overall global grade for each simulated candidate. The inter-rater agreement as measured by the intraclass correlation coefficient (ICC) based on a two-way random-effects model for absolute agreement was calculated for the total domain scores.

Results: 120 assessors enrolled in the study, with 98 eligible for analysis. The ICC was 0.93 (95% CI 0.81-0.99). The mean percentage agreement with the scripted global grade was 74.4% (range 40.8-96.9%).

Conclusions: The inter-rater agreement amongst assessors across the UK when rating simulated candidates performing at scripted levels is excellent. The level of agreement for the overall global performance level for simulated candidates is also high. These findings suggest that assessors from across the UK viewing the same simulated performances show high levels of agreement of the standards expected of students at a 'clear fail,' 'borderline,' 'clear pass' and 'good' level.

Keywords: Undergraduate; assessment; inter-rater reliability.

Publication types

  • Randomized Controlled Trial

MeSH terms

  • Clinical Competence*
  • Educational Measurement*
  • Humans
  • Observer Variation
  • Reproducibility of Results
  • Schools, Medical
  • Students