Correcting for Rater Effects in Operating Room Surgical Skills Assessment

Laryngoscope. 2024 Mar 12. doi: 10.1002/lary.31391. Online ahead of print.

Abstract

Objective: To estimate and adjust for rater effects in operating room surgical skills assessment performed using a structured rating scale for nasal septoplasty.

Methods: We analyzed survey responses from attending surgeons (raters) who supervised residents and fellows (trainees) performing nasal septoplasty in a prospective cohort study. We fit a structural equation model with the rubric item scores regressed on a latent component of skill and then fit a second model including the rating surgeon as a random effect to model a rater-effects-adjusted latent surgical skill. We validated this model against conventional measures including the level of expertise and post-graduation year (PGY) commensurate with the trainee's performance, the actual PGY of the trainee, and whether the surgical goals were achieved.

Results: Our dataset included 188 assessments by 7 raters and 41 trainees. The model with one latent construct for surgical skill and the rater as a random effect was the best. Rubric scores depended on how severe or lenient the rater was, sometimes almost as much as they depended on trainee skill. Rater-adjusted latent skill scores increased with attending-estimated skill levels and PGY of trainees, increased with the actual PGY, and appeared constant over different levels of achievement of surgical goals.

Conclusion: Our work provides a method to obtain rater effect adjusted surgical skill assessments in the operating room using structured rating scales. Our method allows for the creation of standardized (i.e., rater-effects-adjusted) quantitative surgical skill benchmarks using national-level databases on trainee assessments.

Level of evidence: N/A Laryngoscope, 2024.

Keywords: OSATS; SGAT; rater bias; rater effect; septoplasty; surgical skill assessment.