Correcting for Rater Effects in Operating Room Surgical Skills Assessment

Ryan Chou; Hajira Naz; Kofi D O Boahene; Jessica H Maxwell; John R Wanamaker; Patrick J Byrne; Ira D Papel; Theda C Kontis; Gregory D Hager; Lisa E Ishii; Sonya Malekzadeh; S Swaroop Vedula; Masaru Ishii

doi:10.1002/lary.31391

Correcting for Rater Effects in Operating Room Surgical Skills Assessment

Laryngoscope. 2024 Mar 12. doi: 10.1002/lary.31391. Online ahead of print.

Authors

Ryan Chou¹, Hajira Naz², Kofi D O Boahene^{3

4}, Jessica H Maxwell^{5

6}, John R Wanamaker^{5

6}, Patrick J Byrne⁷, Ira D Papel^{3

8}, Theda C Kontis^{3

8}, Gregory D Hager^{9

10}, Lisa E Ishii^{3

4}, Sonya Malekzadeh^{5

6}, S Swaroop Vedula⁹, Masaru Ishii³

Affiliations

¹ Department of Biomedical Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland, U.S.A.
² Dugoni School of Dentistry, University of Pacific, San Francisco, California, U.S.A.
³ Department of Otolaryngology-Head and Neck Surgery, Johns Hopkins University School of Medicine, Baltimore, Maryland, U.S.A.
⁴ Division of Facial Plastic and Reconstructive Surgery, Department of Otolaryngology-Head and Neck Surgery, Johns Hopkins University School of Medicine, Baltimore, Maryland, U.S.A.
⁵ Department of Otolaryngology-Head and Neck Surgery, MedStar Georgetown University Hospital, Washington, DC, U.S.A.
⁶ ENT Section, Veterans Affairs Medical Center, Washington, DC, U.S.A.
⁷ Head and Neck Institute, Cleveland Clinic, Cleveland, Ohio, U.S.A.
⁸ Aesthetic Center at Woodholme, Baltimore, Maryland, U.S.A.
⁹ Malone Center for Engineering in Healthcare, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland, U.S.A.
¹⁰ Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland, U.S.A.

PMID: 38470307
DOI: 10.1002/lary.31391

Abstract

Objective: To estimate and adjust for rater effects in operating room surgical skills assessment performed using a structured rating scale for nasal septoplasty.

Methods: We analyzed survey responses from attending surgeons (raters) who supervised residents and fellows (trainees) performing nasal septoplasty in a prospective cohort study. We fit a structural equation model with the rubric item scores regressed on a latent component of skill and then fit a second model including the rating surgeon as a random effect to model a rater-effects-adjusted latent surgical skill. We validated this model against conventional measures including the level of expertise and post-graduation year (PGY) commensurate with the trainee's performance, the actual PGY of the trainee, and whether the surgical goals were achieved.

Results: Our dataset included 188 assessments by 7 raters and 41 trainees. The model with one latent construct for surgical skill and the rater as a random effect was the best. Rubric scores depended on how severe or lenient the rater was, sometimes almost as much as they depended on trainee skill. Rater-adjusted latent skill scores increased with attending-estimated skill levels and PGY of trainees, increased with the actual PGY, and appeared constant over different levels of achievement of surgical goals.

Conclusion: Our work provides a method to obtain rater effect adjusted surgical skill assessments in the operating room using structured rating scales. Our method allows for the creation of standardized (i.e., rater-effects-adjusted) quantitative surgical skill benchmarks using national-level databases on trainee assessments.

Level of evidence: N/A Laryngoscope, 2024.

Keywords: OSATS; SGAT; rater bias; rater effect; septoplasty; surgical skill assessment.

Abstract

Grants and funding