Automatic Surgical Skill Assessment System Based on Concordance of Standardized Surgical Field Development Using Artificial Intelligence

JAMA Surg. 2023 Aug 1;158(8):e231131. doi: 10.1001/jamasurg.2023.1131. Epub 2023 Aug 9.

Abstract

Importance: Automatic surgical skill assessment with artificial intelligence (AI) is more objective than manual video review-based skill assessment and can reduce human burden. Standardization of surgical field development is an important aspect of this skill assessment.

Objective: To develop a deep learning model that can recognize the standardized surgical fields in laparoscopic sigmoid colon resection and to evaluate the feasibility of automatic surgical skill assessment based on the concordance of the standardized surgical field development using the proposed deep learning model.

Design, setting, and participants: This retrospective diagnostic study used intraoperative videos of laparoscopic colorectal surgery submitted to the Japan Society for Endoscopic Surgery between August 2016 and November 2017. Data were analyzed from April 2020 to September 2022.

Interventions: Videos of surgery performed by expert surgeons with Endoscopic Surgical Skill Qualification System (ESSQS) scores higher than 75 were used to construct a deep learning model able to recognize a standardized surgical field and output its similarity to standardized surgical field development as an AI confidence score (AICS). Other videos were extracted as the validation set.

Main outcomes and measures: Videos with scores less than or greater than 2 SDs from the mean were defined as the low- and high-score groups, respectively. The correlation between AICS and ESSQS score and the screening performance using AICS for low- and high-score groups were analyzed.

Results: The sample included 650 intraoperative videos, 60 of which were used for model construction and 60 for validation. The Spearman rank correlation coefficient between the AICS and ESSQS score was 0.81. The receiver operating characteristic (ROC) curves for the screening of the low- and high-score groups were plotted, and the areas under the ROC curve for the low- and high-score group screening were 0.93 and 0.94, respectively.

Conclusions and relevance: The AICS from the developed model strongly correlated with the ESSQS score, demonstrating the model's feasibility for use as a method of automatic surgical skill assessment. The findings also suggest the feasibility of the proposed model for creating an automated screening system for surgical skills and its potential application to other types of endoscopic procedures.

MeSH terms

  • Artificial Intelligence
  • Digestive System Surgical Procedures*
  • Humans
  • Laparoscopy* / methods
  • ROC Curve
  • Retrospective Studies