Speech-Based Surgical Phase Recognition for Non-Intrusive Surgical Skills' Assessment in Educational Contexts

Sensors (Basel). 2021 Feb 13;21(4):1330. doi: 10.3390/s21041330.

Abstract

Surgeons' procedural skills and intraoperative decision making are key elements of clinical practice. However, the objective assessment of these skills remains a challenge to this day. Surgical workflow analysis (SWA) is emerging as a powerful tool to solve this issue in surgical educational environments in real time. Typically, SWA makes use of video signals to automatically identify the surgical phase. We hypothesize that the analysis of surgeons' speech using natural language processing (NLP) can provide deeper insight into the surgical decision-making processes. As a preliminary step, this study proposes to use audio signals registered in the educational operating room (OR) to classify the phases of a laparoscopic cholecystectomy (LC). To do this, we firstly created a database with the transcriptions of audio recorded in surgical educational environments and their corresponding phase. Secondly, we compared the performance of four feature extraction techniques and four machine learning models to find the most appropriate model for phase recognition. The best resulting model was a support vector machine (SVM) coupled to a hidden-Markov model (HMM), trained with features obtained with Word2Vec (82.95% average accuracy). The analysis of this model's confusion matrix shows that some phrases are misplaced due to the similarity in the words used. The study of the model's temporal component suggests that further attention should be paid to accurately detect surgeons' normal conversation. This study proves that speech-based classification of LC phases can be effectively achieved. This lays the foundation for the use of audio signals for SWA, to create a framework of LC to be used in surgical training, especially for the training and assessment of procedural and decision-making skills (e.g., to assess residents' procedural knowledge and their ability to react to adverse situations).

Keywords: feature extraction; machine learning; natural language processing; speech analysis; surgical process model; surgical workflow analysis.

MeSH terms

  • Cholecystectomy, Laparoscopic*
  • Clinical Competence*
  • General Surgery* / standards
  • Humans
  • Operating Rooms
  • Pattern Recognition, Automated*
  • Speech