Feasibility Study on Automatic Surgical Phase Identification based on Speech Recognition for Laparoscopic Prostatectomy

Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul:2022:4411-4414. doi: 10.1109/EMBC48229.2022.9870990.

Abstract

An efficient capacity management of the Operation Room (OR) is crucial for the optimization of resources and consequent cost minimization. However, the duration of each surgery is hard to predict and often prone to variability and unforeseen events that may delay or accelerate each procedure. Automatic surgical phase identification strategies can lead to a more accurate and automated estimation of the surgery duration, improving OR schedule optimization. One possible contribution for these methods is the introduction of speech recognition systems. The described work aims at the implementation of a speech recognition (SR) engine for surgical phase identification, specifically optimized for Laparoscopic Radical Prostatectomy (LRP). For this application the performance of 3 engines was tested under different background noise and distance circumstances: Microsoft Speech SDK; Google Speech Recognition API; and an optimization of the first, with the introduction of specific vocabulary. Action/target binomials were used to identify the Laparoscopic Prostatectomy surgical phases. 15 participants were selected to perform the tests and Word Error Rate (WER) was calculated as the main comparison metric. The values for the total WER indicate that AzureGrammar (specific vocabulary inserted) has superior performance, reaching a lower error rate of 29.25%, when compared to Azure and Google with 67.91% and 67.48%, respectively. Phase Accuracy Ratio (PAR), considering only the AzureGrammar SR engine, are 84% and 83%, for 1 m and 2 m, without background noise; 61% and 51%, for 1 m and 2 m, with background noise. This study demonstrates that using a conventional SR engine, with specific vocabulary incorporated, has the potential to achieve an acceptable performance with minimal setup. Clinical Relevance- Speech based, surgical phase identification, can be achieved with minimal setup to existing SR engines, showing potential to be implemented in Operating Room, with further optimization.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Feasibility Studies
  • Humans
  • Laparoscopy* / methods
  • Male
  • Prostatectomy
  • Speech
  • Speech Perception*