Towards accurate and interpretable surgical skill assessment: a video-based method for skill score prediction and guiding feedback generation

Int J Comput Assist Radiol Surg. 2021 Sep;16(9):1595-1605. doi: 10.1007/s11548-021-02448-4. Epub 2021 Jul 10.

Abstract

Purpose: Recently, automatic surgical skill assessment has received the attention given the increasingly important role of surgical training. The assessment usually involves skill score prediction and further feedback generation. Existing work on skill score prediction is limited with several challenges and deserves more promising outcomes. For the feedback, most work identifies the flaws on the granularity of video frames or clips. It thus remains to be explored how to identify poorly performed gestures (segments) and further how to provide good references for improvement.

Methods: To overcome these problems, a novel method consisting of three correlated frameworks is proposed. The first framework learns to predict final skill scores of surgical trials with two auxiliary tasks. The second framework learns to predict running intermediate skill scores that indicate the problematic gestures, while the third framework explores the optimal gesture sequences as references through a new Policy Gradient based formulation.

Results: Our method is experimented on JIGSAWS dataset. The first framework pushes state-of-the-art prediction performance further to 0.83, 0.86 and 0.69 Spearman's correlations for the three surgical tasks under LOUO validation scheme. Moreover, the intermediate scores predicted by the second framework are better in accord with the experts'. Besides, the generated gesture sequences in the third framework reflect the optimality of the gesture flow.

Conclusion: In summary, multi-task learning with semantic visual features successfully boosts the performance of skill score prediction, while exploring gesture-level annotations and score elements of the final skill score is useful for generating more interpretable feedback. Our presented method potentially contributes towards a complete loop of automated surgical training.

Keywords: Incorporating recognized surgical gestures and skill levels; Interpretable feedback; Optimal gesture sequences; Surgical skill assessment.

MeSH terms

  • Biomechanical Phenomena
  • Clinical Competence*
  • Feedback
  • Gestures*
  • Humans