Real-Time Extraction of Important Surgical Phases in Cataract Surgery Videos

Sci Rep. 2019 Nov 12;9(1):16590. doi: 10.1038/s41598-019-53091-8.

Abstract

The present study aimed to conduct a real-time automatic analysis of two important surgical phases, which are continuous curvilinear capsulorrhexis (CCC), nuclear extraction, and three other surgical phases of cataract surgery using artificial intelligence technology. A total of 303 cases of cataract surgery registered in the clinical database of the Ophthalmology Department of Tsukazaki Hospital were used as a dataset. Surgical videos were downsampled to a resolution of 299 × 168 at 1 FPS to image each frame. Next, based on the start and end times of each surgical phase recorded by an ophthalmologist, the obtained images were labeled correctly. Using the data, a neural network model, known as InceptionV3, was developed to identify the given surgical phase for each image. Then, the obtained images were processed in chronological order using the neural network model, where the moving average of the output result of five consecutive images was derived. The class with the maximum output value was defined as the surgical phase. For each surgical phase, the time at which a phase was first identified was defined as the start time, and the time at which a phase was last identified was defined as the end time. The performance was evaluated by finding the mean absolute error between the start and end times of each important phase recorded by the ophthalmologist as well as the start and end times determined by the model. The correct response rate of the cataract surgical phase classification was 90.7% for CCC, 94.5% for nuclear extraction, and 97.9% for other phases, with a mean correct response rate of 96.5%. The errors between each phase's start and end times recorded by the ophthalmologist and those determined by the neural network model were as follows: CCC's start and end times, 3.34 seconds and 4.43 seconds, respectively and nuclear extraction's start and end times, 7.21 seconds and 6.04 seconds, respectively, with a mean of 5.25 seconds. The neural network model used in this study was able to perform the classification of the surgical phase by only referring to the last 5 seconds of video images. Therefore, our method has performed like a real-time classification.

MeSH terms

  • Algorithms*
  • Artificial Intelligence
  • Cataract / therapy*
  • Cataract Extraction / methods*
  • Databases, Factual
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Neural Networks, Computer*
  • Surgery, Computer-Assisted / methods*
  • Video-Assisted Surgery / methods*