An Automatized Deep Segmentation and Classification Model for Lumbar Disk Degeneration and Clarification of Its Impact on Clinical Decisions

Global Spine J. 2023 Sep 12:21925682231200783. doi: 10.1177/21925682231200783. Online ahead of print.

Abstract

Study design: Cross-sectional database study.

Objective: The purpose of this study was to develop a successful, reproducible, and reliable convolutional neural network (CNN) model capable of segmentation and classification for grading intervertebral disc degeneration (IVDD), as well as quantify the network's impact on doctors' clinical decision-making.

Methods: 5685 discs from 1137 patients were graded separately by four experienced doctors according to the Pfirrmann classification. A ground truth (GT) was established for each disc in accordance with the decision of the majority of doctors. The U-net model is used for segmentation. 1815 discs from 363 patients were used to train and test the U-net. The Inception V3 model is employed for classification. All discs were separated into two distinct sets: 90% in a training set and 10% in a test set. The performance metrics of these models were measured. Reliability tests were performed. The impact of CNN assistance on doctors was assessed.

Results: Segmentation accuracy was .9597 with a .8717 Jaccard Index and a .9314 Sorensen Dice coefficient. Classification accuracy is .9346, and the F1 score is .9355. The intraclass correlation coefficient (ICC) and kappa values between CNN and GT were .95-.97. With CNN's assistance, the success rates of doctors increased by 7.9% to 22%.

Conclusions: The fully automated network outperformed doctors markedly in terms of accuracy and reliability. The results of CNN were comparable to those of other recent studies in the literature. It was determined that CNN's assistance had a substantial positive effect on the doctor's decision.

Keywords: convolutional neural network; degenerative disc disease; pfirrmann classification; segmentation.