Diagnostic triage in patients with central lumbar spinal stenosis using a deep learning system of radiographs

J Neurosurg Spine. 2022 Jan 21:1-8. doi: 10.3171/2021.11.SPINE211136. Online ahead of print.

Abstract

Objective: Magnetic resonance imaging (MRI) is the gold-standard tool for diagnosing lumbar spinal stenosis (LSS), but it is difficult to promptly examine all suspected cases with MRI considering the modality's high cost and limited accessibility. Although radiography is an efficient screening technique owing to its low cost, rapid operability, and wide availability, its diagnostic accuracy is relatively poor. In this study, the authors aimed to develop a deep learning model with a convolutional neural network (CNN) for diagnosing severe central LSS using radiography and to evaluate radiological diagnostic features using gradient-weighted class activation mapping (Grad-CAM).

Methods: Patients who had undergone both spinal MRI and radiography in the period from May 1, 2005, to December 31, 2017, were screened. According to the formal MRI report, participants were consecutively included in the severe central LSS or healthy control group, and radiographs for both groups were collected. A CNN-based transfer learning algorithm was developed to classify radiographic findings as LSS or normal (binary classification). The proposed models were evaluated using six performance metrics: area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, specificity, and positive and negative predictive values.

Results: The VGG19 model achieved the highest accuracy with an AUROC of 90.0% (95% CI 89.8%-90.3%) by training 12,442 images. Accuracy was 82.8% (95% CI 82.5%-83.1%) by averaging 5-fold models. Feature points on Grad-CAM were reasonable, and the features could be categorized into reduced disc height, narrow foramina, short pedicle, and hyperdense facet joint. The AUROC in the extra validation was 89.3% (95% CI 88.7%-90.0%). Accuracy was 81.8% (95% CI 80.6%-83.0%) by averaging 5-fold models. Multivariate logistic regression analysis showed that a combination of demographic factors (age and sex) did not improve the model performance.

Conclusions: The algorithm trained by a CNN to identify central LSS on radiographs showed high diagnostic accuracy and is expected to be useful as a triage tool. The algorithm could accurately localize the stenotic lesion to assist physicians in the identification of LSS.

Keywords: artificial intelligence; convolutional neural network; deep learning; lumbar; radiograph; spinal stenosis; triage.