An Integrated Voice Recognition and Natural Language Processing Platform to Automatically Extract Thoracolumbar Injury Classification Score Features From Radiology Reports

Archis R Bhandarkar; Chiduziem Onyedimma; Ryan M Jarrah; Sufyan Ibrahim; Sunyang Fu; Hongfang Liu; Mohamad Bydon

doi:10.1016/j.wneu.2023.12.065

An Integrated Voice Recognition and Natural Language Processing Platform to Automatically Extract Thoracolumbar Injury Classification Score Features From Radiology Reports

World Neurosurg. 2024 Mar:183:e243-e249. doi: 10.1016/j.wneu.2023.12.065. Epub 2023 Dec 15.

Authors

Archis R Bhandarkar¹, Chiduziem Onyedimma², Ryan M Jarrah², Sufyan Ibrahim², Sunyang Fu³, Hongfang Liu⁴, Mohamad Bydon⁵

Affiliations

¹ Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota, USA; Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA.
² Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota, USA.
³ Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA.
⁴ Digital Health Sciences, Mayo Clinic Alix School of Medicine, Rochester, Minnesota, USA.
⁵ Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota, USA. Electronic address: bydon.mohamad@mayo.edu.

PMID: 38103686
DOI: 10.1016/j.wneu.2023.12.065

Abstract

Background: Many predictive models for estimating clinical outcomes after spine surgery have been reported in the literature. However, implementation of predictive scores in practice is limited by the time-intensive nature of manually abstracting relevant predictors. In this study, we designed natural language processing (NLP) algorithms to automate data abstraction for the thoracolumbar injury classification score (TLICS).

Methods: We retrieved the radiology reports of all Mayo Clinic patients with an International Classification of Diseases, 9th or 10th revision, code corresponding to a fracture of the thoracolumbar spine between January 2005 and October 2020. Annotated data were used to train an N-gram NLP model using machine learning methods, including random forest, stepwise linear discriminant analysis, k-nearest neighbors, and penalized logistic regression models.

Results: A total of 1085 spine radiology reports were included in our analysis. Our dataset included 483 compression, 401 burst, 103 translational/rotational, and 98 distraction fractures. A total of 103 reports had documented an injury of the posterior ligamentous complex. The overall accuracy of the random forest model for fracture morphology feature detection was 76.96% versus 65.90% in the stepwise linear discriminant analysis, 50.69% in the k-nearest neighbors, and 62.67% in the penalized logistic regression. The overall accuracy to detect posterior ligamentous complex integrity was highest in the random forest model at 83.41%. Our random forest model was implemented in the backend of a web application in which users can dictate reports and have TLICS features automatically extracted.

Conclusions: We have developed a machine learning NLP model for extracting TLICS features from radiology reports, which we deployed in a web application that can be integrated into clinical practice.

Keywords: Artificial intelligence; Fracture; Natural language processing; Spine; Thoracolumbar.

MeSH terms

Fractures, Bone*
Humans
Lumbar Vertebrae / diagnostic imaging
Lumbar Vertebrae / injuries
Natural Language Processing
Radiology*
Thoracic Vertebrae / diagnostic imaging
Thoracic Vertebrae / injuries
Voice Recognition