Differentiating between drug-sensitive and drug-resistant tuberculosis with machine learning for clinical and radiological features

Quant Imaging Med Surg. 2022 Jan;12(1):675-687. doi: 10.21037/qims-21-290.

Abstract

Background: Tuberculosis (TB) drug resistance is a worldwide public health problem that threatens progress made in TB care and control. Early detection of drug resistance is important for disease control, with discrimination between drug-resistant TB (DR-TB) and drug-sensitive TB (DS-TB) still being an open problem. The objective of this work is to investigate the relevance of readily available clinical data and data derived from chest X-rays (CXRs) in DR-TB prediction and to investigate the possibility of applying machine learning techniques to selected clinical and radiological features for discrimination between DR-TB and DS-TB. We hypothesize that the number of sextants affected by abnormalities such as nodule, cavity, collapse and infiltrate may serve as a radiological feature for DR-TB identification, and that both clinical and radiological features are important factors for machine classification of DR-TB and DS-TB.

Methods: We use data from the NIAID TB Portals program (https://tbportals.niaid.nih.gov), 1,455 DR-TB cases and 782 DS-TB cases from 11 countries. We first select three clinical features and 26 radiological features from the dataset. Then, we perform Pearson's chi-squared test to analyze the significance of the selected clinical and radiological features. Finally, we train machine classifiers based on different features and evaluate their ability to differentiate between DR-TB and DS-TB.

Results: Pearson's chi-squared test shows that two clinical features and 23 radiological features are statistically significant regarding DR-TB vs. DS-TB. A ten-fold cross-validation using a support vector machine shows that automatic discrimination between DR-TB and DS-TB achieves an average accuracy of 72.34% and an average AUC value of 78.42%, when combing all 25 statistically significant features.

Conclusions: Our study suggests that the number of affected lung sextants can be used for predicting DR-TB, and that automatic discrimination between DR-TB and DS-TB is possible, with a combination of clinical features and radiological features providing the best performance.

Keywords: Differential diagnosis; clinical features; drug-resistance (DR); machine learning; radiological features; tuberculosis (TB).