Machine learning to predict bacteriologic confirmation of Mycobacterium tuberculosis in infants and very young children

PLOS Digit Health. 2023 May 17;2(5):e0000249. doi: 10.1371/journal.pdig.0000249. eCollection 2023 May.

Abstract

Diagnosis of tuberculosis (TB) among young children (<5 years) is challenging due to the paucibacillary nature of clinical disease and clinical similarities to other childhood diseases. We used machine learning to develop accurate prediction models of microbial confirmation with simply defined and easily obtainable clinical, demographic, and radiologic factors. We evaluated eleven supervised machine learning models (using stepwise regression, regularized regression, decision tree, and support vector machine approaches) to predict microbial confirmation in young children (<5 years) using samples from invasive (reference-standard) or noninvasive procedure. Models were trained and tested using data from a large prospective cohort of young children with symptoms suggestive of TB in Kenya. Model performance was evaluated using areas under the receiver operating curve (AUROC) and precision-recall curve (AUPRC), accuracy metrics. (i.e., sensitivity, specificity), F-beta scores, Cohen's Kappa, and Matthew's Correlation Coefficient. Among 262 included children, 29 (11%) were microbially confirmed using any sampling technique. Models were accurate at predicting microbial confirmation in samples obtained from invasive procedures (AUROC range: 0.84-0.90) and from noninvasive procedures (AUROC range: 0.83-0.89). History of household contact with a confirmed case of TB, immunological evidence of TB infection, and a chest x-ray consistent with TB disease were consistently influential across models. Our results suggest machine learning can accurately predict microbial confirmation of M. tuberculosis in young children using simply defined features and increase the bacteriologic yield in diagnostic cohorts. These findings may facilitate clinical decision making and guide clinical research into novel biomarkers of TB disease in young children.

Grants and funding

This work was supported by the US Agency for International Development (USAID) and the US Centers for Disease Control and Prevention (CDC). A portion of this work was funded by the President’s Emergency Plan for AIDS Relief (PEPFAR) through the Centers for Disease Control and Prevention, and the Eunice Kennedy Shriver National Institute of Child Health & Human Development [K23HD072802 to RS]. The findings and conclusions in this report are those of the author and do not necessarily represent the official position of the funding agencies. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.