Comparison of logistic regression with regularized machine learning methods for the prediction of tuberculosis disease in people living with HIV: cross-sectional hospital-based study in Kisumu County, Kenya

Res Sq [Preprint]. 2023 Sep 21:rs.3.rs-3354948. doi: 10.21203/rs.3.rs-3354948/v1.

Abstract

Background: Tuberculosis (TB) is a major public health concern, particularly among people living with the Human immunodeficiency Virus (PLWH). Accurate prediction of TB disease in this population is crucial for early diagnosis and effective treatment. Logistic regression and regularized machine learning methods have been used to predict TB, but their comparative performance in HIV patients remains unclear. The study aims to compare the predictive performance of logistic regression with that of regularized machine learning methods for TB disease in HIV patients.

Methods: Retrospective analysis of data from HIV patients diagnosed with TB in three hospitals in Kisumu County (JOOTRH, Kisumu sub-county hospital, Lumumba health center) between [dates]. Logistic regression, Lasso, Ridge, Elastic net regression were used to develop predictive models for TB disease. Model performance was evaluated using accuracy, and area under the receiver operating characteristic curve (AUC-ROC).

Results: Of the 927 PLWH included in the study, 107 (12.6%) were diagnosed with TB. Being in WHO disease stage III/IV (aOR: 7.13; 95%CI: 3.86-13.33) and having a cough in the last 4 weeks (aOR: 2.34;95%CI: 1.43-3.89) were significant associated with the TB. Logistic regression achieved accuracy of 0.868, and AUC-ROC of 0.744. Elastic net regression also showed good predictive performance with accuracy, and AUC-ROC values of 0.874 and 0.762, respectively.

Conclusions: Our results suggest that logistic regression, Lasso, Ridge regression, and Elastic net can all be effective methods for predicting TB disease in HIV patients. These findings may have important implications for the development of accurate and reliable models for TB prediction in HIV patients.

Keywords: HIV; Lasso; Ridge regression; cross-sectional study; logistic regression; machine learning; prediction; tuberculosis.

Publication types

  • Preprint