Validation and Improvement of a Machine Learning Model to Predict Interruptions in Antiretroviral Treatment in South Africa

J Acquir Immune Defic Syndr. 2023 Jan 1;92(1):42-49. doi: 10.1097/QAI.0000000000003108.

Abstract

Introduction: Machine learning algorithms are increasingly being used to inform HIV prevention and detection strategies. We validated and extended a previously developed machine learning model for patient retention on antiretroviral therapy in a new geographic catchment area in South Africa.

Methods: We compared the ability of an adaptive boosting algorithm to predict interruption in treatment (IIT) in 2 South African cohorts from the Free State and Mpumalanga and Gauteng and North West (GA/NW) provinces. We developed a novel set of predictive features for the GA/NW cohort using a categorical boosting model. We evaluated the ability of the model to predict IIT over all visits and across different periods within a patient's treatment trajectory.

Results: When predicting IIT, the GA/NW and Free State and Mpumalanga models demonstrated a sensitivity of 60% and 61%, respectively, able to correctly predict nearly two-thirds of all missed visits with a positive predictive value of 18% and 19%. Using predictive features generated from the GA/NW cohort, the categorical boosting model correctly predicted 22,119 of a total of 35,985 missed next visits, yielding a sensitivity of 62%, specificity of 67%, and positive predictive value of 20%. Model performance was highest when tested on visits within the first 6 months.

Conclusions: Machine learning algorithms may be useful in informing tools to increase antiretroviral therapy patient retention and efficiency of HIV care interventions. This is particularly relevant in developing countries where health data systems are being strengthened to collect data on a scale that is large enough to apply novel analytical methods.

MeSH terms

  • HIV Infections* / drug therapy
  • Humans
  • Machine Learning
  • South Africa