Validation and Improvement of a Machine Learning Model to Predict Interruptions in Antiretroviral Treatment in South Africa

Rachel Esra; Jacques Carstens; Sue Le Roux; Tonderai Mabuto; Michael Eisenstein; Olivia Keiser; Erol Orel; Aziza Merzouki; Lucien De Voux; Mhari Maskew; Kieran Sharpey-Schafer

doi:10.1097/QAI.0000000000003108

Validation and Improvement of a Machine Learning Model to Predict Interruptions in Antiretroviral Treatment in South Africa

J Acquir Immune Defic Syndr. 2023 Jan 1;92(1):42-49. doi: 10.1097/QAI.0000000000003108.

Authors

Affiliations

¹ University of Geneva, Institute of Global Health, Genève, Switzerland.
² Imperial College of London, United Kingdom.
³ Palindrome Data, Cape Town, South Africa.
⁴ The Aurum Institute, Parktown, Johannesburg, South Africa; and.
⁵ Health Economics and Epidemiology Research Office, Department of Internal Medicine, School of Clinical Medicine, Faculty of Health Sciences, University of the Witwatersrand, South Africa.

PMID: 36194900
DOI: 10.1097/QAI.0000000000003108

Abstract

Introduction: Machine learning algorithms are increasingly being used to inform HIV prevention and detection strategies. We validated and extended a previously developed machine learning model for patient retention on antiretroviral therapy in a new geographic catchment area in South Africa.

Methods: We compared the ability of an adaptive boosting algorithm to predict interruption in treatment (IIT) in 2 South African cohorts from the Free State and Mpumalanga and Gauteng and North West (GA/NW) provinces. We developed a novel set of predictive features for the GA/NW cohort using a categorical boosting model. We evaluated the ability of the model to predict IIT over all visits and across different periods within a patient's treatment trajectory.

Results: When predicting IIT, the GA/NW and Free State and Mpumalanga models demonstrated a sensitivity of 60% and 61%, respectively, able to correctly predict nearly two-thirds of all missed visits with a positive predictive value of 18% and 19%. Using predictive features generated from the GA/NW cohort, the categorical boosting model correctly predicted 22,119 of a total of 35,985 missed next visits, yielding a sensitivity of 62%, specificity of 67%, and positive predictive value of 20%. Model performance was highest when tested on visits within the first 6 months.

Conclusions: Machine learning algorithms may be useful in informing tools to increase antiretroviral therapy patient retention and efficiency of HIV care interventions. This is particularly relevant in developing countries where health data systems are being strengthened to collect data on a scale that is large enough to apply novel analytical methods.

MeSH terms

HIV Infections* / drug therapy
Humans
Machine Learning
South Africa

Grants and funding

PEPFAR/PEPFAR/United States