Adolescent HIV-related behavioural prediction using machine learning: a foundation for precision HIV prevention

AIDS. 2021 May 1;35(Suppl 1):S75-S84. doi: 10.1097/QAD.0000000000002867.

Abstract

Background: Precision prevention is increasingly important in HIV prevention research to move beyond universal interventions to those tailored for high-risk individuals. The current study was designed to develop machine learning algorithms for predicting adolescent HIV risk behaviours.

Methods: Comprehensive longitudinal data on adolescent risk behaviours, perceptions, peer and family influence, and neighbourhood risk factors were collected from 2564 grade-10 students at baseline followed for 24 months over 2008-2012. Machine learning techniques [support vector machine (SVM) and random forests] were applied to innovatively leverage longitudinal data for robust HIV risk behaviour prediction. In this study, we focused on two adolescent risk behaviours: had ever had sex and had multiple sex partners. Twenty percent of the data were withheld for model testing.

Results: The SVM model with cost-sensitive learning achieved the highest sensitivity, at 79.1%, specificity of 75.4% with AUC of 0.86 in predicting multiple sex partners on the training data (10-fold cross-validation), and sensitivity of 79.7%, specificity of 76.5% with AUC of 0.86 on the testing data. The random forest model obtained the best performance in predicting had ever had sex, yielding the sensitivity of 78.5%, specificity of 73.1% with AUC of 0.84 on the training data and sensitivity of 82.7%, specificity of 75.3% with AUC of 0.87 on the testing data.

Conclusion: Machine learning methods can be used to build effective prediction model(s) to identify adolescents who are likely to engage in HIV risk behaviours. This study builds a foundation for targeted intervention strategies and informs precision prevention efforts in school-setting.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adolescent
  • Adolescent Behavior*
  • Algorithms
  • HIV Infections* / diagnosis
  • HIV Infections* / prevention & control
  • Humans
  • Machine Learning
  • Risk Factors