A Machine Learning Model for Evaluating Imported Disease Screening Strategies in Immigrant Populations

Am J Trop Med Hyg. 2021 Sep 20;105(5):1413-1419. doi: 10.4269/ajtmh.20-1443.

Abstract

Given the high prevalence of imported diseases in immigrant populations, it has postulated the need to establish screening programs that allow their early diagnosis and treatment. We present a mathematical model based on machine learning methodologies to contribute to the design of screening programs in this population. We conducted a retrospective cross-sectional screening program of imported diseases in all immigrant patients who attended the Tropical Medicine Unit between January 2009 and December 2016. We designed a mathematical model based on machine learning methodologies to establish the set of most discriminatory prognostic variables to predict the onset of the: HIV infection, malaria, chronic hepatitis B and C, schistosomiasis, and Chagas in immigrant population. We analyzed 759 patients. HIV was predicted with an accuracy of 84.9% and the number of screenings to detect the first HIV-infected person was 26, as in the case of Chagas disease (with a predictive accuracy of 92.9%). For the other diseases the averages were 12 screenings to detect the first case of chronic hepatitis B (85.4%), or schistosomiasis (86.9%), 23 for hepatitis C (85.6%) or malaria (93.3%), and eight for syphilis (79.4%) and strongyloidiasis (88.4%). The use of machine learning methodologies allowed the prediction of the expected disease burden and made it possible to pinpoint with greater precision those immigrants who are likely to benefit from screening programs, thus contributing effectively to their development and design.

MeSH terms

  • Adolescent
  • Adult
  • Africa
  • Aged
  • Aged, 80 and over
  • Asia
  • Central America
  • Child
  • Child, Preschool
  • Communicable Diseases, Imported / diagnosis*
  • Communicable Diseases, Imported / epidemiology
  • Cross-Sectional Studies
  • Early Diagnosis*
  • Emigrants and Immigrants / statistics & numerical data*
  • Female
  • Humans
  • Infant
  • Infant, Newborn
  • Machine Learning*
  • Male
  • Mass Screening / methods*
  • Mexico
  • Middle Aged
  • Models, Theoretical
  • Prevalence
  • Retrospective Studies
  • South America
  • Spain / epidemiology
  • Young Adult