Machine-Learning-Based Forecasting of Dengue Fever in Brazilian Cities Using Epidemiologic and Meteorological Variables

Am J Epidemiol. 2022 Sep 28;191(10):1803-1812. doi: 10.1093/aje/kwac090.

Abstract

Dengue is a serious public health concern in Brazil and globally. In the absence of a universal vaccine or specific treatments, prevention relies on vector control and disease surveillance. Accurate and early forecasts can help reduce the spread of the disease. In this study, we developed a model for predicting monthly dengue cases in Brazilian cities 1 month ahead, using data from 2007-2019. We compared different machine learning algorithms and feature selection methods using epidemiologic and meteorological variables. We found that different models worked best in different cities, and a random forests model trained on monthly dengue cases performed best overall. It produced lower errors than a seasonal naive baseline model, gradient boosting regression, a feed-forward neural network, or support vector regression. For each city, we computed the mean absolute error between predictions and true monthly numbers of dengue cases on the test data set. The median error across all cities was 12.2 cases. This error was reduced to 11.9 when selecting the optimal combination of algorithm and input features for each city individually. Machine learning and especially decision tree ensemble models may contribute to dengue surveillance in Brazil, as they produce low out-of-sample prediction errors for a geographically diverse set of cities.

Keywords: dengue; epidemiologic methods; feature selection; machine learning; prediction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Brazil / epidemiology
  • Cities / epidemiology
  • Dengue* / epidemiology
  • Dengue* / prevention & control
  • Forecasting
  • Humans
  • Machine Learning