Syndromic Surveillance Models Using Web Data: The Case of Influenza in Greece and Italy Using Google Trends

JMIR Public Health Surveill. 2017 Nov 20;3(4):e90. doi: 10.2196/publichealth.8015.

Abstract

Background: An extended discussion and research has been performed in recent years using data collected through search queries submitted via the Internet. It has been shown that the overall activity on the Internet is related to the number of cases of an infectious disease outbreak.

Objective: The aim of the study was to define a similar correlation between data from Google Trends and data collected by the official authorities of Greece and Europe by examining the development and the spread of seasonal influenza in Greece and Italy.

Methods: We used multiple regressions of the terms submitted in the Google search engine related to influenza for the period from 2011 to 2012 in Greece and Italy (sample data for 104 weeks for each country). We then used the autoregressive integrated moving average statistical model to determine the correlation between the Google search data and the real influenza cases confirmed by the aforementioned authorities. Two methods were used: (1) a flu score was created for the case of Greece and (2) comparison of data from a neighboring country of Greece, which is Italy.

Results: The results showed that there is a significant correlation that can help the prediction of the spread and the peak of the seasonal influenza using data from Google searches. The correlation for Greece for 2011 and 2012 was .909 and .831, respectively, and correlation for Italy for 2011 and 2012 was .979 and .933, respectively. The prediction of the peak was quite precise, providing a forecast before it arrives to population.

Conclusions: We can create an Internet surveillance system based on Google searches to track influenza in Greece and Italy.

Keywords: ARIMA; Google Trends; Web, syndromic surveillance; forecast; influenza; statistical correlation.