Forecasting smog-related health hazard based on social media and physical sensor

Inf Syst. 2017 Mar:64:281-291. doi: 10.1016/j.is.2016.03.011. Epub 2016 Apr 13.

Abstract

Smog disasters are becoming more and more frequent and may cause severe consequences on the environment and public health, especially in urban areas. Social media as a real-time urban data source has become an increasingly effective channel to observe people׳s reactions on smog-related health hazard. It can be used to capture possible smog-related public health disasters in its early stage. We then propose a predictive analytic approach that utilizes both social media and physical sensor data to forecast the next day smog-related health hazard. First, we model smog-related health hazards and smog severity through mining raw microblogging text and network information diffusion data. Second, we developed an artificial neural network (ANN)-based model to forecast smog-related health hazard with the current health hazard and smog severity observations. We evaluate the performance of the approach with other alternative machine learning methods. To the best of our knowledge, we are the first to integrate social media and physical sensor data for smog-related health hazard forecasting. The empirical findings can help researchers to better understand the non-linear relationships between the current smog observations and the next day health hazard. In addition, this forecasting approach can provide decision support for smog-related health hazard management through functions like early warning.

Keywords: Data mining; Forecasting; Health hazard; Smog disaster; Social media; Urban data.