An implementation of real-time air quality and influenza-like illness data storage and processing platform

Comput Human Behav. 2019 Nov:100:266-274. doi: 10.1016/j.chb.2018.10.009. Epub 2018 Oct 8.

Abstract

Recently, air pollution has become the primary concern in Taiwan as it significantly affected people's health. Some air pollution monitoring, analysis, and prediction systems were proposed to solve the problem. However, there is very little research to see whether the air quality is associated with the Influenza-Like Illness (ILI) disease or not. In this study, a system is needed, in which the air quality data and the influenza-like illness data can be analyzed together to determine their associations accurately and effectively. In this work, a novel integrated platform was implemented by building a cluster environment based on Hadoop, Spark and a visualization environment with ELK Stack as well as a backup storage system based on Ceph object storage architecture. Also, Sqoop and Alluxio were used to solve the inefficiency problem in processing vast amounts of data. The experimental results showed the visualization of air quality and influenza-like illness data collected from 2016 to 2017 in Taichung, Taiwan. Besides, the association analyses and discussion between air quality and influenza-like illness were also presented.

Keywords: Alluxio; Association analysis; Ceph; Influenza-like illness; air pollution.