Investigating the Association Between Sociodemographic Factors and Lung Cancer Risk Using Cyber Informatics

IEEE EMBS Int Conf Biomed Health Inform. 2016 Feb:2016:557-560. doi: 10.1109/BHI.2016.7455958. Epub 2016 Apr 21.

Abstract

Openly available online sources can be very valuable for executing in silico case-control epidemiological studies. Adjustment of confounding factors to isolate the association between an observing factor and disease is essential for such studies. However, such information is not always readily available online. This paper suggests natural language processing methods for extracting socio-demographic information from content openly available online. Feasibility of the suggested method is demonstrated by performing a case-control study focusing on the association between age, gender, and income level and lung cancer risk. The study shows stronger association between older age and lower socioeconomic status and higher lung cancer risk, which is consistent with the findings reported in traditional cancer epidemiology studies.