The Use of Multiple Correspondence Analysis to Explore Associations Between Categories of Qualitative Variables and Cancer Incidence

IEEE J Biomed Health Inform. 2021 Sep;25(9):3659-3667. doi: 10.1109/JBHI.2021.3073605. Epub 2021 Sep 3.

Abstract

Background: Previous works have shown that risk factors for some kinds of cancer depend on people's lifestyle (e.g. rural or urban residence). This article looks into this, seeking relationships between cancer, age group, gender and population in the region of Lleida (Catalonia, Spain) using Multiple Correspondence Analysis (MCA).

Methods: The dataset analysed was made up of 3408 cancer episodes between 2012 and 2014, extracted from the Population-based Cancer Registry (PCR) for Lleida province. The cancers studied were colon and rectal (1059 cases), lung (551 cases), urinary bladder (446 cases), prostate (609 cases) and breast (743 cases). The MCA technique was applied and used to search relationships among the main qualitative features. The basic statistics were the percentage explaining (variance), the inertia and the contribution of each qualitative variable.

Results: General outcomes showed a low and moderate contribution of living in rural areas to colorectal and male prostate cancer. Males in urban areas were slightly and heavily affected by lung and urinary bladder cancer respectively. The analysis of each cancer provided additional information. Colorectal cancer greatly affected males aged <60, urban residents aged 70-79, and rural females aged ≥ 80. The impact of lung cancer was high among urban females <60, moderate among males aged 70-79 and high among rural females aged ≥ 80. The results for urinary bladder cancer results were similar to those for lung cancer. Prostate cancer affected both the <60 and ≥ 80 age groups significantly in rural areas. Breast cancer hit the 70-79 group significantly and, somewhat less so, rural females aged ≥ 80.

Conclusions: MCA was a significant help for detecting the contributions of qualitative variables and the associations between them. MCA has proven to be an effective technique for analyzing the incidence of cancer. The outcomes obtained help to corroborate suspected trends, as well as detecting and stimulating new hypotheses about the risk factors associated with a specific area and cancer. These findings will be helpful for encouraging new studies and prevention campaigns to highlight observed singularities.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Incidence
  • Lung Neoplasms*
  • Male
  • Prostatic Neoplasms*
  • Risk Factors
  • Rural Population