Multivariate analysis in data science for the geospatial distribution of the breast cancer mortality rate in Colombia

Front Oncol. 2023 Jan 6:12:1055655. doi: 10.3389/fonc.2022.1055655. eCollection 2022.

Abstract

This research is framed in the area of biomathematics and contributes to the epidemiological surveillance entities in Colombia to clarify how breast cancer mortality rate (BCM) is spatially distributed in relation to the forest area index (FA) and circulating vehicle index (CV). In this regard, the World Health Organization has highlighted the scarce generation of knowledge that relates mortality from tumor diseases to environmental factors. Quantitative methods based on geospatial data science are used with cross-sectional information from the 2018 census; it's found that the BCM in Colombia is not spatially randomly distributed, but follows cluster aggregation patterns. Under multivariate modeling methods, the research provides sufficient statistical evidence in terms of not rejecting the hypothesis that if a spatial unit has high FA and low CV, then it has significant advantages in terms of lower BCM.

Keywords: breast cancer; data science; georeferencing; spatial clusters; spatial distribution.

Grants and funding

This research has been supported by DICYT (Scientific and Technological Research Bureau) of the University of Santiago of Chile (USACH) and the Department of Industrial Engineering.This research was supported in part by the National Fund for Scientific and Technological Development (FONDECYT, Chile), grant no. 11200993 (MV).