Spatial distribution of esophageal cancer mortality in China: a machine learning approach

Int Health. 2021 Jan 14;13(1):70-79. doi: 10.1093/inthealth/ihaa022.

Abstract

Background: Esophageal cancer (EC) is one of the most common cancers, causing many people to die every year worldwide. Accurate estimations of the spatial distribution of EC are essential for effective cancer prevention.

Methods: EC mortality surveillance data covering 964 surveyed counties in China in 2014 and three classes of auxiliary data, including physical condition, living habits and living environment data, were collected. Genetic programming (GP), a hierarchical Bayesian model and sandwich estimation were used to estimate the spatial distribution of female EC mortality. Finally, we evaluated the accuracy of the three mapping methods.

Results: The results show that compared with the root square mean error (RMSE) of the hierarchical Bayesian model at 6.546 and the sandwich estimation at 7.611, the RMSE of GP is the lowest at 5.894. According to the distribution estimated by GP, the mortality of female EC was low in some regions of Northeast China, Northwest China and southern China; in some regions downstream of the Yellow River Basin, north of the Yangtze River in the Yangtze River Basin and in Southwest China, the mortality rate was relatively high.

Conclusions: This paper provides an accurate map of female EC mortality in China. A series of targeted preventive measures can be proposed based on the spatial disparities displayed on the map.

Keywords: cancer mapping; esophageal cancer; genetic programming; prevention and control; spatial distribution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • China / epidemiology
  • Esophageal Neoplasms*
  • Female
  • Humans
  • Machine Learning
  • Rivers*