Modeling age-specific cancer incidences using logistic growth equations: implications for data collection

Asian Pac J Cancer Prev. 2014;15(22):9731-7. doi: 10.7314/apjcp.2014.15.22.9731.

Abstract

Large scale secular registry or surveillance systems have been accumulating vast data that allow mathematical modeling of cancer incidence and mortality rates. Most contemporary models in this regard use time series and APC (age-period-cohort) methods and focus primarily on predicting or analyzing cancer epidemiology with little attention being paid to implications for designing cancer registry, surveillance or evaluation initiatives. This research models age-specific cancer incidence rates using logistic growth equations and explores their performance under different scenarios of data completeness in the hope of deriving clues for reshaping relevant data collection. The study used China Cancer Registry Report 2012 as the data source. It employed 3-parameter logistic growth equations and modeled the age-specific incidence rates of all and the top 10 cancers presented in the registry report. The study performed 3 types of modeling, namely full age-span by fitting, multiple 5-year- segment fitting and single-segment fitting. Measurement of model performance adopted adjusted goodness of fit that combines sum of squred residuals and relative errors. Both model simulation and performance evalation utilized self-developed algorithms programed using C# languade and MS Visual Studio 2008. For models built upon full age-span data, predicted age-specific cancer incidence rates fitted very well with observed values for most (except cervical and breast) cancers with estimated goodness of fit (Rs) being over 0.96. When a given cancer is concerned, the R valuae of the logistic growth model derived using observed data from urban residents was greater than or at least equal to that of the same model built on data from rural people. For models based on multiple-5-year-segment data, the Rs remained fairly high (over 0.89) until 3-fourths of the data segments were excluded. For models using a fixed length single-segment of observed data, the older the age covered by the corresponding data segment, the higher the resulting Rs. Logistic growth models describe age-specific incidence rates perfectly for most cancers and may be used to inform data collection for purposes of monitoring and analyzing cancer epidemic. Helped by appropriate logistic growth equations, the work vomume of contemporary data collection, e.g., cancer registry and surveilance systems, may be reduced substantially.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Age Factors
  • Aged
  • Aged, 80 and over
  • Algorithms*
  • Child
  • Child, Preschool
  • China / epidemiology
  • Data Collection / methods*
  • Female
  • Humans
  • Incidence
  • Infant
  • Infant, Newborn
  • Male
  • Middle Aged
  • Models, Statistical*
  • Neoplasms / epidemiology*
  • Registries*
  • Young Adult