Obtaining long-term stage-specific relative survival estimates in the presence of incomplete historical stage information

Br J Cancer. 2022 Oct;127(6):1061-1068. doi: 10.1038/s41416-022-01866-8. Epub 2022 Jun 17.

Abstract

Background: Completeness of recording for cancer stage at diagnosis is often historically poor in cancer registries, making it challenging to provide long-term stage-specific survival estimates. Stage-specific survival differences are driven by differences in short-term prognosis, meaning estimated survival metrics using period analysis are unlikely to be sensitive to imputed historical stage data.

Methods: We used data from the Surveillance, Epidemiology, and End Results (SEER) Program for lung, colon and breast cancer. To represent missing data patterns in less complete registry data, we artificially inflated the proportion of missing stage information conditional on stage at diagnosis and calendar year of diagnosis. Period analysis was applied and missing stage at diagnosis information was imputed under four different conditions to emulate extreme imputed stage distributions.

Results: We fit a flexible parametric model for each cancer stage on the excess hazard scale and the differences in stage-specific marginal relative survival were assessed. Estimates were also obtained from non-parametric approaches for validation. There was little difference between the 10-year stage-specific marginal relative survival estimates, regardless of the assumed historical stage distribution.

Conclusions: When conducting a period analysis, multiple imputation can be used to obtain stage-specific long-term estimates of relative survival, even when the historical stage information is largely incomplete.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms*
  • Female
  • Humans
  • Neoplasm Staging
  • Prognosis
  • Registries
  • SEER Program
  • Survival Analysis