Characteristics of The Cancer Genome Atlas cases relative to U.S. general population cancer cases

Br J Cancer. 2018 Oct;119(7):885-892. doi: 10.1038/s41416-018-0140-8. Epub 2018 Aug 21.

Abstract

Background: Despite anecdotal reports of differences in clinical and demographic characteristics of The Cancer Genome Atlas (TCGA) relative to general population cancer cases, differences have not been systematically evaluated.

Methods: Data from 11,160 cases with 33 cancer types were ascertained from TCGA data portal. Corresponding data from the Surveillance, Epidemiology, and End Results (SEER) 18 and North American Association of Central Cancer Registries databases were obtained. Differences in characteristics were compared using Student's t, Chi-square, and Fisher's exact tests. Differences in mean survival months were assessed using restricted mean survival time analysis and generalised linear model.

Results: TCGA cases were 3.9 years (95% CI 1.7-6.2) younger on average than SEER cases, with a significantly younger mean age for 20/33 cancer types. Although most cancer types had a similar sex distribution, race and stage at diagnosis distributions were disproportional for 13/18 and 25/26 assessed cancer types, respectively. Using 12 months as an end point, the observed mean survival months were longer for 27 of 33 TCGA cancer types.

Conclusions: Differences exist in the characteristics of TCGA vs. general population cancer cases. Our study highlights population subgroups where increased sample collection is warranted to increase the applicability of cancer genomic research results to all individuals.

Publication types

  • Comparative Study

MeSH terms

  • Age of Onset
  • Databases, Factual*
  • Databases, Genetic
  • Female
  • Humans
  • Male
  • Neoplasms / epidemiology*
  • Neoplasms / genetics
  • Registries
  • SEER Program
  • Sex Distribution
  • Survival Analysis
  • United States / epidemiology