Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA

Marina D Miller; Eric J Devor; Erin A Salinas; Andreea M Newtson; Michael J Goodheart; Kimberly K Leslie; Jesus Gonzalez-Bosquet

doi:10.3390/ijms20051192

Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA

Int J Mol Sci. 2019 Mar 8;20(5):1192. doi: 10.3390/ijms20051192.

Authors

Marina D Miller¹, Eric J Devor^{2

3}, Erin A Salinas⁴, Andreea M Newtson⁵, Michael J Goodheart^{6

7}, Kimberly K Leslie^{8

9}, Jesus Gonzalez-Bosquet^{10

11}

Affiliations

¹ Department of Obstetrics and Gynecology, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA. marina-miller@uiowa.edu.
² Department of Obstetrics and Gynecology, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA. eric-devor@uiowa.edu.
³ Holden Comprehensive Cancer Center, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA. eric-devor@uiowa.edu.
⁴ Compass Oncology, Portland, OR 97227, USA. Erin.Salinas@compassoncology.com.
⁵ Division of Gynecologic Oncology, Department of Obstetrics and Gynecologic, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA. andreea-newtson@uiowa.edu.
⁶ Holden Comprehensive Cancer Center, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA. michael-goodheart@uiowa.edu.
⁷ Division of Gynecologic Oncology, Department of Obstetrics and Gynecologic, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA. michael-goodheart@uiowa.edu.
⁸ Department of Obstetrics and Gynecology, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA. kimberly-leslie@uiowa.edu.
⁹ Holden Comprehensive Cancer Center, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA. kimberly-leslie@uiowa.edu.
¹⁰ Holden Comprehensive Cancer Center, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA. jesus-gonzalezbosquet@uiowa.edu.
¹¹ Division of Gynecologic Oncology, Department of Obstetrics and Gynecologic, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA. jesus-gonzalezbosquet@uiowa.edu.

Abstract

In the era of large genetic and genomic datasets, it has become crucially important to validate results of individual studies using data from publicly available sources, such as The Cancer Genome Atlas (TCGA). However, how generalizable are results from either an independent or a large public dataset to the remainder of the population? The study presented here aims to answer that question. Utilizing next generation sequencing data from endometrial and ovarian cancer patients from both the University of Iowa and TCGA, genomic admixture of each population was analyzed using STRUCTURE and ADMIXTURE software. In our independent data set, one subpopulation was identified, whereas in TCGA 4⁻6 subpopulations were identified. Data presented here demonstrate how different the genetic substructures of the TCGA and University of Iowa populations are. Validation of genomic studies between two different population samples must be aware of, account for and be corrected for background genetic substructure.

Keywords: The Cancer Genome Atlas; endometrial cancer; genetic admixture; ovarian cancer; population substructure.

MeSH terms

Databases, Genetic
Endometrial Neoplasms / genetics*
Female
Genome, Human
Genomics / methods*
High-Throughput Nucleotide Sequencing / methods
Humans
Middle Aged
Ovarian Neoplasms / genetics*
Software

Abstract

MeSH terms

Grants and funding