A linear mixed model approach to gene expression-tumor aneuploidy association studies

Sci Rep. 2019 Aug 16;9(1):11944. doi: 10.1038/s41598-019-48302-1.

Abstract

Aneuploidy, defined as abnormal chromosome number or somatic DNA copy number, is a characteristic of many aggressive tumors and is thought to drive tumorigenesis. Gene expression-aneuploidy association studies have previously been conducted to explore cellular mechanisms associated with aneuploidy. However, in an observational setting, gene expression is influenced by many factors that can act as confounders between gene expression and aneuploidy, leading to spurious correlations between the two variables. These factors include known confounders such as sample purity or batch effect, as well as gene co-regulation which induces correlations between the expression of causal genes and non-causal genes. We use a linear mixed-effects model (LMM) to account for confounding effects of tumor purity and gene co-regulation on gene expression-aneuploidy associations. When applied to patient tumor data across diverse tumor types, we observe that the LMM both accounts for the impact of purity on aneuploidy measurements and identifies a new association between histone gene expression and aneuploidy.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Aneuploidy*
  • Carcinogenesis / genetics
  • Carcinogenesis / metabolism
  • Carcinogenesis / pathology
  • DNA Copy Number Variations
  • Datasets as Topic
  • Gene Expression Regulation, Neoplastic*
  • Genome-Wide Association Study
  • Genomic Instability
  • Histones / genetics*
  • Histones / metabolism
  • Humans
  • Linear Models
  • Neoplasm Proteins / genetics*
  • Neoplasm Proteins / metabolism
  • Neoplasms / diagnosis*
  • Neoplasms / genetics*
  • Neoplasms / metabolism
  • Neoplasms / pathology

Substances

  • Histones
  • Neoplasm Proteins