Prioritizing Crohn's disease genes by integrating association signals with gene expression implicates monocyte subsets

Genes Immun. 2019 Sep;20(7):577-588. doi: 10.1038/s41435-019-0059-y. Epub 2019 Jan 29.

Abstract

Genome-wide association studies have identified ~170 loci associated with Crohn's disease (CD) and defining which genes drive these association signals is a major challenge. The primary aim of this study was to define which CD locus genes are most likely to be disease related. We developed a gene prioritization regression model (GPRM) by integrating complementary mRNA expression datasets, including bulk RNA-Seq from the terminal ileum of 302 newly diagnosed, untreated CD patients and controls, and in stimulated monocytes. Transcriptome-wide association and co-expression network analyses were performed on the ileal RNA-Seq datasets, identifying 40 genome-wide significant genes. Co-expression network analysis identified a single gene module, which was substantially enriched for CD locus genes and most highly expressed in monocytes. By including expression-based and epigenetic information, we refined likely CD genes to 2.5 prioritized genes per locus from an average of 7.8 total genes. We validated our model structure using cross-validation and our prioritization results by protein-association network analyses, which demonstrated significantly higher CD gene interactions for prioritized compared with non-prioritized genes. Although individual datasets cannot convey all of the information relevant to a disease, combining data from multiple relevant expression-based datasets improves prediction of disease genes and helps to further understanding of disease pathogenesis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Algorithms
  • Case-Control Studies
  • Child
  • Child, Preschool
  • Crohn Disease / genetics*
  • Crohn Disease / metabolism
  • Female
  • Gene Regulatory Networks / genetics
  • Genetic Predisposition to Disease / genetics
  • Genome-Wide Association Study
  • Humans
  • Male
  • Monocytes / metabolism
  • Monocytes / pathology*
  • Polymorphism, Single Nucleotide / genetics
  • Quantitative Trait Loci / genetics
  • Sequence Analysis, DNA / methods*
  • Software
  • Transcriptome / genetics