Efficient and accurate causal inference with hidden confounders from genome-transcriptome variation data

PLoS Comput Biol. 2017 Aug 18;13(8):e1005703. doi: 10.1371/journal.pcbi.1005703. eCollection 2017 Aug.

Abstract

Mapping gene expression as a quantitative trait using whole genome-sequencing and transcriptome analysis allows to discover the functional consequences of genetic variation. We developed a novel method and ultra-fast software Findr for higly accurate causal inference between gene expression traits using cis-regulatory DNA variations as causal anchors, which improves current methods by taking into consideration hidden confounders and weak regulations. Findr outperformed existing methods on the DREAM5 Systems Genetics challenge and on the prediction of microRNA and transcription factor targets in human lymphoblastoid cells, while being nearly a million times faster. Findr is publicly available at https://github.com/lingfeiwang/findr.

MeSH terms

  • Algorithms
  • Chromosome Mapping* / methods
  • Chromosome Mapping* / standards
  • Databases, Genetic
  • Genetic Variation
  • High-Throughput Nucleotide Sequencing* / methods
  • High-Throughput Nucleotide Sequencing* / standards
  • Models, Statistical
  • Transcriptome / genetics*