Towards mouse genetic-specific RNA-sequencing read mapping

PLoS Comput Biol. 2022 Sep 26;18(9):e1010552. doi: 10.1371/journal.pcbi.1010552. eCollection 2022 Sep.

Abstract

Genetic variations affect behavior and cause disease but understanding how these variants drive complex traits is still an open question. A common approach is to link the genetic variants to intermediate molecular phenotypes such as the transcriptome using RNA-sequencing (RNA-seq). Paradoxically, these variants between the samples are usually ignored at the beginning of RNA-seq analyses of many model organisms. This can skew the transcriptome estimates that are used later for downstream analyses, such as expression quantitative trait locus (eQTL) detection. Here, we assessed the impact of reference-based analysis on the transcriptome and eQTLs in a widely-used mouse genetic population: the BXD panel of recombinant inbred lines. We highlight existing reference bias in the transcriptome data analysis and propose practical solutions which combine available genetic variants, genotypes, and genome reference sequence. The use of custom BXD line references improved downstream analysis compared to classical genome reference. These insights would likely benefit genetic studies with a transcriptomic component and demonstrate that genome references need to be reassessed and improved.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Gene Expression Profiling
  • Mice
  • Quantitative Trait Loci* / genetics
  • RNA / genetics
  • Sequence Analysis, RNA
  • Transcriptome* / genetics

Substances

  • RNA

Grants and funding

P.F. and M.J. were funded by the University of Lausanne (Etat de Vaud). N.G. was funded by the Swiss National Science Foundation grant to P.F. (31003A_173182 and 310030B_192805, https://www.snf.ch/en). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.