Foreign RNA spike-ins enable accurate allele-specific expression analysis at scale

Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i431-i439. doi: 10.1093/bioinformatics/btad254.

Abstract

Motivation: Analysis of allele-specific expression is strongly affected by the technical noise present in RNA-seq experiments. Previously, we showed that technical replicates can be used for precise estimates of this noise, and we provided a tool for correction of technical noise in allele-specific expression analysis. This approach is very accurate but costly due to the need for two or more replicates of each library. Here, we develop a spike-in approach which is highly accurate at only a small fraction of the cost.

Results: We show that a distinct RNA added as a spike-in before library preparation reflects technical noise of the whole library and can be used in large batches of samples. We experimentally demonstrate the effectiveness of this approach using combinations of RNA from species distinguishable by alignment, namely, mouse, human, and Caenorhabditis elegans. Our new approach, controlFreq, enables highly accurate and computationally efficient analysis of allele-specific expression in (and between) arbitrarily large studies at an overall cost increase of ∼5%.

Availability and implementation: Analysis pipeline for this approach is available at GitHub as R package controlFreq (github.com/gimelbrantlab/controlFreq).

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Alleles
  • Animals
  • Caenorhabditis elegans* / genetics
  • Gene Library
  • Humans
  • Libraries*
  • Mice
  • RNA / genetics

Substances

  • RNA