Correcting signal biases and detecting regulatory elements in STARR-seq data

Genome Res. 2021 May;31(5):877-889. doi: 10.1101/gr.269209.120. Epub 2021 Mar 15.

Abstract

High-throughput reporter assays such as self-transcribing active regulatory region sequencing (STARR-seq) have made it possible to measure regulatory element activity across the entire human genome at once. The resulting data, however, present substantial analytical challenges. Here, we identify technical biases that explain most of the variance in STARR-seq data. We then develop a statistical model to correct those biases and to improve detection of regulatory elements. This approach substantially improves precision and recall over current methods, improves detection of both activating and repressive regulatory elements, and controls for false discoveries despite strong local correlations in signal.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bias
  • Enhancer Elements, Genetic*
  • Genome, Human*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans