Challenges and considerations for reproducibility of STARR-seq assays

Maitreya Das; Ayaan Hossain; Deepro Banerjee; Craig Alan Praul; Santhosh Girirajan

doi:10.1101/gr.277204.122

Challenges and considerations for reproducibility of STARR-seq assays

Genome Res. 2023 Apr;33(4):479-495. doi: 10.1101/gr.277204.122. Epub 2023 May 2.

Authors

Maitreya Das^{1

2

3}, Ayaan Hossain^{3

4}, Deepro Banerjee^{3

4}, Craig Alan Praul³, Santhosh Girirajan^{1

2

3

4

5}

Affiliations

¹ Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA; sxg47@psu.edu mud367@psu.edu.
² Molecular and Cellular Integrative Biosciences Graduate Program, Pennsylvania State University, University Park, Pennsylvania 16802, USA.
³ Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802, USA.
⁴ Bioinformatics and Genomics Graduate Program, Pennsylvania State University, University Park, Pennsylvania 16802, USA.
⁵ Department of Anthropology, Pennsylvania State University, University Park, Pennsylvania 16802, USA.

Abstract

High-throughput methods such as RNA-seq, ChIP-seq, and ATAC-seq have well-established guidelines, commercial kits, and analysis pipelines that enable consistency and wider adoption for understanding genome function and regulation. STARR-seq, a popular assay for directly quantifying the activities of thousands of enhancer sequences simultaneously, has seen limited standardization across studies. The assay is long, with more than 250 steps, and frequent customization of the protocol and variations in bioinformatics methods raise concerns for reproducibility of STARR-seq studies. Here, we assess each step of the protocol and analysis pipelines from published sources and in-house assays, and identify critical steps and quality control (QC) checkpoints necessary for reproducibility of the assay. We also provide guidelines for experimental design, protocol scaling, customization, and analysis pipelines for better adoption of the assay. These resources will allow better optimization of STARR-seq for specific research needs, enable comparisons and integration across studies, and improve the reproducibility of results.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Computational Biology / methods
Genome*
High-Throughput Nucleotide Sequencing / methods
Regulatory Sequences, Nucleic Acid*
Reproducibility of Results
Sequence Analysis, DNA / methods

Abstract

Publication types

MeSH terms

Grants and funding