Correcting signal biases and detecting regulatory elements in STARR-seq data

  1. Timothy E. Reddy1,2,3,4,5
  1. 1Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical School, Durham, North Carolina 27710, USA;
  2. 2Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA;
  3. 3Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27710, USA;
  4. 4Duke Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA;
  5. 5Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA
  • Corresponding author: tim.reddy{at}duke.edu
  • Abstract

    High-throughput reporter assays such as self-transcribing active regulatory region sequencing (STARR-seq) have made it possible to measure regulatory element activity across the entire human genome at once. The resulting data, however, present substantial analytical challenges. Here, we identify technical biases that explain most of the variance in STARR-seq data. We then develop a statistical model to correct those biases and to improve detection of regulatory elements. This approach substantially improves precision and recall over current methods, improves detection of both activating and repressive regulatory elements, and controls for false discoveries despite strong local correlations in signal.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.269209.120.

    • Freely available online through the Genome Research Open Access option.

    • Received July 21, 2020.
    • Accepted March 9, 2021.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server