A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity

  1. Jay Shendure2,6
  1. 1Department of Bioengineering and Therapeutic Sciences, Institute for Human Genetics, University of California San Francisco, San Francisco, California 94158, USA;
  2. 2Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;
  3. 3HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA;
  4. 4Departments of Statistics and Biostatistics, University of Washington, Seattle, Washington 98195, USA;
  5. 5Department of Microbiology and Immunology, UCSF Diabetes Center, Keck Center for Noncoding RNA, University of California, San Francisco, San Francisco, California 94143, USA;
  6. 6Howard Hughes Medical Institute, Seattle, Washington 98195, USA
  1. Corresponding authors: nadav.ahituv{at}ucsf.edu, shendure{at}uw.edu
  1. 7 These authors contributed equally to this work.

Abstract

Candidate enhancers can be identified on the basis of chromatin modifications, the binding of chromatin modifiers and transcription factors and cofactors, or chromatin accessibility. However, validating such candidates as bona fide enhancers requires functional characterization, typically achieved through reporter assays that test whether a sequence can increase expression of a transcriptional reporter via a minimal promoter. A longstanding concern is that reporter assays are mainly implemented on episomes, which are thought to lack physiological chromatin. However, the magnitude and determinants of differences in cis-regulation for regulatory sequences residing in episomes versus chromosomes remain almost completely unknown. To address this systematically, we developed and applied a novel lentivirus-based massively parallel reporter assay (lentiMPRA) to directly compare the functional activities of 2236 candidate liver enhancers in an episomal versus a chromosomally integrated context. We find that the activities of chromosomally integrated sequences are substantially different from the activities of the identical sequences assayed on episomes, and furthermore are correlated with different subsets of ENCODE annotations. The results of chromosomally based reporter assays are also more reproducible and more strongly predictable by both ENCODE annotations and sequence-based models. With a linear model that combines chromatin annotations and sequence information, we achieve a Pearson's R2 of 0.362 for predicting the results of chromosomally integrated reporter assays. This level of prediction is better than with either chromatin annotations or sequence information alone and also outperforms predictive models of episomal assays. Our results have broad implications for how cis-regulatory elements are identified, prioritized and functionally validated.

Footnotes

  • Received June 30, 2016.
  • Accepted November 8, 2016.

This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

| Table of Contents

Preprint Server