Identification of higher-order functional domains in the human ENCODE regions

  1. Robert E. Thurman1,2,
  2. Nathan Day3,
  3. William S. Noble2,3, and
  4. John A. Stamatoyannopoulos2,4
  1. 1 Division of Medical Genetics, University of Washington, Seattle, Washington 98195, USA;
  2. 2 Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;
  3. 3 Department of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA

Abstract

It has long been posited that human and other large genomes are organized into higher-order (i.e., greater than gene-sized) functional domains. We hypothesized that diverse experimental data types generated by The ENCODE Project Consortium could be combined to delineate active and quiescent or repressed functional domains and thereby illuminate the higher-order functional architecture of the genome. To address this, we coupled wavelet analysis with hidden Markov models for unbiased discovery of “domain-level” behavior in high-resolution functional genomic data, including activating and repressive histone modifications, RNA output, and DNA replication timing. We find that higher-order patterns in these data types are largely concordant and may be analyzed collectively in the context of HeLa cells to delineate 53 active and 62 repressed functional domains within the ENCODE regions. Active domains comprise ∼44% of the ENCODE regions but contain ∼75%–80% of annotated genes, transcripts, and CpG islands. Repressed domains are enriched in certain classes of repetitive elements and, surprisingly, in evolutionarily conserved nonexonic sequences. The functional domain structure of the ENCODE regions appears to be largely stable across different cell types. Taken together, our results suggest that higher-order functional domains represent a fundamental organizing principle of human genome architecture.

Footnotes

  • 4 Corresponding author.

    4 E-mail jstam{at}u.washington.edu; fax (206) 267-1094.

  • [Supplemental material is available online at www.genome.org.]

  • Article is online at http://www.genome.org/cgi/doi/10.1101/gr.6081407

    • Received October 29, 2006.
    • Accepted March 27, 2007.
  • Freely available online through the Genome Research Open Access option.

| Table of Contents
OPEN ACCESS ARTICLE

Preprint Server