A comprehensive computational characterization of conserved mammalian intronic sequences reveals conserved motifs associated with constitutive and alternative splicing

  1. Rodger B. Voelker and
  2. J. Andrew Berglund1
  1. Institute of Molecular Biology, University of Oregon, Eugene, Oregon 97403, USA

Abstract

Orthologous mammalian introns contain many highly conserved sequences. Of these sequences, many are likely to represent protein binding sites that are under strong positive selection. In order to identify conserved protein binding sites that are important for splicing, we analyzed the composition of intronic sequences that are conserved between human and six eutherian mammals. We focused on all completely conserved sequences of seven or more nucleotides located in the regions adjacent to splice-junctions. We found that these conserved intronic sequences are enriched in specific motifs, and that many of these motifs are statistically associated with either alternative or constitutive splicing. In validation of our methods, we identified several motifs that are known to play important roles in alternative splicing. In addition, we identified several novel motifs containing GCT that are abundant and are associated with alternative splicing. Furthermore, we demonstrate that, for some of these motifs, conservation is a strong indicator of potential functionality since conserved instances are associated with alternative splicing while nonconserved instances are not. A surprising outcome of this analysis was the identification of a large number of AT-rich motifs that are strongly associated with constitutive splicing. Many of these appear to be novel and may represent conserved intronic splicing enhancers (ISEs). Together these data show that conservation provides important insights into the identification and possible roles of cis-acting intronic sequences important for alternative and constitutive splicing.

Footnotes

  • 1 Corresponding author.

    1 E-mail aberglund{at}molbio.uoregon.edu; fax (541) 346-5891.

  • [Supplemental material is available online at www.genome.org.]

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6017807

    • Received October 12, 2006.
    • Accepted April 12, 2007.
| Table of Contents

Preprint Server