Comparison of splice sites reveals that long noncoding RNAs are evolutionarily well conserved

  1. Peter F. Stadler1,2,7,8,9,10,11
  1. 1Bioinformatics Group, Department of Computer Science, University of Leipzig, D-04107 Leipzig, Germany
  2. 2Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany
  3. 3Bioinformatics Group, Department of Computer Science, University of Freiburg, D-79110 Freiburg, Germany
  4. 4MML, Munich Leukemia Laboratory GmbH, D-81377 München, Germany
  5. 5ecSeq Bioinformatics, D-04275 Leipzig, Germany
  6. 6Young Investigators Group Bioinformatics and Transcriptomics, Department of Proteomics, Helmholtz Centre for Environmental Research–UFZ, D-04318 Leipzig, Germany
  7. 7Department of Diagnostics, Fraunhofer Institute for Cell Therapy and Immunology–IZI, D-04103 Leipzig, Germany
  8. 8Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany
  9. 9Department of Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria
  10. 10Center for non-coding RNA in Technology and Health, University of Copenhagen, DK-1870 Frederiksberg C, Denmark
  11. 11Santa Fe Institute, Santa Fe, New Mexico 87501, USA
  1. Corresponding author: studla{at}bioinf.uni-leipzig.de

Abstract

Large-scale RNA sequencing has revealed a large number of long mRNA-like transcripts (lncRNAs) that do not code for proteins. The evolutionary history of these lncRNAs has been notoriously hard to study systematically due to their low level of sequence conservation that precludes comprehensive homology-based surveys and makes them nearly impossible to align. An increasing number of special cases, however, has been shown to be at least as old as the vertebrate lineage. Here we use the conservation of splice sites to trace the evolution of lncRNAs. We show that >85% of the human GENCODE lncRNAs were already present at the divergence of placental mammals and many hundreds of these RNAs date back even further. Nevertheless, we observe a fast turnover of intron/exon structures. We conclude that lncRNA genes are evolutionary ancient components of vertebrate genomes that show an unexpected and unprecedented evolutionary plasticity. We offer a public web service (http://splicemap.bioinf.uni-leipzig.de) that allows to retrieve sets of orthologous splice sites and to produce overview maps of evolutionarily conserved splice sites for visualization and further analysis. An electronic supplement containing the ncRNA data sets used in this study is available at http://www.bioinf.uni-leipzig.de/publications/supplements/12-001.

Keywords

Footnotes

  • Received May 12, 2014.
  • Accepted December 24, 2014.

This article is distributed exclusively by the RNA Society for the first 12 months after the full-issue publication date (see http://rnajournal.cshlp.org/site/misc/terms.xhtml). After 12 months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

| Table of Contents