Computational Discovery of Internal Micro-Exons

  1. Natalia Volfovsky1,2,3,
  2. Brian J. Haas1, and
  3. Steven L. Salzberg1
  1. 1 The Institute for Genomic Research, Rockville, Maryland 20850, USA

Abstract

Very short exons, also known as micro-exons, occur in large numbers in some eukaryotic genomes. Existing annotation tools have a limited ability to recognize these short sequences, which range in length up to 25 bp. Here, we describe a computational method for the identification of micro-exons using near-perfect alignments between cDNA and genomic DNA sequences. Using this method, we detected 319 micro-exons in 4 complete genomes, of which 224 were previously unknown, human (170), the nematode Caenorhabditis elegans (4), the fruit fly Drosophila melanogaster (14), and the mustard plant Arabidopsis thaliana (36). Comparison of our computational method with popular cDNA alignment programs shows that the new algorithm is both efficient and accurate. The algorithm also aids in the discovery of micro-exon-skipping events and cross-species micro-exon conservation.

Footnotes

  • [Supplementary material available online at www.genome.org.]

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.677503.

  • 2 Present address: Advanced Biomedical Computing Center, National Cancer Institute-Frederick/SAIC, Frederick, Maryland 21702, USA.

  • 3 Corresponding author. E-MAIL natalia{at}ncifcrf.gov; FAX (301) 838-0208.

    • Accepted March 25, 2003.
    • Received August 14, 2002.
| Table of Contents

Preprint Server