Computational and experimental approaches double the number of known introns in the pathogenic yeast Candida albicans

  1. Quinn M. Mitrovich1,
  2. Brian B. Tuch,
  3. Christine Guthrie, and
  4. Alexander D. Johnson
  1. Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, California 94143-2200, USA

Abstract

Candida albicans is the most common fungal pathogen of humans. Frequently found as a commensal within the digestive tracts of healthy individuals, C. albicans is an opportunistic pathogen that causes a wide variety of clinical syndromes in immuno-compromised individuals. A comprehensive annotation of the C. albicans genome sequence was recently published. Because many C. albicans coding sequences are interrupted by introns, proper intron annotation is essential for the accurate definition of genes in this pathogen. Intron annotation is also important for identifying potential targets of splicing regulation, a common mechanism of gene control in eukaryotes. In this study, we report an improved annotation of C. albicans introns. In addition to correcting the existing intron annotations, 25% of which were incorrect, we have used novel computational and experimental approaches to identify new introns, bringing the total to 415, almost double the number previously known. Our identification methods focus primarily on intron features rather than protein-coding features, overcoming biases of traditional intron annotation methods. Introns are not randomly distributed in C. albicans, and are over-represented in genes involved in specific cellular processes, such as splicing, translation, and mitochondrial respiration. This nonrandom distribution suggests functional roles for these introns, and we demonstrate that splicing of two transcripts whose introns have unusual sequence features is responsive to environmental factors.

Footnotes

  • 1 Corresponding author.

    1 E-mail quinn.mitrovich{at}ucsf.edu; fax (415) 502-4315.

  • [Supplemental material is available online at www.genome.org. The microarray data from this study are available at ArrayExpress (http://www.ebi.ac.uk), accession no. E-MEXP-1003.]

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6111907

    • Received November 9, 2006.
    • Accepted February 7, 2007.
| Table of Contents

Preprint Server