Accurate Identification of Novel Human Genes Through Simultaneous Gene Prediction in Human, Mouse, and Rat

  1. Colin Dewey1,6,
  2. Jia Qian Wu5,6,
  3. Simon Cawley3,
  4. Marina Alexandersson4,
  5. Richard Gibbs5, and
  6. Lior Pachter2,7
  1. 1 Department of Electrical Engineering, University of California—Berkeley, Berkeley, California 94720, USA
  2. 2 Department of Mathematics, University of California—Berkeley, Berkeley, California 94720, USA
  3. 3 Affymetrix Inc., Emeryville, California 94608, USA
  4. 4 Fraunhofer-Chalmers Centre, SE-412 88 Gothenburg, Sweden
  5. 5 Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA

Abstract

We describe a new method for simultaneously identifying novel homologous genes with identical structure in the human, mouse, and rat genomes by combining pairwise predictions made with the SLAM gene-finding program. Using this method, we found 3698 gene triples in the human, mouse, and rat genomes which are predicted with exactly the same gene structure. We show, both computationally and experimentally, that the introns of these triples are predicted accurately as compared with the introns of other ab initio gene prediction sets. Computationally, we compared the introns of these gene triples, as well as those from other ab initio gene finders, with known intron annotations. We show that a unique property of SLAM, namely that it predicts gene structures simultaneously in two organisms, is key to producing sets of predictions that are highly accurate in intron structure when combined with other programs. Experimentally, we performed reverse transcription-polymerase chain reaction (RT-PCR) in both the human and rat to test the exon pairs flanking introns from a subset of the gene triples for which the human gene had not been previously identified. By performing RT-PCR on orthologous introns in both the human and rat genomes, we additionally explore the validity of using RT-PCR as a method for confirming gene predictions.

Footnotes

  • [Supplemental material is available online at http://hanuman.math.berkeley.edu/~cdewey/SLAMHMR/index.html.]

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1939804.

  • 6 These authors contributed equally to this work.

  • 7 Corresponding author. E-MAIL lpachter{at}math.berkeley.edu; FAX (510) 642-2028.

    • Accepted January 26, 2004.
    • Received November 5, 2003.
| Table of Contents

Preprint Server