Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

  1. Stephen Richards1,15,
  2. Yue Liu1,2,3,
  3. Brian R. Bettencourt4,
  4. Pavel Hradecky4,
  5. Stan Letovsky4,
  6. Rasmus Nielsen5,
  7. Kevin Thornton5,
  8. Melissa J. Hubisz5,
  9. Rui Chen1,
  10. Richard P. Meisel6,
  11. Olivier Couronne8,12,
  12. Sujun Hua9,
  13. Mark A. Smith4,
  14. Peili Zhang4,
  15. Jing Liu1,
  16. Harmen J. Bussemaker10,
  17. Marinus F. van Batenburg10,13,
  18. Sally L. Howells1,
  19. Steven E. Scherer1,
  20. Erica Sodergren1,
  21. Beverly B. Matthews4,
  22. Madeline A. Crosby4,
  23. Andrew J. Schroeder4,
  24. Daniel Ortiz-Barrientos11,
  25. Catharine M. Rives1,
  26. Michael L. Metzker1,
  27. Donna M. Muzny1,
  28. Graham Scott1,
  29. David Steffen1,
  30. David A. Wheeler1,
  31. Kim C. Worley1,
  32. Paul Havlak1,
  33. K. James Durbin1,
  34. Amy Egan1,
  35. Rachel Gill1,
  36. Jennifer Hume1,
  37. Margaret B. Morgan1,
  38. George Miner1,
  39. Cerissa Hamilton1,
  40. Yanmei Huang4,
  41. Lenée Waldron1,
  42. Daniel Verduzco1,
  43. Kerstin P. Clerc-Blankenburg1,
  44. Inna Dubchak8,
  45. Mohamed A.F. Noor11,
  46. Wyatt Anderson14,
  47. Kevin P. White9,
  48. Andrew G. Clark5,
  49. Stephen W. Schaeffer7,
  50. William Gelbart4,
  51. George M. Weinstock1, and
  52. Richard A. Gibbs1
  1. 1 Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston Texas 77030, USA
  2. 2 Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston Texas 77030, USA
  3. 3 W.M. Keck Center for Computational Biology, Baylor College of Medicine, Houston Texas 77030, USA
  4. 4 FlyBase–Harvard, Department of Molecular and Cellular Biology, Harvard University, Biological Laboratories, Cambridge, Massachusetts 021383, USA
  5. 5 Department of Biological Statistics and Computational Biology, and Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
  6. 6 Intercollege Graduate Degree Program in Genetics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
  7. 7 Department of Biology and Institute of Molecular Evolutionary Genetics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
  8. 8 Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
  9. 9 Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06520, USA
  10. 10 Department of Biological Sciences and Center for Computational Biology and Bioinformatics, Columbia University, New York, New York 10027, USA
  11. 11 Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana 70803, USA
  12. 12 U.S. Department of Energy Joint Genome Institute, Walnut Creek, California 94598, USA
  13. 13 Swammerdam Institute for Life Sciences, University of Amsterdam, The Netherlands
  14. 14 Department of Genetics, University of Georgia, Athens, Georgia 30602, USA

Abstract

We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

Footnotes

  • [Supplemental material is available online at www.genome.org. The annotated whole genome project has been deposited into DDBJ/EMBL/GenBank under the project accession AADE00000000. The version described in this paper is the first version, AADE01000000. The sequences of the proximal and distal Arrowhead breakpoints have been deposited in GenBank with accession nos. AY693425 and AY693426. The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: P.J. de Jong and K. Osoegawa.]

  • 16 To the best of our knowledge, a comprehensive curated collection of experimentally determined cis-regulatory element information does not exist at the present time; such a resource would be of great value for analyses such as this.

  • 17 The magnitude of the p-values is in part a function of the different set sizes and should not be viewed as an estimate of the magnitude of the effect.

  • 18 This figure corresponds to 51.3% identity for all sites, because ∼70% of sites are at least partially aligned.

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3059305. Freely available online through the Genome Research Immediate Open Access option.

  • 15 Corresponding author. E-mail stephenr{at}bcm.tmc.edu; fax (713) 798-5741.

    • Accepted October 14, 2004.
    • Received July 27, 2004.
| Table of Contents
OPEN ACCESS ARTICLE

Preprint Server