Reconstructing the Genomic Architecture of Ancestral Mammals: Lessons From Human, Mouse, and Rat Genomes

  1. Guillaume Bourque1,
  2. Pavel A. Pevzner2, and
  3. Glenn Tesler3,4
  1. 1 Centre de Recherches Mathématiques, Université de Montréal, Canada H3C 3J7
  2. 2 Department of Computer Science and Engineering, University of California–San Diego, La Jolla, California 92093, USA
  3. 3 Department of Mathematics, University of California–San Diego, La Jolla, California 92093, USA

Abstract

Recent analysis of genome rearrangements in human and mouse genomes revealed evidence for more rearrangements than thought previously and shed light on previously unknown features of mammalian evolution, like breakpoint reuse and numerous microrearrangements. However, two-way analysis cannot reveal the genomic architecture of ancestral mammals or assign rearrangement events to different lineages. Thus, the “original synteny” problem introduced by Nadeau and Sankoff previously, remains unsolved, as at least three mammalian genomes are required to derive the ancestral mammalian karyotype. We show that availability of the rat genome allows one to reconstruct a putative genomic architecture of the ancestral murid rodent genome. This reconstruction suggests that this ancestral genome retained many previously postulated chromosome associations in the placental ancestor and reveals others that were beyond the resolution of cytogenetic, radiation hybrid mapping, and chromosome painting techniques. Three-way analysis of rearrangements leads to a reliable reconstruction of the genomic architecture of specific regions in the murid ancestor, including the X chromosome, and for the first time allows one to assign major rearrangement events to one of human, mouse, and rat lineages. Our analysis implies that the rate of rearrangements is much higher in murid rodents than in the human lineage and confirms the existence of rearrangement hot-spots in all three lineages.

Footnotes

  • [Supplemental material is available online at www.genome.org.]

  • 5 Table 1 illustrates some potential pitfalls of the artificial separation of large-scale and small-scale rearrangements. Increasing the minimum synteny block length filters out smaller blocks and reduces the number of large-scale rearrangements of those that remain. We concurrently increased the gap threshold (described later) by a proportionate amount to the minimum synteny block length, which allowed more anchors into these larger synteny blocks and resulted in more microrearrangements. Apart from these issues, there is ambiguity in how to treat the chromosome ends; because the teleomeres have not been sequenced (too many repeats), we do not know whether rearrangements involving them are bounded at the exact end of the chromosome, or somewhere within the end regions. The lack of sequence data precludes us from computing the blocks (if any) at the ends and the rearrangements involving them. Some chromosome ends do not participate in rearrangements, so there is ambiguity in whether to call them breakpoint regions.

  • 6 In Pevzner and Tesler (2003a), we used anchors produced by Michael Kamal (Waterston et al. 2002) for using another procedure, which did not output the signs of the anchors nor information on the repeat familes. We used heuristics to guess the signs of those anchors when possible, and discarded the 2% of the anchors for which this determination was not possible. Some of the guessed anchor signs may have been incorrect, leading to an undercount of microrearrangements.

  • 7 Although the Hannenhalli-Pevzner algorithm finds a most parsimonious rearrangement scenario for two genomes, the real scenario is not necessarily a most parsimonious one, and the order of rearrangement events within a most parsimonious scenario usually remains uncertain. Availability of more than two genomes remedies some of these limitations and provides a means to infer the gene order in the mammalian ancestor (Bourque and Pevzner 2002).

  • 8 In many cases, MGR does not provide an exact solution. It attempts to determine a most parsimonious tree based on macrorearrangements of the synteny block orders in the genomes. The synteny block order of the computed ancestral nodes is only approximate, because there are other possible orders that give identical tree scores, although exploration of neighboring alternative solutions suggests that most of the adjacencies (including all of the chromosome associations) are valid. Localizing the analysis to particular regions of the genome, or including data from appropriate additional species acting as outgroups, can help resolve these issues.

  • 9 The rate of rearrangements on the path from the placental ancestor to human may be significantly smaller than 1.6, as most of the rearrangements on the human branch in Figure 3 may have been acquired on the path from the placental to the murid rodent ancestor.

  • 10 The scenario being optimal, and the ancestor being unique are not typical. With arbitrary data, MGR approximates an optimal scenario, but does not guarantee it will achieve it. In general, distances between two genomes in the computed tree equal or exceed their pairwise distance if the tree constraints are ignored; in this case they are equal, proving the scenario is optimal. Also in general, the recovered ancestor may not be unique.

  • 11 We emphasize that by reusing breakpoints, we do not mean multiple use of exactly the same genomic position as an endpoint of rearrangements, but rather the fact that the breakpoint regions host endpoints for multiple rearrangement events.

  • 12 The breakpoint reuse analysis depends crucially on the accurate generation of synteny blocks and separation between macro- and microrearrangements (Sankoff and Nadeau 2003). Our breakpoint reuse estimates assume this separation is accurate and is not blurred by microrearrangements on the border between breakpoint regions and synteny blocks.

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1975204.

  • 4 Corresponding author. E-MAIL gptesler{at}ucsd.edu; FAX (858) 534-7029.

    • Accepted November 17, 2003.
    • Received September 13, 2003.
| Table of Contents

Preprint Server