Structure of the Highly Conserved HERC2 Gene and of Multiple Partially Duplicated Paralogs in Human

Abstract

Recombination between chromosome-specific low-copy repeats (duplicons) is an underlying mechanism for several genetic disorders. Recently, a chromosome 15 duplicon was discovered in the common breakpoint regions of Prader–Willi and Angelman syndrome deletions. We identified previously the large HERC2 transcript as an ancestral gene in this duplicon, with ∼11 HERC2-containing duplicons, and demonstrated that recessive mutations in mouseHerc2 lead to a developmental syndrome, juvenile development and fertility 2 (jdf2). We have now constructed and sequenced a genomic contig of HERC2, revealing a total of 93 exons spanning ∼250 kb and a CpG island promoter. A processed ribosomal protein L41 pseudogene occurs in intron 2 of HERC2, and putative VNTRs occur in intron 70 (28 copies, ∼76-bp repeat) and 3′ exon 40 through intron 40 (6 copies, ∼62-bp repeat). Sequence comparisons show that HERC2-containing duplicons have undergone several deletion, inversion, and dispersion events to form complex duplicons in 15q11, 15q13, and 16p11. To further understand the developmental role of HERC2, a highly conservedDrosophila ortholog was characterized, with 70% amino acid sequence identity to human HERC2 over the carboxy-terminal 743 residues. Combined, these studies provide significant insights into the structure of complex duplicons and into the evolutionary pathways of formation, dispersal, and genomic instability of duplicons. Our results establish that some genes not only have a protein coding function but can also play a structural role in the genome.

[The sequence data described in this paper have been submitted to GenBank under accession nos. AF189221 (Drosophila HERC2 partial cDNA),AC004583 (human HERC2 exons 1–52, genomic);AF224242AF224257 (human HERC2 exons 54–70, partial genomic sequences); AF225400AF225409 (human HERC2 exons 71–93, partial genomic sequences). The exon-intron boundaries for exons 53–93 are derived from BACs R-142A11 and 263O22. Additional information is available as a supplementary table at www.genome.org.]

Footnotes

  • 3 Present address: Genentech, Inc., Department of Bioinformatics, South San Francisco, California 94080 USA.

  • 4 Corresponding author.

  • E-MAIL rxn19{at}po.cwru.edu; FAX (216) 368-3432.

    • Received September 28, 1999.
    • Accepted January 5, 2000.
| Table of Contents

Preprint Server