Extensive variation between inbred mouse strains due to endogenous L1 retrotransposition

  1. Keiko Akagi1,5,
  2. Jingfeng Li2,5,
  3. Robert M. Stephens3,5,
  4. Natalia Volfovsky3, and
  5. David E. Symer2,4,6
  1. 1 Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, USA;
  2. 2 Basic Research Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, USA;
  3. 3 Advanced Biomedical Computing Center, Advanced Technology Program, SAIC-Frederick, Inc., Frederick, Maryland 21702, USA;
  4. 4 Laboratory of Biochemistry and Molecular Biology, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, USA
  1. 5 These authors contributed equally to this work.

Abstract

Numerous inbred mouse strains comprise models for human diseases and diversity, but the molecular differences between them are mostly unknown. Several mammalian genomes have been assembled, providing a framework for identifying structural variations. To identify variants between inbred mouse strains at a single nucleotide resolution, we aligned 26 million individual sequence traces from four laboratory mouse strains to the C57BL/6J reference genome. We discovered and analyzed over 10,000 intermediate-length genomic variants (from 100 nucleotides to 10 kilobases), distinguishing these strains from the C57BL/6J reference. Approximately 85% of such variants are due to recent mobilization of endogenous retrotransposons, predominantly L1 elements, greatly exceeding that reported in humans. Many genes’ structures and expression are altered directly by polymorphic L1 retrotransposons, including Drosha (also called Rnasen), Parp8, Scn1a, Arhgap15, and others, including novel genes. L1 polymorphisms are distributed nonrandomly across the genome, as they are excluded significantly from the X chromosome and from genes associated with the cell cycle, but are enriched in receptor genes. Thus, recent endogenous L1 retrotransposition has diversified genomic structures and transcripts extensively, distinguishing mouse lineages and driving a major portion of natural genetic variation.

Footnotes

  • 6 Corresponding author.

    6 E-mail symerd{at}mail.nih.gov; fax (301) 846-1638.

  • [Supplemental material is available online at www.genome.org. Novel L1 fusion transcript and genomic integrant sequences have been submitted to GenBank under accession nos. EF591871–EF591883.]

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.075770.107.

    • Received December 20, 2007.
    • Accepted March 27, 2008.
| Table of Contents

Preprint Server