Recompleting the Caenorhabditis elegans genome

  1. Erich M. Schwarz6
  1. 1Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8583, Japan;
  2. 2Department of Pathology, Stanford University, Stanford, California 94305, USA;
  3. 3Department of Genetics, Stanford University, Stanford, California 94305, USA;
  4. 4Department of Zoology and Michael Smith Laboratories, University of British Columbia, Vancouver V6T 1Z3, British Columbia, Canada;
  5. 5Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota 55454, USA;
  6. 6Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
  1. 7 These authors contributed equally to this work.

  • Corresponding authors: afire{at}stanford.edu, moris{at}edu.k.u-tokyo.ac.jp, ems394{at}cornell.edu
  • Abstract

    Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. elegans available today. To provide a more accurate C. elegans genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98% identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. Over 98% of N2 genes encoded unchanged products in VC2010; moreover, we predicted ≥53 new genes in VC2010. The recompleted genome of C. elegans should be a valuable resource for genetics, genomics, and systems biology.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.244830.118.

    • Freely available online through the Genome Research Open Access option.

    • Received October 4, 2018.
    • Accepted March 11, 2019.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    This Article

    1. Genome Res. 29: 1009-1022 © 2019 Yoshimura et al.; Published by Cold Spring Harbor Laboratory Press

    Article Category

    ORCID

    Share

    Preprint Server