Genome-wide genetic variations are highly correlated with proximal DNA methylation patterns

  1. Shinichi Morishita1,5
  1. 1Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa 277-0882, Japan;
  2. 2Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo 113-0033, Japan;
  3. 3Department of Molecular Preventive Medicine, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan;
  4. 4Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo 108-8639, Japan

    Abstract

    5-methyl-cytosines at CpG sites frequently mutate into thymines, accounting for a large proportion of spontaneous point mutations. The repair system would leave substantial numbers of errors in neighboring regions if the synthesis of erased gaps around deaminated 5-methyl-cytosines is error-prone. Indeed, we identified an unexpected genome-wide role of the CpG methylation state as a major determinant of proximal natural genetic variation. Specifically, 507 Mbp (∼18%) of the human genome was within 10 bp of a CpG site; in these regions, the single nucleotide polymorphism (SNP) rate significantly increased by ∼50% (P < 10−566 by a two-proportion z-test) if the neighboring CpG sites are methylated. To reconfirm this finding in another vertebrate, we compared six single-base resolution methylomes in two inbred medaka (Oryzias latipes) strains with sufficient genetic divergence (3.4%). We found that the SNP rate also increased by ∼50% (P < 10−2170), and the substitution rates in all dinucleotides increased simultaneously (P < 10−441) around methylated CpG sites. In the hypomethylated regions, the “CGCG” motif was significantly enriched (P < 10−680) and evolutionarily conserved (P = ∼ 0.203%), and slow CpG deamination rather than fast CpG gain was seen, indicating a possible role of CGCG as a candidate cis-element for the hypomethylation state. In regions that were hypermethylated in germline-like tissues but were hypomethylated in somatic liver cells, the SNP rate was significantly smaller than that in hypomethylated regions in both tissues, suggesting a positive selective pressure during DNA methylation reprogramming. This is the first report of findings showing that the CpG methylation state is significantly correlated with the characteristics of evolutionary change in neighboring DNA.

    Footnotes

    • 5 Corresponding authors

      E-mail quwei{at}cb.k.u-tokyo.ac.jp

      E-mail moris{at}cb.k.u-tokyo.ac.jp

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.140236.112.

      Freely available online through the Genome Research Open Access option.

    • Received March 6, 2012.
    • Accepted May 24, 2012.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server