Published December 18, 2018 | Version v3
Preprint Open

Haplotype-aware diplotyping from noisy long reads

  • 1. Max Planck Institute for Informatics, Saarbruecken, Germany; Center for Bioinformatics, Saarland University, Saarbruecken, Germany
  • 2. UC Santa Cruz Genomics Institute, University of California Santa Cruz, USA

Description

SNP calls for individual NA12878 produced by MarginPhase and WhatsHap on PacBio and Oxford Nanopore data as well as the version of source code used to generate this dataset.

Paper Abstract: Current genotyping approaches for single nucleotide variations rely on short, accurate reads from second generation sequencing devices. Presently, third generation sequencing platforms are rapidly becoming more widespread, yet approaches for leveraging their long but error-prone reads for genotyping are lacking. Here, we introduce a novel statistical framework for the joint inference of haplotypes and genotypes from noisy long reads, which we term diplotyping. Our technique takes full advantage of linkage information provided by long reads. We validate hundreds of thousands of candidate variants that have not yet been included in the high-confidence reference set of the Genome-in-a-Bottle effort.

Source Code:

WhatsHap: bitbucket.org/whatshap/whatshap

MarginPhase: github.com/benedictpaten/marginPhase

Files

Files (924.5 MB)

Name Size Download all
md5:df75832ef1a52be1a55f80717dd60ec4
103.3 MB Download
md5:3fa35a2f37e63285cbadb1a315adae7a
1.4 MB Download
md5:5c434ba655f6a9fbfb49500af6804ca7
52.4 MB Download
md5:212545e8271dd2beab1f7a5ffde0ea38
1.5 MB Download
md5:8bbb441e7bd80d7dc282ca18e0550920
83.5 MB Download
md5:e446d6bbec3e1545fedadfd1868dcf94
1.5 MB Download
md5:74c491919ea05769cc90615b89d77c52
356.8 MB Download
md5:9083ba35de7b8101f9425f39d62aae50
145.6 MB Download
md5:2111c74504346ebdbfda1727d70b5769
1.6 MB Download
md5:9213f47a71a3c8f3e6a16046af5f0901
130.6 MB Download
md5:2ce263fd7147539e057f9cabc1e615ca
1.5 MB Download
md5:e4e1845b8fde57b4571fe47c52a673ad
44.7 MB Download