Simulated reads for benchmarking SARS-CoV-2 lineage abundance estimation
- 1. Delft University of Technology
- 2. Texas A&M University - San Antonio
Description
To evaluate the accuracy of lineage abundance estimates from amplicon-based and whole genome-based sequencing, we simulated paired-end reads from amplicons determined by AmpliDiff, and reads spanning full genomes. Abundances of lineages are based on the relative abundance of a lineage within the dataset. The data consists of the following 8 independent datasets:
- 200 bp reads from the Netherlands based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage,
- 400 bp reads from the Netherlands based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage,
- 200 bp reads from the Netherlands based on whole genome sequencing at 100x coverage,
- 400 bp reads from the Netherlands based on whole genome sequencing at 100x coverage,
- 200 bp reads from Texas based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage,
- 400 bp reads from Texas based on AmpliDiff amplicons (1, 2, 5 or 10 amplicons) at 1000x coverage,
- 200 bp reads from Texas based on whole genome sequencing at 100x coverage,
- 400 bp reads from Texas based on whole genome sequencing at 100x coverage.
Every independent dataset contains 20 sets of reads (generated with different random seeds). The genomes used for the Netherlands-based simulations can be obtained via GISAID through accession id EPI_SET_230825fe, and the genomes used for the Texas-based simulations can be obtained via GISAID through accession id EPI_SET_230825pe.
Files
Files
(37.1 GB)
Name | Size | Download all |
---|---|---|
md5:ed890c54529ebbfdc998d844e156675a
|
2.0 GB | Download |
md5:e26c29969082ef4f9a10a67937d2a2af
|
1.8 GB | Download |
md5:161ffc4b119eb1fcfcbd427e36d44bb9
|
7.3 GB | Download |
md5:c5c9451d28a2c6f1d18411998259add9
|
2.9 GB | Download |
md5:e093c9eae0687adc83f1a7a39fab5dda
|
3.4 GB | Download |
md5:52d09f19130defaf7647bc5b14127bcd
|
3.2 GB | Download |
md5:be1b08038a5eb791cbe9bfbe254f3474
|
12.6 GB | Download |
md5:e17030c2de9a82cbf2e7855fd0ee6d5f
|
3.8 GB | Download |
Additional details
Related works
- Is cited by
- Preprint: 10.1101/2023.07.22.550164 (DOI)
References
- Khare, S., et al (2021) GISAID's Role in Pandemic Response. China CDC Weekly, 3(49): 1049-1051. doi: 10.46234/ccdcw2021.255 PMCID: 8668406