RNA-seq accuracy and reproducibility for the mapping and quantification of influenza defective viral genomes

  1. Nadia Naffakh4,9
  1. 1Unité d'Immunobiologie des Cellules Dendritiques, Institut Pasteur, INSERM U1223, 75015 Paris, France
  2. 2Université de Paris, Sorbonne Paris Cité, 75013 Paris, France
  3. 3Viral Populations and Pathogenesis Unit, Institut Pasteur, CNRS UMR 3569, 75015 Paris, France
  4. 4Unité de Génétique Moléculaire des Virus à ARN, Institut Pasteur, CNRS UMR 3569, Université de Paris, Paris, France
  5. 5Hub de Bioinformatique et Biostatistique, Institut Pasteur, CNRS USR 3756, 75015 Paris, France
  6. 6Centre National de Référence des Virus des Infections Respiratoires, Institut Pasteur, 75015 Paris, France
  7. 7Pasteur International Bioresources network (PIBnet), Plateforme de Microbiologie Mutualisée (P2M), Institut Pasteur, 75015 Paris, France
  1. Corresponding author: nadia.naffakh{at}pasteur.fr
  1. 10 These authors contributed equally to this work.

  • 8 Present address: Sorbonne Université, Hôpital de la Pitié-Salpêtrière, 75013 Paris, France

  • 9 Present address: Unité Biologie des ARN et Virus Influenza, Institut Pasteur, CNRS UMR 3569, 75015 Paris, France

Abstract

Like most RNA viruses, influenza viruses generate defective viral genomes (DVGs) with large internal deletions during replication. There is accumulating evidence supporting a biological relevance of such DVGs. However, further understanding of the molecular mechanisms that underlie the production and biological activity of DVGs is conditioned upon the sensitivity and accuracy of detection methods, that is, next-generation sequencing (NGS) technologies and related bioinformatics algorithms. Although many algorithms were developed, their sensitivity and reproducibility were mostly assessed on simulated data. Here, we introduce DG-seq, a time-efficient pipeline for DVG detection and quantification, and a set of biological controls to assess the performance of not only our bioinformatics algorithm but also the upstream NGS steps. Using these tools, we provide the first rigorous comparison of the two commonly used sample processing methods for RNA-seq, with or without a PCR preamplification step. Our data show that preamplification confers a limited advantage in terms of sensitivity and introduces size- but also sequence-dependent biases in DVG quantification, thereby providing a strong rationale to favor preamplification-free methods. We further examine the features of DVGs produced by wild-type and transcription-defective (PA-K635A or PA-R638A) influenza viruses, and show an increased diversity and frequency of DVGs produced by the PA mutants compared to the wild-type virus. Finally, we demonstrate a significant enrichment in DVGs showing direct, A/T-rich sequence repeats at the deletion breakpoint sites. Our findings provide novel insights into the mechanisms of influenza virus DVG production.

Keywords

  • Received August 1, 2020.
  • Accepted September 2, 2020.

This article is distributed exclusively by the RNA Society for the first 12 months after the full-issue publication date (see http://rnajournal.cshlp.org/site/misc/terms.xhtml). After 12 months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

| Table of Contents