Next Article in Journal
Aspergillosis, Avian Species and the One Health Perspective: The Possible Importance of Birds in Azole Resistance
Next Article in Special Issue
Severe COVID-19 and Sepsis: Immune Pathogenesis and Laboratory Markers
Previous Article in Journal
Targeted Genome Mining—From Compound Discovery to Biosynthetic Pathway Elucidation
Previous Article in Special Issue
HCoV-NL63 and SARS-CoV-2 Share Recognized Epitopes by the Humoral Response in Sera of People Collected Pre- and during CoV-2 Pandemic
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Communication

SARS-Cov-2 Interactome with Human Ghost Proteome: A Neglected World Encompassing a Wealth of Biological Data

by
Tristan Cardon
1,*,
Isabelle Fournier
1,2,* and
Michel Salzet
1,2,*
1
Inserm U1192, University Lille, CHU Lille, Laboratory Protéomique Réponse Inflammatoire Spectrométrie de Masse (PRISM), F-59000 Lille, France
2
Institut Universitaire de France, 75000 Paris, France
*
Authors to whom correspondence should be addressed.
Submission received: 12 November 2020 / Revised: 16 December 2020 / Accepted: 17 December 2020 / Published: 19 December 2020
(This article belongs to the Special Issue SARS-CoV-2: Epidemiology and Pathogenesis)

Abstract

:
Conventionally, eukaryotic mRNAs were thought to be monocistronic, leading to the translation of a single protein. However, large-scale proteomics have led to a massive identification of proteins translated from mRNAs of alternative ORF (AltORFs), in addition to the predicted proteins issued from the reference ORF or from ncRNAs. These alternative proteins (AltProts) are not represented in the conventional protein databases and this “ghost proteome” was not considered until recently. Some of these proteins are functional and there is growing evidence that they are involved in central functions in physiological and physiopathological context. Based on our experience with AltProts, we were interested in finding out their interaction with the viral protein coming from the SARS-CoV-2 virus, responsible for the 2020 COVID-19 outbreak. Thus, we have scrutinized the recently published data by Krogan and coworkers (2020) on the SARS-CoV-2 interactome with host cells by affinity purification in co-immunoprecipitation (co-IP) in the perspective of drug repurposing. The initial work revealed the interaction between 332 human cellular reference proteins (RefProts) with the 27 viral proteins. Re-interrogation of this data using 23 viral targets and including AltProts, followed by enrichment of the interaction networks, leads to identify 218 RefProts (in common to initial study), plus 56 AltProts involved in 93 interactions. This demonstrates the necessity to take into account the ghost proteome for discovering new therapeutic targets, and establish new therapeutic strategies. Missing the ghost proteome in the drug metabolism and pharmacokinetic (DMPK) drug development pipeline will certainly be a major limitation to the establishment of efficient therapies.

1. Introduction

Because proteins are the end products of gene expression, they have a major impact on cell regulation, thus being main targets for the development of new drugs and therapies. Therefore, holistic approaches must be developed to grasp the proteome in its completeness and find out how it relates to the upstream genes it is issued from. Grasping the proteome can be difficult because of the broad dynamic ranges its spans on (i.e., > 7 orders of magnitude from 1 copy to up to 10 million per cell) when compared to transcriptome (only 3–4 orders of magnitude) [1]. However, thanks to the last generation of liquid chromatography-mass spectrometry (LC-MS) instrumentation, > 5000 proteins can be identified in a single run experiment by large-scale bottom-up proteomics [2]. Both bottom-up and top-down proteomic approaches are very powerful; though they do show a major drawback since the protein identification is based on databank interrogation. Databanks are thus critical to large-scale proteomic approaches, since only proteins referenced in the database can be identified. A large part of the proteins in databases, such as UniProtKB/Swiss-Prot, which is the reference database in proteomics [3], is predicted from genes according to well established rules. Thereof, only >100 codon sequences of mRNA starting with an “AUG” and presenting the favorable consensus Kozak motive are translated into a single protein accordingly to the admitted idea that eukaryotes are monocistronic. The single protein product expected from gene translation is designated as the reference protein (RefProt).
However, eukaryotic translation was finally demonstrated to be polycistronic as already suspected in the late 1990s by M. Kozak [4]. Indeed, alternative translation mechanisms, such as the reinitiation or the leaky-scanning, leading to translation from alternative ORFs (AltORFs), were already described by that time; though those has remained considered as an epiphenomenon. Hence, a huge number of proteins were lacking from protein databases and have simply remained invisible to all proteomic studies, representing, thereby, a ghost proteome. This ghost proteome was eventually unveiled by two distinct approaches, one using ribosome profiling [5], and the second, MS-based proteomics. In ribosome profiling, many possible fixations of ribosome were described from non-coding RNA (ncRNA) and untranslated region (UTR) of mRNA [6,7], highlighting the existence of non-expected protein products in mammalians. From proteomic data, by using novel databases that included protein predictions translated from AltORFs novel protein sequences were identified, filling the gap of good quality data remaining unmatched after conventional database interrogation (>10% data) [8]. These proteins, designed as alternative proteins (AltProts), are neither proteoforms, nor proteins issued from alternative splicing. Some show sequence similarities with proteins carried by other mRNA, but the others present totally new amino acid sequences. Finally, identified AltProts are found to be translated, either from mRNA including from the non-coding 5′ & 3′ UTR or a frame shift (+1 or 2 nucleotides) in the CDS of the RefProt, or from ncRNA [9]. Overall, large-scale bottom-up [9,10,11,12] and top-down [13,14] proteomics have enable the identification of an important number of these AltProts. Very importantly, AltProts were also shown to be functional and carrying important cell functions [12,15,16,17]. In a way, the rediscovery of the “lost world” of protein products will open a new page in the history of biological mechanisms.
A total of ~450,000 proteins has ultimately been predicted in humans and are publicly available through the OpenProt [18] database. This is about 20-fold more than yet estimated from conventional databases (20,353 entries in June 2020 for reviewed RefProt). It is thus possible to gain incredible knowledge by considering AltProts in already generated data. Previously, proteomic data reuse have enabled the discovery of the ghost proteome interactome using cross-linking MS (XL-MS) data from HeLa cells [19,20]. In this study, AltProts were found to be interacting with RefProts involved in protein translation regulation as evidenced by the participation of AltATAD2 in the RPL10/AUF1 complex [20]. Since the study of glioma cell line (NCH82) under activation by a protein kinase A activator, inducing a cellular phenotypic change has confirmed the presence of AltProts in the signaling pathways of protein translation. AltProts were also shown interacting with cytoskeleton proteins (e.g., AltTRNAU1AP, AltMAP2, and AltEPHA5 interacting with TPM4) [10].
Based on our experience with AltProts, we were interested in finding out their involvement in development of the SARS-CoV-2 virus, responsible for the 2020 COVID-19 outbreak. Thus, we scrutinized the recently published data by Gordon and Krogan team [21] on the SARS-CoV-2 interactome with host cells by co-IP in the perspective of drug repurposing. In this work, the team have cloned the viral target proteins with a 2XStrep tag based on the GenBank sequence for SARS-CoV-2 isolate 2019-nCoV/USA-WA1/2020, accession MN985325, downloaded on 24 January, 2020. Tagged protein are express in human cells (HEK-293T/17) in order to identify the physical interaction partners of these proteins. Thus, by affinity purification coupled to mass spectrometry (AP-MS), 332 high confidence interactions were identified between the viral protein and the host. Based on these identifications, gene ontology enrichment and analysis were performed to identified pathway involved on the viral infection; moreover, some structure prediction of the viral proteins was performed with some measurements of interaction, e.g., ORF6 and NUP98-RAE1 complex. Finally, drug repurposing, targeting the identified host proteins, was proposed, based on chemoinformatics analysis of SARS-CoV-2-interacting partners and molecular docking. In this way, 69 FDA approved therapeutic compounds were evaluated against SARS-CoV-2 infection; some have been part of viral growth and cytotoxicity assays. Techniques and methodology are described in detail in the article of April 30, 2020: “A SARS-CoV-2 protein interaction map reveals targets for drug repurposing”.

2. Material and Methods

Ghost Proteins Databases

The study was carried out using OpenProt database (www.openprot.org) [18,22]. This database is derived from the predicted H. Sapiens alternative proteins (GRCh38.p5, Assembly: GCA_000001405.20). This database compiles all proteins coming from non-coding regions of mRNA, such as 5′&3′ UTR, shift in reading frame in +2 or +3, and the proteins discovered coding in ncRNA. Moreover, to this database, the RefProt from UniProtKB is added, for a total of 658,263 entries. Proteome Discoverer 2.3 (PD2.3) with label free quantification node is used to analyze the RAW data from ProteomeXchange consortium via the PRIDE repository dataset, number PXD018117 [21]. The following parameters apply on PD2.3: trypsin as enzyme, 2 missed cleavages, methionine oxidation as variable modification, and carbamidomethylation of cysteines as static modification, precursor mass tolerance: 10 ppm and fragment mass tolerance: 0.6 Da. The validation was performed using Percolator with an FDR set to 0.001%. A consensus workflow was then applied for the statistical arrangement, using the high confidence protein identification and at least one unique peptide for identified proteins.
The identified proteins are correlated with the bait of co-IP described on the dataset and to the PRIDE project [21]. Proteins identified with a fold change up to 2, between the bait expression and the control of co-IP, are kept as potential interactors. The network draws on Cytoscape V.3.8.0 [23], the DyNet [24] application is used to compare the network publish in NDEx (according to [21]) and our result. A color code is given for nodes: red hexagon is the viral protein (bait), blue circles are the RefProts, and green circles are the AltProts, and for the edges: red means interaction not recovered in our analysis, grey means recovered in both analyses, green are specific to our analysis, and with a ratio <100 when purple edges are interaction specific to our analysis with a ratio of 100. A ratio of 100 means that protein is not detected in the control, and the expression can be link to the expression of the viral protein.
The AltProt identified (Supplementary Table S1) have been described based on the recovered information obtained from OpenProt database, Ensembl and RefSeq database.
Blast analysis (non-redundant sequences and RefSeq) of the AltProts sequences, identified in interaction with the SARS-CoV-2 proteins, show the presence of 27 AltProts exhibiting a homology rate greater than 80% (average of the coverage and identity percentage). These proteins, for a major part, are ncRNAs emitted, and are therefore not isoforms of homologous proteins because they originate from a different RNA sequence. From the total list of AltProts identified, Blast analysis revealed 16 AltProts with no significant (<80%) homologies; these 16 can have a known protein domain based on few identities with referenced protein, but experimental data are needed to proof the context of action to this AltProt. In the same way, 16 other AltProts have no Blast result in the human database (non-redundant sequences and RefSeq). In the context of following and understanding the SARS-CoV-2 way of action in the host cell, and considering the bat origin of the virus, the protein sequences of the no result blast were interrogated to the bats database (taxid:9397); 7 of the 16 AltProts describe similarity in bats protein, with a rate between 35% and 78% homology.

3. Results and Discussion

We studied the presence of potential AltProt involved in the interaction between the virus and the host cell, representing the possible role of the ghost proteome during a viral infection. The SARS-CoV-2 virus expresses a ~30 kb genome coding for at least 12 ORFs, able to produce at least 36 proteins (10 canonical + 26 nsps) [25,26] at the time of the study. Later research on the translational capabilities of viral RNA in host cells showed the presence of viral protein in reading frame shifts [27]; this could interestingly be considered as viral AltProt based on our previous definition of AltProt. The initial work [21] revealed the interaction between 332 human cellular RefProts with 27 viral proteins. Re-interrogation of these data using 23 viral targets, although some AltProt are known to be present at the level of the cell membrane, we focused our work on the viral proteins present in the cytoplasm, potentially involved in the replication mechanisms of the virus in the host cell. Including the AltProts database, this leads to identify 218 RefProts (common with the initial study), plus 56 AltProts involved in 93 interactions (Figure 1), of which 17 interacted with more than one viral protein. Moreover, 59% originate from ncRNA, 41% from mRNA, of which 39% were from the 3′UTR region, 34% from 5′UTR region, and 26% from a CDS shift (Table 1). Furthermore, 26 AltProts show identification only in the host cells (samples) for which the viral proteins have been expressed, and not in the control. These proteins are therefore specific for the stimulated condition, an expression variation cannot be determined, and so the sample/control ratio is equal to 100. The other 30, identified both under stimulation and in the control, are identified with a minimum of expression variation greater than or equal to two-fold changes. Some identified proteins and interactions are found to be different from the initial study because a different methodology was applied in the data reuse. This is a consequence of using a larger size database, including both RefProts and AltProts, then forcing the utilization of Proteome Discoverer in place of MaxQuant, following the recommendations of the OpenProt developers [18,28]. However, strong FDR filter is used, a unique peptide is verified for each identified protein, and a cutoff threshold sample/control of 2 is applied to define an interactor. Furthermore, 25 AltProts, after a Blast using a human nun-redundant database, present a strong homology (>80% of the average percentage of coverage and percentage of identity) to a RefProt, though they are identified with a unique peptide to the AltProt sequence. This case is not isoform because, coming from another gene of the RefProt, or from an ncRNA, share a common domain with the referenced or predicted protein. Global analysis of the biological processes of proteins identified as homologs shows that mainly the pathway impacted the protein metabolism (Figure 2A), in particular signaling pathways, such as protein translation and elongation (EIF2S2; EEF1A1; RPL35A; RPL4; RPS17; RPS18; RPL18A), and the regulation of protein synthesis by insulin (UBE2D3; HSPD1; HSPA8; PRKDC; HNRNPA1); interestingly proteins (RPL35A; RPL4; RPS17; RPS18; RPL18A) are found in the biological process of viral RNA translation, and in the pathway “Influenza Viral RNA Transcription and Replication”.
Interestingly, it was described that SARS-CoV-2 proteins impacted the phosphorylation state of the host cell proteins, such as the N protein, which was shown to differentially phosphorylate LARP1 and RRP9 [29]. In this way, it was not surprising to recover some AltProt with the riboprotein domain in interaction with SARS-CoV-2 proteins, such as IP_668819, IP_637436, IP_639311, IP_597129, IP_750273, and IP_667059. These proteins were identified as interacting with the non-structural proteins nsp8 (IP_637436, IP_750273, IP_667059) and nsp12 (IP_639311), two viral proteins described as being involved in the virus RNA replication [30,31,32]. Thus, finding interaction with the ribosomal protein and AltProt was not a surprise, in fact, the viral proteins nsp8 and nsp12 are described as interacting with the RNA of the host cell, at the same time, the ribosomal proteins are also fixed on the RNA, thus increasing their possibility of interaction. More than 37 ribosomal protein (RPL) can be observed in interaction with nsp8, RefProt, and AltProt confounding.
Historically the SARS Coronavirus (SARS-CoV) is known to be present in a large number of bats. Although the genome of these is less studied and annotated, genomic and proteomic data banks exist. Therefore, we observed if the AltProt sequences, with no homology with humans, could have some in bats. Of the 16 AltProts analyzed, 7 have a sequence homology, between 35% and 78%, with a bat protein. By their nature, unknown, and their unreferenced sequence, AltProts can present sequence similarities with other species, unexpected and not predicted until now. As a result, they could be the source of inter-species virus transmission, as well as the key to a new therapeutic approach in cases such as SARS-CoV-2 pathology.
The experiments carried out in this study make it possible to demonstrate the interactions of viral proteins with the proteins of the host cell. From this context, we have no information on the protein interactions inside of the host cell, so the determination of the functions of the identified AltProts is difficult, since the identified AltProts can be linked to all of the signaling pathways affected by the viral protein. Domain homology allows us to speculate on the function of these of the 27 AltProts with homology. For the others (32 proteins with <80% homology or without homology) considering their viral interacting protein and the RefProts that interact with these viral proteins, it is possible to hypothesize the signaling pathways involving these AltProts. In this way, among the five AltProts interacting with the viral protein “E”: IP_219869 (AltDGKH), IP_724315 (AltHMGN2P3), IP_788706 (AltEIF2S2P3), IP_555327 (AltAC006386.1) & IP_594707 (AltEEF1A1), three do not present an homology up to 80% with a RefProt (IP_219869, IP_555327, IP_594707); however, the study of Gene Ontology of RefProts found in interaction with E (Figure 2B), shows that the most represented Biological Processes are: “regulation of histone H3-K36 trimethylation” and “Synaptic vesicle budding from endosome” represented by the presence of RefProt: BRD4 and AP3B1. Thus, these three AltProt, such as IP_724315 (AltHMGN2P3) homologous to the “non-histone chromosomal protein HMG-17”, may be involved in modifications of histones or the chromosomal binding and, therefore, in epigenetic phenomena.
In the same way, six AltProts interact with the viral protein “M”, among them, two do not present any homology with RefProts. However, the other four are homologous with Tubulins family (TUBA3, TUBB2BP, and TUBAP2). Moreover, the Gene Ontology analysis of RefProts in interaction with M (Figure 2C) presents the main Biological Process: “microtubule nucleation by microtubule organizing center”. It is a safe bet that the two AltProts of unknown function are involved in microtubule organization and protein transport. Finally, the two AltProts (IP_671071, IP_565887), exhibiting low homology with bat proteins and observed in interaction with Orf8, can be proteins from the cytoskeleton, such as the AltProts IP_774695, IP_593099, IP_774693, and IP_656465, exhibiting strong homologies with the tubulin family, but may also be linked to the post-translational glycosylation modification signaling pathway, such as the Biological Processes of RefProts interacting with Orf8 (Figure 2D).
Overexpression of SARS-CoV-2 proteins in cell lines, followed by affinity purification and mass spectrometry of host proteins bound to the bait suggests an interaction, which need to be validated experimentally (i.e., “demonstrated”) using additional assays. Nevertheless, some AltProts are already foreseen to be key player in the virus-cell hijacking, such as AltHSPA8P11, which is found to interact with seven viral proteins. A cluster of AltProts centered on nsp6, nsp10, nsp11, Orf3b, Orf6, Orf7a, and Orf9b is also identified. Very interestingly, most of these proteins are involved in the interferon production inhibition, innate immunity modulation, cycle arrest, and host translation inhibition21. A major interest of the large scale interactomics is the possibility to screen for drug repurposing, as presented by the authors in their initial study. AltProts must now be considered as new potential therapeutic targets. Indeed, among the AltProts identified, the IP_2336782 (AltDUSP4) is found to be in interaction with Nsp6. AltDUSP4 shares 54% sequence homology with the C3a anaphylatoxin chemotactic receptor (C3AR1), which was recently shown to be involved in severe forms of COVID-19. C3AR1 is found over-activated in some patients, leading to a hyper-inflammatory profile, inducing persistence of the virus and a strong immunopathology [33]. Thus, AltDUSP4 is a potential target to reduce severe symptoms of COVID-19. Interestingly, the search for partner molecules via IUPHAR/BPS Guide to Pharmacology and BindingDB, shows the presence of sequence similarity between AltDUSP4 and the ATP binding cassette subfamily G member 2. It should be noted that the viral protein Nsp6 was previously identified as a target of Bafilomycin A1, a potent and selective inhibitor of the vacuolar H+-ATPase [21]. Several drugs are known to be active towards ATPase activity, e.g., cyclosporin A, KS 176, compound 14, Ko143, and Fumitremorgin C, and thus can target both NSP6 and AltDUSP4.
Taken together, these new findings highlight the presence of many unknown proteins in the interactome between the host cells and the viral proteins that are involved in major pathways, such as innate immune response or translation regulation. Nevertheless, this study is a preliminary and descriptive study of AltProt identification in the previously published dataset, and requires dedicated research in order to specify the function and the role of these proteins in a strict way. This establishes that, besides the reference proteome, a ghost proteome exists, whose consideration would be highly beneficial both to the understanding of the pathophysiological mechanism of the virus and to establish therapeutic strategies.

Authors Contribution

Conceptualization: M.S., I.F.; methodology: T.C.; formal analysis: T.C.; investigation: M.S., T.C.; resources: M.S., I.F.; data curation: T.C.; writing: I.F., M.S., T.C.; original draft: I.F., M.S., T.C.; supervision, project: I.F., M.S., T.C.; administration: M.S., I.F.; funding acquisition: I.F., M.S. All authors have read and agreed to the published version of the manuscript.

Supplementary Materials

The following are available online at https://0-www-mdpi-com.brum.beds.ac.uk/2076-2607/8/12/2036/s1, Table S1: List of AltProts identified to be interacting with the SARS-Cov-2 viral proteins.

Funding

This research was funded by I-site grant number Coughzyme and “The APC was funded by Inserm”.

Acknowledgments

This research was supported by funding from Ministère de lʼEnseignement Supérieur, de la Recherche et de lʼInnovation (MESRI), Institut National de la Santé et de la Recherche Médicale (Inserm), I-Site Ulne and Université de Lille.

Conflicts of Interest

The authors declare no competing interests.

References

  1. Zubarev, R.A. The challenge of the proteome dynamic range and its implications for in-depth proteomics. Proteomics 2013, 13, 723–726. [Google Scholar] [CrossRef] [PubMed]
  2. Meier, F.; Geyer, P.E.; Virreira Winter, S.; Cox, J.; Mann, M. BoxCar acquisition method enables single-shot proteomics at a depth of 10,000 proteins in 100 minutes. Nat. Methods 2018, 15, 440–448. [Google Scholar] [CrossRef] [PubMed]
  3. Boeckmann, B.; Bairoch, A.; Apweiler, R.; Blatter, M.-C.; Estreicher, A.; Gasteiger, E.; Martin, M.J.; Michoud, K.; O’donovan, C.; Phan, I.; et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 2003, 31, 365–370. [Google Scholar] [CrossRef] [PubMed]
  4. Kozak, M. Regulation of translation in eukaryotic systems. Annu. Rev. Cell Biol. 1992, 8, 197–225. [Google Scholar] [CrossRef]
  5. Ingolia, N.T.; Ghaemmaghami, S.; Newman, J.R.S.; Weissman, J.S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 2009, 324, 218–223. [Google Scholar] [CrossRef] [Green Version]
  6. Ingolia, N.T.; Lareau, L.F.; Weissman, J.S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 2011, 147, 789–802. [Google Scholar] [CrossRef] [Green Version]
  7. Bazzini, A.A.; Johnstone, T.G.; Christiano, R.; MacKowiak, S.D.; Obermayer, B.; Fleming, E.S.; Vejnar, C.E.; Lee, M.T.; Rajewsky, N.; Walther, T.C.; et al. Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J. 2014, 33, 981–993. [Google Scholar] [CrossRef] [Green Version]
  8. Vanderperre, B.; Lucier, J.-F.; Bissonnette, C.; Motard, J.; Tremblay, G.; Vanderperre, S.; Wisztorski, M.; Salzet, M.; Boisvert, F.-M.; Roucou, X.; et al. Direct Detection of Alternative Open Reading Frames Translation Products in Human Significantly Expands the Proteome. PLoS ONE 2013, 8, e70698. [Google Scholar] [CrossRef] [Green Version]
  9. Mouilleron, H.; Delcourt, V.; Roucou, X. Death of a dogma: Eukaryotic mRNAs can code for more than one protein. Nucleic Acids Res. 2016, 44, 14–23. [Google Scholar] [CrossRef] [Green Version]
  10. Cardon, T.; Franck, J.; Coyaud, E.; Laurent, E.M.N.; Damato, M.; Maffia, M.; Vergara, D.; Fournier, I.; Salzet, M. Alternative proteins are functional regulators in cell reprogramming by PKA activation. Nucleic Acids Res. 2020. [Google Scholar] [CrossRef]
  11. Murgoci, A.N.; Cardon, T.; Aboulouard, S.; Duhamel, M.; Fournier, I.; Cizkova, D.; Salzet, M. Reference and Ghost Proteins Identification in Rat C6 Glioma Extracellular Vesicles. iScience 2020, 23, 101045. [Google Scholar] [CrossRef] [PubMed]
  12. Samandi, S.; Roy, A.V.; Delcourt, V.; Lucier, J.F.; Gagnon, J.; Beaudoin, M.C.; Vanderperre, B.; Breton, M.A.; Motard, J.; Jacques, J.F.; et al. Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins. Elife 2017, 6, e27860. [Google Scholar] [CrossRef] [PubMed]
  13. Delcourt, V.; Franck, J.; Leblanc, E.; Narducci, F.; Robin, Y.M.; Gimeno, J.P.; Quanico, J.; Wisztorski, M.; Kobeissy, F.; Jacques, J.F.; et al. Combined Mass Spectrometry Imaging and Top-down Microproteomics Reveals Evidence of a Hidden Proteome in Ovarian Cancer. EBioMedicine 2017, 21, 55–64. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Delcourt, V.; Franck, J.; Quanico, J.; Gimeno, J.P.; Wisztorski, M.; Raffo-Romero, A.; Kobeissy, F.; Roucou, X.; Salzet, M.; Fournier, I. Spatially-Resolved Top-down Proteomics Bridged to MALDI MS Imaging Reveals the Molecular Physiome of Brain Regions. Mol. Cell. Proteom. 2018, 17, 357–372. [Google Scholar] [CrossRef] [Green Version]
  15. Cao, X.; Khitun, A.; Na, Z.; Phoodokmai, T.; Sappakhaw, K.; Olatunji, E.; Uttamapinant, C.; Slavoff, S.A. Alt-RPL36 downregulates the PI3K-AKT-mTOR signaling pathway by interacting with TMEM24. bioRxiv 2020. [Google Scholar] [CrossRef] [Green Version]
  16. Dubois, M.L.; Meller, A.; Samandi, S.; Brunelle, M.; Frion, J.; Brunet, M.A.; Toupin, A.; Beaudoin, M.C.; Jacques, J.F.; Lévesque, D.; et al. UBB pseudogene 4 encodes functional ubiquitin variants. Nat. Commun. 2020, 11, 1–12. [Google Scholar] [CrossRef]
  17. Chen, J.; Brunner, A.D.; Cogan, J.Z.; Nuñez, J.K.; Fields, A.P.; Adamson, B.; Itzhak, D.N.; Li, J.Y.; Mann, M.; Leonetti, M.D.; et al. Pervasive functional translation of noncanonical human open reading frames. Science 2020, 367, 140–146. [Google Scholar] [CrossRef]
  18. Brunet, M.A.; Brunelle, M.; Lucier, J.F.; Delcourt, V.; Levesque, M.; Grenier, F.; Samandi, S.; Leblanc, S.; Aguilar, J.D.; Dufour, P.; et al. OpenProt: A more comprehensive guide to explore eukaryotic coding potential and proteomes. Nucleic Acids Res. 2019, 47, D403–D410. [Google Scholar] [CrossRef] [Green Version]
  19. Liu, F.; Rijkers, D.T.S.S.; Post, H.; Heck, A.J.R.R. Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry. Nat. Methods 2015, 12, 1179–1184. [Google Scholar] [CrossRef]
  20. Cardon, T.; Salzet, M.; Franck, J.; Fournier, I. Nuclei of HeLa cells interactomes unravel a network of ghost proteins involved in proteins translation. Biochim. Biophys. Acta-Gen. Subj. 2019. [Google Scholar] [CrossRef]
  21. Gordon, D.; Jang, G.; Bouhaddou, M.; Xu, J.; Obernier, K.; O’Meara, M.; Guo, J.; Swaney, D.; Tummino, T.; Hüttenhain, R.; et al. A SARS-CoV-2-Human Protein-Protein Interaction Map Reveals Drug Targets and Potential Drug-Repurposing. bioRxiv Prepr. Serv. Biol. 2020, 19, 4. [Google Scholar] [CrossRef] [Green Version]
  22. Delcourt, V.; Brunelle, M.; Roy, A.V.; Jacques, J.-F.; Salzet, M.; Fournier, I.; Roucou, X. The Protein Coded by a Short Open Reading Frame, Not by the Annotated Coding Sequence, Is the Main Gene Product of the Dual-Coding Gene MIEF1. Mol. Cell. Proteom. 2018, 17, 2402–2411. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: A software Environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 2498–2504. [Google Scholar] [CrossRef] [PubMed]
  24. Goenawan, I.H.; Bryan, K.; Lynn, D.J. DyNet: Visualization and analysis of dynamic molecular interaction networks. Bioinformatics 2016, 32, 2713–2715. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Zhu, N.; Zhang, D.; Wang, W.; Li, X.; Yang, B.; Song, J.; Zhao, X.; Huang, B.; Shi, W.; Lu, R.; et al. A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. 2020, 382, 727–733. [Google Scholar] [CrossRef] [PubMed]
  26. Kim, D.; Lee, J.Y.; Yang, J.S.; Kim, J.W.; Kim, V.N.; Chang, H. The Architecture of SARS-CoV-2 Transcriptome. Cell 2020, 181, 914–921.e10. [Google Scholar] [CrossRef]
  27. Jungreis, I.; Nelson, C.W.; Ardern, Z.; Finkel, Y.; Krogan, N.J.; Sato, K.; Ziebuhr, J.; Stern-Ginossar, N.; Pavesi, A.; Firth, A.E.; et al. Conflicting and ambiguous names of overlapping ORFs in SARS-CoV-2: A homology-based resolution. Biochemistry 2020. [Google Scholar] [CrossRef]
  28. Brunet, M.A.; Roucou, X. Mass spectrometry-based proteomics analyses using the openprot database to unveil novel proteins translated from non-canonical open reading frames. J. Vis. Exp. 2019, 2019, 59589. [Google Scholar] [CrossRef] [Green Version]
  29. Bouhaddou, M.; Memon, D.; Meyer, B.; White, K.M.; Rezelj, V.V.; Marrero, M.C.; Polacco, B.J.; Melnyk, J.E.; Ulferts, S.; Kaake, R.M.; et al. The Global Phosphorylation Landscape of SARS-CoV-2 Infection. Cell 2020. [Google Scholar] [CrossRef]
  30. Hillen, H.S.; Kokic, G.; Farnung, L.; Dienemann, C.; Tegunov, D.; Cramer, P. Structure of replicating SARS-CoV-2 polymerase. Nature 2020, 584, 154–156. [Google Scholar] [CrossRef]
  31. Kirchdoerfer, R.N.; Ward, A.B. Structure of the SARS-CoV nsp12 polymerase bound to nsp7 and nsp8 co-factors. Nat. Commun. 2019, 10, 1–9. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Subissi, L.; Posthuma, C.C.; Collet, A.; Zevenhoven-Dobbe, J.C.; Gorbalenya, A.E.; Decroly, E.; Snijder, E.J.; Canard, B.; Imbert, I. One severe acute respiratory syndrome coronavirus protein complex integrates processive RNA polymerase and exonuclease activities. Proc. Natl. Acad. Sci. USA 2014, 111, E3900–E3909. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  33. Jodele, S.; Köhl, J. Tackling COVID-19 infection through complement-targeted immunotherapy. Br. J. Pharmacol. 2020. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Viral protein interaction network (SARS-Cov-2) from co-IP experiments [21], re-analyzed by including AltProt, reference protein (RefProt), and viral databases. The previously established network is compared to the new query thanks to the DyNet Analyzer application on Cytoscape V3.8.0. Color legend nodes: red: viral protein (bait), blue: RefProts and green: AltProts, and for the edges: red: interaction not recovered in our analysis, grey are recovered in both analysis, green: specific to our analysis and with a ratio < 100, purple edges are interaction specific to our analysis with a ratio of 100.
Figure 1. Viral protein interaction network (SARS-Cov-2) from co-IP experiments [21], re-analyzed by including AltProt, reference protein (RefProt), and viral databases. The previously established network is compared to the new query thanks to the DyNet Analyzer application on Cytoscape V3.8.0. Color legend nodes: red: viral protein (bait), blue: RefProts and green: AltProts, and for the edges: red: interaction not recovered in our analysis, grey are recovered in both analysis, green: specific to our analysis and with a ratio < 100, purple edges are interaction specific to our analysis with a ratio of 100.
Microorganisms 08 02036 g001
Figure 2. Gene Ontology analysis based on the RefProt identified in the network of interaction. ToppGene analysis is performed; (A) based on the gene name of the RefProt identified to have more than 80% of homology and on the RefProt identified in interaction with the bait of the co-IP; (B) RefProt in interaction with E; (C) in interaction with M; and (D) in interaction with Orf8. The pathway attributed to the RefProt is a clue for the AltProt link to the same bait.
Figure 2. Gene Ontology analysis based on the RefProt identified in the network of interaction. ToppGene analysis is performed; (A) based on the gene name of the RefProt identified to have more than 80% of homology and on the RefProt identified in interaction with the bait of the co-IP; (B) RefProt in interaction with E; (C) in interaction with M; and (D) in interaction with Orf8. The pathway attributed to the RefProt is a clue for the AltProt link to the same bait.
Microorganisms 08 02036 g002
Table 1. List of alternative proteins (AltProts) identified to be interacting with the SARS-Cov-2 viral proteins. The co-IP raw data were re-interrogated using OpenProt [18]. The table lists the 56 identified AltProts identified including the name of the gene coding for the RNA transcript, the accession number of the transcript, the name of the AltProt, the type of transcript for which AltProts are issued from and for AltProts originating from mRNA, the location on the mRNA.
Table 1. List of alternative proteins (AltProts) identified to be interacting with the SARS-Cov-2 viral proteins. The co-IP raw data were re-interrogated using OpenProt [18]. The table lists the 56 identified AltProts identified including the name of the gene coding for the RNA transcript, the accession number of the transcript, the name of the AltProt, the type of transcript for which AltProts are issued from and for AltProts originating from mRNA, the location on the mRNA.
AccessionGNTAAltProtTypeLocationViral Protein in Interaction
IP_555327AC006386.1ENST00000624491AltAC006386.1mRNA3’UTRE
IP_581922AC018641.7ENST00000435950AltAC018641.7ncRNA-orf9b
IP_659614ANKRD20A11PENST00000442192AltANKRD20A11PncRNA-nsp7
IP_187691ARL3NM_004311.3AltARL3mRNA3’UTRnsp12
IP_077449BATF3NM_018664.2AltBATF3mRNACDSnsp6
IP_2387661BCL11AXM_017004337.1AltBCL11AmRNA5’UTRM
IP_565887C9orf116ENST00000371789AltC9orf116mRNA5’UTRorf8
IP_075271CDC73XM_006711537.3AltCDC73mRNACDSnsp11
IP_766056CEP290ENST00000547691AltCEP290mRNA5’UTRnsp11
orf6
orf7a
IP_691726CTC-398G3.1ENST00000483614AltCTC-398G3.1ncRNA-N nsp8
IP_219869DGKHNM_152910.5AltDGKHmRNACDSE orf6
orf7a
IP_2336782DUSP4XM_011544428.2AltDUSP4mRNA5’UTRnsp6
IP_235699EDC3ENST00000565602AltEDC3mRNA3’UTRnsp7
IP_594707EEF1A1ENST00000309268AltEEF1A1mRNA5’UTRE nsp14
IP_788706EIF2S2P3ENST00000428356AltEIF2S2P3ncRNA-E
IP_2396759GJA5XM_017001044.1AltGJA5mRNA5’UTRnsp8
IP_711582HGSENST00000577012AltHGSmRNA5’UTRnsp13
IP_775502HIGD1AP10ENST00000527837AltHIGD1AP10ncRNA-orf10
IP_724315HMGN2P3ENST00000433603AltHMGN2P3ncRNA-E
IP_557348HNRNPA1P28ENST00000424481AltHNRNPA1P28ncRNA-nsp11
IP_572435HSPA8P11ENST00000508840AltHSPA8P11ncRNA-N
nsp10
nsp11
nsp4
nsp9
orf10
orf3b
IP_658154HSPD1P7ENST00000447985AltHSPD1P7ncRNA-orf9b
IP_289249KCNE1XM_017028342.1AltKCNE1mRNA3’UTRnsp14
IP_075761LAD1ENST00000631576AltLAD1mRNACDSorf9b
IP_671071LOC101929023ENST00000434879AltLOC101929023ncRNA-orf8
IP_2361135LOC102723525XR_925379.2AltLOC102723525ncRNA-orf10
IP_2268667LOC105372714XM_017028195.1AltLOC105372714mRNA5’UTRnsp15
IP_2266298LOC107985441XR_001754616.1AltLOC107985441ncRNA-nsp15
IP_2354489LOC107986350XR_001742414.1AltLOC107986350ncRNA-orf3b
IP_143572LYRM2NM_020466.4AltLYRM2mRNA3’UTRnsp15
IP_745252MEG8ENST00000553465AltMEG8ncRNA-nsp6
orf3b
IP_213668METAP2XM_005268583.3AltMETAP2mRNACDSnsp10
nsp6
orf6
IP_729791MT1XENST00000568370AltMT1XmRNA3’UTRnsp11
nsp6
IP_230046NKX2-1-AS1ENST00000521292AltNKX2-1-AS1ncRNA-nsp6
IP_597201NOP56P1ENST00000440030AltNOP56P1ncRNA-orf3b
IP_105102POC1AXM_011533561.1AltPOC1AmRNACDSorf9b
IP_581419RP11-10F11.4ENST00000634439AltRP11-10F11.4ncRNA-nsp6
orf6
IP_734708RP11-24M17.3ENST00000567565AltRP11-24M17.3ncRNA-nsp11
orf3b
IP_667059RP11-397P13.7ENST00000427282AltRP11-397P13.7ncRNA-nsp8
IP_591742RP11-471B18.1ENST00000407538AltRP11-471B18.1ncRNA-nsp11
nsp6
IP_612631RP11-553P9.1ENST00000509116AltRP11-553P9.1ncRNA-orf10
IP_639311RPL36AP13ENST00000457490AltRPL36AP13ncRNA-nsp12
IP_750273RPL4P1ENST00000496596AltRPL4P1ncRNA-N nsp8
IP_637436RPL5P9ENST00000448118AltRPL5P9ncRNA-nsp8
IP_597129RPS17P1ENST00000396783AltRPS17P1ncRNA-orf10
IP_668819RPS23P9ENST00000448848AltRPS23P9ncRNA-orf8
IP_594653SENP6ENST00000474906AltSENP6ncRNA-M orf7a
IP_769089SPRYD4ENST00000338146AltSPRYD4mRNA3’UTRnsp11
orf10
IP_713094SSTR2ENST00000357585AltSSTR2mRNA3’UTRorf3b
IP_656465TUBA3GPENST00000410028AltTUBA3GPncRNA-M orf10
orf8
IP_774695TUBAP2ENST00000530835AltTUBAP2ncRNA-M
nsp11
nsp4
nsp6
nsp7
nsp8
orf10
orf6
orf7a
orf8
orf9b
IP_557241TUBB4AP1ENST00000450755.1Alt TUBB4AP1ncRNA-orf3a
IP_593099TUBB2BP1ENST00000404155AltTUBB2BP1ncRNA-M
nsp11
orf10
orf8
IP_572422TUBBP1ENST00000518096AltTUBBP1ncRNA-nsp9
IP_665452UBE2D3P1ENST00000436669AltUBE2D3P1ncRNA-orf7a
IP_274314ZNF569XM_006723046.2AltZNF569mRNA3’UTRnsp9
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Cardon, T.; Fournier, I.; Salzet, M. SARS-Cov-2 Interactome with Human Ghost Proteome: A Neglected World Encompassing a Wealth of Biological Data. Microorganisms 2020, 8, 2036. https://0-doi-org.brum.beds.ac.uk/10.3390/microorganisms8122036

AMA Style

Cardon T, Fournier I, Salzet M. SARS-Cov-2 Interactome with Human Ghost Proteome: A Neglected World Encompassing a Wealth of Biological Data. Microorganisms. 2020; 8(12):2036. https://0-doi-org.brum.beds.ac.uk/10.3390/microorganisms8122036

Chicago/Turabian Style

Cardon, Tristan, Isabelle Fournier, and Michel Salzet. 2020. "SARS-Cov-2 Interactome with Human Ghost Proteome: A Neglected World Encompassing a Wealth of Biological Data" Microorganisms 8, no. 12: 2036. https://0-doi-org.brum.beds.ac.uk/10.3390/microorganisms8122036

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop