- Research
- Open access
- Published:
COVID-19: viral–host interactome analyzed by network based-approach model to study pathogenesis of SARS-CoV-2 infection
Journal of Translational Medicine volume 18, Article number: 233 (2020)
Abstract
Background
Epidemiological, virological and pathogenetic characteristics of SARS-CoV-2 infection are under evaluation. A better understanding of the pathophysiology associated with COVID-19 is crucial to improve treatment modalities and to develop effective prevention strategies. Transcriptomic and proteomic data on the host response against SARS-CoV-2 still have anecdotic character; currently available data from other coronavirus infections are therefore a key source of information.
Methods
We investigated selected molecular aspects of three human coronavirus (HCoV) infections, namely SARS-CoV, MERS-CoV and HCoV-229E, through a network based-approach. A functional analysis of HCoV–host interactome was carried out in order to provide a theoretic host–pathogen interaction model for HCoV infections and in order to translate the results in prediction for SARS-CoV-2 pathogenesis. The 3D model of S-glycoprotein of SARS-CoV-2 was compared to the structure of the corresponding SARS-CoV, HCoV-229E and MERS-CoV S-glycoprotein. SARS-CoV, MERS-CoV, HCoV-229E and the host interactome were inferred through published protein–protein interactions (PPI) as well as gene co-expression, triggered by HCoV S-glycoprotein in host cells.
Results
Although the amino acid sequences of the S-glycoprotein were found to be different between the various HCoV, the structures showed high similarity, but the best 3D structural overlap shared by SARS-CoV and SARS-CoV-2, consistent with the shared ACE2 predicted receptor. The host interactome, linked to the S-glycoprotein of SARS-CoV and MERS-CoV, mainly highlighted innate immunity pathway components, such as Toll Like receptors, cytokines and chemokines.
Conclusions
In this paper, we developed a network-based model with the aim to define molecular aspects of pathogenic phenotypes in HCoV infections. The resulting pattern may facilitate the process of structure-guided pharmaceutical and diagnostic research with the prospect to identify potential new biological targets.
Background
In December 2019, a novel coronavirus (SARS-CoV-2) was first identified as a zoonotic pathogen of humans in Wuhan, China, causing a respiratory infection with associated bilateral interstitial pneumonia. The disease caused by SARS-CoV-2 was named by the World Health Organization as COVID-19 and it has been classified as a global pandemic since it has spread rapidly to all continents. As of May 20, 2020, there have been 4.889.287 confirmed COVID-19 cases worldwide with 322.457 deaths reported to the WHO [1]. Whilst clinical and epidemiological data on COVID-19 have become readily available, information on the pathogenesis of the SARS-CoV-2 infection has not been forthcoming [2]. The transcriptomic and proteomic data on host response against SARS-CoV-2 is scanty and not effective therapeutics and vaccines for COVID-19 are available yet.
Coronaviruses (CoVs) typically affect the respiratory tract of mammals, including humans, and lead to mild to severe respiratory tract infections [3]. Many emerging HCoV infections have spilled-over from animal reservoirs, such as HCoV-OC43 and HCoV-229E which cause mild diseases such as common colds [4, 5]. During the past 2 decades, two highly pathogenic HCoVs, severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV), have led to global epidemics with high morbidity and mortality [6]. In this period, a large amount of experimental data associated with the two infections has allowed to better understand molecular mechanism(s) of coronavirus infection, and enhance pathways for developing new drugs, diagnostics and vaccines and identification of host factors stimulating (proviral factors) or restricting (antiviral factors) infection remains poorly understood [7]. Structures of many proteins of SARS-CoV and MERS-CoV, and biological interactions with other viral and host proteins have been widely explored; through experimental testing of small molecule inhibitors with anti-viral effects [8, 9]. ACE2, expressed in type 2 alveolar cells in the lung, has been identified as receptor of SARS-CoV and SARS-CoV-2, while dipeptidyl peptidase DPP4 was identified as the specific receptor for MERS-CoV [10, 11].
The investigation of structural genomics and interactomics of SARS-CoV-2 can be implemented through systematical mapping of protein–protein interactions (PPI) between SARS-CoV-2 and human host, and an integrated bioinformatics approach [12, 13]. Structural analysis of specific SARS-CoV-2 proteins, in particular Spike glycoproteins (S-glycoproteins), and their interactions with human proteins, can guide the identification of the putative functional sites and help to better define the pathologic phenotype of the infection. This functional interaction analysis between the host and other HCoVs, combined with an evolutionary sequence analysis of SARS-CoV-2, can be used to guide new treatment and prevention interventions.
We investigated here biologically and clinically relevant molecular targets of three human coronaviruses (HCoV) infections using a network based approach. A functional analysis of HCoV–host interactome was carried out in order to provide a theoretic host–pathogen interaction model for HCoV infections, and to predict viable models for SARS-CoV-2 pathogenesis. Three HCoV causing respiratory diseases were used as the model targets, namely: SARS-CoV, that shares with SARS-CoV-2 a strong genetic similarity, including MERS-CoV, and HCoV-229E.
Methods
Comparative reconstruction of S-glycoprotein in HCoVs
The reconstruction of virus–host interactome was carried out using the RWR algorithm to explore the human PPI network and the multilayer PPI platform enriched with gene expression data sets. 259 sequences of CoVs, infecting different animal hosts (Additional file 1: Table S1), were downloaded by GSAID and NCBI database in order to evaluate the variability in the S gene. SARS-CoV, HCoV-229E and MERS-CoV and other CoV full genome sequence groups were aligned with MAFFT [14], synonymous and non-synonymous mutations, and amino acid similarity were calculated using the SSE program with a sliding windows of 250 nucleotides and a pass of 25 nu [15]. A homology model was built for the amino acid sequences of the S-glycoprotein, derived from the full genome sequence obtained at “SARS-CoV-2/INMI1/human/2020/ITA” (MT066156.1). The Swiss pdb server was used to construct three-dimensional models for the S-glycoprotein of SARS-CoV-2 [16]. Among proteins with a 3D structure, the best match with the “SARS-CoV-2/INMI1/human/2020/ITA” sequence was the 6VSB.1, that was evaluated considering the identity of two amino acid sequences and the QMEN value included in Swiss pdb server. The model of a single chain was overlapped with the three-dimensional structure of S-glycoprotein single chain belonging to SARS-CoV (5WRG), HCoV-229E (6U7H.1) and MERS-CoV (5X59), using Chimera 1.14 [17]. In order to better evaluate the conservation of the sequence in each site, all sequences were aligned with MAFFT and the topology of all structures were compared. The detailed description of the reconstruction of S-glycoprotein structure is reported in Additional file 2.
PPI and gene co-expression network
Network analysis, based on protein–protein interactions and gene expression data, was performed in order to view all possible virus–host protein interactions during the HCoV infections. Since the SARS-CoV-2 genome exhibits substantial similarity to the SARS-CoV genome [18] and subsequently also the proteome [19], we hypothesized that several molecular interactions that were observed in the SARS-CoV interactome will be preserved in the SARS-CoV-2 interactome. Virus–host interactomes (SARS-CoV, MERS-CoV, HCoV-229E) were inferred through published PPI data, using two publicly accessible databases (STRING Viruses and VirHostNet), as well as published scientific reports with a focus on virus–host interactions [20,21,22]. As a next step, the virus–host PPI list, extracted in this first step, was merged with additional PPI databases, i.e. BioGrid, InnateDB-All, IMEx, IntAct, MatrixDB, MBInfo, MINT, Reactome, Reactome-FIs, UniProt, VirHostNet, BioData, CCSB Interactome Database, using R packages PSICQUIC and biomaRt [23, 24]. In total, a large PPI interaction database was assembled, including 13,020 nodes and 71,496 interactions.
The gene expression data set was built from the Protein Atlas database, using tissue and cell line data [25]. To identify the most likely interactions, and to obtain functional information, Random walk with restart (RWR), a state-of-the-art guilt-by-association approach by R package RandomWalkRestartMH [26] was used. It allows to establish a proximity network from a given protein (seed), to study its functions, based on the premise that nodes related to similar functions tend to lie close to each other in the network. For each node, a score was computed as measure of proximity to the seed protein. S-glycoproteins of SARS-CoV, MERS-CoV and HCoV-229E were used as seed in the application of the RWR algorithm.
Functional enrichment analysis
To evaluate functional pathways of proteins involved in host response, gene enrichment analysis was performed, using Kyoto Encyclopedia of Genes and Genomes (KEGG) human pathways and Gene Ontology databases. Network representation from the gene enrichment analysis was performed using ShinyGO v0.61 [27]. The statistical significance was obtained, calculating the False Discovery Rate (FDR).
Results
Structure of S-glycoprotein CoVs
To evaluate the diversity along the full genome, pairwise distance was calculated on 259 HCoV genomes. Diversity was distributed along the entire CoV genome, with the most conserved region located in Orf1ab, as expected, while the spike gene region exhibited a rather high diversity (Additional file 2: Figure S1), due to key role of the S-glycoprotein during viral entry in specific hosts [28].
Consequently, the analysis was focused on the S-glycoprotein, as a key virus component involved in host interaction [29]. A 3D model of S-glycoprotein of the SARS-CoV-2 sequence (MT066156.1) was built on the sequence obtained at Laboratory of Virology, National Institute for Infectious Diseases “L. Spallanzani” IRCCS, using Swiss pdb viewer server (Additional file 2: Figure S2a, b). The SARS-CoV-2 S-glycoprotein structure was then compared to other HCoVs as shown in Additional file 2: Figure S2. The S-glycoprotein structures of the various HCoV were very similar overall. In particular, a strong similarity was shown in the RBD (nCov: residues 319–591) [30], and this was most evident for the comparison between SARS-CoV-2 and SARS-CoV, which share the same cell receptor (ACE2). The amino acid differences among the S-glycoproteins of the selected HCoVs (SARS-CoV-2, MERS-CoV, SARS-CoV, HCoV-229E) are shown in Additional file 2: Figure S3, where a lower topology similarity was observed with HCoV-229E S-glycoprotein, which binds a different host receptor.
Overall, the pattern arising from such comparison was consistent with specific host receptors, as well as with different host reservoirs and ancestry [31].
Human CoV and host interactome
An interactome map was built to highlight biological connections among S-glycoprotein and the human proteome. Using the analysis pipeline described in the methods, a large PPI interaction database was assembled, including 13,020 nodes and 71,496 interactions between human host and the three selected viruses (SARS-CoV, MERS-CoV and HCoV-229E).
The interactome reconstruction was obtained with the RWR analysis, finding 200 closest proteins to seed, or S-glycoproteins of HCoV-229E, SARS-CoV and MERS-CoV (Additional file 2: Figures S4–S6). In Additional file 1: Tables S2–S4, lists of genes selected by RWR algorithm for HCoV-229E, SARS-CoV and MERS-CoV, along with proximity score were reported. In order to further dissect the S-glycoprotein-host interactions, enrichment analysis was carried out with Reactome and KEGG databases. Reactome pathway enrichment analysis revealed biological pathways of DNA repair, transcription and gene regulation for the HCoV-229E S-glycoprotein, with high significance (FDR < 0.01%). KEGG pathway enrichment analysis revealed ubiquitin mediated proteolysis as the most significant pathway (FDR < 0.01%), as well as cellular proliferation pathways, associated with other viral infections (Hepatitis B, measles, Epstein–Barr virus infection and Human T-cell leukemia virus 1 infection) as well as with carcinogenesis (Fig. 1). Next, the RWR algorithm was applied to a multilayer network built on the PPI interactome and on the Gene Coexpression (COEX) network, again with S-glycoprotein of HCoV-229E as seed. The results highlighted a set of genes that are connected in both PPI and COEX analysis, including ANPEP, RAD18, APEX, POLH, APEX1, TERF2, RAD51, CDC7, USP7, XRCC5, RAD18, FEN1, PCNA, all associated to the GO biological process category of DNA repair (FDR < 0.0001%) (Fig. 2). The same analyses were conducted for SARS-CoV and MERS-CoV.
The Reactome pathway enrichment analysis for the SARS-CoV revealed S-glycoprotein connection with early activation of innate immune system, such as the Toll Like Receptor Cascade and TGF-β, with a strong significance (FDR < 0.0001%), while the KEGG pathway enrichment analysis revealed an association with cellular proliferation, TGF-β and other infection-related pathways (FDR < 0.0001%) (Fig. 3). The PPI-COEX multilayer analysis highlighted a set of genes that are connected in both PPI and COEX analysis, i.e.CLEC4G, CLEC4M, CD209, ACE2, RPSA, all associated to the GO biological process category of SARS-CoV entry into host cell (FDR < 0.01%) (Fig. 4). In MERS-CoV, the Reactome pathway enrichment analysis showed a strong association with membrane signals activated by GPCR ligand binding (FDR < 0.0001%), and chemokine/chemokine receptor pathways. Consistent results were obtained with KEGG pathway enrichment, that highlighted cytokine–cytokine receptor and chemokine signalling pathways (FDR < 0.0001%) (Fig. 5). Finally, PPI-COEX multilayer analysis evidenced, for both PPI and COEX, CCR4, CXCL2, CXCL10, CXCL9, PF4, PF4V1, CCL11, CXCL11, XCL1, CXCR4 and CXCL14, all genes identified by the GO biological processes involved in the chemokine cascade (FDR < 0.0001%), in line with the results obtained with enrichment analyses (Fig. 6).
Discussion
In-depth comparative analysis of S-glycoprotein
We applied network analysis, based on protein–protein interactions and gene expression data, in order to describe the interactome of the coronavirus S-glycoprotein and host proteins, with the aim to better understand SARS-CoV-2 pathogenesis. A preliminary structural analysis was conducted on the S-glycoprotein of SARS-CoV-2 as compared to the other 3 HCoV, using the S-glycoprotein as a model to shed light on the host–pathogen interaction in the dynamic process of SARS-CoV-2 infection. Although the amino acid sequences of the S-glycoprotein were different between the various HCoVs, the structural analysis exhibited high similarity; the best 3D structural overlap was found for SARS-CoV and SARS-CoV-2, consistent with the shared ACE2 predicted receptor.
Of note, the newly discovered SARS-CoV-2 genome has revealed differences between SARS-CoV-2 and SARS or SARS-like coronaviruses [31]. Although no amino acid substitutions were present in the receptor-binding motifs, that directly interact with human receptor ACE2 protein in SARS-CoV, six mutations occurred in the other region of the RBD [31, 32] were identified. On the other hand, the genomic comparative analysis highlighted the strong diversity in the S gene among CoV in different hosts, confirming the biologically vital role of the S-glycoprotein as a key factor in viral entry in cross-species transmission events [28].
In addition, the comparative 3D structural data may facilitate the definition of already known antibody epitopes in the S-glycoprotein of other coronaviruses, it will also be useful in rational vaccine design and in gauging anti-virus directed immune responses after vaccination [30]. In fact, S-glycoprotein remains an important target for vaccines and drugs previously evaluated in SARS and MERS, while a neutralizing antibody targeting the S-glycoprotein protein could provide passive immunity. The host interactome, linked to S-glycoprotein of SARS-CoV and MERS-CoV, mainly highlighted innate immunity pathway components, such as Toll Like receptors, cytokines and chemokines. The 3D structural analysis confirmed that we established that S-glycoprotein of SARS-CoV-2 has strong similarity in the 3D structure with SARS-CoV [18].
Host interactome in all three HCoV infections
The reconstruction of virus–host interactome was carried out using RWR algorithm to explore the human PPI network and studying PPI and COEX multilayer. The PPI network topology of host interactome in all three infections indicated the presence of several hub proteins. In the HCoV-229E–host interactome hub position was hold by RAD18 and APEX, which play an important role in DNA repair due to UV damage in phase S [33].
For the SARS-CoV interactome, the gene hubs were identified in ACE2, CLEC4G and CD209, which are known interactors with S-glycoprotein of SARS-CoV [34, 35].
In fact, two independent mechanisms were described as trigger of SARS-CoV infection: proteolytic cleavage of ACE2 and cleavage of S-glycoprotein. The latter activates the glycoprotein for cathepsin L-independent host cell entry. Activated the S-glycoprotein by cathepsin L mechanism in host cell entry was reported in many infections of CoV, such as HCoV-229E and SARS-CoV [36, 37]. A recent study speculated that this interaction will be preserved in SARS-CoV-2 [19], but might be disrupted of a substantial number of mutations in the receptor binding site of S gene will occur. Likewise, the S-glycoprotein in SARS-CoV-2 is expected to interact with type II transmembrane protease (TMPRSS2) and probably is involved in inhibition of antibody-mediated neutralization [38, 39]. It is rather unexpected that, for this virus, no intracellular pathways were highlighted by the multilayer analysis, suggesting that this field is still open to further investigation.
In MERS-CoV infection a gene hub role was described for DPP4, which is known to regulate cytokine levels through catalytic cleavage [40]. Immune cell—recruiting chemokines and cytokines, such as IP-10/CXCL-10, MCP-1/CCL-2, MIP-1α/CCL-3, RANTES/CCL-5, can be strongly induced by MERS-CoV, showing higher inducibility in human monocyte—derived macrophages by MERS-CoV as compared to than SARS-CoV infection [41]. The cellular proliferation pathways, involved immediately after virus entry, were described in all three models, resulting consistent with inhibiting activity on cell proliferation and cytotoxic effect due to HCoV infections [42, 43]. Finally, biological pathways, revealed by enrichment analysis in over all models, supported early activation of innate immune system, as Toll Like receptor Cascade and TGF-β for SARS-CoV, or chemokine and cytokine pathways and infection-related pathways for MERS-CoV, with a strong significance for both.
Pathogenic model for HCoV infections
We constructed a host molecular interactome with SARS-CoV, MERS-CoV and HCoV-229E in patients with cancer, assuming that most of these interactions, especially for SARS-CoV, are shared with SARS-CoV-2. A network-based methodology, along with guilt-by-association algorithm (RWR), was applied to define the pathological model of COVID-19 and provide a treatment of SARS-CoV-2, using existing transcriptomic and proteomic information.
Based on the main pathways identified by the network-based interactome analysis, the following issues require focus further study:
First, The predicted receptor for SARS-CoV-2 has been inferred to be ACE-2, i.e. the same used by SARS-CoV, based on the high similarity of the S-glycoprotein of the two viruses, and this is the basis for hypothesizing to use SARS-CoV as a model for virus–host interactome in COVID-19;
Second, Mitogen activated protein kinase (MAPK) is a major cell signalling pathway that is known to be activated by diverse groups of viruses, and plays an important role in cellular response to viral infections. MAPK interacting kinase 1 (MNK1) has been shown to regulate both cap-dependent and internal ribosomal entry sites (IRES)-mediated mRNA translation;
Third, The identification of the MAPK pathway in SARS-CoV model is highly consistent with in vivo model, where P38 MAPK was found increased in the lungs of mice infected with SARS-CoV [44];
Fourth, The identification of the TGF-β pathway in S-glycoprotein-induced interactome for SARS-CoV of particular interest, due to the previous evidence that this virus, and in particular its protease, triggered the TGF-β through the p38 MAPK/STAT3 pathway in alveolar basal epithelial cells [45, 46];
Fifth, Innate immune pathways were identified in S-glycoprotein-induced models of SARS-CoV and MERS-CoV, as Toll Like receptor, cytokine and chemokine.
Every described pathway can be matched with clinical aspects, the data presented in this report may therefore aid to design a ‘blue print’ for SARS-CoV-2 associated pathogenicity.
The severity and the clinical picture of SARS-CoV and MERS-CoV infections could be related to the activation of exaggerated immune mechanism, causing uncontrolled inflammation [47]; however, the role of strong immune response in SARS-CoV-2 infection severity is still uncertain.
However, we may consider that host kinases link multiple signalling pathways in response to a broad array of stimuli, including viral infections. TGF-β, produced during the inflammatory phase by macrophages, is an important mediator of fibroblast activation and tissue repair. High levels of systemic inflammatory cytokines/chemokines has been widely reported for MERS-CoV infections [48,49,50], correlating with immunopathology and massive pulmonary infiltration into the lungs [51]. Also the HCoV-229E infection can be described with this distance model, although this infection was not associated with a severe respiratory disease. In fact, HCoV-229E is responsible for mild upper respiratory tract infections, such as common colds, with only occasional spreading to the lower respiratory tract, but it interacts with dendritic cells in the upper respiratory tract, inducing a cytopathic effect [52].
Conclusions
In conclusion, we developed a network-based model, which could be the framework for structure-guided research process and for the pathogenetic evaluation of specific clinical outcome. Accurate structural 3D protein models and their interaction with host receptor proteins can allow to build a more detailed theoretical disease model for each HCoV infection, and support the drawing of a disease model for COVID-19. Our analyses suggests it is important to carry out in silico experiments and simulations through specific algorithms.
Limitations to our study
A single protein, namely S-glycoprotein was used as seed, therefore the highlighted interactions were limited to those connected with this unique viral protein. However, this is a proof of concept study, from which it appears that a similar approach may be used to study other viral proteins interacting with host cell pathways.
Another limitation is that the pathway analysis did not consider tissue and cell type diversity. Finally, the low threshold established for the number of nodes found by RWR (200) limited the reconstruction of the entire pathways. However, this was a software-imposed threshold.
In summary, the interactome analysis aided to guide the design of novel models of SARS-CoV pathogenicity.
Availability of data and materials
PPI data of SARS-CoV, MERS-CoV, HCoV-229E S-glycoprotein were inferred through published PPI data, using STRING Viruses (http://viruses.string-db.org/) and VirHostNet (http://virhostnet.prabi.fr/), as well as published scientific reports with a focus on virus-host interactions [20,21,22]. Human PPI databases (BioGrid, InnateDB-All, IMEx, IntAct, MatrixDB, MBInfo, MINT, Reactome, Reactome-FIs, UniProt, VirHostNet, BioData, CCSB Interactome Database), using R packages PSICQUIC (https://bioconductor.org/packages/release/bioc/html/PSICQUIC.html) and biomaRt (https://bioconductor.org/packages/release/bioc/html/biomaRt.html) [23, 24]. The gene expression data set was built from the Protein Atlas database (https://www.proteinatlas.org/) [25].
Abbreviations
- CoVs:
-
Coronaviruses
- HCoV:
-
Human coronavirus
- PPI:
-
Protein–protein interactions
- COEX:
-
Gene coexpression data
- SARS-CoV:
-
Severe acute respiratory syndrome coronavirus
- MERS-CoV:
-
Middle East respiratory syndrome coronavirus
- S-glycoprotein:
-
Spike glycoprotein
- RWR:
-
Random walk with restart
- FDR:
-
False Discovery Rate
References
Johns Hopkins University. Global-cases-covid-19. 2020. https://www.gisaid.org/epiflu-applications/global-cases-covid-19/.
Jernigan DB. Update: public health response to the coronavirus disease 2019 outbreak—United States, February 24, 2020. MMWR Morb Mortal Wkly Rep. 2020;69(8):216–9. https://doi.org/10.15585/mmwr.mm6908e1.
Paules CI, Marston HD, Fauci AS. Coronavirus infections-more than just the common cold. JAMA. 2020. https://doi.org/10.1001/jama.2020.0757.
Walsh EE, Shin JH, Falsey AR. Clinical impact of human coronaviruses 229E and OC43 infection in diverse adult populations. J Infect Dis. 2013;208(10):1634–42. https://doi.org/10.1093/infdis/jit393.
Hendley JO, Fishburne HB, Gwaltney JM Jr. Coronavirus infections in working adults. Eight-year study with 229 E and OC 43. Am Rev Respir Dis. 1972;105(5):805–11. https://doi.org/10.1164/arrd.1972.105.5.805.
de Wit E, van Doremalen N, Falzarano D, Munster VJ. SARS and MERS: recent insights into emerging coronaviruses. Nat Rev Microbiol. 2016;14(8):523–34. https://doi.org/10.1038/nrmicro.2016.81.
de Wilde AH, Wannee KF, Scholte FE, Goeman JJ, Ten Dijke P, Snijder EJ, et al. A kinome-wide small interfering rna screen identifies proviral and antiviral host factors in severe acute respiratory syndrome coronavirus replication, including double-stranded RNA-activated protein kinase and early secretory pathway proteins. J Virol. 2015;89(16):8318–33. https://doi.org/10.1128/JVI.01029-15.
Cotten M, Watson SJ, Kellam P, Al-Rabeeah AA, Makhdoom HQ, Assiri A, et al. Transmission and evolution of the Middle East respiratory syndrome coronavirus in Saudi Arabia: a descriptive genomic study. Lancet. 2013;382(9909):1993–2002. https://doi.org/10.1016/S0140-6736(13)61887-5.
Yuan Y, Cao D, Zhang Y, Ma J, Qi J, Wang Q, et al. Cryo-EM structures of MERS-CoV and SARS-CoV spike glycoproteins reveal the dynamic receptor binding domains. Nat Commun. 2017;8:15092. https://doi.org/10.1038/ncomms15092.
Zhou P, Yang XL, Wang XG, Hu B, Zhang L, Zhang W, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020. https://doi.org/10.1038/s41586-020-2012-7.
Zhu N, Zhang D, Wang W, Li X, Yang B, Song J, et al. A novel coronavirus from patients with pneumonia in China, 2019. N Engl J Med. 2020;382(8):727–33. https://doi.org/10.1056/NEJMoa2001017.
Gordon DE, Jang GM, Bouhaddou M, Xu J, Obernier K, White KM, et al. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature. 2020. https://doi.org/10.1038/s41586-020-2286-9.
Srinivasan S, Cui H, Gao Z, Liu M, Lu S, Mkandawire W, et al. Structural genomics of SARS-CoV-2 indicates evolutionary conserved functional regions of viral proteins. Viruses. 2020;12(4):360. https://doi.org/10.3390/v12040360.
Nakamura T, Yamada KD, Tomii K, Katoh K. Parallelization of MAFFT for large-scale multiple sequence alignments. Bioinformatics. 2018;34(14):2490–2. https://doi.org/10.1093/bioinformatics/bty121.
Simmonds P. SSE: a nucleotide and amino acid sequence analysis platform. BMC Res Notes. 2012;5:50. https://doi.org/10.1186/1756-0500-5-50.
Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 2018;46(W1):W296–303. https://doi.org/10.1093/nar/gky427.
Meng EC, Pettersen EF, Couch GS, Huang CC, Ferrin TE. Tools for integrated sequence-structure analysis with UCSF Chimera. BMC Bioinform. 2006;7:339. https://doi.org/10.1186/1471-2105-7-339.
Sawicki SG, Sawicki DL, Siddell SG. A contemporary view of coronavirus transcription. J Virol. 2007;81(1):20–9. https://doi.org/10.1128/JVI.01358-06.
Wan Y, Shang J, Graham R, Baric RS, Li F. Receptor recognition by novel coronavirus from Wuhan: an analysis based on decade-long structural studies of SARS. J Virol. 2020. https://doi.org/10.1128/JVI.00127-20.
Cook HV, Doncheva NT, Szklarczyk D, von Mering C, Jensen LJ. Viruses. STRING: a virus–host protein–protein interaction database. Viruses. 2018;10(10):519. https://doi.org/10.3390/v10100519.
Letko M, Miazgowicz K, McMinn R, Seifert SN, Sola I, Enjuanes L, et al. Adaptive evolution of MERS-CoV to species variation in DPP4. Cell Rep. 2018;24(7):1730–7. https://doi.org/10.1016/j.celrep.2018.07.045.
Pfefferle S, Schopf J, Kogl M, Friedel CC, Muller MA, Carbajo-Lozoya J, et al. The SARS-coronavirus-host interactome: identification of cyclophilins as target for pan-coronavirus inhibitors. PLoS Pathog. 2011;7(10):e1002331. https://doi.org/10.1371/journal.ppat.1002331.
Aranda B, Blankenburg H, Kerrien S, Brinkman FS, Ceol A, Chautard E, et al. PSICQUIC and PSISCORE: accessing and scoring molecular interactions. Nat Methods. 2011;8(7):528–9. https://doi.org/10.1038/nmeth.1637.
Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G, et al. BioMart–biological queries made easy. BMC Genomics. 2009;10:22. https://doi.org/10.1186/1471-2164-10-22.
Uhlen M, Zhang C, Lee S, Sjostedt E, Fagerberg L, Bidkhori G, et al. A pathology atlas of the human cancer transcriptome. Science. 2017. https://doi.org/10.1126/science.aan2507.
Valdeolivas A, Tichit L, Navarro C, Perrin S, Odelin G, Levy N, et al. Random walk with restart on multiplex and heterogeneous biological networks. Bioinformatics. 2019;35(3):497–505. https://doi.org/10.1093/bioinformatics/bty637.
Ge SX, Jung D, Yao R. ShinyGO: a graphical enrichment tool for animals and plants. Bioinformatics. 2019. https://doi.org/10.1093/bioinformatics/btz931.
Graham RL, Baric RS. Recombination, reservoirs, and the modular spike: mechanisms of coronavirus cross-species transmission. J Virol. 2010;84(7):3134–46. https://doi.org/10.1128/JVI.01394-09.
Gallagher TM, Buchmeier MJ. Coronavirus spike proteins in viral entry and pathogenesis. Virology. 2001;279(2):371–4. https://doi.org/10.1006/viro.2000.0757.
Wrapp D, Wang N, Corbett KS, Goldsmith JA, Hsieh CL, Abiona O, et al. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science. 2020;367(6483):1260–3. https://doi.org/10.1126/science.abb2507
Wu A, Peng Y, Huang B, Ding X, Wang X, Niu P, et al. Genome composition and divergence of the novel coronavirus (2019-nCoV) originating in China. Cell Host Microbe. 2020. https://doi.org/10.1016/j.chom.2020.02.001.
Ge XY, Li JL, Yang XL, Chmura AA, Zhu G, Epstein JH, et al. Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. Nature. 2013;503(7477):535–8. https://doi.org/10.1038/nature12711.
Srivastava M, Chen Z, Zhang H, Tang M, Wang C, Jung SY, et al. Replisome dynamics and their functional relevance upon DNA damage through the PCNA interactome. Cell Rep. 2018;25(13):3869–3883.e4. https://doi.org/10.1016/j.celrep.2018.11.099.
Gramberg T, Hofmann H, Moller P, Lalor PF, Marzi A, Geier M, et al. LSECtin interacts with filovirus glycoproteins and the spike protein of SARS coronavirus. Virology. 2005;340(2):224–36. https://doi.org/10.1016/j.virol.2005.06.026.
Yang ZY, Huang Y, Ganesh L, Leung K, Kong WP, Schwartz O, et al. pH-dependent entry of severe acute respiratory syndrome coronavirus is mediated by the spike glycoprotein and enhanced by dendritic cell transfer through DC-SIGN. J Virol. 2004;78(11):5642–50. https://doi.org/10.1128/JVI.78.11.5642-5650.2004.
Bertram S, Dijkman R, Habjan M, Heurich A, Gierer S, Glowacka I, et al. TMPRSS2 activates the human coronavirus 229E for cathepsin-independent host cell entry and is expressed in viral target cells in the respiratory epithelium. J Virol. 2013;87(11):6150–60. https://doi.org/10.1128/JVI.03372-12.
Bosch BJ, Bartelink W, Rottier PJ. Cathepsin L functionally cleaves the severe acute respiratory syndrome coronavirus class I fusion protein upstream of rather than adjacent to the fusion peptide. J Virol. 2008;82(17):8887–90. https://doi.org/10.1128/JVI.00415-08.
Glowacka I, Bertram S, Muller MA, Allen P, Soilleux E, Pfefferle S, et al. Evidence that TMPRSS2 activates the severe acute respiratory syndrome coronavirus spike protein for membrane fusion and reduces viral control by the humoral immune response. J Virol. 2011;85(9):4122–34. https://doi.org/10.1128/JVI.02232-10.
Hoffmann M, Kleine-Weber H, Schroeder S, Kruger N, Herrler T, Erichsen S, et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell. 2020. https://doi.org/10.1016/j.cell.2020.02.052.
Zhong J, Rajagopalan S. Dipeptidyl peptidase-4 regulation of SDF-1/CXCR4 axis: implications for cardiovascular disease. Front Immunol. 2015;6:477. https://doi.org/10.3389/fimmu.2015.00477.
Zhou J, Chu H, Li C, Wong BH, Cheng ZS, Poon VK, et al. Active replication of Middle East respiratory syndrome coronavirus and aberrant induction of inflammatory cytokines and chemokines in human macrophages: implications for pathogenesis. J Infect Dis. 2014;209(9):1331–42. https://doi.org/10.1093/infdis/jit504.
Lin SC, Ho CT, Chuo WH, Li S, Wang TT, Lin CC. Effective inhibition of MERS-CoV infection by resveratrol. BMC Infect Dis. 2017;17(1):144. https://doi.org/10.1186/s12879-017-2253-8.
Mizutani T, Fukushi S, Iizuka D, Inanami O, Kuwabara M, Takashima H, et al. Inhibition of cell proliferation by SARS-CoV infection in Vero E6 cells. FEMS Immunol Med Microbiol. 2006;46(2):236–43. https://doi.org/10.1111/j.1574-695X.2005.00028.x.
Jimenez-Guardeno JM, Nieto-Torres JL, DeDiego ML, Regla-Nava JA, Fernandez-Delgado R, Castano-Rodriguez C, et al. The PDZ-binding motif of severe acute respiratory syndrome coronavirus envelope protein is a determinant of viral pathogenesis. PLoS Pathog. 2014;10(8):e1004320. https://doi.org/10.1371/journal.ppat.1004320.
Li SW, Wang CY, Jou YJ, Yang TC, Huang SH, Wan L, et al. SARS coronavirus papain-like protease induces Egr-1-dependent up-regulation of TGF-beta1 via ROS/p38 MAPK/STAT3 pathway. Sci Rep. 2016;6:25754. https://doi.org/10.1038/srep25754.
Li SW, Yang TC, Wan L, Lin YJ, Tsai FJ, Lai CC, et al. Correlation between TGF-beta1 expression and proteomic profiling induced by severe acute respiratory syndrome coronavirus papain-like protease. Proteomics. 2012;12(21):3193–205. https://doi.org/10.1002/pmic.201200225.
Newton AH, Cardani A, Braciale TJ. The host immune response in respiratory virus infection: balancing virus clearance and immunopathology. Semin Immunopathol. 2016;38(4):471–82. https://doi.org/10.1007/s00281-016-0558-0.
Kindler E, Thiel V, Weber F. Interaction of SARS and MERS coronaviruses with the antiviral interferon response. Adv Virus Res. 2016;96:219–43. https://doi.org/10.1016/bs.aivir.2016.08.006.
Mahallawi WH, Khabour OF, Zhang Q, Makhdoum HM, Suliman BA. MERS-CoV infection in humans is associated with a pro-inflammatory Th1 and Th17 cytokine profile. Cytokine. 2018;104:8–13. https://doi.org/10.1016/j.cyto.2018.01.025.
Tynell J, Westenius V, Ronkko E, Munster VJ, Melen K, Osterlund P, et al. Middle East respiratory syndrome coronavirus shows poor replication but significant induction of antiviral responses in human monocyte-derived macrophages and dendritic cells. J Gen Virol. 2016;97(2):344–55. https://doi.org/10.1099/jgv.0.000351.
Mella C, Suarez-Arrabal MC, Lopez S, Stephens J, Fernandez S, Hall MW, et al. Innate immune dysfunction is associated with enhanced disease severity in infants with severe respiratory syncytial virus bronchiolitis. J Infect Dis. 2013;207(4):564–73. https://doi.org/10.1093/infdis/jis721.
Mesel-Lemoine M, Millet J, Vidalain PO, Law H, Vabret A, Lorin V, et al. A human coronavirus responsible for the common cold massively kills dendritic cells but not monocytes. J Virol. 2012;86(14):7577–87. https://doi.org/10.1128/JVI.00269-12.
Acknowledgements
We gratefully acknowledge: Collaborators Members of INMI COVID-19 Study Group; COVID 19 INMI Network Medicine for IDs Study Group:
Isabella Abbate, Chiara Agrati, Samir Al Moghazi, Tommaso Ascoli Bartoli, Barbara Bartolini, Maria R. Capobianchi , Alessandro Capone, Delia Goletti, Gabriella Rozera, Carla Nisii, Roberta Gagliardini, Fabiola Ciccosanti, Gian Maria Fimia, Emanuele Nicastri, Emanuela Giombini, Simone Lanini, Alessandra D’Abramo, Gabriele Rinonapoli, Enrico Girardi, Chiara Montaldo, Raffaella Marconi, Antonio Addis, Bradley Maron, Ginestra Bianconi, Bertrand De Meulder, Jason Kennedy, Shabaana Abdul Khader, Francesca Luca, Markus Maeurer, Mauro Piacentini, Stefano Merler, Giuseppe Pantaleo, Rafick-Pierre Sekaly, Serena Sanna, Nicola Segata, Alimuddin Zumla, Francesco Messina, Francesco Vairo, Francesco Nicola Lauria, Giuseppe Ippolito.
Funding
INMI authors are supported by the Italian Ministry of Health (Ricerca Corrente Linea 1) and Italian Ministry of Economy (It-IDRIN). This work was supported also by a donation of Findus italia, part of the Nomad Foods. G. Ippolito and A. Zumla are co-principal investigator of the Pan-African Network on Emerging and Re-emerging Infections (PANDORA-ID-NET), funded by the European & Developing Countries Clinical Trials Partnership, supported under Horizon 2020. Sir Zumla is in receipt of a National Institutes of Health Research senior investigator award. M. Maeurer is a member of the innate immunity advisory group of the Bill & Melinda Gates Foundation, and is funded by the Champalimaud Foundation.
Author information
Authors and Affiliations
Consortia
Contributions
Conceptualization: FM, EG, FNL. Data curation: FM, EG. Formal analysis: FM, EG.Funding acquisition: MRC, GI, FV, MM, AZ. Investigation: FM, EG, FNL. Methodology: FM, EG, TA, SA. Resources: MRC, GI. Software: FM, EG. Supervision: MRC, FNL. Validation: CA, FV, GK, MM, MP. Visualization: FM, EG. Writing ± original draft: FM, EG, MRC, FNL, GK, AZ, MM. Writing ± review and editing: FM, EG, MRC, FNL, AZ. All authors reviewed the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Additional file 1: Table S1.
List of accession numbers of H-CoV. Table S2. List of genes selected by RWR algorithm for HCoV-229E, along with proximity score. Table S3. List of genes selected by RWR algorithm for SARS-CoV, along with proximity score. Table S4. List of genes selected by RWR algorithm for MERS-CoV, along with proximity score.
Additional file 2: Figure S1.
Pairwise distances along 259 full length CoV genomes. In the bottom of picture, indicative gene positioning along CoVs genomes is reported. The list of all considered genomes is reported in Additional file 1: Table S1. Figure S2. 3D structure of S-glycoprotein of SARS-CoV-2 and comparison with the ortholog from HCoV-229E, SARS-CoV, and MERS-CoV. Lateral (a) and superior (b) representation of SARS-CoV-2 S-glycoprotein, deducted for the sequence of patient INMI1 (MT066156.1). Each subunit chain has a different color. Structure comparison of S-glycoprotein subunit between: HCoV-229E and SARS-CoV-2, in purple and blue respectively (c); SARS-CoV and SARS-CoV-2, in red and blue, respectively (d); MERS-CoV and SARS-CoV-2, in green and blue, respectively (e). Figure S3. Amino acid alignment and secondary motifs in the receptor binding domain (RBD) of S-glycoprotein of HCoV-229E, SARS-CoV, MERS-CoV and SARS-CoV-2 are shown. Legend of secondary motifs identifiers: H = α Helix, E = β Sheet, X = Random coil. Figure S4. HCoV-229E–host interactome resulting from RWR applied to the top 200 closest proteins identified by RWR, using S-glycoprotein of HCoV-229E. Figure S5. SARS-CoV–host interactome resulting from RWR applied to the top 200 closest proteins identified by RWR, using S-glycoprotein of SARS-CoV. Figure S6. MERS-CoV–host interactome resulting from RWR applied to the top 200 closest proteins identified by RWR, using S-glycoprotein of MERS-CoV.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Messina, F., Giombini, E., Agrati, C. et al. COVID-19: viral–host interactome analyzed by network based-approach model to study pathogenesis of SARS-CoV-2 infection. J Transl Med 18, 233 (2020). https://doi.org/10.1186/s12967-020-02405-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12967-020-02405-w