- Open Access
Emerging of a SARS-CoV-2 viral strain with a deletion in nsp1
Journal of Translational Medicine volume 18, Article number: 329 (2020)
The new Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2), which was first detected in Wuhan (China) in December of 2019 is responsible for the current global pandemic. Phylogenetic analysis revealed that it is similar to other betacoronaviruses, such as SARS-CoV and Middle-Eastern Respiratory Syndrome, MERS-CoV. Its genome is ∼ 30 kb in length and contains two large overlapping polyproteins, ORF1a and ORF1ab that encode for several structural and non-structural proteins. The non-structural protein 1 (nsp1) is arguably the most important pathogenic determinant, and previous studies on SARS-CoV indicate that it is both involved in viral replication and hampering the innate immune system response. Detailed experiments of site-specific mutagenesis and in vitro reconstitution studies determined that the mechanisms of action are mediated by (a) the presence of specific amino acid residues of nsp1 and (b) the interaction between the protein and the host’s small ribosomal unit. In fact, substitution of certain amino acids resulted in reduction of its negative effects.
A total of 17,928 genome sequences were obtained from the GISAID database (December 2019 to July 2020) from patients infected by SARS-CoV-2 from different areas around the world. Genomes alignment was performed using MAFFT (REFF) and the nsp1 genomic regions were identified using BioEdit and verified using BLAST. Nsp1 protein of SARS-CoV-2 with and without deletion have been subsequently modelled using I-TASSER.
We identified SARS-CoV-2 genome sequences, from several Countries, carrying a previously unknown deletion of 9 nucleotides in position 686-694, corresponding to the AA position 241-243 (KSF). This deletion was found in different geographical areas. Structural prediction modelling suggests an effect on the C-terminal tail structure.
Modelling analysis of a newly identified deletion of 3 amino acids (KSF) of SARS-CoV-2 nsp1 suggests that this deletion could affect the structure of the C-terminal region of the protein, important for regulation of viral replication and negative effect on host’s gene expression. In addition, substitution of the two amino acids (KS) from nsp1 of SARS-CoV was previously reported to revert loss of interferon-alpha expression. The deletion that we describe indicates that SARS-CoV-2 is undergoing profound genomic changes. It is important to: (i) confirm the spreading of this particular viral strain, and potentially of strains with other deletions in the nsp1 protein, both in the population of asymptomatic and pauci-symptomatic subjects, and (ii) correlate these changes in nsp1 with potential decreased viral pathogenicity.
Severe Acute Respiratory Syndrome Coronavirus-2 (SARS CoV-2) belongs to the realm Riboviria, order Nidovirales, suborder Cornidovirineae, family Coronaviridae, subfamily Orthocoronavirinae, genus Betacoronavirus (lineage B), subgenus Sarbecovirus, and the species Severe acute respiratory syndrome-related coronavirus, and is the virus responsible for the current global pandemic [1,2,3]. The genome of SARS-CoV-2  is highly homologous to the coronavirus that caused the SARS epidemic in 2003, SARS-CoV [5, 6] and to the coronavirus responsible for the Middle-Eastern Respiratory Syndrome, MERS-CoV .
Coronavirus Diseases (COVID-19) comprises symptoms reported by patients infected by SARS-CoV-2, ranging from mild to severe, and some cases result in death. Severe acute respiratory illness with fever and respiratory symptoms, such as cough and shortness of breath, are the primary case definition, but recently patients without respiratory symptoms are becoming more recognized, with manifestations such as gastrointestinal, olfactory, cardiovascular, and neurological. Cases resulting in death are primarily middle-aged and elderly patients with obesity and/or pre-existing diseases (tumor surgery, cirrhosis, hypertension, coronary heart disease, diabetes, and Parkinson’s disease) [8,9,10,11].
Given the similarity among the viruses, the data about biological functions, characteristics and effects on the host of the proteins expressed by SARS-CoV-2 are mostly inferred by the previous studies on SARS-CoV and other related human (e.g. MERS-CoV) [12,13,14] and animal coronaviruses (e.g. mouse hepatitis virus) . In SARS-CoV two large polyproteins, ORF1a and ORF1ab, are cleaved by a specific protease to form 16 nonstructural proteins (nsp), four structural proteins, namely spike (S), envelope (E), membrane (M), and nucleocapsid (N), and eight accessory proteins: ORF3a, ORF3b (absent in SARS CoV-2), ORF6, ORF7a, ORF7b, ORF8a, ORF8b, and ORF9b (absent in SARS-CoV-2). Experimental data indicate that some accessory proteins are considered not essential for viral replication, while others have been demonstrated to be important for virus-host interactions both in vitro and in vivo [16, 17].
Among these proteins, SARS-CoV, nonstructural protein 1, nsp1 also known as the leader protein, plays a central role in hampering the anti-viral innate immune response, in particular Interferon-alpha expression , and it has been considered as a possible target for therapeutic interventions aimed at reducing viral pathogenicity . Further indicative of its preserved biological function, nsp1 from alpha- and beta-CoVs have different size, but show comparable biological activities in their ability to reduce host gene expression, even though the mechanism seems different [15, 20,21,22].
SARS-CoV nsp1 almost completely blocks host protein translation by binding the 40S ribosome of the host cell, which stops canonical mRNA translation at different steps during the initiation process [23,24,25]. This in turn results in template-dependent endonucleolytic cleavage, followed by degradation of mRNAs of infected cells, while viral mRNA shutdown is avoided through a still not clear mechanism involving interaction between nsp1 with a conserved 5′ untranslated region of the SARS-CoV mRNA . By blocking expression of several components of the innate immune system, including the interferon response, SARS-CoV is thus able to maintain viral expression and escape immune system detection .
Critical for this mechanism are certain amino acid residues of nsp1. For example, in the case of SARS-CoV several residues have been identified that differentially inhibit host gene expression, like interferon alpha, responsible for antiviral activity . More recently, a region in the C-terminal domain of nsp1 of SARS-CoV-2 has been demonstrated to interfere with host expression factors .
Here we describe a deletion identified in the C-terminal region of nsp1 observed in certain genomes from SARS-CoV-2 patients, from different areas of the word. The deletion did result in removal of three amino acid residues (KSF). Two of them (KS) have been shown to be responsible for nsp1 of SARS-CoV partial attenuation of both inhibition of signal transduction and inhibition of gene expression, including Interferon-alpha . Our data indicate that a small percentage of SARS-CoV-2 viruses is actually harboring a deletion in an important protein responsible for pathogenesis, possibly adapting toward a decrease pathogenicity.
We analyzed 17,928 genomic sequences obtained from the GISAID database (updated on 07/24/2020) derived from patients infected by SARS-CoV-2 from different areas around the world. The genomes were collected from December 2019 to July 2020. SARS-CoV-2 reference genome (RefSeq: NC_045512.2) was obtained from the GenBank database. Genomes alignment was performed using MAFFT .
Nsp1 sequence belonging to SARS-CoV-2 were identified using BioEdit and verified by using BLAST . Nsp1 protein of SARS-CoV-2 with and without deletion have been subsequently modelled using I-TASSER .
We identified genomic sequences, from specific Countries, carrying a deletion of 9 nucleotides in position 686-694, corresponding to AA position 241-243 (KSF) (Fig. 1). The list of Countries with the related number of sequences available analyzed and the number of sequences carrying the deletion is listed in Table 1. The overall presence of genomes carrying the deletion in the cases analyzed was 0.44%, though it was not homogelouly distributed. In fact, we did not found it in certain Countries, such as Italy, Germany and Austria., while in others it was clearly present, for example in Sweden with 10 out of 527 genomes (1.90%), Israel (0.90), Brazil (0.63%) and England (0.45%). Among the States analyzed in the United States, we could detect it in New Jersey (0.91%), New York (0.74), Utah (0.73), and Connecticut (0.65), while we could not detect it in Texas and Nebraska. We note that some of the areas where the deletion could not be detected had a very low number of genomic sequences available for analysis, making the negative results difficult to interpret. Furthermore, the dataset available did not allow us to determine whether this deletion happened as a series of independent events in different temporal moments and geographical areas, as if the virus has an intrisecally fragile site, or it emerged from a single transforming event originating from a unique cluster. More data are needed to differentiate between these hypotheses.
We next used I-TASSER to model nsp1 protein of SARS-CoV-2 carrying the deletion. A structure comparison of nsp1 from SARS-CoV-2 models with and without the deletion is represent in Fig. 2. Cartoon depiction of the nsp1 from SARS-CoV-2 with and without the deletion show the superimposed core (AA1-127) and the C-terminal tails (AA128-148) . The structure of the C-terminal tail is unresolved in the NMR structure of SARS-CoV (PDB code 2GDT) and this region is predicted to be highly flexible and disordered, with a few secondary helical elements predicted . Prediction models for both nsp1 SARS-CoV and nsp1 SARS-CoV-2 indicate a possibility of a short helical secondary structure for KSY and KSF amino acids, respectively, and this terminal tail was found to be very important for expression of nsp1 itself . The flexibility, lack of structure and disorder in this region is speculated to allow for availability of the protease recognition seuquence between nsp1 and nsp2 . Indeed, the C-terminal tail was found to be dispensable for MHV (murine hepatitis virus) viral replication but necessary for proteolysis of nsp1 and nsp2 . The newly described deletion of KSF amino acids may influence potential secondary structure in this region of SARS-CoV-2, thereby altering activity of nsp1 interactions and consequent activity on viral protein and host’s gene expression regulation.
Our analysis shows the emergence of a deletion in nsp1, one of the most important determinants of pathogenicity of SARS-CoV-2. This is quite surprising, since corona viruses typically experience a moderate rate of mutations, due to the presence of a protein with proofreader activity (ExonN, also called nsp14), calculated in about 26 mutations per year (https://nextstrain.org/ncov/global?l=clock). Though the number of sequences detected was a small fraction of the total analyzed, our data clearly identify a new SARS-CoV-2 viral strain present in subjects from different areas (Europe, North and South America). However, our analysis also indicates that this deletion is not homogeneously present in all the Countries analyzed. For this reason, it would be important to monitor its presence over time, and to determine its penetrance and probability to spread and compete with the current viral strains. Nonetheless, our results suggest the possibility of the evolution of a new viral quasi-specie, but further data are necessary to confirm this hypothesis and explore the possibility of a developing intra-host adaptative process.
The new viral strain that we describe carries a defining characteristic deletion of 9 nucleotides in the C-terminal region of the nsp1 gene, translating into a protein lacking three amino acids (KSF). Substitution of two of these amino acids (KS) reduced the inhibitory effect of innate immune response to SARS-CoV, and by predicted structure analysis we show that these amino acids compromise proper folding of nsp1. Consequently, we hypothesize that viruses harboring this deletion are likely to be less pathogenic than commonly observed viral strains. To this regard, we note that the two common endemic human coronaviruses, HCoV-OC43  and HCoV-299E , have extensive deletions in the C-terminal region of nsp1. Thought crystallization and biological data are needed to confirm our hypothesis, our observations, together with the recent findings of two viral strains carrying in one case an extensive deletion in the orf7a gene , a deletion in the nsp2 gene  and deletions in nsp1 gene also identified by other groups [37, 38], indicate that SARS-CoV-2 genome may be undergoing a significant evolutionary process, which may result in virus-host adaptation . Since the overwhelming majority of genomic sequences collected so far are from symptomatic subjects, it seems logical to characterize in detail SARS-CoV-2 genomes from the asymptomatic population. If our hypothesis is correct, this is the proper population where we should be able to identify more in detail further viral evolutionary steps, which may indicate reduction of pathogenicity. Understanding the different steps that characterize the pathogenicity of this virus, as well as the spreading and changes of these pathogenic determinants among the population, may help determining proper strategies of containment of SARS-CoV-2 spread and identify better drugs for treatment of COVID-19.
We identified the emergence in infected subjects of a new viral strain of SARS-CoV-2 with a deletion of 3 amino acids (KSF) in the C-terminal region of nsp1. I-TASSER structure analysis indicates that this deletion may affects the structure of the C-terminal region, important for regulation of nsp1 activity. Substitution of two of these amino acids (KS) was also previously reported to revert the loss of interferon-alpha expression in cells transfected with mutated nsp1 from SARS-CoV. This deletion in nsp1, together with deletions previously described in other parts of SARS-CoV-2 genome by different groups, indicates that the virus is undergoing profound genomic changes. It should be noted that mutations of the virus are not very common, due to its proofreading mechanism, and that collection of the sequencing data is currently biased toward symptomatic subjects. It would be of interest to monitor over time and confirm the spreading of this particular viral strain, and potentially of strains with other deletions in the nsp1 protein, in the population of asymptomatic and pauci-symptomatic subjects and to correlate these changes in nsp1 with a possible decreased viral pathogenicity.
Availability of data and materials
All data utilized, generated or analyzed during these studies are included in this published article.
Severe acute respiratory syndrome coronavirus
Severe acute respiratory syndrome coronavirus 2 (COVID-19)
Middle east respiratory syndrome coronavirus
Gralinski LE, Menachery VD. Return of the Coronavirus: 2019-nCoV. Viruses. 2020;12:2.
Chan JF, et al. Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan. Emerg Microbes Infect. 2020;9(1):221–36.
Li X, et al. Potential of large “first generation” human-to-human transmission of 2019-nCoV. J Med Virol. 2020;92(4):448–54.
Wang C, et al. The establishment of reference sequence for SARS-CoV-2 and variation analysis. J Med Virol. 2020;92(6):667–74.
Khailany RA, Safdar M, Ozaslan M. Genomic characterization of a novel SARS-CoV-2. Gene Rep. 2020;19:100682.
Andersen KG, et al. The proximal origin of SARS-CoV-2. Nat Med. 2020;26(4):450–2.
Wu A, et al. Genome composition and divergence of the novel coronavirus (2019-nCoV) originating in China. Cell Host Microbe. 2020;27(3):325–8.
Vetter P, et al. Clinical features of covid-19. BMJ. 2020;369:m1470.
Adhikari SP, et al. Epidemiology, causes, clinical manifestation and diagnosis, prevention and control of coronavirus disease (COVID-19) during the early outbreak period: a scoping review. Infect Dis Povert. 2020;9(1):29.
Fu L, et al. Clinical characteristics of coronavirus disease 2019 (COVID-19) in China: a systematic review and meta-analysis. J Infect. 2020;80(6):656–65.
Huang C, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020;395(10223):497–506.
Li Y-H, et al. Molecular characteristics, functions, and related pathogenicity of MERS-CoV proteins. Engineering. 2019;5(5):940–7.
Song Z, et al. From SARS to MERS, thrusting coronaviruses into the spotlight. Viruses. 2019;11:1.
Corman VM, et al. Hosts and sources of endemic human coronaviruses. Adv Virus Res. 2018;100:163–88.
Lei L, et al. Attenuation of mouse hepatitis virus by deletion of the LLRKxGxKG region of Nsp1. PLoS ONE. 2013;8(4):e61166–e61166.
Liu DX, et al. Accessory proteins of SARS-CoV and other coronaviruses. Antiviral Res. 2014;109:97–109.
Gordon DE, et al. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature. 2020;583(7816):459–68.
Jauregui AR, et al. Identification of residues of SARS-CoV nsp1 that differentially affect inhibition of gene expression and antiviral signaling. PLoS ONE. 2013;8(4):e62416–e62416.
Wu C, et al. Analysis of therapeutic targets for SARS-CoV-2 and discovery of potential drugs by computational methods. Acta Pharm Sin B. 2020;10(5):766–88.
Tohya Y, et al. Suppression of host gene expression by nsp1 proteins of group 2 bat coronaviruses. J Virol. 2009;83(10):5282–8.
Narayanan K, et al. Severe acute respiratory syndrome coronavirus nsp1 suppresses host gene expression, including that of type I interferon, in infected cells. J Virol. 2008;82(9):4471–9.
Huang C, et al. Alphacoronavirus transmissible gastroenteritis virus nsp1 protein suppresses protein translation in mammalian cells and in cell-free HeLa cell extracts but not in rabbit reticulocyte lysate. J Virol. 2011;85(1):638–43.
Lokugamage KG, et al. Severe acute respiratory syndrome coronavirus protein nsp1 is a novel eukaryotic translation inhibitor that represses multiple steps of translation initiation. J Virol. 2012;86(24):13598–608.
Kamitani W, et al. A two-pronged strategy to suppress host protein synthesis by SARS coronavirus Nsp1 protein. Nat Struct Mol Biol. 2009;16(11):1134–40.
Thoms, M., et al., Structural basis for translational shutdown and immune evasion by the Nsp1 protein of SARS-CoV-2. bioRxiv, 2020: p. 2020.05.18.102467.
Huang C, et al. SARS coronavirus nsp1 protein induces template-dependent endonucleolytic cleavage of mRNAs: viral mRNAs are resistant to nsp1-induced RNA cleavage. PLoS Pathog. 2011;7(12):e1002433.
Katoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform. 2019;20(4):1160–6.
Altschul SF, et al. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.
Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 2010;5(4):725–38.
Schrodinger, LLC, The PyMOL Molecular Graphics System, Version 1.3r1. 2010.
Almeida MS, et al. Novel beta-barrel fold in the nuclear magnetic resonance structure of the replicase nonstructural protein 1 from the severe acute respiratory syndrome coronavirus. J Virol. 2007;81(7):3151–61.
Brockway SM, Denison MR. Mutagenesis of the murine hepatitis virus nsp1-coding region identifies residues important for protein processing, viral RNA synthesis, and viral replication. Virology. 2005;340(2):209–23.
Vijgen L, et al. Complete genomic sequence of human coronavirus OC43: molecular clock analysis suggests a relatively recent zoonotic coronavirus transmission event. J Virol. 2005;79(3):1595–604.
Farsani SMJ, et al. The first complete genome sequences of clinical isolates of human coronavirus 229E. Virus Genes. 2012;45(3):433–9.
Holland LA, et al. An 81-nucleotide deletion in SARS-CoV-2 ORF7a identified from sentinel surveillance in Arizona (January to March 2020). J Virol. 2020;94:14.
Bal A, et al. Molecular characterization of SARS-CoV-2 in the first COVID-19 cluster in France reveals an amino acid deletion in nsp2 (Asp268del). Clin Microbiol Infect. 2020;26(7):960–2.
Phan T. Genetic diversity and evolution of SARS-CoV-2. Infect Genet Evol. 2020;81:104260. https://doi.org/10.1016/j.meegid.2020.104260
Islam MR, Hoque MN, Rahman MS, et al. Genome-wide analysis of SARS-CoV-2 virus strains circulating worldwide implicates heterogeneity. Sci Rep. 2020;10:14004. https://doi.org/10.1038/s41598-020-70812-6.
Benedetti F, et al. SARS-CoV-2: march toward adaptation. J Med Virol. 2020;12:1–3.
MG is supported by Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro (FAPERJ).
Ethics approval and consent to participate
Consent for publication
We consent to publish our data.
The authors have declared that no competing interests exist.
About this article
Cite this article
Benedetti, F., Snyder, G.A., Giovanetti, M. et al. Emerging of a SARS-CoV-2 viral strain with a deletion in nsp1. J Transl Med 18, 329 (2020). https://doi.org/10.1186/s12967-020-02507-5
- Viral adaptation