A modular framework for the development of targeted Covid-19 blood transcript profiling panels
Journal of Translational Medicine volume 18, Article number: 291 (2020)
Covid-19 morbidity and mortality are associated with a dysregulated immune response. Tools are needed to enhance existing immune profiling capabilities in affected patients. Here we aimed to develop an approach to support the design of targeted blood transcriptome panels for profiling the immune response to SARS-CoV-2 infection.
We designed a pool of candidates based on a pre-existing and well-characterized repertoire of blood transcriptional modules. Available Covid-19 blood transcriptome data was also used to guide this process. Further selection steps relied on expert curation. Additionally, we developed several custom web applications to support the evaluation of candidates.
As a proof of principle, we designed three targeted blood transcript panels, each with a different translational connotation: immunological relevance, therapeutic development relevance and SARS biology relevance.
Altogether the work presented here may contribute to the future expansion of immune profiling capabilities via targeted profiling of blood transcript abundance in Covid-19 patients.
Covid-19 is an infectious, respiratory disease caused by a newly discovered coronavirus: SARS-CoV-2. The course of infection vary widely, with most patients presenting mild symptoms. However, about 20% of patients develop severe disease and require hospitalization [1, 2]. The interaction between innate and adaptive immunity can lead to the development of neutralizing antibodies against SARS-CoV-2 antigens that might be associated with viral clearance and protection . But immune factors are also believed to play an important role in the rapid clinical deterioration observed in some Covid-19 patients . There is thus a need to develop new modalities that can improve the delineation of “immune trajectories” during SARS-CoV-2 infection.
Blood transcriptome profiling involves measuring the abundance of circulating leukocyte RNA on a genome-wide scale via RNA sequencing . Processing of the samples and the raw sequencing data however, is time consuming and requires access to sophisticated laboratory and computational infrastructure. Thus, the possibility of implementing this approach on large scales to ensure immediate translational potential is limited. Such unbiased omics profiling data might rather be leveraged to inform the development of more practical, scalable and targeted transcriptional profiling assays. These assays could in turn serve to significantly bolster existing immune profiling capacity.
Fixed sets of transcripts grouped based on co-expression observed in large collections of reference datasets provide a robust platform for transcriptional profiling data analyses . Here we leveraged a repertoire of 382 transcriptional modules previously developed by our team . The repertoire is based on a collection of reference patient cohorts encompassing 16 pathological or physiological states and 985 individual transcriptome profiles. In this proof of principle study, we used the available transcript profiling data from two separate studies to select Covid-19 relevant sets of modules [8, 9]. Next, we applied filters based on pre-specified selection criteria (e.g. immunologic relevance or therapeutic relevance). Finally, expert curation was used as the last selection step. For this we have developed custom web applications to consolidate the information necessary for the evaluation of candidates. One of these applications provides access to module-level transcript abundance profiles for available Covid-19 blood transcriptome profiling datasets. Another web interface was implemented which serves as a scaffold for the juxtaposition of such transcriptional profiling data with extensive functional annotations.
Two Covid-19 blood transcriptional datasets available at the time this work was conducted were used: (1) Xiong et al.  obtained peripheral mononuclear cell samples obtained from one uninfected control individual and three patients with Covid-19. RNA abundance was profiled via RNAseq. The data were deposited in the Genome Sequence Archive of the Beijing Institute of Genomics, Chinese Academy of Sciences, under the accession number CRA002390. FASTQ files were downloaded from this repository. Following QC reads were aligned to reference genome GRCh38/hg19 using Hisat2 (v2.05). BAM files were converted to a raw count expression matrix using subreads (v1.6.2). Raw expression data was corrected for within lane and between lane effects using R package EDASeq (v2.12.0) and quantile normalized using preprocessCore (v1.36.0). The modular analysis was performed by using 10,617 RNA-seq genes which overlapped with transcripts from the 3rd generation module construction . Details of the analysis as described below section.
(2) Ong et al.  collected whole blood stabilized in RNA buffer from uninfected controls and three Covid-19 patients at multiple time points. RNA abundance was profiled using a standard immunology panel from Nanostring comprising 594 transcripts. The data were deposited in the arrayexpress public repository with accession ID E-MTAB-8871. The normalized data were downloaded, and modular analysis was performed by using 403 NanoString genes which overlapped with transcripts from the 3nd generation module construction details of the analysis as described below section.
We used in addition a reference dataset generated by our group that was previously used for the construction of the 382 blood transcriptional module repertoire. This repertoire served in turn as the basis for the selection/development of targeted Covid-19 blood transcript panels described in the present article . Briefly, this repertoire consists of the following cohorts of patients and respective control subjects: S. aureus infection (99 cases, 44 controls), sepsis (35 cases, 12 controls), tuberculosis (23 cases, 11 controls), Influenza (25 cases, 14 controls), RSV infection (70 cases, 14 controls), HIV infection (28 cases, 35 controls), systemic lupus erythematosus (55 cases, 14 controls), multiple sclerosis (34 cases, 22 controls), juvenile dermatomyositis (40 cases, 9 controls), Kawasaki disease (21 cases, 23 controls), systemic onset idiopathic arthritis (62 cases, 23 controls), COPD (19 cases, 24 controls), melanoma (22 cases, 5 controls), pregnancy (25 cases, 20 controls), liver transplant recipients (94 cases, 30 controls), and B cell deficiency (20 cases, 13 controls). All samples were run at the same facility on Illumina HumanHT-12 v3.0 Gene Expression BeadChips. The data have been deposited in NCBI Gene Expression Omnibus (GEO) with accession number GSE100150.
Transcriptional module repertoire
The method used to construct the transcriptional module repertoire has been described elsewhere [10, 11]. The version used here is the third and last to have been developed by our group over a period of 12 years. It is the object of a separate publication (available on a pre-print server ).
Briefly, the approach consists of identifying sets of co-expressed transcripts in a wide range of pathological or physiological states, focusing in this case on the blood transcriptome as the biological system. We determined co-expression based on patterns of co-clustering observed for all gene pairs across the collection of 16 reference datasets listed in the previous section and that encompassed viral and bacterial infectious diseases as well as several inflammatory or autoimmune diseases, B-cell deficiency, liver transplantation, stage IV melanoma and pregnancy. Overall, this collection comprised 985 blood transcriptome profiles. A weighted, co-expression network was built with the weight of the nodes connecting a gene pair being based on the number of times co-clustering was observed for the pair among the 16 reference datasets. Thus, the weights ranged from 1 (where co-clustering occurs in one of 16 datasets) to 16 (where co-clustering occurs in all 16 datasets). Next, this network was mined using a graph theory algorithm to define subsets of densely connected gene sets that constituted our module repertoire (“Cliques” and “Paracliques”).
Overall, 382 transcriptional modules were identified, encompassing 14,168 transcripts. A supplemental file including the definition of this module repertoire along with the functional annotations is made available here (Additional file 3). To provide another level of granularity and facilitate data interpretation, a second round of clustering was performed to group the modules into “aggregates”. This process was achieved by grouping the set of 382 modules according to the patterns of transcript abundance across the 16 reference datasets that were used for module construction. This segregation resulted in the formation of 38 aggregates, each comprising between one and 42 modules.
Module repertoire analyses
The modular analyses were performed using the core set of 14,168 transcripts forming the module repertoire. For group-level comparisons (cases vs controls), a paired t-test was performed on the log2-transformed data [Fold change (FC) cut off = 1.5; FDR cut off = 0.1]. For individual-level comparisons, each sample was compared to the mean value of the corresponding control samples (or individual sample in the case of the Xiong et al. dataset). The cut off comprised an absolute FC > 1.5 and a difference in counts > 10. The results for each module are reported as the percentage of its constitutive transcripts that increased or decreased in abundance. Group-level comparisons were performed on the reference datasets (collection of 16 datasets from Altman et al.). Individual-level comparisons were performed on both Covid-19 datasets. Because the genes comprised in a module are selected based on the co-expression observed in blood, the changes in abundance within a given module tend to be coordinated and the dominant trend is therefore selected (the greater value of the percentage increased vs. percentage decreased). Thus, the values range from -100% (all constitutive modules are decreased) to +100% (all constitutive modules are increased). A module was considered to be “responsive” when the proportion of transcripts found to be increased was > 15% (induced), or when the proportion of transcripts found to be decreased was ≤ 15% (repressed). At the aggregate-level, the percent values of the constitutive modules were averaged. Module aggregates showing little changes in Covid-19 patients were filtered out from the selection process. This was based on the proportion of modules for a given aggregate showing changes for all three subjects from the Xiong et al. dataset. The cutoff was set at 15%. In total of 17 out of the 38 module aggregates exceeded this cutoff and were thus retained for downstream analyses. They are listed in Table 1.
Changes in transcript abundance reduced at the module or module aggregate-level were visualized using a custom fingerprint heatmap format. For each module, the percentage of increased transcripts is represented by a red spot and the percentage of decreased transcripts is represented by a blue spot. The fingerprint grid plots were generated using “ComplexHeatmap” . A web application was developed to generate the plots and browse modules and module aggregates (https://drinchai.shinyapps.io/COVID_19_project/). A detailed description and source code will be available as part of a separate publication BioRxiv deposition on GitHub and BioRxiv (in preparation).
Selection of transcripts for inclusion in targeted panels
Covid-19 module sets belonging to aggregates comprising module annotations relating to inflammation, monocytes, neutrophils or coagulation pathway were selected for screening (A7, A8, A26, A31, A33, A34, A35). In turn, transcripts from each of the corresponding module sets were selected on the basis of their status as a known therapeutic target of a drug for which clinical precedence exists (source: targetvalidation.org). Next, candidates were prioritized via expert curation on the basis of compatibility and a potential benefit as a Covid-19 treatment. Curators selected for this task were medical degree holders. They were provided with reports from the Open Targets website . These reports included transcripts within a given module set which products were identified as being targetable by existing drugs (with tractability information indicating “with clinical precedence”; e.g. for Module M16.64 A31/S1: https://bit.ly/3dLin5P). Based on this information, their medical knowledge, and review of the relevant literature curators identified among candidates targeted by drugs those that would be most likely to be considered for treatment of Covid-19 patients. When multiple such candidates were identified a ranking was given based on feasibility and perceived potential clinical benefit. Only the top ranked candidate from each set was selected for inclusion in the panel. Module sets from aggregate A28 (interferon response) may also be of clinical relevance, as indicators of a treatment response since interferon administration has been shown to increase the activity of anti-viral drugs in Covid-19 patients . The selection of candidates for aggregate A28 sets was thus based on the amplitude of the response to beta-interferon therapy measured in patients with multiple sclerosis [fold-change over pre-treatment baseline  & NCBI GEO accession GSE26104]. The remaining nine aggregates, which tended to associate preferentially with adaptive immune responses and for which targeting by therapies might prove detrimental, were not included in this screen. For these, representative transcripts from the default panel of immune relevant transcripts were included.
Relevance to Coronavirus biology
For the second panel, transcripts were primarily selected based on their relevance to SARS (Severe Acute Respiratory Syndrome) biology. As a first step, a literature profiling tool was used to identify among the SARS, MERS (Middle East Respiratory Syndrome), or Covid-19 literature articles that were associated with transcripts forming the 28 Covid-19 module sets [Literature Lab (LitLab) by Acumenta Biotech  and LitLab Gene Retriever application, Accumenta Biotech, Boston, MA]. Next, the potential associations were assessed by manual curation. The curators prioritized the transcripts for which the associations could be confirmed based on importance and robustness.
Lists of immunologically relevant genes were retrieved from Immport, the NIAID Immunology Database and Analysis Portal , and were used along with membership to IPA pathways (Ingenuity Pathway Analysis, QIAGEN, Germantown MD) to annotate transcripts comprising Covid-19 module sets. The curators prioritized annotated transcripts on the basis of their relevance to the functional annotations of the module set (e.g. if the main annotation for the modules for a given set is “cytotoxic cells”, markers for NK cells would be preferentially over a cytokine that is better characterized but is unrelated to cytotoxic functions). The transcript with the highest priority rank was included in the assay.
A recommended set of housekeeping genes is provided in Table 2. These were selected on the basis of low variance observed across the 985 transcriptome profiles generated for our reference cohorts.
Links to the resources described in this section and to video demonstrations are available in Table 3. Interactive presentations were created via the Prezi web application. For this we have built and expanded upon an annotation framework established as part of the characterization of our reference blood transcriptome repertoire . Several bioinformatic resources were used to populate interactive presentations that served as a framework for annotation of Covid-19 relevant module sets. These resources include web applications deployed using Shiny R, which permit to plot transcript abundance patterns at the module and aggregate levels. Two of these applications were developed as part of a previous work establishing the blood transcriptome repertoire and applying it in the context of a meta-analysis of six public RSV datasets . As described above, a third application was developed as part of this work and can generate profiles at the transcript, module and module-aggregate levels for the Xiong et al. and Ong et al. datasets.
Mapping Covid-19 blood transcriptome signatures against a pre-existing reference set of transcriptional modules
Changes in blood transcript abundance in response to SARS-CoV-2 infection have thus far been reported in two different studies. Different platforms, methodologies and designs were employed. Also, we first used a reference sets of signatures as a common framework in order to compare changes in transcript abundance measured in each study.
We employed a pre-established repertoire of 382 transcriptional modules (Fig. 1a) to map changes observed in Covid-19 patients. This module framework is described in details in “Methods” section and in a separate publication . The Covid-19 datasets that we used for this were contributed by Xiong et al.  (one control and three subjects) and Ong et al.  (nine controls and three subjects profiled at multiple time points). Their data were generated using RNA-seq and Nanostring technology, respectively. The generic 594 transcript panel used by Ong et al. did not give sufficient coverage across the 382-module set. We thus mapped the transcript changes at a lower resolution, using 38 module “aggregates“(Fig. 2). These 38 aggregates encompass the entire 382 module set and constitute a more reduced version of this repertoire [see “Methods” section and . In general, we saw a decrease in aggregates associated with lymphocytic compartments (aggregates A1 & A5) and an increase in aggregates associated with myeloid compartments and inflammation (aggregates A33 & A35). As expected, we also saw increases over uninfected controls for the module aggregate associated with interferon (IFN) responses (A28) and the module aggregate presumably associated with the effector humoral response (A27). We detected a wide spread of values for aggregate A11 for the Nanostring (Ong et al.) dataset. However, this aggregate comprises only one module, with only two of its transcripts measured in this Nanostring code set (the probe coverage across module aggregates is shown in Additional file 1 Fig. S1).
Despite large differences between the two studies in terms of design, range of clinical severity, technology platforms and module coverage, the combined overall changes (detected at a high-level perspective) are consistent with those observed in known acute infections, such as those caused by influenza, respiratory syncytial virus (RSV) or S. aureus. This consistency is evidenced by the patterns of change observed for the reference fingerprints shown alongside those of Covid-19 patients (Fig. 2).
This analysis provides a high-level mapping of changes associated with SARS-CoV-2 infection in two independent studies. It revealed a significant degree of inter-individual variability among Covid-19 patients. In one of the studies dynamic changes were also observed for the same individuals at multiple time points. Overall the analysis results show that changes in abundance of blood transcripts can be measured during the course of Covid-19 disease. It also serves to highlight the need for transcript profiling analyses to be carried out in large number of patients and at high temporal frequencies.
Selection of aggregates and identification of coherent sets of Covid-19-relevant modules
The pre-established repertoire of 382 transcriptional modules that we have employed here covers 14,168 transcripts. It is based on co-expression patterns observed across a wide range of immune states (Fig. 1a). Also, only a fraction of the modules constituting this repertoire are expected to be of relevance for monitoring changes in transcript abundance in Covid-19 patients (as shown in Fig. 2). Thus, in the next step we selected a subset of Covid-19 relevant module aggregates (Fig. 1b). This was achieved by filtering aggregates for which seldom changes were observed among patients profiled via RNAseq by Xiong et al. (see “Methods” for details). As a result, 17 of the 38 module aggregates forming the repertoire were retained for further analysis and target selection (Table 1).
However, patterns of changes in transcript abundance for modules comprised in a given aggregate are not always homogeneous. Thus, a further step consist in identifying sets of modules within each of the 17 aggregate that display coherent abundance patterns (Fig. 1c). To achieve this, we first mapped the changes in transcript abundance associated with Covid-19 disease using the RNAseq dataset from Xiong et al., as illustrated for A31 (Fig. 3a) and A28 (Fig. 4a). Similar plots can be generated for all other aggregates using the “COVID-19” web application (links listed in Table 3 and output provided in Additional file 2).
Next, we identified and assigned a module set ID for each the modules that formed homogeneous clusters. For example, we designated the first A28 set as A28/S1. Such module grouping is only based on patterns of transcript abundance observed in three Covid-19 patients; however, the groupings were often consistent with those observed for the much larger reference cohorts that constitute the module repertoire (Fig. 3b and Fig. 4b). A28/S1, which is formed by M8.3 and M15.127, serves as a good example of this consistency (Fig. 4b). Likewise, the segregation of the modules forming A31 based on differences observed in the three Covid-19 patients was also apparent in the reference patient cohorts (Fig. 3b). Specifically, an increase in A31/S1 modules, which accompanied a decrease in A31/S2 modules, in these three patients was also characteristic of RSV patients.
We ultimately derived 28 homogeneous Covid-19 relevant module sets from the 17 aggregates selected in the earlier step (Table 1). These sets were used as a basis for further selection.
Design of a preliminary targeted panel emphasizing immunological relevance
In the previous step, we used available Covid-19 data to guide the selection of 28 distinct “Covid-19 relevant module sets”. In the next step, we selected the transcripts within each module set that warranted inclusion in one of three preliminary Covid-19 targeted panels. A first panel was formed using immunologic relevance as the primary criterion, a second was formed on the basis of relevance to coronavirus biology, a third was constituted on the basis of relevance to therapy.
For the first panel we matched transcripts comprised in each module set to a list of canonical immune genes (see “Methods” for details). Expert curation also involved accessing transcript profiling data from the reference datasets, indicating for instance leukocyte restriction or patterns of response to a wide range of immune stimuli in vitro. We describe our approach for module and gene annotation in more detail below and provide access to our resources to support expert curation (Table 3).
For our illustrative case, we selected one representative transcript per module set to produce a panel comprised of 28 representative transcripts (Table 4). Examples of signatures surveyed by such a panel include: (1) ISG15 in A28/S1 (interferon responses), which encodes for a member of the ubiquitin family. ISG15 plays a central role in the host defense to viral infections . (2) GATA1 in A37/S1 (erythroid cells), which encodes for a master regulator of erythropoiesis . It is associated with a module signature (A37) that we recently reported as being associated with immunosuppressive states, such as late stage cancer and maintenance immunosuppressive therapy in solid organ transplant recipients . In the same report we also found an association between this signature and heightened severity in patients with RSV infection and established a putative link with a population of immunosuppressive circulating erythroid cells . (3) CD38 in A27/S1 (cell cycle), which encodes for the CD38 molecule expressed on different circulating leukocyte populations. In whole blood we find the abundance of its transcript correlate with that of IGJ, TNFRSF17 (BCMA), TXNDC5 (M12.15). Such a signature was previously found to be increased in response to vaccination at day 7 post administration, to correlate with the prevalence of antibody producing cells, and the development of antibody titers at day 28 . (4) TLR8 in A35/S1 (inflammation), encodes toll-like receptor 8. Expression of transcripts comprising this aggregate is generally restricted to neutrophils and robustly increased during sepsis (e.g. as we have described in detail earlier for ACSL1, another transcript belonging to this aggregate ). (5) GZMB in A2/S1 (Cytotoxic cells) encodes Granzyme B, a serine protease known to play a role in immune-mediated cytotoxicity. Other transcripts forming this panel are listed in Table 4.
Even with the limited amount of data available to guide the selection in the previous steps, it is reasonable to assume that such a panel (while not optimal) would already provide valid information for Covid-19 immune profiling. Additional Covid-19 blood transcriptome data that will become available in the coming weeks will allow us to refine the overall selection process.
Design of a preliminary targeted panel emphasizing therapeutic relevance
A different translational connotation was given for this second panel. Here, we based the selection on the same collection of 28 module sets. However, this time, whenever possible, we included transcripts that could have value as targets for the treatment of Covid-19 patients. An initial screen identified 82 transcripts encoding molecules that are known targets for existing drugs (see “Methods”). We further prioritized these candidates based on an expert’s evaluation of the compatibility of use of the drugs for treating Covid-19 patients. As an exception, module sets belonging to A28 (interferon response) were selected based on their suitability as markers of a response to interferon therapy (as described in “Methods” and illustrated in Fig. 5). Sets for which no targets of clinical relevance were identified (16/28) were instead represented in the panel by immunologically-relevant transcripts (defined earlier). Indeed, while it is possible to customize panels according to preference or needs, it would be optimal for any such custom targeted panel to maintain coverage across the entire breadth of Covid-19 signatures (i.e. the 28 homogenous Covid-19 module sets).
We ultimately identified a preliminary set of 12 targets through this high stringency selection process (Table 5). Developing effective immune modulation therapies in critical care settings has proven challenging . Current efforts in the context of Covid-19 disease particularly aim at controlling runaway systemic immune responses or so called “cytokine storms” that have been associated with organ damage and clinical worsening. Targets of interest identified among our gene set include: (1) IL6R in A35/S2 (inflammation), encoding the Interleukin-6 Receptor, which is a target for the biologic drug Tocilizumab. Several studies have tested this antagonist in open label single arm trials in Covid-19 patients with the intent of blocking the cytokine storm associated with severe Covid-19 infection [25, 26]. (2) CCR2 in A26/1 (monocytes), encoding the chemokine (C–C motif) receptor 2, is targeted along with CCR5 by the drug Cenicriviroc. This drug exerts potent anti-inflammatory activity . (3) TBXA2R in A31/1 (platelets), encoding the Thromboxane receptor, is targeted by several drugs with anti-platelet aggregation properties . (4) PDE8A in A33/S1 (inflammation), encoding Phosphodiesterase 8A, is targeted by Pentoxifylline, a non-selective phosphodiesterase inhibitor that increases perfusion and may reduce risk of acute kidney injury and attenuates LPS-induced inflammation . (5) NQO1 in A8/S1 (Complement), encoding NAD(P)H quinone dehydrogenase 1. The NQO1 antagonist Vatiquinone (EPI-743) has been found to inhibit ferroptosis , a process associated with tissue injury , including in sepsis . A complete list is provided in Table 5.
The fact that this transcript panel and the previous survey the same pre-defined 28 homogenous Covid-19 relevant module sets should make them largely synonymous (since modules are formed on the basis of co-expression). Nevertheless, this second panel may be more relevant for investigators interested in investigating new therapeutic approaches or measuring responses to treatment.
Design of a preliminary targeted panel of blood transcripts of relevance for SARS-CoV-2 biology
For the third panel designed in this proof of principle, we primarily selected transcripts based on their relevance to SARS biology. As a first step, we used a literature profiling tool to identify SARS, MERS, or Covid-19 literature articles that were associated with transcripts forming the 28 Covid-19 module sets. Next, the potential associations were subjected to expert curation (see “Methods”). Once again, to keep redundancies to a minimum, we only included one candidate per set in this panel (Table 6). Notable examples include: (1) LTF in A38/S1 (Neutrophil activation), encoding Lactotransferrin, that is known to block the binding of the SARS-CoV spike protein to host cells, thus exerting an inhibitory function at the viral attachment stage . (2) FURIN in A37/S1 (Erythroid cells), encodes a proprotein convertase that preactivates SARS-CoV-2, thus reducing its dependence on target cell proteases for entry . (3) EGR1 in A7/S1 (Monocytes), encoding Early Growth Response 1, which upon induction by SARS Coronavirus Papain-Like Protease mediates up-Regulation of TGF-β1 . (4) STAT1 in A28/S3 (Interferon response), encoding a transcription factor known to play an important role in the induction of antiviral effector responses. It was reported that SARS ORF6 antagonizes STAT1 function by preventing its translocation to the nucleus and acts as an interferon antagonist in the context of SARS-CoV infection .
This screen identified several molecules that may be of importance for SARS-CoV-2 entry and replication. A complete list is provided in Table 6. It is expected that this knowledge will evolve rapidly over time and periodic updates may be necessary. And, as for the previous two panels, investigators may also have an interest in including more than one candidate per module set. This of course would also be feasible, although at the expense of course of parsimony.
Development of an annotation framework in support of signatures curation efforts
A vast amount of information is available to support the work of expert curators who are responsible for finalizing the selection of candidates. This process often requires accessing a number of different resources (e.g. those listed in Table 3). Here we have built upon earlier efforts to aggregate this information in a manner that makes it seamlessly accessible by the curators.
As proof of principle, we created dedicated, interactive presentations in Prezi for module aggregates A28 (https://prezi.com/view/7lbgGwfiNflffqQzvL14/) and A31 (https://prezi.com/view/zYCSLyo0nvJTwjfJkJqb/). These presentations are intended, on the one hand, to aggregate contextual information that can serve as a basis for data interpretation. On the other hand, they are intended to capture the results of the interpretative efforts of expert curators.
The interactive presentations are organized in sections, each showing aggregated information from a different level: module-sets, modules and transcripts (Fig. 6). The information derived from multiple online sources, including both third party applications and custom applications developed by our team (Table 3). Among those is a web application developed specifically for this work, which was used to generate the Covid-19 plots from Ong et al. and Xiong et al. (Figure 6a). The interactive presentation itself permits to zoom in and out, determine spatial relationships and interactively browse the very large compendium of analysis reports and heatmaps generated as part of these annotation efforts (Fig. 6b). The last section that contains transcript-centric information, is also the area where interpretations from individual curators can be aggregated (Fig. 6c).
We have annotated and interpreted some of the transcripts included in A31/S1 in such a manner: (1) OXTR, which encodes for the Oxytocin receptor through which anti-inflammatory and wound healing properties of Oxytocin are mediated . Among our reference cohort datasets, OXTR is most highly increased in patients with S. aureus infection or active pulmonary tuberculosis . (2) CD9, which encodes a member of the tetraspanin family, facilitates the condensation of receptors and proteases activating MERS-CoV and promoting its rapid and efficient entry into host cells . (3) TNFSF4, which encodes for OX40L and is a member of the TNF superfamily. Although OX40L is best known as a T-cell co-stimulatory molecule, reports have also shown that it is present on the neutrophil surface . Furthermore, OX40L blockade improved outcomes of sepsis in an animal model.
Our interpretation efforts have been limited thus far by expediency. Certainly, interpretation will be the object of future, more targeted efforts. In the meantime, this annotation framework supports the selection of candidates forming the panels presented here. It may also serve as a resource for investigators who wish to design custom panels of their own.
Early reports point to profound immunological changes occurring during the course of SARS-CoV-2 infection [40, 41]. In particular, patterns of immune dysfunction have been associated with rapid worsening of symptoms and the onset of severe respiratory failure . However, disease outcomes remain highly heterogeneous and factors contributing to clinical deterioration are poorly understood. Among other modalities, means to enhance immune monitoring capabilities in cohorts of Covid-19 patients are needed. Here we designed an approach to select and curate targeted blood transcript panels relevant to Covid-19 disease.
The sparsity of the Covid-19 blood transcriptome data available to guide the selection process described in this paper was an obvious limitation. Xiong et al. dataset comprised profiles of only three Covid-19 subjects and one uninfected control subject. More transcriptome profiling data will be generated and become available in the coming weeks and months, including from our group. This will permit to re-iterate the selection process and refine the design of the preliminary versions of the three Covid-19 transcript panels being presented here. Additional Covid-19 data would likely permit to adjust the filtering of module aggregates (Fig. 1b) and improve the delineation of Covid-19 module sets (Fig. 1c). However, the generic module repertoire that serves as the main framework for candidate selection would remain unchanged (Fig. 1a). Likewise, knowledge-driven prioritization of candidates based on relevance to therapy, immunology or SARS biology is by definition independent from Covid-19 profiling data availability (Fig. 1d). Therefore, while changes to the preliminary panels presented in Tables 3, 5 and 6 resulting from additional Covid-19 profiling data becoming available are to be expected those may not prove to be extensive.
The targeted panel design approach that we are presenting is also partly knowledge-driven. Indeed, we have relied on expert knowledge for the identification of transcripts coding for molecules with biological significance or therapeutic relevance, specifically in the context of Covid-19 disease (Fig. 1d). While it was possible to enroll the help of several curators to work in parallel on this task the amount of time allotted was limited by the need for expediency. Curation of candidates is therefore another area that will be worth revisiting over the coming weeks and months as targeted blood transcript panels are further refined. It is also an effort that the platform we have developed for the aggregation of vast amounts of information from various sources would help support. This will become especially important now that an increasing number of bioinformatics tools and resources are being made available by the scientific community for tackling the current health crisis (e.g. more specifically for drug target identification and repurposing: [43, 44]).
In the illustrative use cases that we are providing non-synonymous targeted panels were formed by selecting only one representative transcript from each of the 28 homogenous Covid-19 module sets. It is nevertheless possible to devise custom selection strategies where more than one transcript is retained from each set. Our own implementation of a preliminary targeted Covid-19 blood transcriptional assay will be based on the Fluidigm Biomark high throughput PCR platform. The panel will comprise 96 targets in order to comply with the format of Fluidigm’s integrated fluidics circuits (96 samples × 96 reactions). These will include all transcripts listed in Tables 4, 5 and 6 (53 unique transcripts) which will be complemented by 35 additional candidates which received priority ranking from our expert curators and 8 housekeeping genes. While the number of candidates to be selected within a given module set remains flexible our recommendation when designing such a targeted panel would be for all 28 module sets to be covered by at least one transcript. Additional file 3 includes the list of the genes included in the modules forming the 28 Covid-19 module sets. Other medium-throughput technology platforms, such as the Nanostring nCounter System or ThermoFisher Openarray, would also be appropriate for implementing custom profiling assays with the number of targets comprising the preliminary panels presented here (or a combination thereof). Downsizing panels to comprise ± 10 key markers might serve as a basis for implementation on more ubiquitous real-time PCR platforms.
Monitoring of “immune trajectories” associated with response to SARS-CoV-2 infection and clinical deterioration of Covid-19 patients is one possible application for such a targeted assay. Another would be the measurement of responses to therapy (as part of standard of care or a trial). The immune profiling of asymptomatic or pre-symptomatic patients (e.g. quarantined) would be another setting where implementation of such an assay could prove useful. For this, it would for instance be possible to use protocols that we have previously developed for home-based, self-sampling and blood RNA stabilization [45, 46].
Overall, this work lays the ground for a framework designed to support the development of interpretable targeted panels for profiling immune responses to SARS-CoV-2 infection. It consists, on one hand, in an analytic pipeline for data-driven selection of targets. And, on the other hand, in an information aggregation platform supporting the work of expert curators. The preliminary blood transcript panels presented here will be leveraged for a first round of implementation of a targeted Covid-19 immune profiling assay.
Availability of data and materials
The datasets generated during and/or analysed during the current study are available:
in the ArrayExpress repository, under the accession number E-MTAB-8871, https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-8871/
in the Genome Sequence Archive of the Beijing Institute of Genomics, Chinese Academy of Sciences, under the accession number CRA002390, https://bigd.big.ac.cn/gsa/browse/CRA002390
In the NCBI GEO database, under the accession numbers
Severe Acute Respiratory Syndrome
Middle East Respiratory Syndrome
Respiratory Syncytial Virus
Gene Expression Omnibus
Yang W, Cao Q, Qin L, Wang X, Cheng Z, Pan A, et al. Clinical characteristics and imaging manifestations of the 2019 novel coronavirus disease (COVID-19): a multi-center study in Wenzhou city, Zhejiang, China. J Infect. 2020;80(4):388–93.
Guan W-J, Ni Z-Y, Hu Y, Liang W-H, Ou C-Q, He J-X, et al. Clinical characteristics of Coronavirus disease 2019 in China. N Engl J Med. 2020;382(18):1708–20.
Long Q-X, Liu B-Z, Deng H-J, Wu G-C, Deng K, Chen Y-K, et al. Antibody responses to SARS-CoV-2 in patients with COVID-19. Nat Med. 2020. https://doi.org/10.1038/s41591-020-0897-1.
Henderson LA, Canna SW, Schulert GS, Volpi S, Lee PY, Kernan KF, et al. On the alert for cytokine storm: immunopathology in COVID-19. Arthritis Rheumatol. 2020. https://doi.org/10.1002/art.41285.
Chaussabel D. Assessment of immune status using blood transcriptomics and potential implications for global health. Semin Immunol. 2015;27(1):58–66.
Zhou W, Altman RB. Data-driven human transcriptomic modules determined by independent component analysis. BMC Bioinform. 2018;19(1):327.
Altman MC, Rinchai D, Baldwin N, Toufiq M, Whalen E, Garand M, et al. Development and Characterization of a Fixed Repertoire of Blood Transcriptome Modules Based on Co-expression Patterns Across Immunological States. bioRxiv. 2020;525709.
Ong EZ, Chan YFZ, Leong WY, et al. A Dynamic Immune Response Shapes COVID-19 Progression. Cell Host Microbe. 2020;27(6):879–882.e2. https://doi.org/10.1016/j.chom.2020.03.021.
Xiong Y, Liu Y, Cao L, Wang D, Guo M, Jiang A, et al. Transcriptomic characteristics of bronchoalveolar lavage fluid and peripheral blood mononuclear cells in COVID-19 patients. Emerg Microbes Infect. 2020;9(1):761–70.
Chaussabel D, Quinn C, Shen J, Patel P, Glaser C, Baldwin N, et al. A modular analysis framework for blood genomics studies: application to systemic lupus erythematosus. Immunity. 2008;29(1):150–64.
Chaussabel D, Baldwin N. Democratizing systems immunology with modular transcriptional repertoire analyses. Nat Rev Immunol. 2014;14(4):271–80.
Gu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinforma Oxf Engl. 2016;32(18):2847–9.
Carvalho-Silva D, Pierleoni A, Pignatelli M, Ong C, Fumis L, Karamanis N, et al. Open Targets Platform: new developments and updates two years on. Nucleic Acids Res. 2019;47(D1):D1056–65.
Hung IF-N, Lung K-C, Tso EY-K, Liu R, Chung TW-H, Chu M-Y, et al. Triple combination of interferon beta-1b, lopinavir-ritonavir, and ribavirin in the treatment of patients admitted to hospital with COVID-19: an open-label, randomised, phase 2 trial. Lancet Lond Engl. 2020.
Malhotra S, Bustamante MF, Pérez-Miralles F, Rio J, Ruiz de Villa MC, Vegas E, et al. Search for specific biomarkers of IFNβ bioactivity in patients with multiple sclerosis. PLoS ONE. 2011;6(8):e23634.
Febbo PG, Mulligan MG, Slonina DA, Stegmaier K, Di Vizio D, Martinez PR, et al. Literature Lab: a method of automated literature interrogation to infer biology from microarray analysis. BMC Genomics. 2007;18(8):461.
Bhattacharya S, Dunn P, Thomas CG, Smith B, Schaefer H, Chen J, et al. ImmPort, toward repurposing of open access immunological assay data for translational and clinical research. Sci Data. 2018;27(5):180015.
Rinchai D, Altman MB, Konza O, Hässler S, Martina F, Toufiq M, et al. Identification of erythroid cell positive blood transcriptome phenotypes associated with severe respiratory syncytial virus infection. bioRxiv. 2020;527812. https://doi.org/10.1101/527812
Perng Y-C, Lenschow DJ. ISG15 in antiviral immunity and beyond. Nat Rev Microbiol. 2018;16(7):423–39.
Gutiérrez L, Caballero N, Fernández-Calleja L, Karkoulia E, Strouboulis J. Regulation of GATA1 levels in erythropoiesis. IUBMB Life. 2020;72(1):89–105.
Elahi S. Neglected Cells: immunomodulatory Roles of CD71 + Erythroid Cells. Trends Immunol. 2019;40(3):181–5.
Obermoser G, Presnell S, Domico K, Xu H, Wang Y, Anguiano E, et al. Systems scale interactive exploration reveals quantitative and qualitative differences in response to influenza and pneumococcal vaccines. Immunity. 2013;38(4):831–44.
Roelands J, Garand M, Hinchcliff E, Ma Y, Shah P, Toufiq M, et al. Long-Chain Acyl-CoA Synthetase 1 Role in Sepsis and Immunity: perspectives From a Parallel Review of Public Transcriptome Datasets and of the Literature. Front Immunol. 2019;10:2410.
Remy KE, Brakenridge SC, Francois B, Daix T, Deutschman CS, Monneret G, et al. Immunotherapies for COVID-19: lessons learned from sepsis. Lancet Respir Med [Internet]. 2020 Apr 28 [cited 2020 May 18];0(0).: https://www.thelancet.com/journals/lanres/article/PIIS2213-2600(20)30217-4/abstract.
Guo C, Li B, Ma H, Wang X, Cai P, Yu Q, et al. Tocilizumab treatment in severe COVID-19 patients attenuates the inflammatory storm incited by monocyte centric immune interactions revealed by single-cell analysis. bioRxiv. 2020 Apr 9;2020.04.08.029769.
Roumier M, Paule R, Groh M, Vallee A, Ackermann F. Interleukin-6 blockade for severe COVID-19. medRxiv. 2020 Apr 22;2020.04.20.20061861.
Friedman S, Sanyal A, Goodman Z, Lefebvre E, Gottwald M, Fischer L, et al. Efficacy and safety study of cenicriviroc for the treatment of non-alcoholic steatohepatitis in adult subjects with liver fibrosis: CENTAUR Phase 2b study design. Contemp Clin Trials. 2016;1(47):356–65.
Dogné J-M, Hanson J, de Leval X, Pratico D, Pace-Asciak CR, Drion P, et al. From the design to the clinical application of thromboxane modulators. Curr Pharm Des. 2006;12(8):903–23.
Goicoechea M, García de Vinuesa S, Quiroga B, Verdalles U, Barraca D, Yuste C, et al. Effects of pentoxifylline on inflammatory parameters in chronic kidney disease patients: a randomized trial. J Nephrol. 2012;25(6):969–75.
Kahn-Kirby AH, Amagata A, Maeder CI, Mei JJ, Sideris S, Kosaka Y, et al. Targeting ferroptosis: a novel therapeutic strategy for the treatment of mitochondrial disease-related epilepsy. PLoS ONE. 2019;14(3):e0214250.
Stockwell BR, Friedmann Angeli JP, Bayir H, Bush AI, Conrad M, Dixon SJ, et al. Ferroptosis: a regulated cell death nexus linking metabolism, redox biology, and disease. Cell. 2017;171(2):273–85.
Wang C, Yuan W, Hu A, Lin J, Xia Z, Yang CF, et al. Dexmedetomidine alleviated sepsis-induced myocardial ferroptosis and septic heart injury. Mol Med Rep. 2020;22(1):175–84.
Lang J, Yang N, Deng J, Liu K, Yang P, Zhang G, et al. Inhibition of SARS pseudovirus cell entry by lactoferrin binding to heparan sulfate proteoglycans. PLoS ONE. 2011;6(8):e23710.
Shang J, Wan Y, Luo C, Ye G, Geng Q, Auerbach A, et al. Cell entry mechanisms of SARS-CoV-2. Proc Natl Acad Sci USA. 2020;11727-34.
Li S-W, Wang C-Y, Jou Y-J, Huang S-H, Hsiao L-H, Wan L, et al. SARS coronavirus papain-like protease inhibits the TLR7 signaling pathway through removing Lys63-linked polyubiquitination of TRAF3 and TRAF6. Int J Mol Sci. 2016;17(5):678.
Frieman M, Yount B, Heise M, Kopecky-Bromberg SA, Palese P, Baric RS. Severe acute respiratory syndrome coronavirus ORF6 antagonizes STAT1 function by sequestering nuclear import factors on the rough endoplasmic reticulum/Golgi membrane. J Virol. 2007;81(18):9812–24.
Li T, Wang P, Wang SC, Wang Y-F. Approaches mediating oxytocin regulation of the immune system. Front Immunol. 2016;7:693.
Earnest JT, Hantak MP, Li K, McCray PB, Perlman S, Gallagher T. The tetraspanin CD9 facilitates MERS-coronavirus entry by scaffolding host cell receptors and proteases. PLoS Pathog. 2017;13(7):e1006546.
Karulf M, Kelly A, Weinberg AD, Gold JA. OX40 ligand regulates inflammation and mortality in the innate immune response to sepsis. J Immunol. 2010;185(8):4856–62.
Tay MZ, Poh CM, Rénia L, MacAry PA, Ng LFP. The trinity of COVID-19: immunity, inflammation and intervention. Nat Rev Immunol. 2020;28:1–12.
Azkur AK, Akdis M, Azkur D, Sokolowska M, van de Veen W, Brüggen M-C, et al. Immune response to SARS-CoV-2 and mechanisms of immunopathological changes in COVID-19. Allergy. 2020;75(7):1564–81.
Giamarellos-Bourboulis EJ, Netea MG, Rovina N, Akinosoglou K, Antoniadou A, Antonakos N, et al. Complex immune dysregulation in COVID-19 patients with severe respiratory failure. Cell Host Microbe. 2020. https://doi.org/10.1016/j.chom.2020.04.009.
Kim J, Zhang J, Cha Y, Kolitz S, Funt J, Escalante Chong R, et al. Advanced bioinformatics rapidly identifies existing therapeutics for patients with coronavirus disease-2019 (COVID-19). J Transl Med. 2020;18(1):257.
Gysi DM, Valle ÍD, Zitnik M, Ameli A, Gan X, Varol O, et al. Network Medicine Framework for Identifying Drug Repurposing Opportunities for COVID-19. ArXiv200407229 Cs Q-Bio Stat. 2020; http://arxiv.org/abs/2004.07229. Accessed 24 June 2020.
Speake C, Whalen E, Gersuk VH, Chaussabel D, Odegard JM, Greenbaum CJ. Longitudinal monitoring of gene expression in ultra-low-volume blood samples self-collected at home. Clin Exp Immunol. 2017;188(2):226–33.
Rinchai D, Anguiano E, Nguyen P, Chaussabel D. Finger stick blood collection for gene expression profiling and storage of tempus blood RNA tubes. F10000Research. 2016;5:1385.
Rinchai D, Konza O, Hassler S, Martina F, Mejias A, Ramilo O, et al. Characterizing blood modular transcriptional repertoire perturbations in patients with RSV infection: a hands-on workshop using public datasets as a source of training material. bioRxiv. 2019;527812.
Speake C, Presnell S, Domico K, Zeitner B, Bjork A, Anderson D, et al. An interactive web application for the dissemination of human systems immunology data. J Transl Med. 2015;19(13):196.
Linsley PS, Speake C, Whalen E, Chaussabel D. Copy number loss of the interferon gene cluster in melanomas is linked to reduced T cell infiltrate and poor patient prognosis. PLoS ONE. 2014;9(10):e109760.
Bougarn S, Boughorbel S, Chaussabel D, Marr N. A curated transcriptome dataset collection to investigate the blood transcriptional response to viral respiratory tract infection and vaccination. F1000Research. 2019;8:284.
Parnell G, McLean A, Booth D, Huang S, Nalos M, Tang B. Aberrant cell cycle and apoptotic changes characterise severe influenza A infection–a meta-analysis of genomic signatures in circulating leukocytes. PLoS ONE. 2011;6(3):e17186.
Ayllon-Benitez A, Bourqui R, Thébault P, Mougin F. GSAn: an alternative to enrichment analysis for annotating gene sets. NAR Genomics Bioinforma. 2020;2(2). https://academic.oup.com/nargab/article/2/2/lqaa017/5805305. Accessed 1 Apr 2020.
Ambade A, Lowe P, Kodys K, Catalano D, Gyongyosi B, Cho Y, et al. Pharmacological inhibition of CCR2/5 signaling prevents and reverses alcohol-induced liver damage, steatosis, and inflammation in mice. Hepatol Baltim Md. 2019;69(3):1105–21.
Wang S-M, Wang C-T. APOBEC3G cytidine deaminase association with coronavirus nucleocapsid protein. Virology. 2009;388(1):112–20.
Milewska A, Kindler E, Vkovski P, Zeglen S, Ochman M, Thiel V, et al. APOBEC3-mediated restriction of RNA virus replication. Sci Rep. 2018;8(1):5960.
de Lang A, Osterhaus ADME, Haagmans BL. Interferon-gamma and interleukin-4 downregulate expression of the SARS coronavirus receptor ACE2 in Vero E6 cells. Virology. 2006;353(2):474–81.
Sisk JM, Frieman MB, Machamer CE. Coronavirus S protein-induced fusion is blocked prior to hemifusion by Abl kinase inhibitors. J Gen Virol. 2018;99(5):619–30.
Song R, Lisovsky I, Lebouché B, Routy J-P, Bruneau J, Bernard NF. HIV protective KIR3DL1/S1-HLA-B genotypes influence NK cell-mediated inhibition of HIV replication in autologous CD4 targets. PLoS Pathog. 2014;10(1):e1003867.
Yeung M-L, Yao Y, Jia L, Chan JFW, Chan K-H, Cheung K-F, et al. MERS coronavirus induces apoptosis in kidney and lung by upregulating Smad7 and FGF2. Nat Microbiol. 2016;22(1):16004.
Li S-W, Wang C-Y, Jou Y-J, Yang T-C, Huang S-H, Wan L, et al. SARS coronavirus papain-like protease induces Egr-1-dependent up-regulation of TGF-β1 via ROS/p38 MAPK/STAT3 pathway. Sci Rep. 2016;13(6):25754.
Padhan K, Tanwar C, Hussain A, Hui PY, Lee MY, Cheung CY, et al. Severe acute respiratory syndrome coronavirus Orf3a protein interacts with caveolin. J Gen Virol. 2007;88(Pt 11):3067–77.
Lindner HA, Lytvyn V, Qi H, Lachance P, Ziomek E, Ménard R. Selectivity in ISG15 and ubiquitin recognition by the SARS coronavirus papain-like protease. Arch Biochem Biophys. 2007;466(1):8–14.
Riva L, Yuan S, Yin X, Martin-Sancho L, Matsunaga N, Burgstaller-Muehlbacher S, et al. A Large-scale Drug Repositioning Survey for SARS-CoV-2 Antivirals. bioRxiv. 2020.
Liu M, Yang Y, Gu C, Yue Y, Wu KK, Wu J, et al. Spike protein of SARS-CoV stimulates cyclooxygenase-2 expression via both calcium-dependent and calcium-independent protein kinase C pathways. FASEB J Off Publ Fed Am Soc Exp Biol. 2007;21(7):1586–96.
Shi C-S, Nabar NR, Huang N-N, Kehrl JH. SARS-Coronavirus Open Reading Frame-8b triggers intracellular stress pathways and activates NLRP3 inflammasomes. Cell Death Discov. 2019;5:101.
Siu K-L, Yuen K-S, Castaño-Rodriguez C, Ye Z-W, Yeung M-L, Fung S-Y, et al. Severe acute respiratory syndrome coronavirus ORF3a protein activates the NLRP3 inflammasome by promoting TRAF3-dependent ubiquitination of ASC. FASEB J Off Publ Fed Am Soc Exp Biol. 2019;33(8):8865–77.
Qi F, Qian S, Zhang S, Zhang Z. Single cell RNA sequencing of 13 human tissues identify cell types and receptors of human coronaviruses. Biochem Biophys Res Commun. 2020;526(1):135–40.
Li D, Wu N, Yao H, Bader A, Brockmeyer NH, Altmeyer P. Association of RANTES with the replication of severe acute respiratory syndrome coronavirus in THP-1 cells. Eur J Med Res. 2005;10(3):117–20.
Follis KE, York J, Nunberg JH. Furin cleavage of the SARS coronavirus spike glycoprotein enhances cell-cell fusion but does not affect virion entry. Virology. 2006;350(2):358–69.
Zhou Y, Vedantham P, Lu K, Agudelo J, Carrion R, Nunneley JW, et al. Protease inhibitors targeting coronavirus and filovirus entry. Antiviral Res. 2015;116:76–84.
We would like to thank all study participants and the investigators who have chosen to make their data available publicly, without whom this work would not have been possible. We would also like to acknowledge Insight Editing London for editing the manuscript prior to submission.
Development of some of the bioinformatic resources and approaches employed here was supported by NPRP grant # 10-0205-170348 from the Qatar National Research Fund (a member of Qatar Foundation). The work reported herein is solely the responsibility of the authors.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Coverage of the pre-established 38 transcriptional module aggregate repertoire by the Nanostring immunology panel 2. The bar graphs show the distribution of the 579 transcript constituting the standard Nanostring immunology panel used by Ong et al. across the 38 module aggregates forming this repertoire. The Venn diagram shows the degree of overlap between the Nanostring panel and the transcripts forming this modular repertoire.
Delineation of Covid-19 relevant modules sets in all 17 aggregates retained in the first step of the selection process.
Composition and annotation of the 382 module repertoire employed as a framework for the selection of targeted Covid-19 blood transcriptional panels. Membership to the 28 Covid-19 module sets is indicated in column D.
About this article
Cite this article
Rinchai, D., Syed Ahamed Kabeer, B., Toufiq, M. et al. A modular framework for the development of targeted Covid-19 blood transcript profiling panels. J Transl Med 18, 291 (2020). https://doi.org/10.1186/s12967-020-02456-z