Skip to content


  • Research
  • Open Access

A novel signature derived from immunoregulatory and hypoxia genes predicts prognosis in liver and five other cancers

Journal of Translational Medicine201917:14

  • Received: 13 October 2018
  • Accepted: 4 January 2019
  • Published:



Despite much progress in cancer research, its incidence and mortality continue to rise. A robust biomarker that would predict tumor behavior is highly desirable and could improve patient treatment and prognosis.


In a retrospective bioinformatics analysis involving patients with liver cancer (n = 839), we developed a prognostic signature consisting of 45 genes associated with tumor-infiltrating lymphocytes and cellular responses to hypoxia. From this gene set, we were able to identify a second prognostic signature comprised of 8 genes. Its performance was further validated in five other cancers: head and neck (n = 520), renal papillary cell (n = 290), lung (n = 515), pancreas (n = 178) and endometrial (n = 370).


The 45-gene signature predicted overall survival in three liver cancer cohorts: hazard ratio (HR) = 1.82, P = 0.006; HR = 1.84, P = 0.008 and HR = 2.67, P = 0.003. Additionally, the reduced 8-gene signature was sufficient and effective in predicting survival in liver and five other cancers: liver (HR = 2.36, P = 0.0003; HR = 2.43, P = 0.0002 and HR = 3.45, P = 0.0007), head and neck (HR = 1.64, P = 0.004), renal papillary cell (HR = 2.31, P = 0.04), lung (HR = 1.45, P = 0.03), pancreas (HR = 1.96, P = 0.006) and endometrial (HR = 2.33, P = 0.003). Receiver operating characteristic analyses demonstrated both signatures superior performance over current tumor staging parameters. Multivariate Cox regression analyses revealed that both 45-gene and 8-gene signatures were independent of other clinicopathological features in these cancers. Combining the gene signatures with somatic mutation profiles increased their prognostic ability.


This study, to our knowledge, is the first to identify a gene signature uniting both tumor hypoxia and lymphocytic infiltration as a prognostic determinant in six cancer types (n = 2712). The 8-gene signature can be used for patient risk stratification by incorporating hypoxia information to aid clinical decision making.


  • Hypoxia
  • T cells
  • Tumor-infiltrating lymphocytes
  • Hepatocellular carcinoma
  • Pan-cancer
  • Gene signature


Hepatocellular carcinoma (HCC) is the sixth most common cancer and the second leading cause of cancer-related mortality worldwide [1]. Due to an initially asymptomatic disease course, this aggressive cancer once diagnosed has especially poor outcomes. The etiological risk factors for HCC varies across geographical locations [2]. This pattern mirrors the burden of viral hepatitis in Asia and Africa. Whereas in the Western world, the risk can be attributed more to alcoholic and non-alcoholic steatohepatitis. Therapy for HCC, involving either surgery or radiological ablation, is most effective when the cancer is detected early. Prompt diagnosis, however, requires regular liver imaging, usually 6-monthly ultrasound scans, which is both resource intensive and dependent upon identification of otherwise silent risk factors. Once diagnosed, curative liver transplantation is based on tumor size and restricted to patients where the primary cancer is thought unlikely to recur in the new liver. The use of size as the only prognostic criterion precludes therapy in patients with large, indolent cancers that have a low probability of recurrence and many centers are questioning whether additional data on tumor biology would add value to the diagnostic pathway.

Gene signatures obtained from tumor that are derived from common oncogenic pathways can be used to risk-stratify patients to provide individualized care. Despite genomic instability driving intertumoral heterogeneity, solid malignancies do share two common characteristics. Solid tumors are often hypoxic due to aberrant vasculature resulting in metabolic shifts towards aerobic glycolysis known as the Warburg effect [3, 4]. Tumor hypoxia is associated with metastases and aggressive phenotypes that influence clinical outcomes [5, 6]. Additionally, within tumor cells one can find lymphocytes termed tumor-infiltrating lymphocytes. These can affect patient prognosis in multiple cancers [7, 8]. A subset of lymphocytic infiltrates known as the FoxP3+ regulatory T cells (Tregs) function to suppress cytotoxic T cells activity to maintain tumor tolerance. Increases in Tregs contributes to unfavorable prognosis in multiple solid tumors [9]. Importantly, infiltrating Tregs within the tumor milieu can be affected by hypoxia where the latter facilitates Tregs recruitment to promote angiogenesis and maintain growth [10]. Another category of T cells known as the cytotoxic CD8+ T cells are required to kill cancer cells. But they can be made ineffective by Tregs or be rendered dysfunctional when exhausted [8]. Together, accumulation of Tregs and exhausted CD8+ T cells further exacerbate the immunosuppressive functions in solid tumors.

By taking advantage of the intricate link between hypoxia and Treg infiltration in solid malignancies, we systematically developed two prognostic gene signatures for risk stratification. Both signatures can accurately predict high-risk patients as confirmed by multi-cohort validations in six cancer types to support their validity and immediate clinical application.


Study cohorts

HCC data sets (n = 839) used in this study were GSE14520 (n = 242) from the Liver Cancer Institute, LIRI-JP (n = 226) from the International Cancer Genome Consortium (ICGC) database and TCGA-LIHC (n = 371) from the Cancer Genome Atlas (TCGA) Research Network (Additional file 1) [11]. Full clinical characteristics of HCC patients were listed in Additional file 2. Among these patients, underlying etiology differed between the cohorts with GSE14520 comprising of patients with HBV, while LIRI-JP and TCGA-LIHC consisting of patients with HBV, HCV and/or non-alcoholic steatohepatitis.

Gene expression data for 24 other cancer types (n = 20,241) generated by TCGA Research Network [11] were downloaded from Broad Institute GDAC Firehose. Illumina HiSeq rnaseqv2 Level 3 RSEM normalized gene expression profiles were converted to log2(x + 1) scale. Expression values of genes associated with tumor-infiltrating T cells were corrected for levels of infiltration by dividing expression values of each gene with CD3D values.

Differential expression analysis

To determine differentially expressed genes between tumor and adjacent normal liver tissues in the GSE14520 cohort, the Bayes method and linear model were employed using the R package limma. P-values were adjusted using the false discovery rate controlling procedure of Benjamini–Hochberg. Genes with log2 fold change of > 1 and adjusted P-values < 0.05 were considered significant.

Gene signature and risk scores

Expression scores for hypoxia, tumor-infiltrating T cells, 45-gene signature and 8-gene signature were calculated for each patient in all the cohorts by taking the average log2 expression values. Nonparametric Mann–Whitney–Wilcoxon test was used to compare the distribution of hypoxia scores in tumor and adjacent non-tumor samples. Risk scores for each patient were determined by taking the sum of Cox regression coefficient for each of the signature genes multiplied with its corresponding expression value. Nonparametric Spearman’s rank correlation was employed to assess the relationship of gene scores and risk scores with tumor hypoxia (hypoxia score).

Survival analyses

We employed the Cox proportional hazards model to investigate the association between patient survival duration and one or more risk factors, e.g. 45-gene or 8-gene score, tumor stage and other clinical variables. Univariate analysis was performed to determine the influence of individual risk factors on overall survival. As multiple covariates can potentially influence patient prognosis, multivariate analyses were performed by including risk factors that were significantly associated with overall survival identified in univariate analysis (P < 0.05). Hazard ratios (HR) were determined from Cox models with HR greater than one indicating that a covariate was positively associated with event probability (increased hazard) and negatively associated with survival length. Cox regression analyses were performed using the R survival and survminer packages. Proportional hazards assumption was supported by a non-significant relationship between scaled Schoenfeld residuals and time using the R survival package. In addition, Kaplan–Meier and log-rank tests were used in univariate analyses of the gene signatures in relation to patient survival and were performed using the survival and survminer packages. Patients were median-dichotomized into low and high-risk groups based on mean expression scores of signature genes and Kaplan–Meier estimates of survival probability over time were generated. Difference between high and low-risk groups were tested using the log-rank test.

Time-dependent receiver operating characteristic (ROC) curve analysis was used to assess the predictive performance of both 45-gene and 8-gene signatures in comparison with standard tumor staging parameters. The R survcomp package was employed to compute time-dependent ROC curves. ROC curves depicted true positive rates (sensitivity) versus false positive rates (1-specificity). The area under the curve (AUC) is a measure of how well the gene signatures can predict patient survival where AUC ranges from 0.5 (for an uninformative marker) to 1 (for a perfect predictive marker).

Visualization of sample distance in the reduced 2-dimensional space

To determine whether the gene signatures (45-gene or 8-gene signature) can distinguish tumor and non-tumor samples, multidimensional scaling (MDS) analyses were performed using the R vegan package to visualize samples’ distance in the reduced 2-dimensional space. Euclidean genetic distances between each sample were investigated by MDS ordination. Permutational multivariate analysis of variance (PERMANOVA) was employed to test for differences between tumor and non-tumor samples.

Biological enrichment and protein interaction network analyses

Analysis of biological pathway enrichment on the 45-gene set was conducted using GeneCodis [12] against the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) databases. The Enrichr [13] tool was used to identify transcription factors from the ChEA database that are potential regulators of the 45 genes. Functional protein association network of the 45 genes was determined using the STRING [14] database.

Somatic mutation identification

Level 3 mutation datasets were downloaded from GDAC. Kaplan–Meier analysis and log-rank tests were employed to determine the association of somatic mutations in combination with the 45-gene or 8-gene signatures, on overall survival.

All graphs were generated using the ggplot2 package in R. Heatmap was generated using the R pheatmap package.


Hypoxia is associated with immunosuppression through enhanced expression of Tregs and exhausted CD8+ T cells genes

Since both hypoxia and high density of infiltrating Tregs are linked to poor clinical outcomes, we hypothesized that these features could be employed to predict prognosis in patients with hepatocellular carcinoma (HCC). Hypoxia inducible factors (HIFs) are transcription factors (TFs) that play key roles in regulating cellular responses to hypoxia [15]. The alpha subunits (HIF-1α and -2α) heterodimerize with the beta subunit (HIF-1β) to orchestrate physiological responses in hypoxia [16]. To develop our gene signature uniting hypoxia and tumor-infiltrating T-cells, we utilized three gene sets: (1) pan-cancer hypoxia genes (52 genes) [6]; (2) HCC-infiltrating Tregs (401 genes) and (3) HCC-infiltrating exhausted CD8+ T cells (82 genes) [8] (Fig. 1). We utilized a HepG2 hepatoma HIF chromatin immunoprecipitation sequencing (ChIP-seq) dataset, GSE120885 [17], to determine which of these genes were bound by HIFs (Fig. 1).
Fig. 1
Fig. 1

Schematic diagram of the study design and development of gene signatures. A liver cancer cohort (GSE14520) was used to define the first 45-gene signature. Briefly, 79 tumor-infiltrating T-cell genes were identified as HIF targets using a HIF-1α/2α ChIP-seq dataset. Of these 79 genes, 26 genes were > 1.5-fold upregulated in GSE14520. Independently, 23 hypoxia genes were identified as HIF targets. Uniting the 23 hypoxia-HIF genes and 26 T-cell-HIF genes resulted in 45 unique genes representing the first gene signature. Cox regression analyses of individual 45 genes in each of the three liver cancer cohorts (GSE14520, TCGA-LIHC and LIRI-JP) revealed a common prognostic set consisting of 8 genes that represent the second signature. This 8-gene signature is further validated in liver and five other cancers using Kaplan–Meier, Cox regression and receiver operating characteristic analyses

To determine the extent of hypoxia in cancers, we interrogated in vivo mRNA expression patterns of the 52 hypoxia genes [6] in 25 cancer types including HCC (n = 20,662) retrieved from The Cancer Genome Atlas (TCGA) [11] (Additional file 1). Hypoxia scores for each tumor and non-tumor samples were determined by obtaining the mean expression values (log2) of the 52 hypoxia genes [6]. We observed that tumor samples were significantly more hypoxic than non-tumor samples in 20 out of 25 cancers, which included HCC (Additional file 3A). Multidimensional scaling analyses revealed that the 52 genes can distinguish tumor from non-tumor samples in these cancers, hence hypoxic transcriptional states can be used as a proxy for identifying cancerous cells (Additional file 3B).

We predict that patients with more hypoxic tumors would have higher expression of tumor-infiltrating T cell genes since hypoxia could promote the maintenance of immunological escape via the suppressive function of Tregs [10]. The HCC-infiltrating T cell gene set uniting both Tregs and exhausted CD8+ T cells consisted of 438 genes collectively, with 45 genes found to be in common between the two cell types (Fig. 1). Infiltrating T cell expression scores for each patient were calculated as mean expression values (log2) of the 438 genes (Fig. 1). Patient hypoxia scores significantly correlated with tumor-infiltrating T cell scores in three independent HCC cohorts (n = 839; Additional file 2): GSE14520 (ρ = 0.564, P < 0.0001); TCGA-LIHC (ρ = 0.390, P < 0.0001) and LIRI-JP (ρ = 0.633, P < 0.0001) (Additional file 3C), suggesting that highly hypoxic tumors have transcriptional profiles associated with immunosuppressive functions.

Since HIFs are the key TFs that regulate numerous important aspects of oncogenesis [18], we hypothesize that identifying genes that are directly bound by HIFs may have more profound implications on prognosis. Using the HIF ChIP-seq dataset, we observed that 79 of the 438 tumor-infiltrating T cell genes were direct HIF targets (Fig. 1). Notably, Spearman’s correlation coefficients between the 79 genes and hypoxia score were significantly higher: GSE14520 (ρ = 0.669, P < 0.0001), TCGA-LIHC (ρ = 0.469, P < 0.0001) and LIRI-JP (ρ = 0.694, P < 0.0001) (Additional file 3D), suggesting a role for HIFs in regulating tumor lymphocytic infiltration in HCC.

Levels of T cell infiltration were corrected by dividing the expression of each of the 79 genes with CD3D expression. Of the 79 genes, 26 were upregulated more than 1.5-fold in the training cohort, GSE14520 (Additional file 3E). On the other hand, ChIP-seq analysis revealed that 23 of the 52 hypoxia genes mentioned earlier were direct HIF targets (Fig. 1). The final gene set was comprised of 45 genes that encompass 26 tumor-infiltrating lymphocyte genes and 23 hypoxia genes, with 4 genes being in common (Fig. 1; Additional file 4).

The 45-gene signature is strongly associated with poor overall survival in HCC

We next assessed the ability of the 45-gene signature (HCC45) to predict overall survival (OS) in three independent HCC cohorts (n = 839). The signature can successfully discriminate between tumor and non-tumor samples in these cohorts (Additional file 5A). Tumor samples were notably less tightly clustered implying significant intertumoral heterogeneity (Additional file 5A). We next determined HCC45 scores, calculated for each patient as the mean expression of the 45 genes. Patients were ranked according to their HCC45 scores and divided into high- or low-risk groups using the median cutoff. The OS rates were significantly reduced in high-risk patient groups with log-rank P values of 0.012, 0.0042 and 0.0043 in the GSE14520, TCGA-LIHC and LIRI-JP cohorts respectively (Fig. 2a). Disease-free survival rates were also significantly lower in high-risk patients (P = 0.026) (Additional file 5B).
Fig. 2
Fig. 2

Patient stratification using the 45-gene signature in HCC cohorts. a Kaplan–Meier plots of overall survival in HCC patients across three cohorts stratified into low- and high-risk groups using the 45-gene signature. P-values were calculated from the log-rank test. b Kaplan–Meier plots show independence of the signature over current staging systems in HCC cohorts. Patients were sub-grouped according to TNM stages and further stratified using the 45-gene signature. The signature successfully identified high-risk patients in different TNM stages. P-values were calculated from the log-rank test. c Analysis of specificity and sensitivity of the signature in HCC cohorts with receiver operating characteristic (ROC). Plots depict comparison of ROC curves of signature and clinical tumor staging parameters. The signature demonstrates an incremental value over current staging systems. AUC: area under the curve. TNM tumor, node, metastasis staging. d Kaplan–Meier plots depicting combined relation of TP53 or CTNNB1 mutation status with the signature on overall survival in HCC

To test the independence of HCC45 over current staging systems, we performed subgroup analysis by applying the signature to patients within each tumor, node, metastasis (TNM) stages. The signature successfully identified high-risk patients, particularly among early-stage (1 and/or 2) patients (Fig. 2b). To evaluate the predictive performance of HCC45 on 5-year OS, we performed the receiver operating characteristic (ROC) analysis in comparison with tumor staging parameters. The area under the curves (AUC) for HCC45 were 0.759 (GSE14520), 0.804 (TCGA-LIHC) and 0.883 (LIRI-JP) (Fig. 2c). All HCC45 AUCs were higher than that of TNM staging (Fig. 2c). Moreover, combining HCC45 with TNM staging considerably improved its predictive ability: AUC = 0.800 (GSE14520), 0.830 (TCGA-LIHC) and 0.890 (LIRI-JP) (Fig. 2c).

We assessed two common mutations in HCC, TP53 and CTNNB1. Mutation in the tumor suppressor TP53 is associated with poor prognosis [19, 20]. Results on β‐catenin CTNNB1 mutations have been mixed; some demonstrating poor outcomes [21, 22], while others demonstrating favorable prognosis [23]. Both TP53 (P < 0.001) and CTNNB1 (P = 0.025) mutations were associated with poor prognosis (Table 1). Mortality in patients with high HCC45 scores and TP53 mutation were ~ 45% higher than in patients with low HCC45 scores harboring wild-type TP53 at 2 years (Fig. 2d). Likewise, joint relation of HCC45 and CTNNB1 predicted a ~ 35% increase in mortality at 4 years between the lowest and highest risk groups (Fig. 2d).
Table 1

Univariate and multivariate Cox proportional hazard regression analysis of risk factors associated with overall survival


Hazard Ratio (95% CI)




45-gene signature (high vs. low risk)

1.823 (1.18–2.816)


8-gene signature (high vs. low risk)

2.36 (1.49–3.74)


Tumour size (> 5 cm vs. < 5 cm)

2.159 (1.406–3.316)


Cirrhosis (yes vs. no)

4.665 (1.147–18.97)


TNM staging (II and III vs. I)

2.26 (1.706–2.994)


BCLC staging (B and C vs. A and 0)

2.181 (1.722–2.762)


AFP (> 300 ng/mL vs. < 300 ng/mL)

1.606 (1.049–2.46)



Multivariate (45-gene signature)

45-gene signature (high vs. low risk)

1.8864 (1.2088–2.944)


Tumour size (> 5 cm vs. < 5 cm)

0.9487 (0.5605–1.606)


Cirrhosis (yes vs. no)

4.5927 (1.1084–19.031)


TNM staging (II and III vs. I)

1.4359 (1.2752–2.114)


BCLC staging (B and C vs. A and 0)

1.7808 (1.2521–2.533)


AFP (> 300 ng/mL vs. < 300 ng/mL)

1.4159 (0.9102–2.203)



Multivariate (8-gene signature)

8-gene signature (high vs. low risk)

1.715 (1.054–2.791)


Tumour size (> 5 cm vs. < 5 cm)

1.1834 (0.4822–1.481)



2.820 (0.6811–11.673)


TNM staging (II and III vs. I)

1.540 (1.178–2.383)


BCLC staging (B and C vs. A and 0)

1.675 (1.178–2.383)


AFP (> 300 ng/mL vs. < 300 ng/mL)

1.164 (0.7304–1.855)



Hazard Ratio (95% CI)




45-gene signature (high vs. low risk)

1.839 (1.164–2.904)


8-gene signature (high vs. low risk)

2.423 (1.51–3.89)


TNM staging (II and III vs. I)

2.000 (1.533–2.608)


TP53 mutation (Yes vs. No)

2.484 (1.485–4.155)


CTNNB1 mutation (Yes vs. No)

1.894 (1.084–3.311)



Multivariate (45-gene signature)

45-gene signature (high vs. low risk)

1.606 (1.014–2.542)


TNM staging (II and III vs. I)

1.879 (1.434–2.461)


TP53 mutation (Yes vs. No)

2.150 (1.271–3.635)


CTNNB1 mutation (Yes vs. No)

1.553 (0.884–2.726)



Multivariate (8-gene signature)

8-gene signature (high vs. low risk)

2.229 (1.388–3.580)


TNM staging (II and III vs. I)

1.865 (1.428–2.434)


TP53 mutation (Yes vs. No)

1.939 (1.158–3.246)


CTNNB1 mutation (Yes vs. No)

1.917 (1.085–3.385)



Hazard Ratio (95% CI)




45-gene signature (high vs. low risk)

2.671 (1.361–5.240)


8-gene signature (high vs. low risk)

3.45 (1.69–7.04)


Fibrosis (yes vs. no)

1.024 (0.8155–1.286)


Tumour size (> 3 cm vs. < 3 cm)

2.415 (1.263–4.615)


Portal vein invasion (yes vs. no)

2.281 (1.712–3.039)


Hepatic vein invasion (yes vs. no)

1.822 (1.112–2.985)


Bile duct invasion (yes vs. no)

1.392 (0.987–1.963)


TNM staging (II and III vs. I)

2.267 (1.556–3.304)



Multivariate (45-gene signature)

45-gene signature (high vs. low risk)

2.201 (1.112–4.357)


Tumour size (> 3 cm vs. < 3 cm)

1.471 (0.746–2.898)


TNM staging (II and III vs. I)

2.009 (1.348–2.995)



Multivariate (45-gene signature)

45-gene signature (high vs. low risk)

2.286 (1.139–4.587)


Portal vein invasion (yes vs. no)

2.314 (1.635–3.275)


Hepatic vein invasion (yes vs. no)

0.843 (0.462–1.539)



Multivariate (8-gene signature)

8-gene signature (high vs. low risk)

2.578 (1.228–5.412)


Tumour size (> 3 cm vs. < 3 cm)

1.386 (0.698–2.752)


TNM staging (II and III vs. I)

1.878 (1.268–2.781)



Multivariate (8-gene signature)

8-gene signature (high vs. low risk)

2.419 (1.141–5.129)


Portal vein invasion (yes vs. no)

2.057 (1.449–2.920)


Hepatic vein invasion (yes vs. no)

0.914 (0.508–1.646)


Significant covariates are in italics

We next examined the relation of HCC45 to other clinical parameters and observed that it remained highly prognostic after adjustment for tumor size, cirrhosis, fibrosis, TNM stages, Barcelona Clinic Liver Cancer stages, vascular invasion, TP53/CTNNB1 mutation status and/or alpha-fetoprotein levels in a multivariate Cox regression analysis: GSE14520 hazard ratio [HR] (HR = 1.89, P = 0.0052), TCGA-LIHC (HR = 1.61, P = 0.033) and LIRI-JP (HR = 2.20, P = 0.024) (Table 1). Both cirrhosis and HCC45 harbor complementary prognostic information contributing to elevated HR to 4.59 (95% CI 1.11–19.03, P = 0.036) (Table 1).

Gene Ontology analysis of HCC45 revealed enrichment of biological pathways linked to hypoxia, metabolism, inflammation and cancer (Additional file 6). Moreover, HCC45 was enriched for targets of several well-known cancer-related transcription factors such as MYC and RUNX1 (Additional file 6) and were functionally connected as a group (protein–protein interaction network enrichment: P < 1e−16) (Additional file 7).

The minimal prognostic 8-gene signature is sufficient and effective in predicting overall survival in HCC

Having identified that HCC45 predicts outcome in HCC, we evaluated the performance of each gene as a prognostic factor (Fig. 1). Of the 45 genes, univariate Cox regression analysis revealed that 14, 19 and 28 genes were significantly associated with unfavorable prognosis in GSE14520, TCGA-LIHC and LIRI-JP respectively (Figs. 1; 3a). Eight genes (HCC8) were found to confer prognostic information in all HCC cohorts: CA9, CCL20, CORO1C, CTSC, LDHA, NDRG1, PTP4A3 and TUBA1B (Fig. 3a; Additional file 8). Expression of HCC8 not only distinguished tumor from non-tumor sample (Additional file 9) but also increased according to tumor progression (Additional file 10). As expected, the signature successfully identified high-risk patients in full HCC cohorts, GSE14520 (P = 0.0001), TCGA-LIHC (P = 0.00016) and LIRI-JP (P = 0.0003) cohorts, and in patients stratified by tumor stage (Fig. 3b). The 5-year disease-free survival rates were also lower in high-risk patients as determined by the 8-gene signature (Additional file 11). Furthermore, as measured by Wald Chi square statistics and log-rank tests, HCC8 is superior in predicting death than HCC45 (Figs. 2a; 3b; Table 1).
Fig. 3
Fig. 3

Minimal prognostic 8-gene signature in HCC. a Forest plots depict Cox proportional hazards analysis on 45 signature genes in three HCC cohorts. Hazard ratios (HR) were denoted as dark blue circles and light blue bars represent 95% confidence interval. Eight genes are consistently prognostic across all three cohorts, thereby constituting the minimal prognostic signature. Significant Wald test P values were indicated in red. Signature genes were highlighted in red. b The 8-gene signature successfully identified high-risk patients in different TNM stages. Kaplan–Meier plots of overall survival in HCC patients across three cohorts stratified by 8-gene signature into low and high-risk groups. Patients were stratified by the signature as a full cohort, or sub-grouped according to TNM stages. P-values were calculated from the log-rank test. Plots show independence of signature over current staging systems. c Analysis of specificity and sensitivity of the signature in HCC cohorts with ROC. Plots depict comparison of ROC curves of signature and clinical tumor staging parameters. AUC: area under the curve. TNM tumor, node, metastasis staging. d Kaplan–Meier plots depicting combined relation of TP53 or CTNNB1 mutation status with the 8-gene signature on overall survival in HCC

Multivariate models of HCC8 revealed that this reduced gene set sufficiently served as an independent prognostic risk factor: GSE14520 (HR = 1.72, P = 0.02), TCGA-LIHC (HR = 2.23, P = 0.0009) and LIRI-JP (HR = 2.58, P = 0.012) (Table 1). Predictive value for 5-year OS increased when the HCC8 signature was used in combination with TNM staging: GSE14520 AUC = 0.67 versus 0.73 and TCGA-LIHC AUC = 0.73 versus 0.76 (Fig. 3c). When TP53 or CTNNB1 mutation status and HCC8 were jointly used, patients with high HCC8 levels and mutant TP53 (P < 0.0001) or CTNNB1 (P < 0.0001) had the lowest chances of survival (Fig. 3d).

We calculated risk scores for each patient by taking the sum of Cox regression coefficient for each of the 8 genes multiplied with its corresponding expression value [24]. Risk scores significantly correlated with hypoxia scores in all 3 cohorts, indicating that patients with more hypoxic tumors have higher risk for death as predicted by HCC8 (Additional file 12), overall suggesting that it was sufficient and effective in predicting death.

Prognosis of the 8-gene signature in 5 other cancers

Hypoxic tumors recruit CD4+ Tregs to suppress effector T cell function and promote tumor tolerance [10]. We hypothesized that HCC8 could predict outcome in other solid malignancies in addition to HCC. HCC8 could indeed discriminate between tumor and non-tumor samples in five other cancer types (Additional file 13). Remarkably, our prognostic 8-gene signature derived from HCC was a significant adverse prognostic factor for OS in these five cancer types: head and neck (HR = 1.64, P = 0.004), renal papillary cell (HR = 2.31, P = 0.047), lung (HR = 1.45, P = 0.03), pancreas (HR = 1.96, P = 0.006) and endometrium (HR = 2.33, P = 0.003) (Fig. 4a; Additional file 14). Multivariate analysis adjusting for tumor stage revealed that the HCC8 signature retained independent prognostic relevance in these cancers: head and neck (HR = 1.66, P = 0.003), lung (HR = 1.45, P = 0.029), pancreas (HR = 1.91, P = 0.009) and endometrium (HR = 1.99, P = 0.016) (Additional file 14). Importantly, while TNM staging was not a reliable predictor of OS in pancreatic cancer on its own, our 8-gene signature successfully predicted high-risk patients when accounting for tumor stage (P = 0.009) (Additional file 14).
Fig. 4
Fig. 4

Prognosis of the 8-gene signature in 5 other non-HCC cancers. a Kaplan–Meier plots of overall survival in patients across multiple cancers stratified into low and high-risk groups using the prognostic 8-gene signature. P-values were obtained from the log-rank test. b Analysis of specificity and sensitivity of the 8-gene signature in multiple cancers. Plots depict comparison of ROC curves of the 8-gene signature and clinical tumor staging parameters. AUC: area under the curve. TNM tumor, node, metastasis staging

The predictive value of the 8-gene signature outperformed the TNM staging system: head and neck (AUC = 0.610 versus 0.606), renal papillary cell (AUC = 0.701 versus 0.640) and pancreas (AUC = 0.694 versus 0.593) (Fig. 4b). When used in conjunction with staging information, additive effects on the predictive performance of HCC8 were observed: head and neck (AUC = 0.661), renal papillary cell (AUC = 0.768), lung (AUC = 0.671) and pancreas (AUC = 0.701) (Fig. 4b). Importantly, risk scores derived from HCC8 showed strong positive trends with tumor hypoxia, suggesting that high risk patients had more aggressive tumors: head and neck (ρ = 0.729, P < 0.0001), renal papillary cell (ρ = 0.766, P < 0.0001), lung (ρ = 0.856, P < 0.0001), pancreas (ρ = 0.844, P < 0.0001) and endometrium (ρ = 0.374, P < 0.0001) (Additional file 15).

TP53, KRAS and CDKN2A mutations are commonly observed in lung cancer [25]. CDKN2A (P = 0.03), but not TP53 (P = 0.2) or KRAS (P = 0.4), was significantly associated with poor prognosis (Additional file 14). When used in combination with our HCC8 signature, patients with high HCC8 score and mutant CDKN2A had ~ 30% higher chances for death at 2 years than patients with low HCC8 and wild-type CDKN2A (Additional file 16).


We developed a novel gene signature that predicts overall survival in six cancers. This represents a significant step forward in the prognostic pathway since pan-cancer genetic signatures are limited at best. Capitalizing two biological phenomena of solid tumors, hypoxia and immune cells infiltration, our signature can be applied across multiple cancers, suggesting that a high degree of commonality exists in host immune response within a hypoxic tumor microenvironment. Signature genes may provide insights into biological features of individual tumors as they are implicated in oncogenic processes, i.e. pH regulation, invasion, cell proliferation, adhesion and migration, glycolysis and inflammation [2630]. Combining the RNA gene signature with DNA somatic mutations simultaneously improved its prognostic capabilities, indicating that signature analysis at two macromolecular levels offers the opportunity for multiple levels of drug targeting. High-risk patients, as predicted by our signature, that concurrently have TP53 mutations suggests a role for hypoxia in promoting genomic instability and aberrant DNA damage repair.

Canonical tumor staging parameters are useful, but they do not offer additional resolution for discriminating patients with similarly-staged malignant grades. Especially for patients with stage 1 cancer, the signature allows the incorporation of hypoxia information to identify patients that are more at risk of progressing to advanced stages and developing lethal metastasis. Additionally, we were able to differentiate tumor from non-tumor samples using the signature, suggesting that it offers additional information on how transformed cells differ from normal cells. It is interesting to speculate that the signature could be used in diagnostics since non-transformed cells would have a distinct expression profile and a deviation from this profile could indicate the onset of oncogenesis.

Tumor hypoxia has wide-ranging effects causing metabolic alteration, angiogenesis, metastasis and immune suppression [31]. Significant crosstalk exists between hypoxia and antitumor immune functions where tumor hypoxia contributes to attenuated antitumor responses [10]. We observed a significant positive correlation between Treg and hypoxia gene expressions, supporting the notion that immunosuppression is higher in patients with more hypoxic tumors (Additional file 3C, D). Cancerous cells are protected through immunosuppressive functions enhanced by tumor hypoxia and together, they contribute to chemotherapy resistance [32]. Hence, to improve patient response to treatment, there is an urgent need to incorporate hypoxic tumor assessments in clinic for an effective patient-centric strategy. Through personalized therapy, patients with more hypoxic tumors would benefit from neoadjuvant treatment using hypoxia-modifying drugs to improve response to immunotherapy and chemotherapy [18].

Elevation of Tregs and exhausted CD8+ T cells in tumors offers another opportunity for therapeutic intervention. Exhausted CD8+ T cells are overrepresented among tumor-infiltrating CD8+ T cells [8]. These cells have reduced levels of cytotoxic markers and high expression of the PDCD1 exhaustion marker [8]. Hence, patients with high signature scores may likely have increased Treg and exhausted CD8+ T cell densities. These patients could benefit from therapy with the PD1 antibody as anti-PD1 treatment can rejuvenate exhausted CD8+ T cells [33]. When used in combination with first-line treatments, this may dramatically improve patient prognosis.


Although prospective validation in clinical trials is warranted, we consider our results as supporting the implications of tumor’s hypoxic and immunologic microenvironment in influencing patients’ prognosis and potentially their response to treatment. Multi-cohort validations incorporating large sample size (n = 2712) confirmed that the 8-gene signature is robust, clinically actionable and has potential to radically change how we determine prognosis and guide therapy.



hepatocellular carcinoma


regulatory T cells


the Cancer Genome Atlas


International Cancer Genome Consortium


receiver operating characteristic


area under the curve


multidimensional scaling


permutational multivariate analysis of variance


Kyoto encyclopedia of genes and genomes


Gene Ontology


hypoxia inducible factor


transcription factor


chromatin immunoprecipitation sequencing


45-gene signature


8-gene signature


overall survival


tumor, node, metastasis


hazard ratio


Authors’ contributions

WHC and AGL designed the study and analyzed the data. AGL supervised the research. All authors interpreted the data. WHC and AGL wrote the initial manuscript draft. All authors revised the manuscript. All authors read and approved the final manuscript.


A special thank you to Graham Foster for his critical insights and help on manuscript preparation.

Competing interests

The authors declare that they have no competing interests.

Availability of data and materials

The datasets supporting the conclusions of this article are included within the article and its additional files.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Not applicable.



Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Authors’ Affiliations

Nuffield Department of Medicine, University of Oxford, Old Road Campus, Oxford, OX3 7FZ, UK


  1. Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer. 2012;136:E359–86.View ArticleGoogle Scholar
  2. Forner A, Llovet JM, Bruix J. Hepatocellular carcinoma. Lancet. 2012;379:1245–55.View ArticleGoogle Scholar
  3. Warburg O. On the origin of cancer cells. Science. 1956;123:309–14.View ArticleGoogle Scholar
  4. Gatenby RA, Gillies RJ. Why do cancers have high aerobic glycolysis? Nat Rev Cancer. 2004;4:891.View ArticleGoogle Scholar
  5. Milosevic M, Warde P, Ménard C, Chung P, Toi A, Ishkanian A, et al. Tumor hypoxia predicts biochemical failure following radiotherapy for clinically localized prostate cancer. Clin cancer Res. 2012. ArticlePubMedGoogle Scholar
  6. Buffa FM, Harris AL, West CM, Miller CJ. Large meta-analysis of multiple cancers reveals a common, compact and highly prognostic hypoxia metagene. Br J Cancer. 2010;102:428–35.View ArticleGoogle Scholar
  7. De Simone M, Arrigoni A, Rossetti G, Gruarin P, Ranzani V, Politano C, et al. Transcriptional landscape of human tissue lymphocytes unveils uniqueness of tumor-infiltrating T regulatory cells. Immunity. 2016;45:1135–47.View ArticleGoogle Scholar
  8. Zheng C, Zheng L, Yoo JK, Guo H, Zhang Y, Guo X, et al. Landscape of infiltrating T cells in liver cancer revealed by single-cell sequencing. Cell. 2017;169(1342–1356):e16.Google Scholar
  9. Kobayashi N, Hiraoka N, Yamagami W, Ojima H, Kanai Y, Kosuge T, et al. FOXP3+ regulatory T cells affect the development and progression of hepatocarcinogenesis. Clin Cancer Res. 2007;13:902–11.View ArticleGoogle Scholar
  10. Facciabene A, Peng X, Hagemann IS, Balint K, Barchetti A, Wang LP, et al. Tumour hypoxia promotes tolerance and angiogenesis via CCL28 and Tregcells. Nature. 2011;475:226–30.View ArticleGoogle Scholar
  11. Weinstein JN, Collisson EA, Mills GB, Shaw KRM, Ozenberger BA, Ellrott K, et al. The cancer genome atlas pan-cancer analysis project. Nat Genet. 2013;45:1113.View ArticleGoogle Scholar
  12. Tabas-Madrid D, Nogales-Cadenas R, Pascual-Montano A. GeneCodis3: a non-redundant and modular enrichment analysis tool for functional genomics. Nucleic Acids Res. 2012. ArticlePubMedPubMed CentralGoogle Scholar
  13. Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, McDermott MG, Monteiro CD, Gundersen GW, Ma'ayan A. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44:W90–7.View ArticleGoogle Scholar
  14. Szklarczyk D, et al. STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2014:43(D1):D447–52.View ArticleGoogle Scholar
  15. Pugh CW, Ratcliffe PJ. Regulation of angiogenesis by hypoxia: role of the HIF system. Nat Med. 2003;9:677.View ArticleGoogle Scholar
  16. Kaelin WG. Cancer and altered metabolism: Potential importance of hypoxia-Inducible factor and 2-oxoglutarate-dependent dioxygenases. Cold Spring Harb Symp Quant Biol. 2011;76:335–45.View ArticleGoogle Scholar
  17. Smythies JA, et al. Inherent DNA-binding specificities of the HIF-1α and HIF-2α transcription factors in chromatin. EMBO Rep. 2018;2018:e46401.Google Scholar
  18. Wilson WR, Hay MP. Targeting hypoxia in cancer therapy. Nat Rev Cancer. 2011;11:393.View ArticleGoogle Scholar
  19. Petitjean A, Achatz MIW, Borresen-Dale AL, Hainaut P, Olivier M. TP53 mutations in human cancers: functional selection and impact on cancer prognosis and outcomes. Oncogene. 2007;26:2157.View ArticleGoogle Scholar
  20. Hussain SP, Schwank J, Staib F, Wang XW, Harris CC. TP53 mutations and hepatocellular carcinoma: insights into the etiology and pathogenesis of liver cancer. Oncogene. 2007;26:2166.View ArticleGoogle Scholar
  21. Cieply B, Zeng G, Proverbs-Singh T, Geller DA, Monga SPS. Unique phenotype of hepatocellular cancers with exon-3 mutations in beta-catenin gene. Hepatology. 2009;49:821–31.View ArticleGoogle Scholar
  22. Wong CM, Fan ST, Ng IOL. β-Catenin mutation and overexpression in hepatocellular carcinoma: clinicopathologic and prognostic significance. Cancer. 2001;92:136–45.View ArticleGoogle Scholar
  23. Mao T-L, Chu J-S, Jeng Y-M, Lai P-L, Hsu H-C. Expression of mutant nuclear β-catenin correlates with non-invasive hepatocellular carcinoma, absence of portal vein spread, and good prognosis. J Pathol. 2001;193:95–101.View ArticleGoogle Scholar
  24. Kim SM, Leem SH, Chu IS, Park YY, Kim SC, Kim SB, et al. Sixty-five gene-based risk score classifier predicts overall survival in hepatocellular carcinoma. Hepatology. 2012;55:1443–52.View ArticleGoogle Scholar
  25. Ding L, Getz G, Wheeler DA, Mardis ER, McLellan MD, Cibulskis K, et al. Somatic mutations affect key pathways in lung adenocarcinoma. Nature. 2008;455:1069.View ArticleGoogle Scholar
  26. Den Hollander P, Rawls K, Tsimelzon A, Shepherd J, Mazumdar A, Hill J, et al. Phosphatase PTP4A3 promotes triple-negative breast cancer growth and predicts poor patient survival. Cancer Res. AACR. 2016;76:1942–53.View ArticleGoogle Scholar
  27. Pastorek J, Pastorekova S. Hypoxia-induced carbonic anhydrase IX as a target for cancer therapy: from biology to clinical use. Semin Cancer Biol. 2015;31:52–64.View ArticleGoogle Scholar
  28. Cheng X-S, Li Y-F, Tan J, Sun B, Xiao Y-C, Fang X-B, et al. CCL20 and CXCL8 synergize to promote progression and poor survival outcome in patients with colorectal cancer by collaborative induction of the epithelial–mesenchymal transition. Cancer Lett Elsevier. 2014;348:77–87.View ArticleGoogle Scholar
  29. Wang J, Tsouko E, Jonsson P, Bergh J, Hartman J, Aydogdu E, et al. miR-206 inhibits cell migration through direct targeting of the actin-binding protein Coronin 1C in triple-negative breast cancer. Mol Oncol. 2014;8:1690–702.View ArticleGoogle Scholar
  30. Mataki H, Enokida H, Chiyomaru T, Mizuno K, Matsushita R, Goto Y, et al. Downregulation of the microRNA-1/133a cluster enhances cancer cell migration and invasion in lung-squamous cell carcinoma via regulation of Coronin1C. J Hum Genet. 2015;60:53.View ArticleGoogle Scholar
  31. Yotnda P, Wu D, Swanson AM. Hypoxic tumors and their effect on immune cells and cancer therapy. Immunother Cancer. 2010;651:1–29.View ArticleGoogle Scholar
  32. Lukashev D, Ohta A, Sitkovsky M. Hypoxia-dependent anti-inflammatory pathways in protection of cancerous tissues. Cancer Metastasis Rev. 2007;26:273–9.View ArticleGoogle Scholar
  33. Blackburn SD, Shin H, Freeman GJ, Wherry EJ. Selective expansion of a subset of exhausted CD8 T cells by α-PD-L1 blockade. Proc Natl Acad Sci USA. 2008;105:15016–21.View ArticleGoogle Scholar


© The Author(s) 2019