Dual analysis of glycolytic and OXPHOS gene expression describes the metabolic landscape of cancer
The study design is summarized in Additional file 1: Fig. S1. In order to investigate the landscape of tumor metabolism, we complied a pan-cancer cohort of 9668 cases from 33 cancer types from TCGA and stratified them based on their relative expression levels of glycolytic and OXPHOS pathway genes. Gene signatures representing all enzymes and isoenzymes involved in these two pathways proposed by different studies [23, 24] (Glycolysis = 292; OXPHOS = 200) were integrated together for further analysis to achieve the best comprehensiveness and representativeness when measuring the actual status of the metabolic pathways. However, previous studies have figured out the potential heterogeneity in metabolic gene expression, including the isoenzymes within specific pathways between different cancer types [29,30,31]. To avoid this problem, we performed consensus clustering to identify the signature genes exhibiting similar expression pattern among multiple tumor types. As shown in Fig. 1A, based on the pan-cancer RNA-sequencing data, the expression level of the genes involved in the signatures mentioned above [23, 24] were categorized into five clusters, in which the cluster 1 (n = 72) mainly consists of glycolytic genes whereas cluster 4 (n = 66) mainly consists of OXPHOS genes. Therefore, we selected these co-expressed genes belonging to the two groups as the glycolysis and OXPHOS pathway gene signatures respectively to be used for describing the status of the two pathways (Fig. 1A, Additional file 9: Data 1). The Glycolysis and OXPHOS scores were then calculated by GSVA based on the corresponding “co-expressed” gene signatures for each cancer patient enrolled in this study (Additional file 10: Data 2). The robustness and representativeness of the two “filtered” metabolic scores could be verified by their strong correlations with the GSVA scores for the above mentioned 2 “primary” signatures on a pan-cancer scale (Additional file 2: Fig. S2A, Glycolysis: r = 0.75; OXPHOS: r = 0.96). However, the correlation between these two scores is not completely opposite to each other as expected (r = 0.33), indicating the complicated mutual relationship between the two pathways.
The Glycolysis and OXPHOS scores exhibited distinct distribution patterns in different cancer types (Fig. 1B), highlighting the metabolic diversity among tumor types. For example, in terms of Glycolysis score, the median (interquartile range) was − 0.30 (− 0.33 to − 0.24) for lymphoid neoplasm diffuse large B-cell lymphoma (DLBC); for acute myeloid leukemia (LAML), the value was even lower: − 0.35 (− 0.39 to − 0.31). These were in contrast to the solid tumor types such as colon adenocarcinoma (COAD): 0.17 (0.08–0.27), liver hepatocellular carcinoma (LIHC): 0.014 (0.08–0.21), and rectum adenocarcinoma (READ): 0.14 (0.07–0.24). These results suggest that glycolysis-related genes are significantly upregulated in digestive malignancies such as COAD, LIHC, and READ, thus leading to the increased score, while the Glycolysis scores are much lower in non-solid tumors like DLBC and LAML. As for OXPHOS score, we observed higher viability among different tumor types (Fig. 1B). LAML also exhibits lowest score: − 0.79 (− 0.81 to − 0.77), whereas kidney chromophobe (KICH) has the highest: 0.72 (0.69–0.75).
We further adopted metabolite data to investigated whether the expression patterns of metabolic pathway genes and corresponding scores could reflect actual metabolic activities in tumor patients. Considering the unavailability of metabolite data in TCGA, we enrolled an independent dataset that provides parallel gene expression and metabolites profiling data and focused on the 12 metabolites that had been annotated to Glycolysis (n = 9) and OXPHOS (n = 3) pathways [14]. The correlations between metabolite abundance and corresponding pathway genes were calculated and exhibited in Fig. 1C, D (only those with an FDR < 0.15). For example, HMMR, a hyaluronan associated gene from the Glycolysis signature, was found to be significantly correlated with 5 glycolytic metabolites, with 4 of them displaying correlation > 0.6 (3-phosphoglycerate, r = 0.63, FDR < 0.001; fructose-6-phosphate, r = 0.68, FDR < 0.001; lactate, r = 0.75, FDR < 0.001; glucose-6-phosphate (G6P), r = 0.76, FDR < 0.001). And we also noticed SPAG4, whose expression level was significantly in agreement with six glycolytic metabolites (Fig. 1C). As for OXPHOS pathway, only three metabolites were annotated, but we still detected the significant correlation between phosphate, a key factor in OXPHOS process, and five genes in this pathway (Fig. 1D). However, the correlation between pathway scores and metabolites are quite weak (Additional file 2: Fig. S2B–E). This result might be due to the integrative nature of the Glycolysis and OXPHOS score, which represent the overall activity of the corresponding metabolic process rather than the abundance of a specific metabolite. Besides, to assess the statistical significance of the number of significant hits detected, we conducted a simulation analysis to compare the number of genes with significant signals from Glycolysis signature with those based on random gene sets of the same size (n = 72, FDR < 0.05). Strikingly, the mean number of genes exhibiting significant correlations with glycolytic metabolites in 1000 random samples was only 16.862, whereas the real number of significant genes in glycolytic signatures was 27 (Student t-test: p < 0.001, Fig. 1E). These results demonstrate that the expression patterns of metabolic pathway genes do reflect actual metabolic activities.
Classification of metabolic subtypes and their prognostic value
We classified all the cancer patients enrolled in this study into four metabolic subgroups according to their Glycolysis and OXPHOS scores: High-glycolysis and high OXPHOS (HGHO), High-glycolysis and low OXPHOS (HGLO), low-glycolysis and high OXPHOS (LGHO), and Low-Glycolysis & Low OXPHOS (LGLO) (Fig. 2A, Additional file 10: Data 2). The medians for the two scores in each cancer type were considered as the cut-off value for the definition of “High” and “Low” (Using LUAD as an example, Fig. 2B).
We next investigated the clinical relevance of the metabolic subtypes mentioned above, since overall survival serves as a key index for evaluating tumor aggressiveness (Fig. 2C, Additional file 3: Fig. S3). Notably, compared with HGLO, LGHO subtypes were consistently associated with better prognosis in 23 out of 33 tumor types, with 7 exhibiting statistical significance, including cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC; HR: 0.33 (0.16–0.70), p = 0.003), kidney renal papillary cell carcinoma (KIRP; HR: 0.20 (0.07–0.60), p = 0.004), mesothelioma (MESO; HR: 0.38 (0.18–0.80), p = 0.011), pancreatic adenocarcinoma (PAAD; HR: 0.35 (0.18–0.69), p = 0.002), and LUAD (HR: 0.49 (0.31–0.80), p = 0.003), sarcoma (SARC; HR: 0.49 (0.26–0.94), p = 0.032), and bladder urothelial carcinoma (BLCA; HR: 0.57 (0.37–0.86), p = 0.008) (Fig. 2C). These results are compatible with the role of glycolysis and OXPHOS in tumor progression [1], where tumors with higher rates of glycolysis but lower OXPHOS may be more aggressive than those with adverse features. Unexpectedly, LGHO was significantly associated with worse survival in esophageal carcinoma (ESCA; HR: 3.69 (1.32–10.31), p = 0.012), suggesting the potential distinct metabolic feature of ESCA. However, we failed to identify common prognostic value in HGHO and LGLO subtypes across multiple cancer types (Additional file 4: Fig. S4), thus, subsequent analyses were mainly focused on HGLO and LGHO.
Moreover, to further validate the metabolic discrepancy between HGLO and LGHO subgroups at a post-transcriptional level, we performed gene set enrichment analysis (GSEA) in 102 BRCA patients and ten OV patients, since only in these two tumor types the mass proteomic spectrum data were available in the CTPAC [9, 10]. The expression data of 9733 and 7625 proteins was available for BRCA and OV patients, respectively. Based on the metabolic group determined by RNA-sequencing data, the Glycolysis and OXPHOS pathway mentioned above also exhibited similar enrichment features for both cancer types (Glycolysis: adjusted p-value = 0.014 for BRCA and 0.004 for OV; OXPHOS: adjusted p-value = 0.121 for BRCA and 0.004 for OV, Fig. 2D). This phenomenon could be explained by the high correlations between mRNA and protein expression levels in a patient, further indicating the robustness of our metabolic scores and classification from another dimension.
Association of metabolic subtypes with multi-omics genomic profiles across multiple cancer types
To systematically outline the genomic characteristics of patients in different metabolic subtypes and identify common changes or regulatory relationship across multiple cancer types, we performed a multi-omics comparison of three types of molecular features between the HGLO and LGHO groups in all 33 cancers: mRNA-sequencing, mature miRNA, and DNA methylation (Additional file 11: Data 3). Significant differential genomic features were detected and exhibited varied distribution across the 33 cancer types, which could be due to the distinct tumor characteristics and varied sample sizes (Fig. 3A). For instance, no DEG was detected in DLBC, cholangiocarcinoma (CHOL), and thymoma (THYM), while 3762 were detected in uveal melanoma (UVM). Differentially expressed miRNAs were identified in only 23 out of 33 cancers, and alterations in DNA methylation probes ranged from one in CESC to 14,405 in testicular germ cell tumors (TGCT).
Despite the considerable heterogeneity among different tumor types, we still noted lots of common significant genomic alterations across different cancer types, indicating the respective general characteristics of HGLO and LGHO subtypes, regardless of tumor origins. For example, our analysis revealed increased expression of NDUFA2 and UQCRQ, all key components of the mitochondrial respiratory chain, in LGHO group in 25/33 cancer types (Additional file 5: Fig. S5). The expression of MPC1, which encodes one subunit of mitochondrial pyruvate complex (MPC) that is responsible for transporting pyruvate into mitochondria, was downregulated in HGLO group in 15 cancer types (Fig. 3B). It has been reported that the effects of glycolysis on tumor progression can be diminished by diverting the metabolite pyruvate from conversion to lactate in part through transport into the mitochondria via the activity of the MPC1/MPC2 heterodimer [32,33,34]. Therefore, reduced expression of MPC1 is associated with increased glycolytic activity in tumors and thus promoting tumor development, which is consistent with our findings above. Besides, 15 tumor types exhibited increased expression of HIF1A, which is activated in hypoxia environment in tumor cells, in HGLO groups, indicating that hypoxia serves as the inducer of the glycolytic pathway and Warburg effect in cancer (Fig. 3B). Furthermore, GO enrichment analysis of the 632 genes (315 upregulated in HGLO and 317 upregulated in LGHO), which were differentially expressed in at least ten cancer types, also demonstrated similar results. As shown in Fig. 3C and Additional file 11: Data 3, in addition to the common pathways like extracellular matrix organization, we found the significant enrichment of a series of metabolic pathways involved in glycolysis and OXPHOS, including response to hypoxia (adjusted p-value < 0.001), respiratory chain (adjusted p-value < 0.001), and NADH dehydrogenase activity (adjusted p-value < 0.001), depicting the typical common genomic feature of different metabolic subgroups.
Considering the regulatory role of miRNA in gene expression, we also investigated the intersection between metabolism-related DEGs and differentially expressed miRNAs across multiple cancer types based on the miRNA-gene targeting relationship predicted by miRWalk (Additional file 12: Data 4). As shown in Fig. 3D, 24 key miRNAs (12 upregulated in HGLO and 12 in LGHO) and 38 genes (28 upregulated in HGLO and 10 in LGHO) were identified upon the following criteria: (1) the miRNAs are significantly differentially expressed in at least five tumor types; (2) their targets genes exhibit opposite expression mode in at least 15 tumor types. MiR-30c-5p (LGHO), miR-362-5p (LGHO), and miR-379-5p (HGLO) appear to be strong regulators of tumor glycolysis and OXPHOS since they were all dysregulated in 8 cancer types and targeted several metabolism-related DEGs. Meanwhile, the three miRNAs have been reported to play an important role in tumor metabolism and progression [35,36,37], which further supports our findings.
For differentially methylated genes (DMGs) between HGLO and LGHO subgroups, we focused on gene promoter region including TSS200, TSS1500, 3′-UTR, and 1st-Exon, since DMGs are commonly defined according to their promoter methylation status [38, 39]. The genes were classified into four groups based on the intersection between metabolism-associated DMGs and DEGs: hypermethylated and upregulated (hyper-up), hypermethylated, and downregulated (hyper-down), hypomethylated and upregulated (hypo-up), and hypomethylated and downregulated (hypo-down). Considering the nature of DMGs and DEGs, we focused on genes in the hyper-down and hypo-up groups in downstream analyses. However, these differentially methylated and expressed genes exhibited few generalities across tumor types, suggesting that abnormal methylation of a specific gene might not explain the metabolic alteration in multiple cancer types.
Regulatory network of transcription factors in cancer metabolic heterogeneity
To further elucidate how the tumor metabolic status is regulated by transcription factors (TFs), we adopted tumor-context-specific TF regulatory networks, which is based on ARACNe algorithm and proposed by Giorgi et al. [18], to infer the significantly enriched master TFs that potentially drive the conversion of tumor metabolic phenotypes (HGLO and LGHO) from one to the other using MARINa in each cancer type (Additional file 13: Data 5). The results were further statistically adjusted by shadow analysis [19] and the activating and expression level of identified TFs in different cancer types are shown in Fig. 4A (we took BRCA as an example) and Additional file 6: Fig. S6. After this two-step screening method, 55 common TFs (see “Methods”) were enrolled for subsequent analyses, and the result of functional enrichment of them also confirms their role in multiple metabolic pathways, including respiratory electron transport chain (Fig. 4B). We then employed leading-edge analysis to identify the 56 target genes of those common TFs (Additional file 13: Data 5) and integrated their genomic location, relative expression level across different cancer types, and corresponding TF-target regulatory network in Fig. 4C. For example, MTERF2, which has been demonstrated to be involved in mitochondrial gene transcription and metabolism and serve pivotal roles in the pathogenesis of various cancer types [40], was identified as a master TF driving the LGHO phenotype in 7 cancer types. The regulatory network analysis revealed that ZNF10, a transcriptional repressor protein, was MTERF2’s potential target in 10 cancers. Meanwhile, we found that HIF1A, which is up-regulated in 15 tumor types as mentioned above, served as a master TF for the HGLO phenotype in 7 of them, as tumor formation associated gene SNAPC1 was identified as its downstream target in 6 cancers [41]. In addition, we noticed an interesting phenomenon that many TFs mentioned above are also downstream target genes of other TFs, indicating the complicated TF-related regulatory network of tumor metabolism status.
Metabolism associated somatic genomic changes
The reprogramming of the glycolytic and OHPHOS status can be largely viewed as a consequence of oncogenic driver events [42]. For instance, it has been widely accepted that mutated TP53 and amplified MYC are linked to several anabolic and catabolic pathways [43, 44]. To identify somatic genomic changes that potentially characterize tumor metabolism, we systematically evaluated the association between metabolic subtypes and specific mutational events, including both SNVs and CNVs. For SNVs, Fig. 5A shows the overall mutational landscape of the 15 most frequently mutated genes on the pan-cancer scale (Additional file 14: Data 6). Besides, for each cancer type, we identified significantly differently mutated genes and only noted the elevation rated of TP53 (Fisher-exact test, P < 0.05) across five cancer types in HGLO compared with LGHO, which is in agreement with its reported function in metabolism and the selection of TP53-mutated cells under hypoxic stress [45]. However, most metabolism-related SNVs were detected in only one or two tumor types, thus highlighting strong inter-tumor-type diversity in this mutational feature.
As for CNVs, we first focused on 131 cancer driver genes altered by CNVs [22] and identified the oncogenes and tumor suppressors recurrently associated with tumor metabolism in multiple cancer types (Fig. 5B; Additional file 7: Fig. S7A). For example, the amplification of the oncogene MYC was significantly associated with elevated Glycolysis score in four independent cancer types (FDR < 0.05). However, we failed to find CNVs recurrently associated with OXPHOS in even three cancer types, indicating that individual tumor types might have distinctive cancer driver CNV-metabolism relationships. We next investigated the CNV fraction, a surrogate of genomic instability that is correlated with aggressivity in several tumor types (Fig. 5C) [46, 47]. Our analysis revealed that CNV fraction was significantly different between HGLO and LGHO subtypes only in BRCA and uterine corpus endometrial carcinoma (UCEC, FDR adjusted Wilcoxon P < 0.05, Fig. 5C, D). Meanwhile, strong correlations were observed between CNV fraction value and the Glycolysis score in BRCA and the OXPHOS score in UCEC, respectively. (Fig. 5C, left). Therefore, we focused on these two cancers and further assessed whether metabolic scores were associated simply with specific gene-level mutational events, including both CNVs and SNVs. As shown in Fig. 5E, F, in BRCA, tumors harboring amplification of PIK3CA, GATA3, MYC, and deletion of MAP3K1, CHAF1A are more likely to have higher Glycolysis score. Besides, high glycolytic breast tumors also exhibit an elevated rate of TP53 (OR = 3.18, P < 0.001) and decreased rate of PIK3CA point mutation (OR = 0.36, P < 0.001). As for UCEC, the strong association between mutational events and OXPHOS score was observed. Tumors with amplification of PIK3CA and BCL6, or those with deletion of ARID1A and CH1F1A, are more likely to show lower OXPHOS score (Additional file 7: Fig. S7B, C). Additionally, TP53 point mutations are more common in low-OXPHOS tumors (OR = 2.63, P < 0.001). Taken together, these analyses provide a broad view of potential somatic drivers related to metabolic reprogramming in human cancer, thereby proposing testable hypotheses for future experimental modeling studies.
Tumor metabolism are informative about pathway functional enrichment and drug response
To further assess the biological relevance of metabolic subtypes, we examined their correlation with various cellular pathways based on mRNA expression (FDR < 0.05, Fig. 6A). This analysis enrolled the 50 hallmark gene sets proposed by Liberzon et al. [23] and seven metabolic super-pathways from the latest Reactome annotations [24]. The GSVA score of these 57 pathways was calculated for each cancer patient. Tumor stemness data proposed by Malta et al. was also enrolled [25]. Interestingly, despite the diversity of cancer types surveyed, we observed an almost adverse correlation pattern of most pathway enrichment levels with Glycolysis and OXPHOS score across multiple cancer types. For example, the majority of included pathways, including angiogenesis, epithelial-mesenchymal transition, TGF-beta signaling, hypoxia, and mTORc, were consistently positively correlated with Glycolysis but negatively correlated with OXPHOS score. In contrast, asides from the TCA cycle, which is an important section of OXPHOS process, the tumor-stemness and amino acid metabolism pathway displayed recurrent positive correlations with OXPHOS, but negative correlation with Glycolysis score across several tumor types. Overall, these results demonstrate that tumor metabolic activity is intrinsically coupled with cancer pathways and development.
Considering the previously reported associations of drug resistance with tumor metabolic features [30], we adopted the expression and drug sensitivity data of cancer cell lines from GDSC and investigated whether the tumor cells from HGLO & LGHO subtypes would harbor different sensitivity to anti-cancer drugs. The RNA sequencing data of 435 cancer cell lines and their sensitivity to 167 anti-cancer drugs, presented as AUC, were used for subsequent analyses. As shown in Additional file 8: Fig. S8, some of the commonly used anti-cancer drugs, including cisplatin, gemcitabine, and paclitaxel, as well as mTOR suppressor OSI_027, exhibited significant different pharmacological effect in the two metabolic subtypes. Besides, we further analyzed the correlation between drug sensitivity and the expression of 438 metabolism-associated DEGs, which were significantly differentially expressed between HGLO and LGHO subgroups in at least ten cancer types. These drugs target several important biological processes, including metabolism, apoptosis, DNA replication, PI3K-mTOR signaling, and the p53 pathway. The network in Fig. 6B depicts the complicated correlations among the anti-cancer drugs and 111 of 438 metabolism-related DEGs that were strongly correlated (|r|> 0.3) with cell line sensitivity (AUC) to at least three drugs. For example, the expression of ASAP2, which has been identified as a biomarker for several types of tumor, is up-regulated in HGLO in 16 cancer types and positively correlated with the resistance to DNA topoisomerases inhibitor Topotecan (r = 0.40) and DNA synthesis inhibitor Gemcitabine (r = 0.34). It has been reported that overexpression of ASAP2 is correlated with worse prognosis in pancreatic cancers [48]. Besides, Overexpression of APP in HGLO group was detected in 15 cancer types and suggested potential resistance to 33 types of drugs, including Teniposide, Mitoxantrone, Oxaliplatin, and Irinotecan, all of the which target DNA replication (r = 0.434, r = 0.425, r = 0.416, and r = 0.366, respectively). Taken together, our findings demonstrate the extensive interactions between drug responses and tumor metabolism, suggesting that our metabolic classification might serve as a predictive marker for the response to different anti-cancer treatment.