Skip to main content
Fig. 4 | Journal of Translational Medicine

Fig. 4

From: Ensemble learning model for identifying the hallmark genes of NFκB/TNF signaling pathway in cancers

Fig. 4

Triple-negative breast cancer analysis. a Enrichment analysis of breast cancer subtypes for patients by their weights in the BRCA ensemble model. The z-score of each subtype was calculated through 10,000 random permutations during the GSEA process, wherein patients were ranked by their weights in the BRCA model to hit patients with different subtypes. b The highly voted functional modules activated/inactivated in TNBC patients. The z-score, calculated from 1000 random permutations during the GSEA process, reflects the functional module's activity in TNBC patients. We assessed the activity of the highly-voted functional modules identified in BRCA by performing GSEA, wherein genes were ranked by their fold-change (TNBC vs. non-TNBC patients) to hit the member genes in the tested module. The modules marked in red indicate activated, while those marked in green indicate inactivated in TNBC patients. c The performance of identified functional modules in identifying TNBC patients. We performed 100 hold-out processes for each module to assess the potential overfitting. During the hold-out process, 60% of the sampling data was used for training, and the remaining 40% was used for testing. The performance metrics shown here are AUC and AUCPR values derived from the testing data. d Principal component analysis (PCA) of BRCA samples by expression profiles of the functional module “mononuclear cell differentiation”. The left panel illustrates the separation of TNBC and non-TNBC patients based on the expression profiles of the functional module, with the p-value obtained through a permutational multivariate analysis of variance (PERMANOVA). The right panel displays the trajectory of the receptor status in BRCA patients. e The association between the TNBC probability and non-TNBC patients’ ER level. The TNBC probabilities of all BRCA patients were predicted by a naïve logistic regression model using gene expression profiles in the mononuclear cell differentiation module. The subtypes of non-TNBC patients are displayed by colors. f Aalen’s additive regression model estimated the hazard rate of non-TNBC patient survival. The patients’ age and predicted TNBC probability are covariates in the model. The three vertical dash lines represent the time point of one, three, and five years from left to right. g The 3-year hazard ratio of age and predicted TNBC probability for non-TNBC patients. The hazard ratios were calculated by a multivariate Cox regression model using age and TNBC probability as covariates. h The Kaplan Meier survival curve of the estimated three-year survival probability of non-TNBC patients with high and low predicted TNBC probability. The p-value is estimated by the log-rank test. i The functional module of mononuclear cell differentiation. The size of each node in the figure represents the magnitude of the gene's impact on TNBC prediction. Red and blue nodes represent genes with positive and negative z-scores, respectively. Genes with an absolute z-score greater than two are labeled in white

Back to article page