- Open Access
Identification and validation of stemness-related lncRNA prognostic signature for breast cancer
Journal of Translational Medicine volume 18, Article number: 331 (2020)
Long noncoding RNAs (lncRNAs) are emerging as crucial contributors to the development of breast cancer and are involved in the stemness regulation of breast cancer stem cells (BCSCs). LncRNAs are closely associated with the prognosis of breast cancer patients. It is critical to identify BCSC-related lncRNAs with prognostic value in breast cancer.
A co-expression network of BCSC-related mRNAs-lncRNAs from The Cancer Genome Atlas (TCGA) was constructed. Univariate and multivariate Cox proportional hazards analyses were used to identify a stemness risk model with prognostic value. Kaplan–Meier analysis, univariate and multivariate Cox regression analyses and receiver operating characteristic (ROC) curve analysis were performed to validate the risk model. Principal component analysis (PCA) and Gene Set Enrichment Analysis (GSEA) functional annotation were conducted to analyze the risk model.
In this study, BCSC-related lncRNAs in breast cancer were identified. We evaluated the prognostic value of these BCSC-related lncRNAs and eventually obtained a prognostic risk model consisting of 12 BCSC-related lncRNAs (Z68871.1, LINC00578, AC097639.1, AP003119.3, AP001207.3, LINC00668, AL122010.1, AC245297.3, LINC01871, AP000851.2, AC022509.2 and SEMA3B-AS1). The risk model was further verified as a novel independent prognostic factor for breast cancer patients based on the calculated risk score. Moreover, based on the risk model, the low- risk and high-risk groups displayed different stemness statuses.
These findings suggested that the 12 BCSC-related lncRNA signature might be a promising prognostic factor for breast cancer and can promote the management of BCSC-related therapy in clinical practice.
Breast cancer is the most commonly diagnosed malignancy and is the leading cause of cancer-associated mortality among women worldwide [1, 2]. In the field of clinical treatment, increasing attention has been focused on individual and precise therapeutic strategies. Thus, the identification of novel prognostic biomarkers and promising targets is considered to be an effective way to achieve this goal.
Heterogeneity is a hallmark of solid tumors, including breast cancer, which results from the enrichment of cancer stem cells (CSCs) [3, 4]. CSCs represent a dynamic subpopulation of tumor cells characterized by self-renewal, pluripotency and limitless proliferative properties . Breast cancer stem cells (BCSCs) are considered the source of tumor aggression, metastasis, worse prognosis, chemoresistance and recurrence in breast cancer . Therefore, identifying key stemness regulators of BCSCs is of great importance for both theoretical studies and clinical practice.
Long noncoding RNAs (lncRNAs) are a class of transcript RNAs longer than 200 nucleotides that are not translated into proteins . LncRNAs are involved in the development and progression of various cancers at different levels, including epigenetic, transcriptional and posttranscriptional regulation, and are considered one of the most sensitive and specific cancer biomarkers [8,9,10,11]. Recently, lncRNAs have become a hot topic in stemness regulation of CSCs and prediction of prognosis in numerous cancers. Therefore, it is essential to identify key lncRNAs closely related to the stemness of BCSCs and prognosis in breast cancer.
In this study, we analyzed a dataset of lncRNA expression in breast cancers from The Cancer Genome Atlas (TCGA) and screened prognostic lncRNAs related to the stemness of BCSCs. We identified a 12 BCSC-related lncRNA signature with the ability to predict the survival prognosis of breast cancer patients.
Patient data sets
Clinical information and pathology records of patients with breast cancer were taken from the TCGA (https://cancergenome.nih.gov/). The edgeR package was used to normalize gene expression. A total of 1053 TCGA female breast cancer patients with lncRNA expression profiles were utilized in the present study.
Among them, 986 patients with complete follow-up information and survival time ≥ 30 days and 539 patients with complete clinicopathological data were applied to subsequent analyses. The clinical characteristics are detailed in Table 1.
Identification of BCSC-related lncRNAs in breast cancer
A total of 213 BCSC-related encoding genes (mRNAs) were extracted from the Molecular Signatures Database of Gene Set Enrichment Analysis (GSEA: M14079, M4740, M9246, and M13135). Finally, 1198 BCSC-related lncRNAs were identified by constructing BCSC-related mRNAs-lncRNAs co-expression network according to the criteria of |Correlation Coefficient|> 0.3 and p < 0.001 by Pearson correlation analysis using the limma R package .
Identification of BCSC-related lncRNA prognostic signatures for breast cancer
To identify BCSC-related lncRNAs associated with survival, univariate Cox proportional hazards analysis was performed with p < 0.01 as the criteria, and multivariate Cox analysis was subsequently performed to establish the optimal prognostic risk model based on the Akaike information criterion (AIC = 1422.28) using the survival R package. Then, the risk score for each patient was calculated based on the following formula:
Risk score = coef (lncRNA1) × expr (lncRNA1) + coef (lncRNA2) × expr (lncRNA2) + …… + coef (lncRNAn) × expr (lncRNAn).
coef (lncRNAn) was defined as the coefficient of lncRNAs correlated with survival.
expr (lncRNAn) was defined as the expression of lncRNAs.
Breast cancer patients in the TCGA were divided into a high-risk group and a low-risk group according to the median risk score. Kaplan–Meier survival analysis was conducted to evaluate the survival difference between the two groups using the survival and survminer R packages.
Independent prognostic analysis and ROC curve plotting
Univariate and multivariate Cox regression analyses were performed to assess the relationship between survival prognosis and age; estrogen receptor (ER) expression; progesterone receptor (PR) expression; human epidermal growth factor receptor 2 (HER2) expression; Tumor, Node, Metastasis (TNM) stage; tumor size (T); lymph node (N) metastasis; distant metastasis (M); and risk score using the survival R package. Time-dependent receiver operating characteristic (ROC) curves were plotted to evaluate the predictive accuracy for survival time through different clinical pathological factors and risk scores using the survival ROC R package.
All statistical analyses were performed using R software (version 3.6.2). A co-expression network of BCSC-related lncRNAs-mRNAs with prognostic value in breast cancer was constructed visualized by Cytoscape and Sankey diagram. The correlation between the expression of BCSC-related lncRNAs and clinicopathological factors was analyzed by ggpubr R package. Principal component analysis (PCA) was performed for effective dimension reduction, pattern recognition, and exploratory visualization of high-dimensional data of the whole genome, 213 BCSC-related encoding genes and the risk model of BCSC-related lncRNAs expression profiles, respectively [13, 14]. GSEA was used for functional annotation. Two-tailed p < 0.05 was considered statistically significant.
Identification of significant prognostic value of BCSC-related lncRNAs in breast cancer
A total of 1198 BCSC-related lncRNAs were obtained by constructing co-expression networks with 213 BCSC-related encoding genes (mRNAs) (Additional files 1, 2 and 3). Among them, 32 BCSC-related lncRNAs were significantly associated with the survival of breast cancer patients from the TCGA (p < 0.01) by Cox proportional hazards analysis, including 23 lncRNAs with low risk (hazard ratio (HR) < 1) and 9 lncRNAs with high risk (HR > 1) (Fig. 1). Subsequently, multivariate Cox analysis further screened 12 lncRNAs from the above 23 BCSC-related lncRNAs with prognostic significance, namely, Z68871.1, LINC00578, AC097639.1, AP003119.3, AP001207.3, LINC00668, AL122010.1,
AC245297.3, LINC01871, AP000851.2, AC022509.2 and SEMA3B-AS1 (Table 2). These 12 lncRNAs were constructed into the optimal prognostic risk model of BCSC-related lncRNAs. As shown in Fig. 1a–c visualization co-expression network of BCSC-related lncRNAs-mRNAs with prognostic value was established. According to the risk score formula and the calculated median risk score, breast cancer patients were divided into a high-risk group and a low-risk group. Kaplan–Meier survival analysis showed that the high-risk group presented worse overall survival (OS) than the low risk group (p = 3.021E−11) (Fig. 2a), suggesting that the risk score had prognostic value. The risk curve and scatterplot were made to illustrate the risk score and the corresponding survival status of breast cancer patients. The results showed that the higher the risk score was, the more mortality occurred (Fig. 2b, c). The heatmap of the expression of these 12 BCSC-related lncRNAs in breast cancer samples showed that Z68871.1, LINC00578, AC097639.1, AP003119.3, AP001207.3 and LINC00668 were upregulated in the high risk group, while AL122010.1, AC245297.3, LINC01871, AP000851.2, AC022509.2 and SEMA3B-AS1 were highly expressed in the low risk group (Fig. 2d). Therefore, these studies identified 12 BCSC-related lncRNAs with prognostic significance for breast cancer.
Evaluation of the risk model of the 12 BCSC-related lncRNAs as independent prognostic factor for breast cancer patients
Univariate and multivariate Cox regression analyses were performed to assess whether the risk model of the above 12 BCSC-related lncRNAs was an independent prognostic factor for breast cancer. The HR of the risk score and 95% CI were 1.190 and 1.122–1.262 (p < 0.001) in univariate Cox regression analysis (Fig. 3a) and 1.162 and 1.074–1.258 (p < 0.001) in multivariate Cox regression analysis (Fig. 3b), respectively, indicating that the risk model of the 12 BCSC-related lncRNAs was the most significant prognostic factor for breast cancer, independent of clinicopathological parameters such as age, ER expression, PR expression, HER2 expression, molecular subtypes, TNM stage, tumor size, lymph node metastasis and distant metastasis. To evaluate the predictive specificity and sensitivity of the risk score on the prognosis of breast cancer patients, the area under the ROC curve (AUC) of the risk score was estimated. The AUC of the risk score was 0.813, followed by the AUC of age and more than the AUCs of other clinicopathological factors (Fig. 3c), suggesting that the prognostic risk model of the 12 BCSC-related lncRNAs for breast cancer was considerably reliable. All of the above results indicated that the 12 BCSC-related lncRNAs were significant independent prognostic factors for breast cancer patients.
Correlation of the expression of the 12 BCSC-related lncRNAs with clinicopathological factors
To further investigate whether the 12 BCSC-related lncRNAs were involved in the development of breast cancer, we assessed the association of the expression of the 12 BCSC-related lncRNAs with clinicopathological factors. There were significant correlations between most of the 12 BCSC-related lncRNAs and ER expression, PR expression and molecular subtypes, as shown in Fig. 4.
Different stemness statuses in the low-risk and high-risk groups
PCA was performed to compare the difference between the low-risk and high-risk groups based on the risk model of the 12 BCSC-related lncRNAs, 224 BCSC-related encoding genes and whole-genome expression profiles, respectively (Fig. 5). The results showed that the low-risk and high-risk groups based on the risk model were distributed in distinct directions, more obvious than the others, suggesting that the risk model could divide breast cancer patients into two parts and that the stemness status of breast cancer patients in the high-risk group differed from those in the low-risk group. Functional annotation was further conducted using GSEA, and the results showed that the differentially expressed genes between the high-risk and low-risk groups based on the risk model of the 12 BCSC-related genes were enriched in stemness-related processes and CSC-related pathways (Fig. 6). These results indicated that the low-risk and high-risk groups showed different stemness statuses.
In clinical practice, although the OS of breast cancer patients has made promising improvements, the occurrence of drug resistance to breast cancer has constantly increased. The increased frequency of chemoresistance and endocrine resistance might result from enrichment with BCSCs. Because large-scale current therapeutic strategies focus on eliminating the majority of non-CSCs, the residual population of chemo-resistant cancer cells that contributes to relapse and metastasis is thought to result from the existence of minimal residual CSCs [15, 16]. Thus, effective treatment aimed at BCSCs has promising prospects. This led us to find potential specific prognostic biomarkers and therapeutic targets for BCSCs. Recently, an increasing number of studies have suggested the crucial role of lncRNAs in regulating the stemness of CSCs and predicting prognosis in numerous cancers such as intestinal cancer, lung cancer, and hepatocellular cancer [17,18,19]. Similarly, an increasing number of lncRNAs have been implicated in the regulation of stemness in breast cancer. LncRNA-Hh strengthened CSC generation in twist-positive breast cancer by activating the Hedgehog signaling pathway . LncRNA FEZF1-AS1 promoted the stemness of BCSC tumorigenesis by targeting the miR-30a/Nanog axis . LncRNA HOTAIR contributed to the epithelial–mesenchymal transition (EMT) of BCSCs by activating the STAT3 signaling pathway . In this study, we identified the risk model of the 12 BCSC-related lncRNAs as an independent prognostic factor for breast cancer. To date, among these 12 BCSC-related lncRNAs, only LINC00578, LINC00668 and SEMA3B-AS1 have been studied in breast cancer or other cancers. It has been reported that LINC00578 is associated with worse OS in pancreatic cancer and lung adenocarcinoma [23, 24]. LINC00668 promoted tumorigenesis and progression and indicated poor prognosis in not only breast cancer but also other cancers, such as colorectal cancer, hepatocellular carcinoma and non-small-cell lung cancer [25,26,27,28]. SEMA3B-AS1 might serve as a new tumor suppressor to inhibit the development of hepatocellular carcinoma, esophageal squamous cell carcinoma and gastric cardia adenocarcinoma [29,30,31]. Consistent with our results, LINC00578 and LINC00668, as high-risk BCSC-related lncRNAs, were correlated with worse prognosis in breast cancer patients, whereas SEMA3B-AS1, as a low-risk BCSC-related lncRNA, was associated with better prognosis in breast cancer patients. Breast cancer is a complex disease and highly heterogeneous tumor . To assess survival prognosis and guide individual therapeutic decisions, breast cancer has been divided into distinct molecular subtypes based primarily on the expression status of hormonal receptors such as ER, PR, HER2 and Ki67 (tumor proliferation index) as follows: luminal A/B (ER and/or PR positive), HER2 enriched (HER2 positive) and triple-negative breast cancer (ER, PR and HER2 negative) . It is also well known that ER and/or PR positive indicates an effective endocrine therapy outcome and better survival prognosis; HER2 positive represents a more aggressive phenotype but is sensitive to HER2-targeted therapy; triple negative is enriched with BCSCs and associated with a worse prognosis due to the lack of effective therapeutic targets [34, 35]. Moreover, it was demonstrated that distinct molecular subtypes of breast cancer were enriched with different amounts of BCSCs . Thus, distinct ER, PR and HER2 statuses indicated different biological processes of breast cancer and survival outcomes. In line with the abovementioned findings, most of the 12 BCSC-related lncRNAs were remarkably associated with ER expression, PR expression and molecular subtypes, which further suggested that the 12 BCSC-related lncRNAs might be involved in the development and progression of breast cancer, and the risk model was also based on the intrinsic properties of breast cancer. More interestingly, there were no significant relationship between most of the 12 BCSCs-related lncRNAs and tumor size (T), lymph node (N) status, distant metastasis (M) and TNM stage, indicating that the risk model of the 12 BCSCs-related lncRNAs has no close correlation with the sooner or later for finding and diagnosis of breast cancer, but is only strongly linked to the intrinsic biological characteristics of distinct subtypes of breast cancer. Furthermore, the risk model was the second most statistically significant prognostic signature compared with other clinicopathological factors, and the ROC result (AUC = 0.813) confirmed that the risk model is reliable. Combined with the AUC of the risk model score, these results all indicated that the risk model of the 12 BCSC-related lncRNAs had superior prognostic value to other clinicopathological factors. The results of PCA and GSEA functional annotation illustrated that the high-risk and low- risk groups showed different distribution directions and aggregation centers based on the risk model, rather than the whole-genome expression profiles and BCSC-related genes expression profiles, indicating that the significant differences in OS between the high-risk and low-risk groups might result from different stemness and oncogenic statuses induced by the risk model of the 12 BCSC-related lncRNAs. Taken together, these results indicated that the prognostic signature of the 12 BCSC-related lncRNAs might be a feasible independent prognostic factor for breast cancer in clinical practice. To date, a key challenge of precision genomic medicine is to make reliable and accurate predictions of clinical outcomes from high-dimensional molecular data . To solve this problem, there have been some advances in Cox regression with prognostic value in recent years. A Cox elastic net has been used in objective and data-driven feature selection with time-to-event data . Cox-nnet is an artificial neural network approach that has been utilized in predicting low-dimensional survival prognosis . Bayesian-optimized deep survival models (SurvivalNet models) have successfully improved the accuracy of prognostic prediction for high-dimensional cancer genomic profiles . In addition, Cox-nnet has a better performance than SurvivalNet models, and SurvivalNet models provide comparable performance to the Cox elastic net [39, 40]. Moreover, Cox-PASNet, which is a novel pathway-based sparse deep neural network for survival analysis that integrates high-dimensional genomic data and clinical data, has been applied to identify significant prognostic factors . However, our study has some limitations. We applied traditional univariate and multivariate Cox proportional hazards analyses to establish and estimate the prognostic value of the risk model of the 12 BCSC-related lncRNAs. Although the method has been approved and employed in many researches, it is necessary to improve our further study with more advanced methodologies and technologies in the future. To further validate our bioinformatics prediction results, in-depth studies on the 12 BCSC-related lncRNAs, including functional experiments and molecular mechanisms, are needed.
In conclusion, we identified a BCSC-related lncRNA signature consisting of 12 lncRNAs (Z68871.1, LINC00578, AC097639.1, AP003119.3, AP001207.3, LINC00668, AL122010.1, AC245297.3, LINC01871, AP000851.2, AC022509.2 and SEMA3B-AS1), which can act as a novel independent prognostic factor for breast cancer. In the future, with prospective validation, the 12 BCSC-related lncRNA signature may improve the predictive accuracy and guide individual specific therapy for patients with breast cancer.
All data utilized in this study are included in this article and all data supporting the findings of this study are available on reasonable request from the corresponding author.
Long noncoding RNAs
The Cancer Genome Atlas
Receiver operating characteristics
Principal components analysis
Gene Set Enrichment Analysis
Human epidermal growth factor receptor 2
DeSantis CE, Ma J, Gaudet MM, Newman LA, Miller KD, Goding Sauer A, et al. Breast cancer statistics, 2019. CA Cancer J Clin. 2019;69(6):438–51.
Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424.
Wicha MS. Cancer stem cell heterogeneity in hereditary breast cancer. Breast Cancer Res. 2008;10(2):105.
Bianchini G, Balko JM, Mayer IA, Sanders ME, Gianni L. Triple-negative breast cancer: challenges and opportunities of a heterogeneous disease. Nat Rev Clin Oncol. 2016;13(11):674–90.
Batlle E, Clevers H. Cancer stem cells revisited. Nat Med. 2017;23(10):1124–34.
Peitzsch C, Tyutyunnykova A, Pantel K, Dubrovska A. Cancer stem cells: the root of tumor recurrence and metastases. Semin Cancer Biol. 2017;44:10–24.
Fazal FM, Chang HY. lncRNA structure: message to the heart. Mol Cell. 2016;64(1):1–2.
Kim J, Piao HL, Kim BJ, Yao F, Han Z, Wang Y, et al. Long noncoding RNA MALAT1 suppresses breast cancer metastasis. Nat Genet. 2018;50(12):1705–15.
Yang F, Fang E, Mei H, Chen Y, Li H, Li D, et al. Cis-Acting circ-CTNNB1 promotes beta-catenin signaling and cancer progression via DDX3-mediated transactivation of YY1. Cancer Res. 2019;79(3):557–71.
Wang Z, Yang B, Zhang M, Guo W, Wu Z, Wang Y, et al. lncRNA epigenetic landscape analysis identifies EPIC1 as an oncogenic lncRNA that interacts with MYC and promotes cell-cycle progression in cancer. Cancer Cell. 2018;33(4):706–20.
Slack FJ, Chinnaiyan AM. The role of non-coding RNAs in oncology. Cell. 2019;179(5):1033–55.
Schober P, Boer C, Schwarte LA. Correlation coefficients: appropriate use and interpretation. Anesth Analg. 2018;126:1763–8.
Kim S, Kang D, Huo Z, Park Y, Tseng GC. Meta-analytic principal component analysis in integrative omics application. Bioinformatics. 2018;34:1321–8.
Li Z, Safo SE, Long Q. Incorporating biological information in sparse principal component analysis with application to genomic data. BMC Bioinform. 2017;18:332.
Creighton CJ, Li X, Landis M, Dixon JM, Neumeister VM, Sjolund A, et al. Residual breast cancers after conventional therapy display mesenchymal as well as tumor-initiating features. Proc Natl Acad Sci USA. 2009;106(33):13820–5.
Dean M, Fojo T, Bates S. Tumour stem cells and drug resistance. Nat Rev Cancer. 2005;5(4):275–84.
Wang R, Dong HX, Zeng J, Pan J, Jin XY. LncRNA DGCR5 contributes to CSC-like properties via modulating miR-330-5p/CD44 in NSCLC. J Cell Physiol. 2018;233(9):7447–566.
Zhu P, Wu J, Wang Y, Zhu X, Lu T, Liu B, et al. LncGata6 maintains stemness of intestinal stem cells and promotes intestinal tumorigenesis. Nat Cell Biol. 2018;20(10):1134–44.
Zhao X, Sun B, Liu T, Shao B, Sun R, Zhu D, et al. Long noncoding RNA n339260 promotes vasculogenic mimicry and cancer stem cell development in hepatocellular carcinoma. Cancer Sci. 2018;109(10):3197–208.
Zhou M, Hou Y, Yang G, Zhang H, Tu G, Du YE, et al. LncRNA-Hh strengthen cancer stem cells generation in twist-positive breast cancer via activation of hedgehog signaling pathway. Stem Cells. 2016;34(1):55–66.
Zhang Z, Sun L, Zhang Y, Lu G, Li Y, Wei Z. Long non-coding RNA FEZF1-AS1 promotes breast cancer stemness and tumorigenesis via targeting miR-30a/Nanog axis. J Cell Physiol. 2018;233(11):8630–8.
Zhang H, Cai K, Wang J, Wang X, Cheng K, Shi F, et al. MiR-7, inhibited indirectly by lincRNA HOTAIR, directly inhibits SETDB1 and reverses the EMT of breast cancer stem cells by downregulating the STAT3 pathway. Stem Cells. 2014;32(11):2858–68.
Zhang B, Li C, Sun Z. Long non-coding RNA LINC00346, LINC00578, LINC00673, LINC00671, LINC00261, and SNHG9 are novel prognostic markers for pancreatic cancer. Am J Transl Res. 2018;10(8):2648–58.
Wang L, Zhao H, Xu Y, Li J, Deng C, Deng Y, et al. Systematic identification of lincRNA-based prognostic biomarkers by integrating lincRNA expression and copy number variation in lung adenocarcinoma. Int J Cancer. 2019;144(7):1723–34.
Qiu X, Dong J, Zhao Z, Li J, Cai X. LncRNA LINC00668 promotes the progression of breast cancer by inhibiting apoptosis and accelerating cell cycle. Onco Targets Ther. 2019;12:5615–25.
Yan S, Yue Y, Wang J, Li W, Sun M, Gu C, et al. LINC00668 promotes tumorigenesis and progression through sponging miR-188-5p and regulating USP47 in colorectal cancer. Eur J Pharmacol. 2019;858:172464.
Wang X, Zhou X, Liu J, Liu Z, Zhang L, Gong Y, et al. Genomewide investigation of the clinical implications and molecular mechanism of long noncoding RNA LINC00668 and proteincoding genes in hepatocellular carcinoma. Int J Oncol. 2019;55(4):860–78.
Jiang R, Hu C, Li Q, Cheng Z, Gu L, Li H, et al. Sodium new houttuyfonate suppresses metastasis in NSCLC cells through the Linc00668/miR-147a/slug axis. J Exp Clin Cancer Res. 2019;38(1):155.
Zhong Y, Li Y, Song T, Zhang D. MiR-718 mediates the indirect interaction between lncRNA SEMA3B-AS1 and PTEN to regulate the proliferation of hepatocellular carcinoma cells. Physiol Genomics. 2019;51(10):500–5.
Dong Z, Liang X, Wu X, Kang X, Guo Y, Shen S, et al. Promoter hypermethylation-mediated downregulation of tumor suppressor gene SEMA3B and lncRNA SEMA3B-AS1 correlates with progression and prognosis of esophageal squamous cell carcinoma. Clin Exp Metastasis. 2019;36(3):225–41.
Guo W, Liang X, Liu L, Guo Y, Shen S, Liang J, et al. MiR-6872 host gene SEMA3B and its antisense lncRNA SEMA3B-AS1 function synergistically to suppress gastric cardia adenocarcinoma progression. Gastric Cancer. 2019;22(4):705–22.
Yeo SK, Guan JL. Breast Cancer: Multiple Subtypes within a Tumor? Trends Cancer. 2017;3(11):753–60.
Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, et al. Molecular portraits of human breast tumours. Nature. 2000;406(6797):747–52.
Goldhirsch A, Winer EP, Coates AS, Gelber RD, Piccart-Gebhart M, Thurlimann B, et al. Personalizing the treatment of women with early breast cancer: highlights of the St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2013. Ann Oncol. 2013;24(9):2206–23.
Coates AS, Winer EP, Goldhirsch A, Gelber RD, Gnant M, Piccart-Gebhart M, et al. Tailoring therapies–improving the management of early breast cancer: St Gallen international expert consensus on the primary therapy of early breast cancer 2015. Ann Oncol. 2015;26(8):1533–46.
Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015;13:8–17.
Simon N, Friedman J, Hastie T, Tibshirani R. Regularization paths for Cox's proportional hazards model via coordinate descent. J Stat Softw. 2011;39:1–13.
Ching T, Zhu X, Garmire LX. Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data. PLoS Comput Biol. 2018;14:e1006076.
Yousefi S, Amrollahi F, Amgad M, Dong C, Lewis JE, Song C, et al. Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models. Sci Rep. 2017;7:11707.
Hao J, Kim Y, Mallavarapu T, Oh JH, Kang M. Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data. BMC Med Genomics. 2019;12:189.
The data of this study was download from The Cancer Genome Atlas (TCGA), we gratefully acknowledge the patients and operations.
This study was supported by grants from the National Natural Science Foundation of China (Nos. 81773163, 81702881).
Ethics approval and consent to participate
Consent for publication
The authors declare no conflicts of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1.
The list of 1198 BCSC-related lncRNAs.
Additional file 2.
The list of 213 BCSC-related mRNAs.
Additional file 3: Fig. S1.
Protein-protein interaction network of the 213 BCSC-related encoding genes (mRNAs).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Li, X., Li, Y., Yu, X. et al. Identification and validation of stemness-related lncRNA prognostic signature for breast cancer. J Transl Med 18, 331 (2020). https://doi.org/10.1186/s12967-020-02497-4