Molecular network-based identification of competing endogenous RNAs and mRNA signatures that predict survival in prostate cancer

Background The aim of the study is described the regulatory mechanisms and prognostic values of differentially expressed RNAs in prostate cancer and construct an mRNA signature that predicts survival. Methods The RNA profiles of 499 prostate cancer tissues and 52 non-prostate cancer tissues from TCGA were analyzed. The differential expression of RNAs was examined using the edgeR package. Survival was analyzed by Kaplan–Meier method. microRNA (miRNA), messenger RNA (mRNA), and long non-coding RNA (lncRNA) networks from the miRcode database were constructed, based on the differentially expressed RNAs between non-prostate and prostate cancer tissues. Results A total of 773 lncRNAs, 1417 mRNAs, and 58 miRNAs were differentially expressed between non-prostate and prostate cancer samples. The newly constructed ceRNA network comprised 63 prostate cancer-specific lncRNAs, 13 miRNAs, and 18 mRNAs. Three of 63 differentially expressed lncRNAs and 1 of 18 differentially expressed mRNAs were significantly associated with overall survival in prostate cancer (P value < 0.05). After the univariate and multivariate Cox regression analyses, 4 mRNAs (HOXB5, GPC2, PGA5, and AMBN) were screened and used to establish a predictive model for the overall survival of patients. Our ROC curve analysis revealed that the 4-mRNA signature performed well. Conclusion These ceRNAs may play a critical role in the progression and metastasis of prostate cancer and are thus candidate therapeutic targets and potential prognostic biomarkers. A novel model that incorporated these candidates was established and might provide more powerful prognostic information in predicting survival in prostate cancer.


Background
In men, prostate cancer remains the second leading cause of deaths due to cancer in the US [1]. Approximately 26,000 men were expected to die from prostate cancer in 2016 [2]. Siegel et al. [2] also estimated that many patients with advanced prostate cancer will develop castrationresistant prostate cancer (CRPC). Previous studies [3][4][5][6] have reported that there are several treatment options for CRPC, including chemotherapy, androgen receptortargeted agents, and radiopharmaceuticals. Nevertheless, there are currently no effective biomarkers for the early diagnosis and treatment of prostate cancer.
Morphological, immunological, and molecular features have been used to predict the progression and prognosis of prostate cancers [7,8]. Over the past several decades, urologists have devoted much effort toward identifying prostate cancer-related protein-coding genes [9]. However, only approximately 2% of all transcripts in mammals are protein-coding RNAs [10]. Thus, the functions of non-coding RNAs should be examined [11]. Previous studies [12][13][14][15][16] proposed a competing endogenous RNA (ceRNA) hypothesis, which described an intricate posttranscriptional regulatory network in which mRNAs, lncRNAs, and other RNAs act as natural miRNA sponges to weaken the function of miRNA via sharing one or more miRNA response elements.
In this study, a ceRNA network was constructed to identify the ceRNAs that are involved in prostate cancer using data from the TCGA database. The RNA profiles of 499 prostate cancer tissues and 52 non-prostate cancer tissues were analyzed. Finally, a prostate cancerassociated ceRNA network was established, based on our bioinformatics prediction and correlation analysis, consisting of 63 lncRNAs, 13 miRNAs, and 18 mRNAs. We examined the functions of the differentially expressed miRNAs that we identified and developed a novel model using several candidates to predict survival in prostate cancer patients. This study aimed to identify prostate cancer-specific RNAs as ceRNAs that regulate target genes and are involved in the pathogenesis and prognosis of prostate cancer.

Data collection
RNA profiles of prostate cancer and control samples were downloaded from the genomic data commons (GDC) data portal and the cancer genome atlas (TCGA, https ://tcga-data.nci.nih.gov/tcga/) database. A total of 551 samples were collected, comprising 499 primary prostate cancer samples and 52 normal solid tissue samples.

Differential gene expression analysis
mRNA, lncRNA, and miRNA expression in the prostate cancer samples were analyzed using the RNASeqV2 and Illumina HiSeq 2000 miRNA sequencing platforms. Samples were divided into prostate cancer tissues versus adjacent non-tumor tissues to identify differentially expressed RNAs using edgeR. Differences in the expression of each RNA between prostate cancer and adjacent non-tumor tissue were expressed as fold-change and the associated P value. Downregulated and upregulated RNAs were defined as those that decreased and increased by a foldchange of > 1.5, respectively, with an FDR-adjusted P of < 0.05.

Survival analysis and definition of mRNA-related prognostic model
The association between differentially expressed mRNAs and overall survival was evaluated by univariate Cox proportional hazards regression analysis using the R survival package. Only mRNAs with P < 0.01 were considered to be candidates and selected for multivariate Cox regression analysis. The best explanatory and most informative predictive model was identified using Akaike Information Criterion (AIC), which assesses the goodness of fit of a statistical model.

Gene ontology and pathway analysis
To understand the underlying biological processes and pathways between differentially expressed genes in the ceRNA network, the database for annotation, visualization, and integrated discovery (DAVID) (http://david .abcc.ncifc rf.gov/) was used to perform functional enrichment analysis. Then, significantly differentially expressed mRNAs were analyzed in the gene ontology (GO) database (http://www.geneo ntolo gy.org). Finally, significantly enriched GO terms were selected to analyze their biological function. The kyoto encyclopedia of genes and genomes (KEGG; http://www.kegg.jp/) was used to perform the pathway enrichment analysis.

Survival analysis of key members in the ceRNA network
The clinical data on the patients were combined with prostate cancer data in TCGA to evaluate the prognostic value of differential RNAs in the ceRNA network. Survival curves were generated using the survival package in R for samples with differentially expressed mRNAs, lncR-NAs, and miRNAs. Survival was analyzed by Kaplan-Meier method, and P values < 0.05 were considered to be significant.

Identification of significantly differentially expressed lncRNAs
In this study, 551 samples were obtained from the TCGA database. Differential expression was analyzed by comparing the expression of 14,254 lncRNAs in prostate cancer and adjacent normal prostate tissues in the TCGA database. Fold-change > 1.5 and P value < 0.05 were set as cutoffs to identify significantly differentially expressed lncRNAs. As a result, 773 differentially expressed lncRNAs between prostate cancer and adjacent normal prostate tissue were obtained-of which 414 were upregulated and 359 were downregulated ( Fig. 1; Table 1).

Predictions of mRNAs and lncRNAs targeted by miRNAs
Next, we predicted the mRNAs and lncRNAs that were targeted by miRNAs, focusing on the relationship between the 58 differentially expressed miRNAs and 773 differentially expressed lncRNAs above. Only 13 of 58 differentially expressed miRNAs were predicted to target 63 of 773 differentially expressed lncRNAs.
The relationships between these 13 differentially expressed lncRNA-targeting miRNAs were used to predict the targeted mRNAs using Targetscan, miRTarBase, and miRDB. Then, 13 prostate cancer-specific miRNAs were predicted to target the 644 mRNAs. After 644 mRNAs were found, the intersection of 644 mRNAs and 19,660 differentially expressed mRNAs between prostate cancer and adjacent normal prostate tissue were performed. Finally, 18 mRNAs were obtained from the 644 mRNAs. Overall, 63 lncRNAs, 13 miRNAs, and 18 mRNAs were selected to construct the lncRNA-miRNA-mRNA ceRNA network using Cytoscape 3.8.5 ( Fig. 4; Tables 4 and 5).

Survival analysis with differentially expressed lncRNAs
To examine the relationship between the differentially expressed lncRNAs and the prognosis of patients with prostate cancer, the link between overall survival and the 63 differentially expressed lncRNAs in prostate cancer patients was analyzed by Kaplan-Meier method. Three of 63 differentially expressed lncRNAs were linked to the prognosis in prostate cancer: LINC00355 and lncRNA OSTN-AS1 were positively associated with overall survival, whereas LINC00308 correlated negatively with it (log-rank P < 0.05) (Fig. 5).

Establishment of a 4-mRNA signature associated with overall survival in prostate cancer patients
Univariate Cox regression analysis was first used to identify prognosis-related mRNAs, identifying 21 mRNAs that were significantly related to overall survival (P < 0.01). Then, multivariate Cox regression was performed, and 4 mRNAs were ultimately selected to establish a predictive model. The predictive model was defined as the linear combination of the expression levels of the 4 mRNAs, which were weighted using the corresponding relative coefficient in the multivariate Cox regression as follows: survival risk score = (0.420 × expression value of HOXB5 + 0.794 × expression value of GPC2 + 0.947 × PGA5 + 0.473 × AMBN). All 4 mRNAs had positive coefficients in the Cox regression analysis, indicating that their high expression was associated with shorter overall survival in prostate cancer patients.

Risk stratification and ROC curve analysis
The 4-mRNA expression-based survival risk score was used to assign patients into a low-risk or high-risk group using a median risk score of 0.9558 as the cutoff. Ultimately, a total of 247 patients were assigned to the high-risk group, versus 248 in the low-risk group (Fig. 6a). The Kaplan-Meier curves for overall survival demonstrated that there was a significant difference between the 2 groups, based on the 4 mRNAs (Fig. 6b). The 5-year and 10-year overall survival rates were 96.0% and 46.3% in the high-risk group, respectively. The prognostic power of the 4-mRNA signature was evaluated using the area under the ROC curve. In this study, the area under the ROC curve was 0.904, indicating good sensitivity and specificity of the 4-mRNA signature in predicting survival in prostate cancer patients ( Fig. 6c; Table 6).

Functional assessment
The functions of the differentially expressed mRNAs in the ceRNA network were determined using DAVID bioinformatics resources. The results demonstrated that 7 GO terms and 19 enriched KEGG pathways were involved in the ceRNA network ( Fig. 7; Table 7).

Discussion
Differentially expressed lncRNAs that correlated significantly with OS were identified by constructing an lncRNA-miRNA-mRNA ceRNA network, based on specific criteria in a large sample of prostate cancer patients in the TCGA database. Thus, there are potential interactions between mRNAs, lncRNAs, and miRNAs in the progression and metastasis of prostate cancer. In this study, ceRNA networks for prostate cancer were built by bioinformatics prediction and correlation analysis of data on significantly differentially expressed mRNAs, lncRNAs, and miRNAs. Further, considering the associations between cancer-specific ceRNAs and clinical characteristics, 3 lncRNAs (LINC00308, OSTN-AS1 and LINC00355) were related to the clinical prognosis. Moreover, 4 mRNAs (HOXB5, GPC2, PGA5, and AMBN) which screened to establish a predictive model were also associated with the clinical prognosis. Both 3 lncRNAs and 4 mRNAs are important because these RNAs are associated with overall survival of patients. These RNAs might provide more powerful prognostic information in predicting survival in prostate cancer.
The mechanisms that underlie the progression and metastasis of prostate cancer remain unknown. However, our understanding of the genesis and characteristics of prostate cancer has grown because of the development of high-throughput sequencing and bioinformatics. Recently, Liu et al. [17] revealed that miRNA genes can be considered tumor suppressor genes and novel oncogenes that are involved in the progression and metastasis of carcinomas. Liu et al. [17] also demonstrated that miR-141 employs several mechanisms to reduce the growth and metastasis of prostate cancer. Liu et al. [18] reported that the microRNA miR-34a inhibits the regeneration and metastasis of prostate cancer by repressing CD44 directly. Tinay et al. [19] demonstrated that 3 miRNAs are significantly overexpressed in serum from prostate cancer patients versus those without cancer. In this study, 58 miRNAs were significantly differentially expressed in prostate cancer compared with adjacent non-tumorous tissues.
lncRNAs are potential biomarkers in carcinogenesis and have significant advantages as diagnostic and prognostic biomarkers [20]. Previous research has confirmed that differentially expressed lncRNAs correlate with the progression and metastasis of carcinomas [21,22]. Ramnarine et al. [23] reported that the lncRNAs FEN-DRR, H19, LINC00514, LINC00617, and SSTR5-AS1 are involved in the development of neuroendocrine prostate cancer. Zhang et al. [24] found that cell proliferation in hormone-refractory prostate cancer is promoted by the lncRNA PCGEM1. In this study, 773 lncRNAs were identified. LINC00355 and OSTN-AS1 were positively associated with overall survival, whereas LINC00308 correlated negatively with overall survival. LINC00355, OSTN-AS1, and LINC00308 were included in the ceRNA network, suggesting that these lncRNAs play an important role in the progression and prognosis of prostate cancer.
Only 1 of 18 differentially expressed mRNAs (RRM2), which constructed of ceRNA networks, were significantly associated with overall survival in prostate cancer. Although RRM2 has been studied in colorectal cancer [25], non-small cell lung cancer [26], pancreatic cancer [27], adrenocortical cancer [28], and cervical cancer [29]. However, the role of RRM2 in prostate cancer has not been established yet. In this study, the higher expression of RRM2 was associated with worse survival outcome in prostate cancer. Chang et al. [25] demonstrated that overexpression of RRM2 was associated with survival and recurrence in colorectal cancer patients with k-ras mutation. Yoshida et al. [30] found that the upregulation of RRM2 was essential for the proliferation of colorectal cancer cell lines. Rahman et al. [26] indicated that knockdown of RRM2 was associated with apoptosis of head and neck squamous cell carcinoma and non-small cell lung cancer. These finds mentioned above suggested that RRM2 may be a potential prognostic targets in prostate cancer.
However, there are no reports on the correlation between LINC00308 and disease. Moreover, the function of LINC00308 has not been examined. Thus, the genes that are related to LINC00308 were predicted by constructing an lncRNA-miRNA-mRNA network. The results demonstrated that 2 miRNAs (has-mir-137 and   Rectangles represent miRNAs, ellipses represent protein-coding genes, and diamonds represent lncRNAs; gray edges indicate lncRNA-miRNA-mRNA interactions has-mir-93-5p) are associated with LINC00308. The target genes of these 2 miRNAs were then predicted, resulting in 29 has-mir-137 target genes and 385 has-mir-93-5p target genes. We found three common hits between the target genes of these 2 miRNAs: RORA, GIGYF1, and NCOA3. Mocellin et al. [31] reported that RORA is significantly associated with the risk of breast carcinoma, prostate carcinoma, and lung carcinoma. Zhu et al. [32] also found that RORA is a common fragile site gene that is inactivated in several carcinomas and is involved in responses to cellular stress. Moretti et al. [33] reported that RORA is a molecular target for the development of chemotherapeutic strategies for prostate carcinoma. Ajiro et al. [34] demonstrated that the phosphorylation of Akt at Ser 473 is significantly reduced after GIGYF1 knockdown in breast cancer cell lines. Tong et al. [35] revealed that NCOA3 is overexpressed in human hepatocellular carcinoma specimens and promotes the proliferation of human hepatocellular carcinoma. Ngollo et al. [36] showed that NCOA3 is upregulated in prostate cancer compared with normal prostate tissues. Moreover, the expression of NCOA3 also correlates with Gleason score, clinical stage, and PSA levels. Conventional prognostic systems generally make insufficient predictions for risk stratification and estimations of clinical outcome because of the heterogeneity between patients. Thus, in recent decades, much effort has been made to establish a novel prognostic model to improve the prediction of survival in prostate cancer patients [37][38][39]. In this study, we generated a 4-mRNA signature that predicted the clinical outcome of prostate cancer. To the best of our knowledge, this is the first mRNA-related predictive model that is based on TCGA RNA-seq data from 495 prostate cancer patients. These 4 mRNAs were identified to establish a predictive model that is based on their linear combination. A significant difference of survival rate was observed between the high-risk and lowrisk groups. In the ROC analysis, the AUC was 0.904, indicating high sensitivity and specificity of the mRNA signature. The GCP2 has been explored in several studies [40][41][42]. However, the role of GCP2 in prostate cancer has not been elucidated yet. Dráberová et al. [41] reported that the immunoreactivity of GCP2 was significantly increased in glioblastoma cells than that in normal brains cells. The GCP2 was also related to the progress of the microvascular proliferation. The dysregulation of GCP2 in glioblastomas may also associated with the alteration of transcriptional checkpoint activity.