Expression profiles (GSE34933 from NCBI GEO) for PCa and benign prostate tissue (BPH) samples generated by Zhong and colleagues [19, 20] were used. Eight available paired miRNA and mRNA expression profiles (each containing 4 samples for PCa and BPH) were selected for further analysis. Information on these profiles is provided in Additional file 1. Normalized miRNA and mRNA data were downloaded directly. For mRNA expression data, the average probe intensity was calculated and used as the gene expression level for genes with multiple probes. Finally, the profiles included information on the expression of 851 miRNAs and 19595 genes.
Another dataset used in this study was the miRNA-mRNA network. This dataset consisted of a combination of experimentally validated targeting data and computational prediction data. The experimentally validated data included information from miRecords , TarBase , miR2Disease , and miRTarBase , while the computational prediction data consisted of miRNA-mRNA target pairs residing in no fewer than 2 datasets from HOCTAR , ExprTargetDB , and starBase . In total, there were 32739 regulatory pairs among 641 miRNAs and 7706 target genes.
Prostate cancer miRNA biomarker identification
We developed a novel approach to identify candidate miRNA biomarkers for PCa. The schematic workflow of our pipeline is described in Figure 1. Paired miRNA and gene expression, and miRNA-mRNA networks were integrated to predict outlier miRNAs associated with PCa progression. This procedure consisted of four separate stages. First, differentially expressed miRNAs and genes between PCa and BPH samples were detected using the two-sample t-test. Second, Pearson’s correlation was used to detect negative correlations between the expression profiles of outlier miRNAs and outlier genes. In the third step, the intersection data of the negative correlations and miRNA-mRNA binding pairs were retrieved to identify miRNA regulatory networks related to PCa progression. In the fourth and final step, a new index designated as novel out-degree (NOD) was defined to measure the independent regulatory power of an individual miRNA, and used to prioritize novel PCa miRNA biomarkers.
Step 1: Detection of differentially expressed miRNAs and genes associated with prostate cancer
The detection of cancer-specific abnormal changes in miRNA and gene expression is the aim of cancer studies [28–31]. Here, we used two-sample t-tests to identify differently expressed miRNAs and genes associated with PCa progression on the basis of their expression profiles. The top 30% miRNAs (or genes) ranked by their statistical significance (p-value) were retrieved for further analysis. As a result, 256 miRNAs and 5878 genes were considered as candidate PCa outliers.
The threshold for the expression of outlier miRNAs and outlier genes is often arguable. A less stringent cut-off (top 40%) and a stricter cut-off (top 20%) were tested for candidate miRNA biomarker prediction. Details of the comparison between these predictions are listed in Additional file 2. The data indicated that the prediction results were highly conserved and only the number of candidate miRNAs changed with the different thresholds. Therefore, we adopted a moderate threshold (top 30%) in the present study.
Step 2: Acquisition of inverse correlation pairs
One major function of miRNAs is the cleavage of transcripts of its target genes at the post-transcriptional level. Thus, the inverse correlation of expression profiles should be one prerequisite for miRNAs and candidate targets. In the present study, the Pearson’s correlation method was used to detect negative correlations between outlier miRNAs and outlier genes. The cut-off for the correlation coefficient was roughly chosen to be -0.6, as it has been used as a threshold in several correlation studies [32, 33].
Step 3: Constructing a prostate cancer miRNA-mRNA binding network
According to the above miRNA-mRNA binding data from experimental validation and computational prediction databases, we identified possible human miRNA-mRNA target pairs. We further filtered these target pairs with the collected information on miRNA-mRNA negative correlations to generate a PCa miRNA regulatory network. As a result, the miRNA-mRNA target sub-network consisted of 136 miRNAs and 551 target genes.
Step 4: Prioritizing candidate prostate cancer miRNA biomarkers
Generally, we face two main challenges for the prediction of miRNAs related to cancer based on miRNA-mRNA regulatory data. First, for genes with abnormal expression that are regulated by more than one miRNA, it is difficult to discriminate which miRNA contributed to the deregulation of this gene. Second, besides miRNA regulation, other factors such as DNA methylation may also result in abnormal expression of the studied gene. To overcome these problems, we defined a novel out-degree (NOD) index to measure the independent regulatory power of an individual miRNA, i.e., the genes uniquely regulated by one specific miRNA. Based on the observation that miRNAs with greater independent regulatory power were more likely to be cancer biomarkers as described in the Results section, we prioritized candidate PCa miRNA biomarkers according to their NOD values, as calculated from the PCa miRNA regulatory network.
In summary, the number of uniquely regulated genes was first computed as a NOD value for each miRNA in the PCa miRNA regulatory network. These miRNAs were further ranked by their NOD values. The Wilcoxon signed-rank test was then applied to assign a statistical significance value (p-value) to each miRNA, which indicated whether the NOD value of an individual miRNA was significantly greater than the median level of all these candidate miRNAs. Herein, the threshold of the p-value was set at 0.01. Finally, 39 miRNAs were detected as potential PCa miRNA biomarkers in our study.
Performance comparison with other computational methods
To evaluate the accuracy of our method, we compared its performance with that of two other computational approaches, the miRNA expression fold-change based on the t-test method  and another method based on the cancer miRNA synergism theory . The same numbers of top ranked miRNAs as in our prediction results were extracted from these two methods for comparison. The performance of each computational method was expressed as the percentage of known PCa abnormal miRNAs in their prediction results.
In vitro q-PCR confirmation of candidate prostate cancer miRNA biomarkers
When normal prostate tissue (NPT) samples are unavailable, benign prostatic hyperplasia (BPH) samples can be used as normal prostate samples for comparison with PCa samples [35, 36]. The study group consisted of 25 Han Chinese patients with PCa and 20 Han Chinese individuals with BPH with ages ranging from 60 to 91. The PCa and BPH samples were part of a sample set collected for clinical diagnostic tests at the First Affiliated Hospital of Soochow University (Suzhou, China). No extra samples were collected from the study subjects; therefore, verbal consent was obtained from all participating individuals. The study procedure was approved by the ethics committee of Soochow University. The PCa and BPH tissues were snap-frozen in liquid nitrogen and stored at -80°C. Total RNA was extracted with the TRIzol reagent (Invitrogen, China). RNA quantity was measured on a Nanodrop 1000 Spectrophotometer (Thermo Scientific, China). Universal reverse transcription of all the mature miRNAs was performed by enzymatic tailing of the miRNAs by using Poly(A) Polymerase. MiRNAs were first tailed and then reverse transcribed by using universal primers. The sequences of miRNAs were obtained from the miRNAMap database . MiRNA specific primers were designed with Primer 3 software. Quantitative PCR was performed in a volume of 20 μl containing 2 μl of cDNA diluted 10 times, 10 μl of LightCycler® 480 SYBR Green I Master (Roche, China), and 200 nM of each primer. U6 expression was used as the internal control, and all quantitative PCR values were normalized to those of U6 RNA. Triplicates were performed for all reactions with a LightCycler® 480 System (Roche, China). Relative expression was analyzed by the Pfaffl method. All the statistical analyses were carried out on Graphpad Prism software.
Systematic analysis of the target genes of candidate prostate cancer miRNA biomarkers
The uniquely regulated genes associated with our prediction miRNAs from the PCa miRNA-mRNA target network were retrieved. Gene Ontology (GO) analysis and pathway analysis were performed to explore the relationships between these genes and PCa. The Database for Annotation, Visualization and Integrated Discovery (DAVID)  was used for GO annotation and KEGG pathway [39, 40] analysis. Another pathway source, MetaCore™ Database from GeneGo Inc., was used for GeneGo pathway mapping analysis. The highly significantly mapped pathways (p-value < 0.01) were further confirmed for their association with PCa via NCBI PubMed literature exploration.