Prostate cancer small non-coding RNA transcriptome in Arabs

Background Prostate cancer (PCa) is a complex disorder resulting from the combined effects of multiple environmental and genetic factors. Small non-coding RNAs (sRNAs), particularly microRNAs (miRNAs), regulate several cellular processes and have an important role in many human malignancies including PCa. We assessed the sRNA profiles associated with PCa in Arabs, a population that has rarely been studied. Methods We used next generation sequencing technology to obtain the entire sRNA transcriptome of primary prostate tumor formalin-fixed paraffin-embedded tissues, and their paired non-tumor tissues, collected from Bedouin patients (Qatari and Saudi). The miRNA and the target gene expression were evaluated by real-time quantitative PCR. miRNA KEGG pathway and miRNA target genes were subsequently analyzed by starBase and TargetScan software. Results Different expression patterns of several sRNA and miRNA editing were revealed between PCa tumor and their paired non-tumor tissues. Our study identified four miRNAs that are strongly associated with prostate cancer, which have not been reported previously. Differentially expressed miRNAs significantly affect various biological pathways, such as cell cycle, endocytosis, adherence junction and pathways involved in cancer. Prediction of potential targets for the identified miRNAs indicates the overexpression of KRAS, BCL2 and down-regulation of PTEN in PCa tumor tissues. Conclusion These miRNAs, newly associated with prostate cancer, may represent not only markers for the increased risk of PCa in Arabs, but may also reflect the clinical and pathological diversity as well as the ethno-specific heterogeneity of prostate cancer. Electronic supplementary material The online version of this article (10.1186/s12967-017-1362-x) contains supplementary material, which is available to authorized users.


Background
Prostate cancer is the most common malignancy in Western countries and the second cause of cancer-related death in Europe and the United States [1]. With lifestyle changes, the incidence of the disease has been increasing in the Arab populations [2]. From 1991 to 2006, PCa was the most common cancer in Qatari males over 65 years old [3]. In Kuwait, the incidence of prostate cancer rose to 12.3/100,000 men/year in 2004 [4]. In Arab populations, the incidence of PCa correlates with a low prostate volume and a low testosterone level. The high frequency of aggressive forms of PCa in Arab patients, despite the low levels of testosterone, indicates an increased sensitivity of Arab men to this steroid [5].
Prostate cancer is generally considered a complex disease and several genes underlie its onset, course, and severity. The genetic susceptibility to prostate cancer is variable among different populations [6]. The identification of population-specific genetic variants may help to better understand the genetics and the molecular mechanisms of prostate cancer.
At present, PCa is diagnosed primarily through the use of digital rectal examination and the measurement of serum levels of prostate-specific antigen (PSA). However, PSA is not prostate cancer specific and can be found with normal prostate at equal or higher levels than in PCa.
The non-specificity of PSA was particularly reported for Middle-Eastern and North African populations [7]. The poor specificity of serum PSA, the only current biomarker of the disease, presents significant problems for disease diagnosis, patient treatment and management. It is widely admitted that more specific prognostic and diagnostic markers of PCa are urgently needed.
Next generation sequencing (NGS) studies have revealed that the majority of the human genome is transcribed, with thousands of non-protein-coding RNAs (ncRNA), which comprise small and long ncRNAs [8,9]. Alterations in the expression of miRNA genes, which are small RNAs having 19-25 base pairs (bp) in length, contribute to the pathogenesis of most, perhaps all, human malignancies [10][11][12]. Several findings support an important role of the small non-coding RNAs in PCa [13][14][15][16][17][18]. Studies of PCa-specific miRNAs show potential for their utilization in the diagnosis and treatment of PCa [13][14][15]. Moreover, ribosomal RNA (rRNA) modification, small nuclear RNAs (snRNAs) and small nucleolar RNAs (snoRNAs) have been shown to be involved in PCa progression [16][17][18]. Previous studies, which have assessed the small RNA transcriptome in PCa and/or in different subtypes of PCa, are summarized in [15]. Most, if not all, PCa sRNA data, including miRNAs obtained so far, originated from Western and Asian specimens, and significant differences in prostate tumor pathological and clinical characteristics have been found between different ethnicities [19,20].
With the aim to identify an sRNA signature associated with prostate cancer in Arabs, we first conducted a deep sequencing of the entire small RNA transcriptome in PCa tissues along with non-malignant adjacent tissues. We further extended the study to validate the expression of several miRNAs and to search for potential targets associated with their deregulation in prostate cancer.

Patients and sample collection
Thirty-two patients with prostate cancer from Qatar and Saudi Arabia, from Bedouin tribes, were included in this study. Informed consents were obtained from all patients, and the study protocol was approved by the Institutional Review Boards of Weill Cornell Medicine-Qatar, Hamad Medical Corporation and King Saud University Hospital. The age and Gleason score of Qatari (Q) and Saudi (S) patients are listed in Additional file 1: Table  S1. All the tissues collected from prostate cancer surgical specimens and the FFPE prostate tissues were stored in Hamad Medical Corporation and in King Saud University Hospital.
The areas of tumor and normal tissue sampling were identified by pathologists, and 3 sections of 10 μm in the thickness of each FFPE tissue were taken for RNA extraction. Total RNA was extracted with RecoverAll Total Nucleic Acid Isolation Kit (Ambion, USA) following manufacturer's protocols. The quantity and quality of RNA were examined by Agilent 2100 Bioanalyzer (Agilent Technologies, USA).

Small RNA transcriptome sequencing
Next generation sequencing (NGS) technology was used to obtain the entire sRNA transcriptome of 20 samples (10 primary prostate tumor FFPE tissues, and their paired non-tumor tissues). Briefly, small RNAs in the size range from 18 to 30 nt were gel purified and ligated to 5′ and 3′ adaptor, and the ligation products were subjected to reverse transcription and then amplified for 15 cycles using the adaptor primers. The fragments around 150 bp were isolated and sequenced on Illumina HiSeq 2000 platform (Illumina, USA).

NGS data analysis
Raw reads went through data cleaning first, which includes removing adaptors, getting rid of low quality tags and several kinds of contaminants from the 50 nt tags. Length distribution of clean tags was then summarized. Clean reads were mapped to genome hg18 track by Short Oligonucleotide Analysis Package (SOAP) to analyze their expression and distribution.
To obtain the miRNA expression profile, small RNA tags were aligned to the precursor/mature miRNA of Homo Sapiens in miRBase18. Small non-coding RNA tags with rRNA, snRNA, snoRNA, small cytoplasmic RNA (scRNA) and transfer RNA (tRNA) were annotated in Genbank and Rfam. After excluding all the matched tags, the remaining sequencing reads were aligned to exons and introns of mRNA to identify the degraded fragments of mRNA. All the unannotated small RNA tags might represent novel miRNA and base edits of potential known miRNA.
The comparisons of percentage between tumor and normal tissues were calculated using paired one-tailed t test.

Real time-quantitative PCR (RT-qPCR)
For mRNA expression, total RNA was reverse transcribed into cDNA using oligo 16T primer and then gene expression was relatively quantified with GoTaq ® 2-Step RT-qPCR System for SYBR Green-based detection on Applied Biosystems ® 7500 fast real-time PCR machine. The HPRT1 gene was used as a reference. The sequences of primers are listed in Additional file 1: Table S2.
For miRNA expression, total RNA was reverse transcribed using miRNA specific primer with TaqMan MicroRNA Reverse Transcription Kit (Applied Biosystems, USA). The miRNA levels were quantified with Taqman probe-based detection (Applied Biosystems, USA) on Applied Biosystems ® 7500 fast Real-Time PCR Machine. The 18s rRNA was used as a reference.

Small non-coding RNA transcriptomes of Arab prostate cancer specimens
Small RNA transcriptomes from a total of 10 pairs of FFPE PCa tissues and their adjacent normal tissues were analyzed by NGS. A total of 766,824,250 high quality reads were obtained from the sequencing. After removal of irrelevant sequences there were 691,235,882 total reads. The length distribution analysis revealed that the RNA sequences were mainly within a range of 20-23 nt (Additional file 1: Figure S1), which corresponds to the size of most known small RNAs.

Library composition and mapping results
For each sample, 19 to 38 million reads were mapped to the human genome. For all samples, the percentage of alignments exceeded 70% (Additional file 1: Table S3). These reads included miRNAs, rRNAs, tRNAs, scRNAs, snRNAs, snoRNAs, sRNA repeats, exons, introns, and unknown nucleotide sequences (Table 1). In most of the cases, the total mapped reads were higher in non-tumor tissues than in tumor tissues (Fig. 1a). The read count percentages (Table 1) for snRNA, snoRNA, scRNA and sRNA repeats were significantly higher in PCa tumor tissues than in non-tumor tissues (P = 0.015; P = 0.002; P = 0.049 and P = 0.01 respectively). Conversely, the read count percentage for miRNA was significantly lower in PCa tumor tissues (P = 0.024).
Up to 1311 miRNAs were detected from all samples, with a large dynamic of read counts ranging from 1 to 215,035,382 (Additional file 2: Table S4). Out of the 1311 miRNAs, 590 miRNAs have at least one count in more than 50% of samples, and only 247 miRNAs have an average more than 100 reads per sample (Additional file 2: Table S4). Expression comparison of the 247 miRNAs in PCa tissues and in their corresponding adjacent nontumor tissues ranked miR-143-3p and miR-10b as the most abundant miRNAs in the PCa tumor tissues (about 50 and 20% of total miRNA reads, respectively). The expression of the top-ranked 20 miRNAs, representing more than 90% of the total miRNA reads in PCa tumor tissues, is shown in Fig. 1b.

Prostate cancer miRNA expression profiling
To compare the miRNA expression between tumor and non-tumor tissues, the actual miRNA counts were normalized into transcripts per million (TPM). The foldchange and P-value from the normalized expression were calculated. The results of each pair of samples are listed in Additional file 3: Table S5 and Additional file 1: Figure  S2. The miRNA expression of all the 10 pairs of samples was subjected to an unsupervised cluster analysis (Fig. 2). All miRNAs with average reads below 100 per sample were filtered out. The cluster highlighted in red corresponds to a group of miRNAs upregulated in the PCa tumor tissues, whereas the one highlighted in green corresponds to those down-regulated (green box in Fig. 2). Twenty-seven miRNAs were upregulated and 18 downregulated in the PCa tumors ( Table 2). Out of these 45 miRNAs, 26 have a read counts exceeding an average of 5000 per sample.
In contrast to the findings of a previous report [15], miR-107 was found to be down-regulated in PCa tumor tissues. Five miRNAs are newly found to be associated with PCa namely, miR-671-3p, miR-143-5p, miR-145-3p, miR-195-3p and miR-320b. Except for miR-671-3p, all other miRNAs were found down-regulated in tumor tissues. Validation of NGS findings was performed using RT-qPCR. As shown in Fig. 3a, except for miR-195-3p, NGS findings were replicated by RT-qPCR. Taken together, our results unveil four novel associations between miRNAs and PCa.

Identification of miRNAs targets
Since the primary function of miRNA is to target mRNAs and interfere with their expression, we analyzed the KEGG pathways affected by the 45 miRNAs that were differentially expressed in PCa tumor tissues. The star-Base tool, [21] based on microRNA-mRNA interactions from Argonaute CLIP-Seq and Degradome-Seq data, was applied. Pathways targeted by up-regulated miR-NAs were expected to be negatively affected, whereas those targeted by down-regulated miRNAs would be over-expressed (Table 3). The KEGG ID: hsa05200 cancer pathway was found to be the most significant affected pathway. Both upregulated and downregulated miRNAs significantly affect cell cycle, endocytosis, and adherence junction. The prostate cancer pathway was found to be upregulated. Based on the miRNA target prediction results, we selected the top 6 most frequently targeted genes, relevant to cancer, and assessed their expression in 22 pairs of prostate tumor specimens and in their adjacent non-tumor tissues. KRAS and BCL2 oncogenes were highly expressed in tumor tissues, whereas the tumor suppressor PTEN gene was significantly downregulated (Fig. 3b).

miRNA editing analysis
Transcriptome analysis is based commonly on the analysis of transcript levels and biological pathway alterations. Recently, more emphasis is placed on post-transcriptional modifications, particularly on RNA editing. This process targets not only mRNAs, but also small RNAs, including miRNAs. Adenosine to inosine (A-to-I) substitution, equivalent to A-to-G cDNA changes, is the most prevalent alteration. A-to-I changes in seed sequence (+ 2 to + 8 positions of mature miRNA) could modulate miRNA-binding specificity [22], and could modulate the maturation [23] and expression [24] in non-seed region. To get insight into miRNA editing in PCa, un-annotated sRNA tags, that align to mature miRNA with one base mismatch, were analyzed. A summary of read counts of edited and wild-type miRNAs are listed in Table 4. The obtained results indicate that for several miRNAs, the edited format predominates the miRNA pool. For certain miRNAs, such as miR-23c, editing could be seen in 100% of miRNA pool (Additional file 4: Table S6). miRNA editing is more frequent in PCa tumor tissues than in nontumor tissues (P = 0.0560). Positive correlation between miRNA editing and miRNA expression pattern was seen only for let-7e-5p miRNA (Table 5).

Discussion
Significant data on small RNA profiling in prostate cancer has been accumulated from population studies of different ancestries, including Europeans and Asians. Arab populations, including Arab Gulf populations, however, have not been studied. To our knowledge, this is the first study to unveil small RNA profiles associated with prostate cancer in Arab populations, in which aggressive forms of prostate cancer are frequently found.
Our analysis of the entire small non-coding RNA profile of prostate tumors collected from Arab patients led to more than 691 million clean reads. Since miRNA reads account for more than 70% of all the small RNA reads, we focused our analysis on miRNAs. We found that 45 miR-NAs were significantly deregulated in PCa tumor tissues. We specifically identified the KEGG pathways targeted  Fig. 2 Cluster analysis of differentially expressed miRNAs in 10 pairs of PCa tumor and non-tumor tissues. Each row shows one miRNA and each column shows one sample pair. Therefore, each cell shows the differential expression of a miRNA in one sample pair. Red indicates that the miRNA has a higher expression in tumor tissue, green indicates that the miRNA has a higher expression in non-tumor tissue, and grey indicates that the miRNA has no expression (detected tag counts < 5) in at least one of the sample pair. miRNA with similar expression pattern in different sample pairs are clustered together Shan et al. J Transl Med (2017) 15:260 by these deregulated miRNAs. We further assessed the expression levels of oncogene and tumor suppressor genes most frequently targeted by these deregulated miRNAs.
Our findings are consistent with several reports (summarized in [15]), which showed positive association of several miRNAs with prostate cancer. However, our study unveiled novel associations in Arab patients. We report here 4 miRNAs, which are associated with prostate cancer for the first time, namely miR-671-3p, miR-143-5p, miR-145-3p and miR-320b. Our findings, along with the report indicating a significant association of miR-671-3p with breast cancer [25], suggest that miR-671-3p could be an attractive marker for prostate cancer risk.
Ethno-specific genetic variation could affect the prevalence and expression of miRNAs linked to cancer [19,[37][38][39]. Our findings, showing novel associations between 4 miRNAs and prostate cancer in Arabs, suggest that miRNA expression may contribute to the clinical and pathological diversity and ethnic-related heterogeneity of prostate cancer.

Conclusions
This study suggests that the identified miRNAs, differentially regulated in prostate cancer, represent putative factors for the increased risk of PCa in Arabs. The role of miRNA editing as a potential mechanism underlying