A circRNA–miRNA–mRNA network identification for exploring underlying pathogenesis and therapy strategy of hepatocellular carcinoma

Background Circular RNAs (circRNAs) have received increasing attention in human tumor research. However, there are still a large number of unknown circRNAs that need to be deciphered. The aim of this study is to unearth novel circRNAs as well as their action mechanisms in hepatocellular carcinoma (HCC). Methods A combinative strategy of big data mining, reverse transcription-quantitative polymerase chain reaction (RT-qPCR) and computational biology was employed to dig HCC-related circRNAs and to explore their potential action mechanisms. A connectivity map (CMap) analysis was conducted to identify potential therapeutic agents for HCC. Results Six differently expressed circRNAs were obtained from three Gene Expression Omnibus microarray datasets (GSE78520, GSE94508 and GSE97332) using the RobustRankAggreg method. Following the RT-qPCR corroboration, three circRNAs (hsa_circRNA_102166, hsa_circRNA_100291 and hsa_circRNA_104515) were selected for further analysis. miRNA response elements of the three circRNAs were predicted. Five circRNA–miRNA interactions including two circRNAs (hsa_circRNA_104515 and hsa_circRNA_100291) and five miRNAs (hsa-miR-1303, hsa-miR-142-5p, hsa-miR-877-5p, hsa-miR-583 and hsa-miR-1276) were identified. Then, 1424 target genes of the above five miRNAs and 3278 differently expressed genes (DEGs) on HCC were collected. By intersecting the miRNA target genes and the DEGs, we acquired 172 overlapped genes. A protein–protein interaction network based on the 172 genes was established, with seven hubgenes (JUN, MYCN, AR, ESR1, FOXO1, IGF1 and CD34) determined from the network. The Gene Oncology, Kyoto Encyclopedia of Genes and Genomes and Reactome enrichment analyses revealed that the seven hubgenes were linked with some cancer-related biological functions and pathways. Additionally, three bioactive chemicals (decitabine, BW-B70C and gefitinib) based on the seven hubgenes were identified as therapeutic options for HCC by the CMap analysis. Conclusions Our study provides a novel insight into the pathogenesis and therapy of HCC from the circRNA–miRNA–mRNA network view.


Background
Circular RNA (circRNA), with a complete closed loop structure, is firstly identified in 1976 [1]. However, these transcripts without poly-A tail are ignored for a long time due to the limitation of traditional RNA detection methods. In recent years, with the development of high-throughput sequencing technology, a myriad of circRNAs has been found in the eukaryotic transcriptome [2]. With the features of cell-type specific [3] and highly conserved across species [4], cir-cRNAs are thought to be new star RNAs which play important roles in various diseases, including human cancers [5,6].
Competing endogenous RNAs (ceRNAs) are transcripts that act as miRNA sponges, modulating each other at post-transcriptional level via competely binding to shared miRNAs [7]. Recently, circRNAs have become new hotspots in ceRNA family since they have been demonstrated to harbor abundant conserved miRNA response elements (MREs) [8]. Increasing study has revealed that some circRNAs are involved in tumor initiation and progression by the ceRNA mechanism. For example, a classic cirRNA, ciRs7, is deemed to be a miRNA sponge, absorbing miR-7 and liberating the latter inhibitory effect on its target gene in many human cancers [9,10]. CircRNAs as ceRNAs mediating pathological processes has also been reported in HCC [11,12]. However, many unknown circRNAs still remain to be explored.
In this study, we employed a combinative strategy of gene chip and computational biology to investigate novel circRNAs and their potential action mechanisms in HCC. The flow chart recapitulating the present work is shown in Fig. 1 as follows: First, we collected HCCrelated microarray datasets providing expression profile of circRNAs from the Gene Expression Omnibus (GEO), obtaining differently expressed circRNAs (DECs) with RobustRankAggreg method and corroborating their expression using reverse transcription-quantitative polymerase chain reaction (RT-qPCR). To depict whether the DECs function as ceRNAs in HCC, we collected their sponge miRNAs and miRNA target genes, constructing a circRNA-miRNA-mRNA network. A protein-protein interaction (PPI) network was subsequently established and the hubgenes were identified. Then, Gene Oncology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome enrichment analyses on the hubgenes were performed to elucidate the potential pathogenesis of HCC. Furthermore, we conducted a connectivity map (CMap) analysis to acquire bioactive compounds for the treatment of HCC, which provide a new insight into the latent therapeutic capacity of circRNAs in HCC.

Screening of DECs in HCC from GEO
Microarray datasets providing circRNA expression profile in HCC were achieved from the GEO database [13]. All raw expression data were normalized and log2-transformed. First, we used Limma, a Bioconductor package for differential analysis of microarray data, to determine DECs in each dataset with the criteria of |log2(foldchange)| > 1 and P-value < 0.05 [14]. Then, we integrated and ranked all of the DECs with R package RobustRankAggreg [15].

Validation of DECs with RT-qPCR method
Sixteen paired fresh frozen HCC tissues and corresponding adjacent non-tumor tissues were obtained from patients diagnosed with HCC at the First Affiliated Hospital of Guangxi Medical University (Nanning, Guangxi, People's Republic of China). No patient received any radiotherapy or chemotherapy before surgery. The Ethics Committee of the First Affiliated Hospital of Guangxi Medical University approved this study.
Total RNA was isolated with TRIzol ® Reagent (Life technologies, Thermo Fisher Scientific, USA) following the manufacturer's instruction. Then, 1 µg total RNA was reversed into 20 µl complementary DNA (cDNA) with Geneseed ® II First Strand cDNA Synthesis Kit (Geneseed, Guangzhou, China). RT-qPCR was conducted using Geneseed ® qPCR SYBR ® Green Master Mix (Geneseed) on ABI7500 system (Applied Biosystems, CA, USA) in line with the manufacturer's procedure. Beta-actin (Geneseed) was set as the endogenous reference. All of the primer sequences used in this study were synthesized by Geneseed and are displayed in Table 1. CircRNA expression was determined using the 2 −ΔCT method. Significance between groups was analyzed by paired T-test with SPSS 22.0 (IBM, New York, USA). A P-value < 0.05 denotes a statistical significance.

Prediction of MREs
The miRNA binding sites, also known as MREs, of those selected DECs, were predicted with two web tools, Cancer-Specific CircRNA (CSCD) [16] and Circular RNA Interactome (CircInteractome) [17]. We identified overlapped miRNAs of the two algorithms as potential target miRNAs of the DECs.

Verification of miRNA expression based on data from GEO and TCGA
Microarray datasets providing miRNA expression profile in HCC were obtained from the public databases GEO and The Cancer Genome Atlas (TCGA) [18]. The retrieval terms were as follows: (hepatocellular OR hepatic OR liver OR HCC) and ("cancer" OR "tumor" OR  16:220 "tumour" OR "carcinoma" OR "neoplasm" OR "malignan*") and (miRNA OR microRNA OR miR OR "noncoding RNA" OR ncRNA OR "noncoding RNA" OR "non coding RNA").
We screened available datasets based on the listed inclusion criteria: (1) all of the patients were diagnosed with HCC; (2) the studies must contain circRNA expression data both in cancerous and normal liver tissues; and (3) the sample sizes in tumor and non-tumor group were at least three.
Two revivers extracted the basic information of each included record: miRNA type, first author and published year, region, data source, platform, number of case, and expression level of miRNA. Any divergences were settled via discussion with a third investigator.
The combined standard mean difference (SMD) and 95% confidence interval (95% CI) were computed by STATA 12.0 (StataCorp, College Station, TX, USA). A SMD > 0 represents high expression of miRNA in HCC samples than in normal controls. The corresponding 95% CI do not cross 1 and a P-value < 0.05 suggest a statistical significance.

Collection of differently expressed genes (DEGs) of HCC form TCGA
RNA-sequencing (RNA-seq) data containing 374 HCC samples and 50 normal controls was downloaded from the TCGA. DEGs were determined by the edgeR package [20] in Bioconductor with the filter criteria of |log2(foldchange)| > 1 and adjust P-value < 0.05.

Construction of circRNA-miRNA-mRNA network
The overlapping genes between the predicted miRNA target genes and the DEGs were obtained for circRNA-miRNA-mRNA network construction. The Cytoscape 3.6.1 software [21] was used to visualize the regulatory network.

Establishment of PPI network and identification of hub-genes
A PPI network was established by the STRING (v10. 5) [27924014] and visualized by the Cytoscape 3.6.1. Then, the "Molecular Complex Detection" (MCODE), a clustering algorithm identifying locally densely connected regions in a large PPI network based on a node-weighting arithmetic [22], was employed to recognize highly interacted hubgene clustering.

GO, KEGG and Reactome enrichment analyses
GO annotation and KEGG pathway analyses were conducted by clusterProfiler, an R package for functional classification and enrichment of gene clusters using hypergeometric distribution [23]. Reactome pathway analysis was performed by Reactome FI, a plugin of Cytoscape for network and pathway analysis [24].

CMap analysis
Hubgenes consisting of two lists of up-and down-regulated tags were uploaded into the CMap web tool, matching against over 7000 gene expression profiles following treatment of 1309 bioactive compounds in human cell lines [25]. The match between the signatures of interest and chemicals from CMap was assessed by a connectivity score from − 1 to 1: a positive score denotes a stimulative effect of compound on the query signatures; while a negative score implicates a repressed effect of a compound on the query signatures.

Identification of six DECs in HCC based on RobustRankAggreg method
Three microarray datasets (GSE78520, GSE94508 and GSE97332) were included in our study. All of the three gene chips were from the platform of Agilent-069978 Arraystar Human CircRNA microarray V1. The basic information of the three datasets is concluded in Table 2.
A total of 259 DECs with 211 up-regulated circRNAs and 48 down-regulated circRNAs were found in gene chip GSE78520 (Fig. 2a); 299 DECs with 46 up-regulated cir-cRNAs and 253 down-regulated circRNAs were determined in gene chip GSE94508 (Fig. 2b); 882 DECs with 429 up-regulated circRNAs and 453 down-regulated cir-cRNAs were identified in gene chip GSE97332 (Fig. 2c). Subsequently, we integrated the DECs of the three datasets and ranked them with a robust method. A total of six circRNAs, including two up-regulated circRNAs (hsa_circRNA_103510 and hsa_circRNA_100542) and four down-regulated circRNAs (hsa_circRNA_102166, hsa_circRNA_104515, hsa_circRNA_105031 and hsa_ circRNA_100291), were found to be in the top rankings with an adjust P-value < 0.05 (Fig. 3). The expression of   16:220 the six circRNAs in each dataset is shown in Fig. 4. The essential characteristics of the six circRNAs are displayed in Table 3. The basic structural patterns of the six circR-NAs are exhibited in Fig. 5.

Corroboration of the six circRNAs with RT-qPCR
RT-qPCR was conducted in 16 pairs HCC samples and adjacent non-cancerous tissues to corroborate the expression of the six circRNAs. Among the six circR-NAs, hsa_circRNA_103510 and hsa_circRNA_100542 cannot be detected in these in-house tissues due to their relative low expression levels. The low expression of hsa_circRNA_102166 in HCC tissues was corroborated by RT-qPCR (P = 0.047, Fig. 6a). Hsa_circRNA_100291 and hsa_circRNA_104515 showed a down-regulated tendency in HCC, though the P-values were greater than 0.05 (hsa_circRNA_100291: P = 0.069, Fig. 6b; hsa_circRNA_104515: P = 0.059, Fig. 6c). For hsa_cir-cRNA_105031, its expression in HCC tissues was similar to that in non-cancerous tissues (P = 0.588, Fig. 6d). The head-to-tail splicing in the RT-qPCR product of hsa_circRNA_102166, hsa_circRNA_100291, hsa_cir-cRNA_104515 and hsa_circRNA_105031 was confirmed by Sanger sequencing (Fig. 7).

Identification of five circRNA-miRNA interactions
Hsa_circRNA_102166, hsa_circRNA_100291 and hsa_circRNA_104515 were selected for further analysis. Increasing evidence has demonstrated that some circRNAs play critical roles in tumors by functioning as "decoys" to sponge miRNAs. To depict whether the three circRNAs perform the similar role in HCC, we collected their potential target miRNAs from two online databases, CSCD and CircInteractome. A total of five circRNA-miRNA interactions including two circRNAs (hsa_circRNA_104515 and hsa_circRNA_100291) and five miRNAs (hsa-miR-1303, hsa-miR-142-5p, hsa-miR-877-5p, hsa-miR-583 and hsa-miR-1276) were identified (Fig. 8). DIANA-miRPath [26] was exploited to explore the signaling pathways in which the five miRNAs may be involved. As shown in Fig. 9, all of the five miRNAs were closely linked with some cancer-related pathways.

Expression of the five miRNAs based on the data from GEO and TCGA
In all, 11 microarray and RNA-seq datasets were included. Among them, 7 datasets with 339 HCC samples and 113 normal controls were for miR-1303; 11 datasets with 881 HCC samples and 351 normal controls were for miR-142-5p; 7 datasets with 671 HCC samples and 149 normal controls were for miR-877-5p; 5 datasets with 307 HCC samples and 97 normal controls were for miR-583; and 5 datasets with 534 HCC samples and 110 normal controls were for miR-1276. The essential properties of the 11 records are concluded in Table 4. According  Fig. 10d). For miR-583, its expression in HCC tissues was similar to that in normal liver tissues (SMD = 0.03, 95% CI − 0.25 to 0.31, P = 0.851, Fig. 10e).

Construction of circRNA-miRNA-mRNA network
Total 1424 target genes of the aforementioned five miRNAs were obtained from the miRWalk. Additionally, 3278 DEGs in HCC were gained from the TCGA (Fig. 11a). By intersecting the predicted target genes and DEGs, we identified 172 target genes that exert momentous roles in HCC (Fig. 11b).

Identification of three bioactive compounds for the treatment of HCC based on CMap analysis
The seven hubgenes consisting of two up-regulated genes (CD34, MYCN) and five down-regulated genes (AR, JUN, ESR1, FOXO1, IGF1) were loaded into the CMap web tool as up-regulated tags and down-regulated tags, respectively. Following the signature query, three compounds (decitabine, BW-B70C, gefitinib) with the highest negative enrichment score were determined as the potential therapeutic agents for HCC ( Table 5). The chemical structures of the three compounds are presented in Fig. 18.

Discussion
Due to the lack of 5′ caps and 3′polyadenylated tails, cir-cRNAs are ignored in classical polyadenylated transcriptome studies for a long time. In the past few decades, with the development of high-throughput sequencing, biochemical and computational biology methods, a large number of circRNAs are unmasked the veil in various tissues and cells [30]. Increasing study has unveiled the important roles of circRNAs in a myriad of human diseases, including malignant tumors [31][32][33]. CircRNAs usually serve as diagnostic and prognostic biomarkers because of their relative tolerance to exonucleases, which benefits from the covalently closed loop structures [34,35]. In addition, owing to the high cell-and tissue-specificity of circRNAs, their roles in different neoplasms are not complete accord. In HCC, an increasing number of circRNAs, such as ciRS-7 [36], has_circ_0067934 [12], CDR1as [37] and circMTO1 [11], have been reported to exert momentous roles in regulating pathophysiological process and guiding clinical diagnosis and treatment. However, there are still a lot of circRNAs that need to be unearthed.
In this study, we firstly collected three gene chips (GSE78520, GSE94508 and GSE97332) from the GEO database and screened six DECs (hsa_circRNA_103510,hsa_circRNA_100542, hsa_ circRNA_102166, hsa_circRNA_104515, hsa_cir-cRNA_105031 and hsa_circRNA_100291) with the RobustRankAggreg approach, an algorithm for integration of different gene expression spectra, ranking genes by their fold changes and aggregating these rankings to achieve the final robust and rigorous results. Following the RT-qPCR validation of the six DECs, three circRNAs (hsa_circRNA_102166, hsa_circRNA_100291 and hsa_ circRNA_104515) were selected for further analysis.
As highly conserved endogenous RNAs, many circR-NAs harbors abundant miRNA binding sites, indicating that they can sponge corresponding miRNAs and thus function as ceRNAs to regulate gene expression [8,38,39]. To ascertain whether the aforementioned three circRNAs function as ceNAs in HCC, we predicted their MREs via two online tools, CSCD and CircInteractome. The former web tool predicted MREs within 50 bp upstream and downstream of circRNA junction point [16]. The latter web tool predicted MREs based on the TargetScane algorithm, which forecasts MREs by searching for 7mer or 8mer complementarity to the 3′ end and the seed region of each miRNA [17,40]. We chose miRNA predicted by both algorithms as the putative target miRNA for the three circRNAs. Finally, five circRNA-miRNA interactions consisting of two circR-NAs (hsa_circRNA_104515 and hsa_circRNA_100291) and five miRNAs (hsa-miR-1303, hsa-miR-142-5p, hsa-miR-877-5p, hsa-miR-583 and hsa-miR-1276) were determined. The expression of the five miRNAs was then verified based on the microarray and RNA-seq data from the GEO and TCGA. The results showed that miR-142-5p was down-regulated in HCC, which was consistent with previous studies [41,42]. No relevant study reported the expression of miR-877, miR-1303, miR-1276 and miR-583 in HCC. According to our results, miR-877 was up-regulated in HCC. MiR-1303 and miR-1276 showed high expression tendencies in HCC, while the differences were not statistical significant. For miR-583, Fig. 11 Identification of 172 genes that exert momentous roles in hepatocellular carcinoma (HCC). a Volcano plot of the differentially expressed genes (DEGs) in HCC based on data from TCGA. The volcano plot was generated by R package 'ggplot2' . b Venn diagram for the intersections between DEGs and miRNA target genes its expression in HCC tissues was similar to that in normal liver tissues. Due to the inter-study heterogeneity that cannot be ignored, the reliable of the pooled results is reduced. Thereby, further rigorous studies are necessary to validate these findings.

Fig. 13
Identification of hubgenes from the PPI network with the MCODE algorithm. The node color changes gradually from green to red in ascending order according to the log2(foldchange) of genes. The edge size changes gradually from fine to coarse in ascending order according to the combined score between two neighbored genes.  16:220 ESR1, hsa_circRNA_100291/miR-583/JUN, hsa_cir-cRNA_100291/miR-583/AR, hsa_circRNA_104515/ miR-877-5p/AR, hsa_circRNA_104515/miR-142-5p/ MYC, hsa_circRNA_104515/miR-142-5p/MYCN, hsa_circRNA_104515/miR-142-5p/IGF1 and hsa_cir-cRNA_104515/miR-142-5p/CD34), indicating competitive regulatory relationships of hsa_circRNA_100291 and hsa_circRNA_104515 with the seven genes in HCC. However, given that the results are on the basis of computational biology, further in-depth studies are indispensable to verify the possible roles of the seven axes in HCC.
To date drug control is an important treatment for patients with HCC [50]. Digging effective and sensitive drugs against HCC helps improve patients' outcomes. We therefore implemented the CMap analysis of the seven hubgenes to explore usable drugs for the treatment of HCC. Based on the genome-wide expression profiling of transcripts technology, CMap provides a Fig. 15 CircRNA-miRNA-hubgene network. The network consisting of two cricRNAs (hsa_circRNA_104515 and hsa_circRNA_100291), four miRNAs (hsa-miR-142-5p, hsa-miR-877-5p, hsa-miR-583 and hsa-miR-1276) and seven hubgenes (JUN, MYCN, AR, ESR1, FOXO1, IGF1 and CD34) was generated by Cytoscape 3.6.1 comprehensive and accurate data resource for exploration of novel drug or relocation of existing drug [51]. Drugs available in CMap are all licensed for human use by the Food and Drug Administration, thus it is an ideal and reliable approach to obtain therapeutic agents for human diseases [52]. Three chemicals (decitabine, BW-B70C and gefitinib) were identified as the treatment options for HCC. As a cytidine antimetabolite analogue, decitabine represses DNA methylation, arresting cells  into G1/S phase and inhibiting tumor cell proliferation. Its antitumor activity in solid tumors including HCC has been elucidated in previous studies [53]. Xing et al. [54] have demonstrated that decitabine could facilitate the expression of miR-122 via inhibition of methylation in HCC cells. More importantly, Skårn et al. [55] have demonstrated that miR-142 is epigenetically restrained by DNA methylation. Treating decitabine in mesenchymal cells promotes mature miR-142-5p/3p expression and thus depresses cell proliferation. We thus hypothesize that decitabine may exert its anti-HCC effect by augmenting miR-142-5p via demethylation and subsequently regulating the downstream target genes of miR-142-5p. Further well-design study is necessary to validate the conclusion. BW-B70C is an inhibitor of arachidonic acid 5-lipoxygenase. Previous study has reported its anti-neoplastic activity in leukemic cells by suppressing the NOTCH1-P13K-AKT-eNOS axis [56]. However, its anti-HCC effect and action mechanism have not been elucidated as of yet. In this study, we found its potential as therapeutic agent for HCC. More studies are needed to verify this finding. Gefitinib is a selective inhibitor of tyrosine kinase receptor used in clinic for the treatment of locally advanced or metastatic non-small cell lung cancer (NSCLC) [57]. Its antineoplastic effect on the other solid tumors including HCC has also been reported [58,59]. However, the responsiveness of different HCC patients to gefitinib varies greatly and most patients even develop gefitinib resistance [60]. Multiple non-coding RNAs including miRNAs and lncRNAs have been reported to contribute to gefitinib resistance and sensitivity in NSCLC [61][62][63]. We speculate that gefitinib share the similar resistance mechanisms in HCC and lung cancer. Our study provides a theoretical basis for studying gefitinib resistance mechanism and enhancing gefitinib sensitivity in patients with HCC from the perspective of circRNA-miRNA-mRNA network.

Conclusions
In conclusion, by employing a comprehensive strategy of big data mining, RT-qPCR and computational biology, we constructed a circRNA-miRNA-mRNA network and found that hsa_circRNA_100291 and hsa_circRNA_104515 may function as ceRNAs to exert critical roles in HCC. In addition, three bioactive chemicals (decitabine, BW-B70C and gefitinib) based on the CMap analysis was determined as therapeutic agents for HCC. Our study provides a novel insight into the pathogenesis and therapy for HCC from the circRNA-miRNA-mRNA view.