Identification and validation a TGF-β-associated long non-coding RNA of head and neck squamous cell carcinoma by bioinformatics method

Background The role of transforming growth factorβ (TGF-β)-induced tumor progression in advanced malignancy is well established, but the involvement of long non-coding RNAs (lncRNAs) in TGF-β signaling remains unclear. This study aimed to identify TGF-β-associated lncRNAs in head and neck squamous cell carcinoma (HNSCC). Methods Expression profiling of lncRNAs was obtained using Gene Expression Omnibus and The Cancer Genome Atlas. Real-time quantitative PCR was used to analyze the expression of EPB41L4A-AS2 in HNSCC cell line. We used bioinformatics resources (DAvID) to conduct Gene Ontology biological processes and KEGG pathways at the significant level. Wound healing assay, cell migration and invasion assays, were used to examine the effects of EPB41L4A-AS2 on tumor cell metastasis in vivo. Protein levels of EPB41L4A-AS2 targets were determined by western blot. Results A novel TGF-β-associated lncRNA, EPB41L4A-AS2, was found downregulated by TGF-β and associated with invasion and metastasis. The relationship of EPB41L4A-AS2 with the clinicopathological features and prognosis of HNSCC patients was evaluated. Bioinformatic analyses revealed that EPB41L4A-AS2 may be involved in processes associated with the tumor-associated signaling pathway, especially the TGF-β signaling pathway. Furthermore, a TGF-β-induced epithelial-to-mesenchymal transition (EMT) model was established. Low EPB41L4A-AS2 expression was determined, and overexpression of this gene inhibited cell migration and invasion in the EMT model. Moreover, EPB41L4A-AS2 suppressed TGFBR1 expression. Conclusions EPB41L4A-AS2 might serve as a negative regulator of TGF-β signaling and as an effective prognostic biomarker and important target in anti-metastasis therapies of HNSCC patients. Electronic supplementary material The online version of this article (10.1186/s12967-018-1418-6) contains supplementary material, which is available to authorized users.


Background
Long non-coding RNAs (lncRNAs) are a class of transcripts that are longer than 200 nucleotides without a protein-coding capacity [1]. Mounting evidence has recently indicated that lncRNAs are aberrantly expressed in different cancer types and are tightly associated with disease processes [2,3]. A growing number of lncRNAs have been determined to play significant roles in invasion and metastasis in malignant tumor [4]. For example, HOTAIR promotes invasion in cervical cancer by targeting the Notch pathway, whereas HULC enhances the metastasis of hepatocellular carcinoma through the miR-200a-3p/ZEB1 signaling pathway [5,6].
Transforming growth factorβ (TGF-β) is a vital cytokine factor in tumorigenesis and tumor progression [7,8]. TGF-β promotes tumor progression by enhancing microenvironment modification, distant metastasis, and cell invasion partly through its capability to induce EMT [9,10]. EMT is an important progression in the early events of cancer cell invasion and metastasis by endowing tumor cells with enhanced motility and invasion potentials [11]. The role of TGF-β-induced EMT in tumor cell dissemination has been well established. TGF-β signaling is mediated through SMAD and non-SMAD pathways to regulate EMT in tumor [12]. Recent studies have suggested that lncRNA, which serves as a mediator, participates in TGF-β-induced EMT [13]. lncRNA ATB, which is a mediator of TGF-β signaling, binds the miR-200 family and then the induced EMT in hepatocellular carcinoma [1]. TGF-β-induced upregulation of malat1 promotes bladder cancer metastasis through associating with SUZ12 [14]. However, the physiological significance of such interaction of the TGF-β-lncRNA axis requires further investigation.
Head and neck squamous cell carcinoma (HNSCC), the sixth leading cancer worldwide, is an aggressive cancer form with high metastasis and recurrence rates [15,16]. Understanding the diverse etiology and molecular heterogeneity of HNSCC is crucial to overcome the barriers to the progress of prognostic tools and treatments. As members of a new class of regulatory molecules, lncR-NAs have received increasing attention because of their roles in the invasion-metastasis cascade [17]. However, the specific roles of lncRNAs in mediating the metastatic role of TGF-β have not been extensively studied. In this study, we utilized gene expression microarray data in the GEO and mined next-generation sequencing and clinical data HNSCC patients from TCGA to fill this research gap. We specifically determined the effects of TGF-β on lncRNA expression and the correlation of this expression with cancer cell invasion and metastasis. We aimed to identify an lncRNA signature that participates in TGF-β signaling in the invasion-metastasis cascade of HNSCC.

GEO and TCGA gene expression datasets
HNSCC gene expression data and corresponding clinical data were obtained from the TCGA (https ://tcgadata.nci.nih.gov and https ://genom e-cance r.ucsc.edu) and GEO databases (http://www.ncbi.nlm.nih.gov/ geo/). To investigate lncRNA and TGF-β expression and between normal and cancer tissues, we downloaded RNASeq datasets for 43 tumor-adjacent normal pairs from TCGA and GSE59652 (N = 7) from the GEO database. Level date 3 were downloaded from the TCGA data and the expression profiling platforms is RNA-seqv2. To investigate the TGF-β-induced lncRNA changes in HNSCC, GSE54800 (lncRNA expression profile activated by TGF-β promotes the invasion-metastasis cascade in hepatocellular carcinoma, N = 14) was downloaded. To investigate the lncRNA signature related to invasion and metastasis in HNSCC, the data of 127 HNSCC patients were obtained from the TCGA database. These data consisted of 39 patients with and 88 patients without metastasis or recurrence. The exclusion criteria were set as follows: (i) histologic diagnosis is not HNSCC; (ii) suffering from other malignancies aside from HNSCC; and (iii) patients samples without complete data for analysis.

RNA sequence data processing and computational analysis
The RNA sequencing raw reads were post-processed and normalized by TCGA RNASeqV2 system. Level 3, normalized miRNA expression data were downloaded from the TCGA data portal performed using Illumina HiSeq and Illumina GA microRNA sequencing platforms and quantile normalized before performing analysis. No further normalizations were applied to the lncRNA, miRNA and mRNA expression profile data in level 3 [18,19]. To detect the differential expression of lncRNA, miRNA, and mRNA, the samples were divided into different groups. For further analysis, the intersection of lncRNA, miRNA, and mRNA was selected. The flow chart for bioinformatics analysis is depicted in Fig. 1.

Functional enrichment analysis
Functional enrichment analysis was actualized according to the literature [18,19]. To understand further the underlying pathways and biological processes of aberrantly expressed lncRNAs, miRNAs, and mRNAs in HNSCC, we used bioinformatics resources (DAvID) (http://david .abcc.ncifc rf.gov/), to conduct Gene Ontology (GO) biological processes and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways at the significant level.

Cell transfection
CNE1 cells were transfected with EPB41L4A-AS2 ORF clone (GeneCreate Biological Engineering Co., Ltd., Wuhan, China) in accordance with the manufacturer's protocol. To determine the efficiency of transfection, the expression of EPB41L4A-AS2 was assessed with a qRT-PCR detection system (Bio-Rad, Hercules, CA, USA).

RNA sample preparation
Total RNA was extracted using TRIzol reagent (Invitrogen, Burlington, ON, Canada) in accordance with the manufacturer's recommended protocol. The yield and purity of the RNA were determined by measuring the absorbance (Abs) at 260 and 280 nm. The RNA samples were only used when the Abs ratio of 260/280 nm was > 1.8.
Cell wound healing assay CNE1 cells were cultured in six-well plates (2 × 105 cells per well) and then transfected with EPB41L4A-AS2 ORF clone. When the cells reached 75-90% confluence, a single wound was created by using a T-200-Y pipet tip. Wound areas were visualized under an optical microscope with a magnification of 100×. Cell migration capability was measured by gap closure.

Migration and invasion assays
For migration assays, CNE1-transfected cells (5 × 104) were plated on the top chamber with a non-coated membrane. For invasion assays, CNE1-transfected cells (5 × 104) were plated on the top chamber with a coated membrane. In both assays, the cells were plated on the top chamber in medium without serum; the lower chamber was filled with 20% FBS. After 24-36 h of incubation, cells that invaded the membrane were fixed with 4% paraformaldehyde and then stained with crystal violet. Six random fields of cells were counted in each well under a microscope at a magnification of 100×.

Statistical analysis
One-way ANOVA and Student's t test were performed with SPSS 13.0 and GraphPad Prism 5.0 software. The significance level was set at 0.05 as default to control the false discovery rate. Data were expressed as the mean ± standard deviation of at least three separate experiments. Values of p < 0.05 were considered statistically significant.

TGF-β upregulation in HNSCC and association with worse patient survival
The expression levels and clinical significance of TGFβ were analyzed based on available samples in TCGA to determine the TGFβ expression profile in HNSCC and evaluate patient survival as a function of TGFβ expression. We downloaded RNAseq datasets for 43 tumoradjacent normal pairs and determined that TGFβ expression was higher in the HNSCC samples than in the normal samples ( Fig. 2a, b). The cutoff value of TGFβ was determined by receiver-operating characteristic analysis, which was employed to differentiate low and high TGFβ levels among patients (Fig. 2c). 507 HNSCC samples were downloaded from TCGA. On the basis of the expression levels of TGFβ, Kaplan-Meier survival analysis was performed. Results showed a negative association between high TGFβ expression and overall survival of HNSCC patients (Fig. 2d).

Dysregulated TGF-β-associated lncRNAs in HNSCC
TGF-β-induced lncRNA profile in HNSCC is unavailable. Thus, we employed the TGF-β-induced lncRNA profile in hepatocellular carcinoma (GSE54800) as the lncRNA pool to identify the specific TGF-βassociated lncRNAs profile in HNSCC. Analysis of 50 tumor-adjacent normal pairs from TCGA and GEO (GSE59652) revealed 90 dysregulated lncRNAs in tumor tissues paired with normal tissues (Additional file 1). Many established and cancer-linked lncRNAs were observed in our results. Among the upregulated lncRNAs, Lnc00152 promotes proliferation in gastric cancer through the EGFR-dependent pathway [20]. HCP5 regulates the malignant behavior of glioma cells [21]. Among the downregulated lncRNAs, MEG3 inhibits lung cancer tumor progression through MYC downregulation [22]. Many lncRNAs with unexplored roles in HNSCC were revealed in our analysis, such as SFTA1P, FIRRE, and FKBP9P1. Considering the intersection GSE54800, our analysis identified three dysregulated TGF-β-associated lncRNAs in HNSCC (Table 1). EPB41L4A-AS2 and LINC00515 were downregulated, whereas MIR4435-2HG was upregulated. HNSCC studies associated with these lncRNAs are unavailable. Furthermore, we measured the expression levels of these three lncRNAs in HNSCC samples from TCGA. A negative correlation was observed between EPB41L4A-AS2 as well as LINC00515 and TGF-β (Fig. 3a, b). A positive correlation was observed between MIR4435-2HG and TGF-β (Fig. 3c).

Identification of lncRNA signatures related to invasion and metastasis in HNSCC
We compared the levels of each lncRNA expression in the patient samples to identify the HNSCC-dysregulated lncRNA associated with invasion and metastasis. Considering that the clinical data of the patients were often incomplete, we focused on patients with the most complete set of criteria in terms of tumor status and follow-up data. We identified 167 dysregulated lncRNA transcripts ( Table 2, Additional file 2). Our results confirmed several of the previously reported lncRNAs in HNSCC. HOTAIR, for instance, promotes tumor cell invasion and metastasis by recruiting EZH2 and repressing E-cadherin in oral squamous cell carcinoma [5]. Several lncRNAs (e.g., LINC00964, DICER1-AS1, and PPIEL) with unexplored roles in the invasion and metastasis of HNSCC were revealed in our analysis. Moreover, many other established invasion-and metastasis-linked lncRNAs showed differential expression. TUBA4B, MEG3, and   [23][24][25][26][27][28]. Finally, our results indicated that EPB41L4A-AS2 was downregulated not only in cancer tissues but also in metastasis or recurrence tissues.

Significant correlation of EPB41L4A-AS2 with clinicopathological features and patient survival in HNSCC
Clinical characteristics for the total HNSCC cohort used are provided in Additional file 3. Low EPB41L4A-AS2 expression significantly correlated with increasing N classification, T classification, and tumor stage (Fig. 4a-c). Moreover, EPB41L4A-AS2 was aberrantly upregulated in HNSCC patients with perineural invasion (Fig. 4d). The relationship between EPB41L4A-AS2 expression and patient survival was analyzed. The cutoff value of EPB41L4A-AS2 was determined by receiver-operating characteristic analysis, which was employed to differentiate low and high TGF-β levels among patients (Additional file 4). High EPB41L4A-AS2 expression was significantly positively Fig. 3 Three dysregulated TGF-β-associated lncRNAs were found in HNSCC. a, b A negative correlation was observed between EPB41L4A-AS2 as well as LINC00515 and TGF-β. c A positive correlation was observed between MIR4435-2HG and TGF-β associated with a poor 3-year overall survival (OS) and relapse-free survival (RFS) in HNSCC (Fig. 4e,  f ). These data suggest that EPB41L4A-AS2 associated with disease progression and may has anti-oncogenic activity.

EPB41L4A-AS2 is downregulated in the CNE1-EMT model
We selected the nasopharyngeal carcinoma line CNE1 to establish an EMT model in HNSCC for result verification [29]. CNE1 was treated continuously with TGF-β for 8 days, which caused the CNE1 cells to undergo EMT (Fig. 5a). CNE1 was indicated by a spindle-shaped appearance, increased Snail, vimentin and N-cadherin expression levels, and decreased E-cadherin expression (Fig. 5b, c). QPCR assays revealed that TGF-β induced a large decrease in EPB41L4A-AS2 expression in the CNE1-EMT model (Fig. 5d). Moreover, the EPB41L4A-AS2 expression level was also downregulated in the 3-day treatment (Fig. 5e).

EPB41L4A-AS2 inhibits cell migration and invasion in the CNE1-EMT model
TGF-β expression was upregulated in HNSCC as previously described. Therefore, we detected the function of EPB41L4A-AS2 in the EMT model to approximate the environment in vivo. CNE-1-EMT was transfected with pcDNA3.1-EPB41L4A-AS2. The successful overexpression of EPB41L4A-AS2 in CNE-1-EMT was confirmed by qPCR (Fig. 6a). The transfected CNE1-EMT was subjected to a wound-healing assay.
Results showed that overexpression of EPB41L4A-AS2 decreased the migration capability of the CNE1-EMT model (Fig. 6b). Moreover, the transfected cells were cultured on a transwell apparatus. The numbers of migrated cells significantly decreased upon overexpression of EPB41L4A-AS2 (Fig. 6c). Furthermore, a matrigel model was established to examine the role of EPB41L4A-AS2 in invasion. The upregulation of EPB41L4A-AS2 decreased the invasion capability of the CNE1-EMT model (Fig. 6d). Moreover, overexpression of EPB41L4A-AS2 suppressed the expression of vimentin and promoted that of E-cadherin (Fig. 6e, f ). These findings indicate that EPB41L4A-AS2 restrained cell migration and invasion in the CNE1-EMT model, as well as may participate in TGFβ-induced EMT.

Analyses of the potential molecular mechanisms of EPB41L4A-AS2 by bioinformatics-based methods
The function of lncRNA involves a wide spectrum of biological contexts. The specific molecular mechanisms of EPB41L4A-AS2 have not been investigated in HNSCC. Our preliminary study indicated that EPB41L4A-AS2 inhibited TGFBR1/SMAD2/SMAD3 axis, but the specific mechanism by which EPB41L4A-AS2 participates in TGF-β signaling requires further investigation. Bioinformatics has recently become an effective approach to explore the potential molecular mechanisms of novel lncRNA. LncRNAs that target RNA-binding proteins (RBPs) are the vital mechanisms of lncRNA functions [4,31]. Identifying the predicted target RBPs provides a basis for understanding lncRNA functions. Thus, the RBPs of EPB41L4A-AS2 were analyzed using the RBP prediction program RBP database (http://rbpdb .ccbr.utoro nto.ca/). A part of analysis result are provided in Table 3. Among the different RBPs, several proteins have been proven to be associated with EMT. MBNL1 is a negative regulator of the TGF-β-dependent EMT of atrioventricular canal endocardial cells [32]. KHSRP silencing rewires posttranscriptional programs during TGF-β-induced EMT transition [33]. YBX1/YB-1 induces partial EMT and tumorigenicity through the secretion of angiogenic factors into the extracellular microenvironment [34].
Competing endogenous RNA (ceRNA) is currently under intense study as a novel mechanism for lncRNA functions, which indicates that RNA transcripts communicate with one another by miRNA response elements [31]. The ceRNA network of EPB41L4A-AS2 was Fig. 7 EPB41L4A-AS2 inhibits the TGFBR1/SMAD2/SMAD3 axis. a Western blot analysis to detect the expression of TGFBR1, Smad2, p-Smad2, Smad3, p-Smad3, Snail and β-actin. b The relative protein expression levels were represented as columns. c-e A negative correlation was found between EPB41L4A-AS2 and TGFBR1 or Snail1/2. * p < 0.05 and ** p < 0.01 constructed (Fig. 8a). Several miRNAs have contributed to EMT. For instance, miR-301a acts as an oncogene in laryngeal SCC and induces EMT in prostate cancer [35]. mRNAs from the ceRNA analysis serve as the target genes, and the lncRNA-mRNA pathway network was constructed (Fig. 8b). Several pathways are involved in the carcinogenesis of HNSCC metastases and invasion, such as the "PI3 K-Akt signaling pathway", "Wnt signaling pathway", and "mTOR signaling pathway". Moreover, several pathways, such as the "MAPK signaling pathway", "RAP1 signaling pathway", and "AMPK signaling pathway", are related to cancer development [36,37].
Antisense transcripts have a wide variety of biological roles and play an important role in modulating transcription of sense RNA [38]. EPB41L4A-AS2 is the antisense RNA of EPB41L4A, which is a gene involved in the β-catenin/tcf signaling pathway [39]. We measured the expression levels of EPB41L4A-AS2 and EPB41L4A in 564 HNSCC samples from TCGA. A positive correlation was observed between EPB41L4A-AS2 and EPB41L4A (Additional file 5). Furthermore, we analyzed the regulatory network of EPB41L4A in GCBI (https ://www.gcbi. com.cn/gclib /html/index ) (Fig. 8c), so as to gain a better understanding of the mechanism of EPB41L4A-AS2.

Discussion
In this study, we analyzed 50 tumor-adjacent normal pairs and identified 90 cancer-linked lncRNAs dysregulated in HNSCC relative to paired normal tissue. Then, TGF-β-induced lncRNA profile (GSE54800) was downloaded to identify the specific TGF-β-induced lncRNAs in HNSCC. Finally, our analysis confirmed three TGFβ-induced lncRNAs correlated with cancer. The key role of TGF-β-induced EMT is well established, and previous research has shown that EMT is a key molecular mechanism in tumor cell invasion and metastasis [40]. To identify the specific TGF-β-induced lncRNAs associated with invasion and metastasis in HNSCC, we compared the data of 39 patients with and 88 patients without distant metastasis or recurrence. Moreover, 167 lncRNAs were identified as differentially expressed. Finally, the expression of EPB41L4A-AS2 decreased in HNSCC and further decreased in distant metastasis or recurrence tissues. Moreover, a negative correlation was observed between EPB41L4A-AS2 and TGF-β in HNSCC tissues. Furthermore, we examined the relationship between EPB41L4A-AS2 expression and clinicopathological features in 507 patients with HNSCC. A lower level of EPB41L4A-AS2 was associated with T, N, and TNM stage in patients and was inversely correlated with prognosis. Therefore, EPB41L4A-AS2 was believed to have tumor suppressor gene activities and associated with the TGF-β signaling pathway.
Our results indicated that a low level of EPB41L4A-AS2 was associated with perineural invasion in patients with HNSCC, which is consistent with a previous finding that the biphasic activities of the TGF-β signaling pathway are also associated with perineural invasion [41]. Recent research has indicated that the level of TGF-β signaling pathway activity is higher in invasion and metastasis tissues than in primary HNSCC tissues, which further supports the regulation of EPB41L4A-AS2 by TGF-β in vivo [42].
Biphasic level of the TGF-β signaling pathway activity contributes to tumorigenesis, and recent research has indicated that TGF-β-induced lncRNAs inhibited the invasion-metastasis cascade of tumor in vivo [1]. Thus, the specific downstream lncRNAs of different TGF-β signaling pathways need to be explored further. In this study, the CNE1-EMT model was established by treatment with TGF-β. Our results indicated that EPB41L4A-AS2 was downregulated in the CNE1-EMT model, and overexpression of EPB41L4A-AS2 decreased the migration and invasion capability of the CNE1-EMT model. Moreover, overexpression of EPB41L4A-AS2 reversed the TGF-β-induced EMT. Thus, we identified that EPB41L4A-AS2 can be downregulated by TGF-β and mediates the role of TGF-β in inducing EMT. However, the mechanism by which TGF-β inhibits EPB41L4A-AS2 expression requires further investigation. Our results showed that a 3-day treatment with TGF-β was sufficient to downregulate EPB41L4A-AS2 expression, which implied that EPB41L4A-AS2 may be a direct target of the TGF-β/Smad pathway. Previous studies have identified that TGFBR1 is the vital receptor for TGF-β-induced EMT [30]. Our results indicated that overexpression of EPB41L4A-AS2 inhibited TGFBR1 expression. Furthermore, a negative correlation was found between EPB41L4A-AS2 and TGFBR1 in 563 HNSCC tissues from TCGA. However, the mechanism by which EPB41L4A-AS2 downregulates TGFBR1 expression requires further investigation. A major mechanism of lncRNAs in regulating gene expression involves the interaction with Polycomb repressive complex 2 (PRC2), which catalyzes the trimethylation of histone H3 lysine 27 (H3K27me3) to repress the transcription of specific genes [43]. Considering that the promoter region of TGFBR1 is rich in methylation, we hypothesized that EPB41L4A-AS2 plays a role in inducing the methylation of the TGFBR1 promoter (Additional file 6). The function of lncRNA involves many biological processes. We analyzed the specific molecular mechanisms of EPB41L4A-AS2 through a bioinformatics-based method. Our results indicated that EPB41L4A-AS2 may be a key participant in TGF-β signaling pathways through ceRNA or targeting RBPs. Several vital TGF-β signaling molecules, such as MBNL1 or miR-301a, may be regulated by EPB41L4A-AS2. Overall, our research demonstrated that EPB41L4A-AS2 is downregulated by TGF-β and mediates the role of TGF-β in inducing EMT. EPB41L4A-AS2 inhibits TGFBR1 expression, which possibly regulates TGF-β signaling pathways. Our findings have significant implications regarding the understanding of TGF-β-lncRNAs in HNSCC. EPB41L4A-AS2 could be an effective target for anti-metastasis therapies.

Conclusions
In summary, we revealed that EPB41L4A-AS2 is upregulated in HNSCC and downregulated by TGF-β. Our results indicated that EPB41L4A-AS2 might serve as a negative regulator of TGF-β signaling and as an effective prognostic biomarker. The newly identified EPB41L4A-AS2/TGFBR1/SMAD2/SMAD3 axis sheds light on a novel molecular mechanism for HNSCC cell metastasis, indicating that EPB41L4A-AS2 is a valuable biomarker and a promising therapeutic target for the management of HNSCC.