Identification of a 4-lncRNA signature predicting prognosis of patients with non-small cell lung cancer: a multicenter study in China

Previous findings have indicated that the tumor, nodes, and metastases (TNM) staging system is not sufficient to accurately predict survival outcomes in patients with non-small lung carcinoma (NSCLC). Thus, this study aims to identify a long non-coding RNA (lncRNA) signature for predicting survival in patients with NSCLC and to provide additional prognostic information to TNM staging system. Patients with NSCLC were recruited from a hospital and divided into a discovery cohort (n = 194) and validation cohort (n = 172), and detected using a custom lncRNA microarray. Another 73 NSCLC cases obtained from a different hospital (an independent validation cohort) were examined with qRT-PCR. Differentially expressed lncRNAs were determined with the Significance Analysis of Microarrays program, from which lncRNAs associated with survival were identified using Cox regression in the discovery cohort. These prognostic lncRNAs were employed to construct a prognostic signature with a risk-score method. Then, the utility of the prognostic signature was confirmed using the validation cohort and the independent cohort. In the discovery cohort, we identified 305 lncRNAs that were differentially expressed between the NSCLC tissues and matched, adjacent normal lung tissues, of which 15 are associated with survival; a 4-lncRNA prognostic signature was identified from the 15 survival lncRNAs, which was significantly correlated with survivals of NSCLC patients. This signature was further validated in the validation cohort and independent validation cohort. Moreover, multivariate Cox analysis demonstrates that the 4-lncRNA signature is an independent survival predictor. Then we established a new risk-score model by combining 4-lncRNA signature and TNM staging stage. The receiver operating characteristics (ROC) curve indicates that the prognostic value of the combined model is significantly higher than that of the TNM stage alone, in all the cohorts. In this study, we identified a 4-lncRNA signature that may be a powerful prognosis biomarker and can provide additional survival information to the TNM staging system.


Background
Lung cancer is the most common and lethal malignant disease in the world, and approximately about 85% of lung cancer cases are non-small cell lung cancer (NSCLC) [1]. In clinical practice, delayed diagnosis and the lack of effective prognostic biomarkers are two main reasons for poor survival of patients with NSCLC [2,3]. The 5-year survival rate for patients with late-stage lung cancer and those with stage-I lung cancer is 15% and 83%, respectively [4]. Currently, the treatment strategy and prognosis of lung cancer are mainly determined according to TNM staging system. However, NSCLC patients with the same TNM stage may have a different prognosis [2,5,6]. Therefore, an urgent need exists for new biomarkers that can help improve the accuracy of prognosis prediction, which would enhance the quality of life of patients as well as the survival rate [7,8].
With the development and advancement of highthroughput technologies, numerous investigators have proposed using single genes or gene sets (signatures) as biomarkers for tumor diagnosis, prognosis, disease classification, and personalized treatment. Genomic abnormalities such as DNA mutations, copy-number variations, DNA methylation, and gene expression have been investigated for their usefulness in identifying prognostic biomarkers in patients with NSCLC. High-throughput technologies like microarray and RNA-sequencing (RNA-seq) have enabled simultaneous analysis of hundreds or thousands of genes and their relationships with clinical features, including the survival of patients with cancer, which has led to the discovery of many novel biomarkers (single genes or signatures) for diagnosis, prognosis, and targeted therapy in patients with NSCLC [9,10]. However, only a few molecular biomarkers have been evaluated in clinical practice (mainly as therapeutic targets) [11] because most of the biomarkers show low accuracy (low sensitivity and/or specificity) [12] or need to be further confirmed with a larger population in an independent validation study [13]. Therefore, more reliable biomarkers are still needed to improve diagnosis, prognosis and personalized therapy for NSCLC patients.
Long non-coding RNAs (lncRNAs) that are expressed at high levels in the body have exhibited superior potential as novel diagnostic or prognostic biomarkers when compared to protein-coding genes, which raises the possibility of identifying more reliable biomarkers for lung cancer [14,15]. LncRNAs are a type of non-coding RNA that are longer than 200 nucleotides [16,17]. Accumulating reports have shown that lncRNAs can participate in numerous biological processes, such as the regulation of epigenetic modification, cell cycle progression, and cell differentiation. Growing evidence shows that numerous lncRNAs are significantly deregulated in various types of cancers and play important roles in tumorigenesis [18][19][20]. An increasing number of lncRNAs have been shown to be dysregulated and involved in lung cancer tumorigenesis, and to be useful as diagnostic or prognostic biomarkers, or as targets for therapy. For example, the lncRNAs MALAT1 and NEAT1 play important roles in lung cancer cell proliferation, cell cycle progression, and apoptosis, as well as tumor progression and prognosis [21][22][23][24][25]. Inhibitors targeting MALAT1 significantly reduced lung cancer metastasis in a mouse model [21]. The prognostic role of lncRNA signatures in NSCLS has been investigated in many reports by using the data downloaded from the Gene Expression Omnibus (GEO) database or The Cancer Genome Atlas (TCGA) database. However, a lncRNA expression profile for especially identifying prognostic signature in a large cohort of NSCLC patients and multicenter study has not been reported yet. Therefore, the prognostic value and the clinical application potentiality of lncRNA signature in NSCLC patients are necessary to be further systematically explored.
In this study, to our knowledge, we performed the first multicenter retrospective study on the prognosis of total 439 NSCLC patients with a custom lncRNA microarray and qRT-PCR. NSCLC patients from South China were randomly divided into a discovery cohort (194 cases) and a validation cohort (172 cases), and those from Southwest China were used as an independent validation cohort (73 cases). A 4-lncRNA signature was established to predict survival of NSCLC patients in the discovery cohort, and was validated in the validation and independent cohorts.

Patients and clinical information
A total of 439 NSCLC cases were collected for this study, and these patients underwent radical resection of lung cancer in the Sun Yat-Sen University Cancer Center (n = 366) and Yunnan Cancer Hospital (n = 73) between 2003 and 2008. Matched cancer tissues and adjacent normal tissues were obtained from each patient recruited in Sun Yat-Sen University Cancer Center. The inclusion criteria for our study were: (i) NSCLC was confirmed by pathological diagnosis and reviewed by 2 experienced pathologists, (ii) the patients did not receive any form of anti-tumor therapy before surgery, (iii) the patients did not die within 1 month after surgery, and (iv) the patient's sample was preserved at − 80 °C immediately after surgery. The samples collected from the 366 patients enrolled at Sun Yat-Sen University Cancer Center were divided randomly into a discovery cohort (n = 194) and a validation cohort (n = 172). Seventy-three patients with NSCLC were recruited from Yunnan Cancer Hospital (using the inclusion criteria described above) and assigned to an independent validation cohort. Overall Survival (OS) was defined as the time from the date of surgery to the date of death or last follow-up, and disease-free survival (DFS) was defined as the time from the date of surgery to the date of first recurrence or distant metastasis, death, or the last follow-up. The clinicopathological characteristics of the patients in all three cohorts are shown in Table 1. This study was reviewed and approved by the Ethical Committees of Sun Yat-Sen University Cancer Center and Yunnan Cancer Hospital. Written informed consent was obtained from each patient.

RNA extraction
RNA was extracted from tumor and normal lung tissues using the TRIzol reagent (Invitrogen, Carlsbad, CA, USA) and homogenized with a Bullet Blender (Vortex-Genie 2), according to the manufacturer's instructions. Briefly, each tissue (100 mg) was mixed with 1 mL TRIzol reagent and homogenized in a Bullet Blender at a 4 °C for 15 min, after which the mixtures were incubated at 25 °C for 5 min. After adding chloroform, the mixtures were violently shaken for 15 s, incubated at room temperature for 10 min, and then centrifuged for 15 min at 4 °C and 14,000 rotations per min. After each supernatant was transferred to a new tube, an equal volume of isopropyl alcohol was added, and the tube contents were mixed. After holding the tubes at room temperature for 10 min, the supernatants were discarded after centrifugation. Each precipitate was washed with 75% alcohol, and then the ethanol was removed after additional centrifugation. After allowing the residual ethanol to evaporate, doubledistilled H 2 O was added to dissolve the RNA. Finally, the concentration and quality of each extracted RNA was measured in an ND-1000 spectrophotometer (NanoDrop Technologies), to meet the requirements of the microarray and qRT-PCR experiments.

Quantitative RT-PCR
Total RNA (1 µg) was reverse transcribed using the GoScript ™ Reverse Transcription System (Promega), which includes oligo(dT) primers and random primers for the reverse transcription step, and qPCR was performed using GoTaq ® qPCR (Promega) and SYBR Green on a PRISM 7900HT system (Applied Biosystems). Each sample was analyzed in triplicate wells, and reactions without cDNA were included as negative controls. The thermal cycling conditions were as follows: 94 °C at 5 min (for the hot start step), followed by 40 cycles at 94 °C for 15 s and 60 °C for 30 s. The sequences of the primers used in this study are shown in Additional file 1: Table S1. The PCR data were processed by normalizing the median expression value of a given lncRNA to the expression of GAPDH in the same sample. Relative lncRNA-expression levels were quantified using the 2 −ΔΔCt method.

LncRNA microarray fabrication and hybridization
Human lncRNA transcript sequences selected from public lncRNA databases, including the LNCipedia, LncR-NAdb, LncRNADisease, and EST databases, were used to design probes for constructing an lncRNA microarray, and 2412 probes were successfully designed. The lncRNA microarray was fabricated in-house and hybridized as described previously [26,27]. RNA samples obtained from the 366 cancer samples and 100 normal lung tissues in the discovery and validation cohorts, were examined with the lncRNA microarray. Briefly, each probe was mixed with printing buffer to a final concentration of 40 μmol/L and printed in duplicate on the cleaned glass slides (75 × 25 mm). The total RNA 2.0 μg was labeled with 100 nmol/L of Cy5-dUTP (Enzo Life Sciences, New York, USA) in reverse transcription. Then the mixture of labeled RNA sample and 1× hybridization solution was hybridized onto the microarray for 12-18 h at 45 °C. After hybridization, the slides were washed in 1× SSC/1% SDS for 10 min at 45 °C, followed by sequential washing in 2 cycles of 0.5× SSC/0.1% SDS, 2 cycles of 0.2× SSC and 1 cycle of purified water for 1 min at room temperature, respectively, and then dried in a special small centrifuge and scanned using the InnoScan 700A Scanner (Innopsys Inc, France).

Microarray data processing
The raw microarray data were first processed by subtracting the background signals and then normalized with the quantile method and a log transformation. The logtransformed data were deposited in the GEO database (National Center for Biotechnology Information website), under GEO Accession number GSE143018 (https ://www. ncbi.nlm.nih.gov/geo/query /acc.cgi?acc=GSE14 3018).
To identify differentially expressed lncRNAs between lung cancer tissues and paired normal lung tissues, the Significance Analysis of Microarrays (SAM) program was employed to identify lncRNAs with a fold-change of > 1.25, a P-value of < 0.01, and a false-discovery rate (FDR) of < 0.01 (t test). Hierarchical-clustering analysis (for classifying the samples in the discovery cohort) was performed using the average-linkage method and uncentered Pearson's correlation coefficients in MEV software, version 4.2.

Statistical analysis
Correlations between the 4-lncRNA prognostic signature and clinical characteristics were assessed by Fisher's exact test and the χ 2 test, using SPSS software, version 23.0. The prognostic accuracies of the 4-lncRNA signature, the TNM staging system, and the combined-risk model were compared with receiver operating characteristic (ROC) curves, which were generated using MedCalc software, version 11.4.2. The OS and DFS of patients were assessed using the Kaplan-Meier method, and the corresponding graphs were generated using GraphPad Prism software, version 8.0.
The impacts of the lncRNA-expression level and clinical characteristics on DFS and OS were determined using univariate and multivariate Cox-regression models. By employing the risk-score method reported previously [28,29], 15 lncRNAs were incorporated into different combinations to construct a signature and tested by survival analysis, and the lncRNAs were gradually subtracted from the combinations to obtain a final 4-lncRNA signature with the greatest prognostic value.

Detection of lncRNA-expression profiles in NSCLC tissues from the discovery cohort, using a custom microarray
The 366 patients with NSCLC from Sun Yat-Sen University Cancer Center in Southern China were randomly divided into a discovery cohort and a validation cohort. The clinical characteristics of these patients are shown in Table 1. We first detected the lncRNA-expression profiles in 194 NSCLC samples and 100 matched normal lung tissues in the discovery cohort, using an in-house generated lncRNA microarray containing 2412 human lncRNA probes. After subtracting the background signals, and normalizing and log-transforming the microarray data, we analyzed the lncRNA-expression profiles with the SAM program and Student's t test, and identified 305 differentially expressed lncRNAs between the NSCLC tissues and adjacent normal lung tissues (FDR = 0 and fold-change > 1.25), of which 138 lncRNAs were upregulated and 167 were down-regulated in the NSCLC tissues (Additional file 1: Fig. S1 and Table S2). The log-transformed microarray data were submitted and deposited in the GEO database.
To confirm the reliability and repeatability of the microarray results, 5 out of 15 prognostic lncRNAs were selected for qRT-PCR analysis with 30 pairs of samples that were randomly selected from the discovery cohort. Of these 5 lncRNAs, 2 (NEAT1 and XLOC_009261) were up-regulated and 3 (XLOC_005302, XLOC_001306, and lnc-GAN1) were down-regulated in the lung cancer tissues, compared with that in the normal lung tissues. The expression-level ratios of the 5 lncRNAs in cancer tissues versus adjacent tissues detected by qRT-PCR were consistent with the microarray results (Fig. 1a) and significant correlations were found between the qRT-PCR and microarray data for the 5 lncRNAs (Fig. 1b-f ). These results reveal that the lncRNA-expression levels detected with the lncRNA microarray are reliable and reproducible, which can be used for further analysis.

Identification of a 4-lncRNA prognostic signature for NSCLC patients in the discovery cohort
To elucidate the prognostic significance of lncRNAs in NSCLC, we conducted univariate Cox regression analysis on all 305 differentially expressed lncRNAs in the discovery cohort. Based on the threshold of P-value<0.05, 15 lncRNAs were significantly associated with OS in the NSCLC patients (Table 2), of which 6 lncRNAs were risky and 9 lncRNAs were protective.
To determine an optimal lncRNA combination (signature) for predicting the survival outcomes of patients with NSCLC, we employed the 15 lncRNAs associated with survival to establish a prognostic signature with a risk-score method, as previously reported [28,29]. Using this method, we established a 4-lncRNA signature with the highest prognostic power, consisting of NEAT1, lnc-GAN1, ASLNC11245, and GSO_1539832_023. Based on the expression levels of the 4 lncRNAs (measured by microarray analysis and weighted by their corresponding regression coefficients derived from univariate Coxregression analysis), the risk scores were calculated as follows: The risk-score formula was used to calculate risk scores for each patient, who were divided into high-and lowrisk groups according to median risk score. Kaplan-Meier-survival analysis showed that patients in the high-risk group had remarkably lower OS and DFS rates than those in the low-risk group (Fig. 2a), implying that this prognostic signature is potentially highly effective for predicting the survival of patients with NSCLC.

Validation of the 4-lncRNA prognostic signature in patients with NSCLC from a multicenter registry
To verify the prognostic value of the 4-lncRNA signature identified in the discovery cohort, we attempted to validate it with NSCLC patients from two different geographical locations, where one cohort was used as an internal validation cohort, and the other was used as an independent validation cohort. First, we tested the 4-lncRNA signature with the internal validation cohort (n = 172 NSCLC samples) acquired from the same center as the discovery cohort in southern China. The NSCLC Risk score = (0.412 × NEAT1 level) samples in the internal validation cohort were analyzed using the same lncRNA microarray and risk-score formula that was used for the discovery cohort. Based on the risk scores, patients in the internal validation cohort were classified into high-risk and low-risk groups. Survival analysis showed that patients in the high-risk group had significantly lower OS and DFS rates than those in the low-risk group (Fig. 2b), which was consistent with the results obtained in the discovery cohort.
Second, we tested the 4-lncRNA prognostic signature with another 73 NSCLC samples (as an independent validation cohort) obtained from another medical center in southwestern China and detected the expression of the 4 lncRNAs using qRT-PCR. Then, univariate Coxregression analysis was performed on the 4 lncRNAs, and a risk-score formula was constructed with the same method used in the discovery cohort: We calculated the risk score for each patient with the new formula (shown immediately above) in the independent validation cohort. By applying the median risk score as the cutoff point, patients were categorized into high-and low-risk groups. As shown in Fig. 2c, the OS and DFS rates of patients with NSCLC in the high-risk group were significantly lower than those in the low-risk group, which was in concordance with the results obtained from the discovery and internal validation cohorts. The above results demonstrated that the Risk score = (0.297 × NEAT1 level) 4-lncRNA signature is correlated significantly with the prognosis of patients with NSCLC from a multicenter cohort in different geographical regions, suggesting that the 4-lncRNA signature is a new and powerful prognostic biomarker for patients with NSCLC from different regions of China.

The 4-lncRNA prognostic signature was independent of the TNM staging system
To gain deeper insight into the clinical significance of the 4-lncRNA signature, we first conducted a correlation analysis between the signature and any associated clinical characteristics. The results showed that the 4-lncRNA signature did not correlate with any  (Table 3), implying that the signature was independent of the clinical characteristics. Then, we carried out a univariate Coxregression analysis of the signature and clinical characteristics. The results revealed that only the 4-lncRNA signature and TNM stage were associated with the OS (Table 4) and DFS (Table 5) rates of patients with NSCLC in all the 3 cohorts, providing further evidence that the 4-lncRNA signature is a useful prognostic indicator. Finally, we performed a multivariate Coxregression analysis on the 4-lncRNA signature and all clinical characteristics. After adjustment for other clinicopathological variables, both the 4-lncRNA signature and the TNM stage correlated significantly with OS and DFS rates of patients in all the 3 cohorts, whereas other factors did not (Table 6). To further confirm the utility of the 4-lncRNA signature as an independent predictive factor for survival, we performed a stratified analysis of patients at three different TNM stages with the 4-lncRNA prognostic signature. Patients in the same TNM stage (stage I, II, or III) were divided into high-or low-risk subgroups, based on the risk scores generated with the 4-lncRNA prognostic signature. The results showed that NSCLC patients with high-risk scores generally had significantly lower OS and DFS rates than those with low-risk scores (Fig. 3) in stage I, II, or III, indicating that the prognostic 4-lncRNA signature is performed independently of the TNM staging system. Collectively, these results indicated that the 4-lncRNA signature is a powerful and independent prognostic indicator for patients with NSCLC. The 4-lncRNA signature provids additional prognostic information to the TNM staging system in patients with NSCLC In clinical practice, the traditional TNM staging system is the main assessment used to predict the survival of patients with NSCLC and to determine the treatment strategy. However, the TNM staging system is mainly based on anatomical information and does not include factors related to the tumor biology. Therefore, the TNM system is insufficient for predicting survival outcomes in patients with NSCLC [30]. For example, Kaplan-Meiersurvival analysis of the 3 cohorts in this study showed that the TNM stage system did not effectively determine the prognosis of NSCLC patients at different stages, especially in stages I and II (Fig. 4). To improve the ability of the TNM staging system to predict patient survival, we established a new risk-score model by combining the risk scores of the 4-lncRNA signature and the TNM staging system: low-and high-risk signatures were scored as 0 and 1, respectively, and stage I, II, and III NSCLC were scored as 1, 2, and 3, respectively. Patients with combined scores of 1, 2-3, or 4 were classified as low-, mediumor high-risk patients, respectively. Then we performed Kaplan-Meier-survival analysis of the patients with different combined risks in the 3 cohorts. The results revealed significant differences in OS and DFS rates between patients with low, medium, or high risk in the discovery cohort (Fig. 5a), and these results were confirmed in the internal validation and independent validation cohorts (Fig. 5b, c). Next, receiver operating characteristic (ROC) analysis was performed to compare the accuracy of the TNM staging system and the combined-risk model. ROC analysis showed that the combined-risk model achieved a significantly higher predictive accuracy for OS (AUC = 0.726 vs. 0.644) and DFS (AUC = 0.723 vs. 0.641) than that achieved by the TNM staging system in the discovery cohort (Fig. 6a). Similar results were observed in the internal validation cohort and the independent validation cohort (Fig. 6b, c). These results demonstrated that the 4-lncRNA signature can provide additional prognostic information and improve the prognostic power of the TNM staging system.

Discussion
LncRNAs are widely dysregulated in various cancers and participate in a diverse range of associated biological functions. Numerous aberrant lncRNAs have been detected as hallmarks of cancers and can potentially be used for diagnosis, prognosis, and targeted therapy in cancer. Some investigators have discovered lncRNA profiles and lncRNA signatures in NSCLC by mining data from the GEO and TCGA databases. For example, Zhou et al [31] analyzed the lncRNA-expression profiles of 603 patients from 3 independent NSCLC cohorts in the GEO database and developed a riskscore model based on the expression of 8 lncRNAs, which were significantly associated with OS in patients with NSCLC. Lin et al. [10] identified a 7-lncRNA signature for predicting the OS of patients with NSCLC after combining lncRNA profiles from 4 GEO datasets and validated the signature in 2 independent datasets (TCGA and GSE31210). Recently, He et al. [32] proposed a novel 8-gene signature as a prognostic indicator for patients with early-stage NSCLC after analyzing data from the GEO and TCGA projects. However, the abovementioned prognostic signatures generated by data mining have not been confirmed in patients with NSCLC in a prospective multicenter study. Therefore, the clinical application of prognostic lncRNA biomarkers in NSCLC remains very limited to date. Here, we report the first lncRNA-expression profiling (as determined by microarray analysis) of a large cohort of patients with NSCLC and the identification of an effective prognostic 4-lncRNA signature.
In this study, we identified 305 aberrantly expressed lncRNAs in 104 NSCLC tissues when compared with those in matched normal tissues in the discovery cohort, using a custom lncRNA microarray containing 2412 probes. Notably, we identified a novel 4-lncRNA prognostic signature for patients with NSCLC in the discovery cohort. Kaplan-Meier-survival analysis demonstrated the effective prognostic performance of the 4-lncRNA signature in all the 3 cohorts. Multivariate Cox-regression analysis identified the 4-lncRNA signature as an independent prognostic factor for patients with NSCLC in all the cohorts.
Although TNM staging is widely accepted for disease prognosis and guiding treatment decisions for most solid cancers (including NSCLC), at present, the TNM staging system has critical limitations and insufficiencies in clinical practice, due to intra-tumoral molecular and genetic heterogeneities among patients with lung cancer. The clinical outcomes of lung cancer patients with similar clinical and pathological features are often quite different after receiving similar treatments. Therefore,  an advanced TNM stage and lymphatic metastasis in patients with NSCLC [33]. Previous findings revealed that NEAT1 promoted the epithelial-mesenchymal transition and metastasis in NSCLC via the Wnt/βcatenin pathway [25,34]. However, the association of NEAT1 with the survival of patients with lung cancer has not been reported previously. Consistent with published reports, we found that NEAT1 expression was significantly higher in NSCLC tissues than in adjacent normal tissues (fold-change = 1.7). Moreover, we found the first evidence that NEAT1 can serve as an independent prognostic indicator for patients with NSCLC (unpublished data). To our knowledge, the remaining 3 lncRNAs (lnc-GAN1, ASLNC11245, and GSO_1539832_023) in the prognostic 4-lncRNA signature have not been functionally annotated. In our study, these 3 lncRNAs were significantly down-regulated in lung cancer tissues compared with adjacent normal tissues (fold-change = 0.39, 0.75, and 0.47, respectively), and high expression levels of these lncRNAs could serve as indicators for a good prognosis of patients with NSCLC. Current treatment strategies for lung cancer have led to a comprehensive approach that includes surgery, radiotherapy, chemotherapy, targeted therapy, gene therapy, and immunotherapy [35,36]. Based on insights gained into the molecular mechanisms underlying NSCLC in the past 10 years, common mutations in genes encoding EGFR-TKIs (EGFR tyrosine kinase inhibitors), programmed cell death protein 1, and members of the epidermal growth factor receptor super-family have been treated clinically with targeted tyrosine-kinase inhibitors [37][38][39][40][41][42][43]. Even though these targeted therapies have improved the survival rates and quality of life of patients with NSCLC, their effects are far from satisfactory. Most patients exhibit drug resistance or disease progression after receiving treatment for a certain period of time [44,45]. Therefore, specific biomarkers for monitoring therapeutic responses in patients with NSCLC are urgently needed. By applying microarray and RNA-seq technology in cancer research, numerous molecular biomarkers have been identified that can predict the responses to specific treatment regimens [46][47][48]. Of the 4-lncRNA signature identified in this study, NEAT1 was significantly up-regulated in paclitaxel-resistant NSCLC cells and contributed to paclitaxel resistance by activating the Akt/mTORsignaling pathway [49]. Recent data showed that NEAT1 can inhibit apoptosis in multiple myeloma cells by regulating genes involved in DNA-repair processes, including the homologous-recombination pathway, suggesting its association with drug resistance [49]. Therefore, NEAT1, a component of our 4-lncRNA signature, may play an important role in NSCLC.
Although the 4-lncRNA prognostic signature is a novel and potentially powerful predictor for survival in NSCLC patients, further prospective validation studies in larger cohorts and clinical trials are still required. This study also has other limitations. First, although the 4-lncRNA signature was identified in a large number of NSCLC samples from 2 different regions of China, the signature still needs to be validated in a larger prospective multicenter study, involving patients from more institutions and other countries. Second, the efficacy of models based on multiple types of markers are thought to provide better prognostic value than a single type of marker. Thus, further study will be conducted to identify a multi-gene panel by integrating lncRNAs, microRNAs, and messenger RNAs, with the aim of obtaining a more accurate prognostic assessment of NSCLC. Finally, further experiments need to be performed to elucidate the characteristics and functions of the identified prognostic lncRNAs.

Conclusions
In this study, our findings reveal a tumor-specific lncRNA expression profile in NSCLC tissues and a novel prognostic signature based on 4 lncRNAs, which is a powerful and independent predictor of OS and DFS in patients with NSCLC. Moreover, a new prognostic model is developed by combining the 4-lncRNA signature and TNM stage to refine the current staging system and to improve the predictive performance. The results of our study suggest that the 4-lncRNA classifier might serve as a precise predictive biomarker for selecting high-risk patients who might benefit from adjuvant therapy and thus guide the personalized management of patients with NSCLC.