A novel immune checkpoints-based signature to predict prognosis and response to immunotherapy in lung adenocarcinoma

Except for B7-CD28 family members, more novel immune checkpoints are being discovered. They are closely associated with tumor immune microenvironment and regulate the function of many immune cells. Various cancer therapeutic studies targeting these novel immune checkpoints are currently in full swing. However, studies concerning novel immune checkpoints phenotypes and clinical significance in lung adenocarcinoma (LUAD) are still limited. We enrolled 1883 LUAD cases from nine different cohorts. The samples from The Cancer Genome Atlas (TCGA) were used as a training set, whereas seven microarray data cohorts and an independent cohort with 102 qPCR data were used for validation. The immune profiles and potential mechanism of the system were also explored. After univariate Cox proportional hazards regression and stepwise multivariable Cox analysis, a novel immune checkpoints-based system (LTA, CD160, and CD40LG) were identified from the training set, which significantly stratified patients into high- and low-risk groups with different survivals. Furthermore, this system has been well validated in different clinical subgroups and multiple validation cohorts. It also acted as an independent prognostic factor for patients with LAUD in different cohorts. Further exploration suggested that high-risk patients exhibited distinctive immune cells infiltration and suffered an immunosuppressive state. Additionally, this system is closely linked to various classical immunotherapy biomarkers. we constructed a novel immune checkpoints-based system for LUAD, which predicts prognosis and immunotherapeutic implications. We believe that these findings will not only aid in clinical management but will also shed some light on screening appropriate patients for immunotherapy.


Introduction
Lung adenocarcinoma (LUAD) is the most common variant of non-small cell lung cancer (NSCLC) across the world. LUAD has become significantly more prevalent over the past two decades, with a similar decrease in squamous cell carcinoma during the same period [1]. There have been advances in the current strategy of combining various therapies with tyrosine kinase inhibitors (TKIs) to treat patients with LAUD. However, the 5-year overall survival (OS) rate for patients with LUAD remains poor [2,3]. At present, there is an urgent need to identify specific prognostic biomarkers in patients with LAUD, to enable the development of optimized, appropriate treatment protocols and clinical management for patients with different elements of risk. Several studies have attempted to predict and evaluate prognosis in patients with LAUD, through diverse gene expression profiles and bioinformatics approaches [4][5][6]. However, these studies usually incorporate genes from the entire genome or transcriptome and do not consider the intrinsic biological functions. This may give rise to the fact that many of these signatures remain merely mathematical but unable to truly reflect the real characters inherent in the tumor. In recent years, the treatment of LUAD has developed rapidly. In addition to traditional surgery, radiotherapy, and chemotherapy, molecular targeted therapy, novel strategy like immunotherapy has been becoming of increasingly great concern [7,8]. The introduction of immune checkpoint-blocking antibodies that target programmed death 1 (PD-1) receptor as well as against its ligand, programmed death-ligand 1 (PD-L1) of the B7-CD28 family has significantly improved survival rates in patients with advanced lung cancer. These antibodies are available for clinical use [9][10][11]. However, beneficial immunotherapeutic effects were only observed in a subgroup of patients with response rates of 17-21% [12], suggesting that there may be other immune checkpoints in the LAUD tumor microenvironment (TME).
In addition to these star targets of the B7-CD28 family, several emerging and promising immune checkpoints have been identified [13]. Most of these novel immune checkpoints are from the TNF superfamily, including the TNF ligand superfamily (TNFSF) as well as the TNF receptor superfamily (TNFRSF), which are responsible for regulating pathways related to cell survival, differentiation, and non-immunogenic death. Most notably, these novel immune checkpoints belonging to the TNF superfamily also play a critical function in regulating the immune system, providing co-stimulatory or co-inhibitory signals vital for natural and adaptive immunity with an emphasis on T-cell responsiveness [14,15].
These novel immune checkpoints molecules can trigger inflammatory activities in several cells in the TME, including cells like T and B lymphocytes as well as tissueresident cells such as epithelial and fibroblasts cells [16]. Therefore, modulating and controlling the interaction of these novel immune checkpoints would be a novel therapeutic target with great potential for tumor treatment. Meanwhile, many cancer therapies targeting these novel immune checkpoints are in full swing, including preclinical and clinical trials. For example, therapies targeting OX40, CD40, and CD27 have achieved favorable progress in various tumor treatments, including for patients with lung cancer [14,17,18]. This observation inspired us to explore and establish a novel immune checkpoints-based prognosis system for LUAD, revealing immune features for patients with LUAD.
Herein, we firstly enrolled 1883 LUAD samples from nine independent cohorts, including eight public cohorts and an independent cohort. Then, we systematically explored these novel immune checkpoints in LAUD and filtered out genes with the most prognostic value. From this, we constructed a signature based on the combination of LTA, CD160, and CD40LG, which was also well-validated in different cohorts. Finally, we further examined the clinical characteristics, immunotherapy responses, and immune cells infiltration of the prognostic signature. The novel combination of LTA, CD160, and CD40LG, may help us understand the immune status of patients with LUAD, but also optimize available tumor immunotherapies.

Publicly available mRNA expression datasets
We collected 1731 publicly available samples from The Cancer Genome Atlas (TCGA), including level three RNA-seq expression data from 502 LAUD cases (Illumina HiSeq 2000). All samples had corresponding prognostic data and other clinical information. These data were downloaded from the Cancer Genomics Browser of the University of California Santa Cruz (UCSC) (https:// genom ecanc er. ucsc. edu) and served as the training cohort. The microarray data and corresponding survival information for the remaining 1670 samples were gathered from seven different Gene Expression Omnibus (GEO) datasets (http:// www. ncbi. nlm. nih. gov/ geo), including 90 samples from GSE11969, 83 samples from GSE30219, 226 samples from GSE31210, 106 samples from GSE37745, 127 samples from GSE50081, 442 samples from GSE68465 and 105 samples from GSE81089, which were used as public validation cohorts. First, we log2-transformed the mRNA expressions of these public validation sets. After the quantiles were normalized, we selected the expression of genes with several probes as the mean expression.

Sample analysis and quantitative real-time polymerase chain reaction (qRT-PCR)
Between May '13 and September '14, a total of 102 frozen and surgically resected tissue samples from patients with LUAD were collected from the First Affiliated Hospital of Zhengzhou University. The samples were snapfrozen promptly after surgical resection and stored in liquid nitrogen. We extracted the total RNA from the stored tissues using RNAiso Plus reagent (Takara, #9109)  20:332 in accordance with the manufacturer's instructions. Next, we reverse-transcribed the total RNA signature into single-stranded cDNA using the Prime Script ™ RT reagent kit (Takara, #RR047A). The expression of the three selected genes in this prognostic signature was then detected via qRT-PCR. We used SYBR Premix Ex Taq II (Takara, #RR820A) reagent and Agilent Mx3005P software for data analysis. The three genes' expression values were first standardized to GAPDH and then log2 transformed for subsequent analysis. All primer sequences used for the three genes of interest, as well as GAPDH, are shown in Additional file 6: Table S1. The First Affiliated Hospital of Zhengzhou University's Ethics Committee Board approved this protocol and monitored the study's progress.

Functional enrichment analysis and prognostic meta-analysis
We analyzed the gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways to analyze the prognostic signature using DAVID 6.8 (http:// david. abcc. ncifc rf. gov/ home. jsp). Then, we used STATA software (version 12.0) to carry out a prognostic metaanalysis to better understand this signature's prognostic significance across different public cohorts. We used a random-effects model to calculate the pooled HR value.

Immune cell infiltration analysis
The novel CIBERSORT method estimates cell fractions within complex tissues using gene expression levels in solid tumors. The results of this process stand in agreement with ground truth evaluation [19,20]. The LM22 signature matrix included 547 genes that distinguished 22 immune cell phenotypes. These included distinct Tand B-cell subsets, as well as natural killer (NK) cells and other various myeloid subsets. We used CIBERSORT and the LM22 signature matrix to calculate the immune cell infiltration of patients with high-and low-risk disease statuses.

Tumor Immune Dysfunction and Exclusion (TIDE) analysis
TIDE is an accurate computational framework to predict immunotherapeutic responses, especially to therapeutic strategies based on checkpoint blockade [22]. We integrated the mechanisms of primary tumor immune evasion (T-cell exclusion and dysfunction) into this novel method for modeling tumor immune evasion. The TIDE score is superior to many known biomarkers in predicting immunotherapy responses for melanoma and lung cancer, especially for patients who are treated with anti-PD-1/PD-L1 or anti-CTLA4 [22]. We uploaded the transcriptome profiles of patients from the TCGA cohort to the TIDE website (http:// tide. dfci. harva rd. edu), and after analyzing through the website, downloaded the scores for T-cell dysfunction and exclusion as well as the TIDE scores for all patients.

Tumor Inflammation Signature analysis
Tumor Inflammation signature (TIS) is a novel clinical assay to predict immunotherapy responses in various tumors. We calculated the TIS scores for patients from the TCGA cohort according to the methods mentioned in the previous study [23].

Signature generation and statistical analysis
We used a univariate Cox proportional hazards regression model to screen for genes that were strongly associated with OS in patients with LAUD. Then, we chose to use stepwise Cox proportional hazards regression analysis to reduce the total number of covariates and filter out the most valuable prognostic genes for the prediction of OS. A risk formula was constructed for each patient based on the expression of selected genes and corresponding regression coefficients from the stepwise Cox proportional hazards regression analysis. This information was used to classify patients into two groups: low risk and high risk. Next, we used a Kaplan-Meier analysis to determine the OS and RFS among the two risk groups. We used the Mann-Whitney U-test to explore estimates of immune cell infiltration, TMB load, the number of neoantigens (including clonal and subclonal neoantigens), expression of PD-L1 protein, the TIDE score, the T-cell dysfunction score, the T-cell exclusion scores, and TIS scores in the high-and low-risk patient subsets. Then, a Cox proportional hazards regression analysis was used to determine if the combined gene signature was capable of predicting prognoses dependently. We used R software version 3.5.1 (https:// www.r-proje ct. org) to analyze all data and generate the corresponding figures. A P-value < 0.05 indicated statistical significance. All statistical tests were two-tailed.

Identification of prognostic novel immune checkpoint members in LUAD
After a careful literature review, 21 clearly defined novel immune checkpoint members, which all belong to the TNF superfamily, were included in this study [15,[24][25][26][27]. We used a set of 502 patients with LUAD from the TCGA database as the training cohort for our model. The demographics of this cohort are listed in Table 1. The expression correlations among these 21 immune checkpoint members are displayed in Fig. 1A, which exhibited that many members have a strong positive association. The expression of the 21 defined members was explored in the training cohort. Then, a univariate Cox proportional regression analysis found that seven immune checkpoint genes were significantly associated with OS (P < 0.05, Additional file 6: Table S2). Interestingly, all seven genes (BTLA, CD160, CD27, CD40LG, LTA, TNFRSF14, and TNFSF8) with low hazard ratios (hazard ratio < 1) were positively correlated with favorable survival (Additional file 6: Table S2).

Generation of the prognostic signature in the TCGA cohort
The seven immune checkpoint genes with prognostic significance in univariate comparisons were used for further analysis. Next, we used a stepwise Cox proportional hazard regression model to determine which set of variables provided the most efficient estimate of patient prognoses. This procedure identified three genes of interest: CD160, LTA, and CD40LG. We established the risk score formula based on the expression of these three genes and their corresponding coefficients: risk score = 0.1379 × LTA-0.1525 × CD160-0.2595 × CD40LG. Each patient's risk score was calculated, and patients were categorized as high-or low-risk based on the optimal cutoff point (determine from the training data set, Fig. 1B). The high-risk group had a worse OS (P < 0.0001) and RFS (P = 0.0050) than the low-risk group (Fig. 1C, D). According to the multivariable Cox analysis, this novel risk score independently predicted OS in the TCGA cohort even after correcting for potential confounding from the other clinical variables (Table 2).

Validating the prognostic signature in independent cohorts
To test the prognostic signature's reproducibility in LAUD, we also applied this risk score formula in seven publicly available independent cohorts. Demographic data from these cohorts are presented in Table 1. All patients in these seven cohorts were divided into highand low-risk groups based on the optimal risk-score cutoff that was determined from the training data set.
Of interest, we found that patients in the high-risk group had an unfavorable OS than those in the low-risk group, especially in six of the test data sets: GSE11969 (P = 0.0496, Fig. 2A  indicated a borderline difference between the OS in the high-and low-risk groups in the GSE37745 cohort (P = 0.0630, Fig. 2D). Furthermore, to validate the prognostic value of the novel immune checkpoints-based signature, we also performed a prognostic meta-analysis in these seven GEO cohorts and the TCGA cohort.
The results of the analysis provide further evidence that this prognostic signature is a risk predictor in LAUD (P < 0.001) (Fig. 2H).
To assess our signature's applicability to clinical settings, we used qRT-PCR to validate this risk score in another independent cohort consisting of frozen samples from 102 patients with LUAD. This cohort's demographic data are presented in Table 3. We used the risk score formula to divide the patients into two groups (Fig. 3A). We found a significant difference in OS (P < 0.0001) and RFS (P = 0.00015) between the two risk groups (Fig. 3B, E). Since LUAD patients with different clinical stages have different treatment principles and prognoses [28], we also applied this risk score within early-(clinical stage I and II) and advanced-stage (clinical stage III and IV) subgroups. We observed that patients in these clinicalstage subgroups who were at high risk had shorter survival than patients at low risk (Fig. 3C, D). Finally, we confirmed that this risk score was independently related to OS in these 102 patients with LAUD using qPCR data and multivariable Cox analysis ( Table 3).

Validation of the prognostic signature in important clinical subsets.
We next explored the association between prognosis and risk score for early-and advanced-stage samples in the TCGA cohort since the clinical stage is known to affect prognosis. For the early-stage subgroup, highrisk patients suffered shorter OS (P = 0.00012) and RFS (P = 0.0058) than their low-risk counterparts (Additional file 1: Fig. S1A, C). Similarly, in the advanced-stage subgroup, high-risk samples had a worse OS relative to lowrisk ones (P = 0.17) (Additional file 1: Fig. S1B). However, among patients with advanced-stage disease, there was a borderline difference in RFS between the high-and lowrisk groups (P = 0.071) (Additional file 1: Fig. S1D).
Patients above the risk cutoff suggested significantly inferior OS in other important clinical feature subtypes of the TCGA cohort, including older (age ≥ 60), younger (age < 60), male, female, and smoker relative to nonsmokers (Additional file 2: Fig. S2). Next, considering the critical significance of the EGFR and KRAS mutations in LAUD, we further investigated the prognostic signature's performance in patient subgroups with EGFR and KRAS mutation status. Our prognostic signature accurately stratified varying OS for the EGFR wild-type (WT) vs. mutation (MUT), KRAS WT/MUT, or EGFR/KRAS WT subtypes (Additional file 3: Fig. S3). Last, we evaluated the risk scores across the three expression subtypes in LAUD: magnoid, squamoid, and bronchioid [29]. The risk score was also effective in stratifying patients with different OS in the magnoid and bronchioid subgroups, while showing a borderline difference in the squamiod subgroup (P = 0.0580) (Additional file 4: Fig. S4).

Biological pathways and inflammatory responses analysis of the prognostic signature
To explore this prognostic signature's related and potential biological pathways, we selected the genes that are strongly related to the signature (Pearson |R|> 0.45). The results demonstrated that about 436 genes were negatively, and 12 genes were positively associated with the risk score (Fig. 4A). Next, we performed GO and KEGG analysis for these genes, which showed that they are mainly enriched in immune response and leukocyte activation pathways (Fig. 4B). Meanwhile, we sought to further investigate the relevant inflammatory responses of the risk score. We analyzed the relationships between the seven clusters of metagenes identified from univariate (HCK, interferon, LCK, MHC-I, MHC-II, STAT1, IgG) and risk score [30]. The seven clusters of metagenes found here broadly represent pathways related to inflammatory responses and immune regulation. The profiles of these metagenes are shown in Fig. 4C. A Gene Sets Variation Analysis (GSVA) was performed to validate the seven metagene clusters [31]. The findings from this analysis indicated that the risk score was positively related to IgG, interferon, and STAT clusters, but negatively related to LCK and MHC-II clusters (Fig. 4D).

Immune landscapes of the prognostic signature
Since the prognostic signature is linked to immune response pathways, we intend to delve into the immune cell infiltration in patients with LUAD. Next, we combined CIBERSORT with LM22, a novel method for estimating 22 immune cell fractions and subtypes, to evaluate immune cell infiltration within the training cohort. We found that patients in the high and low-risk groups had distinct profiles of immune cells (Fig. 5A). The high-risk patient groups exhibited higher infiltration of resting CD4 T-cells, macrophagocyte M1, but lower infiltration of memory B-cells and resting T-cells (Fig. 5B, C). Meanwhile, because some immune checkpoints can construct a communication system to modulate antitumor immune responses [32], we next sought to explore the correlation between patient risk scores and several essential immune checkpoints. The Pearson correlation analysis suggested that the risk score was positively correlated with CD276. CD276, a vital immune checkpoint of the B7-28 family, and the immunotherapies targeting CD276 have achieved positive results in various tumors (Fig. 5D-G) [33]. This may indicate that patients at higher risk may benefit from treatment with immunotherapies that target CD276.

Relationship between the prognostic signature and immunotherapy responses
Nowadays, immunotherapies targeting immune checkpoints are increasingly significant to cancer treatment and become the first-line choice in various tumors, especially for LAUD. Meanwhile, these novel immune checkpoint members also are potential and crucial candidate checkpoints for immunotherapies. We sought to find out the relationship between this signature and immunotherapy responses. We selected some classical and widespread biomarkers and analyzed their connection to the risk score [22,32]. First, we calculated and evaluated the tumor mutation burden (TMB), the number of neoantigens, as well as the number of clonal and subclonal neoantigens among the high-and low-risk patients. The high-risk group exhibited higher TMB, and these three kinds of neoantigens (Fig. 6A-D). Second, we also compared the protein expression of PD-L1 in high-and low-risk patients and found a borderline, difference between the patients in the high-and lowrisk groups in average PD-L1 protein expression levels (Fig. 6H). Finally, the TIDE score, an accurate and reliable biomarker for immunotherapy, was also applied in our analysis. We systematically explored the TIDE and T-cell dysfunction and exclusion scores in patients from different groups. As expected, high-risk patients exhibited lower TIDE scores and higher T-cell dysfunction and exclusion scores (Fig. 6E-G). However, the TIS score is negatively associated our risk scores, and the low-risk patients also demonstrated a higher TIS score than highrisk score counterparts, which contradicts with the above  results (Additional file 5: Fig. S5). Together, these results indicated high-risk patients may be more likely to benefit from immunotherapies.

Discussion
High-throughput sequencing and genomics research has ushered in the identification of emerging biomarkers and therapeutic targets. These advancements have also improved our understanding of tumors. However, there is little information on which biomarkers may predict immune therapy responses and prognoses, revealing tumor immune phenotypes in LAUD. To improve our understanding of the co-stimulatory signals important in LAUD, we combined three crucial novel immune checkpoints (LTA, CD160, and CD40LG) to create a novel signature. We used the TCGA database as a training cohort and systematically explored the association between the gene expressions of some novel immune checkpoint members and prognostic outcomes in patients with LUAD. In the process, we identified a novel immune checkpoints-based risk signature, which is closely related to OS and RFS of patients with LAUD. This novel risk signature was validated in seven different publicly available cohorts as well as 102 samples of frozen tumor tissues using qPCR data. The validity of our signature was further confirmed using a meta-analysis. Our results indicated that the risk signature was well-validated and significantly related to OS in various important clinical and mutation subgroups. Our novel signature independently predicted prognosis in patients with LAUD. We also explored and analyzed relevant mechanisms, immune predictors for immunotherapy, and immune cells infiltration of this risk signature. Thus, this signature may contribute to a deep and comprehensive understanding of precision immunotherapy for LAUD. Currently, apart from B7-CD28 family members, there are several emerging and potential immune checkpoints for immunotherapy, such as some TNF superfamily members [16]. After analyzing the association between these novel immune checkpoints and prognoses in LUAD, we determined the most significant prognostic genes in these novel immune checkpoints, including protective (LTA) and risky (CD160 and CD40LG) genes.
Regarding the protective gene, LTA, a proinflammatory cytokine, belongs to the TNF superfamily, involved in the inflammatory and immune responses [34]. This molecule also can assist lymphocytes and stromal cells to induce cytotoxic effects on tumor cells [34,35]. It was reported that the polymorphisms of the LTA gene are closely related to cancer risk, including LAUD and other adenocarcinoma malignancies [36]. Meanwhile, researchers indicated that depleting LTA-expressing lymphocytes with LTA-specific monoclonal antibodies may be useful in treating autoimmune diseases [37]. Also, in terms of risky genes, CD160-also known as BY55, is an immunoglobulin-like, glycosylphosphatidylinositolanchored protein and expressed on natural killer cells, γδ T-cells, and a subset of CD4 + and CD8 + T-cells [38,39]. CD160 was proven to bind to herpesvirus entry mediator (HEVM) with high affinity, which induces robust natural cells effector activity and suppressed T-cell responses in vitro [40,41]. B and T lymphocyte attenuator (BTLA) is a novel checkpoint receptor for immunotherapy, while CD160 shares the same ligands with it as BTLA repaired receptor, suggesting that CD160 inhibitory also may be a promising target for immunotherapy [27]. CD40LG is the ligand of CD40, and after they get combined, it will stimulate proinflammatory gene expression, including interleukins (IL)-1, IL-6, IL-8, IL-12, TNF-α, IFN-γ, and monocytic chemoattractant protein (MCP)-1. The activation of the CD40/CD40LG system is a major contributor to carcinogenesis. When CD40LG antibodies are used to disrupt the CD40/CD4OLG system's function, it leads to the suppression of tumor cells [42]. However, some researchers demonstrated that the membrane-stable CD40LG mutant gene transfer into the CD40-positive LUAD cell line. This gene transfer reduces cell proliferation and enhances apoptosis [43]. Thus, the role of CD40LG in LUAD is still uncertain and requires further in-depth study and exploration. Similarly, the functions of LTA and CD160 in LUAD are unclear, and more relevant research are urgently needed. We also investigated the potential genetic mechanisms of this signature. Risk-score-related genes were mainly enriched in immune response and leukocyte activation processes and pathways after correlation analysis. We analyzed seven immune-related metagenes to better understand the relationship between risk signature and immune responses. We found that risk score was negatively associated with LCK and MHC_II clusters, suggesting that high-risk patients have decreased B-cell function and impaired antigen-presentation ability. We also found higher infiltration of resting type CD4 + T-cells in high-risk patients, suggestive of immunosuppression. Next, we found that the patient risk score was positively associated with the expression of CD276. CD276 is a crucial immune checkpoint member of the B7-28 family that plays a pivotal role in inhibiting T-cell function, and immunotherapies that target CD276 have achieved positive results in various tumors [33]. This showed that patients at elevated risk may benefit from immunotherapies that target CD276. We also found that patients at elevated risk exhibited higher average TMB levels, more T-cell dysfunction, greater exclusion scores, lower TIDE scores. Of interest, despite this trend, PD-L1 protein expression differed across risk groups. PD-L1 and TMB are currently the most reliable biomarkers for predicting responses to PD-1/PD-L1 immunotherapy [44,45]. These findings suggest that high-risk patients are mostly immunosuppressed and are therefore better candidates for immunotherapy. Although this signature was successfully validated across multiple cohorts and appeared to act as an independent prognostic factor for LUAD, this study had several limitations. Firstly, it was a retrospective study that should be validated in prospective, largescale cohorts. Secondly, we only examined some novel immune checkpoint genes, which may limit our signature's predictive capacity. However, this new classifier also provides more information on the state of the TME. Finally, none of the patients in this study underwent immunotherapy, thus our prediction of immunotherapy responsiveness is merely theoretical. Future studies should address these limitations.
In conclusion, we found that analyzing tumor expression of LTA, CD160, and CD40LG represents a novel immune signature that may be useful for predicting prognosis and response to immunotherapy in patients with LUAD. Further validation of these findings may improve the ability to shed some light on screening appropriate patients for are most likely to benefit from immunotherapy, enabling increasingly personalized, evidence-based care.