Heterogeneous activation of the TGFβ pathway in glioblastomas identified by gene expression-based classification using TGFβ-responsive genes

Background TGFβ has emerged as an attractive target for the therapeutic intervention of glioblastomas. Aberrant TGFβ overproduction in glioblastoma and other high-grade gliomas has been reported, however, to date, none of these reports has systematically examined the components of TGFβ signaling to gain a comprehensive view of TGFβ activation in large cohorts of human glioma patients. Methods TGFβ activation in mammalian cells leads to a transcriptional program that typically affects 5–10% of the genes in the genome. To systematically examine the status of TGFβ activation in high-grade glial tumors, we compiled a gene set of transcriptional response to TGFβ stimulation from tissue culture and in vivo animal studies. These genes were used to examine the status of TGFβ activation in high-grade gliomas including a large cohort of glioblastomas. Unsupervised and supervised classification analysis was performed in two independent, publicly available glioma microarray datasets. Results Unsupervised and supervised classification using the TGFβ-responsive gene list in two independent glial tumor gene expression data sets revealed various levels of TGFβ activation in these tumors. Among glioblastomas, one of the most devastating human cancers, two subgroups were identified that showed distinct TGFβ activation patterns as measured from transcriptional responses. Approximately 62% of glioblastoma samples analyzed showed strong TGFβ activation, while the rest showed a weak TGFβ transcriptional response. Conclusion Our findings suggest heterogeneous TGFβ activation in glioblastomas, which may cause potential differences in responses to anti-TGFβ therapies in these two distinct subgroups of glioblastomas patients.


Background
Glial tumors are the most common primary brain malignancies in adults. In the United States, they result in an estimated 13,000 deaths every year [1]. The most aggressive form, glioblastoma (WHO Grade IV), also known as glioblastoma multiforme, is one of the most deadly human malignancies. Glioblastoma patients have a median survival time of less than 12 months despite the standard treatment of surgery, radiotherapy and nitrosourea-based chemotherapy [2]. Significant morbidity and mortality comes from local invasion of the tumor preventing complete surgical resection. Glioblastoma may develop from a diffuse astrocytoma or an anaplastic astrocytoma (secondary glioblastoma), but more commonly presents de novo without evidence of a less malignant precursor (primary glioblastoma). Genetically, amplification of the epidermal growth factor receptor (EGFR) locus is found in approximately 40% of primary glioblastomas but is rarely found in secondary glioblastomas; mutations of the tumor suppressor gene phosphatase and tensin homolog deleted on chromosome 10 (PTEN) are observed in 45% of primary glioblastomas and are seen more frequently in primary glioblastomas than in secondary glioblastomas [3]. Loss of heterozygosity (LOH) of chromosome 10 and loss of an entire copy of chromosome 10, which harbors the PTEN gene, are the most frequently observed chromosomal alterations. The aberrant EGFR expression and the mutation of PTEN leads to abnormal activation of phosphoinositide-3-kinase (PI3K)/v-akt murine thymoma viral oncogene homolog (AKT) pathway, which provides necessary signals for tumor cell growth, survival and migration [4]. The importance of activation of EGFR-PI3K/PTEN pathway in the pathogenesis of glioblastoma has been confirmed in the subgroup of patients who showed clinical responses to EGFR kinase inhibitors [5,6].
The transforming growth factor-β (TGFβ)-mediated pathway has also been shown to play critical roles in glial tumors. The high-grade malignant gliomas express TGFβ ligands and receptors, which are not expressed in normal brain, gliosis, or low-grade astrocytomas [7][8][9][10]. The immunosuppressive cytokine, TGFβ, secreted by the tumor cells interferes with the host antitumor immune response therefore allowing the tumor to escape immunosurveilance [11]. Furthermore, TGFβ may act directly as a tumor progression factor. The growth-inhibition function on normal epithelial cells has been lost in many tumorderived cell lines [12]. The ability of TGFβ to enhance cell migration promotes tumor growth and invasion in advanced epithelial tumors [13][14][15].
TGFβ ligands are secreted in latent forms and are activated through cleavage of the carboxyl-terminal latency-associated peptide. Activated TGFβ ligands bind to specific cell surface receptors to form an activated heterodimeric serine/threonine kinase receptor complex. The constitutively active type II receptor phosphorylates and activates the type I receptor upon binding of the activated ligands, which then initiates the intracellular signaling cascade involving the SMAD, a family of proteins similar to the gene products of the Drosophila gene "mothers against decapentaplegic" (Mad) and the C. elegans gene Sma. SMAD2 and SMAD3 specifically mediate the signals induced by TGFβ. Phosphorylated SMAD2/3 are released from the receptor complex and bind to SMAD4. The SMAD2(3)/SMAD4 complex is translocated into the nucleus and regulates the transcription of specific target genes. TGFβ may act via the SMAD pathway to either promote or inhibit the transcription of specific genes [16]. The transcriptional profiles induced upon TGFβ stimulation have been examined using microarray technology [17][18][19][20][21][22][23][24]. Diversified yet overlapping transcriptional responses are generated by TGFβ stimulation in different tissues in different species. In general, the expressions of 5-10% genes in the genome are affected upon TGFβ stimulation.
Large-scale microarray analysis has been used in gliomas to identify gene signatures that have the power to predict survival and subclasses of gliomas that represent distinct prognostic groups [25][26][27]. Gene expression-based classification of malignant gliomas was shown to correlate better with survival than histological classification [28]. In this current investigation, we analyzed the transcriptional responses generated upon TGFβ stimulation from multiple studies. We then used this gene signature to examine the activation status of TGFβ in high-grade gliomas using published microarray data.

Glioma microarray datasets
Two glioblastoma microarray datasets were used in this study: Freije et al [25] and Nutt et al [28]. The Freije study included 85 tumor samples (dChip133ABGliomasGrdIII_ IV.xls) and used the affymetrix U133A and U133B gene chips, which contain more than 45,000 probesets. Consistent with the original publication, the dCHIP [29] normalized expression values were used in the analysis. The quality of the data was examined by scatter plots and correlation coefficients were calculated among all samples. 5 tumors (GBM 1469, GBM 1544, GBM 2015, GBM 749, GBM 839) were excluded from further analysis due to large artifacts on the scatter plots and low correlation coefficients with the rest of the samples. Between the two replicates of tumor # 975 (OLIGO III 975 and OLIGO III 975.1), OLIGO III 975 was included here, since it showed better quality as assessed from the scatter plot. The average of the two replicates (OLIGO III 744, OLIGO III 744.1) was used for the same reason. A total of 78 tumors from this dataset were used in the subsequent analysis. The second, independent dataset from Nutt et al [28] included 50 tumors and was generated on the Affymatrix U95A platform. The files with .cel format were downloaded from http://www.broad.mit.edu/publications/broad888 and normalized with GC-RMA in Splus 6.2 (Insightful) with the S+ArrayAnalyzer module (2.0). Pearson's correlation coefficients were calculated among all tumors and 4 tumors (Brain_NG_13, Brain_CG_1, Brain_NG_11, Brain_CG_10) were excluded from further analysis due to low correlation coefficients with the rest of samples. A total of 46 samples from this dataset were used in the following analysis.

Data analysis
ANOVA, t-test, Pearson's correlation coefficient calculations, Support Vector Machine (SVM) classification, and survival analysis were computed using MATLAB 7.

TGFβ-Responsive gene list
The comprehensive TGFβ-responsive gene set was compiled from 3 in-house microarray studies, 6 published microarray studies [19][20][21][22][23][24], and an in-house curation of >100 publications on TGFβ regulated genes. The 3 inhouse microarray studies include: human lung fibroblast +/-TGFβ [17], human glioblastoma cell line LN308 +/-TGFβ (unpublished data), and human pancreatic cancer cell line Panc1 +/-TGFβ [30]. For the published microarray studies, the whole datasets were not always available, however, the differentially expressed gene list based on the authors' criteria was normally presented in the publications. The following strategy was utilized to summarize the results from different studies and publications. For each of the microarray studies, if a gene was identified by the original authors using their criteria as differentially expressed after TGFβ stimulation at any of the time points in the original publication, it contributed one count to this gene. If the gene was one of the in-house curated TGFβ regulated genes, it also contributed one count. For in-house microarray studies where the whole datasets were available, a differentially expressed gene was defined as genes with at least 1.8 fold change in response to TGFβ treatment. If the study was done in mouse models, the human orthologs were identified for the mouse genes through the ortholog map from Mouse Genome Informatics http://www.informatics.jax.org/. The counts were then summed across all studies for each gene (Additional file 1: Counts of Studies). The direction of changes after TGFβ treatment was also summarized in the following fashion: upregulation of gene expressions upon TGFβ stimulation contributed positive counts, while downregulation of gene expressions after TGFβ treatment contributed negative counts. The signed counts were then summed across all microarray studies. If one gene is upregulated by TGFβ in one study but downregulated by TGFβ in another study, the direction counts will cancel each other during summarization therefore the total direction counts will be fewer than the total counts of the studies (Additional file 1: Directions). Since the direction of changes in TGFβ regulated genes curated from literature were not readily available in our database, they were not included in the directional counts.

Identification of TGFβ-Responsive gene set
To investigate potential TGFβ activation among glial tumors, we first identified a gene set that was responsive to TGFβ stimulation using in-house and public microarray data. Based upon several large-scale gene expression profiling experiments, TGFβ is expected to generate transcriptional responses that would impact 5-10% of the genome in any given tissues and the transcription profiles upon TGFβ stimulation would be quite diversified in different tissues and species [17,[19][20][21][22][23][24]. The transcriptional responses generated by chronic TGFβ stimulation on tumor tissues would also be different from acute TGFβ stimulation on normal tissues and cell lines. With the variability among microarray experiments, the transcriptional profile from a single experiment is not sufficient to identify TGFβ-responsive genes in glioma tumors. We examined the genes differentially expressed upon TGFβ Outline of data analysis steps Figure 1 Outline of data analysis steps. treatment in multiple large-scale gene expression profiling studies from both the majority of the published literature at the time this study was conducted, and data from inhouse microarray experiments; these datasets included multiple tissue types in both human and animal models. Together with curating >100 publications on TGFβ-regulated genes, we compiled a comprehensive TGFβ-responsive gene set using the strategy described above. A total of 2749 unique human genes were identified as responsive to TGFβ stimulation in at least one of the studies (Additional file 2). Although a majority (2129, 77%) of the genes were identified from one study, which may reflect the diversity of TGFβ transcriptional responses in different tissues and species, core TGFβ-responsive genes were identified in multiple studies showing the independence of tissue and species origins. 445 (17%) genes were identified in 2 independent studies and 175 (6%) genes were identified in at least 3 independent studies. Representative TGFβ-responsive genes with references are shown in Additional file 1. Gene ontology annotation showed that these genes are involved in a wide variety of biological functions where TGFβ plays a role, such as cell growth control, angiogenesis, signal transduction, immune response, cell adhesion, cell motility, and regulation of transcription.
As a first step towards characterizing the TGFβ-responsive gene set in gliomas, we examined the expression of a classic TGFβ target gene SERPINE1 in glial tumors within the Freije data set. The expression of SERPINE1, also called PAI-1, has been shown to be regulated by TGFβ in several reports [31]. Multiple TGFβ-responsive elements have been identified at the promoter region of the SERPINE1 gene [32,33]. The protein products of the SERPINE1 gene play important roles in TGFβ-mediated biological processes such as fibrosis and wound healing [34]. The induction of SERPINE1 expression by TGFβ was abolished by agents that interfered with TGFβ signaling [17]. Our ANOVA analysis of the Freije study suggested that there was no significant association between SERPINE1 expression and age or gender. However, SERPINE1 expression was significantly associated with the following histological types: glioblastoma (GBM), anaplastic astrocytoma

Figure 2
The expression of TGFβ downstream targets SERPINE1 in glial tumors (the Freije dataset) shown in box plots Figure 2 The expression of TGFβ downstream targets SERPINE1 in glial tumors (the Freije dataset) shown in box plots. Y-axis is the expression level of SERPINE1 in log2 scale. The black arrow indicates the mean expression level of SERPINE1 in each type of gliomas. Red spots indicate the outlier samples. The table underneath of the box plots are the summary statistics (count, mean, standard deviation (StdDev), median) of the expression level of SERPINE1 by glioma types. A: Significant association of SERPINE1 expression and histology classification. SERPINE1 is significantly upregulated in glioblastoma (GBM) compared to anaplastic astrocytoma (Astro), anaplastic oligodendroglioma (Oligo) and mixed glioma, anaplastic oligoastrocytoma (Mix). The mean expression level of SERPINE1 is 6.1-fold higher in glioblastoma compared to anstrocytoma, 5.3-fold higher compared to mixed glioma and 1.9-folder higher compared to oligodendroglioma. P-value computed using ANOVA is indicated at the top right corner of the plot. B. Significant association of SERPINE1 expression and the grade of the tumor. SERPINE1 is significantly upregulated in grade IV tumors (GBM) compared to grade III tumors (Astro, Oligo, Mix). The mean expression level of SERPINE1 is 3.7-fold higher in grade IV tumors (GBM) than in grade III tumors. The P-value was computed using a t-test as indicated in the top left corner of the plot. C. The expression of SERPINE1 is highly correlated with FN1 expression in gliomas. The correlation coefficient (R) and P-value of correlation (p) were indicated in the plot. The histology types of the gliomas are indicated by colors (blue: GBM, red: Astro, pink: Mix, black: Oligo).

Assessment of TGFβ activation in gliomas using the TGFβ-Responsive gene set
Initially the activation of TGFβ in gliomas was assessed by unsupervised hierarchical clustering of glial tumor microarray data from the Freije study [25] using the most consistent TGFβ-responsive genes in the set (Additional file 1). A TGFβ-responsive classifier set (103 probe sets representing 60 unique genes) was selected as the classifiers using the following criteria: 1) they have been identified to respond to TGFβ stimulation in at least 3 studies; 2) they were consistently up-or down-regulated by TGFβ stimulation in a majority of these studies (absolute direction counts > 50% of total study counts); 3) they varied among all tumors in the Freije dataset (CV >10%) [25]. By visual inspection of the hierarchical clustering results, we identified two small subsets of the glial tumors that showed distinct patterns of the 103 TGFβ-responsive classifiers ( Figure 3): one with higher expression of many molecules that were induced by TGFβ in vitro and were known as classical TGFβ downstream targets, including SERPINE1, FN1, THBS1, COL6A1, COL4A1, COL1A2, LTBP2, ITGB5 (Figure 3, highlighted in green, see Additional file 2 for the order of 103 probe sets), therefore represented strong TGFβ transcriptional response (right, 11 tumors). In contrast, the expression of these molecules was much lower in the other cluster, which represented weak TGFβ transcriptional response (left, 10 tumors). Grade III tumors (8 out of 10) were the majority in the weak TGFβ response cluster, while the strong TGFβ response cluster contained all glioblastomas (Figure 3). The status of TGFβ activation in the remaining tumors is unclear from visual inspection of the hierarchical clustering results. Support vector machine algorithm was then used to further classifying the TGFβ transcriptional responses among the remaing glial tumors. The 11 tumors in the subset showing strong TGFβ transcriptional responses and the 10 tumors in the weak TGFβ transcriptional responses group ( Figure 3) served as the training set. The machine learning was restricted to the 7173 TGFβ-responsive probe sets. The Leave-two-out cross-validation showed 100% accuracy among the training set, suggesting clear distinction between the two subgroups. The rest of the glioma samples were then subjected to SVM as the test set. Table 1 summarized the results of the SVM classification. In total, the majority of the grade III (96%) tumors with one exception were classified as weak TGFβ response group, while over half of grade IV glioblastomas (59%) were classified as strong TGFβ responses, suggesting that TGFβ is more commonly activated in glioblastomas. However, among glioblastomas, the level of TGFβ activation, as assessed by TGFβ-induced transcriptional response, is quite heterogeneous.
To further examining the differential gene expressions between the two TGFβ response glioblastomas subgroups, we employed the student t test for each gene and the results are shown in Additional file 3. A total of 3497 probesets had a p value of less than 0.001, including 1386 that had a fold change larger than 1.7. This set represented 982 unique known genes and 97 unknown genes, and their differential gene expression patterns among the glioblastomas are shown in Figure 4. P values and mean fold changes for representative TGFβ downstream targets (highlighted in green in Figure 4) are shown in Table 2. The expressions of these TGFβ downstream targets were highly elevated in TGFβ strong response glioblastomas compared to those in TGFβ weak response glioblastoma subgroup, confirming the heterogenenous activation of TGFβ pathway in glioblastomas.
TGFβ activation is associated with tumor progression and recurrence. In 4 out of 6 cases where primary and recur-rent tumor samples from the same patients were available, TGFβ response in the recurrent glioblastomas became strong from the weak status in the primary tumors (Table  3). No significant survival difference between the two TGFβ response groups in glioblastomas was observed with standard treatments (data not shown), though their potential response to anti-TGFβ therapies may be different.

Validation of TGFβ transcriptional response patterns in an independent gliomas microarray study
An independent microarray gene expression dataset containing 28 glioblastoma and 22 anaplastic oligodendrog- The SVM training set showing distinct weak or strong TGFβ response pattern in the 103 classifiers that were selected from the most consistent TGFβ-responsive genes (in the Freije dataset) Figure 3 The SVM training set showing distinct weak or strong TGFβ response pattern in the 103 classifiers that were selected from the most consistent TGFβ-responsive genes (in the Freije dataset). The data were Z-score transformed and the color range was indicated by the color bar below the heatmap. Each column represents a tumor sample and the tumor identification number is shown at the bottom of the column. These tumors were selected as training set for the SVM algorithm. Each row represents one of the 103 TGFβ-responsive probesets that were selected from the most consistent TGFβ-responsive genes. The orders of these genes are shown in Additional file 2.
lioma were obtained from Nutt et al [28]. The Nutt dataset was generated using the Affymatrix U95A platform that includes about 12000 probe sets. Using the same criteria described above, 101 probe sets representing 72 unique genes were selected from the most consistent TGFβresponsive genes. 47 of the 72 genes overlap with those used in the Freije study [25]. Subgroups of TGFβ responses similar to those seen in the Freije study [25] were also found by unsupervised clustering (data not shown). SVM classification was used among 3095 probe sets representing TGFβ responsive genes, with a training set of 8 samples showing weak TGFβ response and 8 samples showing strong TGFβ response in the hierarchical clustering analysis. The summary of the TGFβ response subgroups from the Nutt study [28] is also shown in Table   1. Overall, the results from the Nutt dataset were consistent with our results from the Freije dataset [25]. The majority of grade III anaplastic oligodendrogliomas (82%) showed weak TGFβ response while the majority of grade IV glioblastoma (67%) showed strong TGFβ response. Similar to the observations in the Freije study [25], TGFβ activation is heterogeneous. The expressions of many well-known TGFβ downstream targets were significantly different between the two TGFβ response subgroups within glioblastomas (Table 2).

Discussion
Antagonizing the biological effects of TGFβ has become a potential experimental strategy to treat glioblastoma, one of the most devastating human cancers. Several anti-TGFβ therapies have shown promise in both preclinical and early clinical trials [39]. The current rationale for TGFβ antagonism includes its role in tumor promotion, migration and invasion, metastasis, and tumor-induced immunosuppression. Numerous reports suggest aberrant TGFβ activation in glioblastoma and other high-grade gliomas. This includes abnormal expression of the ligands, more specifically TGFB2 and higher levels of phosphorylated SMADs. However, to date, none of these reports has systematically examined the components of TGFβ signaling to gain a comprehensive view of TGFβ activation in a large cohort of human glioma patients. In this study, we adopted an alternative approach. By examining the transcriptional responses induced by TGFβ activation in publicly available microarray data, we identified two subgroups of glioblastomas that showed distinct patterns of TGFβ activation in two independent studies. Combin-  By examining the genes differentially expressed between the two identified subgroups of glioblastomas that showed different TGFβ transcriptional responses, we found that the ligands TGFB1, TGFB2 and their receptors were expressed significantly higher in the strong TGFβ response group (Additional file 3) compared to those in the weak TGFβ response group, suggesting that increased expression of the ligands and receptors contributed to TGFβ activation. THBS1, an activator of TGFβ, was shown to have a higher level in the strong TGFβ response group in one study, suggesting that TGFβ activation may also result from increased bioavailability. In contrast, SMAD7, a negative regulator of TGFβ pathway that often was induced upon TGFβ stimulation in vitro (Additional file 1), was downregulated in the strong TGFβ response group (fold change -1.48, p < 0.0007), suggesting the tumor-specific escape of the negative feedback mechanism may also contributed to TGFβ activation in glioblastomas. In addition, genes involved in antigen presentation were upregulated in the TGFβ strong response glioblastomas. These included the genes encoding class I major histocompatibility complex proteins HLA-A, HLA-B, HLA-C, HLA-E, HLA-F, HLA-G, class II major histocompatibility complex Differentially expressed genes in the two subgroups of glioblastomas with strong and weak TGFβ response (in the Freije data-set) Figure 4 Differentially expressed genes in the two subgroups of glioblastomas with strong and weak TGFβ response (in the Freije dataset). The data were Z-score transformed and the color range was indicated by the color bar below the heatmap. Each column represents a glioblastoma sample and the tumor identification number is shown at the bottom of the column. Each row represents one of the 1386 differentially expressed gene with p < 0.001 and fold change >1.7. The classical TGFβ downstream targets in Table 2 are highlighted as green.
EGFR amplification and PTEN mutations/10q LOH are frequent genetic alterations observed in glioblastomas.
Recently a gene signature generated from autocrine platelet-derived growth factor (PDGF) signaling in gliomas has been used to classify gliomas, and it was shown that EGFR amplification and PTEN mutation/10q LOH were largely enriched in the cluster showing weak autocrine PDGF signaling [46]. Using the same signature, we found the TGFβ strong response cluster overlapped with the weak autocrine PGDG signaling subgroup extensively (data not shown), suggesting potential collaboration between EGFR/PTEN/PI-3K pathway and TGFβ pathway in glioblastoma development and progression. Numerous evidence in vitro also showed the collaborating roles of EGFR and TGFβ in inducing epithelial to mesenchymal transition, an event that contributes to cell migration, invasion, cell survival and angiogenesis [47][48][49][50]. Future studies will be needed to examine if EGFR amplification and PTEN mutation/10q LOH were enriched in the subgroups of glioblastomas that showed strong TGFβ transcriptional response.

Conclusion
Using the TGFβ-responsive genes we compiled from various studies, we examined the status of TGFβ pathway activation in high-grade gliomas in two independent, publicly available, large-scale gene expression datasets. The purpose of this manuscript is not to establish or test a gene signature that can be used to prospectively classify future datasets in a platform-independent fashion. Rather our goal is to examine the status of TGFβ activation and its heterogeneity among glioblastomas. Therefore, we applied the same methodology/algorithm in two independent datasets and found similar results. Consistent with previous reports, we found that glioblastomas showed a stronger TGFβ response than grade III gliomas. More importantly, among glioblastmas, two subgroups with distinct patterns of TGFβ activation were identified. This molecular stratification of glial tumors using TGFβ transcriptional response is potentially relevant to TGFβtargeted therapies. A small subset of the gene signatures with classification power are currently under investigation to identify biomarkers that potentially can be used in the clinical setting with anti-TGFβ therapies.