Integrated weighted gene coexpression network analysis identifies Frizzled 2 (FZD2) as a key gene in invasive malignant pleomorphic adenoma

Background Invasive malignant pleomorphic adenoma (IMPA) is a highly malignant neoplasm of the oral salivary glands with a poor prognosis and a considerable risk of recurrence. Many disease-causing genes of IMPA have been identified in recent decades (e.g., P53, PCNA and HMGA2), but many of these genes remain to be explored. Weighted gene coexpression network analysis (WGCNA) is a newly emerged algorithm that can cluster genes and form modules based on similar gene expression patterns. This study constructed a gene coexpression network of IMPA via WGCNA and then carried out multifaceted analysis to identify novel disease-causing genes. Methods RNA sequencing (RNA-seq) was performed for 10 pairs of IMPA and normal tissues to acquire the gene expression profiles. Differentially expressed genes (DEGs) were screened out with the cutoff criteria of |log2 Fold change (FC)|> 1 and adjusted p value  < 0.05. Then, WGCNA was applied to systematically identify the hidden diagnostic hub genes of IMPA. Results In this research, a total of 1970 DEGs were screened out in IMPA tissues, including 1056 upregulated DEGs and 914 downregulated DEGs. Functional enrichment analysis was performed for identified DEGs and revealed an enrichment of tumor-associated GO terms and KEGG pathways. We used WGCNA to identify gene module most relevant with the histological grade of IMPA. The gene FZD2 was then recognized as the hub gene of the selected module with the highest module membership (MM) value and intramodule connectivity in protein–protein interaction (PPI) network. According to immunohistochemistry (IHC) staining, the expression level of FZD2 was higher in low-grade IMPA than in high-grade IMPA. Conclusion FZD2 shows an expression dynamic that is negatively correlated with the clinical malignancy of IMPA and it plays a central role in the transcription network of IMPA. Thus, FZD2 serves as a promising histological indicator for the precise prediction of IMPA histological stages. Supplementary Information The online version contains supplementary material available at 10.1186/s12967-021-03204-7.


Background
Invasive malignant pleomorphic adenoma (IMPA) is characterized by high malignancy and invasive growth. It is a subtype of carcinoma ex pleomorphic adenoma (Ca-ex-PA) that arises mainly in the parotid gland, with more than 1.5 mm of cancerous components extending beyond the adenoma capsule into surrounding tissues [1,2]. Based on the histological morphology of the malignant components, IMPA can be subclassified into the myoepithelial subtype and adenocarcinoma subtype. Briefly, if two or more tumor myoepithelial markers, such as Calponin, S100, or SMA are positively expressed at the same time, IMPA is considered to be a myoepithelial carcinoma subtype. If tumors do not meet the above criteria or are only positive for the epithelial marker Ckpan, they are categorized as adenocarcinoma subtypes [3,4]. To date, surgical resection combined with radiotherapy and chemotherapy is the main treatment strategy, but the prognosis is poor due to higher local recurrence, distant metastasis and a lower survival rate after surgery [1,5,6]. Thus, further research on more accurate molecular targets for early diagnosis and targeted therapy and predictors of a good prognosis of IMPA are urgently needed to develop more efficient therapy that can improve patient survival and quality of life.
A diffusion-weighted imaging (DWI)-based triple-classification radiomics model to characterize intratumoral heterogeneity for preoperative auxiliary diagnosis of pleomorphic adenoma (PA) and a nomogram for predicting the prognosis in an individual with Ca-ex-PA have been established [7]. However, these models are limited to neoplasms and more precise early diagnostic biomarkers are necessary [8]. As bioinformatics analysis is widely applied to cancer research, our previous study first profiled the N 6 -methyladenosine (m 6 A) methylome map in IMPA by methylated RNA immunoprecipitation with high-throughput sequencing (MeRIP-seq) [9]. This study indicated a significant effect of m 6 A modification on IMPA progression, but gene-targeted diagnostic biomarkers are not clear. Weighted gene coexpression network analysis (WGCNA) is a suitable tool to establish free-scale coexpression networks and it is utilized to determine the hub genes based on the correlation between gene modules and clinical characteristics [10,11]. It has been successfully used in a variety of tumor studies. With WGCNA, potential biomarkers and molecular mechanisms of breast cancer, papillary thyroid carcinoma (PTC) and medulloblastoma (MB) etc. have been identified [12][13][14], but a gene coexpression network for identifying hub genes closely related to IMPA is still poorly characterized.
Therefore, we evaluated the relationship between the hub gene and tumor grade using WGCNA since histological grade has a strong effect on diagnosis, treatment and prognosis. To the best of our knowledge, our research is the first to utilize WGCNA for the underlying mechanism profiling and biomarker verification of IMPA, which might provide new ideas for increasing diagnosis accuracy and designing efficient strategies for IMPA.

Tissue samples
Ten pairs of IMPA and adjacent normal control tissues for RNA sequencing (RNA-seq) were obtained from Shanghai Ninth People's Hospital. In addition, another 45 Formalin-fixed and paraffin-embedded IMPA tissues with clinicopathological information were retrieved for verification of hub genes via Immunohistochemistry (IHC). The clinicopathological information of these 45 tissue samples is summarized in Table 1. This study was approved by the ethics committee of Shanghai Ninth People's Hospital.

Histological analyses
All IMPA tissues and adjacent normal controls were fixed in 4% PFA for > 4 h at 4 °C. Then, these fixed tissues were dehydrated with graded ethanol (70-100%) and embedded in paraffin. Sections (4 μm) were cut on a Leica HistoCore BIOCUT RM2235. Hematoxylin-eosin (H&E) staining was performed following standard procedures. Stained sections were imaged using a Leica LF200 microscope.

RNA-seq and differentially expressed gene (DEG) identification
Ten pairs of IMPA tissue samples and the surrounding normal control were immediately placed into RNase-free centrifuge tubes and snap-frozen in liquid nitrogen for RNA-seq. Briefly, total RNA was extracted and controlled for quality using a BioAnalyzer 2100 system (Agilent Technologies, Inc., USA). Then, RNA-seq was performed by Illumina NovaSeq ™ 6000. EdgeR was performed to normalize the raw data [16] and the DEGs were identified via adjusted p value and Fold change (FC).

Functional and pathway enrichment analysis
The acquired DEGs were analyzed by the database for annotation, visualization and integrated discovery (DAVID) database (https:// david. ncifc rf. gov/) [17,18] online tool for gene ontology (GO) [19] enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) [20] pathway enrichment analysis to identify potential biological functions of DEGs in IMPA.

Weighted gene co-expression network construction
The R package 'WGCNA' was used to construct a gene coexpression network of DEGs [10]. The correlation strength between nodes was calculated using an adjacency matrix and the formula was as follows: Briefly, i and j in the formula were two distinct genes and xi and xj represented their expression values. sij is Pearson's correlation coefficient and aij is the strength of the correlation between two genes. The soft-threshold β was 10 in this study. Then, we transformed the adjacency matrix into the topological overlap matrix and performed hierarchical clustering to identify modules with min. Module Size = 30. Notably, genes without characteristics were assigned to the grey module. Subsequently, gene significance (GS), module membership (MM), module eigengene (ME) and other parameters were calculated to identify the module most relevant with histological grade of IMPA.

Identification and validation of hub genes
Genes in the selected clinically relevant module were uploaded to the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) (https:// string-db. org/) [21,22]to analyze the protein-protein interaction (PPI) network of DEGs. The PPI network was visualized via Cytoscape software [23]. KEGG pathway enrichment of these genes was analyzed with the R package 'ClusterProfiler' to investigate the biological functions of the module [24]. Of note, the gene with MM > 0.8 and the highest degree of connectivity was determined to be the hub gene. IHC was further performed to examine the expression level of the hub gene in different clinical stages of IMPA.

Statistical analysis
Data were expressed as mean ± standard deviation (SD) and analyzed using SPSS version 23 statistical analysis package (SPSS Inc., Chicago, IL, USA). Unpaired/paired Student's t test (two-tailed) was applied to analyze differences between two groups. Pearson's chi-square test was used to assess differences between the low-grade and high-grade groups. The Pearson's correlation coefficient analysis was performed to examine the correlation. Statistical significance was described as follows: *p value < 0.05; **p value < 0.01; ***p value < 0.001. sij = cor xi, xj aij = Sijβ

Identification of DEGs
Based on |log 2 FC|> 1 and adjusted p value < 0.05, a total of 1970 DEGs between IMPA and normal samples were screened out, including 1056 upregulated and 914 downregulated genes (Fig. 3a) [25]. In addition, volcano plots listed the top 10 upregulated and downregulated genes ( Fig. 3b, Table 2). More details can be found in Additional file 1: Table S1.

Functional and pathway enrichment analysis of DEGs
GO and KEGG pathway enrichment analyses were used to verify the potential functional and molecular pathways associated with DEGs. The upregulated DEGs were significantly enriched in biological process of GO terms including nuclear division, organelle fission, extracellular matrix (ECM) organization and chromosome segregation (Fig. 4a). On the other hand, downregulated DEGs were mainly involved in second messenger-mediated signaling, lipid catabolic processes and purine-containing compound catabolic processes (Fig. 4b). In addition, KEGG pathway analysis revealed that upregulated DEGs exhibited enrichments in the cell cycle, ECM-receptor interaction and p53 signaling pathway (Fig. 4c), whereas downregulated DEGs were mainly enriched in salivary secretion, calcium signaling and cAMP signaling pathways (Fig. 4d). In general, GO and KEGG pathway analyses both suggestd that upregulated DEGs in IMPA exerted functions of promoting tumorigenesis. They were involved in cell cycle progression, ECM organization and tumor-associated signaling regulation. Similarly, downregulated DEGs were revealed to participate in the modulation of second messenger-mediated signaling, which is extremely important for oncogenesis. Notably, the KEGG pathway analysis of downregulated DEGs also showed enrichments in second messenger-mediated signaling (e.g., cAMP and calcium), indicating a critical role of second messengers in IMPA pathogenesis.

Identification of gene module relevant with IMPA histological grade
A cluster dendrogram of 1970 DEGs was produced via WGCNA based on the criteria of soft-threshold β = 10 and scale-free R 2 = 0.85 (Fig. 5a-c). Then, 11 gene modules were identified in the hierarchical clustering, based on a merge cut height of 0.25 and a minimum module size of 30 (Figs. 5d, 6a). The enriched GO terms and KEGG pathways of these 11 gene modules are summarized in Additional file 2: Table S2. It was intriguing to find that there were 6 modules exhibiting enrichment of tumor-associated signaling pathways or GO terms, namely, the cyan module, green module, grey60 module, midnight blue module, salmon module and yellow module. Reasonably, those modules were significantly relevant to the onset and progression of IMPA. The remaining 5 gene modules did not show tumor-related pathway or GO enrichments and thus were not deeply explored in this study.
Among the 6 tumor-associated modules, the green one showed the strongest correlation with the histological grade of IMPA (R = − 0.62; p = 0.06; Fig. 6b). Besides, the green module had the highest correlation with histological grade in GS analysis (Fig. 6c). It was noteworthy that MM and GS of the green module were positively related (Fig. 6d). Therefore, the green module was selected for further analysis.

FZD2 as a promising biomarker of IMPA malignancy
KEGG pathway analysis was performed to determine the possible functions of the 132 genes in the green module. As a result, the 132 genes in the green module were primarily enriched in pathways of the Wnt signaling pathway, protein digestion and absorption and ECM-receptor interaction (Fig. 7a). Subsequently, the PPI network of the green module was constructed via the STRING database (Fig. 7b). Notably, FZD2 had the highest MM value and intramodule connectivity among genes involved in the PPI network (Table 3; Additional file 3: S3) and was therefore selected as the core gene. IHC staining of FZD2 revealed that the expression level of FZD2 was negatively correlated with the clinical malignancy of IMPA; that is, the FZD2 protein level was lower in high-grade IMPA than in low-grade IMPA. FZD2 IHC staining was then performed in distinct grades of IMPA tissues and showed identical results (Fig. 7c-e; Table 4). These results suggest that FZD2 might be an ideal indicator for predicting the histological stages of IMPA.

Discussion
IMPA is a seriously malignant salivary gland neoplasm with a high risk of recurrence and poor prognosis [1,2], especially in high-grade tumors [8]. Thus, better biomarkers are needed and the underlying pathogenesis must be clarified for precise diagnosis, targeted treatment and the prognosis prediction of IMPA. Therefore, 10 pairs of IMPA tissue and tumor-surrounding normal tissue samples were subjected to RNA-seq analysis in this study. In our study, 1970 DEGs were identified according to the criteria of |log 2 FC|> 1 and adjusted p value < 0.05. These DEGs included 1056 upregulated genes and 914 downregulated genes. According to the GO analysis results, upregulated DEGs were mainly enriched in nuclear division and extracellular matrix (ECM) organization. The upregulated DEGs, as revealed by KEGG enrichment analysis, participated in pathways such as ECM-receptor interaction and p53 signaling. The ECM provides biochemical and essential structural support for cellular constituents of the tissue and is responsible for cell proliferation, cell adhesion and cell-cell communication [25][26][27]. Aberrant ECM organization has been reported to be associated with the progression of multiple tumors (e.g., breast cancer and hepatocellular carcinoma) and may likewise contribute to IMPA development [28,29]. Furthermore, the downregulated DEGs exhibited an enrichment in the GO category of second messenger-mediated signaling and were involved in salivary secretion and calcium signaling pathways. Notably, intracellular calcium ions (Ca 2+ ) are well-studied second messengers and they play direct and robust roles in many biological processes. Calcium  signaling changes reflect the 'in or out' of intracellular Ca 2+ [30,31]. Studies have revealed that calcium signaling regulates cancer progression mainly by modulating immune-associated pathways and remodeling the tumor microenvironment [32,33]. Therefore, there can be dysregulation of intracellular Ca 2+ and calcium signaling in IMPA.
To determine the hub genes of the 1970 DEGs, WGCNA was performed to construct a gene coexpression network associated with the clinical features of IMPA. Among the 11 identified gene modules, the green module, comprised of 132 DEGs, had the strongest correlation with IMPA stage and grade and was selected for further analysis. Enrichment analysis showed that genes in the green module were enriched in the Wnt signaling pathway, protein digestion and absorption, ECM-receptor interaction and neuroactive ligand receptor interaction. The Wnt signaling pathway is a critical pathway in multiple biological processes [34]. Aberrant activity of the Wnt pathway plays an important role in carcinogenesis, cancer proliferation, metastasis and invasion [35]. Moreover, the activation of protein anabolism is important to high-grade cancers because proteins are required for cell division during proliferation and against cancer-associated cachexia and malnutrition [36]. Studies have reported that neuroactive ligand receptor interactions are involved in tumor immune infiltration and metastasis [37][38][39]. Therefore, these pathway analysis results may contribute to the comprehensive illumination of the hidden pathogenesis of IMPA. FZD2 encodes a transmembrane protein with seven transmembrane domains and an extracellular cysteine-rich domain [40,41]. It can strongly bind to Wnt proteins, which subsequently activates the expression of downstream genes by the Wnt signaling pathway [40,41]. A study performing a comprehensive analysis of FZD2 in 33 cancer types indicated that FZD2 was associated with high oncogenicity [42]. It was implicated to participate in cancer cell growth, migration and invasion and dominantly affected treatment and prognosis, suggesting the importance of FZD2 in cancer [42]. In oral neoplasms, FZD2 promotes the oncogenesis of oral squamous cell carcinoma [43,44]. Interestingly, FZD2 was found to act as an inhibitor in salivary adenoid cystic carcinoma, which might be due to the different microenvironment of these tumors [45]. Nonetheless, the role of FZD2 in malignant pleomorphic adenoma has not been reported till now.
In the present study, we showed that FZD2 in this module has the strongest relationship with the grade characteristics of IMPA. Some reports have demonstrated that the expression of FZD2 varies with the different clinical stages and histological grades in cancers [42,46]. Here, we first found that FZD2 was more highly upregulated in low-grade IMPA than in high-grade IMPA, indicating that FZD2 serves as an ideal indicator for the precise prediction of IMPA histological stages. In general, our study has uncovered an important role of FZD2 in the pathological progression of IMPA. Hub genes of the other 5 tumor-associated modules have also been screened out (Table 5), which might also play a role in IMPA. Further studies are necessary to figure out detailed functions of FZD2 as well as other 5 hub genes in IMPA.

Conclusions
The expression of FZD2 is negatively correlated with the clinical malignancy of IMPA. Moreover, FZD2 was identified as one of the hub genes in the IMPA transcription network and can play a vital role in regulating IMPA progression. In general, FZD2 serves as a promising indicator for predicting the histological stages of IMPA.