- Open Access
Uncovering the gene regulatory network of type 2 diabetes through multi-omic data integration
Journal of Translational Medicine volume 20, Article number: 604 (2022)
Type 2 diabetes (T2D) onset is a complex, organized biological process with multilevel regulation, and its physiopathological mechanisms are yet to be elucidated. This study aims to find out the key drivers and pathways involved in the pathogenesis of T2D through multi-omics analysis.
The datasets used in the experiments comprise three groups: (1) genomic (2) transcriptomic, and (3) epigenomic categories. Then, a series of bioinformatics technologies including Marker set enrichment analysis (MSEA), weighted key driver analysis (wKDA) was performed to identify key drivers. The hub genes were further verified by the Receiver Operator Characteristic (ROC) Curve analysis, proteomic analysis, and Real-time quantitative polymerase chain reaction (RT-qPCR). The multi-omics network was applied to the Pharmomics pipeline in Mergeomics to identify drug candidates for T2D treatment. Then, we used the drug-gene interaction network to conduct network pharmacological analysis. Besides, molecular docking was performed using AutoDock/Vina, a computational docking program.
Module-gene interaction network was constructed using MSEA, which revealed a significant enrichment of immune-related activities and glucose metabolism. Top 10 key drivers (PSMB9, COL1A1, COL4A1, HLA-DQB1, COL3A1, IRF7, COL5A1, CD74, HLA-DQA1, and HLA-DRB1) were selected by wKDA analysis. Among these, COL5A1, IRF7, CD74, and HLA-DRB1 were verified to have the capability to diagnose T2D, and expression levels of PSMB9 and CD74 had significantly higher in T2D patients. We further predict the co-expression network and transcription factor (TF) binding specificity of the key driver. Besides, based on module interaction networks and key driver networks, 17 compounds are considered to possess T2D-control potential, such as sunitinib.
We identified signature genes, biomolecular processes, and pathways using multi-omics networks. Moreover, our computational network analysis revealed potential novel strategies for pharmacologic interventions of T2D.
Type 2 diabetes (T2D) is a chronic metabolic disease distinguished by insulin resistance and elevated blood glucose levels. As a global endemic, recent data from the Centers for Disease Control and Prevention (CDC)  suggested that as of 2019, roughly 28.7 million people in the United States (8.7% of the total U.S. population) were diagnosed with diabetes, of which about 90–95% have T2D. As prolonged hyperglycemia is a high-risk factor for heart disease, chronic kidney disease (CKD), and nerve damage, T2D imposes a substantial economic burden on society . In addition, calculations based on epidemiological data suggested that the expenditure on diabetes in the U.S. was approximately $327 billion in 2017, including $237 billion in direct medical costs and $90 billion in lost productivity .
Due to the complexity of the onset and progression of T2D and the tandem with various diseases, analysis from multiple levels could exponentially augment our understanding of its pathophysiological mechanism. Including genomics, epigenomics, transcriptomics, etc., multi-omics analysis  can provide a list of disease-related differences that can be used as biomarkers of the disease process and uncover critical pathways in disease. For instance, Yang-Tay et al. discovered the DNMT1-NT5C2-insulin receptor pathway using the DNA methylation array data, demonstrating that DNMT1 is relevant to the susceptibility of T2D patients . Besides, through the single-cell transcriptome analysis, Lawlor et al. revealed that PP/gamma cells in T2D patients could integrate central and peripheral hunger and satiety signals, which increases our accurate knowledge of the molecular components of rare islet cells . However, unlike single-omics studies, multi-omics studies can be more comprehensive and accurate. At present, multi-omics studies on T2D have received extensive attention. For example, to better understand preT2D status, Wenyu et al.  conducted a cohort study of 106 normal individuals and individuals with prediabetes. Through conjoined analysis of transcriptomic and proteomic data, etc., the study indicated a new sight in that the gut microbiota of insulin-resistant individuals may have a decreased response to respiratory viral infection. This could lead to the emergence of chronic inflammation, which promotes the progression of T2D in preT2D patients. In addition, by combining GWAS data with other multi-omics datasets, Yon Jung  revealed the common pathways shared by IGF-I and IR, such as glycosaminoglycan biosynthesis, etc. It provided assistance for a more comprehensive understanding of the molecular mechanism of the IGF-I/IR axis. Although previous observations are of great significance, by comprehensively using the data of genomics, transcriptomics, and epigenomics to explore disease-related gene expression signatures, our cognition of T2D and the clinical transformation of drug discovery can both be advanced.
In this study, more accurate potential key genes and their regulatory mechanisms in T2D were identified by constructing gene regulatory network through multi-omics analysis, which contributes to demonstrating pathology, and identifying drug targets of T2D (Fig. 1).
Gene expression profiles and DNA methylation profiles of T2D were filtrated through the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo). Inclusion criteria were as follows: (1) Availability of islets from T2D patients in the dataset; (2) Ten or more islet samples in the dataset. Three eligible datasets were selected, including GSE38642 and GSE21232 (training set), and GSE25724 (test set). Additionally, genomics data (Nature 536(7614): 41–47.) was retrieved from Mergeomics web server. The details of the data are shown in Table 1.
All microarray data were submitted to the GEO database (http://www.ncbi.nih.gov/geo). The raw data were downloaded as MINiML files. It contains the data for all platforms, samples, and GSE records. The extracted data were normalized by log2 transformation. The microarray data were normalized by the normalized quantiles function of the preprocessCore package in R software (version 3.4.1). Probes were converted to gene symbols according to the annotation information of the normalized data in the platform. Probes matching multiple genes were removed from these datasets; the average expression value of genes measured by multiple probes was calculated as the final expression value and, as in the case of the same dataset and platform but in different batches, used the removeBatchEffect function of the limma package in the R software to remove batch effects. As in different datasets or the same dataset but in different platforms, extracting multiple data sets with common gene symbols, marking different datasets or platforms as different batches, used the removeBatchEffect function of the limma package in the R software to remove batch effects. The result of the data preprocessing was assessed by Density plot. The UMAP plot was drawn to illustrate the samples before and after batch effect.
Multi-omics data integration and analysis
To investigate the functional connections among the T2D-associated genes, we used Mergeomics, a web server for multi-omics data integration, to elucidate disease networks and predict therapeutics . Mergeomics consists of two main libraries, Marker Set Enrichment Analysis (MSEA) and Weighted Key Driver Analysis (wKDA). In the current study, we used MSEA to assess whether known biological processes or pathways were enriched for multi-omics data of T2D; wKDA leverages gene network topology (interactions or regulatory relations among genes) and edge weight (strength or reliability of interactions and regulatory connections) information of graphical gene networks to predict potential key regulators of top T2D-related genes after integration. The wKDA depth was set at 1 and default incoming and outgoing directionality, the minimum overlap of 0.33, and edge factor 0.5 were used. Genes were compared against the pancreas tissue-specific Bayesian network.
Based on the key driver genes screened above, the Cytoscape (version 3.9.1)  software was exploited for network analysis and visualization. Here, we used 9 common algorithms (MCC, MNC, Neck, ECC, Degree, Closeness, Radiality, Stress, EPC) to evaluate modules. Subsequently, we constructed a co-expression network of these hub genes via GeneMANIA (http://www.genemania.org/) , which is a reliable tool for identifying internal associations in gene sets.
Enrichment analyses of hub genes
To uncover the biological function related to the hub genes, Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed using Hiplot (https://hiplot.com.cn/).
RNA isolation and RT‐qPCR analyses
A total of 6 serum samples (3 serum samples from T2D patients and 3 serum samples from healthy controls) were evaluated in this study. The total RNA was isolated using RNA extraction kit (TIANGEN) and reverse transcribed into cDNA using reverse transcription kit (ABI). Real-time quantitative PCR (RT-qPCR) analysis was performed using real‐time PCR kit (ABI). The relative mRNA expression levels of PSMB9, COL1A1, COL4A1, HLA-DQB1, COL3A1, IRF7, COL5A1, CD74, HLA-DQA1, and HLA-DRB1were normalized with the GADPH in the same sample. The thermal cycler parameters for the amplification of these genes were as follows: 1 cycle at 95 °C for 10 min followed by 40 cycles at 95 °C for 15 s, 60 °C for 15 s, and 72 °C for 30 s. Gene expression was evaluated by the 2 − ΔΔCt method. The sequences of RT‐PCR primers are the following (5′–3′; Table 2). For gene expression using RT- PCR, statistical analysis was performed by the unpaired two-tailed t-test.
iTRAQ-based quantitative proteomic analysis of obese diabetic mice
All animal studies were approved by the Animal Care and Use Committee of our institution and in accordance with relevant guidelines and regulations. Five-week-old male Sprague–Dawley (SD) rats initially weighing 160–180 g were housed individually in cages at a constant temperature of 24 ± 2 °C with a 12:12-h light–dark cycle. Rats were fed a high-fat diet and intraperitoneally injected low-dose streptozotocin (32 mg/kg) to induce T2DM model. Rats with random blood glucose ≥ 16.7 mmol/L on 3 consecutive days were selected . Finally, 4 rats were selected in T2D group.
Pancreas tissues were grinded and then dissolved in SDT lysis buffer (4% sodium dodecyl sulfate, 100 mM Tris–HCl, pH 7.6, Sangon, China). The supernatant was collected and quantified after boiling and centrifuging. The extracted proteins were treated with the method of filter-aided sample preparation (FASP) enzymatic hydrolysis. The samples were labeled according to the instructions of the iTRAQ Reagents 8-plex kit (AB SCIEX, USA). Mascot software 2.6 and Proteome Discoverer software 2.1 (Thermo Fisher Scientific) were used to process proteomic data against the rat database (Uniprot_RattusNorvegicus_36080_20180123).
After merging the GSE77943 (including 5 islet samples from normal mice), the comparison between the hub genes expression of T2D and contorl was performed with the T-test. P-value < 0.05 was considered significant.
Analysis of the predictive value of biomarkers
Receiver operator characteristic (ROC) curve analysis was performed to predict the diagnostic effectiveness of biomarkers by SSPA Statistics 23. The area under the ROC curve (AUC) value was utilized to determine the diagnostic effectiveness in discriminating T2D from control samples in the GSE25724 dataset.
Prediction and verification of transcription factors (TFs)
Transcriptional Regulatory Relationships Unraveled by Sentence-based Text mining (TRRUST)  is a database for predicting transcriptional regulatory networks, which contains the target genes corresponding to transcription factors (TFs) and the regulatory relationships between TFs. TRRUST currently includes two species: human and mouse, containing 8444 and 6552 TFs target regulatory relationships of 800 human TFs and 828 mouse TFs, respectively. TFs that regulate the hub genes were obtained through the TRRUST database, and an adjusted P-value < 0.05 was considered significant. Subsequently, we verified the expression levels of these TFs in obese diabetic mice with the T-test.
Transcriptional regulation and histone modification related to hub genes based on epigenetic data
"Homo sapiens" and "pancreas islet" were jointly searched in the Cistrome Data Browser (DB) (http://cistrome.org/db) . Through the analysis and processing of all samples in the preset process, as well as the evaluation using comprehensive quality control indicators, we obtained the factors targeting specific genes and visualized them in the UCSC genome browser.
Construction of disease network
The gene-disease association networks were created using DisGeNet , OMIM , OpenTargets , and Genecards databases , and genes were selected as nodes in the network if retrieved by at least two databases using Venny2.1.0 (http://bioinfogp.cnb.csic.es/tools/venny/) .
Potential drugs for the management of T2D were selected using the network based drug repositioning method from the Pharmomics pipeline in the Mergeomics web server. Drug-target interactions were used to construct a drug-target interaction network and visualized using Cytoscape v3.9.1. Gene Ontology and KEGG pathway analysis can clarify the role of potential targets by gene function and signaling pathways. The drug-disease common targets were converted into Entrez IDs, and then the “clusterProfiler” package was installed in the R software. According to the converted Entrez IDs, enrichment analysis of key target gene GO functions and analysis of KEGG signaling pathways were performed with p < 0.05.
To analyze the binding affinities and modes of interaction between the drug candidate and their targets, AutodockVina 1.2.2, a silico protein–ligand docking software was employed . The molecular structures of sunitinib were retrieved from PubChem Compound (https://pubchem.ncbi.nlm.nih.gov/) . The 3D coordinates of COL1A1 (PDB ID, 5CTD; resolution, 1.6 Å), COL4A1 (PDB ID, 1LI1; resolution, 1.9 Å), PSMB9 (PDB ID, 7AWE; resolution, 2.3 Å), IRF7 (PDB ID, 2O61; resolution, 2.8 Å), HLA-DQB1 (PDB ID, 1JK8; resolution, 2.4 Å) and COL3A1 (PDB ID, 4AE2; resolution, 1.68 Å) were downloaded from the PDB (http://www.rcsb.org/pdb/home/home.do). For docking analysis, all protein and molecular files were converted into PDBQT format with all water molecules excluded and polar hydrogen atoms added. The grid box was centered to cover the domain of each protein and to accommodate free molecular movement. The grid box was set to 30 Å × 30 Å × 30 Å, and the grid point distance was 0.05 nm. Molecular docking studies were performed by Autodock Vina 1.2.2 (http://autodock.scripps.edu/).
Statistical analysis was performed using GraphPad software (GraphPad Prism v9.0; GraphPad Software, USA) and R software (version 3.4.1).
Multi-omics data collection and integration
We integrated the multi-omics data, including gene expression profiles and epigenomic profiling data sets collected from the GEO database (GSE38642 and GSE21232) and genomics data retrieved from Mergeomics web server (Table 1).
Marker set enrichment analysis (MSEA)
To better understand the biological implications that relate to T2D, based on the multi-omic profile, we applied marker set enrichment analysis (MSEA) to evaluate the biological modules and functional categories. GO analysis results are mainly enriched in glucose homeostasis, carbohydrate homeostasis, and lipoprotein particle binding (see Additional file 1: Fig. S1A–C).
In order to identify the regulated modules and potential association between significant modules (FDR < 0.05 and MSEA score > 5) and genes in T2D, module-gene network were visualized (see Additional file 5: Table S1) (Fig. 2A) Of the 32 statistically significant modules, the top 10 modules are shown after applying 12 algorithms in the plug-in cyto-Hubba (Fig. 2B) (see Additional file 5: Table S2), and all 10 modules reported by MSEA are implicated in T2D. Besides, we further summarized 15 SNPs from 28 genes in the top 10 modules to investigate the aggregate genetic link between the modules and T2D (Fig. 2C).
Weighted key driver analysis (wKDA) of T2D-related genes and analysis of hub genes
The wKDA using Mergeomics was used to evaluate potential networks and key regulators of type 2 diabetes-related genes based on multi-omics data. wKDA identified a network within the 84 genes (Fig. 3A). The top 10 key drives of the network were further screened (see Additional file 5: Table S3) (Fig. 3B, C), and their full names and detailed functional information are shown (see Additional file 5: Table S4). Based on the GeneMANIA database, we analyzed the co-expression network of these genes, which showed the complex PPI network with a co-expression of 35.34%, physical interactions of 22.12%, and co-localization 25.21%, shared protein domains of 8.23% and predicted of 9.10%. (Fig. 3D) GO analysis showed that these genes are mainly involved in immunoglobulin-mediated immune response, B cell-mediated immunity, immune receptor activity, and MHC class II protein complex (Fig. 3E–H). In addition, KEGG pathway analysis showed that they are mainly involved in the AGE-RAGE signaling pathway in diabetic complications and the relaxin signaling pathway (Fig. 3I).
Validation of hub genes
To make the results more reliable, the expression of hub genes was subjected to RT-qPCR verification from 3 T2D patients and 3 controls. No significant difference was observed in mRNA level between the control and T2D (PSMB9: P = 0.16; CD74: P = 0.64), although there were modest trends (Fig. 4A). Besides, the GSE25724 dataset was used to validate the diagnostic effectiveness of the biomarkers for T2D by ROC analysis. (Fig. 4B) AUC of more than 0.800 was considered as having the capability to diagnose T2D with excellent specificity and sensitivity. As shown in Fig. 4B, the AUC values of COL5A1, IRF7, CD74, and HLA-DRB1 were 0.928, 1.000, 0.952, and 0.833, respectively.
Further expression validation of the hub genes was performed in obese diabetic mice, which was achieved by comparing proteomics data from obese diabetic mice and data from the GEO database (GSE77943) after normalization and batch effect adjustment. The results show that expression levels of PSMB9, COL1A1, and COL4A1 had significantly higher in T2D. (Fig. 4C) (The expression value of HLA-DQB1, COL3A1, IRF7, COL5A1, CD74, HLA-DQA1, and HLA-DRB1 are missing in proteomics data from obese diabetic mice).
Prediction and verification of transfer factors (TFs)
Based on the TRRUST database, we found that 8 TFs may regulate the expression of these genes (Fig. 5A) (see Additional file 5: Table S5). Further verification, we discovered that NFKB1 is highly expressed in the T2D group (Fig. 5B) (the expression value of RFXANK, RFXAP, RFX5, CIITA, ILF3, and RELA1 missing in proteomics data from obese diabetic mice), which coordinately participated in the regulation of four hub genes (IRF7, PSMB9, CD74, and COL1A1).
Transcriptional regulation and histone modification related to hub genes based on epigenetic data
After quality control analysis, ChIP-seq results showed that the CTCF binding site was located on the CpG islands after the first exon of COL1A1 (Fig. 6A). In addition, CTCF was also combined with the promoter of PSMB9, which contains CpG islands (Fig. 6C). Moreover, we found that the regulation of hub genes including COL1A1, IRF7, PSMB9, COL4A1, and COL5A1 is widely related to histone modifications such as histone methylation and acetylation (Fig. 6A–E). The modification of H3K4me3 in pancreatic islets was confirmed to depend on the presence of CpG , which explains our result that CTCF binds near the chromatin region occupied by H3K4me3. Concerning chromatin accessibility, highly sensitive sites of DNase I were found on COL1A1, IRF7, and PSMB9 (Fig. 6A–C), which directly indicated the location of regions where transcriptional regulatory elements can bind.
Drug repositioning based on multi-omics data
Construction of multi-omics network and disease network
We merged the MSEA network and wKDA network to construct a multi-omics network for repositioning. The gene-disease association networks were created using DisGeNet, OMIM, OpenTargets, and Genecards databases, and genes were selected as nodes in the network if retrieved by at least two databases (Fig. 7A).
T2D-targeted screening for candidate drugs
According to the above results, a total of 1274 compounds were screened from the Mergeomics database. 17 compounds with intervention records from pancreas were proposed to possess therapeutic potential against T2D (see Additional file 5: Table S6). After excluding compounds that have not yet been widely used in clinical, sunitinib, a receptor tyrosine kinase inhibitor and chemotherapeutic agent used for the treatment of renal cell carcinoma (RCC) and imatinib-resistant gastrointestinal stromal tumor (GIST), was considered a potential drug of T2D.
Network pharmacology approach and molecular docking to predict the mechanisms of drugs counteracting T2D.
87 drug genes directly connected to multi-omics network and 33 genes from multi-omics network directly connected to drug genes were identified as possible anti-diabetic targets of sunitinib. (Fig. 7B) The target genes of sunitinib were further analyzed by GO and KEGG analyses (see Additional file 2: Fig S2). The results show that sunitinib may affect the AGE-RAGE pathways and gluconeogenesis, thereby influencing the development of T2D.
To evaluate the affinity of sunitinib for their targets, we performed molecular docking analysis. The binding poses and interactions of sunitinib with seven top key drivers from targets were obtained with Autodock Vina v.1.2.2 and binding energy for each interaction was generated. (The 3D-structure of the molecular target of COL5A1 was missing in the PDB database) Results showed that each target bound to sunitinib through visible hydrogen bonds and strong electrostatic interactions, moreover, hydrophobic pockets of each target were occupied successfully by sunitinib (Fig. 7C–H; Table 3).
The current paper demonstrated the molecular mechanisms associated with T2D progression by combining transcriptomic, genomic, epigenetic, and proteomic data. At the same time, some less-reported genes and pathways that may play key regulatory roles in T2D have been identified, which could provide new biomarker options for T2D research. In addition, this study conducted drug repositioning based on multi-omics data to facilitate the clinical translation of potential T2D therapeutics.
To identify markers for T2D and better understand the underlying pathways, we performed MSEA using multi-omics data. In the results, GO terms were enriched in “response to hypoxia” and “response to decreased oxygen levels”, implying the stress of oxygen deprivation. Hypoxia could activate the hypoxia-inducible factor rapidly, leading to the conversion of glucose utilization from aerobic to anaerobic metabolism, resulting in β-cell dysfunction [21, 22]. Besides, Li et al. found that hypoxia may cause intensive apoptotic injury of β cells by destroying islet vascular integrity . Related to the above, “gap junction assembly” and “mitochondrial outer membrane” were also enriched. β cells are connected by gap junction, which provides electric coupling between β cells, thus promoting the regulation of electrical activity and insulin secretion . Studies have confirmed that related to the abnormal expression of gap junction proteins, the breakdown of mitochondrial redox balance in β cells under long-term hyperglycemia will accelerate the dysfunction of β cells . On the other hand, “sulfur compound binding” was also enriched in the GO-based list. Sulfur-containing compounds include sulfur-containing amino acids, glutathione, etc. Elevated cysteine of sulfur-containing amino acids is associated with a doubling of the risk of insulin resistance . In addition, data from a prospective cohort study showed that diabetic patients had higher cystathionine and plasma total cysteine and lower antioxidants such as taurine . These findings suggest that sulfur-containing amino acids may interfere with blood glucose levels through oxidative stress. Moreover, the pathways “Leukocyte Transendothelial Migration” and “Reactome Signaling By Interleukins” were also significantly enriched. For islet vascular damage in diabetes, neutrophil transmigration across TNF-activated endothelial monolayers can be accelerated by co-clustering L-selectin with PECAM-1. And another crucial step is the shedding of L-selectin activated by Akt family kinases, p38 MAPK signaling pathways, etc. . As previously mentioned, increased secretion and chemotaxis of neutrophils would stimulate β cell apoptosis by reducing insulin signaling transduction, promoting the occurrence of ROS-NLRP3 inflammasome-IL-1β .
Elucidation of T2D driving molecular profiles through integrative multi-omic analysis, including genomic, epigenomic, and transcriptomic analysis, was the primary focus of this study. The most prominent finding to emerge from the research is that putative ten hub genes are associated with T2D. Among these, COL5A1, IRF7, CD74, and HLA-DRB1 expression was suggested to have diagnostic value in T2D, and the expression levels of PSMB9, COL1A1, and COL4A1 were significantly higher in T2D after validation. Besides, the enrichment analysis showed the hub genes were significantly associated with immune-related terms and T2D-related terms, which corroborates the findings of a great deal of the previous work [30,31,32]. Chemokines and cytokines are jointly involved in the occurrence and development of T2D [33, 34], such as serum TNF-α, adiponectin, Growth factor 19/21, Interleukin-1 beta (IL-1β), Interleukin-6 (IL-6), Interleukin-18 (IL-18), and C-reactive protein (CRP), which play a central role in the development of T2D. According to KEGG analysis, the AGE-RAGE signaling pathway plays a vital role in T2D. In T2D patients, elevated blood glucose leads to advanced glycation end products ; the products of nonenzymatic glycation/oxidation of proteins/lipids are signal transduction ligands for Receptor for AGE (RAGE), which accumulate in the vessel wall. Furthermore, the recruitment of inflammatory cells bearing Calgranulin B (S100A9), also ligands for RAGE, augments vascular dysfunction and can subsequently exacerbate the progression of T2D [36, 37]. In addition, we found that 8 TFs may regulate the expression of these genes. We further verified that NFKB1 is highly expressed in T2D patients, which coordinately participated in regulating four hub genes (IRF7, PSMB9, CD74, and COL1A1).
In accordance with the present results, previous studies have demonstrated that HLA-DQB1 , COL1A1 , COL3A1 , COL4A1 , CD74 , and HLA-DQA1  are highly related to T2D pathogenesis. While PSMB9, IRF7, and COL5A1 have not been previously reported to be associated with T2D in the literature. PSMB9 is a multicatalytic proteinase complex with a highly ordered ring-shaped 20S core structure, which encodes a member of the proteasome B-type family, also known as the T1B family. An essential function of a modified proteasome, the immunoproteasome, is the processing of class I MHC peptides, which have critical roles in the immune response. The activation of PSMB9 may be related to the expression of CTCF, which is shown in the results. CTCF, also called CCCTC-binding factor, encodes a transcriptional regulatory protein with 11 highly conserved zinc finger (ZF) domains. It has been revealed that CTCF-mediated chromatin accessibility changes could help to increase the transcription of genes related to important functions of pancreatic β cells, thereby increasing insulin secretion and improving T2D . Diseases associated with the PSMB9 gene in GWAS datasets from the DISEASES Experimental Gene-Disease Association Evidence Scores dataset revealed a potential association between PSMB9 and type 1 diabetes mellitus with a standardized value of 1.20753. Besides, Phenotypes associated with the PSMB9 gene by text-mining GWAS publications from the HuGE Navigator Gene-Phenotype Associations dataset also suggested the potential association between PSMB9 mutations and type 1 diabetes. IRF7 encodes interferon regulatory factor 7, a member of the interferon regulatory factor (IRF) family, which has been shown to play a role in the transcriptional activation of virus-inducible cellular genes, including interferon β chain genes. In the results of DNase-seq, the cis-acting elements located near the TSS region of IRF7 showed high chromatin accessibility. It means that the transcription of IRF7 may be regulated in many ways, such as H3K4me3, H3K27ac, and H3K9ac, which promote the activation of transcription. A previous study has reported that highly expressed IRF7 in the pancreas is implicated in immunoinflammatory diseases such as autoimmune pancreatitis  and pancreatic ductal adenocarcinoma . In addition, a study by Hemin et al. demonstrated that STAT1-IRF7-MHC I complex axis was crucial for IFN-α signaling in islets and created positive feedback through IRF7-STAT2 cascade amplifying signals which accelerated the process of type 1 diabetes. COL5A1 encodes an alpha chain for one of the low abundance fibrillar collagens. Previous work revealed that Col5A1, Nqo1, and Notch2 modulated by Ast may promote insulin-releasing balance, relieve insulin resistance, and maintain normal size in marginal-zone B cells . Besides, diseases associated with COL5A1 gene/protein from the curated CTD Gene-Disease Associations dataset suggested the potential association between COL5A1 and diabetes mellitus with a standardized value of 1.24401 . Hence, the above results provide meaningful clues for the further study of PSMB9, IRF7, and COL5A1 in the occurrence and development of T2D, and further exploration is needed.
Due to the lack of effective/safe and less expensive drugs, drug repositioning appears to be the best tool for finding proper targets and predicting latent drugs in the therapy for T2D and related complications. Finally, 17 compounds were expected to have potential therapeutic effects on T2D. Among these, sunitinib  is a multi-target receptor tyrosine kinase inhibitor of vascular endothelial growth factor receptor (VEGFR), platelet derived growth factor receptor (PDGFR), etc. It is often used as a chemotherapeutic agent to treat renal cell carcinoma (RCC) and pancreatic neuroendocrine tumors. Considering that VEGF/VEGFR, PDGF/PDGFR related signal pathways play essential roles during the development of T2D, such as insulin resistance , the expression of iron metabolism related-proteins , islet cell inflammation , further studies to investigate potential therapeutic benefits for sunitinib in T2D are warranted.
Due to the heterogeneity of T2D , future T2D treatment urgently needs to be personalized. Nowadays, nanotechnology has shown great prospects in T2D pharmacological intervention, such as extending the release of anti-diabetic peptides through the hydrogel system , oracle delivery of nuclear acid therapy with higher stability , etc. In addition, the use of versatile drug delivery nanocarriers after multi-level specific biomarker recognition based on multi-omics data may increase the targeting effect of drugs and promote the realization of personalized T2D treatment .
Despite these promising results, questions remain. The results of RT-qPCR on the predicted hub genes could only show that the expressions of CD74 and PSMB9 were up-regulated in the T2D group, while the expressions of other genes were not detected (Additional files 3, 4). In addition to the poor effect of RNA extraction, since the samples used for RNA extraction were human serum rather than islets, the greater heterogeneity of the gene expression levels of the two could be the cause. For that, it would be problematic to demonstrate the expression of predicted genes in islets due to the inaccessibility of normal human islets and the potential ethical issues involved.
Availability of data and materials
The original contributions presented in the study are publicly available. The data can be found here: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE38642, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE25724, and https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE21232.
Type 2 diabetes
Gene expression omnibus
Differentially expressed genes
Marker set enrichment analysis
Weighted key driver analysis
Kyoto Encyclopedia of Genes and Genomes
Real-time quantitative PCR
Transcriptional regulatory relationships unraveled by sentence-based text mining
Receiver operator characteristic
Area under the ROC curve
Receptor for AGE
Interferon regulatory factor
Centers for Disease Control and Prevention National Diabetes Statistics Report website https://www.cdc.gov/diabetes/data/statistics-report/indexhtml. Accessed 2022.
Aune D, Schlesinger S, Neuenschwander M, Feng T, Janszky I, Norat T, Riboli E. Diabetes mellitus, blood glucose and the risk of heart failure: a systematic review and meta-analysis of prospective studies. Nutr Metab Cardiovasc Dis. 2018;28:1081–91.
Economic Costs of Diabetes in the U.S. in 2017. Diabetes Care. 2018; 41:917–928.
Hasin Y, Seldin M, Lusis A. Multi-omics approaches to disease. Genome Biol. 2017;18:83.
Chen YT, Lin WD, Liao WL, Tsai YC, Liao JW, Tsai FJ. NT5C2 methylation regulatory interplay between DNMT1 and insulin receptor in type 2 diabetes. Sci Rep. 2020;10:16087.
Lawlor N, George J, Bolisetty M, Kursawe R, Sun L, Sivakamasundari V, Kycia I, Robson P, Stitzel ML. Single-cell transcriptomes identify human islet cell signatures and reveal cell-type-specific expression changes in type 2 diabetes. Genome Res. 2017;27:208–22.
Zhou W, Sailani MR, Contrepois K, Zhou Y, Ahadi S, Leopold SR, Zhang MJ, Rao V, Avina M, Mishra T, et al. Longitudinal multi-omics of host-microbe dynamics in prediabetes. Nature. 2019;569:663–71.
Jung SY. Multi-omics data analysis uncovers molecular networks and gene regulators for metabolic biomarkers. Biomolecules. 2021;11:406.
Ding J, Blencowe M, Nghiem T, Ha SM, Chen YW, Li G, Yang X. Mergeomics 2.0: a web server for multi-omics data integration to elucidate disease networks and predict therapeutics. Nucleic Acids Res. 2021;49:W375-w387.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–504.
Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, Franz M, Grouios C, Kazi F, Lopes CT, et al. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 2010;38:W214-220.
Srinivasan K, Viswanad B, Asrat L, Kaul CL, Ramarao P. Combination of high-fat diet-fed and low-dose streptozotocin-treated rat: a model for type 2 diabetes and pharmacological screening. Pharmacol Res. 2005;52:313–20.
Han H, Cho JW, Lee S, Yun A, Kim H, Bae D, Yang S, Kim CY, Lee M, Kim E, et al. TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. 2018;46:D380-d386.
Mei S, Qin Q, Wu Q, Sun H, Zheng R, Zang C, Zhu M, Wu J, Shi X, Taing L, et al. Cistrome Data Browser: a data portal for ChIP-Seq and chromatin accessibility data in human and mouse. Nucleic Acids Res. 2017;45:D658-d662.
Piñero J, Ramírez-Anguita JM, Saüch-Pitarch J, Ronzano F, Centeno E, Sanz F, Furlong LI. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 2020;48:D845-d855.
Amberger JS, Bocchini CA, Schiettecatte F, Scott AF, Hamosh A. OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 2015;43:D789-798.
Ghoussaini M, Mountjoy E, Carmona M, Peat G, Schmidt EM, Hercules A, Fumis L, Miranda A, Carvalho-Silva D, Buniello A, et al. Open targets genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 2021;49:D1311-d1320.
Safran M, Dalah I, Alexander J, Rosen N, Iny Stein T, Shmoish M, Nativ N, Bahir I, Doniger T, Krug H, et al. GeneCards Version 3: the human gene integrator. Database (Oxford). 2010;2010:baq020.
Shade A, Handelsman J. Beyond the Venn diagram: the hunt for a core microbiome. Environ Microbiol. 2012;14:4–12.
Bhandare R, Schug J, Le Lay J, Fox A, Smirnova O, Liu C, Naji A, Kaestner KH. Genome-wide analysis of histone modifications in human pancreatic islets. Genome Res. 2010;20:428–33.
Cantley J, Selman C, Shukla D, Abramov AY, Forstreuter F, Esteban MA, Claret M, Lingard SJ, Clements M, Harten SK, et al. Deletion of the von Hippel-Lindau gene in pancreatic beta cells impairs glucose homeostasis in mice. J Clin Invest. 2009;119:125–35.
Kim JW, Tchernyshyov I, Semenza GL, Dang CV. HIF-1-mediated expression of pyruvate dehydrogenase kinase: a metabolic switch required for cellular adaptation to hypoxia. Cell Metab. 2006;3:177–85.
Li X, Zhang L, Meshinchi S, Dias-Leme C, Raffin D, Johnson JD, Treutelaar MK, Burant CF. Islet microvasculature in islet hyperplasia and failure in a model of type 2 diabetes. Diabetes. 2006;55:2965–73.
Head WS, Orseth ML, Nunemaker CS, Satin LS, Piston DW, Benninger RK. Connexin-36 gap junctions regulate in vivo first- and second-phase insulin secretion dynamics and glucose tolerance in the conscious mouse. Diabetes. 2012;61:1700–7.
Benáková Š, Holendová B, Plecitá-Hlavatá L. Redox homeostasis in pancreatic β-cells: from development to failure. Antioxidants (Basel). 2021; 10.
Elshorbagy AK, Valdivia-Garcia M, Refsum H, Butte N. The association of cysteine with obesity, inflammatory cytokines and insulin resistance in Hispanic children and adolescents. PLoS ONE. 2012;7: e44166.
Elshorbagy AK, Turner C, Bastani N, Refsum H, Kwok T. The association of serum sulfur amino acids and related metabolites with incident diabetes: a prospective cohort study. Eur J Nutr. 2022;61:3161.
Rahman I, Collado Sánchez A, Davies J, Rzeniewicz K, Abukscem S, Joachim J, Hoskins Green HL, Killock D, Sanz MJ, Charras G, et al. L-selectin regulates human neutrophil transendothelial migration. J Cell Sci. 2021; 134.
Dror E, Dalmas E, Meier DT, Wueest S, Thévenet J, Thienel C, Timper K, Nordmann TM, Traub S, Schulze F, et al. Postprandial macrophage-derived IL-1β stimulates insulin, and both synergistically promote glucose disposal and inflammation. Nat Immunol. 2017;18:283–92.
Donath MY, Dinarello CA, Mandrup-Poulsen T. Targeting innate immune mediators in type 1 and type 2 diabetes. Nat Rev Immunol. 2019;19:734–46.
Golden TN, Simmons RA. Immune dysfunction in developmental programming of type 2 diabetes mellitus. Nat Rev Endocrinol. 2021;17:235–45.
SantaCruz-Calvo S, Bharath L, Pugh G, SantaCruz-Calvo L, Lenin RR, Lutshumba J, Liu R, Bachstetter AD, Zhu B, Nikolajczyk BS. Adaptive immune cells shape obesity-associated type 2 diabetes mellitus and less prominent comorbidities. Nat Rev Endocrinol. 2022;18:23–42.
Pan X, Kaminga AC, Wen SW, Liu A. Chemokines in prediabetes and type 2 diabetes: a meta-analysis. Front Immunol. 2021;12: 622438.
Dovio A, Angeli A. Cytokines and type 2 diabetes mellitus. JAMA. 2001;286:2233.
Eftekhari A, Vahed SZ, Kavetskyy T, Rameshrad M, Jafari S, Chodari L, Hosseiniyan SM, Derakhshankhah H, Ahmadian E, Ardalan M. Cell junction proteins: crossing the glomerular filtration barrier in diabetic nephropathy. Int J Biol Macromol. 2020;148:475–82.
Ortega FJ, Mercader JM, Moreno-Navarrete JM, Sabater M, Pueyo N, Valdés S, Ruiz B, Luche E, Serino M, Naon D, et al. Targeting the association of calgranulin B (S100A9) with insulin resistance and type 2 diabetes. J Mol Med (Berl). 2013;91:523–34.
Yan SF, Ramasamy R, Naka Y, Schmidt AM. Glycation, inflammation, and RAGE: a scaffold for the macrovascular complications of diabetes and beyond. Circ Res. 2003;93:1159–69.
Ma ZJ, Sun P, Guo G, Zhang R, Chen LM. Association of the HLA-DQA1 and HLA-DQB1 alleles in type 2 diabetes mellitus and diabetic nephropathy in the Han ethnicity of China. J Diabetes Res. 2013;2013: 452537.
Lin G, Wan X, Liu D, Wen Y, Yang C, Zhao C. COL1A1 as a potential new biomarker and therapeutic target for type 2 diabetes. Pharmacol Res. 2021;165: 105436.
Blencowe M, Furterer A, Wang Q, Gao F, Rosenberger M, Pei L, Nomoto H, Mawla AM, Huising MO, Coppola G, et al. IAPP-induced beta cell stress recapitulates the islet transcriptome in type 2 diabetes. Diabetologia. 2022;65:173–87.
Park JT, Kato M, Lanting L, Castro N, Nam BY, Wang M, Kang SW, Natarajan R. Repression of let-7 by transforming growth factor-β1-induced Lin28 upregulates collagen expression in glomerular mesangial cells under diabetic conditions. Am J Physiol Renal Physiol. 2014;307:F1390-1403.
Chen L, Yin Z, Qin X, Zhu X, Chen X, Ding G, Sun D, Wu NN, Fei J, Bi Y, et al. CD74 ablation rescues type 2 diabetes mellitus-induced cardiac remodeling and contractile dysfunction through pyroptosis-evoked regulation of ferroptosis. Pharmacol Res. 2022;176: 106086.
Wang RR, Qiu X, Pan R, Fu H, Zhang Z, Wang Q, Chen H, Wu QQ, Pan X, Zhou Y, et al. Dietary intervention preserves β cell function in mice through CTCF-mediated transcriptional reprogramming. J Exp Med. 2022; 219.
Minaga K, Watanabe T, Arai Y, Shiokawa M, Hara A, Yoshikawa T, Kamata K, Yamashita K, Kudo M. Activation of interferon regulatory factor 7 in plasmacytoid dendritic cells promotes experimental autoimmune pancreatitis. J Gastroenterol. 2020;55:565–76.
Muthalagu N, Monteverde T, Raffo-Iraolagoitia X, Wiesheu R, Whyte D, Hedley A, Laing S, Kruspig B, Upstill-Goddard R, Shaw R, et al. Repression of the Type I interferon pathway underlies MYC- and KRAS-dependent evasion of NK and B cells in pancreatic ductal adenocarcinoma. Cancer Discov. 2020;10:872–87.
Sun X, Ji Y, Tahir A, Kang J. Network pharmacology combined with transcriptional analysis to unveil the biological basis of astaxanthin in reducing the oxidative stress induced by diabetes mellitus. Diabetes Metab Syndr Obes. 2020;13:4281–95.
Davis AP, Grondin CJ, Lennon-Hopkins K, Saraceni-Richards C, Sciaky D, King BL, Wiegers TC, Mattingly CJ. The comparative toxicogenomics database’s 10th year anniversary: update 2015. Nucleic Acids Res. 2015;43:D914-920.
Christensen JG. A preclinical review of sunitinib, a multitargeted receptor tyrosine kinase inhibitor with anti-angiogenic and antitumour activities. Ann Oncol. 2007;18(Suppl 10):x3-10.
Abderrahmani A, Yengo L, Caiazzo R, Canouil M, Cauchi S, Raverdy V, Plaisance V, Pawlowski V, Lobbens S, Maillet J, et al. Increased hepatic PDGF-AA signaling mediates liver insulin resistance in obesity-associated type 2 diabetes. Diabetes. 2018;67:1310–21.
Karamzad N, Eftekhari A, Ashrafi-Asgarabad A, Sullman MJM, Sahebkar A, Safiri S. Serum hepcidin, the hepcidin/ferritin ratio and the risk of type 2 diabetes: a systematic review and meta-analysis. Curr Med Chem. 2021;28:1224–33.
Villalta SA, Lang J, Kubeck S, Kabre B, Szot GL, Calderon B, Wasserfall C, Atkinson MA, Brekken RA, Pullen N, et al. Inhibition of VEGFR-2 reverses type 1 diabetes in NOD mice by abrogating insulitis and restoring islet function. Diabetes. 2013;62:2870–8.
Redondo MJ, Hagopian WA, Oram R, Steck AK, Vehik K, Weedon M, Balasubramanyam A, Dabelea D. The clinical consequences of heterogeneity within and between different diabetes types. Diabetologia. 2020;63:2040–8.
Zhuang Y, Yang X, Li Y, Chen Y, Peng X, Yu L, Ding J. Sustained release strategy designed for lixisenatide delivery to synchronously treat diabetes and associated complications. ACS Appl Mater Interfaces. 2019;11:29604–18.
Nurunnabi M, Lee SA, Revuri V, Hwang YH, Kang SH, Lee M, Cho S, Cho KJ, Byun Y, Bae YH, et al. Oral delivery of a therapeutic gene encoding glucagon-like peptide 1 to treat high fat diet-induced diabetes. J Control Release. 2017;268:305–13.
Rosenblum D, Peer D. Omics-based nanomedicine: the future of personalized oncology. Cancer Lett. 2014;352:126–36.
Taneera J, Lang S, Sharma A, Fadista J, Zhou Y, Ahlqvist E, Jonsson A, Lyssenko V, Vikman P, Hansson O, et al. A systems genetics approach identifies genes and pathways for type 2 diabetes in human islets. Cell Metab. 2012;16:122–34.
Dominguez V, Raimondi C, Somanath S, Bugliani M, Loder MK, Edling CE, Divecha N, da Silva-Xavier G, Marselli L, Persaud SJ, et al. Class II phosphoinositide 3-kinase regulates exocytosis of insulin granules in pancreatic beta cells. J Biol Chem. 2011;286:4216–25.
Volkmar M, Dedeurwaerder S, Cunha DA, Ndlovu MN, Defrance M, Deplus R, Calonne E, Volkmar U, Igoillo-Esteve M, Naamane N, et al. DNA methylation profiling identifies epigenetic dysregulation in pancreatic islets from type 2 diabetic patients. Embo J. 2012;31:1405–26.
Fuchsberger C, Flannick J, Teslovich TM, Mahajan A, Agarwala V, Gaulton KJ, Ma C, Fontanillas P, Moutsianas L, McCarthy DJ, et al. The genetic architecture of type 2 diabetes. Nature. 2016;536:41–7.
All authors would like to thank all those who contributed to this study.
This work was supported by a grant from the Natural Science Foundation of Hunan Province [Grant Number 2020JJ5856], and the National undergraduate innovation training Project (2020105330289).
Ethics approval and consent to participate
The studies involving human participants were reviewed and approved by IRB of the third Xiangya hospital of central south university expedited approval. The patients/participants provided their written informed consent to participate in this study.
Consent for publication
Authors declare no competing interest for this article.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original version of this article was revised: incorrect affiliation of author Weizheng Li was assigned - 3 Xiangya Medical College, Central South University, No. 138 Tongzipo Road Yuelu District, Changsha 410013, Hunan, People’s Republic of China. It should be Weizheng Li 1 – 1 Department of General Surgery, Third Xiangya Hospital Central South University, No. 138 Tongzipo Road Yuelu District, Changsha 410013, Hunan, People’s Republic of China
Additional file 1: Figure S1.
GO analysis of the multi-omics profile. (A-C) The results of GO were presented by bubble charts.
Additional file 2: Figure S2.
GO and KEGG analysis of sunitinib target genes. (A-C) The results of GO were presented by bubble charts. (D) Bubble graphs showed the enrichment results of the KEGG pathways.
Additional file 3: Figure S3.
RT-qPCR analysis of CD74. (A-H) The RT-qPCR process results of CD74.
Additional file 4: Figure S4.
RT-qPCR analysis of PSMB9. (A-H) The RT-qPCR process results of PSMB9.
Additional file 5: Table S1
: MSEA top pathways. Table S2: Scores of pathways after applying 12 topological algorithms. Table S3: Detailed information of key drivers in wKDA network. Table S4: The details of hub genes. Table S5: The details of TFs. Table S6: Detailed results of drug repositioning results.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Liu, J., Liu, S., Yu, Z. et al. Uncovering the gene regulatory network of type 2 diabetes through multi-omic data integration. J Transl Med 20, 604 (2022). https://doi.org/10.1186/s12967-022-03826-5
- Multi-omics network
- Key driver genes
- Type 2 diabetes
- Drug repositioning
- Bioinformatics analysis