Serum protein signature of coronary artery disease in type 2 diabetes mellitus

Background Coronary artery disease (CAD) is the leading cause of morbidity and mortality in patients with type 2 diabetes mellitus (T2DM). The purpose of the present study was to discriminate the Indian CAD patients with or without T2DM by using multiple pathophysiological biomarkers. Methods Using sensitive multiplex protein assays, we assessed 46 protein markers including cytokines/chemokines, metabolic hormones, adipokines and apolipoproteins for evaluating different pathophysiological conditions of control, T2DM, CAD and T2DM with CAD patients (T2DM_CAD). Network analysis was performed to create protein-protein interaction networks by using significantly (p < 0.05) altered protein markers in each disease using STRING 10.5 database. We used two supervised analysis methods i.e., between class analysis (BCA) and principal component analysis (PCA) to reveals distinct biomarkers profiles. Further, random forest classification (RF) was used to classify the diseases by the panel of markers. Results Our two supervised analysis methods BCA and PCA revealed a distinct biomarker profiles and high degree of variability in the marker profiles for T2DM_CAD and CAD. Thereafter, the present study identified multiple potential biomarkers to differentiate T2DM, CAD, and T2DM_CAD patients based on their relative abundance in serum. RF classified T2DM based on the abundance patterns of nine markers i.e., IL-1β, GM-CSF, glucagon, PAI-I, rantes, IP-10, resistin, GIP and Apo-B; CAD by 14 markers i.e., resistin, PDGF-BB, PAI-1, lipocalin-2, leptin, IL-13, eotaxin, GM-CSF, Apo-E, ghrelin, adipsin, GIP, Apo-CII and IP-10; and T2DM _CAD by 12 markers i.e., insulin, resistin, PAI-1, adiponectin, lipocalin-2, GM-CSF, adipsin, leptin, Apo-AII, rantes, IL-6 and ghrelin with respect to the control subjects. Using network analysis, we have identified several cellular network proteins like PTPN1, AKT1, INSR, LEPR, IRS1, IRS2, IL1R2, IL6R, PCSK9 and MYD88, which are responsible for regulating inflammation, insulin resistance, and atherosclerosis. Conclusion We have identified three distinct sets of serum markers for diabetes, CAD and diabetes associated with CAD in Indian patients using nonparametric-based machine learning approach. These multiple marker classifiers may be useful for monitoring progression from a healthy person to T2DM and T2DM to T2DM_CAD. However, these findings need to be further confirmed in the future studies with large number of samples. Electronic supplementary material The online version of this article (10.1186/s12967-018-1755-5) contains supplementary material, which is available to authorized users.

Background Type 2 diabetes mellitus (T2DM) is a chronic metabolic disease associated with higher blood glucose levels as a result of insufficient insulin secretion, insulin action or both. T2DM is influenced by both host genetics and environmental factors including age, family history, diet and sedentary life style. The global burden of T2DM has been estimated at 425 million and it is going to increase 48% i.e., 625 million by the year 2045 [1]. T2DM is a major risk factor for developing micro and macro vascular complications [2]. Patients with T2DM have two to three-fold higher cardiovascular risk than non-diabetic subjects [1]. Persistently high blood glucose levels may cause vascular damage and develop vascular complications like coronary artery disease (CAD), which further lead to angina or myocardial infarction. Diabetic patients were unaware of the cardiovascular complications until they were hospitalized with angina or myocardial infarction. Early prediction or detection of the disease may prevent the disease progression by therapeutic intervention and management plan. Therefore, there is a need to find specific markers for the detection of different levels of diseases severity or progression of T2DM associated with CAD. Until now, most of the studies reported single or very few numbers of markers to understand the disease progression [3] and limited number of studies used multiple markers approaches giving consideration of different pathophysiological process, like lipoprotein metabolism, hormonal imbalance and inflammation [4]. Several prospective and randomized studies such as United Kingdom Prospective Diabetes Study (UKPDS), Action to Control Cardiovascular Risk in Diabetes (ACCORD), Action in Diabetes and Vascular Disease Preterax and Diamicron Modified Release Controlled Evaluation (ADVANCE) and Veterans Administration Diabetes Trial (VADT) show that regulation of glucose has little or no effect on cardiovascular complication and mortality [5]. Canakinumab Anti-inflammatory Thrombosis Outcome Study (CANTOS) with cankanemib antiinflammatory therapy targeting IL-1beta shown reduced recurrent cardiovascular events and decreased IL-6 levels dose depended manner [6]. CANTOS study as well as previous literature demonstrated that inflammation plays pivotal role in the development and progression of many complex diseases such as hypertension, dyslipidaemia, diabetes, and cardiovascular diseases (CVD) [6,7]. Data indicates that inflammation could be the bridging link between T2DM and CVD. Chronic inflammation works through different mechanisms like endothelial dysfunction, oxidative stress, and insulin resistance [8]. Among different cells, macrophages and adipocytes play important role to induce inflammation in diabetes. Cytokines are cell-signaling molecules in inflammatory state and secretes from macrophages [9]. Similar to macrophages, adipocytes are also very sensitive to inflammation and secrete different adipokines such as leptin, adiponectin, resistin, and proinflammatory factors like TNF-α, IL-1β, IL-6, and PAI-1. Retained lipoproteins with other atherogenic factors that observed during atherosclerosis process, activates endothelial cells to recruit more immune cells like monocytes, which further differentiate into macrophages and activate other inflammatory signaling pathways [10]. In the process of atherosclerosis development, the expression of lipoprotein surface molecules such as apolipoproteins i.e., atherogenic (Apo-B) and anti-atherogenic (Apo-AI) molecules [11] may alter and cause coronary artery diseases [12]. We hypothesised that different cells like macrophages, adipocytes, and endothelial cells were communicating through metabolic hormones, cytokine/ chemokine, adipokine and apolipoproteins to regulate insulin sensitivity, lipid metabolism, and inflammation. Perturbation in signaling molecules homeostasis due to environmental or genetic changes may contribute in T2DM and CAD progression. Therefore, a set of markers perturbed at different stages of the disease progression, may be potential biomarkers for predicting diseases phenotypes. To identify potential biomarkers of T2DM with or without cardiovascular complication, different omics approaches targeting multiple proteins, metabolites, microRNAs and long non-coding RNAs are preferred [13]. In the present study, we measured protein markers in patients with T2DM, CAD, T2DM_CAD and healthy controls to identify a disease-specific panel through nonparametric based machine learning classification to distinct different stages of disease progression.

Inclusion and exclusion criteria for selection of study subjects
Group 1: Control (CT, n = 26) subjects had no prior history of T2DM, hypertension, coronary artery diseases or any other cardiovascular diseases, and were not taking medication for any chronic medical condition. Fasting blood glucose, HbA1c and blood chemistry were normal. Group 2: Type 2 diabetes (T2DM, n = 53) subjects with HbA1c levels ≥ 6.5% as per American Diabetes Association (ADA) guidelines with proven history of T2DM and no other complications. Group 3: Coronary artery diseases (CAD, n = 21) subjects were diagnosed based on positive medical history (myocardial infarction, angina pectoris and coronary artery bypass graft) and/or ischemic changes on a conventional 12-lead ECG, which included ST-segment depression or Q-wave changes [14]. Coronary artery disease subjects were identified in inpatient setting of cardiac catheterization unit in Mediciti hospital by the cardiologist. This group had no prior history of T2DM. Group 4: Type 2 diabetes with coronary artery diseases (T2DM_CAD, n = 27) subjects were coronary artery disease as defined for group 3 but patient had HbA1c levels ≥ 6.5% and prior history of T2DM. Information regarding demographic, clinical, and angiographic data was also collected from all patients. Fasting samples were collected from the patients prior to the percutaneous coronary intervention (PCI) or coronary artery bypass graft (CABG). Clinical or laboratory evidence of chronic diseases conditions like liver failure, renal failure (serum creatinine levels > 1.5 mg/dl), type 1 diabetes, cancer, thyroid disease and pregnancy subjects were excluded from the study.

Statistical analysis, classification and visualization
The majority of statistical analysis and visualizations were performed using the various modules of the R programming interface. In order to obtain the fold change, we used the median values of each protein marker obtained across all the healthy controls (referred to as the 'control median'). The fold change of a given marker for a given patient was then obtained as the log-ratio of the value of the marker in that patient divided by the control-median corresponding to that marker. The various modules used for this purpose were randomForest (Random forest classifications), dudi.pca of the ade4 package (for performing Principal Component Analysis), dunn.test (for performing post hoc dunn's tests of pairwise comparisons across cohorts), and kruskal.test (for identifying the significantly different markers for different cohorts). Only p < 0.05 were considered as significant. We used twostage approach to select the specific markers ( Table 2). In the first step, we selected all those protein features that were significantly different between two groups at a nominal p-value < 0.05. Then, in the second step, on this subset, we applied a Benjamini-Hochberg correction and selected those features with corrected false discovery rate (fdr) p-value < 0.15. Correlations between markers, and between clinical parameters and protein markers were obtained using kendall's tau (corr function of R) and spearman correlation was used to find the correlation. R value 0.3 was set as threshold and significance was considered as p < 0.05. The different modules for visualizations were heatmap.2 (for heatmap), Between Class Analysis is a specialized form of 'supervised' Principal Component Analysis (PCA), with respect to the instrumental variable (in this the class). It provided better resolution and provides a better analysis for marker identification as compared to the PCA factoextra (for PCA biplot showing the association of the markers with different patient marker profiles), ggcorrplot (for plotting the correlations as heatmaps) and s.class function of the ade4 package (for visualization of the class-based resolution of the patient marker profiles obtained using the PCA). In-house codes were written in Perl for computing the pair-wise variations among patient marker profiles using the J-divergence measure [16].

Results
A total 127 subjects were randomly selected and enrolled in the study (Table 1). Clinical and biochemical characteristics were represented in the Table 1. Male and female subject's ratio was not matching in the study groups as male subjects were more prone to CAD than females. Fasting blood glucose and glycated haemoglobin (HbA1c) levels were significantly (p < 0.001) increased in T2DM and T2DM_CAD groups as compared to control.

Serum protein markers levels in study groups
Levels of 45 protein markers in the serum samples of enrolled subjects were measured. Four cytokines IL-2, IL-7, IL-15 and MIP-1α were excluded from out of 46 protein markers due to detection limits of the present assay. Fold change of the various clinically significant markers across all the individuals belonging to the three different disease states i.e., T2DM, CAD and T2DM_ CAD represented in heat map. The median fold change in each disease cohort versus the control medians of each marker is also shown (Fig. 1).

Table 1 Clinical and biochemical variables in study groups
Data are mean (SD) and median (Q1-Q3) for normally distributed and non-normally distributed variables respectively BMI Body mass index, FBS fasting blood sugar, HbA1c glycated hemoglobin, eGFR estimated glomerular filtration rate, FRS Framingham Coronary Heart Disease Risk Score in 10 years, ASCVD estimate risk score for atherosclerotic cardiovascular disease in 10 years. Anti-platelet and Statin therapy was given to the all the patients for the prophylaxis for the CAD event a p < 0.05 compared to control Hypertension history -25 16 14 Patients were on hypertensive drugs -18 13 14 Patients were on anti-diabetes medication -34 -17 Patients were on anti-platelet and statin therapy --25 26 were significantly (fdr corrected p-value < 0.15) increased in the T2DM_CAD group as compared to T2DM group.
In the present study, we also compared protein markers alteration in T2DM_CAD groups with CAD group, metabolic hormones i.e., insulin, GIP and GLP-1 levels were significantly (fdr corrected p-value < 0.10) increased where IP-10 levels were decreased in T2DM_CAD as compared with CAD group.

Serum protein marker levels in T2DM and CAD group compared with control group
Our data also indicated that metabolic hormone i.e., GIP, cytokine i.e., GM-CSF and apolipoprotein i.e., Apo-AI levels were significantly (fdr corrected p-value < 0.10) decreased while lipocalin-2 was significantly increased in T2DM group as compared with control group (fdr corrected p-value < 0.10). Metabolic hormone such as ghrelin and adipokines i.e., resistin, PAI-I, adipsin, and lipocalin-2 levels were significantly (fdr corrected p-value < 0.10) increased in CAD group and similarly cytokines eotaxin, IP-10, PDGF-BB levels, and apolipoprotein such as Apo-CII levels were significantly (fdr corrected p-value < 0.10) increased in CAD group as compared with control group. Metabolic hormones i.e., GIP and leptin levels were significantly (fdr corrected p-value < 0.05) decreased in CAD group as compared to control group (Table 2).

Network analysis
Further, we conducted protein-protein interaction (PPI) network analysis using significantly (p ≤ 0.05) altered proteins in each disease using STRING 10.5 database. STRING database provides PPIs from experimental interactions from different sources combining text and data mining approaches. We constructed diseasespecific PPI networks based on high confidence score threshold (STRING score ≥ 0.7). The Kyoto Encyclopedia of Genes and Genomes (KEGG) database was used to assign related gene categories into their associated pathways, through the STRING interface. KEGG pathway enrichment analysis was performed and results with multiple testing corrections were used for further analysis. False discovery rate (FDR) threshold ≤ 1% was applied. KEGG pathway analysis sheet was submitted as Additional file 1. The important processes were colored using the STRING analysis tool tab. The networks were Fig. 1 Heatmap showing the fold change of the various clinically significant markers across all the individuals belonging to the three different disease states. In order to obtain the fold change, the median values of each clinical marker was obtained across all the healthy controls (referred to as the 'control median'). The fold change of a given marker for a given patient was then obtained as the log-ratio of the value of the marker in that patient divided by the control-median corresponding to that marker. Four distinct sets of correlated protein markers (CLs) are highlighted by dark blue, light blue, yellow and green boxes on heatmap. The median fold change in each disease cohort versus the control medians of each marker is also shown  downloaded and edited to highlight the upregulated and downregulated proteins. In T2DM group, three proteins were down-regulated and one protein lipocalin-2 was upregulated. PPI network showed that these molecules were involved in the cytokine-cytokine receptor signaling and Jak/Stat signaling (Fig. 2a). Similarly, in CAD group, lipocalin 2, ghrelin, PAI-I (serpine1), adipsin (CFD), resistin, PDGF-BB, CCL11, IP-10 and APO-CII were upregulated, and GIP and leptin were down regulated. The network revealed that these molecules were closely associated with cytokine-cytokine receptor signaling and chemokine signaling (Fig. 2b). T2DM_CAD group GM-CSF (CSF2) and PAI-I levels were down regulated and TNF-alpha, IL-1β, CCL-11, lipocalin 2, insulin, GLP-1, adiponectin and adipsin were upregulated. All these proteins are involved in the cytokine-cytokine receptor signaling, NF-kB signaling, insulin signaling and adipocytokine signaling (Fig. 2c). In the T2DM_CAD category, nine markers GLP-1, GIP, Insulin, IL-6, Apo E, Apo-AI, Apo-AII, TNF-α, IL-6, and adipsin were upregulated and leptin was down regulated. These proteins were involved in the cytokine-cytokine receptor signaling, Jak-Stat signaling, PI3K-Akt signaling, adipocytokine signaling and insulin signaling when compared with T2DM (Fig. 2d).

Correlation among protein markers and clinical characteristics
The correlations between protein markers and clinical characteristics were shown in Fig. 3b. Our correlation

Machine learning classification methods for characterizing the disease groups
We used two supervised analysis methods to distinct disease groups based on the protein markers i.e., Between Class Analysis (BCA) and Principal Component Analysis (PCA). BCA performed between class analysis to distinct the diseases groups with marker profiles. Our BCA analysis revealed that distinct biomarkers profile for the T2DM_CAD and CAD groups. However, T2DM group has marker profiles, relatively similar to that of control group. BCA ordination plot shown in Fig. 4a. Subjects belonging to different groups are colored differently (as indicated in the figure legend) and connected with the centroid profiles of each group. Furthermore, principal component analysis (PCA) of the serum markers showed decent separation of samples from patients with CAD and T2DM_CAD from both controls and T2DM individuals based on the most decisive component of the dataset (Fig. 4b). Dimension 1 (Dim 1) of the PCA accounted for 28.3% variation, while Dimension (Dim 2) accounted for 13.9% variability. Further, we have presented within the group marker profile variations, which revealed that there is a high degree of variability in the markers within CAD and T2DM_CAD as compared to the control and T2DM group (Fig. 4c).
For each disease groups among control prediction (Fig. 5), we performed 100 iterations, where in each iteration; we trained a random forest classification model on 50% of the dataset (based on the protein profiles), and tested it on the rest 50% (2-fold cross validation; rather than the 90% to 10% training: testing ratio, thereby reducing the over-fitting aspect). In other words, each iteration involved the random forest classification trained on a different subset of control and diseased samples, and tested on a completely non-overlapping set. The two cross validation addresses that the models do not over-fit. Furthermore, the 100 iterations gave us the statistical power to explore the entire landscape of individuals (available in the current study) to judge the power of each classification strategy. The classification power of each feature was then computed as the mean feature rank across all the 100 iterations. Our random forest classifier approach is given predictive models to distinct the different diseases like T2DM, CAD, and T2DM_CAD respect to control group. We also found the random forest classifier to predict T2DM_CAD with respect to T2DM. Accuracy, variable importance score and median abundance of the markers for each disease states were shown in Fig. 5a. The iterative approach gave us the statistical power to not only compare the classification efficiency for the three diseases using the protein profiles (p < 3.4e−10 using Kruskal-Wallis H-test, CAD > T2DM_CAD > DM), but also validated the stability of the features for prediction across an entire landscape of subjects (Fig. 5b). Principal Coordinate Analysis for the 100 ranked feature importance profiles (obtained in each iteration) for each disease, indicates that the feature importance profiles are relatively similar to each other for a given disease and significantly from those of other diseases (PERMANOVA p < 2.8e−13). Briefly, Random Forest (RF) classifier classified T2DM group with respect to the control group by nine markers (IL-1β, GM-CSF, glucagon, PAI-I, rantes, IP-10, resistin, GIP, Apo-B; accuracy 76%, sensitivity 72%, specificity 81%, AUC 0.72), CAD was predicted with respect to the control by 14 markers (resistin, PDGF-BB, PAI-1, lipocalin-2, leptin, IL-13, eotaxin, GM-CSF, Apo-E, ghrelin, adipsin, GIP, Apo-CII, IP-10; accuracy 86%, sensitivity 85%, specificity 87.5%, AUC 0.84); T2DM_CAD was predicted by 12 markers (insulin, resistin, PAI-1, adiponectin, lipocalin-2, GM-CSF, adipsin, leptin, Apo-AII, rantes, IL-6, and ghrelin; accuracy 92%, sensitivity 92.3%, specificity 90%, AUC 0.92); (Fig. 5c). T2DM_CAD was also classified well with respect to T2DM by nine markers (adiponectin, C-peptide, resistin, IL-1β, ghrelin, lipocalin-2, Apo-AII, IP-10, Apo-B; accuracy 85.7%, sensitivity 86.9%, specificity 78.5%, AUC 0.76) (6a-c). These all classifiers were considered as significant p < 0.05 and shown in Table 3.

Discussion
In the process of development of CAD in T2DM, multiple pathophysiologic processes including hyperglycemia, hyperinsulinemia, insulin resistance, dyslipidemia, chronic low-grade inflammation, oxidative stress, endothelial dysfunction, vascular calcification, and hypercoagulability were involved [17]. However, these all mechanisms were responsible together to alter the plasma levels of metabolic hormones, inflammatory mediators, adipokines and apolipoproteins in diabetes and associated cardiovascular disease complications [18]. These individual mechanisms and their interplay were not fully understood in the diabetic disease progression. In the present study, we were interested to find the simultaneous induction of various protein markers in diabetes and CAD based on the diseases mechanisms related to metabolic hormonal regulation, inflammation, adipogenesis, and atherogenesis. Our interest was to characterise T2DM by a panel of protein markers that accelerates CAD progression in Indian subjects. Identification of specific markers in Indian T2DM patients is very important considering the high prevalence of type 2 diabetes and its complications specially CAD in India.
The present study showed that metabolic hormones like GLP-1, C-peptide and insulin levels were significantly increased in T2DM_CAD as compared with control. Similarly, GLP-1 and C-peptide levels were significantly increased in TDM_CAD as compared with T2DM. Previously, it was reported that increased insulin and C-peptide levels were independently associated with increased risk of coronary artery disease in T2DM subjects. Interestingly, hyperinsulinemia accelerates atherosclerosis in diabetic patients than non-diabetic patients [19,20]. In contrast, GLP-1 other than metabolic regulation shows anti-atherosclerotic effect [21]. Increased GLP-1 levels in T2DM_CAD group that we observed in the present study might be the compensatory response to the atherosclerotic effect. In the present study leptin level were significantly lower in CAD patients but increased when diabetes is associated with CAD. Al-Daghri et al. reported the association of increased leptin levels with severity of both metabolic syndrome and diabetes associated with coronary heart diseases [22]. Many clinical reports suggest that leptin might play a key link between Fig. 6 a Classification area under the curves (AUCs) of random forest-based classifiers (trained on the marker profiles) for predicting the T2DM_CAD with respect to T2DM. For each disease state, classification accuracies were obtained after 100 iterations, where in each iteration, the model was trained on 50% of the data and validated/tested on the rest 50%. b Variable importance scores of the markers identified to be optimal for at least one of the three comparisons (T2DM v/s T2DM_CAD). c Fold change of the median abundances of the corresponding markers for each comparison (T2DM v/s T2DM_CAD) metabolism and inflammation, across different age categories, ranging from pediatric to geriatric patients with diabetes or other cardiovascular risk factors [23]. Further, leptin is able to induce production of C-reactive protein by endothelial cells [24]. The increase local availability of leptin in the vascular wall can in turn exert pro-atherothrombotic effects both on endothelial cells and smooth muscle cells [25,26]. Among all pro-inflammatory cytokines i.e., TNF-α and IL-1β levels were increased in T2DM_CAD subjects as compared with control subjects. As per the previous literature, TNF-α induces insulin resistance by inhibiting IRS-1 phosphorylation and GLUT-4 expression, and elevated in patients with heart failure, and myocardial ischemia reperfusion [27]. Pu et al. reported the association of increased TNF-α levels with CAD among T2DM patients [28]. Increased TNF-α and IL-1β levels along with hyperglycemia helps to develop atherosclerosis [29,30]. Similarly, another pro-inflammatory marker IL-6 increased in both CAD and T2DM_CAD groups. However, the significant increase was observed in T2DM_ CAD group as compared with T2DM. Increased IL-6 levels disturb the glucose metabolism and contribute to the development of insulin resistance. It is also reported earlier that IL-6 levels were positively correlated with CAD [31]. Present study showed that pro-inflammatory markers i.e., TNF-α, IL-1β, and IL-6 together induce chronic inflammation condition to promote CAD in diabetes. Recently, CANTOS study reported that treatment with canakinumab, a monoclonal antibody targeted against IL-1beta in myocardial infraction patients showed dose-dependent decrease in IL-6 levels [6]. Serum markers like PDGF-BB, IP-10, resistin and PAI-1 levels were significantly increased in the CAD group, therefore representing specific markers for the CAD in the absence of diabetes. All these parameters help to develop atherosclerosis as reported earlier [32][33][34][35][36][37].
In our study, GM-CSF levels decreased in T2DM and T2DM_CAD group as compared with control group. GM-CSF induces activation of monocytes/ macrophages and mediates differentiation to other states that participate in immune responses. Previously, researchers reported that GM-CSF protects from diabetes by increasing a tolerogenic dendritic cells population [38]. Recently, Al-Hassnawi et al. reported decreased GM-CSF levels in type 2 diabetes patients and found indirect association with blood glucose levels [39]. Present study also confirmed that GM-CSF could be used in the disease progressive marker for diabetes and diabetes with CAD. Similarly, eotaxin levels were significantly increased in CAD and T2DM_CAD group. Researchers previously reported that increased eotaxin levels were associated with CAD and coronary atherosclerosis [40]. Therefore, elevation of eotaxin levels might be more important for atherosclerosis development in diabetes subjects. Increased adipsin levels were observed in CAD and T2DM_CAD group while lipocolin-2 levels were increased in T2DM, CAD and T2DM_CAD groups. Researchers reported that adipsin promotes lipid accumulation and adipocyte differentiation, and improves beta cell function [41,42]. In previous literature, it was reported that increased lipocolin-2 in serum is positively correlated with insulin resistance and inflammation in T2DM patients [43,44].
The association of Apo-AI, Apo-AII and Apo-CII levels was reported with atherosclerotic occlusive disease, CAD and type 2 diabetes associated with CAD [45][46][47]. In the present study, Apo-AI, Apo-AII and Apo-CII levels

Table 3 Classification performance of marker profile based on random-forest classifiers for different pairs of groups
For each pair of groups, the Random Forest classifications were obtained with 10-fold cross validation (there were 1000 iterations where in each iteration the classifiers were trained on 90% of the subjects, while the rest 10% were used for prediction). Top discriminatory marker features for each pair wise classification. Fisher's exact test were then performed on the confusion matrix, in order to judge the significance of the prediction profile VIS variable importance score, AUC area under curve, AUC is mentioned as median Optimal markers giving the separation more than > 1 VIS

Control vs T2DM
Total 9 markers i.e., IL-1beta, GM-CSF, glucagon, PAI-I, rantes, IP-10, resistin, GIP, Apo-B. were increased in CAD and T2DM_CAD group, however, significantly increased was observed only in CAD group. As per the previous literature increased Apo-AI shows cardio-protective effect and thus improvement of Apo-AI expression is considered as a potential therapeutic strategy to inhibit atheroma formation [3]. However, ApoA-I Milano product (MDCO-216) and wild-type ApoA-I product (CER-001) failed to promote regression of coronary atherosclerosis compared with placebo. Further another ApoA-I product, CSL112, recently entered a phase III cardiovascular outcomes trial [48]. Hope this study may come up with a fruitful result. Researchers reported that Apo-AII promotes insulin resistance and disturbs body fat homeostasis [49]. Increased Apo-AII levels promote development of atherosclerosis by disturbing the reverse cholesterol transport and antioxidant properties of HDL [49][50][51][52][53]. Similarly, Apo-CII plays an important role in triglyceride rich lipoprotein metabolism, and positively correlates with increased CAD and coronary heart diseases (CHD) [54]. Between class analysis (BCA) was performed to distinct the disease groups with marker profiles. Our BCA analysis revealed that distinct biomarkers profile was observed for the T2DM_CAD and CAD groups but no difference between control and T2DM group. Thus the distinct biomarker profile between groups is depends on high degree of glycaemia, duration of diabetes state and its complications. Principal component analysis (PCA) of the serum markers showed decent separation of samples with high degree variability with CAD and T2DM_CAD from controls and type 2 diabetes based on the most decisive component of the dataset.
We have also analysed our data using random forest classifier approach, which is a predictive model to distinct the different diseases like T2DM, CAD, and T2DM_CAD respect to control group, and classified T2DM_CAD group in respect to T2DM group. All these protein markers from random forest (RF) classifier were further used to made venny diagram to represent common and individual markers to distinct each disease group from other group (Fig. 7a). We found that GM-CSF, PAI-I and resistin were common classifiers for the T2DM, CAD, T2DM_CAD diseases. While IL-1β, glucagon and Apo-B are individual markers for type 2 diabetes, PDGF-BB, IL-13, eotaxin, Apo-E, and Apo-CII are the individual markers for the CAD group. Only four markers like insulin, adiponectin, Apo-AII and IL-6 are the individual markers for T2DM_CAD group. Six serum markers representing metabolic hormones (leptin, ghrelin) and adipokines (resistin, adipsin, PAI-1, lipocalin-2) are common between CAD and T2DM_CAD group while serum markers from inflammation, cytokines and apolipoproteins observed in both groups are completely different. Similarly, four serum markers representing Fig. 7 a Venny diagram represented common and unique protein markers from the RF classifier to distinct type 2 diabetes, CAD, and T2DM_CAD as compared with control. b Protein markers that responsible for development and progression of diabetes and associated coronary artery disease complication. Different pathological protein markers i.e., adipokines, cytokines, metabolic hormones and apolipoproteins (markers which were classified in RF classifier Table 3) may act as mediators in the initiation of insulin resistance, systemic inflammation, endothelial dysfunction and increase lipolysis and free fatty acids. Up arrow resembles upregulated proteins and down arrow resembles downregulated markers inflammation (GM-CSF), adipokines (resistin, PAI-1) and apolipoproteins (Apo B) are common between T2DM and T2DM_CAD groups while serum markers from metabolic hormones observed in both groups are completely different. The classifier analysis showed that few plasma markers from T2DM_CAD group were common with T2DM and CAD marker panels. These common plasma markers confirm the involvement of common pathologies among T2DM, CAD and T2DM_CAD.
Few plasma protein markers i.e., rantes, IL-13, glucagon and Apo B were picked up with the absence of statistically significant differences. Rantes (CCL5) an inflammatory marker, secreted by adipocyte and contributes to leukocyte infiltration. Previous researcher reported that increased levels of circulatory rantes observed in obesity, impaired glucose tolerance, type 2 diabetes, and coronary artery diseases [55,56]. In contrast, Podolec et al. reported that severe coronary atherosclerosis correlated with decreased rantes levels [57]. Our data suggested that rantes could be a marker for type 2 diabetes and type 2 diabetes with CAD. Similarly, anti-inflammatory molecule IL-13 is a classifier of CAD panel without any significant change. Decreased serum IL-13 levels in T2DM subjects play a role in impaired glucose uptake and metabolism [58]. Similarly, another study showed decreased IL-13 levels in patients with coronary artery disease subjects [59]. Metabolic hormone glucagon has emerged as one of the marker for the type 2 diabetes classifier; however, there was no significant difference. Metabolic hormones glucagon and insulin together regulate glucose production by stimulatory and inhibitory actions, respectively [60]. In our study, decreased glucagon levels, and increased insulin levels were observed in type 2 diabetes group. However, in T2DM_CAD group, both the levels were increased which shows impairment of the glucose regulation. Apolipoproteins like Apo-B and Apo-E were classifiers for the T2DM and CAD, respectively and Apo-B, was a classifier marker for the T2DM_ CAD. It is well known that Apo-B and Apo-E were risk factor for coronary artery diseases and cardiovascular mortality [61]. Researchers also reported that Apo-B is associated with the incident type 2 diabetes and better predictor for coronary artery diseases among diabetic patients [62,63]. As reported earlier our study also classified Apo-B a marker for T2DM and T2DM_CAD when compare to control and T2DM, respectively. Furthermore, all these significant protein markers in study groups compared with control group and T2DM_ CAD group compared with T2DM were analyzed using STRING database to find the possible cellular signaling pathways (Fig. 2). Circulatory protein markers with significant change in respect to control or T2DM were used to build a closely associated network of known pathways.
Network analysis revealed the complexity of T2DM_CAD that linked to several inflammatory and insulin resistance pathways like cytokine receptor, Jak-STAT, PI3K-Akt, adipocytokine and insulin signaling pathways (Fig. 2d). To understand the overall affected pathways throughout the disease progression starting from healthy to diabetesto-diabetes with CAD, we combined all significant circulatory markers to link with the crucial cellular proteins that could be affected during the disease progression (Fig. 2b). PTPN1 found from the network study (Fig. 2c) is linked to the common network pathway and modulates insulin resistance [64] through Jak-Stat, insulin receptor substrates (IRS1 and IRS2) and leptin signaling pathway [64]. All these proteins are already being explored as targets for the treatment of diabetes in diverse studies. Hyperinsulinemia that observed in T2DM and T2DM_ CAD groups, may down regulate IRS1 and IRS2 via p38-MAPK and triggers insulin resistance in liver and skeletal muscle. Alteration of leptin found in T2DM_CAD also effects phosphatidylinositol 3-kinase-Akt (PI3K-Akt) signaling pathway via Jak2 activation of IRS1 and IRS2 (Fig. 2d). Together these proteins contribute to the dysregulation of glucose and lipid metabolism, mitochondrial biogenesis, calcium-handling, fibrosis, and motor gene expression, culminating in cardiovascular complications. Pro-inflammatory marker IL-6 signaling via IL6R and STAT3 is a contributor for the vascular inflammation in vessel wall or fatty streak. Another pro-inflammatory marker i.e., IL-1β that belongs to IL-1 family, binds with the IL-1R type I (IL-1RI) and induces a downstream signal via numerous inflammatory kinases, such as Myd88, ERK, JNK and NF-κB leading to transcription of several inflammatory genes like cytokines and chemokines (Fig. 2c). Some signaling molecules specially, NF-kB also overlap with toll like receptors (TLRs) signaling [65]. Recently, the transcription factor high-mobility-group AT-hook 1 (HMGA1) has been linked to NF-kB activation, and involved in inflammation and in the pathogenesis of insulin resistance. It has been demonstrated that HMGA1 is associated to both the risk for diabetes and the risk of developing cardiovascular complications [66].
With the help of the network analysis, we were able to identify several cellular network proteins like PTPN1, AKT1, INSR, LEPR, IRS1, IRS2, AKT1, IL1R2, IL6R, PCSK9 and MYD88, which are responsible for regulating inflammation, insulin resistance, and atherosclerosis. Several adipokines like resistin, adiponectin, lipokolin-2 and IL-6 contribute to the development of insulin resistance, type 2 diabetes and cardiovascular diseases. We also found that apolipoproteins and cytokines were tightly connected with each other in the network and contribute to the development as well as progression of diabetes and diabetes with coronary artery diseases.