External validation of the European risk assessment tool for cardio metabolic disease in a Middle Eastern population

Background: High burden of chronic cardio-metabolic disease (CCD) including type 2 diabetes mellitus (T2DM), chronic kidney disease (CKD), and cardiovascular disease (CVD) have been reported in the Middle East and North Africa region. We aimed to externally validate a Europoid risk assessment tool designed by Alssema et al, including non-laboratory measures, for the prediction of the CCD in the Iranian population. Methods: The predictors included age, body mass index, waist circumference, use of antihypertensive, current smoking, and family history of cardiovascular disease and or diabetes. For external validation of the model in the Tehran lipids and glucose study (TLGS), the Area under the curve (AUC) and the Hosmer-Lemeshow (HL) goodness of fit test were performed for discrimination and calibration, respectively. Results: Among 1310 men and 1960 women aged 28-85 years, 29.5% and 47.4% experienced CCD during the 6 and 9-year follow-up, respectively. The model showed acceptable discrimination, with an AUC of 0.72(95% CI: 0.69-0.75) for men and 0.73(95% CI: 0.71-0.76) for women. The calibration of the model was good for both genders (min HL P=0.5). Considering separate outcomes, AUC was highest for CKD (0.76(95% CI: 0.72-0.79)) and lowest for T2DM (0.65(95% CI: 0.61-0.69)), in men. As for women, AUC was highest for CVD (0.82(95% CI: 0.78-0.86)) and lowest for T2DM (0.69(95% CI: 0.66-0.73)). The 9-year follow-up demonstrated almost similar performances compared to the 6-year follow-up. Conclusion: This model showed acceptable discrimination and good calibration for risk prediction of CCD in short and long-term follow-up in the Iranian population.

second-largest country in the MENA with the increasing prevalence of non-communicable diseases (NCD) including T2DM, CKD, and hypertension leading to CVD. Moreover, the incidence density rate of T2DM, CKD, and CVD was 10.6, 21.5 and 10.5 respectively per 1000 person-year over more than 10 years follow-up in an Iranian population (3,4). Age-standardized mortality from NCD among populations aged 30-70 was 346.1 per 100,000 population in 2016 (5). Programs for screening and primary prevention have been reported in Iran but so far scarcely implemented (6,7). Incident T2DM, CKD, and CVD share many risk factors including age, sex, obesity, smoking, high blood pressure, and sedentary lifestyle. Diabetes is a risk factor for both CKD and CVD (3,4,8). To date, various models have been proposed for the prediction of T2DM (9, 10), CKD (11) and CVD (10,(12)(13)(14)(15) separately. Most of the previous models were comprised of non-laboratory measures. Sattar N. et al. suggest in a recent article that the best approach for screening cardiometabolic disease is to start from non-laboratory measures in the primary phase and employ laboratory measures only for the high risk group of individuals (10). In 2012 a model comprising of non-laboratory measures was suggested by Alssema et al. (16) for the 7-year risk prediction of combined endpoints (i.e. T2DM, CKD, and CVD) in the Dutch population. It revealed good discrimination between low-and high-risk populations for the combined outcomes. This prediction tool is now implemented into Dutch guidelines for general practitioners (17). This model was validated in Australia in 2018 which revealed good discrimination and poor calibration (18). The development and validation of this model have been performed predominantly in the Europoid population and it might not be transferable to the other ethnicities. This prediction model is comprised of non-laboratory measures which make it a costworthy tool for screening and primary prevention of CCD especially in countries of the MENA with limited healthcare facilities. Therefore, in the current study, we aimed to externally validate the risk prediction tool for CCD in the Iranian population. Moreover, we also examined the validity of this model to predict CCD during an extended follow-up period of 9 years.

Methods -Study Population
Tehran Lipid and Glucose Study (TLGS) is a community-based prospective cohort study conducted on an Iranian urban population in Tehran. The study aims to determine the prevalence and incidence of non-communicable diseases and related risk factors among individuals aged ≥ 3 years and promote a healthy lifestyle and programs for the prevention of NCD. The study has been established in two phases including the first (1999-2001: n = 15005) and the second (2001-2005; n = 3550) and is designed to keep on for at least 20 years on the triennial basis. The design and methodology of the TLGS study have been reported elsewhere (19). Since the detail of the data regarding the cardiovascular status at the recruitment time was available from the phase II, the current study was designed on 7490 individuals aged 28-85 years who participated in the second phase of the TLGS study (phase I = 5716 and phase II = 1774). From this number, we excluded those with prevalent CVD (i.e. participants with a history of myocardial infarction, angioplasty, coronary artery bypass graft (CABG) or stroke, (n = 546)), prevalent T2DM defined as self-reported use of diabetes-lowering medication (n = 856) and prevalent end-stage renal disease (ESRD) defined by estimated Glomerular Filtration Rate (eGFR) < 15 mL/min/1.73 m 2 (n = 1). After excluding those with missing data at baseline for creatinine (Cr), fasting plasma glucose (FPG), 2-hour post-challenge plasma glucose (2 h-PCG), body mass index (BMI), waist circumference (WC) and smoking status (n = 1864, considering overlap features between missing values) as well as participants with missing data during follow-up on Cr (n = 32), FPG, 2 h-PCG (n = 718) and CVD status (n = 203), 3270 individuals were eligible for the current study during 6-year follow-up until March 2011. In line with the risk assessment tool, no one died from non-cardiovascular causes during the follow-up period.
To investigate the long-term effect of the risk assessment tool for prediction of CCD, from a total of 4223 individuals, we excluded prevalent cases of CVD, T2DM and ESRD and those with missing data on covariates using the above approach. 3240 individuals remained for the analysis during 9-year follow-up until March 2018 ( Supplementary Fig. 1). This study was approved by the Institutional Review Board (IRB) of the Research Institute for Endocrine Sciences (RIES), Shahid Beheshti University of Medical Sciences, Tehran, Iran, and all participants provided written informed consent.

-Clinical And Laboratory Measurements
Information on demographic data, family history of premature CVD and T2DM, current smoking status and medication history were obtained by a trained interviewer using a standard questionnaire. Detail for anrhtropometric measurments including height, weight and WC was reported elsewhere. A blood sample was taken from all study participants between 7:00 and 9:00 AM after 12 to 14 hours overnight fasting. More detail for laboratory measurmnrts including FPG, 2 h-PCG and seum creatinin was addressed previously (19).
-Definition Of Variables BMI was calculated as weight (kg) divided by height (m 2 ). A positive family history of premature CVD for the study participant was considered as having previously diagnosed CVD in first-degree male and female relatives aged ˂55 and ˂65 years, respectively. The current smoker was defined as who smokes cigarettes daily or occasionally.

C. Cardiovascular Disease
According to the previously published article about CVD outcomes in the TLGS cohort (22), each participant is followed-up for any medical event leading to hospitalization during the previous year by telephone call. They were asked for any medical conditions by a trained nurse and later, a trained physician collected complementary data regarding that event during a home visit and by the acquisition of data from medical files. The collected data were then evaluated by an outcome assessment committee consisting of an internist, endocrinologist, cardiologist, epidemiologist, and other experts, if required, to assign a specific outcome for every event. In the current study CHD events included cases of definite and probable MI, unstable angina, angiographic proven CHD and CHD death. Stroke was also defined as definite or possible stroke or transient ischemic attack. Finally, CVD was clarified as a composite measure of any CHD events, stroke or cerebrovascular death.

D. Chronic Cardio-metabolic Disease
CCD was defined as the diagnosis of either T2DM, CKD or CVD during the follow-up period.
-Risk Tool For Chronic Cardio-metabolic Disease To evaluate the CCD outcome, the risk tool was developed on 6780 Dutch men and women (aged 28-85 years) based on three population-based cohorts: the Rotterdam study(n = 4018), the Hoorn study (n = 627) and the Prevention of Renal and Vascular End-stage Disease (PREVEND; n = 2135) (16). The sex-stratified model including age, BMI, WC, use of antihypertensive, current smoking, parent and/or sibling with MI or stroke (age < 65 years), and Parent and/or sibling with diabetes were developed using logistic regression (Supplementary Table 1). The 7-year risk of CCD was calculated for each subject according to the original risk assessment tool recommended by Alssema et.al (16) for each TLGS men and women.
-Statistical analysis Baseline characteristics of respondents (those with complete data) and non-respondents (those with missing data of covariates or loss to follow-up) were expressed as mean (standard deviation) and number (%) for categorical variables. For covariates with a skewed distribution, the median (interquartile range: IQR) was reported. A comparison of baseline characteristics between men and women was done by the Student's t-test for normally distributed continuous variables, Maan -Whitney u test for skewed variables, and the chi-squared test for categorical variables. To evaluate the external validity of the risk equation, Area under the curve (AUC) and Hosmer-Lemeshow chi-square were applied to determine the discrimination and calibration of these predictor models, respectively.
According to the Hosmer et.al (23) criteria, the AUCs 0.5-0.7, 0.70-80, 0.80-0.90 and ≥ 0.90 indicated poor, acceptable, excellent and outstanding discrimination, respectively. To show the calibration in detail, the observed risk was plotted versus the mean of predicted probabilities using the calibration belt Stata module (24). Besides, the observed to an expected ratio (O/E) for the CCD outcome was calculated; ratio < 1 indicated overestimation and > 1 indicated underestimation of the risk.
We also recalibrated the risk assessment tool for the TLGS cohort characteristics by adjusting the intercept of the model; the same predictors with the same regression coefficients of the original model were fixed while the intercept was estimated as the free parameters (25). Using the above statistical approach, we repeated our data analysis for those participants with a 9-year follow-up. To compare the discrimination measurement of the risk assessment tool with other available noninvasive prediction models for the CVD outcome, we used the Gaziano et.al. (13) risk score. Statistical analysis was performed using STATA version 14 (StataCorp LP, College Station, Texas), statistical software. p < 0.05 were considered as statistically significant.

Results -Baseline characteristics
The study population consisted of 1310 men and 1960 women at baseline with a mean (SD) age of 47.1 (12.8) and 45.3 (11.3) years, respectively. The baseline characteristics of men and women are shown in Table 1. There were significant differences between men and women; men were older and had a higher level of WC and higher frequencies of being a current smoker, whereas women had a higher level of BMI and higher frequencies of using antihypertensive medications and positive family history of CVD. The comparison of the baseline characteristics of the respondants vs. nonrespondants is shown in Supplementary Table 2.   and CVD (for both genders), the hypothesis of the good calibration was rejected. Moreover, recalibration with adjusting the TLGS study intercept did not improve the model goodness-of-fit; HL tests were significant regarding T2DM for women and CVD for men. Also, the AUC showed similar discrimination compared with the original model ( Table 2). The O/E ratio for the combined cardiometabolic disease was almost 1 for both men and women.

-Additional Analysis
The secondary analysis during the median (IQR) 9.2 years (8.7-10.2) follow-up, demonstrated almost similar discrimination and calibration for both genders compared with the 6-year follow-up (Table 2).  (18). This difference might be explained by the difference in the discrimination for the specific NCD groups despite the higher incidence of CCD (40.2%) compared to the development (36.0%) and AusDiab data (13.3%) (Fig. 2). Moreover, in the current study, we reported the high prevalence of newly diagnosed T2DM and CKD (i.e. those with eGFR between 15 to 60 mL/min/1.73 m 2 ) among Iranian population at baseline compared to the development (4.6% and 7.2%, respectively) (16) and AusDiab data (3.7% and 11.2%, respectively) (27,28). An efficient risk prediction model requires a series of assumptions to eliminate the potential presence of reverse causality (29). We believe that the high prevalence of newly diagnosed T2DM among TLGS population at the baseline caused reverse causality that might have affected obesity indices, leading to lower performance of the model in the prediction of T2DM.
The incidence of CVD was lower in TLGS population compared to the development and AusDiab data (Fig. 2). There are several previously developed models comprising non-laboratory measures for the prediction of CVD only (13,30). One of which is a model introduced by Gaziano  model showed good calibration in the 9-year follow-up, better discrimination in women and the same discriminative performance in men; Despite of not including DM as a major risk factor in CCD model (30). Although CVD showed less contribution to the composite outcome, the model revealed the best CVD discrimination for women and the second-best discriminative performance for CVD in men for both follow-up periods. Other models for prediction of CVD showed the same gender difference as the current model. Framingham CVD risk score is one of the models also validated in Iran. The results were in line with ours and showed higher discrimination in women compared to men (15). The model showed a good calibration for CVD for both genders during the 9-year follow-up. This could be explained by the time-dependent course of CVD progression leading to a higher rate of CVD incidence in the long-term follow-up.
The incidence of CKD was higher in TLGS population compared to the development and AusDiab data (Fig. 2). There are several previously developed models comprising non-laboratory measures for the prediction of CKD (31). CKD had the most contribution to the composite CCD outcome. This could be due to the presence of multiple major risk factors of CKD in the current model including age, hypertension, and smoking. The inclusion of laboratory measures could increase the predictive power of the model as it has been addressed in a meta-analysis (C-statistic probability = 0.845) (11).
Considering that only non-laboratory measures were included in the model and eGFR was absent as a major predictor of CKD, an AUC of about 0.76 in men and 0.71 in women is acceptable. The model showed the best CKD discrimination for men and the second-best discriminative performance for CKD in women. Calibration was good for the prediction of CKD.
Focusing on T2DM, its incidence was almost similar to the development data but higher than the AusDiab population (Fig. 2). The model showed the worst discriminative performance for T2DM in both men and women. Calibration was good in men but poor in women. Several explanations could be proposed for the poor performance of the model in the prediction of T2DM. Firstly, as mentioned earlier the percentage of newly diagnosed T2DM in TLGS study was higher compared to the development and AusDiab data (16,28); this issue affects the discriminative power of the CCD model for incident T2DM. Secondly, during 6-years follow-up we previously found that general adiposity was not an independent risk factor for incident T2DM, however including age, SBP, family history of T2DM as well as waist to hip ratio(WHR) and waist to height ratio(WHtR) in a non-laboratory model resulted This study had several limitations. Firstly, TLGS data on the history of intermittent claudication and peripheral intervention was not assessed, so CVD incidence might have been underestimated in the TLGS. However, despite differences in CVD definitions with the original study, the discriminative power of the CCD tool for CVD assessment was acceptable. Secondly, men participants, compared to non-respondents were more obese and reported a higher rate of smoking while women participants reported less frequency of smoking and use of anti-hypertensive medications; leading to over-and underestimation of CCD among men and women, respectively (supplementary table2). Thirdly, this study was conducted on the population of Tehran and might not be generalizable to the entire population of Iran, especially rural areas.
The current risk prediction tool is freely available on websites in the Netherlands and is also incorporated into the Dutch guidelines for general practitioners, 'The Prevention Visit' (17). Recent studies have discussed the cost-effectiveness of the cardio-metabolic risk assessment (35,36). This model could help differentiate between high-risk population in need for further risk assessment and those at low risk in the MENA region.

Conclusion
In conclusion, the previously developed non-invasive 7-year risk prediction tool for CCD performed well in regards to discrimination and calibration in a non-Europoid population with a 6-and 9-year follow up. The model performed best for prediction of CVD and CKD in both genders but further workup evaluation is needed for better prediction of T2DM. Results from this study suggest that this model has an acceptable performance in other ethnic groups and for a longer follow-up period. World health organization (WHO) has implemented a prevention program to reduce death from NCD by 25% in the Eastern Mediterranean region by 2025 (37). This non-laboratory cost-effective tool is especially very beneficial for screening three important NCDs in middle to low-income regions with limited access to health care facilities.

List Of Abbreviations
Chronic cardio-metabolic disease (CCD)

Consent for publication: Not applicable
Availability of data and materials: The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Funding Sources: None
Author contributions: FH, FA conceived and planned the study. SA and DK conducted the analyses.
FM and SA developed the first draft of the manuscript. FH, FM and SA critically revised the manuscript. All authors contributed to the writing of the paper, and have read and approved the final manuscript.
*No deaths were recorded during follow-up from non-cardiovascular causes. Figure 1 Calibration belt plot of the risk of a prediction tool for T2DM, CKD, CVD, and CCD outcomes among men and women separately. The Solid line indicates the bisector line(perfect calibration). The light-gray area defines an 80% confidence level. The dark-gray area defines a 95% confidence level. A likelihood-ratio test was used for evaluating the hypothesis of good calibration (p>0.05).