Improving accuracy of estimating glomerular filtration rate using artificial neural network: model development and validation

Background The performance of previously published glomerular filtration rate (GFR) estimation equations degrades when directly used in Chinese population. We incorporated more independent variables and using complicated non-linear modeling technology (artificial neural network, ANN) to develop a more accurate GFR estimation model for Chinese population. Methods The enrolled participants came from the Third Affiliated Hospital of Sun Yat-sen University, China from Jan 2012 to Jun 2016. Participants with age < 18, unstable kidney function, taking trimethoprim or cimetidine, or receiving dialysis were excluded. Among the finally enrolled 1952 participants, 1075 participants (55.07%) from Jan 2012 to Dec 2014 were assigned as the development data whereas 877 participants (44.93%) from Jan 2015 to Jun 2016 as the internal validation data. We in total developed 3 GFR estimation models: a 4-variable revised CKD-EPI (chronic kidney disease epidemiology collaboration) equation (standardized serum creatinine and cystatin C, age and gender), a 9-variable revised CKD-EPI equation (additional auxiliary variables: body mass index, blood urea nitrogen, albumin, uric acid and hemoglobin), and a 9-variable ANN model. Results Compared with the 4-variable equation, the 9-variable equation could not achieve superior performance in the internal validation data (mean of difference: 5.00 [3.82, 6.54] vs 4.67 [3.55, 5.90], P = 0.5; interquartile range (IQR) of difference: 18.91 [17.43, 20.48] vs 20.11 [18.46, 21.80], P = 0.05; P30: 76.6% [73.7%, 79.5%] vs 75.8% [72.9%, 78.6%], P = 0.4), but the 9-variable ANN model significantly improve bias and P30 accuracy (mean of difference: 2.77 [1.82, 4.10], P = 0.007; IQR: 19.33 [17.77, 21.17], P = 0.3; P30: 80.0% [77.4%, 82.7%], P < 0.001). Conclusions It is suggested that using complicated non-linear models like ANN could fully utilize the predictive ability of the independent variables, and then finally achieve a superior GFR estimation model.


Background
Glomerular flirtation rate (GFR) has been well recognized as the best overall indicator of kidney function, which is widely used in diagnosis, treatment and prognosis of chronic kidney disease (CKD) [1]. GFR can be measured by renal or serum clearance of exogenous filtration markers such as inulin and iohexol, but the so-called measured GFR (mGFR) values are cumbersome and costly to be derived in clinical routine. Therefore, investigators have developed widely used GFR estimation equations using established filtration markers (e.g., serum creatinine and cystatin C) in association with demographical variables (e.g., age, gender and race) [2][3][4][5][6][7]. The global organization kidney disease: improving global outcomes (KDIGO) has recommended to use estimated GFR (eGFR) as the initial test in clinical practice and epidemiological survey [8]. By 2017, many countries have been reporting eGFR with serum creatinine measurement [9].
The most accepted eGFR equations are modification of diet in renal disease (MDRD) [5] and chronic kidney disease epidemiology collaboration (CKD-EPI) equations [7], which can provide acceptable GFR estimates for the North American population. However, these eGFR estimations may not perform well among Chinese population, as these equations were not developed based on Chinese population [10]. Therefore, studies have been conducted to develop accurate equations for Chinese or Asian population [11]. However, most of these studies focus on either establishing an ethnic factor [10] or developing a new equation just using traditional regression method.
In the development of GFR estimation equations, the standard procedures are using natural logarithm transformation of mGFR and filtration markers, then using ordinary least square linear or piecewise linear regression. This simple linearity might not explain the complicated relationship among kidney function, GFR and filtration markers [1,12]. Moreover, the potential predictive power of auxiliary variables (demographical variables and other laboratory test variables) was not sufficiently utilized, as no interaction terms were incorporated into the equations. Studies have shown that using complicated non-linear modeling technology may improve the performance of GFR estimation [13][14][15][16]. Therefore, we used artificial neural network (ANN), a powerful and common methodology in machine learning, to develop a more accurate eGFR model for Chinese population, and validated this model and compared its performance with standard regression equation models.

Study design and study participants
Patients diagnosed with CKD in the Third Affiliated Hospital of Sun Yat-sen University during January 2012 to June 2016 were recruited consecutively into this study. Participates were excluded for any of the following reasons: (1) age < 18 years; (2) having acute kidney function deterioration, skeletal muscle atrophy, edema, pleural effusion or ascites, heart failure, malnutrition, amputation, or ketoacidosis; (3) taking trimethoprim or cimetidine; or (4) receiving dialysis at the time of study.
The institutional review board at the Third Affiliated Hospital of Sun Yat-sen University approved this study. A written informed consent was obtained from all participants.

Laboratory measurements
GFR was measured by 99m Tc-DTPA renal dynamic imaging, which had been recalibrated to a dual plasma sample 99m Tc-DTPA GFR. Renal dynamic imaging was obtained with a Millennium TMMPR SPECT using the General Electric Medical System (Discovery VH, GE Healthcare). Serum samples from each participant were collected on the same day of performing GFR measurement and assayed on a Hitachi 7180 auto-analyzer (Hitachi reagents from Roche Diagnostics) in a single laboratory at the Department of Laboratory in the Third Affiliated Hospital of Sun Yet-sun University. Creatinine was measured by an enzymatic method and then recalibrated to the isotope dilution mass spectrometry reference method [17,18]. We also recalibrated serum cystatin C to the standard reference material (ERM-DA471) [19]. The laboratory test variables were extracted from the analysis report and recorded manually.

Development of revised CKD-EPI equations and ANN model
The revised equations were derived using the same method for developing the CKD-EPI equation by Inker and colleagues [7]. We first developed an equation for GFR estimation using a combination of conventional 4 variables including age, sex, serum creatinine (Scr) and serum cystatin C (Scys), then we further developed a 9-variable equation by incorporating 5 more auxiliary variables including body mass index (BMI), blood urea nitrogen (BUN), albumin (ALB), uric acid (UA) and hemoglobin (HGB). For both equations the dependent variable was mGFR. mGFR and independent variables Scr, Scys, and BUN were log-transformed, so the correlation between mGFR and the independent variables became nearly linear. We developed the equations with 4-and 9-variable which fit the piecewise linear splines with a knot of both Scr and Scys by using splines Package in R software (version 3.5.0, R Development Core Team). The method for determining the knot of spline of Scr and Scys was described in Additional file 1.
We also developed an ANN model with the same 9 independent variables for GFR estimation. Prior to ANN development, we performed data cleaning and pre-processing on the development data, including outlier deleting and variable normalization. We used only 1 hidden layer with 4 neurons, and the activation functions in all hidden neurons were set as Leaky ReLU (alpha = 0.1) [20]. The ANN was trained by Stochastic gradient descent (SGD) optimizer, and the whole development of ANN was implemented under Keras framework in

Model evaluation and statistical analysis
The performance indicators of GFR estimation include bias, precision and accuracy Bias and precision were defined as the median and the interquartile range (IQR) of the difference of eGFR minus mGFR, respectively. Accuracy was assessed as P30 (percentage of eGFR within ± 30% of mGFR). Besides the model evaluation on overall cohort, we also performed the identical evaluation procedures on subgroups divided by mGFR. Data from patients from Jan 2015 to Jun 2016 were used for internal validation on the performance of the derived models. We also performed a sensitivity analysis by developing and internally validating the 3 GFR estimation models based on random split datasets. Complete-case analysis was used to handle the missing data. The 95% confidence intervals were calculated with bootstrap methods (2000 bootstraps) [21][22][23]. Wilcoxon signed rank test was used to compare the bias between models, whereas Permutation test for comparison of precision, and McNemar test for comparison of P30. All statistical analysis was performed using MATLAB software (version 2018b, MathWorks).

Characteristics of participants
Among the initially enrolled 2997 CKD patients during 2012 and 2016, 970 with incomplete data and 75 with irregular recordings were excluded (details are available in Fig. 1). Finally, 1952 participants were included in the model development or validation, including 1075 (55.1%) participants from Jan 2012 to Dec 2014 assigned into the development dataset to derive the revised equations and ANN, whereas 877 (44.9%) from Jan 2015 to Jun 2016 assigned as the internal validation dataset to independently evaluate the performance of the derived models. Table 1 summarizes the characteristics of both development and internal validation datasets. For the development dataset, 57.3% were male; mean age was 55.6 years (standard deviation [SD] 14.5); mean mGFR was 71.0 (SD, 27.4) mL/min/1.73 m 2 , serum creatinine 1.5 (SD, 1.3) mg/dL, and serum cystatin C 1.5 (SD, 0.9) mg/L. For internal validation dataset, 59.0% were male; mean age was 57.4 (SD, 13.4) years; mGFR was 68.8 (27.1) mL/min/1.73 m 2 , serum creatinine 1.3 (SD, 0.9) mg/dL, and serum cystatin C 1.3 (SD, 0.7) mg/L. There were few participants with mGFR less than 30 mL/min/1.73 m 2 in both development and internal validation dataset (6.5% and 6.3% respectively).

Formulation of revised CKD-EPI equations and ANN model
The knots of serum creatinine for female and male participants were 0.7 and 0.9 mg/dL, respectively, whereas the knots of serum cystatin C were both 0.9 mg/L. The formulations of revised 4-variable and 9-variable CKD-EPI equations were shown in Table 2. Additional file 3 shows how to implement the 9-variable ANN model.

Performance of models in the internal validation dataset
The performance of three derived models was summarized in  Table 3).
The model performance in subgroups by mGFR was similar with the overall performance. In both subgroups of mGFR < 60 mL/min/1.73 m 2 and mGFR ≥ 60 mL/ min/1.73 m 2 , the 9-variable ANN model consistently achieved superior P30 than the two revised equations. However, in subgroup of mGFR ≥ 90 mL/min/1.73 m 2 the ANN model tended to be more biased (median of difference − 2.91 [− 4.60 to − 1.32] mL/min/1.73 m 2 ) ( Table 3).
The sensitivity analysis based on random split datasets showed that the 9-variable ANN model has significantly superior P30 and precision and similar bias compared with the 4-variable CKD-EPI equation (see Additional file 4).

Discussion
Accurate evaluation of GFR is important for assessing the severity of CKD, predicting prognosis and deciding proper therapeutic interventions. Since publication of Cockcroft-Gault (CG) Equation in 1976 [2], many studies have been conducted to derive actionable models to estimate GFR. The major barrier of accurately estimate individual's GFR is non-GFR determinants of filtration markers [1,12,24,25], which degrade the ideal linear correlation between GFR and filtration markers. Under the consideration of cost and convenience, such unmeasured non-GFR determinants are unable to be incorporated into the GFR estimation models, instead auxiliary variables (demographical variables and other laboratory test variables) are used as surrogates. The frequently used demographical variables are age, gender and race, whereas the frequently used other laboratory test variables are blood urea nitrogen and albumin.  However, other laboratory test variables in the linear equations seem to have limited predictive ability to estimate GFR compared with filtration markers and demographical variables. The 6-variable MDRD equation has two additional variables Serum urea nitrogen and Albumin than the simplified 4-variable MDRD equation, but the performance of the two equations are nearly the same [3][4][5]. In the development of CKD-EPI equation in 2012, no other laboratory test variables or interaction terms are incorporated into the final equation as their predictive ability are not statistically significant during variable selection [7].
In our study, we developed two revised CKD-EPI equations. One equation incorporated 4 variables: standardized serum creatinine and cystatin C, age and gender, which are the standard variable combination during developing the GFR estimation model. We further incorporated more auxiliary variables as in theory it is beneficial using more independent variables when developing prediction models. Besides blood urea nitrogen and albumin, we also incorporated body mass index, uric acid and hemoglobin, and finally developed a 9-variable revised CKD-EPI equation. However, the two revised equations turned out to have similar performance of GFR estimation. The reason behind this phenomenon is the simple linear regression cannot sufficiently utilize the potential predictive power of these auxiliary variables. When we used the same 9 variables to develop a ANN model, compared with the revised 4-variable CKD-EPI equation, the 9-variable ANN model significantly reduce bias and improve P30 accuracy.
The mathematical theory of ANN is the universal approximation theorem [26,27], which means that ANN is able to approximate any continuous even uncontinuous functions. When the network size of ANN increases, the capacity of ANN will become more powerful. Furthermore, ANN doesn't require any assumptions about distribution of variables and can handle with the multicollinearity among independent variables [28]. Therefore, ANN can capture not only the complicated correlations between GFR and independent variables, but also any interactions between independent variables, so it can make GFR estimations based on these sophisticated relationships.
Our study suggests that it is beneficial to use complicated models to fully utilize the predictive ability of these variables to achieve a good performance of GFR estimation.
There are limitations in our study. First, all study participants were from one medical center in China, and most are CKD patients. The generalizability of the study may be limited to CKD patients, and the performance of the developed ANN still requires extra validation on diverse populations. Second, the gold standard mGFR was measured by 99m Tc-DTPA renal dynamic imaging, and then recalibrated to a dual plasma sample 99m Tc-DTPA GFR. It is widely accepted that using iohexol or iothalamate will achieve a more accurate mGFR compared with 99m Tc-DTPA [33]. Third, the sizes of development dataset as well as internal validation dataset are relatively small, especially there were few participants with mGFR ≤ 30 mL/min/1.73 m 2 . Fourth, although ANN model is superior in the accuracy, it is difficult to interpret, and