Predictive models for chronic kidney disease after radical or partial nephrectomy in renal cell cancer using early postoperative serum creatinine levels

Several predictive factors for chronic kidney disease (CKD) following radical nephrectomy (RN) or partial nephrectomy (PN) have been identified. However, early postoperative laboratory values were infrequently considered as potential predictors. Therefore, this study aimed to develop predictive models for CKD 1 year after RN or PN using early postoperative laboratory values, including serum creatinine (SCr) levels, in addition to preoperative and intraoperative factors. Moreover, the optimal SCr sampling time point for the best prediction of CKD was determined. Data were retrospectively collected from patients with renal cell cancer who underwent laparoscopic or robotic RN (n = 557) or PN (n = 999). Preoperative, intraoperative, and postoperative factors, including laboratory values, were incorporated during model development. We developed 8 final models using information collected at different time points (preoperative, postoperative day [POD] 0 to 5, and postoperative 1 month). Lastly, we combined all possible subsets of the developed models to generate 120 meta-models. Furthermore, we built a web application to facilitate the implementation of the model. The magnitude of postoperative elevation of SCr and history of CKD were the most important predictors for CKD at 1 year, followed by RN (compared to PN) and older age. Among the final models, the model using features of POD 4 showed the best performance for correctly predicting the stages of CKD at 1 year compared to other models (accuracy: 79% of POD 4 model versus 75% of POD 0 model, 76% of POD 1 model, 77% of POD 2 model, 78% of POD 3 model, 76% of POD 5 model, and 73% in postoperative 1 month model). Therefore, POD 4 may be the optimal sampling time point for postoperative SCr. A web application is hosted at https://dongy.shinyapps.io/aki_ckd. Our predictive model, which incorporated postoperative laboratory values, especially SCr levels, in addition to preoperative and intraoperative factors, effectively predicted the occurrence of CKD 1 year after RN or PN and may be helpful for comprehensive management planning.

7 days or less, whereas chronic kidney disease (CKD) is defined by the persistence of kidney disease for a period of > 90 days [3]. The severity of AKI and recovery time have been implicated as important predictors of CKD progression [3][4][5][6]. Surgically induced CKD may be associated with a lower risk of progression and mortality than CKD due to medical causes [7]. However, despite this, an estimated glomerular filtration rate (eGFR) less than 45 mL/min/1.73 m 2 in patients with surgically induced CKD has been associated with an increased risk of mortality [8].
Numerous studies have investigated the predictive factors for CKD following RN or PN [9][10][11][12][13][14][15][16], some of which developed predictive models for CKD [13,14]. However, these studies analyzed preoperative and intraoperative factors (such as patient characteristics, preoperative laboratory values, and surgical type or technique) as possible predictors, without including postoperative laboratory values [9][10][11][12][13][14]. Although few studies included the occurrence of AKI or time to nadir eGFR as one of the predictors of CKD [15][16][17], serial changes of serum creatinine (SCr) was not considered. When considering the variable trajectories following AKI [18], postoperative laboratory values, especially SCr, should be considered for better prediction. Therefore, we hypothesized that SCr levels collected in the first 5 days after nephrectomy would provide important information to predict the SCr levels 1 year after surgery and ultimately, the occurrence of CKD. Thus, this study aimed to develop predictive models for CKD after RN or PN using early postoperative laboratory values, including SCr levels, in addition to preoperative and intraoperative factors, and build a web application to facilitate their implementation. Moreover, we aimed to find optimal SCr sampling time points for accurate CKD prediction.

Patients
The analysis data set included 1,556 patients with RCC who received either laparoscopic or robotic RN (n = 557) or PN (n = 999) between December 2005 and May 2019 and were at least followed up to 1 year after surgery. Patients lost to follow up or died before 1 year were excluded. Data were retrospectively collected from the electronic medical records of a single institution.

Features used for prediction
Our study aimed to predict the rise in SCr levels relative to preoperative value at 1 year after surgery (between 11 and 13 months after surgery), hereafter denoted as SCr 1y , given the following features: Data were also collected on POD 7 and 14 and at 3 and 6 months; however, they were not included in the model development. For brevity, a common notation involving has been used throughout to represent the baselinesubtracted level of different variables, with the associated subscript indicating the sampling time point. For example, baseline subtracted SCr on POD 3 has been denoted as SCr 3d . The letters d, m, and y, used as subscripts, represent the day, month, and year, respectively.
Missing values were imputed using multiple imputation by chained equations (MICE), also known as fully conditional specification or sequential regression multiple imputation. The method operates under the assumption that missing data are Missing at Random (MAR), i.e., the probability of a particular value being missing depends only on the observed values and not the unobserved values [19]. Since missing values in our data showed a clear time-dependency and were likely unrelated to the true value of SCr, we assumed that the condition of MAR was fulfilled, thus allowing the partial deduction of the missing values based on the measurements immediately before and after them. The highest proportions of missing values occurred on POD 4 and 5 in both RN and PN (Additional file 1: Fig. S1). The proportions of patients without any missing value were 14.7% and 16.8% in RN and PN, respectively. The widely validated R package, mice, was used to carry out the imputation process [20].

Model development
First, the features were grouped into eight categories, namely F pre , F 0d , F 1d , F 2d , F 3d , F 4d , F 5d , and F 1m , based on their time of acquisition. F pre only included factors available prior to completion of the surgery, mentioned above as patient characteristics and preoperative and intraoperative factors. Other feature sets, denoted as F i with i = 0, 1, 2, 3, 4, 5 days, included F pre and information collected on the i th postoperative day. The last feature set, F 1m , included F pre and features collected 1 month after surgery. Thereafter, Lasso regression models were built on each of the feature sets to predict SCr 1y , hereafter referred to individually as Model Lasso,pre , Model Lasso,0d , Model Lasso,1d , Model Lasso,2d , Model Lasso,3d , Model Lasso,4d , Model Lasso,5d , and Model Lasso,1m , and collectively, as Model Lasso . The features with non-zero regression coefficients in Model Lasso were collectively referred to as F Lasso and individually as F Lasso,pre , F Lasso,0d , F Lasso,1d , F Lasso,2d , F Lasso,3d , F Lasso,4d , F Lasso,5d , and F Lasso,1m ; each of these consisted of selected features from F pre , F 0d , F 1d , F 2d , F 3d , F 4d , F 5d , and F 1m , respectively. Prior to model development, we split the dataset into training and test datasets in the ratio of 8:2. The features were z-score normalized. A grid search algorithm and four-fold cross-validation were used on the training dataset to tune the shrinkage hyper-parameters.

Construction of the final model
To retain only the most parsimonious set of features, we calculated the Spearman's partial correlation coefficients [21] of F Lasso with SCr 1y and eliminated features with absolute values less than 0.1. This yielded the final selected features, collectively referred to as F final and individually as F final,pre , F final,0d , F final,1d , F final,2d , F final,3d , F final,4d , F final,5d , and F final,1m . We then constructed multivariate linear regression models using F final . Unlike the Lasso models, the features were not z-score normalized, so the estimated regression coefficients could be readily interpreted. The final models were referred to as Model final,pre , Model final,0d , Model final,1d , Model final,2d , Model final,3d , Model final,4d , Model final,5d , and Model final,1m .
The predictive performances of Model final were then compared with those of Model Lasso . R 2 and mean squared error (MSE) between the predicted and observed SCr 1y values calculated using the test dataset were used as the performance metrics. The eGFR was calculated using the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation [22]. CKD was categorized according to the eGFR: stage 1 (≥ 90 mL/min/1.73 m 2 ), stage 2 (60-89 mL/min/1.73 m 2 ), and stage 3 and higher (< 60 mL/min/1.73 m 2 ) [23]. The overall analysis workflow is schematically shown in Fig. 1.

Model stacking
We performed model stacking by developing metamodels using predictions generated from all possible subsets of Model final,0d , Model final,1d , Model final,2d , Model final,3d , Model final,4d , Model final,5d , and Model final,1m as features. Ridge regressions with fourfold cross-validation were used to acquire appropriate weights to be assigned to each of the predictions generated by the component models. A total of 120 (= 2 7 -1-7) meta-models were thus developed. Metamodels trained using predictions of k different final models (k = 2, …, 7) are hereafter be referred to as Meta_models k . For example, a meta-model developed using predictions of Model final,0d , Model final,3d , and Model final,4d constituted one of Meta_models 3 , and was used when supplied with SCr and laboratory values measured on POD 0, 3, and 4.

Web application development
To facilitate the automatic selection and implementation of the meta-model, we developed a web application with a user-friendly interface, hosted at https:// dongy. shiny apps. io/ aki_ ckd. The shiny package in R (https:// shiny. rstud io. com) was used for programming the application. The application used 8 basic models ( Model final,pre , Model final,0d , Model final,1d , Model final,2d , Model final,3d , Model final,4d , Model final,5d , and Model final,1m ) and 120 meta-models built from their predictions. Given only preoperative and intraoperative factors, Model final,pre is activated to generate the predictions. Following the input of extra postoperative factors, the optimal model is chosen based on the number of samples (= k) and the corresponding PODs of their acquisition. Outputs of the model are predicted values of SCr 1y , eGFR, and CKD stage at 1 year. Table 1 summarizes the baseline characteristics of patients that underwent RN and PN. The longitudinal trajectories of postoperative SCr in RN and PN are shown in Additional file 2: Fig. S2. SCr typically increased from POD 0 to POD 3, decreased from POD 4 to 7, showed a secondary surge until POD 15, and gradually declined towards the final level. SCr levels on POD 4 and 5 showed the highest correlation with SCr level at 1 year (Additional file 3: Table S1). In RN and PN, 43.1% (240 of 557) and 7.2% (72 of 999) of the patients, respectively, developed CKD stage 3 and higher 1 year after surgery. All the above exploratory analyses were carried out using the raw data prior to imputation.
For all features included in each F Lasso subset, partial Spearman's correlation coefficients with SCr 1y were calculated; only features whose absolute values of the coefficients were greater than 0.1 (i.e., F final ) were retained. Final regression models (i.e., Model final ) were developed on F final and their estimation results are shown in Table 2.
The  Each of these feature sets was combined with F pre to yield 7 different feature sets ( F 0d , …, F 5d , and F 1m ). The 8 feature sets were used to fit 8 different Lasso regression models. Features of each set with non-zero coefficients, F Lasso , were then passed onto a partial correlation filter that evaluated the correlation of each of the features with the target variable SCr 1y . The final features, F final , were then used to train the final multivariate linear regression models. The final step used all possible combinations of the predictions generated by the 7 final models, Model 0d−1m , to yield 120 meta-models CKD stage are shown in Table 3. Model final,4d was found to confer the best accuracy, weighted averaged precision, and weighted averaged recall.

Model stacking
Ridge regressions with zero intercept were performed using predictions of all possible subsets of Model final , yielding 120 meta-models. Web application development Figure 3 shows the screenshot of the developed web application. The minimum information required to run the application are patient characteristics, such as age, sex, history of CKD, and preoperative and intraoperative factors, including the type of nephrectomy (RN or PN), size of mass removed, and preoperative SCr and BUN levels. The left panel is used for generating predictions. As postoperative measurements of SCr become available, they can be used to update the predictions.

Discussion
We developed predictive models for CKD 1 year after RN or PN that fully incorporated preoperative, intraoperative, and postoperative factors. Our work can be summarized as follows: 1) We clearly demonstrated the need to incorporate early postoperative information (specifically SCr levels) to accurately predict long-term renal function. 2) Within the first postoperative week we identified POD 4 as the optimal sampling point for SCr (and BUN).
3) We identified the magnitude of early SCr elevation, history of CKD, surgery type (RN or PN), and patient age as the most robust predictors of CKD. 4) We provide a practical framework to predict CKD and offer an easy-touse web application to implement our models. Although surgically induced CKD may have a better prognosis than CKD due to medical causes [7], the risk of mortality is known to increase if patients have a reduced eGFR (< 45 mL/min/1.73 m 2 ) following RCC surgery [8]. Our results showed that 14% (78 out of 557) of RN patients and 2.2% (22 out of 999) of PN patients had an eGFR < 45 mL/min/1.73 m 2 1 year after surgery. This percentage was enough to warrant attention. Studies investigating the risk factors for CKD after RN or PN have shown older age, male sex, history of CKD, diabetes mellitus, and RN as independent predictors [9][10][11][12][13][14][15]. However, most of these studies did not consider perioperative laboratory values. Another study showed that time to nadir eGFR was one of the predictors of CKD [18]. However, computing the exact time to nadir eGFR requires intensive sampling, and may limit easy clinical implementation. Moreover, the perioperative laboratory values tested was limited only to eGFR in that study. In contrast to the aforementioned studies, our study comprehensively considered perioperative laboratory values, including CBC, routine chemistry, and serum electrolytes as predictive factors. In all the final models, an increase in SCr levels from the preoperative value and history of CKD were the most important features (Table 2). Among other features, RN (compared to PN) was robustly associated with a higher SCr 1y . Older age was additionally depicted as a significant risk factor in most final models ). Male sex was only significant in Model final,pre , which was built using just preoperative information; its effect was fully accounted for by other factors once postoperative information was available. The size of mass removed was a significant feature only immediately after surgery ( Model final,0d ) , and its contribution disappeared after incorporating the SCr values on POD 1 and onwards. Overall, the important predictors for CKD occurrence were postoperative SCr levels, history of CKD, RN, and older age. Most other factors such as postoperative electrolytes, CBC, routine chemistry, and vital signs that were tested as candidate predictors were insignificant. In addition to model development, we aimed to identify the optimal time point of postoperative SCr sampling for predicting CKD. To this end, we compared the predictive performances of 6 final models that used the information obtained on POD 0 to 5 ( Model final,0d −Model final,5d ) to that of a reference model that only used preoperative information ( Model final,pre ), and then to that of a model that used information collected at 1 month ( Model final,1m ). The predictive performances of Model final,0d − Model final,5d were better than those of Model final,pre but were almost similar to those of Model final,1m . Hence, postoperative SCr levels measured in the first 5 days after surgery constituted crucial, nearly sufficient information to predict CKD at 1 year. Among Model final,0d −Model final,5d , Model final,4d showed the best performance. In predicting the CKD stage, this model demonstrated classification accuracy of 79%, weighted averaged precision of 80%, and weighted averaged recall of 79% (Table 3). This suggested that POD 4 may be the optimal sampling point for predicting CKD.
To maximize the predictive performance, we adopted model stacking, a technique increasingly used in the medical field [24][25][26], wherein predictions from each of the 7 models (excluding the Model final,pre ) were combined in all possible ways to generate 120 feature sets. Ridge regression models were then trained on these sets, yielding 120 meta-models. Our prediction strategy was to select from the meta-models, the one that makes best use of all available information. For example, if we had SCr measurements on POD 3, 4, and 5, we would choose a meta-model built on predictions of Model final,3d , Model final,4d , and Model final,5d .
This study has a few limitations. First, the data used for model building were retrospectively collected at a single center primarily comprising Korean patients. Hence, for generalization to patients of different ethnic backgrounds or those treated under different hospital environments, external validation is required. Second, this study only included surgeries that used minimally invasive laparoscopic or robotic techniques, and not open techniques. One study reported a lower risk of CKD in minimally invasive approaches [14], whereas others reported a similar risk [10,13]. However, minimally invasive approaches are being used more frequently for RN and PN [27]; thus, our model may be appropriate for future studies. Third, SCr was used as the surrogate of postoperative renal function and the target to be predicted, although the definition of CKD is based on eGFR [3]. However, as AKI is defined by changes in SCr, we wanted to examine the longitudinal changes of postoperative SCr with the concept of AKI, and CKD in continuum. Moreover, eGFR can easily be calculated using SCr. Therefore, we displayed eGFR in a web application by converting the predicted SCr to predicted eGFR and then finally classifying the CKD stages of the patients. Fourth, the strong correlation between the early increase in SCr levels and CKD at 1 year after surgery, while being useful for prediction, offers little to modify treatment for improving the clinical outcome. However, our results recommend that further investigations to prevent CKD progression be focused on preventing AKI in the first place, since early SCr elevation is strongly associated with long-term clinical outcomes. Despite these limitations, our model was the first to show the serial trends of SCr during 1 year with the incorporation of preoperative, intraoperative, and postoperative information.

Conclusions
We developed a model for predicting CKD after RN or PN, effectively extending the applicability of our prior model for predicting AKI after RN or PN [2]. The main strengths of our study were the active utilization of postoperative SCr and other laboratory values for CKD prediction and a clear demonstration of the importance of SCr measured within the first 5 days after surgery as Table 3 Classification performances of chronic kidney disease stage based on the final models