- Research
- Open Access
- Published:
Multi-institutional development and external validation of machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection
Journal of Translational Medicine volume 19, Article number: 281 (2021)
Abstract
Background
Surgical resection is the only potentially curative treatment for pancreatic ductal adenocarcinoma (PDAC) and the survival of patients after radical resection is closely related to relapse. We aimed to develop models to predict the risk of relapse using machine learning methods based on multiple clinical parameters.
Methods
Data were collected and analysed of 262 PDAC patients who underwent radical resection at 3 institutions between 2013 and 2017, with 183 from one institution as a training set, 79 from the other 2 institution as a validation set. We developed and compared several predictive models to predict 1- and 2-year relapse risk using machine learning approaches.
Results
Machine learning techniques were superior to conventional regression-based analyses in predicting risk of relapse of PDAC after radical resection. Among them, the random forest (RF) outperformed other methods in the training set. The highest accuracy and area under the receiver operating characteristic curve (AUROC) for predicting 1-year relapse risk with RF were 78.4% and 0.834, respectively, and for 2-year relapse risk were 95.1% and 0.998. However, the support vector machine (SVM) model showed better performance than the others for predicting 1-year relapse risk in the validation set. And the k neighbor algorithm (KNN) model achieved the highest accuracy and AUROC for predicting 2-year relapse risk.
Conclusions
By machine learning, this study has developed and validated comprehensive models integrating clinicopathological characteristics to predict the relapse risk of PDAC after radical resection which will guide the development of personalized surveillance programs after surgery.
Introduction
Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal human malignant diseases worldwide and the sixth leading cause of cancer-related deaths in China [1]. So far, radical resection followed by adjuvant chemotherapy has been the only potentially curative treatment [2]. However, only a minority of patients present with a tumor suitable for this combination therapy at diagnosis, due to lack of early clinical symptoms and effective screening approaches [3]. Even after curative resection, up to 80% of patients will suffer from disease relapse resulting in a 5-year survival of only 20–30% [4,5,6,7]. Hence, the survival of patients with resectable PDAC is closely related to recurrence. It is necessary and urgent to build robust models to identify those patients with increased risk of relapse and further optimize treatment decision-making.
Nowadays, development of methods to predict treatment outcomes and prognosis is an important paradigm in the realm of personalized medicine [8]. Several studies have shown comparable prediction accuracy by using traditional regression-based statistical methods on a basis of a combination of biomarkers and multiple clinical factors [9,10,11,12]. However, common statistical methods familiar to clinicians ignore more complex non-linear interactions between variables that might play significant roles in the potential of future relapse, and which could be captured using more sophisticated modeling approaches [13]. In recent years, machine learning, as a branch of artificial intelligence (AI) technology, has attracted extensive interest in developing clinical predictive tools for diagnosis, staging and prognosis of various diseases [14,15,16]. It has been successfully applied for recognizing hidden patterns in complex data, allowing for better predictions of clinical outcomes than conventional statistical models, especially when applied to large-scale datasets [17].
Thus, the aim of this study was to develop, and externally validate, new cutting-edge machine learning-based models that accurately predict 1- and 2-year relapse of PDAC using clinicopathological factors in patients with resectable disease. Predicting the risk of relapse offers the potential to improve personalized surveillance schedules, determine clinical trial eligibility and compare results across studies and different institutions [18].
Materials and methods
Study population
Data of PDAC patients who underwent radical resection at 3 institutions between January 2013 and December 2017 were obtained. The study was approved by the Institutional Review Boards of 3 institutions. And no additional patient consent was required since the medical records were retrospectively reviewed. As this study aimed to build models based on preoperative clinical and pathological factors affecting relapse risk after surgery in resectable PDAC, patients who had initially borderline resectable/unresectable cancers according to the NCCN guideline [19] or received neoadjuvant therapy were excluded. So were those who were lost to follow-up or lacking complete clinical data. The inclusion criteria were met by a total of 262 patients, including 183 from the Second Affiliated Hospital of Zhejiang University School of Medicine, 70 from the Cancer Hospital of the University of Chinese Academy of Sciences and 9 from the Fourth Affiliated Hospital of Zhejiang University School of Medicine.
Data collection
Preoperative blood biomarkers including carcinoembryonic antigen (CEA), CA199, CA125, white blood cell (WBC) count, hemoglobin (Hb) count, platelet (Plt) count, neutrophil (Neut) count, lymphocyte (Lymp) count, monocyte (Mono) count, albumin (Alb), globulin (Glb), aspartate transaminase (AST), alanine transaminase (ALT), alkaline phosphatase (ALP), gamma-glutamyltransferase (GGT), total bilirubin (TB) and direct bilirubin (DB) were collected using the measurements that were closest to the operation and within at least 1 week before the surgery. Inflammation-based prognostic scores, including albumin-globulin ratio (AGR) [20], lymphcyte-monocyte ratio (LMR) [21], neutrophil–lymphocyte ratio (NLR) [22] and platelet-lymphocyte ratio (PLR) [23], were calculated. Additionally, pathological diagnosis and description was carried out by experienced pancreatic pathologists at 3 institutions, including surgical margin status, tumor site, tumor size, tumor differentiation, T-stage, lymph node status (N-stage), vascular invasion, perineural invasion and adipose tissue invasion.
After surgery, the follow-up of patients was initially performed every 3Â months for the first 2 years, every 6Â months during years 3 and 4, and then annually. The surveillance protocol included physical examination, serum CA19-9 level and contrast-enhanced abdominoperineal computed tomography (CT). When imaging features were consistent with a cancer recurrence, magnetic resonance imaging (MRI) and/or fluorodeoxyglucose positron emission tomography (PET) was carried out to further clarify ambiguous CT findings if necessary. Relapse-free survival (RFS) and overall survival (OS) were defined as the duration from the date of surgery until the date when a relapse was diagnosed and death, respectively, or last follow-up.
Statistical analysis
Differences of clinical characteristics between the training set and the validation set as well as between patient groups with or without 1- and 2-year relapse were assessed using independent sample t test, Mann–Whitney U test, or χ2 test with a statistical significance level set at 0.05. Clinical variables found significantly different (p < 0.05) between patient groups with or without 1- and 2-year relapse were selected as inputs for the predictive models.
In our study, six algorithms were applied to build models for predicting 1- and 2-year relapse. In addition to the basic binary LR model, several machine learning models were developed: random forest (RF), support vector machine (SVM), gradient boosting machine (GBM), Neural network (NN), k neighbor algorithm (KNN). RF and GBM both are tree-based ensemble algorithms. RF creates multiple decision tree models by bootstrap samples, and aggregates decisions through averaging or majority voting [24]. And GBM uses all the data to build a regression tree model from the beginning, and constructs the new models to be maximally correlated with the negative gradient of the loss function [25]. SVM provides two-class prediction by constructing the separating hyperplane that has the largest distance to the nearest training data points from each of the two classes [26]. The neural network algorithm recognizes the potential relationships in a set of data through constructing a network structure composed of three main layers (input, hidden and output layer) and the main task is to transform raw input units into useful output units [27]. The K-nearest neighbor algorithm is based on analogical reasoning, it stores all the training data and classifies the new data point based on similarity measures [28].
For data standardizing, we centered and scaled the input features to the same range of values with mean of zero prior to modeling. Model tuning were carried out using the repeated fivefold cross-validation method with the training set. Repeated cross-validation means repeating the procedure of cross-validation for k times (k = 3 in this study), each time with different splits. The model assessment metric was calculated in each repetition and finally averaged as the final result. Compared with performing cross-validation only once, repeated cross-validation can improve estimated performance of a chosen model [29]. In each cross-validation, we tried all possible combinations of parameters by grid search. For each set of parameters, we used 4/5 of the data to fit the model, and the remaining 1/5 was assessed to compute the performance measure. Here we selected accuracy as the performance measure, which was calculated 5 times and averaged to produce the performance score of each parameter set. The ranges of training parameters for grid search were provided in Additional file 1: Table S1. Relative variable importance was calculated and plotted to find out the impact of features on the predictive models.
The performance of the final models was assessed in the validation set. The evaluation indicators used to compare the performance of models were AUROC, sensitivity, specificity, accuracy, positive predictive value (PPV), negative predictive value (NPV), F1 score and root mean squared error (RMSE). To further evaluate the performance of the models, we used bootstrapping resampling (2000 times) to compute the 95% confidence interval (CI) of AUROC and compared the AUROCs of machine learning models using 2-sided test. Finally, 95%CI of AUROCs and p values from comparisons were plotted together. We determined the best machine learning models for prediction of 1- and 2-year relapse with the validation set. Calibration curves were constructed to regress observed data against model fits of the best machine learning models. We also tried other variable sets as inputs for these ML models: (1) all the 32 clinical variables, (2) variables obtained through fivefold cross-validation Lasso analysis.
All statistical analysis was performed with R 4.0.2. The R package ‘caret’ was used for data pre-processing, model training (SVM and KNN), and calculation of variable importance. The R packages ‘randomForest’, ‘gbm’ and ‘nnet’ were used for the RF, GBM, and NNET model training, respectively. Lasso analysis was performed by the R package ‘glmnet’.
Results
Basic characteristics
The clinicopathological characteristics of the training set and validation set are shown in Table 1. 183 from the Second Affiliated Hospital of Zhejiang University School of Medicine were included as the training set. 70 from the Cancer Hospital of the University of Chinese Academy of Sciences and 9 from the Fourth Affiliated Hospital of Zhejiang University School of Medicine were used as the external independent validation cohort. Several clinical features were found significantly different between the training and validation datasets including globulin (Glb), albumin-globulin ratio (AGR), tumor differentiation, T-stage, lymph node status (N-stage) and vascular invasion (VI).
Comparison of characteristics between patients with and without 1- or 2-year relapse in training set was shown in Additional file 1: Table S2 and S3, respectively. According to the univariate analysis, significant differences were observed in various clinical parameters (CA199, N stage, vascular invasion, adipose tissue invasion, differentiation) for 1-year relapse, and (CA199, N stage, vascular invasion, monocyte counts, albumin, AGR) for 2-year relapse. These variables were then included in the construction of machine learning models to predict the relapse risk of PDAC after radical surgery.
Model performance
Six models including LR, RF, SVM, GBM, KNN and NN were built and externally validated and the optimal parameters of these models were shown in Additional file 1: Table S4. Relative importance of variables was calculated and shown in Figs. 1 and 2. Pathological characteristics such as lymph node status (N-stage), tumor differentiation, and vascular invasion were found to have a major impact on most predictive models.
Relative importance of variables on models to predict 1-year relapse. Interpretation: N2 = N stage 1, N3 = N stage 2; grade 2 = moderate differentiation, grade 3 = poor differentiation or undifferentiated; ATI1 = with adipose tissue invasion; VI1 = with vascular invasion; CA1991 = CA 199 ≥ 37U/mL
Comparisons of ROC curves and AUROC of different models to predict 1- and 2-year relapse in training cohort and validation sets were shown in Fig. 3 and Additional file 1: Figure S1. All six methods had excellent performance in the training set. Among them, the RF model outperformed the others in the training set. The highest accuracy and AUROC for predicting 1-year relapse risk with RF were 78.4% and 0.834, respectively; and for 2-year relapse risk were 95.1% and 0.998, respectively. LR obtained the lowest AUROC value of 0.776 to predict 1-year relapse risk and KNN of 0.808 to predict 2-year relapse risk.
Comparisons of ROC curves and AUROC of different models to predict 1- and 2-year relapse in training cohort and validation sets (1-year relapse: training set: A, validation set: B, comparison of AUROC in validation set: C; 2-year relapse: training set: D, validation set: E, comparison of AUROC in validation set: F)
In the validation set, the SVM model showed better performance than the others for predicting 1-year relapse risk with an accuracy and AUROC of 70.9% and 0.733, respectively (Table 2). And the KNN model achieved the highest accuracy and AUROC for predicting 2-year relapse risk of 73.4% and 0.689, respectively (Table 3). We further separately compared these two models with the rest using the AUROC. However, there was no significant statistical difference between RF and either of these two models, implying that these models might be similar in terms of their predictive power.
In addition, we also built models based on all the 32 clinical variables or variables obtained from fivefold cross-validation Lasso analysis. Nonetheless, no better predictive performance was achieved by either of these two approaches (Additional file 1: Tables S5 and S6). We still used the results from univariate analysis considering its simplicity and good performance.
Finally, we used a calibration curve to assess the agreement between the predicted and observed risks of relapse of PDAC. Adequate consistency was displayed in the training set between estimated risks using the predictive models and the actual observed outcome. However, SVM and KNN showed relatively poorer calibration performance in the validation set due to a smaller sample size (Additional file 1: Figure S2).
Discussion
The development of predictive tools for individual relapse risk assessment after multimodal therapy may help to further optimize treatment decision-making [30]. In this study, we have constructed and validated comprehensive models integrating clinicopathological characteristics to predict the relapse risk of PDAC after radical resection. It turned out that machine learning techniques were superior to conventional regression-based analyses in terms of the predictive performance. In accordance with various studies investigating the prognostic factors of PDAC [11, 31, 32], lymph node status (N-stage), vascular invasion and CA199 are independent predictors for both 1- and 2-year relapse. Although the RF model had the highest AUROC in the training set, the SVM model and KNN model showed better robustness to predict 1- and 2-year relapse in the validation set, respectively.
Currently, lack of screening and early detection, the proneness for early relapse after radical resection and minimally effective systemic therapy remain major barriers to curing patients with PDAC [33]. Timely and accurate prediction of relapse even after operative intervention is difficult. Implementation of cutting-edge machine learning algorithms may help to identify at-risk patients, among whom more intensive surveillance, the use of adjuvant treatment, or even the inclusion of these patients into clinical trials may be considered. Nowadays, artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications across almost every domain of medicine [34,35,36]. As an important branch of AI, machine learning allows computers to train models using large numbers of examples and may detect difficult-to-recognize patterns from complex dataset [37]. Unlike conventional regression-based approaches, machine learning algorithms are capable of capturing higher-order, non-linear inter-actions between predictors [38]. As a widely used model in biomedical analytics, SVM creates a set of hyperplanes for each feature in an infinite dimensional space, and fits linear or nonlinear models that most effectively discriminate between the values of a binary output variable [39]. Its effectiveness has been proved in studies to predict the recurrence of various diseases [40,41,42]. KNN is another stringent methodology for classification and regression. Reports have also demonstrated its promising role in prognostic research [43,44,45]. It can be useful to weight the contributions of the neighbours, so that the nearer neighbours contribute more to the average than the more distant ones [46]. Our study allowed for the comparison of multiple learning algorithms to identify the approach with the most favorable performance.
To the best of our knowledge, this is the first study to develop and compare machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection from multi-institutional datasets. Predictive nomograms based on conventional regression methods have been built for early recurrence after pancreatectomy in resectable pancreatic cancer [9, 12]. Kim et al. established a nomogram to predict the probability of recurrence within 12 months after surgery in single medical center with AUROC = 0.655 [9]. While in our study, we constructed and externally validated a predictive SVM model for 1-year relapse risk with AUROC = 0.733 and a KNN model for 2-year relapse risk with AUROC = 0.689 using stringent statistical method. Another work by Guo et al. redefined early recurrence as the first 162 days postoperatively on a basis of its own cohort, which made it difficult to compare results across studies and different institutions [12]. Particularly, it is understandable that this study did not include histopathologic data in its Cox proportional hazards regression model for the purpose of guiding preoperative decision-making concerning the use of neoadjuvant therapy. Other reports regarding this topic also have their own specific drawbacks with either a very small sample size of less than 40 [30] or lack of external validation [10]. In addition, recent research has revealed the links between radiomics and underlying tumor biology in PDAC, which are strongly correlated with tumor phenotype [47], response to treatment [48], and prognosis [49,50,51]. However, the steps of image texture analysis and manual contouring of region of interests (ROIs) are still tedious, laborious and time-consuming, which is inconvenient for clinical practice at present and has ample room to improve in the future.
Certain limitations of this study and the results need to be discussed. First, given the retrospective nature of our study, there might be some selection bias existing because of its inherent flaws. Second, despite the low incidence of PDAC, the relatively limited sample size included in the training and validation dataset might impair the accuracy for quantifying interpatient variability effects. Both two models showed high sensitivity with a trade-off that the specificity might be sacrificed in a certain level, which is relevant to the threshold selection when performing binary classification [52]. More larger and balanced cohorts will be collected from multiple medical centers in the future to further establish the robustness of the proposed models. Third, limitations in the interpretability of inner workings of models currently poses a severe bottleneck in implementing cutting-edge machine learning techniques in biomedical research [34, 53]. We need to keep pursuing a better understanding of the complex and evolving relationship between physicians and human-centred AI tools in the live clinical environment, thus providing better outcomes to our patients [54].
In conclusion, we employed machine learning algorithms to construct models integrating clinicopathological characteristics to predict the relapse risk of PDAC after radical resection. And we have externally validated the prediction capacity of our models in independent groups from other medical institutions. Machine learning systems can provide critical prognostic prediction for patients with PDAC after radical resection, and the use of predictive algorithms may offer promising clinical decision support for both practitioners and patients.
Availability of data and materials
The datasets generated during and analysed during the current study are available in the Code Ocean (https://codeocean.com/capsule/2968380/tree).
Abbreviations
- PDAC:
-
Pancreatic ductal adenocarcinoma
- AI:
-
Artificial intelligence
- CT:
-
Computed tomography
- MRI:
-
Magnetic resonance imaging
- PET:
-
Positron emission tomography
- RFS:
-
Relapse-free survival
- OS:
-
Overall survival
- RF:
-
Ramdom forest
- SVM:
-
Support vector machine
- GBM:
-
Gradient boosting machine
- NN:
-
Neural network
- KNN:
-
K neighbor algorithm
- AUROC:
-
Area under the receiver operating characteristic curve
- PPV:
-
Positive predictive value
- NPV:
-
Negative predictive value
- RMSE:
-
Root mean squared error
- CI:
-
Confidence interval
- BMI:
-
Body mass index
- CEA:
-
Carcinoembryonic antigen
- CA:
-
Cancer antigen
- WBC:
-
White blood cell
- Hb:
-
Hemoglobin
- Plt:
-
Platelet
- Neut:
-
Neutrophil
- Lymph:
-
Lymphocyte
- Mono:
-
Monocyte
- Alb:
-
Albumin
- Glb:
-
Globulin
- AGR:
-
Albumin-globulin ratio
- NLR:
-
Neutrophil–lymphocyte ratio
- LMR:
-
Lymphcyte-monocyte ratio
- PLR:
-
Platelet-lymphocyte ratio
- AST:
-
Aspartate transaminase
- ALT:
-
Alanine transaminase
- ALP:
-
Alkaline phosphatase
- GGT:
-
Gamma-glutamyltransferase
- TB:
-
Total bilirubin
- DB:
-
Direct bilirubin
- VI:
-
Vascular invasion
- PI:
-
Perineural invasion
- ATI:
-
Adipose tissue invasion
- OS:
-
Overall survival
- RFS:
-
Relapse-free survival
References
Chen WQ, Zheng RS, Baade PD, Zhang SW, Zeng HM, Bray F, et al. Cancer Statistics in China, 2015. Cancer J Clin. 2016;66(2):115–32.
Nevala-Plagemann C, Hidalgo M, Garrido-Laguna I. From state-of-the-art treatments to novel therapies for advanced-stage pancreatic cancer. Nat Rev Clin Oncol. 2020;17(2):108–23.
Aier I, Semwal R, Sharma A, Varadwaj PK. A systematic assessment of statistics, risk factors, and underlying features involved in pancreatic cancer. Cancer Epidemiol. 2019;58:104–10.
Katz MHG, Wang H, Fleming JB, Sun CC, Hwang RF, Wolff RA, et al. Long-term survival after multidisciplinary management of resected pancreatic adenocarcinoma. Ann Surg Oncol. 2009;7:25.
Ferrone CR, Pieretti-Vanmarcke R, Bloom JP, Zheng H, Szymonifka J, Wargo JA, et al. Pancreatic ductal adenocarcinoma: long-term survival does not equal cure. Surgery. 2012;152:S43-9.
He J, Ahuja N, Makary MA, Cameron JL, Eckhauser FE, Choti MA, et al. 2564 resected periampullary adenocarcinomas at a single institution: trends over three decades. HPB. 2014;17:325.
Ellison LF, Wilkins K. An update on cancer survival. Health Rep. 2010;21(3):55–60.
Kawakami E, Tabata J, Yanaihara N, Ishikawa T, Koseki K, Iida Y, et al. Application of artificial intelligence for preoperative diagnostic and prognostic prediction in epithelial ovarian cancer based on blood biomarkers. Clin Cancer Res. 2019;25(10):3006–15.
Kim N, Han IW, Ryu Y, Hwang DW, Heo JS, Choi DW, et al. Predictive nomogram for early recurrence after pancreatectomy in resectable pancreatic cancer: Risk classification using preoperative clinicopathologic factors. Cancers. 2020;12:18.
He C, Huang X, Zhang Y, Cai Z, Lin X, Li S. A quantitative clinicopathological signature for predicting recurrence risk of pancreatic ductal adenocarcinoma after radical resection. Front Oncol. 2019;9:87.
He C, Sun S, Zhang Y, Lin X, Li S. A novel nomogram to predict survival in patients with recurrence of pancreatic ductal adenocarcinoma after radical resection. Front Oncol. 2020;10:147.
Guo SW, Shen J, Gao JH, Shi XH, Gao SZ, Wang H, et al. A preoperative risk model for early recurrence after radical resection may facilitate initial treatment decisions concerning the use of neoadjuvant therapy for patients with pancreatic ductal adenocarcinoma. Surgery. 2020;168(6):1003–14.
Wei R, Wang J, Wang X, Xie G, Wang Y, Zhang H, et al. Clinical prediction of HBV and HCV related hepatic fibrosis using machine learning. EBioMedicine. 2018;35:124–32.
Jurmeister P, Bockmayr M, Seegerer P, Bockmayr T, Treue D, Montavon G, et al. Machine learning analysis of DNA methylation profiles distinguishes primary lung squamous cell carcinomas from head and neck metastases. Sci Transl Med. 2019;11:509.
Xu R-H, Wei W, Krawczyk M, Wang W, Luo H, Flagg K, et al. Circulating tumour DNA methylation markers for diagnosis and prognosis of hepatocellular carcinoma. Nat Mater. 2017;8:54.
Motwani M, Dey D, Berman DS, Germano G, Achenbach S, Al-Mallah MH, et al. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur Heart J. 2017;38(7):500–7.
Singal AG, Mukherjee A, Joseph Elmunzer B, Higgins PDR, Lok AS, Zhu J, et al. Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma. Am J Gastroenterol. 2013;108(11):1723–30.
Pulvirenti A, Javed AA, Landoni L, Jamieson NB, Chou JF, Miotto M, et al. Multi-institutional development and external validation of a nomogram to predict recurrence after curative resection of pancreatic neuroendocrine tumors. Ann Surg. 2019;10:1–7.
Tempero MA, Malafa MP, Chiorean EG, Czito B, Scaife C, Narang AK, et al. Pancreatic adenocarcinoma, version 1.2019 featured updates to the NCCN guidelines. JNCCN. 2019;17(3):203–10.
He J, Pan H, Liang W, Xiao D, Chen X, Guo M, et al. Prognostic effect of albumin-to-globulin ratio in patients with solid tumors: a systematic review and meta-analysis. J Cancer. 2017;8(19):4002–10.
Goto W, Kashiwagi S, Asano Y, Takada K, Takahashi K, Hatano T, et al. Predictive value of lymphocyte-to-monocyte ratio in the preoperative setting for progression of patients with breast cancer. BMC Cancer. 2018;18(1):1137.
Tong Z, Liu L, Zheng Y, Jiang W, Zhao P, Fang W, et al. Predictive value of preoperative peripheral blood neutrophil/lymphocyte ratio for lymph node metastasis in patients of resectable pancreatic neuroendocrine tumors: A nomogram-based study. World J Surg Oncol. 2017;15(1):1–9.
Wang C, He W, Yuan Y, Zhang Y, Li K, Zou R, et al. Comparison of the prognostic value of inflammation-based scores in early recurrent hepatocellular carcinoma after hepatectomy. Liver Int. 2020;9:547.
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29(5):1189–232.
Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.
Cross SS, Harrison RF, Kennedy RL. Introduction to neural networks. The Lancet. 1995;346(8982):1075–9.
Altman NS. An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat. 1992;46(3):175–85.
Moss HB, Leslie DS, Rayson P. Using J-K-fold cross validation to reduce variance when tuning NLP models. BMC. 2018;5:2978–89.
Sala Elarre P, Oyaga-Iriarte E, Yu KH, Baudin V, Arbea Moreno L, Carranza O, et al. Use of machine-learning algorithms in intensified preoperative therapy of pancreatic cancer to predict individual risk of relapse. Cancers. 2019;11(5):606.
Song W, Miao DL, Chen L. Nomogram for predicting survival in patients with pancreatic cancer. Onco Targets Ther. 2018;11:539–45.
De CMM, Biere SSAY, Lagarde SM, Busch ORC, Van GM, Gouma DJ. Validation of a nomogram for predicting survival after resection for adenocarcinoma of the pancreas. Br J Surg. 2009;96(4):417–23.
Groot VP, Rezaee N, Wu W, Cameron JL, Fishman EK, Hruban RH, et al. Patterns, timing, and predictors of recurrence following pancreatectomy for pancreatic ductal adenocarcinoma. Ann Surg. 2018;267(5):936–45.
Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 2019;17:1–9.
Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44–56.
Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Nat Med. 2019;25(1):24–9.
Bohr A, Memarzadeh K. The rise of artificial intelligence in healthcare applications. PLoS ONE. 2020;4:25–60.
Zeevi D, Korem T, Zmora N, Israeli D, Rothschild D, Weinberger A, et al. Personalized nutrition by prediction of glycemic responses. Cell. 2015;163(5):1079–94.
Kim W, Kim KS, Lee JE, Noh D-Y, Kim S-W, Jung YS, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15(2):230.
Liang J-D, Ping X-O, Tseng Y-J, Huang G-T, Lai F, Yang P-M. Recurrence predictive models for patients with hepatocellular carcinoma after radiofrequency ablation using support vector machines with feature selection methods. Comput Methods Prog Biomed. 2014;117(3):425–34.
Tseng C-J, Lu C-J, Chang C-C, Chen G-D. Application of machine learning to predict the recurrence-proneness for cervical cancer. Neural Comput Appl. 2014;24(6):1311–6.
Lg A, At E. Using three machine learning techniques for predicting breast cancer recurrence. J Health Med Inf. 2013;04(02):2–4.
Medjahed SA, Saadi TA, Benyettou A. Breast cancer diagnosis by using k-nearest neighbor with different distances and classification rules. Int J Comput Appl. 2013;62:18.
Li C, Zhang S, Zhang H, Pang L, Lam K, Hui C, et al. Using the K-nearest neighbor algorithm for the classification of lymph node metastasis in gastric cancer. Comput Math Methods Med. 2012;2012:77.
Atallah DM, Badawy M, El-Sayed A, Ghoneim MA. Predicting kidney transplantation outcome based on hybrid feature selection and KNN classifier. Multimedia Tools Appl. 2019;78(14):20383–407.
Rana M, Chandorkar P, Dsouza A, Kazi N. Breast cancer diagnosis and recurrence prediction using machine learning techniques. IJRET. 2015;8:2319–1163.
Lim CH, Cho YS, Choi JY, Lee KH, Lee JK, Min JH, et al. Imaging phenotype using 18F-fluorodeoxyglucose positron emission tomography–based radiomics and genetic alterations of pancreatic ductal adenocarcinoma. Eur J Nucl Med Mol Imaging. 2020;47(9):2113–22.
Nasief H, Zheng C, Schott D, Hall W, Tsai S, Erickson B, et al. A machine learning based delta-radiomics process for early prediction of treatment response of pancreatic cancer. NPJ Precis Oncol. 2019;3(1):1–10.
Kaissis G, Ziegelmayer S, Lohöfer F, Algül H, Eiber M, Weichert W, et al. A prospectively validated machine learning model for the prediction of survival and tumor subtype in pancreatic ductal adenocarcinoma. BMC Med. 2019;17:1–9.
Hwang SH, Kim HY, Lee EJ, Hwang HK, Park M-S, Kim M-J, et al. Preoperative clinical and computed tomography (CT)-based nomogram to predict oncologic outcomes in patients with pancreatic head cancer resected with curative intent: a retrospective study. J Clin Med. 2019;8(10):1749.
Yun G, Kim YH, Lee YJ, Kim B, Hwang JH, Choi DJ. Tumor heterogeneity of pancreas head cancer assessed by CT texture analysis: Association with survival outcomes after curative resection. Sci Rep. 2018;8(1):1–10.
Lu CF, Hsu FT, Hsieh KL, Kao YJ, Cheng SJ, Hsu JB, et al. Machine learning-based radiomics for molecular subtyping of gliomas. Clin Cancer Res. 2018;24(18):4429–36.
Manamley N, Mallett S, Sydes MR, Hollis S, Scrimgeour A, Burger HU, et al. Data sharing and the evolving role of statisticians. BMC Med Res Methodol. 2016;16(Suppl 1):75.
Ahuja AS. The impact of artificial intelligence in medicine on the future role of the physician. PeerJ. 2019;2019:10.
Acknowledgements
Not applicable.
Funding
This work was supported by the General Program of National Natural Science Foundation of China under Grant [Grant Number: 81772562, 2017] (Yulian Wu) and the Fundamental Research Funds for the Central Universities [Grant Number: 2021FZZX005-08] (Xiawei Li).
Author information
Authors and Affiliations
Contributions
XL: conceptualization, methodology, formal analysis, writing-original draft, writing-review & editing. LY: resources, writing-review & editing. ZY: writing-original draft, formal analysis. JL: investigation. YF: resources, data curation. AS: data curation, methodology. JH, MZ: data curation. YW: funding acquisition, project administration, resources, supervision. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The study was approved by the Institutional Review Boards of 3 institutions. And no additional patient consent was required since the medical records were retrospectively reviewed.
Consent for publication
All the authors agree to the publication of this work.
Competing interests
The authors declare no potential conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1: Figure S1.
Comparisons of AUROC of different models to predict 1- and 2-year relapse in training set (a: 1-year relapse, b: 2-year relapse). Figure S2. Calibration curves of SVM model (A: training set; B: validation set) to predict 1-year relapse and KNN model (C: training set; D: validation set) to predict 2-year relapse. Table S1. The ranges of training parameters for grid search in different models. Table S2. Comparison of characteristics between patients with and without 1-year relapse in training set. Table S3. Comparison of characteristics between patients with and without 2-year relapse in training set. Table S4. The optimal parameters for different models to predict relapse risks of PDAC. Table S5. Performance of models built on all 32 variables in the validation set. Table S6. Performance of models built on variables from lasso analysis in the validation set.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Li, X., Yang, L., Yuan, Z. et al. Multi-institutional development and external validation of machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection. J Transl Med 19, 281 (2021). https://doi.org/10.1186/s12967-021-02955-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12967-021-02955-7
Keywords
- Machine learning
- PDAC
- Relapse
- Prediction model
- Radical surgery