Multi-institutional development and external validation of machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection

Li, Xiawei; Yang, Litao; Yuan, Zheping; Lou, Jianyao; Fan, Yiqun; Shi, Aiguang; Huang, Junjie; Zhao, Mingchen; Wu, Yulian

doi:10.1186/s12967-021-02955-7

Research
Open access
Published: 30 June 2021

Multi-institutional development and external validation of machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection

Xiawei Li^1,2,3,
Litao Yang⁴,
Zheping Yuan⁵,
Jianyao Lou^1,2,3,
Yiqun Fan⁶,
Aiguang Shi^1,2,3,
Junjie Huang⁷,
Mingchen Zhao⁵ &
…
Yulian Wu ORCID: orcid.org/0000-0002-4090-0551^1,2,3

Journal of Translational Medicine volume 19, Article number: 281 (2021) Cite this article

2602 Accesses
11 Citations
3 Altmetric
Metrics details

Abstract

Background

Surgical resection is the only potentially curative treatment for pancreatic ductal adenocarcinoma (PDAC) and the survival of patients after radical resection is closely related to relapse. We aimed to develop models to predict the risk of relapse using machine learning methods based on multiple clinical parameters.

Methods

Data were collected and analysed of 262 PDAC patients who underwent radical resection at 3 institutions between 2013 and 2017, with 183 from one institution as a training set, 79 from the other 2 institution as a validation set. We developed and compared several predictive models to predict 1- and 2-year relapse risk using machine learning approaches.

Results

Machine learning techniques were superior to conventional regression-based analyses in predicting risk of relapse of PDAC after radical resection. Among them, the random forest (RF) outperformed other methods in the training set. The highest accuracy and area under the receiver operating characteristic curve (AUROC) for predicting 1-year relapse risk with RF were 78.4% and 0.834, respectively, and for 2-year relapse risk were 95.1% and 0.998. However, the support vector machine (SVM) model showed better performance than the others for predicting 1-year relapse risk in the validation set. And the k neighbor algorithm (KNN) model achieved the highest accuracy and AUROC for predicting 2-year relapse risk.

Conclusions

By machine learning, this study has developed and validated comprehensive models integrating clinicopathological characteristics to predict the relapse risk of PDAC after radical resection which will guide the development of personalized surveillance programs after surgery.

Introduction

Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal human malignant diseases worldwide and the sixth leading cause of cancer-related deaths in China [1]. So far, radical resection followed by adjuvant chemotherapy has been the only potentially curative treatment [2]. However, only a minority of patients present with a tumor suitable for this combination therapy at diagnosis, due to lack of early clinical symptoms and effective screening approaches [3]. Even after curative resection, up to 80% of patients will suffer from disease relapse resulting in a 5-year survival of only 20–30% [4,5,6,7]. Hence, the survival of patients with resectable PDAC is closely related to recurrence. It is necessary and urgent to build robust models to identify those patients with increased risk of relapse and further optimize treatment decision-making.

Nowadays, development of methods to predict treatment outcomes and prognosis is an important paradigm in the realm of personalized medicine [8]. Several studies have shown comparable prediction accuracy by using traditional regression-based statistical methods on a basis of a combination of biomarkers and multiple clinical factors [9,10,11,12]. However, common statistical methods familiar to clinicians ignore more complex non-linear interactions between variables that might play significant roles in the potential of future relapse, and which could be captured using more sophisticated modeling approaches [13]. In recent years, machine learning, as a branch of artificial intelligence (AI) technology, has attracted extensive interest in developing clinical predictive tools for diagnosis, staging and prognosis of various diseases [14,15,16]. It has been successfully applied for recognizing hidden patterns in complex data, allowing for better predictions of clinical outcomes than conventional statistical models, especially when applied to large-scale datasets [17].

Thus, the aim of this study was to develop, and externally validate, new cutting-edge machine learning-based models that accurately predict 1- and 2-year relapse of PDAC using clinicopathological factors in patients with resectable disease. Predicting the risk of relapse offers the potential to improve personalized surveillance schedules, determine clinical trial eligibility and compare results across studies and different institutions [18].

Materials and methods

Study population

Data of PDAC patients who underwent radical resection at 3 institutions between January 2013 and December 2017 were obtained. The study was approved by the Institutional Review Boards of 3 institutions. And no additional patient consent was required since the medical records were retrospectively reviewed. As this study aimed to build models based on preoperative clinical and pathological factors affecting relapse risk after surgery in resectable PDAC, patients who had initially borderline resectable/unresectable cancers according to the NCCN guideline [19] or received neoadjuvant therapy were excluded. So were those who were lost to follow-up or lacking complete clinical data. The inclusion criteria were met by a total of 262 patients, including 183 from the Second Affiliated Hospital of Zhejiang University School of Medicine, 70 from the Cancer Hospital of the University of Chinese Academy of Sciences and 9 from the Fourth Affiliated Hospital of Zhejiang University School of Medicine.

Data collection

Preoperative blood biomarkers including carcinoembryonic antigen (CEA), CA199, CA125, white blood cell (WBC) count, hemoglobin (Hb) count, platelet (Plt) count, neutrophil (Neut) count, lymphocyte (Lymp) count, monocyte (Mono) count, albumin (Alb), globulin (Glb), aspartate transaminase (AST), alanine transaminase (ALT), alkaline phosphatase (ALP), gamma-glutamyltransferase (GGT), total bilirubin (TB) and direct bilirubin (DB) were collected using the measurements that were closest to the operation and within at least 1 week before the surgery. Inflammation-based prognostic scores, including albumin-globulin ratio (AGR) [20], lymphcyte-monocyte ratio (LMR) [21], neutrophil–lymphocyte ratio (NLR) [22] and platelet-lymphocyte ratio (PLR) [23], were calculated. Additionally, pathological diagnosis and description was carried out by experienced pancreatic pathologists at 3 institutions, including surgical margin status, tumor site, tumor size, tumor differentiation, T-stage, lymph node status (N-stage), vascular invasion, perineural invasion and adipose tissue invasion.

After surgery, the follow-up of patients was initially performed every 3 months for the first 2 years, every 6 months during years 3 and 4, and then annually. The surveillance protocol included physical examination, serum CA19-9 level and contrast-enhanced abdominoperineal computed tomography (CT). When imaging features were consistent with a cancer recurrence, magnetic resonance imaging (MRI) and/or fluorodeoxyglucose positron emission tomography (PET) was carried out to further clarify ambiguous CT findings if necessary. Relapse-free survival (RFS) and overall survival (OS) were defined as the duration from the date of surgery until the date when a relapse was diagnosed and death, respectively, or last follow-up.

Statistical analysis

Differences of clinical characteristics between the training set and the validation set as well as between patient groups with or without 1- and 2-year relapse were assessed using independent sample t test, Mann–Whitney U test, or χ² test with a statistical significance level set at 0.05. Clinical variables found significantly different (p < 0.05) between patient groups with or without 1- and 2-year relapse were selected as inputs for the predictive models.

In our study, six algorithms were applied to build models for predicting 1- and 2-year relapse. In addition to the basic binary LR model, several machine learning models were developed: random forest (RF), support vector machine (SVM), gradient boosting machine (GBM), Neural network (NN), k neighbor algorithm (KNN). RF and GBM both are tree-based ensemble algorithms. RF creates multiple decision tree models by bootstrap samples, and aggregates decisions through averaging or majority voting [24]. And GBM uses all the data to build a regression tree model from the beginning, and constructs the new models to be maximally correlated with the negative gradient of the loss function [25]. SVM provides two-class prediction by constructing the separating hyperplane that has the largest distance to the nearest training data points from each of the two classes [26]. The neural network algorithm recognizes the potential relationships in a set of data through constructing a network structure composed of three main layers (input, hidden and output layer) and the main task is to transform raw input units into useful output units [27]. The K-nearest neighbor algorithm is based on analogical reasoning, it stores all the training data and classifies the new data point based on similarity measures [28].

For data standardizing, we centered and scaled the input features to the same range of values with mean of zero prior to modeling. Model tuning were carried out using the repeated fivefold cross-validation method with the training set. Repeated cross-validation means repeating the procedure of cross-validation for k times (k = 3 in this study), each time with different splits. The model assessment metric was calculated in each repetition and finally averaged as the final result. Compared with performing cross-validation only once, repeated cross-validation can improve estimated performance of a chosen model [29]. In each cross-validation, we tried all possible combinations of parameters by grid search. For each set of parameters, we used 4/5 of the data to fit the model, and the remaining 1/5 was assessed to compute the performance measure. Here we selected accuracy as the performance measure, which was calculated 5 times and averaged to produce the performance score of each parameter set. The ranges of training parameters for grid search were provided in Additional file 1: Table S1. Relative variable importance was calculated and plotted to find out the impact of features on the predictive models.

The performance of the final models was assessed in the validation set. The evaluation indicators used to compare the performance of models were AUROC, sensitivity, specificity, accuracy, positive predictive value (PPV), negative predictive value (NPV), F1 score and root mean squared error (RMSE). To further evaluate the performance of the models, we used bootstrapping resampling (2000 times) to compute the 95% confidence interval (CI) of AUROC and compared the AUROCs of machine learning models using 2-sided test. Finally, 95%CI of AUROCs and p values from comparisons were plotted together. We determined the best machine learning models for prediction of 1- and 2-year relapse with the validation set. Calibration curves were constructed to regress observed data against model fits of the best machine learning models. We also tried other variable sets as inputs for these ML models: (1) all the 32 clinical variables, (2) variables obtained through fivefold cross-validation Lasso analysis.

All statistical analysis was performed with R 4.0.2. The R package ‘caret’ was used for data pre-processing, model training (SVM and KNN), and calculation of variable importance. The R packages ‘randomForest’, ‘gbm’ and ‘nnet’ were used for the RF, GBM, and NNET model training, respectively. Lasso analysis was performed by the R package ‘glmnet’.

Results

Basic characteristics

The clinicopathological characteristics of the training set and validation set are shown in Table 1. 183 from the Second Affiliated Hospital of Zhejiang University School of Medicine were included as the training set. 70 from the Cancer Hospital of the University of Chinese Academy of Sciences and 9 from the Fourth Affiliated Hospital of Zhejiang University School of Medicine were used as the external independent validation cohort. Several clinical features were found significantly different between the training and validation datasets including globulin (Glb), albumin-globulin ratio (AGR), tumor differentiation, T-stage, lymph node status (N-stage) and vascular invasion (VI).

Table 1 Characteristics of the study population in training set and validation set

Full size table

Comparison of characteristics between patients with and without 1- or 2-year relapse in training set was shown in Additional file 1: Table S2 and S3, respectively. According to the univariate analysis, significant differences were observed in various clinical parameters (CA199, N stage, vascular invasion, adipose tissue invasion, differentiation) for 1-year relapse, and (CA199, N stage, vascular invasion, monocyte counts, albumin, AGR) for 2-year relapse. These variables were then included in the construction of machine learning models to predict the relapse risk of PDAC after radical surgery.

Model performance

Six models including LR, RF, SVM, GBM, KNN and NN were built and externally validated and the optimal parameters of these models were shown in Additional file 1: Table S4. Relative importance of variables was calculated and shown in Figs. 1 and 2. Pathological characteristics such as lymph node status (N-stage), tumor differentiation, and vascular invasion were found to have a major impact on most predictive models.

Comparisons of ROC curves and AUROC of different models to predict 1- and 2-year relapse in training cohort and validation sets were shown in Fig. 3 and Additional file 1: Figure S1. All six methods had excellent performance in the training set. Among them, the RF model outperformed the others in the training set. The highest accuracy and AUROC for predicting 1-year relapse risk with RF were 78.4% and 0.834, respectively; and for 2-year relapse risk were 95.1% and 0.998, respectively. LR obtained the lowest AUROC value of 0.776 to predict 1-year relapse risk and KNN of 0.808 to predict 2-year relapse risk.

In the validation set, the SVM model showed better performance than the others for predicting 1-year relapse risk with an accuracy and AUROC of 70.9% and 0.733, respectively (Table 2). And the KNN model achieved the highest accuracy and AUROC for predicting 2-year relapse risk of 73.4% and 0.689, respectively (Table 3). We further separately compared these two models with the rest using the AUROC. However, there was no significant statistical difference between RF and either of these two models, implying that these models might be similar in terms of their predictive power.

Table 2 Performance comparison of different models to predict 1-year relapse in the validation set

Full size table

Table 3 Performance comparison of different models to predict 2-year relapse in the validation set

Full size table

In addition, we also built models based on all the 32 clinical variables or variables obtained from fivefold cross-validation Lasso analysis. Nonetheless, no better predictive performance was achieved by either of these two approaches (Additional file 1: Tables S5 and S6). We still used the results from univariate analysis considering its simplicity and good performance.

Finally, we used a calibration curve to assess the agreement between the predicted and observed risks of relapse of PDAC. Adequate consistency was displayed in the training set between estimated risks using the predictive models and the actual observed outcome. However, SVM and KNN showed relatively poorer calibration performance in the validation set due to a smaller sample size (Additional file 1: Figure S2).

Discussion

The development of predictive tools for individual relapse risk assessment after multimodal therapy may help to further optimize treatment decision-making [30]. In this study, we have constructed and validated comprehensive models integrating clinicopathological characteristics to predict the relapse risk of PDAC after radical resection. It turned out that machine learning techniques were superior to conventional regression-based analyses in terms of the predictive performance. In accordance with various studies investigating the prognostic factors of PDAC [11, 31, 32], lymph node status (N-stage), vascular invasion and CA199 are independent predictors for both 1- and 2-year relapse. Although the RF model had the highest AUROC in the training set, the SVM model and KNN model showed better robustness to predict 1- and 2-year relapse in the validation set, respectively.

Currently, lack of screening and early detection, the proneness for early relapse after radical resection and minimally effective systemic therapy remain major barriers to curing patients with PDAC [33]. Timely and accurate prediction of relapse even after operative intervention is difficult. Implementation of cutting-edge machine learning algorithms may help to identify at-risk patients, among whom more intensive surveillance, the use of adjuvant treatment, or even the inclusion of these patients into clinical trials may be considered. Nowadays, artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications across almost every domain of medicine [34,35,36]. As an important branch of AI, machine learning allows computers to train models using large numbers of examples and may detect difficult-to-recognize patterns from complex dataset [37]. Unlike conventional regression-based approaches, machine learning algorithms are capable of capturing higher-order, non-linear inter-actions between predictors [38]. As a widely used model in biomedical analytics, SVM creates a set of hyperplanes for each feature in an infinite dimensional space, and fits linear or nonlinear models that most effectively discriminate between the values of a binary output variable [39]. Its effectiveness has been proved in studies to predict the recurrence of various diseases [40,41,42]. KNN is another stringent methodology for classification and regression. Reports have also demonstrated its promising role in prognostic research [43,44,45]. It can be useful to weight the contributions of the neighbours, so that the nearer neighbours contribute more to the average than the more distant ones [46]. Our study allowed for the comparison of multiple learning algorithms to identify the approach with the most favorable performance.

To the best of our knowledge, this is the first study to develop and compare machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection from multi-institutional datasets. Predictive nomograms based on conventional regression methods have been built for early recurrence after pancreatectomy in resectable pancreatic cancer [9, 12]. Kim et al. established a nomogram to predict the probability of recurrence within 12 months after surgery in single medical center with AUROC = 0.655 [9]. While in our study, we constructed and externally validated a predictive SVM model for 1-year relapse risk with AUROC = 0.733 and a KNN model for 2-year relapse risk with AUROC = 0.689 using stringent statistical method. Another work by Guo et al. redefined early recurrence as the first 162 days postoperatively on a basis of its own cohort, which made it difficult to compare results across studies and different institutions [12]. Particularly, it is understandable that this study did not include histopathologic data in its Cox proportional hazards regression model for the purpose of guiding preoperative decision-making concerning the use of neoadjuvant therapy. Other reports regarding this topic also have their own specific drawbacks with either a very small sample size of less than 40 [30] or lack of external validation [10]. In addition, recent research has revealed the links between radiomics and underlying tumor biology in PDAC, which are strongly correlated with tumor phenotype [47], response to treatment [48], and prognosis [49,50,51]. However, the steps of image texture analysis and manual contouring of region of interests (ROIs) are still tedious, laborious and time-consuming, which is inconvenient for clinical practice at present and has ample room to improve in the future.

Certain limitations of this study and the results need to be discussed. First, given the retrospective nature of our study, there might be some selection bias existing because of its inherent flaws. Second, despite the low incidence of PDAC, the relatively limited sample size included in the training and validation dataset might impair the accuracy for quantifying interpatient variability effects. Both two models showed high sensitivity with a trade-off that the specificity might be sacrificed in a certain level, which is relevant to the threshold selection when performing binary classification [52]. More larger and balanced cohorts will be collected from multiple medical centers in the future to further establish the robustness of the proposed models. Third, limitations in the interpretability of inner workings of models currently poses a severe bottleneck in implementing cutting-edge machine learning techniques in biomedical research [34, 53]. We need to keep pursuing a better understanding of the complex and evolving relationship between physicians and human-centred AI tools in the live clinical environment, thus providing better outcomes to our patients [54].

In conclusion, we employed machine learning algorithms to construct models integrating clinicopathological characteristics to predict the relapse risk of PDAC after radical resection. And we have externally validated the prediction capacity of our models in independent groups from other medical institutions. Machine learning systems can provide critical prognostic prediction for patients with PDAC after radical resection, and the use of predictive algorithms may offer promising clinical decision support for both practitioners and patients.

Availability of data and materials

The datasets generated during and analysed during the current study are available in the Code Ocean (https://codeocean.com/capsule/2968380/tree).

Abbreviations

PDAC:: Pancreatic ductal adenocarcinoma
AI:: Artificial intelligence
CT:: Computed tomography
MRI:: Magnetic resonance imaging
PET:: Positron emission tomography
RFS:: Relapse-free survival
OS:: Overall survival
RF:: Ramdom forest
SVM:: Support vector machine
GBM:: Gradient boosting machine
NN:: Neural network
KNN:: K neighbor algorithm
AUROC:: Area under the receiver operating characteristic curve
PPV:: Positive predictive value
NPV:: Negative predictive value
RMSE:: Root mean squared error
CI:: Confidence interval
BMI:: Body mass index
CEA:: Carcinoembryonic antigen
CA:: Cancer antigen
WBC:: White blood cell
Hb:: Hemoglobin
Plt:: Platelet
Neut:: Neutrophil
Lymph:: Lymphocyte
Mono:: Monocyte
Alb:: Albumin
Glb:: Globulin
AGR:: Albumin-globulin ratio
NLR:: Neutrophil–lymphocyte ratio
LMR:: Lymphcyte-monocyte ratio
PLR:: Platelet-lymphocyte ratio
AST:: Aspartate transaminase
ALT:: Alanine transaminase
ALP:: Alkaline phosphatase
GGT:: Gamma-glutamyltransferase
TB:: Total bilirubin
DB:: Direct bilirubin
VI:: Vascular invasion
PI:: Perineural invasion
ATI:: Adipose tissue invasion
OS:: Overall survival
RFS:: Relapse-free survival

References

Chen WQ, Zheng RS, Baade PD, Zhang SW, Zeng HM, Bray F, et al. Cancer Statistics in China, 2015. Cancer J Clin. 2016;66(2):115–32.
Article Google Scholar
Nevala-Plagemann C, Hidalgo M, Garrido-Laguna I. From state-of-the-art treatments to novel therapies for advanced-stage pancreatic cancer. Nat Rev Clin Oncol. 2020;17(2):108–23.
Article PubMed Google Scholar
Aier I, Semwal R, Sharma A, Varadwaj PK. A systematic assessment of statistics, risk factors, and underlying features involved in pancreatic cancer. Cancer Epidemiol. 2019;58:104–10.
Article PubMed Google Scholar
Katz MHG, Wang H, Fleming JB, Sun CC, Hwang RF, Wolff RA, et al. Long-term survival after multidisciplinary management of resected pancreatic adenocarcinoma. Ann Surg Oncol. 2009;7:25.
Google Scholar
Ferrone CR, Pieretti-Vanmarcke R, Bloom JP, Zheng H, Szymonifka J, Wargo JA, et al. Pancreatic ductal adenocarcinoma: long-term survival does not equal cure. Surgery. 2012;152:S43-9.
Article PubMed Google Scholar
He J, Ahuja N, Makary MA, Cameron JL, Eckhauser FE, Choti MA, et al. 2564 resected periampullary adenocarcinomas at a single institution: trends over three decades. HPB. 2014;17:325.
Google Scholar
Ellison LF, Wilkins K. An update on cancer survival. Health Rep. 2010;21(3):55–60.
PubMed Google Scholar
Kawakami E, Tabata J, Yanaihara N, Ishikawa T, Koseki K, Iida Y, et al. Application of artificial intelligence for preoperative diagnostic and prognostic prediction in epithelial ovarian cancer based on blood biomarkers. Clin Cancer Res. 2019;25(10):3006–15.
Article CAS PubMed Google Scholar
Kim N, Han IW, Ryu Y, Hwang DW, Heo JS, Choi DW, et al. Predictive nomogram for early recurrence after pancreatectomy in resectable pancreatic cancer: Risk classification using preoperative clinicopathologic factors. Cancers. 2020;12:18.
Google Scholar
He C, Huang X, Zhang Y, Cai Z, Lin X, Li S. A quantitative clinicopathological signature for predicting recurrence risk of pancreatic ductal adenocarcinoma after radical resection. Front Oncol. 2019;9:87.
Article Google Scholar
He C, Sun S, Zhang Y, Lin X, Li S. A novel nomogram to predict survival in patients with recurrence of pancreatic ductal adenocarcinoma after radical resection. Front Oncol. 2020;10:147.
CAS Google Scholar
Guo SW, Shen J, Gao JH, Shi XH, Gao SZ, Wang H, et al. A preoperative risk model for early recurrence after radical resection may facilitate initial treatment decisions concerning the use of neoadjuvant therapy for patients with pancreatic ductal adenocarcinoma. Surgery. 2020;168(6):1003–14.
Article PubMed Google Scholar
Wei R, Wang J, Wang X, Xie G, Wang Y, Zhang H, et al. Clinical prediction of HBV and HCV related hepatic fibrosis using machine learning. EBioMedicine. 2018;35:124–32.
Article PubMed PubMed Central Google Scholar
Jurmeister P, Bockmayr M, Seegerer P, Bockmayr T, Treue D, Montavon G, et al. Machine learning analysis of DNA methylation profiles distinguishes primary lung squamous cell carcinomas from head and neck metastases. Sci Transl Med. 2019;11:509.
Article CAS Google Scholar
Xu R-H, Wei W, Krawczyk M, Wang W, Luo H, Flagg K, et al. Circulating tumour DNA methylation markers for diagnosis and prognosis of hepatocellular carcinoma. Nat Mater. 2017;8:54.
Google Scholar
Motwani M, Dey D, Berman DS, Germano G, Achenbach S, Al-Mallah MH, et al. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur Heart J. 2017;38(7):500–7.
PubMed Google Scholar
Singal AG, Mukherjee A, Joseph Elmunzer B, Higgins PDR, Lok AS, Zhu J, et al. Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma. Am J Gastroenterol. 2013;108(11):1723–30.
Article PubMed PubMed Central Google Scholar
Pulvirenti A, Javed AA, Landoni L, Jamieson NB, Chou JF, Miotto M, et al. Multi-institutional development and external validation of a nomogram to predict recurrence after curative resection of pancreatic neuroendocrine tumors. Ann Surg. 2019;10:1–7.
Google Scholar
Tempero MA, Malafa MP, Chiorean EG, Czito B, Scaife C, Narang AK, et al. Pancreatic adenocarcinoma, version 1.2019 featured updates to the NCCN guidelines. JNCCN. 2019;17(3):203–10.
Google Scholar
He J, Pan H, Liang W, Xiao D, Chen X, Guo M, et al. Prognostic effect of albumin-to-globulin ratio in patients with solid tumors: a systematic review and meta-analysis. J Cancer. 2017;8(19):4002–10.
Article PubMed PubMed Central CAS Google Scholar
Goto W, Kashiwagi S, Asano Y, Takada K, Takahashi K, Hatano T, et al. Predictive value of lymphocyte-to-monocyte ratio in the preoperative setting for progression of patients with breast cancer. BMC Cancer. 2018;18(1):1137.
Article CAS PubMed PubMed Central Google Scholar
Tong Z, Liu L, Zheng Y, Jiang W, Zhao P, Fang W, et al. Predictive value of preoperative peripheral blood neutrophil/lymphocyte ratio for lymph node metastasis in patients of resectable pancreatic neuroendocrine tumors: A nomogram-based study. World J Surg Oncol. 2017;15(1):1–9.
Article Google Scholar
Wang C, He W, Yuan Y, Zhang Y, Li K, Zou R, et al. Comparison of the prognostic value of inflammation-based scores in early recurrent hepatocellular carcinoma after hepatectomy. Liver Int. 2020;9:547.
Google Scholar
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Article Google Scholar
Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29(5):1189–232.
Article Google Scholar
Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.
Article Google Scholar
Cross SS, Harrison RF, Kennedy RL. Introduction to neural networks. The Lancet. 1995;346(8982):1075–9.
Article CAS Google Scholar
Altman NS. An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat. 1992;46(3):175–85.
Google Scholar
Moss HB, Leslie DS, Rayson P. Using J-K-fold cross validation to reduce variance when tuning NLP models. BMC. 2018;5:2978–89.
Google Scholar
Sala Elarre P, Oyaga-Iriarte E, Yu KH, Baudin V, Arbea Moreno L, Carranza O, et al. Use of machine-learning algorithms in intensified preoperative therapy of pancreatic cancer to predict individual risk of relapse. Cancers. 2019;11(5):606.
Article PubMed Central CAS Google Scholar
Song W, Miao DL, Chen L. Nomogram for predicting survival in patients with pancreatic cancer. Onco Targets Ther. 2018;11:539–45.
Article PubMed PubMed Central Google Scholar
De CMM, Biere SSAY, Lagarde SM, Busch ORC, Van GM, Gouma DJ. Validation of a nomogram for predicting survival after resection for adenocarcinoma of the pancreas. Br J Surg. 2009;96(4):417–23.
Article Google Scholar
Groot VP, Rezaee N, Wu W, Cameron JL, Fishman EK, Hruban RH, et al. Patterns, timing, and predictors of recurrence following pancreatectomy for pancreatic ductal adenocarcinoma. Ann Surg. 2018;267(5):936–45.
Article PubMed Google Scholar
Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 2019;17:1–9.
Article CAS Google Scholar
Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44–56.
Article CAS PubMed Google Scholar
Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Nat Med. 2019;25(1):24–9.
Article CAS PubMed Google Scholar
Bohr A, Memarzadeh K. The rise of artificial intelligence in healthcare applications. PLoS ONE. 2020;4:25–60.
Google Scholar
Zeevi D, Korem T, Zmora N, Israeli D, Rothschild D, Weinberger A, et al. Personalized nutrition by prediction of glycemic responses. Cell. 2015;163(5):1079–94.
Article CAS PubMed Google Scholar
Kim W, Kim KS, Lee JE, Noh D-Y, Kim S-W, Jung YS, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15(2):230.
Article CAS PubMed PubMed Central Google Scholar
Liang J-D, Ping X-O, Tseng Y-J, Huang G-T, Lai F, Yang P-M. Recurrence predictive models for patients with hepatocellular carcinoma after radiofrequency ablation using support vector machines with feature selection methods. Comput Methods Prog Biomed. 2014;117(3):425–34.
Article Google Scholar
Tseng C-J, Lu C-J, Chang C-C, Chen G-D. Application of machine learning to predict the recurrence-proneness for cervical cancer. Neural Comput Appl. 2014;24(6):1311–6.
Article Google Scholar
Lg A, At E. Using three machine learning techniques for predicting breast cancer recurrence. J Health Med Inf. 2013;04(02):2–4.
Google Scholar
Medjahed SA, Saadi TA, Benyettou A. Breast cancer diagnosis by using k-nearest neighbor with different distances and classification rules. Int J Comput Appl. 2013;62:18.
Google Scholar
Li C, Zhang S, Zhang H, Pang L, Lam K, Hui C, et al. Using the K-nearest neighbor algorithm for the classification of lymph node metastasis in gastric cancer. Comput Math Methods Med. 2012;2012:77.
Article Google Scholar
Atallah DM, Badawy M, El-Sayed A, Ghoneim MA. Predicting kidney transplantation outcome based on hybrid feature selection and KNN classifier. Multimedia Tools Appl. 2019;78(14):20383–407.
Article Google Scholar
Rana M, Chandorkar P, Dsouza A, Kazi N. Breast cancer diagnosis and recurrence prediction using machine learning techniques. IJRET. 2015;8:2319–1163.
Google Scholar
Lim CH, Cho YS, Choi JY, Lee KH, Lee JK, Min JH, et al. Imaging phenotype using 18F-fluorodeoxyglucose positron emission tomography–based radiomics and genetic alterations of pancreatic ductal adenocarcinoma. Eur J Nucl Med Mol Imaging. 2020;47(9):2113–22.
Article CAS PubMed Google Scholar
Nasief H, Zheng C, Schott D, Hall W, Tsai S, Erickson B, et al. A machine learning based delta-radiomics process for early prediction of treatment response of pancreatic cancer. NPJ Precis Oncol. 2019;3(1):1–10.
CAS Google Scholar
Kaissis G, Ziegelmayer S, Lohöfer F, Algül H, Eiber M, Weichert W, et al. A prospectively validated machine learning model for the prediction of survival and tumor subtype in pancreatic ductal adenocarcinoma. BMC Med. 2019;17:1–9.
Google Scholar
Hwang SH, Kim HY, Lee EJ, Hwang HK, Park M-S, Kim M-J, et al. Preoperative clinical and computed tomography (CT)-based nomogram to predict oncologic outcomes in patients with pancreatic head cancer resected with curative intent: a retrospective study. J Clin Med. 2019;8(10):1749.
Article PubMed Central Google Scholar
Yun G, Kim YH, Lee YJ, Kim B, Hwang JH, Choi DJ. Tumor heterogeneity of pancreas head cancer assessed by CT texture analysis: Association with survival outcomes after curative resection. Sci Rep. 2018;8(1):1–10.
Article Google Scholar
Lu CF, Hsu FT, Hsieh KL, Kao YJ, Cheng SJ, Hsu JB, et al. Machine learning-based radiomics for molecular subtyping of gliomas. Clin Cancer Res. 2018;24(18):4429–36.
Article PubMed Google Scholar
Manamley N, Mallett S, Sydes MR, Hollis S, Scrimgeour A, Burger HU, et al. Data sharing and the evolving role of statisticians. BMC Med Res Methodol. 2016;16(Suppl 1):75.
Article PubMed PubMed Central Google Scholar
Ahuja AS. The impact of artificial intelligence in medicine on the future role of the physician. PeerJ. 2019;2019:10.
Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the General Program of National Natural Science Foundation of China under Grant [Grant Number: 81772562, 2017] (Yulian Wu) and the Fundamental Research Funds for the Central Universities [Grant Number: 2021FZZX005-08] (Xiawei Li).

Author information

Authors and Affiliations

Department of Surgery, Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310000, Zhejiang, China
Xiawei Li, Jianyao Lou, Aiguang Shi & Yulian Wu
Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Cancer Institute, Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310000, Zhejiang, China
Xiawei Li, Jianyao Lou, Aiguang Shi & Yulian Wu
Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China
Xiawei Li, Jianyao Lou, Aiguang Shi & Yulian Wu
Department of Surgery, Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Hangzhou, 310000, Zhejiang, China
Litao Yang
Hessian Health Technology Co., Ltd, Beijing, 100007, China
Zheping Yuan & Mingchen Zhao
Department of Surgery, Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, 322000, Zhejiang, China
Yiqun Fan
Department of Surgery, Changxing People’s Hospital, Huzhou, 313100, Zhejiang, China
Junjie Huang

Authors

Xiawei Li
View author publications
You can also search for this author in PubMed Google Scholar
Litao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zheping Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Jianyao Lou
View author publications
You can also search for this author in PubMed Google Scholar
Yiqun Fan
View author publications
You can also search for this author in PubMed Google Scholar
Aiguang Shi
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Huang
View author publications
You can also search for this author in PubMed Google Scholar
Mingchen Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yulian Wu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

XL: conceptualization, methodology, formal analysis, writing-original draft, writing-review & editing. LY: resources, writing-review & editing. ZY: writing-original draft, formal analysis. JL: investigation. YF: resources, data curation. AS: data curation, methodology. JH, MZ: data curation. YW: funding acquisition, project administration, resources, supervision. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yulian Wu.

Ethics declarations

Ethics approval and consent to participate

The study was approved by the Institutional Review Boards of 3 institutions. And no additional patient consent was required since the medical records were retrospectively reviewed.

Consent for publication

All the authors agree to the publication of this work.

Competing interests

The authors declare no potential conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Figure S1.

Comparisons of AUROC of different models to predict 1- and 2-year relapse in training set (a: 1-year relapse, b: 2-year relapse). Figure S2. Calibration curves of SVM model (A: training set; B: validation set) to predict 1-year relapse and KNN model (C: training set; D: validation set) to predict 2-year relapse. Table S1. The ranges of training parameters for grid search in different models. Table S2. Comparison of characteristics between patients with and without 1-year relapse in training set. Table S3. Comparison of characteristics between patients with and without 2-year relapse in training set. Table S4. The optimal parameters for different models to predict relapse risks of PDAC. Table S5. Performance of models built on all 32 variables in the validation set. Table S6. Performance of models built on variables from lasso analysis in the validation set.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Li, X., Yang, L., Yuan, Z. et al. Multi-institutional development and external validation of machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection. J Transl Med 19, 281 (2021). https://doi.org/10.1186/s12967-021-02955-7

Download citation

Received: 30 March 2021
Accepted: 19 June 2021
Published: 30 June 2021
DOI: https://doi.org/10.1186/s12967-021-02955-7

Multi-institutional development and external validation of machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection