Skip to main content

Multi-institutional development and external validation of machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection

Abstract

Background

Surgical resection is the only potentially curative treatment for pancreatic ductal adenocarcinoma (PDAC) and the survival of patients after radical resection is closely related to relapse. We aimed to develop models to predict the risk of relapse using machine learning methods based on multiple clinical parameters.

Methods

Data were collected and analysed of 262 PDAC patients who underwent radical resection at 3 institutions between 2013 and 2017, with 183 from one institution as a training set, 79 from the other 2 institution as a validation set. We developed and compared several predictive models to predict 1- and 2-year relapse risk using machine learning approaches.

Results

Machine learning techniques were superior to conventional regression-based analyses in predicting risk of relapse of PDAC after radical resection. Among them, the random forest (RF) outperformed other methods in the training set. The highest accuracy and area under the receiver operating characteristic curve (AUROC) for predicting 1-year relapse risk with RF were 78.4% and 0.834, respectively, and for 2-year relapse risk were 95.1% and 0.998. However, the support vector machine (SVM) model showed better performance than the others for predicting 1-year relapse risk in the validation set. And the k neighbor algorithm (KNN) model achieved the highest accuracy and AUROC for predicting 2-year relapse risk.

Conclusions

By machine learning, this study has developed and validated comprehensive models integrating clinicopathological characteristics to predict the relapse risk of PDAC after radical resection which will guide the development of personalized surveillance programs after surgery.

Introduction

Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal human malignant diseases worldwide and the sixth leading cause of cancer-related deaths in China [1]. So far, radical resection followed by adjuvant chemotherapy has been the only potentially curative treatment [2]. However, only a minority of patients present with a tumor suitable for this combination therapy at diagnosis, due to lack of early clinical symptoms and effective screening approaches [3]. Even after curative resection, up to 80% of patients will suffer from disease relapse resulting in a 5-year survival of only 20–30% [4,5,6,7]. Hence, the survival of patients with resectable PDAC is closely related to recurrence. It is necessary and urgent to build robust models to identify those patients with increased risk of relapse and further optimize treatment decision-making.

Nowadays, development of methods to predict treatment outcomes and prognosis is an important paradigm in the realm of personalized medicine [8]. Several studies have shown comparable prediction accuracy by using traditional regression-based statistical methods on a basis of a combination of biomarkers and multiple clinical factors [9,10,11,12]. However, common statistical methods familiar to clinicians ignore more complex non-linear interactions between variables that might play significant roles in the potential of future relapse, and which could be captured using more sophisticated modeling approaches [13]. In recent years, machine learning, as a branch of artificial intelligence (AI) technology, has attracted extensive interest in developing clinical predictive tools for diagnosis, staging and prognosis of various diseases [14,15,16]. It has been successfully applied for recognizing hidden patterns in complex data, allowing for better predictions of clinical outcomes than conventional statistical models, especially when applied to large-scale datasets [17].

Thus, the aim of this study was to develop, and externally validate, new cutting-edge machine learning-based models that accurately predict 1- and 2-year relapse of PDAC using clinicopathological factors in patients with resectable disease. Predicting the risk of relapse offers the potential to improve personalized surveillance schedules, determine clinical trial eligibility and compare results across studies and different institutions [18].

Materials and methods

Study population

Data of PDAC patients who underwent radical resection at 3 institutions between January 2013 and December 2017 were obtained. The study was approved by the Institutional Review Boards of 3 institutions. And no additional patient consent was required since the medical records were retrospectively reviewed. As this study aimed to build models based on preoperative clinical and pathological factors affecting relapse risk after surgery in resectable PDAC, patients who had initially borderline resectable/unresectable cancers according to the NCCN guideline [19] or received neoadjuvant therapy were excluded. So were those who were lost to follow-up or lacking complete clinical data. The inclusion criteria were met by a total of 262 patients, including 183 from the Second Affiliated Hospital of Zhejiang University School of Medicine, 70 from the Cancer Hospital of the University of Chinese Academy of Sciences and 9 from the Fourth Affiliated Hospital of Zhejiang University School of Medicine.

Data collection

Preoperative blood biomarkers including carcinoembryonic antigen (CEA), CA199, CA125, white blood cell (WBC) count, hemoglobin (Hb) count, platelet (Plt) count, neutrophil (Neut) count, lymphocyte (Lymp) count, monocyte (Mono) count, albumin (Alb), globulin (Glb), aspartate transaminase (AST), alanine transaminase (ALT), alkaline phosphatase (ALP), gamma-glutamyltransferase (GGT), total bilirubin (TB) and direct bilirubin (DB) were collected using the measurements that were closest to the operation and within at least 1 week before the surgery. Inflammation-based prognostic scores, including albumin-globulin ratio (AGR) [20], lymphcyte-monocyte ratio (LMR) [21], neutrophil–lymphocyte ratio (NLR) [22] and platelet-lymphocyte ratio (PLR) [23], were calculated. Additionally, pathological diagnosis and description was carried out by experienced pancreatic pathologists at 3 institutions, including surgical margin status, tumor site, tumor size, tumor differentiation, T-stage, lymph node status (N-stage), vascular invasion, perineural invasion and adipose tissue invasion.

After surgery, the follow-up of patients was initially performed every 3 months for the first 2 years, every 6 months during years 3 and 4, and then annually. The surveillance protocol included physical examination, serum CA19-9 level and contrast-enhanced abdominoperineal computed tomography (CT). When imaging features were consistent with a cancer recurrence, magnetic resonance imaging (MRI) and/or fluorodeoxyglucose positron emission tomography (PET) was carried out to further clarify ambiguous CT findings if necessary. Relapse-free survival (RFS) and overall survival (OS) were defined as the duration from the date of surgery until the date when a relapse was diagnosed and death, respectively, or last follow-up.

Statistical analysis

Differences of clinical characteristics between the training set and the validation set as well as between patient groups with or without 1- and 2-year relapse were assessed using independent sample t test, Mann–Whitney U test, or χ2 test with a statistical significance level set at 0.05. Clinical variables found significantly different (p < 0.05) between patient groups with or without 1- and 2-year relapse were selected as inputs for the predictive models.

In our study, six algorithms were applied to build models for predicting 1- and 2-year relapse. In addition to the basic binary LR model, several machine learning models were developed: random forest (RF), support vector machine (SVM), gradient boosting machine (GBM), Neural network (NN), k neighbor algorithm (KNN). RF and GBM both are tree-based ensemble algorithms. RF creates multiple decision tree models by bootstrap samples, and aggregates decisions through averaging or majority voting [24]. And GBM uses all the data to build a regression tree model from the beginning, and constructs the new models to be maximally correlated with the negative gradient of the loss function [25]. SVM provides two-class prediction by constructing the separating hyperplane that has the largest distance to the nearest training data points from each of the two classes [26]. The neural network algorithm recognizes the potential relationships in a set of data through constructing a network structure composed of three main layers (input, hidden and output layer) and the main task is to transform raw input units into useful output units [27]. The K-nearest neighbor algorithm is based on analogical reasoning, it stores all the training data and classifies the new data point based on similarity measures [28].

For data standardizing, we centered and scaled the input features to the same range of values with mean of zero prior to modeling. Model tuning were carried out using the repeated fivefold cross-validation method with the training set. Repeated cross-validation means repeating the procedure of cross-validation for k times (k = 3 in this study), each time with different splits. The model assessment metric was calculated in each repetition and finally averaged as the final result. Compared with performing cross-validation only once, repeated cross-validation can improve estimated performance of a chosen model [29]. In each cross-validation, we tried all possible combinations of parameters by grid search. For each set of parameters, we used 4/5 of the data to fit the model, and the remaining 1/5 was assessed to compute the performance measure. Here we selected accuracy as the performance measure, which was calculated 5 times and averaged to produce the performance score of each parameter set. The ranges of training parameters for grid search were provided in Additional file 1: Table S1. Relative variable importance was calculated and plotted to find out the impact of features on the predictive models.

The performance of the final models was assessed in the validation set. The evaluation indicators used to compare the performance of models were AUROC, sensitivity, specificity, accuracy, positive predictive value (PPV), negative predictive value (NPV), F1 score and root mean squared error (RMSE). To further evaluate the performance of the models, we used bootstrapping resampling (2000 times) to compute the 95% confidence interval (CI) of AUROC and compared the AUROCs of machine learning models using 2-sided test. Finally, 95%CI of AUROCs and p values from comparisons were plotted together. We determined the best machine learning models for prediction of 1- and 2-year relapse with the validation set. Calibration curves were constructed to regress observed data against model fits of the best machine learning models. We also tried other variable sets as inputs for these ML models: (1) all the 32 clinical variables, (2) variables obtained through fivefold cross-validation Lasso analysis.

All statistical analysis was performed with R 4.0.2. The R package ‘caret’ was used for data pre-processing, model training (SVM and KNN), and calculation of variable importance. The R packages ‘randomForest’, ‘gbm’ and ‘nnet’ were used for the RF, GBM, and NNET model training, respectively. Lasso analysis was performed by the R package ‘glmnet’.

Results

Basic characteristics

The clinicopathological characteristics of the training set and validation set are shown in Table 1. 183 from the Second Affiliated Hospital of Zhejiang University School of Medicine were included as the training set. 70 from the Cancer Hospital of the University of Chinese Academy of Sciences and 9 from the Fourth Affiliated Hospital of Zhejiang University School of Medicine were used as the external independent validation cohort. Several clinical features were found significantly different between the training and validation datasets including globulin (Glb), albumin-globulin ratio (AGR), tumor differentiation, T-stage, lymph node status (N-stage) and vascular invasion (VI).

Table 1 Characteristics of the study population in training set and validation set

Comparison of characteristics between patients with and without 1- or 2-year relapse in training set was shown in Additional file 1: Table S2 and S3, respectively. According to the univariate analysis, significant differences were observed in various clinical parameters (CA199, N stage, vascular invasion, adipose tissue invasion, differentiation) for 1-year relapse, and (CA199, N stage, vascular invasion, monocyte counts, albumin, AGR) for 2-year relapse. These variables were then included in the construction of machine learning models to predict the relapse risk of PDAC after radical surgery.

Model performance

Six models including LR, RF, SVM, GBM, KNN and NN were built and externally validated and the optimal parameters of these models were shown in Additional file 1: Table S4. Relative importance of variables was calculated and shown in Figs. 1 and 2. Pathological characteristics such as lymph node status (N-stage), tumor differentiation, and vascular invasion were found to have a major impact on most predictive models.

Fig. 1
figure1

Relative importance of variables on models to predict 1-year relapse. Interpretation: N2 = N stage 1, N3 = N stage 2; grade 2 = moderate differentiation, grade 3 = poor differentiation or undifferentiated; ATI1 = with adipose tissue invasion; VI1 = with vascular invasion; CA1991 = CA 199 ≥ 37U/mL

Fig. 2
figure2

Relative importance of variables on models to predict 2-year relapse. Interpretation: VI1 = with vascular invasion; N2 = N stage 1, N3 = N stage 2; Mono = monocyte; Alb = Albumin; AGR = albumin-globulin ratio; CA1991 = CA 199 ≥ 37U/mL

Comparisons of ROC curves and AUROC of different models to predict 1- and 2-year relapse in training cohort and validation sets were shown in Fig. 3 and Additional file 1: Figure S1. All six methods had excellent performance in the training set. Among them, the RF model outperformed the others in the training set. The highest accuracy and AUROC for predicting 1-year relapse risk with RF were 78.4% and 0.834, respectively; and for 2-year relapse risk were 95.1% and 0.998, respectively. LR obtained the lowest AUROC value of 0.776 to predict 1-year relapse risk and KNN of 0.808 to predict 2-year relapse risk.

Fig. 3
figure3

Comparisons of ROC curves and AUROC of different models to predict 1- and 2-year relapse in training cohort and validation sets (1-year relapse: training set: A, validation set: B, comparison of AUROC in validation set: C; 2-year relapse: training set: D, validation set: E, comparison of AUROC in validation set: F)

In the validation set, the SVM model showed better performance than the others for predicting 1-year relapse risk with an accuracy and AUROC of 70.9% and 0.733, respectively (Table 2). And the KNN model achieved the highest accuracy and AUROC for predicting 2-year relapse risk of 73.4% and 0.689, respectively (Table 3). We further separately compared these two models with the rest using the AUROC. However, there was no significant statistical difference between RF and either of these two models, implying that these models might be similar in terms of their predictive power.

Table 2 Performance comparison of different models to predict 1-year relapse in the validation set
Table 3 Performance comparison of different models to predict 2-year relapse in the validation set

In addition, we also built models based on all the 32 clinical variables or variables obtained from fivefold cross-validation Lasso analysis. Nonetheless, no better predictive performance was achieved by either of these two approaches (Additional file 1: Tables S5 and S6). We still used the results from univariate analysis considering its simplicity and good performance.

Finally, we used a calibration curve to assess the agreement between the predicted and observed risks of relapse of PDAC. Adequate consistency was displayed in the training set between estimated risks using the predictive models and the actual observed outcome. However, SVM and KNN showed relatively poorer calibration performance in the validation set due to a smaller sample size (Additional file 1: Figure S2).

Discussion

The development of predictive tools for individual relapse risk assessment after multimodal therapy may help to further optimize treatment decision-making [30]. In this study, we have constructed and validated comprehensive models integrating clinicopathological characteristics to predict the relapse risk of PDAC after radical resection. It turned out that machine learning techniques were superior to conventional regression-based analyses in terms of the predictive performance. In accordance with various studies investigating the prognostic factors of PDAC [11, 31, 32], lymph node status (N-stage), vascular invasion and CA199 are independent predictors for both 1- and 2-year relapse. Although the RF model had the highest AUROC in the training set, the SVM model and KNN model showed better robustness to predict 1- and 2-year relapse in the validation set, respectively.

Currently, lack of screening and early detection, the proneness for early relapse after radical resection and minimally effective systemic therapy remain major barriers to curing patients with PDAC [33]. Timely and accurate prediction of relapse even after operative intervention is difficult. Implementation of cutting-edge machine learning algorithms may help to identify at-risk patients, among whom more intensive surveillance, the use of adjuvant treatment, or even the inclusion of these patients into clinical trials may be considered. Nowadays, artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications across almost every domain of medicine [34,35,36]. As an important branch of AI, machine learning allows computers to train models using large numbers of examples and may detect difficult-to-recognize patterns from complex dataset [37]. Unlike conventional regression-based approaches, machine learning algorithms are capable of capturing higher-order, non-linear inter-actions between predictors [38]. As a widely used model in biomedical analytics, SVM creates a set of hyperplanes for each feature in an infinite dimensional space, and fits linear or nonlinear models that most effectively discriminate between the values of a binary output variable [39]. Its effectiveness has been proved in studies to predict the recurrence of various diseases [40,41,42]. KNN is another stringent methodology for classification and regression. Reports have also demonstrated its promising role in prognostic research [43,44,45]. It can be useful to weight the contributions of the neighbours, so that the nearer neighbours contribute more to the average than the more distant ones [46]. Our study allowed for the comparison of multiple learning algorithms to identify the approach with the most favorable performance.

To the best of our knowledge, this is the first study to develop and compare machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection from multi-institutional datasets. Predictive nomograms based on conventional regression methods have been built for early recurrence after pancreatectomy in resectable pancreatic cancer [9, 12]. Kim et al. established a nomogram to predict the probability of recurrence within 12 months after surgery in single medical center with AUROC = 0.655 [9]. While in our study, we constructed and externally validated a predictive SVM model for 1-year relapse risk with AUROC = 0.733 and a KNN model for 2-year relapse risk with AUROC = 0.689 using stringent statistical method. Another work by Guo et al. redefined early recurrence as the first 162 days postoperatively on a basis of its own cohort, which made it difficult to compare results across studies and different institutions [12]. Particularly, it is understandable that this study did not include histopathologic data in its Cox proportional hazards regression model for the purpose of guiding preoperative decision-making concerning the use of neoadjuvant therapy. Other reports regarding this topic also have their own specific drawbacks with either a very small sample size of less than 40 [30] or lack of external validation [10]. In addition, recent research has revealed the links between radiomics and underlying tumor biology in PDAC, which are strongly correlated with tumor phenotype [47], response to treatment [48], and prognosis [49,50,51]. However, the steps of image texture analysis and manual contouring of region of interests (ROIs) are still tedious, laborious and time-consuming, which is inconvenient for clinical practice at present and has ample room to improve in the future.

Certain limitations of this study and the results need to be discussed. First, given the retrospective nature of our study, there might be some selection bias existing because of its inherent flaws. Second, despite the low incidence of PDAC, the relatively limited sample size included in the training and validation dataset might impair the accuracy for quantifying interpatient variability effects. Both two models showed high sensitivity with a trade-off that the specificity might be sacrificed in a certain level, which is relevant to the threshold selection when performing binary classification [52]. More larger and balanced cohorts will be collected from multiple medical centers in the future to further establish the robustness of the proposed models. Third, limitations in the interpretability of inner workings of models currently poses a severe bottleneck in implementing cutting-edge machine learning techniques in biomedical research [34, 53]. We need to keep pursuing a better understanding of the complex and evolving relationship between physicians and human-centred AI tools in the live clinical environment, thus providing better outcomes to our patients [54].

In conclusion, we employed machine learning algorithms to construct models integrating clinicopathological characteristics to predict the relapse risk of PDAC after radical resection. And we have externally validated the prediction capacity of our models in independent groups from other medical institutions. Machine learning systems can provide critical prognostic prediction for patients with PDAC after radical resection, and the use of predictive algorithms may offer promising clinical decision support for both practitioners and patients.

Availability of data and materials

The datasets generated during and analysed during the current study are available in the Code Ocean (https://codeocean.com/capsule/2968380/tree).

Abbreviations

PDAC:

Pancreatic ductal adenocarcinoma

AI:

Artificial intelligence

CT:

Computed tomography

MRI:

Magnetic resonance imaging

PET:

Positron emission tomography

RFS:

Relapse-free survival

OS:

Overall survival

RF:

Ramdom forest

SVM:

Support vector machine

GBM:

Gradient boosting machine

NN:

Neural network

KNN:

K neighbor algorithm

AUROC:

Area under the receiver operating characteristic curve

PPV:

Positive predictive value

NPV:

Negative predictive value

RMSE:

Root mean squared error

CI:

Confidence interval

BMI:

Body mass index

CEA:

Carcinoembryonic antigen

CA:

Cancer antigen

WBC:

White blood cell

Hb:

Hemoglobin

Plt:

Platelet

Neut:

Neutrophil

Lymph:

Lymphocyte

Mono:

Monocyte

Alb:

Albumin

Glb:

Globulin

AGR:

Albumin-globulin ratio

NLR:

Neutrophil–lymphocyte ratio

LMR:

Lymphcyte-monocyte ratio

PLR:

Platelet-lymphocyte ratio

AST:

Aspartate transaminase

ALT:

Alanine transaminase

ALP:

Alkaline phosphatase

GGT:

Gamma-glutamyltransferase

TB:

Total bilirubin

DB:

Direct bilirubin

VI:

Vascular invasion

PI:

Perineural invasion

ATI:

Adipose tissue invasion

OS:

Overall survival

RFS:

Relapse-free survival

References

  1. 1.

    Chen WQ, Zheng RS, Baade PD, Zhang SW, Zeng HM, Bray F, et al. Cancer Statistics in China, 2015. Cancer J Clin. 2016;66(2):115–32.

    Article  Google Scholar 

  2. 2.

    Nevala-Plagemann C, Hidalgo M, Garrido-Laguna I. From state-of-the-art treatments to novel therapies for advanced-stage pancreatic cancer. Nat Rev Clin Oncol. 2020;17(2):108–23.

    PubMed  Article  Google Scholar 

  3. 3.

    Aier I, Semwal R, Sharma A, Varadwaj PK. A systematic assessment of statistics, risk factors, and underlying features involved in pancreatic cancer. Cancer Epidemiol. 2019;58:104–10.

    PubMed  Article  Google Scholar 

  4. 4.

    Katz MHG, Wang H, Fleming JB, Sun CC, Hwang RF, Wolff RA, et al. Long-term survival after multidisciplinary management of resected pancreatic adenocarcinoma. Ann Surg Oncol. 2009;7:25.

    Google Scholar 

  5. 5.

    Ferrone CR, Pieretti-Vanmarcke R, Bloom JP, Zheng H, Szymonifka J, Wargo JA, et al. Pancreatic ductal adenocarcinoma: long-term survival does not equal cure. Surgery. 2012;152:S43-9.

    PubMed  Article  Google Scholar 

  6. 6.

    He J, Ahuja N, Makary MA, Cameron JL, Eckhauser FE, Choti MA, et al. 2564 resected periampullary adenocarcinomas at a single institution: trends over three decades. HPB. 2014;17:325.

    Google Scholar 

  7. 7.

    Ellison LF, Wilkins K. An update on cancer survival. Health Rep. 2010;21(3):55–60.

    PubMed  Google Scholar 

  8. 8.

    Kawakami E, Tabata J, Yanaihara N, Ishikawa T, Koseki K, Iida Y, et al. Application of artificial intelligence for preoperative diagnostic and prognostic prediction in epithelial ovarian cancer based on blood biomarkers. Clin Cancer Res. 2019;25(10):3006–15.

    CAS  PubMed  Article  Google Scholar 

  9. 9.

    Kim N, Han IW, Ryu Y, Hwang DW, Heo JS, Choi DW, et al. Predictive nomogram for early recurrence after pancreatectomy in resectable pancreatic cancer: Risk classification using preoperative clinicopathologic factors. Cancers. 2020;12:18.

    Google Scholar 

  10. 10.

    He C, Huang X, Zhang Y, Cai Z, Lin X, Li S. A quantitative clinicopathological signature for predicting recurrence risk of pancreatic ductal adenocarcinoma after radical resection. Front Oncol. 2019;9:87.

    Article  Google Scholar 

  11. 11.

    He C, Sun S, Zhang Y, Lin X, Li S. A novel nomogram to predict survival in patients with recurrence of pancreatic ductal adenocarcinoma after radical resection. Front Oncol. 2020;10:147.

    CAS  Article  Google Scholar 

  12. 12.

    Guo SW, Shen J, Gao JH, Shi XH, Gao SZ, Wang H, et al. A preoperative risk model for early recurrence after radical resection may facilitate initial treatment decisions concerning the use of neoadjuvant therapy for patients with pancreatic ductal adenocarcinoma. Surgery. 2020;168(6):1003–14.

    PubMed  Article  Google Scholar 

  13. 13.

    Wei R, Wang J, Wang X, Xie G, Wang Y, Zhang H, et al. Clinical prediction of HBV and HCV related hepatic fibrosis using machine learning. EBioMedicine. 2018;35:124–32.

    PubMed  PubMed Central  Article  Google Scholar 

  14. 14.

    Jurmeister P, Bockmayr M, Seegerer P, Bockmayr T, Treue D, Montavon G, et al. Machine learning analysis of DNA methylation profiles distinguishes primary lung squamous cell carcinomas from head and neck metastases. Sci Transl Med. 2019;11:509.

    Article  CAS  Google Scholar 

  15. 15.

    Xu R-H, Wei W, Krawczyk M, Wang W, Luo H, Flagg K, et al. Circulating tumour DNA methylation markers for diagnosis and prognosis of hepatocellular carcinoma. Nat Mater. 2017;8:54.

    Google Scholar 

  16. 16.

    Motwani M, Dey D, Berman DS, Germano G, Achenbach S, Al-Mallah MH, et al. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur Heart J. 2017;38(7):500–7.

    PubMed  Google Scholar 

  17. 17.

    Singal AG, Mukherjee A, Joseph Elmunzer B, Higgins PDR, Lok AS, Zhu J, et al. Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma. Am J Gastroenterol. 2013;108(11):1723–30.

    PubMed  PubMed Central  Article  Google Scholar 

  18. 18.

    Pulvirenti A, Javed AA, Landoni L, Jamieson NB, Chou JF, Miotto M, et al. Multi-institutional development and external validation of a nomogram to predict recurrence after curative resection of pancreatic neuroendocrine tumors. Ann Surg. 2019;10:1–7.

    Google Scholar 

  19. 19.

    Tempero MA, Malafa MP, Chiorean EG, Czito B, Scaife C, Narang AK, et al. Pancreatic adenocarcinoma, version 1.2019 featured updates to the NCCN guidelines. JNCCN. 2019;17(3):203–10.

    Google Scholar 

  20. 20.

    He J, Pan H, Liang W, Xiao D, Chen X, Guo M, et al. Prognostic effect of albumin-to-globulin ratio in patients with solid tumors: a systematic review and meta-analysis. J Cancer. 2017;8(19):4002–10.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  21. 21.

    Goto W, Kashiwagi S, Asano Y, Takada K, Takahashi K, Hatano T, et al. Predictive value of lymphocyte-to-monocyte ratio in the preoperative setting for progression of patients with breast cancer. BMC Cancer. 2018;18(1):1137.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  22. 22.

    Tong Z, Liu L, Zheng Y, Jiang W, Zhao P, Fang W, et al. Predictive value of preoperative peripheral blood neutrophil/lymphocyte ratio for lymph node metastasis in patients of resectable pancreatic neuroendocrine tumors: A nomogram-based study. World J Surg Oncol. 2017;15(1):1–9.

    Article  Google Scholar 

  23. 23.

    Wang C, He W, Yuan Y, Zhang Y, Li K, Zou R, et al. Comparison of the prognostic value of inflammation-based scores in early recurrent hepatocellular carcinoma after hepatectomy. Liver Int. 2020;9:547.

    Google Scholar 

  24. 24.

    Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

    Article  Google Scholar 

  25. 25.

    Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29(5):1189–232.

    Article  Google Scholar 

  26. 26.

    Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.

    Google Scholar 

  27. 27.

    Cross SS, Harrison RF, Kennedy RL. Introduction to neural networks. The Lancet. 1995;346(8982):1075–9.

    CAS  Article  Google Scholar 

  28. 28.

    Altman NS. An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat. 1992;46(3):175–85.

    Google Scholar 

  29. 29.

    Moss HB, Leslie DS, Rayson P. Using J-K-fold cross validation to reduce variance when tuning NLP models. BMC. 2018;5:2978–89.

    Google Scholar 

  30. 30.

    Sala Elarre P, Oyaga-Iriarte E, Yu KH, Baudin V, Arbea Moreno L, Carranza O, et al. Use of machine-learning algorithms in intensified preoperative therapy of pancreatic cancer to predict individual risk of relapse. Cancers. 2019;11(5):606.

    PubMed Central  Article  CAS  PubMed  Google Scholar 

  31. 31.

    Song W, Miao DL, Chen L. Nomogram for predicting survival in patients with pancreatic cancer. Onco Targets Ther. 2018;11:539–45.

    PubMed  PubMed Central  Article  Google Scholar 

  32. 32.

    De CMM, Biere SSAY, Lagarde SM, Busch ORC, Van GM, Gouma DJ. Validation of a nomogram for predicting survival after resection for adenocarcinoma of the pancreas. Br J Surg. 2009;96(4):417–23.

    Article  Google Scholar 

  33. 33.

    Groot VP, Rezaee N, Wu W, Cameron JL, Fishman EK, Hruban RH, et al. Patterns, timing, and predictors of recurrence following pancreatectomy for pancreatic ductal adenocarcinoma. Ann Surg. 2018;267(5):936–45.

    PubMed  Article  Google Scholar 

  34. 34.

    Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 2019;17:1–9.

    CAS  Article  Google Scholar 

  35. 35.

    Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44–56.

    CAS  PubMed  Article  Google Scholar 

  36. 36.

    Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Nat Med. 2019;25(1):24–9.

    CAS  PubMed  Article  Google Scholar 

  37. 37.

    Bohr A, Memarzadeh K. The rise of artificial intelligence in healthcare applications. PLoS ONE. 2020;4:25–60.

    Google Scholar 

  38. 38.

    Zeevi D, Korem T, Zmora N, Israeli D, Rothschild D, Weinberger A, et al. Personalized nutrition by prediction of glycemic responses. Cell. 2015;163(5):1079–94.

    CAS  PubMed  Article  Google Scholar 

  39. 39.

    Kim W, Kim KS, Lee JE, Noh D-Y, Kim S-W, Jung YS, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15(2):230.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  40. 40.

    Liang J-D, Ping X-O, Tseng Y-J, Huang G-T, Lai F, Yang P-M. Recurrence predictive models for patients with hepatocellular carcinoma after radiofrequency ablation using support vector machines with feature selection methods. Comput Methods Prog Biomed. 2014;117(3):425–34.

    Article  Google Scholar 

  41. 41.

    Tseng C-J, Lu C-J, Chang C-C, Chen G-D. Application of machine learning to predict the recurrence-proneness for cervical cancer. Neural Comput Appl. 2014;24(6):1311–6.

    Article  Google Scholar 

  42. 42.

    Lg A, At E. Using three machine learning techniques for predicting breast cancer recurrence. J Health Med Inf. 2013;04(02):2–4.

    Google Scholar 

  43. 43.

    Medjahed SA, Saadi TA, Benyettou A. Breast cancer diagnosis by using k-nearest neighbor with different distances and classification rules. Int J Comput Appl. 2013;62:18.

    Google Scholar 

  44. 44.

    Li C, Zhang S, Zhang H, Pang L, Lam K, Hui C, et al. Using the K-nearest neighbor algorithm for the classification of lymph node metastasis in gastric cancer. Comput Math Methods Med. 2012;2012:77.

    Google Scholar 

  45. 45.

    Atallah DM, Badawy M, El-Sayed A, Ghoneim MA. Predicting kidney transplantation outcome based on hybrid feature selection and KNN classifier. Multimedia Tools Appl. 2019;78(14):20383–407.

    Article  Google Scholar 

  46. 46.

    Rana M, Chandorkar P, Dsouza A, Kazi N. Breast cancer diagnosis and recurrence prediction using machine learning techniques. IJRET. 2015;8:2319–1163.

    Google Scholar 

  47. 47.

    Lim CH, Cho YS, Choi JY, Lee KH, Lee JK, Min JH, et al. Imaging phenotype using 18F-fluorodeoxyglucose positron emission tomography–based radiomics and genetic alterations of pancreatic ductal adenocarcinoma. Eur J Nucl Med Mol Imaging. 2020;47(9):2113–22.

    CAS  PubMed  Article  Google Scholar 

  48. 48.

    Nasief H, Zheng C, Schott D, Hall W, Tsai S, Erickson B, et al. A machine learning based delta-radiomics process for early prediction of treatment response of pancreatic cancer. NPJ Precis Oncol. 2019;3(1):1–10.

    CAS  Article  Google Scholar 

  49. 49.

    Kaissis G, Ziegelmayer S, Lohöfer F, Algül H, Eiber M, Weichert W, et al. A prospectively validated machine learning model for the prediction of survival and tumor subtype in pancreatic ductal adenocarcinoma. BMC Med. 2019;17:1–9.

    Article  Google Scholar 

  50. 50.

    Hwang SH, Kim HY, Lee EJ, Hwang HK, Park M-S, Kim M-J, et al. Preoperative clinical and computed tomography (CT)-based nomogram to predict oncologic outcomes in patients with pancreatic head cancer resected with curative intent: a retrospective study. J Clin Med. 2019;8(10):1749.

    PubMed Central  Article  PubMed  Google Scholar 

  51. 51.

    Yun G, Kim YH, Lee YJ, Kim B, Hwang JH, Choi DJ. Tumor heterogeneity of pancreas head cancer assessed by CT texture analysis: Association with survival outcomes after curative resection. Sci Rep. 2018;8(1):1–10.

    Google Scholar 

  52. 52.

    Lu CF, Hsu FT, Hsieh KL, Kao YJ, Cheng SJ, Hsu JB, et al. Machine learning-based radiomics for molecular subtyping of gliomas. Clin Cancer Res. 2018;24(18):4429–36.

    PubMed  Article  Google Scholar 

  53. 53.

    Manamley N, Mallett S, Sydes MR, Hollis S, Scrimgeour A, Burger HU, et al. Data sharing and the evolving role of statisticians. BMC Med Res Methodol. 2016;16(Suppl 1):75.

    PubMed  PubMed Central  Article  Google Scholar 

  54. 54.

    Ahuja AS. The impact of artificial intelligence in medicine on the future role of the physician. PeerJ. 2019;2019:10.

    Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the General Program of National Natural Science Foundation of China under Grant [Grant Number: 81772562, 2017] (Yulian Wu) and the Fundamental Research Funds for the Central Universities [Grant Number: 2021FZZX005-08] (Xiawei Li).

Author information

Affiliations

Authors

Contributions

XL: conceptualization, methodology, formal analysis, writing-original draft, writing-review & editing. LY: resources, writing-review & editing. ZY: writing-original draft, formal analysis. JL: investigation. YF: resources, data curation. AS: data curation, methodology. JH, MZ: data curation. YW: funding acquisition, project administration, resources, supervision. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yulian Wu.

Ethics declarations

Ethics approval and consent to participate

The study was approved by the Institutional Review Boards of 3 institutions. And no additional patient consent was required since the medical records were retrospectively reviewed.

Consent for publication

All the authors agree to the publication of this work.

Competing interests

The authors declare no potential conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Figure S1.

Comparisons of AUROC of different models to predict 1- and 2-year relapse in training set (a: 1-year relapse, b: 2-year relapse). Figure S2. Calibration curves of SVM model (A: training set; B: validation set) to predict 1-year relapse and KNN model (C: training set; D: validation set) to predict 2-year relapse. Table S1. The ranges of training parameters for grid search in different models. Table S2. Comparison of characteristics between patients with and without 1-year relapse in training set. Table S3. Comparison of characteristics between patients with and without 2-year relapse in training set. Table S4. The optimal parameters for different models to predict relapse risks of PDAC. Table S5. Performance of models built on all 32 variables in the validation set. Table S6. Performance of models built on variables from lasso analysis in the validation set.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, X., Yang, L., Yuan, Z. et al. Multi-institutional development and external validation of machine learning-based models to predict relapse risk of pancreatic ductal adenocarcinoma after radical resection. J Transl Med 19, 281 (2021). https://doi.org/10.1186/s12967-021-02955-7

Download citation

Keywords

  • Machine learning
  • PDAC
  • Relapse
  • Prediction model
  • Radical surgery