Overexpression of a panel of cancer stem cell markers enhances the predictive capability of the progression and recurrence in the early stage cholangiocarcinoma

Background Cancer recurrence is the important problem of cholangiocarcinoma (CCA) patients, lead to a very high mortality rate. Therefore, the identification of candidate markers to predict CCA recurrence is needed in order to effectively manage the disease. This study aims to examine the predictive value of cancer stem cell (CSC) markers on the progression and recurrence of CCA patients. Methods The expression of 6 putative CSC markers, cluster of differentiation 44 (CD44), CD44 variant 6 (CD44v6), CD44 variants 8-10 (CD44v8-10), cluster of differentiation 133 (CD133), epithelial cell adhesion molecule (EpCAM), and aldehyde dehydrogenase 1A1 (ALDH1A1), was investigated in 178 CCA tissue samples using immunohistochemistry (IHC) and analyzed with respect to clinicopathological data and patient outcome including recurrence-free survival (RFS) and overall survival (OS). The candidate CSC markers were also investigated in serum from CCA patients, and explored for their predictive ability on CCA recurrence. Results Elevated protein level of CD44 and positive expression of CD44v6 and CD44v8-10 were significantly associated with short RFS and OS, while high levels of ALDH1A1 were correlated with a favorable prognosis patient. The elevated CD44v6 level was also correlated with higher tumor staging, whereas a decreasing level of ALDH1A1 was correlated with lower tumor staging. The levels of CD44, CD44v6 and CD44v8-10 were also correlated and were associated with a poor outcome. Furthermore, soluble CD44, CD44v6, CD44v8-10 and EpCAM were significantly increased in the recurrence group for early stage CCA; they also correlated with high levels of the tumor marker CA19-9. Elevated levels of CD44, CD44v6, CD44v8-10 or EpCAM alone or in combination has the potential to predict CCA recurrence. Conclusions The overexpression of CD44, CD44v6, CD44v8-10 and EpCAM increases predictability of post-operative CCA recurrence. Moreover, the overexpression of the panel of CSC markers combined with CA19-9 could improve our predictive ability for tumor recurrence in early stage CCA patients. This result may be beneficial for the patients in order to predict the outcome after treatment and may be useful for clinical intervention in order to improve patient survival.


Background
Cholangiocarcinoma (CCA) is the second most common primary hepatic cancer. It originates from the bile duct epithelium, accounting for 10-20% of primary liver cancers [1]. CCA can be divided into intrahepatic (iCCA), perihilar (pCCA) and distal (dCCA) forms based on their anatomical localization. iCCA arises from the bile duct epithelium inside the liver while pCCA and dCCA arise from epithelium outside of the liver [2]. Surgical resection is the only curative treatment and there is evidence suggesting that surgery with complete resection can improve patient survival [3]. In addition to surgery, adjuvant chemo-or radio-therapy is necessary to improve the patient's outcome [4]. However, to date many patients experience recurrence after surgery resulting in the high mortality rate of CCA patients [5]. Therefore, understanding the tumor biology and the identification of markers to predict cancer recurrence are necessary to manage the disease.
Accumulated evidence suggests that subpopulations of cancer cells, called cancer stem cells (CSCs), show stem cell-like properties such as self-renewal. CSCs play a critical role in many cancer processes, including development, progression and recurrence [6]. Because CSCs impact tumor aggressiveness, CSC markers, which are the markers most commonly expressed in CSCs, become an important factor for predicting cancer progression and recurrence. Currently, several CSC markers have been established in CCA, including cluster of differentiation 44 (CD44), cluster of differentiation 133 (CD133), epithelial cell adhesion molecules (EpCAM), and aldehyde dehydrogenase 1 (ALDH1) [7]. The expression of these markers is usually associated with a poor clinical outcome in patients with different cancers [8][9][10][11][12]. CD44 is a cell surface glycoprotein with a single polypeptide chain. It functions as a cell surface receptor for hyaluronic acid. There are many CD44 variants (CD44v) generated by alternative splicing processes [13]. The expression of CD44 and its variant isoforms relates to tumor progression and recurrence in some cancers [11,12,14]. CD133, transmembrane glycoprotein, is another CSC marker for cancer stem-like cells in CCA. High expression of CD133 was reported to be significantly associated with more aggressive tumors and correlated with a worse outcome for cancer patients [15]. Another CSC marker, EpCAM, is mostly overexpressed in tumors of epithelial origin. EpCAM overexpression is usually associated with tumor progression, especially metastasis [16]. Moreover, it has long been recognized that EpCAM can be cleaved [17], and soluble EpCAM is also associated with the aggressive phenotype of tumors [18][19][20]. Even though surface markers are mostly used to isolate/characterize CSC, other types of markers have also been used to identify and predict tumor progression and patient outcome. ALDH1A1 is an enzyme belonging to the ALDH family that functions as a detoxifying enzyme and also converts retinol (vitamin A) into retinoic acid (RA) [10]. The overexpression of ALDH1A1 is mostly involved in poor cancer prognosis, however, numerous studies suggest that high expression of ALDH1A1 is also associated with a better prognosis of the patients [21]. Although many studies have reported that the expression of CD44, CD44v, CD133, EpCAM and ALDH1A1 is associated with tumor progression and can be used to predict patient's outcome, their prognostic significance in the recurrence of CCA in patients has not been elucidated.
Therefore, in the present study, the expression of the above 6 putative CSC markers was investigated in 178 paraffin-embedded CCA tissue samples using immunohistochemical staining in order to explore their relationship with clinicopathological features and patient survival. Moreover, 4 candidate CSC markers were selected for further experimentation using enzymelinked immunosorbent assay (ELISA) to examine their expression level in serum and to provide a potential CSC panel for predicting of CCA recurrence.

Patient selection and follow-up
Patients diagnosed with CCA and who underwent surgery at Srinagarind Hospital, Khon Kaen University, Khon Kaen, Thailand between February, 2007 and December, 2016 were retrospectively studied. Patients were excluded if they received either radiotherapy or chemotherapy before surgery in order to reduce the effect of neoadjuvant on protein expression. The patients were also excluded if they died within 30 days after surgery to avoid the effect of the operation. Tissue samples and preoperative peripheral blood were obtained from patients and kept in the BioBank of the Cholangiocarcinoma Research Institute. All patients were assessed for clinicopathological characteristics including sex, age, tumor site, histology type, primary tumor (T stage), regional lymph node metastasis (N stage), distant metastasis (M order to predict the outcome after treatment and may be useful for clinical intervention in order to improve patient survival. Keywords: Cholangiocarcinoma, Cancer recurrence, Cancer stem cell marker, Tumor marker, Prognostic factor stage), TNM stage, and post-operative chemotherapy status. In addition, pre-operative peripheral blood was used for laboratory testing including, tumor markers and liver function test. For the follow-up protocol after surgery, the patients were examined every 3 months during the first year and every 6 months thereafter. Those patients who developed a new tumor after surgery were defined as having a postoperative recurrence. Overall survival (OS) was defined as the interval from the date of surgery to the time of death or until the last follow-up date, and recurrencefree survival (RFS) was defined as the interval from the date of surgery to the time of recurrence or until the last follow-up date.
Protein expression was analyzed according to staining frequency and intensity. The staining frequency of the protein was semi-quantitatively scored based on the percentage of positive cells, 0% = negative, 1-25% = + 1, 26-50% = + 2, and > 50% = + 3. The intensity of protein expression was scored as weak = 1, moderate = 2, and strong = 3. The final immunohistochemical score was determined by multiplying the intensity scores with the frequency scores, with a minimum score of 0 and a maximum of 9. The average score of each patient was calculated from two independent punctures. Finally, the median value was calculated by grading the scores of all patients. This was used as a cut-off point with the patients having a grading score lower than the median being classified as the low expression group, while those with a grading score equal to or higher than the median were classified as the high expression group [22]. For the proteins which have a median equal to zero, the patients have a grading score equal to zero, being classified as the negative group, while those with a grading score above zero are classified as the positive group.

Statistical analyses
Statistical analyses were carried out using the Statistical Package for the Social Sciences; SPSS software v.17. The association between protein expression and the clinicopathological features of the CCA patients was determined using the Chi square test. Overall and recurrence-free survival analyses were performed using Kaplan-Meier (log-rank) analysis. The correlation between protein types was analyzed using Pearson's correlation. The different of IHC score in different staging was analyzed using Kruskal-Wallis Test. The results from ELISA were analyzed by student's t test. The receiver operator characteristic curve (ROC curve) and logistic regression were used to determine the predictive ability with respect to cancer recurrence of soluble protein levels or the combination with tumor markers. A p-value less than 0.05 was considered as statistically significant.

Correlation between CSC marker expression and clinicopathological features
The expression levels of the CSC marker were investigated using immunohistochemistry. The representative figures of a normal bile duct, the precancerous (dysplasia) stage and CCA are shown in Fig. 1. To investigate the correlation between protein expression and clinicopathological features, the expression of CSC markers was divided into those with low and those with high expression and also those with negative and those with positive expression. High expression of candidate CSC markers CD44, EpCAM, ALDH1A1 and positive expression of CD44v6, CD44v8-10, and CD133 were 65%, 52%, 47%, 38%, 42%, and 36%, respectively. High expression of CD44 and positive expression of CD44v6 was observed mostly in males (p = 0.028 and p = 0.026). In addition, positive CD44v6 and CD133 expression was frequently observed in intrahepatic CCA (p = 0.001 and p = 0.019). A significant association was found between T stage and CD44v6 expression (p = 0.026). Additionally, reginal lymph node metastasis and TNM staging were significantly associated with CD44v6 (p = 0.002 and p = 0.005) and ALDH1A1 expression (p = 0.001 and p = 0.002) ( Table 1).

The prognostic significance of clinicopathological features
To identify prognostic factors for CCA patients, we analyzed all clinicopathological features including sex, age, tumor site, histology type, T stage, regional lymph node metastasis and distant metastasis status, TNM stage, post-operative chemotherapy (CMT) status with recurrence-free survival (RFS) and overall survival (OS) of the patients. The median RFS and OS were 15 and 17 months, respectively. Among all clinicopathological features, we found that patients with a higher T stage, regional lymph nodes and higher TNM staging were significantly correlated with shorter RFS compared with patients with a low T stage, absent regional lymph nodes, or low TNM stage (p < 0.001, p = 0.001 and p < 0.001, respectively). The results for OS analysis also showed a similar result, except that patients with age equal to 61 or greater were also significantly associated with a short OS (p = 0.032). There was no significant correlation between sex, histology type, tumor site, distant metastasis status, and post-operative CMT status with RFS and OS ( Table 2).

The expression of CSC markers and their prognostic significance in CCA patients
The expression of candidate CSC markers CD44, CD44v6, CD44v8-10, CD133, EpCAM, and ALDH1A1 was analyzed with respect to RFS and OS. Univariate analysis showed that the patients with a high expression of CD44, positive expression of CD44v6 and CD44v8-10 had a shorter RFS compared with other patients (p = 0.007, p = 0.001 and p = 0.007, respectively). In addition, a high expression of CD44 and ALDH1A1, positive expression of CD44v6 and CD44v8-10 was associated with a shorter OS compared with the other group of patients (p = 0.001, p = 0.022, p = 0.006 and p < 0.001, respectively) ( Table 2). Moreover, multivariate analysis showed that CD44 and CD44v8-10 could be used as prognostic factors independent of clinicopathological characteristics for RFS (p = 0.020 and p = 0.012) ( Table 3) and OS (p = 0.002 and p = 0.001) ( Table 4).
Kaplan-Meier analysis was used to examine the importance of tumor location. The result from intrahepatic CCA showed that a high expression of CD44 or the positive expression of CD44v6, and CD44v8-10 was significantly correlated with a shorter RFS compared with samples showing a low expression (p = 0.007, p = 0.017 and p < 0.001, respectively), while a high expression of EpCAM and ALDH1A1 was significantly correlated with a favorable prognosis in patients (p = 0.028 and p = 0.008) (Fig. 2). The results from OS analysis showed that patients with a high expression of CD44 or a positive expression of CD44v8-10 or a low expression of  ALDH1A1 also had a shorter OS compared with other groups (p = 0.002, p < 0.001 and p = 0.002, respectively) (Fig. 2). The result from extrahepatic CCA showed that a positive expression of CD44v6 was significantly correlated with a shorter RFS and OS (p = 0.034 and p = 0.039) (Fig. 3). Additionally, a positive expression of CD133 was significantly correlated with a shorter OS compared with samples showing a low expression (p = 0.033) (Fig. 3).
In addition to the DFS and OS analyses, the differences in IHC scores for different protein types were evaluated for different tumor stages. The expression of CD44, CD44v6, and CD44v8-10 seems to increase at higher stages compared with stage I tumor (Fig. 4a-c). Significant differences were observed between stages I and IV, II and IV of CD44v6 (p = 0.033 and p = 0.020, respectively) (Fig. 4b). In addition, the expression level of ALDH1A1 could be used to classify tumor staging. We found that ALDH1A1 expression level decreased along with tumor staging and was significantly decreased in stages III and IV compared with stage I tumor (p = 0.019 and p = 0.013) (Fig. 4f ).
CD44, CD44v6, CD44v8-10 and ALDH1A1 showed prognostic significance for CCA patients. The correlation between these markers was therefore further analyzed and significant positive correlations between CD44, CD44v6, and CD44v8-10 were observed, while there was no significant correlation between ALDH1A1 with the other markers ( Table 5). The combination of high expression of CD44 with positive expression of CD44v6 and CD44v8-10 was significantly associated with RFS (p = 0.001 and p = 0.002) and OS (p = 0.001 and p < 0.001) in intrahepatic CCA. Patients with high or positive expression of two or three markers had a poorer prognosis compared with other groups of patients ( Fig. 5a and 5b). On the other hand, only high expression of CD44 with a positive expression of CD44v6 and CD44v8-10 was significantly associated with OS (p = 0.016) (Fig. 6b).

The correlation of soluble CSC markers with cancer recurrence
The previous results showed that CD44, CD44v6, and CD44v8-10 were significantly correlated with RFS. In order to identify soluble CSC markers that can be used to predict cancer relapse, soluble CD44, CD44v6, and CD44v8-10 were further determined in CCA sera. Moreover, soluble EpCAM was also investigated because there is considerable evidence suggesting that  this plays an important role in the progression of many cancers. In addition, the result of IHC showed that T stage, the present of lymph node metastasis and TNM staging were associated with RFS and OS. Therefore, the different of soluble CSC markers, CD44, CD44v6, CD44v8-10 and EpCAM on patients with and without recurrence was analyzed according to tumor staging in order to avoid the effect of T, N and TNM stage on recurrence. The detailed information of 127 sera CCA samples was summarized in Additional file 1: Table S1. The result showed that patients with early stage CCA had levels of soluble CSC markers, CD44, CD44v6, CD44v8-10, and EpCAM that were significantly increased in patients with recurrence (p = 0.019, p = 0.028, p = 0.031, and p = 0.001). On the other hand, there were no differences in soluble CSC markers between patients with recurrence and those without recurrence in the late stage (Fig. 7).

Correlation between CSC marker levels in sera with clinicopathological features and laboratory results
The correlation between the levels of soluble CSC markers with clinicopathological features and laboratory results was analyzed. The results from the early stage group show that high levels of CD44, CD44v8-10 and EpCAM were significantly correlated with high levels of CA19-9 (p = 0.006, p = 0.011 and p < 0.001, respectively) ( Table 6). On the other hand, there was no correlation found between sex, age, tumor site, histology type, CEA, AFP and liver function test. In addition, the results from the late stage group show that a high level of CD44v6 was significantly associated with elevated total bilirubin, direct bilirubin, AST and ALP (p = 0.037, p = 0.029, p = 0.037 and p = 0.049, respectively) ( Table 7). Moreover, CD44v8-10 and EpCAM were also associated with elevated of ALP (p = 0.024 and p = 0.006) ( Table 7).  Table 9). On the other hand, soluble CD44, CD44v6, CD44v8-10,  and EpCAM were not suitable to distinguish between recurrence and non-recurrence in patients with late stage CCA (Additional file 3: Fig. S2).

The combination of soluble CSC markers and CA19-9 for improving predictive ability for post-operative recurrence
Soluble CD44, CD44v6, CD44v8-10 and EpCAM are promising factors for predicting cancer recurrence. A combination of these markers and their predictive efficacy for cancer recurrence was further examined. Interestingly, a combination of high levels of CD44, CD44v6, CD44v8-10 and EpCAM could increase the risk for recurrence with a high value of crude OR (crude OR = 7.08, p = 0.004) and adjusted OR (adjusted OR = 7.39, p = 0.006). Moreover, the best predictive ability for recurrence was observed with the combination of high expression of these 4 CSC markers and elevated CA19-9 levels with an increase of the crude and adjusted OR to 12.25 (p = 0.005) and 15.28 (p = 0.011), respectively ( Table 10). The survival analysis was also evaluated in patients with high levels of CD44, CD44v6, CD44v8-10 and EpCAM combined with an elevated CA19-9 level compared with other groups of patients. Patients with high levels of CD44, CD44v6, CD44v8-10 and EpCAM combined with elevated CA19-9 had a lower RFS when compared with other groups (p = 0.004) (Fig. 8).

Discussion
CCA is a malignant tumor with an asymptomatic early stage so that the disease is usually diagnosed once it has become advanced, resulting in a poor outcome for patients after treatment [23]. Even though several therapeutic approaches can be considered for CCA treatment, the recurrence rate is still high and leads to a high mortality in CCA patients [5]. Many studies suggest that tumor size and metastatic status are potential factors influencing RFS and OS in CCA patients [24][25][26][27]. Similar to our study, we found that CCA patients with a high primary tumor stage, presence of regional lymph node metastasis and high TNM staging have a lower RFS and OS compared with other groups of patients. Even though several studies have reported potential pathological factors for predicting CCA recurrence, effective biomarkers are required to assess the potential outcome of patients, including survival rate and the probability of recurrence after treatment. Moreover, the presence of such markers is likely to be useful for targeted therapy in order to prevent cancer progression and recurrence. The subpopulation of cancer cells with stem cell-like properties, CSCs, has been reported to be involved in many cancer processes such as tumor growth, metastasis, resistance to treatment, as well as recurrence [28]. Raggi et al. demonstrated the existent of CSC in biliary tract cancer (BTC) and suggest that the isolated BTC cells that express CD24, CD44 or EpCAM had a higher potential of tumorigenesis than the negative groups [29]. In addition, other CSC markers have also been reported as markers for CSC in CCA [7]. Therefore, CSC markers might be used to predict CCA progression and recurrence. To answer this hypothesis, we performed immunohistochemical staining to evaluate the expression of 6 putative CSC markers, CD44, CD44v6, CD44v8-10, CD133, EpCAM and ALDH1A1 in CCA tissue. The results show that among the 6 CSC markers investigated, the expression of CD44 and its variant isoforms (CD44v6 and CD44v8-10), and also ALDH1A1, were associated with tumor progression and poor outcome of CCA patients, including short RFS and OS. CD44 is a well-known marker that plays an important role in tumor progression, but the different isoforms work differently [30]. There is considerable evidence suggesting that a high expression of CD44 is associated with tumor progression and recurrence [31,32], which is similar to the other two variant isoforms that have also been reported to be involved in cancer progression and recurrence [33][34][35]. This is consistent with our finding for CCA which shows that patients with a high expression of CD44, CD44v6, and CD44v8-10 had a shorter RFS and OS compared with the low expression group. In addition, the expression of these markers seems to increase along with tumor stage, suggesting that their expression is involved in tumor progression. ALDH1A1 is cytosolic enzyme that can convert retinol into retinoic acid. It plays an important role in many processes occurring in the normal cell, include growth, development and differentiation [21]. It has been reported to be marker for normal stem cells (SC) and also for CSC. Although many studies have reported that a high expression of ALDH1A1 is associated with tumor progression, this result is controversial as many studies have shown that a high expression of ALDH1A1 is correlated with a favorable prognosis in different cancers [21].
In the present study, we found that a high expression of ALDH1A1 was also associated with a favorable prognosis for CCA patients. There is evidence suggesting that a combination of protein expressions has more potential to divide patients into the different prognostic groups [36]. Thus, the correlation of our 4 promising markers was also analyzed. A significant positive correlation was found in CD44, CD44v6 and CD44v8-10, with the combination of high expression in two or three markers being more useful in dividing patients into the different prognostic groups. On the other hand, there was no correlation between ALDH1A1 and the other markers. The panel of protein expression markers (CD44, CD44v6, and CD44v8-10) shows more efficacy in discriminating patients into different prognostic groups than the individual markers. Moreover, the elevation of these markers was also associated with RFS. Therefore, we further investigated the levels of these markers in the serum using the ELISA technique so that it can be used diagnostically for predicting factors for CCA recurrence. As many studies suggest that soluble EpCAM is associated with an aggressive tumor phenotype [18][19][20], soluble EpCAM was also considered to be a marker for CCA recurrence. According to the literature, tumor staging is an important factor involved in tumor recurrence in CCA patients [27], and our results on IHC also demonstrate that tumor staging has the potential to predict CCA recurrence. Therefore, in order to determine the effect of staging on cancer recurrence, the different levels of soluble CD44, CD44v6, CD44v8-10 and EpCAM in patients with and without recurrence were examined according to staging. The results indicate that early stage tumors are less variable than late or advanced stage tumors. Thus, the recurrence of cancer is caused by the inherent resistance of cancer cells [37]. Our results on early stage CCA patients show that patients with a low T stage, absence of lymph node involvement and no distant metastases but with recurrence had higher soluble levels of CD44, CD44v6, CD44v8-10 and EpCAM compared with those patients without recurrence. Accumulating evidence indicates that highly proliferative cancer cells can be killed by chemotherapy and radiotherapy, however a subpopulation of cancer cells with therapeutic resistance might survive and lead to relapse [6]. CD44 is known as a surface marker associated with CSC in various cancer types, and several CD44 variant isoforms are generated by alternative splicing processes [38]. Shi et al. reported that the expression of CD44v6 is up-regulated in the recurrence ovarian cancer, and this is also associated with cancer progression and metastasis [34]. Another CSC marker, CD44v8-10 stabilizes xCT, which is a cystine-glutamate transporter inducing glutathione synthesis. This process contributes to the tumor cells becoming resistant to oxidative stress, including reactive oxygen species (ROS) [39]. In addition, a study by Tayama et al. on ovarian cancer demonstrated that chemotherapy mostly eliminated the EpCAM-negative population compared with the EpCAM-positive population, suggesting that the EpCAM-positive population contributes to chemoresistance and cancer recurrence after chemotherapy [40]. Thus, the CSC markers, CD44, CD44v6, CD44v8-10, and EpCAM have the potential to predict cancer recurrence including CCA. The levels of tumor markers (CA19-9, CEA, and AFP) and a liver function test were also used to monitor CCA patients after treatment [27]. In this study, we also found that high levels of soluble CD44, CD44v6, CD44v8-10 and EpCAM were correlated with elevated levels of CA19-9, suggesting that their expression is involved in tumor progression. However, in late stage disease, there was no difference in the levels of soluble CD44, CD44v6, CD44v8-10 and EpCAM in patients with and without   recurrence, even though some of them showed an association with poor results for the liver function test. Therefore, our further analysis focused on early stage disease in CCA patient with the aim of examining the predictive value of soluble CD44, CD44v6, CD44v8-10 and EpCAM on post-operative CCA recurrence. Interestingly, we found that either high levels of soluble CD44, CD44v6, CD44v8-10 and EpCAM alone or a combination of these markers provides more precise predictive potential of CCA recurrence. Furthermore, there are studies that suggest that elevated serum levels of CA19-9 are also associated with CCA recurrence [27,41], a result corroborated by our study with soluble CD44, CD44v6, CD44v8-10, EpCAM and CA19-9. Therefore, the association between the combination of high levels of these 4 markers and CA19-9 was further evaluated. Our findings suggest that overexpression of the panel of CSC markers in combination with elevated levels of CA19-9 provide the best predictive factor for the post-operative recurrence of CCA in early stage patients. However, the small number of patients is a limitation of this study and a larger independent patient cohort needs to be further evaluated before clinical application.

Conclusion
The elevated of CD44, CD44v6, CD44v8-10 and EpCAM increases predictability of post-operative CCA recurrence. Moreover, the best predictive ability was found in the combination of overexpression of the panel of CSC markers with CA19-9.