IGHV gene mutational status and 17p deletion are independent molecular predictors in a comprehensive clinical-biological prognostic model for overall survival prediction in chronic lymphocytic leukemia

Background Prognostic index for survival estimation by clinical-demographic variables were previously proposed in chronic lymphocytic leukemia (CLL) patients. Our objective was to test in a large retrospective cohort of CLL patients the prognostic power of biological and clinical-demographic variable in a comprehensive multivariate model. A new prognostic index was proposed. Methods Overall survival and time to treatment in 620 untreated CLL patients were analyzed retrospectively to evaluate the multivariate independence and predictive power of mutational status of immunoglobulin heavy chain variable gene segments (IGHV), high-risk chromosomal aberration such as 17p or 11q deletions, CD38 and ZAP-70 expression, age, gender, Binet stage, β2-microglobulin levels, absolute lymphocyte count and number of lymph node regions. Results IGHV mutational status and 17p deletion were the sole biological variables with independent prognostic relevance in a multivariate model for overall survival, which included easily measurable clinical parameters (Binet staging, β2-microglobulin levels) and demographics (age and gender). Analysis of time to treatment in Binet A patients below 70 years of age showed that IGHV was the most important predictor. A novel 6-variable clinical-biological prognostic index was developed and internally validated, which assigned 3 points for Binet C stage, 2 points/each for Binet B stage and for age > 65 years, 1 point/each for male gender, high β2-microglobulin levels, presence of an unmutated IGHV gene status or 17p deletion. Patients were classified at low-risk (score = 0-1; 21%), intermediate-risk (score 2-5; 63% of cases), high-risk (score 6-9; 16% of cases). Projected 5-year overall survival was 98%, 90% and 58% in low-, intermediate- and high-risk groups, respectively. A nomogram for individual patient survival estimation was also proposed. Conclusions Data indicate that IGHV mutational status and 17p deletion may be integrated with clinical-demographic variables in new prognostic tools to estimate overall survival.

IGHV gene mutational status and 17p deletion are independent molecular predictors in a comprehensive clinical-biological prognostic model for overall survival prediction in chronic lymphocytic leukemia Bulian et al.

Background
According to the updated National Cancer Institute-Working Group (NCI-WG) guidelines, indication for treatment of chronic lymphocytic leukemia (CLL) still depends on clinical stage and disease activity [1]. In this context, measurements of biological prognostic markers, namely CD38, ZAP-70, mutational status of immunoglobulin heavy chain variable gene segments (IGHV), are judged as mandatory in the context of clinical trials, but not in general practice, since they fail to influence therapeutic decisions [1]. The only exception is represented by analyses of chromosomal aberrations by interphase fluorescence in-situ hybridization (FISH), given the presence of high-risk cytogenetic lesions (del11q and del17p), which may predict resistance to chemotherapy-based treatments [2]. Wierda et al. [3] proposed to combine a set of clinical risk factors, i.e age, gender, Rai staging, absolute lymphocyte count (ALC) and number of involved lymph node regions (LNR), with an inexpensive and widely available serum marker such as beta2-microglobulin (β2 M) to develop a prognostic index (PI) stratifying patients in three risk groups with different expected median survival, and a nomogram, estimating individual patient survivals. This model was subsequently validated in independent patients series also using time to first treatment as end-point [4][5][6][7][8]. A reduction of this model from six to four variables, i.e. age, gender, β 2 M levels and Binet staging, was also shown to predict survival with equal or even better performance [8]. The object of the present study was to provide evidence that prognostic models for overall survival based on clinical variables [4][5][6][7][8] could be improved by information on biological risk factors. By retrospectively analyzing a multicentre CLL population of over 600 untreated patients the most significant and independent biological and clinical prognosticators were integrated in a new clinical-biological prognostic index for group stratification and in a novel nomogram for estimating individual survival.

Patient population
Between 1996 and 2008 a cohort of 620 CLL patients was collected in the context of a larger multicenter patient dataset (n = 1037), previously utilized to propose a modified prognostic model and nomogram [8], according to the availability of the following biological prognosticators: IGHV mutational status, chromosomal abnormalities, as detected by interphase FISH, and flow cytometric expression of CD38 and ZAP-70. Moreover, since most of the diagnoses of the original patient set were made before the publication of the revised NCI-WG guidelines [1], all cases of previously defined CLL that could be re-classified as monoclonal B cell lymphocytosis (MBL) were removed accordingly. The percentage of recruited cases in the different centers was: 30% at Roma Catholic University, 25% at Novara, 15% at Roma Tor Vergata, 8% at Siena, 6% at Milano, 4% each in the other 4 centers. Cutpoints for LNR were as previously reported [3]. Continuous variables age and β 2 M levels were categorized using cut-points at 65 years for age and at the upper limit of normal (ULN) for β 2 M, as deduced by the analysis of martingale residuals plots [9]; ALC was categorized at the median, since the martingale residual plots did not show any suitable cut-point.

Biological prognosticators
Evaluation of biological prognosticators was centralized in few reference laboratories, utilizing previously validated common procedures; in detail, 5 centers performed IGVH mutational analysis, 6 centers performed cytogenetics and flow cytometry. IGHV mutational status was performed as previously reported [10]. Cytogenetic abnormalities involving chromosomes 11 (del11q22; hereafter del11), 12 (trisomy 12), 13 (13q14.3) and 17 (del17p13; hereafter del17) were investigated by interphase FISH, as reported [11]. Results of FISH analyses were classified as unfavourable when high-risk genomic aberrations (del17p and or del 11q) were present [12][13][14]. ZAP-70 measurements were determined by flow cytometry, utilizing the 20% of positive CLL cells as cut-off to discriminate between ZAP-70 positive and negative cases [15][16][17][18]. CD38 measurements were performed as reported [19], using a threshold at 30% expression to define positive cases. All the variables were measured at or within one year from diagnosis and always before treatment on either fresh or frozen samples. Data were used upon informed consent from patients and approval by Institutional Review Boards (Centro di Riferimento Oncologico, Aviano; Catholic University of the Sacred Heart, Rome), and in accordance with the Declaration of Helsinki.

Statistical methods
All analyses were performed in R, an open source statistical package (http://www.r-project.org/). Median follow-up was computed using the reverse censoring method. The primary end points were overall survival (OS) and timeto-first-treatment (TTT), defined as described [1,20,21]. OS was estimated using Kaplan-Meier plots and compared between groups by log-rank test. Univariate and multivariate Cox models were used to verify independent prognostic power of each parameter. Model minimization was performed by stepwise backward elimination. A p value < 0.05 was considered to be statistically significant. Departure from proportionality in hazard was tested in all Cox models. The predictive accuracy of various Cox models was evaluated by calculating the concordance index (c-index), which is a probability of concordance between predicted and observed survival, equal to the area under the receiver operating characteristics curve for censored data [22]. A c-index of 0.5 indicates that outcomes are completely random, whereas a c-index of 1 indicates that the model is a perfect predictor. Prediction error was calculated as 1-c-index. U-statistics was applied to test the significance between different c-index values [22]. Nomogram was developed and calibrated following published methods [22]. Final risk group scoring was developed in four step: 1. selection of independent predictive variables; 2. fitting of a Cox model with selected variables; 3. score assignments based on regression coefficients; 4. identification of best cut-point to split the score in 3 risk groups by recursive partitioning [23]. Internal validation for step 1. and 2. was performed with bootstrap .632+ method [24,25] with B = 620 bootstrap samples and (step 2) with cross-validation [26]. Variables selected with a frequency greater than 50% were entered in the final model. Risk score categorical model obtained by recursive partitioning was internally validated by bootstrap methods applied to tree-based analysis [27]. Finally, the whole model building procedure was validated by a comprehensive leave-oneout cross validation (see Additional file 1: supplementary statistical methods). All p values are based on two-tailed tests.

Patients characteristics
Patients characteristics are reported in Table 1. Treatment was administered according to NCI-WG indications. Deaths occurred mostly in treated patients (83%). Deaths among untreated patients aged beyond 70 years accounted for 11% of all deaths. All patients characteristics were balanced across age groups <55, 55-64, 65-4 and ≥ 75(chi-square tests), except for a greater proportion of males in the <55 age group and a greater proportion of high β2 M levels and deaths events in the ≥75 age group. Kaplan-Meyer plots of OS and TTT are shown in Figure 1.

Univariate and multivariate analysis for OS and TTT
In univariate analysis for OS all clinical and biological variables were significant, except for del11q (  [24]. All the variables introduced in the final model were selected in more than 50% of bootstrap samples (Table 3); prediction error in this step of model building was 0.244. The final model fitting was also validated by bootstrap .632+ method, showing at this step a prediction error of 0.247. Leaveone-out cross validation [24,26]. showed that neither β 2 M nor gender, the least important variables, could be safely removed from the model (Table 4). Univariate and multivariate analyses of TTT, performed on the subset of Binet A patients below 70 years of age, are shown in Table 5.

Clinical-biological prognostic index
The 4-variable clinical model previously proposed by us [8] was refitted in the present CLL cohort (see Additional file 2: Table S1).  Table 3). The score point distribution is reported in Figure 2a. To this distribution we applied a recursive partitioning method [23], which yielded three prognostic groups, with score 0-1, 2-5 and 6-9. The Kaplan-Meier plots of the three risk group partitioning of the prognostic score is shown in Figure 2b, for comparison also the risk group partition by Wierda PI [3] is shown in Figure 2c. In particular, 21% of patients (score 0-1) were at low-risk, 63% (score 2-5) were at intermediate risk, and 16% of patients (score [6][7][8][9] were at high risk. To show the combination of predictive variables in each patients and in each group we used a heat-map plot ( Figure 3). In the low risk group, comprising 133 cases, 52 patients had no adverse predictors (score 0), 50 patients were male, 16 patients had a β2 M > 1 and 15 patients had unmutated IGHV gene mutational status. Of note, low risk patients were never aged >65, nor had a Binet staging B or C, or were affected by a CLL bearing del17p ( Figure 4). Conversely, in the high-risk group, only 3 or 4 patients, respectively, had <65 years or a Binet stage A disease; these patients, however, had all the other prognosticators in their bad configuration. Moreover, the 51 patients in Binet stage B of the high-risk group, had mostly (37/51) an unmutated IGHV gene status or high β2 M (42/51) levels. Finally, the 29 patients classified in Binet stage C and belonging to the high-risk group, mostly had (26/29) high β2 M levels ( Figure 4). Kaplan-Meyer plots of the individual variables are reported in Figure 4.

Nomogram for estimating prognosis in individual patients
Even if individual estimates of survival, as those obtained from nomograms, are more likely affected by inaccuracy than group estimates [28], to allow individual patients survival estimation a nomogram was developed as described previously [8], based on the final model with clinical and biological prognostic factors shown in Table 3, modified using age and β2 M as continuous variables ( Figure 5). The clinical-biological nomogram showed a better predictive accuracy than the clinical nomogram proposed by Wierda et al. [3] (c-index respectively 0.79 and 0.76, p = 0.046).

Discussion
Survival time at CLL diagnosis may be simply estimated by means of six variables, four of them clinical-demographic   [3,4,8]. Two independent studies [4,8] failed to confirm the predictive power of ALC. A simplification of the PI from six to four variables was previously proposed by us as capable to stratify patients with equal or better performance [8]. In the present study, the aim was to improve these clinical prognostic models by adding information on biological variables, in particular those identified by the updated NCI-WG guidelines [1] as mandatory at least in the context of clinical trials. We demonstrated that PI for OS prediction based on clinical variables could be improved only by IGHV gene mutational status and del17p, but not CD38, ZAP-70 and del11q. The lack of prognostic power of CD38 and ZAP-70 is not totally unexpected. Similar findings have been found either analyzing OS [13,14,29] or TTT [30,31], although none of these reports included both biological and clinical prognosticators in a comprehensive clinical-biological PI, as proposed here. It has been often emphasized that assays evaluating ZAP-70 and, at least in part, CD38 expression suffer from inherent weakness and lack of proper standardization [13,[32][33][34]. As a consequence, considerable analytic variability still exists on measurement of these parameters [35]. In this regard, such a variability could be more relevant in multi-center series like that investigated in this study. Indeed, at variance with our results, ZAP-70 or CD38 turned out to be among the strongest prognosticators in mono-center studies [36,37], with time-to-firsttreatment or time-to-progression as end-points. Lack of reproducibility and standardization of biological markers can affect the results of prognostic tools applied at different institutions. Our model might be less subjected to this bias, since it includes IGHV and del17p, but not the less standardized measurements of CD38 and ZAP-70. Krober et al. [13,14] have previously showed the importance of molecular risk factors in CLL by stratifying patients by IGHV gene mutational status and presence of high-risk genomic aberrations (del17p or del11q), although authors failed to test if their model was independent of clinical and demographic risk factors. Here we had the chance to integrate the data by Krober et al. [13,14] by showing the independent prognostic relevance of UM IGHV gene status and del17p in a model that also included clinical and demographic risk factors. Of note, the effect of these molecular prognosticators was found to be additive and of equal importance. The unexpected limited relevance of molecular risk factors in our model and the lacking predictive power of CD38 and ZAP-70, may be in part justified by a relative small number or deaths and a median follow up of only 5 years, despite the large number of patients collected. Future analyses with longer follow-up data and more events might regain significance to some clinical−biological score biological variables showing the need to update the score. Compared to our previous clinical model [8], we confirmed the value of β2 M, although with the smallest coefficient and the weakest level of significance. We had no data to adjust for renal function impairment, particularly in aged patients, or for other comorbidities. However, β2 M was shown to be important in other retrospective and prospective studies [38][39][40]. The value of prognostic factors in aged CLL patients has been recently criticized by showing that FISH aberrations (del11q or del 17q) and IGVH lost their predictive power for OS in patients aged above 75 years [41]. In our CLL series, we specifically addressed this issue by testing age dependent variations of the predictive power of all the variables included in the final model. No significant interaction effect was found for age. We found a greater proportion of death events in the oldest age group. The risk of death in this group may be influenced by other factors, not related to disease. However, epidemiological data from cancer registry show the frequent occurrence of late deaths attributable to CLL also in aged patients group [42,43]. The effect of chemoimmunotherapy with anti-CD20 was small, with a non significant trend for a longer survival (p = 0.10). Results of the present study differed in part from those of a randomized prospective trial [40], where Binet stage and gender, in addition to del11q, CD38 and ZAP-70, all failed to be independent prognostic markers in a multivariate model for OS which also included IGHV gene mutational status, usage of IGHV3-21 gene, del17p, age and β2 M. Notably, this study, which investigated a population of selected patients in need of treatment (i.e. with active or progressive disease), selected del17p as the strongest risk factor [40]. Conversely, in our retrospective study dealing with untreated patients at diagnosis, the relative weight of del17p appeared equal or lower than that of other variables. Therefore, while the model described in [40] seems to better predict outcome of CLL patients with progressive or active disease, our model appears to be more suited for estimating survival in untreated patients at diagnosis or before clinical progression. The analysis of TTT in Binet A patients below 70 years of age showed that demographic factors (age, gender), important for OS estimation, lost their prognostic power for TTT. Conversely, It might be expected that biological prognosticator, particularly those with limited or even absent significance in the OS analyses, would have gained more importance in the TTT analyses in this subset of patients. However only IGHV and β2 M confirmed their role, with IGHV the most important predictor of TTT. CD38 and ZAP-70 were again not significant, in spite of a good representation of positive cases (respectively 21% and 36%); del17p lost its power whereas del11q gained significance. In this case the low percentage of positive cases may, at least in part, justify the fluctuating results.

Conclusions
In the present study we showed that the survival of untreated CLL patients may be estimated by a limited set of clinical and biological variables, integrated in a prognostic index and in a nomogram, allowing group and individual estimation, respectively. CD38, ZAP-70 and del11q gave redundant prognostic information. Both the proposed PI and the nomogram were only internally validated. Even in internally validated models, the performance of prognostic tools may be influenced or biased by the composition of the population in which they are developed and lack of standardization of biological variables. Therefore the prognostic tools proposed should be used with caution until externally validated on independent, prospective patient series.  Table S1. Previously proposed prognostic score for overall survival with clinical risk factors.