Development and validation of a score to predict mortality in ICU patients with sepsis: a multicenter retrospective study

Background Early and accurate identification of septic patients at high risk for ICU mortality can help clinicians make optimal clinical decisions and improve the patients’ outcomes. This study aimed to develop and validate (internally and externally) a mortality prediction score for sepsis following admission in the ICU. Methods We extracted data retrospectively regarding adult septic patients from one teaching hospital in Wenzhou, China and a large multi-center critical care database from the USA. Demographic data, vital signs, laboratory values, comorbidities, and clinical outcomes were collected. The primary outcome was ICU mortality. Through multivariable logistic regression, a mortality prediction score for sepsis was developed and validated. Results Four thousand two hundred and thirty six patients in the development cohort and 8359 patients in three validation cohorts. The Prediction of Sepsis Mortality in ICU (POSMI) score included age ≥ 50 years, temperature < 37 °C, Respiratory rate > 35 breaths/min, MAP ≤ 50 mmHg, SpO2 < 90%, albumin ≤ 2 g/dL, bilirubin ≥ 0.8 mg/dL, lactate ≥ 4.2 mmol/L, BUN ≥ 21 mg/dL, mechanical ventilation, hepatic failure and metastatic cancer. In addition, the area under the receiver operating characteristic curve (AUC) for the development cohort was 0.831 (95% CI, 0.813–0.850) while the AUCs ranged from 0.798 to 0.829 in the three validation cohorts. Moreover, the POSMI score had a higher AUC than both the SOFA and APACHE IV scores. Notably, the Hosmer–Lemeshow (H–L) goodness-of-fit test results and calibration curves suggested good calibration in the development and validation cohorts. Additionally, the POSMI score still exhibited excellent discrimination and calibration following sensitivity analysis. With regard to clinical usefulness, the decision curve analysis (DCA) of POSMI showed a higher net benefit than SOFA and APACHE IV in the development cohort. Conclusion POSMI was validated to be an effective tool for predicting mortality in ICU patients with sepsis. Supplementary Information The online version contains supplementary material available at 10.1186/s12967-021-03005-y.


Data sources
This was a multicenter, retrospective, observational study that was conducted using data from the eICU Collaborative Research Database which is a large multi-center critical care database containing information on 139,367 patients from 335 ICUs in 208 hospitals across the USA, in 2014 and 2015 [12,13]. The study also obtained patient clinical data from the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China, which had over 2000 beds. Data on septic patients admitted to the hospital between January 1, 2010 and September 31, 2020 was obtained through the electronic medical record management system. The study was approved by the Ethics Committee of the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University.

Participants
Participants were enrolled based on the definition of sepsis-3 i.e. a known or suspected infection plus SOFA > 2 points for organ dysfunction [14,15]. In addition, the first ICU admission was selected for septic patients admitted to the ICU more than once. The study however excluded patients who were younger than 18 years of age. Considering the different medical care levels in different regions, we divided the patients into three groups (Midwest, West and South) according to the hospital locations in the USA. Septic patients from the Midwest were used as the development cohort because of the largest sample size. Patients from West and South were used as external validation sets. And, data from the Chinese ICU was acted as another external validation set.

Variables
The Structured Query Language (SQL) with pgAdmin 4 (version 4.30) was used to extract data from the eICU database. The study retrospectively collected the following data: (1) demographic information including age, sex, race, height and weight; (2) site of infection, including pulmonary, renal/urinary tract infection (UTI), cutaneous/soft tissue, Gastrointestinal (GI), gynecologic, others, and unknown; (3) APACHE IV and SOFA scores on the day of ICU admission; (4) vital signs including temperature, heart rate, respiratory rate, systolic pressure, diastolic pressure, mean arterial pressure (MAP) and oxygen saturation levels at the first records after ICU admission; (5) laboratory data, including albumin, bicarbonate, bilirubin, creatinine, glucose, hematocrit, hemoglobin, lactate, platelet, blood urea nitrogen (BUN), white blood cell (WBC) and alanine transaminase (ALT) within 24 h of ICU admission; (6) comorbidities including Acquired Immunodeficiency Syndrome (AIDS), hepatic failure, lymphoma, metastatic cancer, leukemia, immunosuppression, and cirrhosis. For laboratory data recorded more than once, values associated with the most severe form of sepsis were employed. The proportion of missing values was less than 10% across all the variables.

Endpoints
The main outcome of the present study was ICU mortality. Survival following admission to the ICU was clearly recorded in eICU database. On the other hand, electronic

Statistical analysis
The Shapiro-Wilk test was used to examine whether the data was normally distributed. Categorical variables were described by frequency (percentages) and mean (SD) or median (interquartile range) for continuous variables, as appropriate. In addition, non-normal continuous variables were compared using the Wilcoxon rank-sum test while the Student's t test was employed for the normally distributed data. Moreover, categorical variables were analyzed using the chi-squared test or Fisher's exact test, accordingly. The primary outcome for the study was ICU mortality. Therefore, univariate logistic regression analyses were conducted to identify the unadjusted association between potential predictors and ICU mortality. For the ICU mortality model, the backward stepdown logistic regression based on the smallest Akaike Information Criterion (AIC) value was selected to confirm the independent risk variables for ICU mortality [16]. Additionally, multicollinearity of variables was examined using the Variance Inflation Factor (VIF) for each predictive variable and a VIF ≥ 5 indicated multicollinearity among variables.
Thereafter, the above continuous independent predictor variables were transformed into categorical variables based on quartiles then all the categorical variables (including AIDS, hepatic failure and metastatic cancer) were subjected to multivariable logistic regression to identify the final predictor variables in the prediction scoring system. The study developed this scoring system to predict mortality in septic patients and named it, Prediction of Sepsis Mortality in the ICU (POSMI). POSMI was developed by allocating an integer or half an integer score, which was calculated by dividing the regression coefficient of each predictor variable with the smallest regression coefficient. The sum of each predictor variable score yielded a total score for each individual and this total score was included in the final regression model. In addition, the model's discrimination for ICU mortality was examined using the area under the receiver operating characteristic curve (AUC) and calibration was conducted using calibration curves and the Hosmer-Lemeshow (H-L) goodness-of-fit test. Moreover, the DeLong's non-parametric method was used to compare the two AUC values with an equal sample size [17]. Following recommendations by Hosmer and Lemeshow, an AUC ≥ 0.7 indicated an acceptable discrimination while an AUC ≥ 0.8 showed excellent discrimination. Furthermore, Integrated Discrimination Improvement (IDI) was used to evaluate improvement in model performance [18] and the 95% Confidence Intervals (CIs) were calculated using non-parametric bootstrapping. In order to assess the clinical utility of the POSMI score, Decision Curve Analysis (DCA) was performed to compare the net benefit of the POSMI, APACHE IV and SOFA scores in the prediction of ICU mortality, at different threshold probabilities.
Given that missing data could have influenced the results to some extent, the multiple imputation technique using chained equations was employed in order to minimize bias and maintain the power of the study before data analysis. The "mice" package in R was used to implement this method. Additionally, sensitivity analyses were conducted to evaluate the influence of missing value filling. All the statistical analyses were conducted using R (version 3.6.1) and a p value < 0.05 was considered to be statistically significant.

Populations
Ten thousand seven hundred and fifty four septic patients from the eICU database met the inclusion criteria. Based on geographical locations in the USA, 4236 patients were from the Midwest, 3185 from the West, 2934 from the South, 386 from the Northeast and 13 were from unknown locations. In addition, the study consecutively collected a total of 1878 sepsis cases from the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China, between January 2010 and September 2020. Notably, septic patients from the Midwest, who had the largest sample size, were used as the development cohort. The median age in the development cohort was 69 years (range, 57 to 80 years) and 53% of the patients were male. Additionally, the infection sites most frequently associated with sepsis were pulmonary (41%), renal/UTI (23%), GI (13%), unknown (10%), cutaneous/soft tissue (8%) and others (4%). The ICU mortality and hospital mortality rates were 11.8%, and 19.1%, respectively. On the other hand, patients from the West of USA and Wenzhou, China were used as the validation set, named as the West and Wenzhou validation cohorts. Moreover, patients from the Northeast and unknown regions were grouped into the South region due to the small sample sizes and used as another validation set, named as the South validation cohort. Table 1 shows the demographic and clinical characteristics of the development and validation cohorts.

Predictors of ICU mortality
Univariate logistic regression was used to test for the potential risk factors that would predict ICU mortality. Most of the variables were associated with ICU  mortality (Additional file 1: Table S1). However, the study performed the backward stepdown multivariate logistic regression analysis based on the smallest AIC value in order to determine the independent risk factors. The results revealed fourteen independent risk factors including; age, temperature, heart rate, respiratory rate, MAP, SpO 2 , mechanical ventilation, albumin, bilirubin, lactate, BUN, AIDS, hepatic failure and metastatic cancer (Additional file 1: Table S2). Thereafter, continuous variables were converted into categorical variables based on the quartiles in order to further validate the above continuous independent predictor variables and for practical purposes. Multivariable logistic regression analysis identified age ≥ 50 years, temperature < 37 °C, respiratory rate > 35 breaths/min, MAP ≤ 50 mmHg, SpO2 < 90%, albumin ≤ 2 g/dL, bilirubin ≥ 0.8 mg/dL, lactate ≥ 4.2 mmol/L, BUN ≥ 21 mg/dL, mechanical ventilation, hepatic failure and metastatic cancer (Additional file 1: Table S3).

The POSMI score
Twelve variables were used to create the POSMI score and each prognostic variable was assigned a score ( Table 2). The POSMI score for each patient was derived by obtaining a sum of the points corresponding to prognostic factors, whose scores ranged from 0 to 25. In addition, septic patients in the Midwest development cohort were divided into four categories according to each POSMI score distribution. These included the low risk (0-6 points) category which had 1.2% ICU mortality, moderate risk (7-10 points) which had 6.3% ICU mortality, high risk (11-15 points) which had 23.3% ICU mortality and very high risk (> 7 points) which had 66.9% ICU mortality (Table 3).

Risk stratification
Classification of the Midwest development cohort based on the POSMI score resulted in 1126 (26.6%) patients in the low-risk class, 1,836 (43.3%) in the moderate-risk group, 1105 (26.1%) in the high-risk category and 169 (4.0%) in the very high-risk class (Table 3). Classification results for the West, South and Wenzhou validation cohorts were similar to those of the development cohort.
In the West, South and Wenzhou validation cohorts, 24.8%, 24.5%, and 24.6% of the patients, respectively, were assigned to the low-risk class; 40.9%, 42.7% and 39.7%, respectively, fell under the moderate-risk group; 28.5%, 27.5% and 30.0%, respectively, were classified into the high-risk category and 5.8%, 5.3% and 5.7%, respectively, were assigned to the very high-risk class (Table 3). Moreover, patients in the three validation cohorts showed ICU mortality rates similar to those in the development cohort, in the four risk classifications (Table 3). Furthermore, the ICU mortality predicted by the POSMI score was very close to that of actual ICU mortality in the four different risk levels ( Table 3).

Validation of the POSMI score
Performance of the POSMI score was compared to that of the SOFA and APACHE IV scores in predicting ICU mortality in septic patients. The AUC value for the POSMI score was 0.831 (95% CI, 0.813-0.850) and was significantly higher than that of the SOFA score which was 0.728 (95 CI, 0.703-0.754) and the APACHE IV score which was 0.773 (95% CI, 0.752-0.795), in the development cohort ( Fig. 1A; Table 4). This indicated that the POSMI score had better discrimination than both the SOFA and APACHE IV scores. Similarly, the AUC values for the POSMI score in the West and South validation cohorts were more than 0.8 and were also significantly higher than those of both the SOFA and APACHE IV scores. The results therefore showed that the POSMI score had excellent discrimination in predicting mortality in ICU patients with sepsis (Fig. 1B, C; Table 4). In the Wenzhou validation cohort, the AUC value of the POSMI score was 0.798 (95% CI, 0.769-0.826) and was higher than that of the SOFA score which was 0.747 (95% CI, 0.714-0.780) and the APACHE IV score which was 0.777 (95% CI, 0.747-0.807). However, no significant differences in AUC were obtained between POSMI and APACHE IV (Fig. 1D; Table 4).
Additionally, the accuracy of the POSMI score was assessed using calibration curves and the H-L Chisquare test, in the development and validation cohorts. The bias-corrected curve, generated through a bootstrap method, showed a slight deviation from the reference line although the predicted ICU mortality was still in good agreement with the actual ICU mortality (Fig. 2). Moreover, the H-L Chi-square test showed that the POSMI score had good calibration in the Midwest development cohort (HL Chi-square = 10.963; p = 0.204). Good calibration was also confirmed in the West (HL Chi-square = 3.092; p = 0.929), South (HL Chi-square = 10.888; p = 0.208) and Wenzhou validation cohorts (HL Chi-square = 13.135; p = 0.107) as shown in Table 4. In addition, there was a significant increase in the IDI of the POSMI score compared to that of the SOFA and APACHE IV scores in the development and validation cohorts. This suggested that the POSMI score could improve significantly in prediction performance (Table 4). Furthermore, excellent discrimination and calibration were still observed in the sensitivity analyses Table 2 Risk factors for predictive model for ICU mortality in the midwest development cohort (n = 4236) OR odds ratio a ICU mortality odds ratio b Assignment of points to risk factors was based on a linear transformation of the corresponding β regression coefficient. The coefficient of each variable was divided by 0.4119 (the smallest absolute β value, corresponding to BUN ≥ 33 to < 52, mg/dL) and allocated an integer or an half integer score for each variable

Net benefit of using the POSMI score
Decision curve analysis (Fig. 3) showed that POSMI had a positive net benefit at a predicted threshold probability between 1 and 80% compared to treating septic patients as if they would all have died or they would all have survived (i.e., treat-all or treat-none strategies). The estimates of net benefits from using the POSMI score at different probability thresholds are provided in Table 5 (more estimates of net benefits are shown in Table S4). When the predicted threshold probability was 1% to 60% for the SOFA score and 1-80% for the APACHE IV score, the net benefits were positive in both scores (Table 5). With regard to clinical use, medical treatment aided by POSMI had more net benefit than using both the SOFA and APACHE IV scores when the predicted threshold probability was between 1 and 80% ( Fig. 3; Table 5).

Discussion
The present study involved 12,631 patients admitted with sepsis to more than 300 ICUs in over 200 hospitals. The study developed and validated (both internally and externally) a POSMI score for predicting the risk of ICU mortality. Although some of the predictor variables in the risk score have been reported previously, there is a limited number of tools for predicting the risk of mortality in septic patients [19][20][21]. Notably, the novel POSMI score developed by the study had a number of advantages. The score could easily be implemented based on the available common variables and had good calibration as well as discrimination for ICU mortality in septic patients in both the development and validation cohorts. Additionally, the discrimination and IDI of the POSMI score were significantly higher than those of the APACHE IV and SOFA scores (discrimination of the POSMI score was similar to that of the APACHE IV score in Wenzhou validation cohort). The score may therefore be ideal for guiding decision-making in clinical practice for the management of septic patients. Moreover, the POSMI score showed comparable or better discrimination for predicting ICU mortality in sepsis, compared to other predictive scoring systems in sepsis and critically ill patients. Such include prediction of mortality in sepsis (AUC, 0.68-0.75) [20,[22][23][24], prediction of ICU mortality in surgical patients (AUC, 0.72) [25], prediction of mortality in the critically ill with sepsis using the SOFA score (AUC, 0.77) [26] and prediction of mortality in an academic cardiac intensive care unit using the APACHE IV score (AUC, 0.82) [27]. Considering the high morbidity and mortality rates associated with sepsis, it is necessary to establish a risk score for clinicians to accurately predict and evaluate the outcomes of septic patients. This will also be important in clinical decision-making. Body temperature is a main area of focus in studies on sepsis [28]. For instance, two recent studies on body temperature and sepsis showed that hyperpyrexia was associated with poor prognosis in septic patients [29,30]. In addition, a randomized controlled trial demonstrated that fever control by external cooling, significantly reduced early mortality in septic shock [31]. Additionally, most studies showed that hypothermia was associated with a higher mortality in septic patients [32][33][34]. According to a previous study, the occurrence of fever in sepsis may be associated with better survival [35]. However, the present study found an association between an admission body temperature below 37 °C and the risk of ICU mortality although body temperature alone was not sufficiently predictive of the severity of illness. In addition to body temperature, the study showed that respiratory rate and blood pressure were also predictors of poor outcomes in patients with sepsis. It is noteworthy that the two have been adopted as predictors in many critical illness prediction scoring systems, such as qSOFA [20] as well as the APACHE II and IV scores [5,6]. Moreover, heart rate was not independently associated with mortality from sepsis in the study. Nonetheless, variability in heart rate was associated with mortality from sepsis in some studies previously reported [36][37][38]. Consequently, heart rate at admission was not incorporated in the Notably, low albumin, high bilirubin and BUN reflect acute and/or chronic damage of the liver and kidney, which are both strong and independent risk factors of prognosis in critical illness [41]. Additionally, high serum lactate was proven to be significantly associated with mortality in patients with sepsis [42][43][44]. All these factors could therefore provide important prognostic information for the prediction model. Performance of the model in this study was evaluated based on discrimination and calibration through statistical analysis and graphical methods. The AUC for the development and validation cohorts ranged from 0.798 to 0.831, reflecting the excellent ability of the model to discriminate ICU mortality in patients. Additionally, the H-L goodness-of-fit test results and calibration curves suggested that the predicted ICU mortality was similar to the actual ICU mortality, indicating that the prediction model was well calibrated. Moreover, the study validated the calibration of the prediction model at four risk levels (low-, moderate-, high-and very high risk). Expectedly, the ICU mortality predicted by the model was almost consistent with the actual ICU mortality. In addition, the IDI of POSMI in the development and validation cohorts were all significantly higher than those of the APACHE IV and SOFA score, suggesting that the prediction model was superior to both the APACHE IV and SOFA scores. The study also used cohorts with no missing values to conduct sensitivity analysis. Although the multiple imputation approach was used, the POSMI score still maintained excellent discrimination and calibration. With regard to clinical benefit, patients could get more net benefit from using the POSMI score.
Although a high-or a very high-risk score does not directly influence treatment decision-making, it may be useful in making objective prognoses and recommendations for clinicians as well as patients and their families. Nevertheless, further studies are required to confirm the clinical application of the POSMI score. In addition, clinical trials on sepsis may benefit from using the POSMI score as an inclusion and exclusion criterion. For instance, very high-risk patients, where therapeutic measures may not bring clinical benefits because of the severity of disease and low-risk patients whose event rate may be too low to warrant inclusion, may be excluded to optimize the study design. Furthermore, the POSMI score could facilitate patient stratification in clinical studies.
The present study had a number of strengths. First, the POSMI score had excellent model performance in the development and external validation cohorts. The POSMI score was also relatively easy to calculate and all the variables could easily be obtained. In addition, the development and validation cohorts were from hospitals of different sizes (most hospitals from the eICU database are small and medium-sized while the Wenzhou validation cohort came from a large-sized hospital), making it possible to use the model in other hospitals or countries. Nonetheless, the study had a few limitations. First, this was a retrospective cohort study and although we adjusted for many potential confounders, the possibility of residual confounders remains, and the POSMI score was only validated in USA and China, further validation is needed to determine whether our prediction model is applied to other locations or countries. Second, information on the time from onset of illness to hospital admission was missing. Third, the reported ICU mortality was all-cause mortality, the cause of death was not available in the cohorts. Finally, there was no information about treatment in preventing the ICU mortality. Most notably, the baseline differences between the study populations (different continents), ICU practices, study dates (2 years vs. 10 years, one of which includes the COVID-19 pandemic which is likely to have influenced the ICU data collected during that time) were not addressed in present study. Hence, more prospective studies are therefore needed to validate these findings.

Conclusions
In conclusion, the present study developed and validated a simple risk score, POSMI score, which is valuable in predicting mortality in septic patients admitted to the ICU. The POSMI score is superior to the SOFA and APACHE IV scores in present study. We anticipate it will be most useful for risk stratification and decision-making.