Evaluation of the E-PRE-DELIRIC prediction model for ICU delirium: a retrospective validation in a UK general ICU

Methods We retrospectively analysed data for 2445 consecutive ICU admissions (November 2014 to June 2017). Patients were routinely assessed for delirium, using twice daily Confusion Assessment Method for the ICU (CAM-ICU) assessment [5]. As in previous E-PRE-DELIRIC studies [1–4], delirium was defined as any positive CAM-ICU assessment or antipsychotic initiation while on ICU. We adopted the original E-PRE-DELIRIC exclusion criteria [1], excluding 683 ICU admissions for ICU stay < 24 h (425 admissions), incomplete CAM-ICU data (152), delirium on admission (50), comatose throughout entire ICU stay (47), and age under 18 (9). Sixteen admissions were excluded due to missing E-PRE-DELIRIC components; 1746 admissions (1569 unique patients) remained for analysis; this 71.4% inclusion rate is consistent with previous studies (Table 1). Results and discussion Seven hundred sixty-three delirium cases were identified (43.7% of ICU admissions), a higher incidence than reported previously (Table 1). This is likely due to differences in the study population compared to previous studies: more patients were classified as urgent, the mean APACHE II score was higher, and median length of stay (LoS) was longer (Table 1). The mean E-PRE-DELIRIC score was 0.269 (Q1–Q3; 0.154–0.371). The histogram of E-PRE-DELIRIC scores shows extensive overlap between patients who did and did not develop delirium (Fig. 1a). The receiver operator characteristic (ROC) curve (Fig. 1b) and the precisionrecall (PR) curve (Fig. 1c), showing precision (positive predictive value (PPV)) against recall (sensitivity), both indicate moderate-to-poor discriminative performance. The area under the ROC (AUROC) was 0.628 (95% CI 0.602–0.653). The area under the PR curve (AUPRC) was 0.534. For sensitivity > 0.1, PPV was between 0.437 and 0.585, indicating only around half of the patients predicted to develop delirium actually did, in a population with 43.7% incidence. Refitting the E-PRE-DELIRIC logistic regression model to our data hardly improved discrimination: AUROC was 0.648 (95% CI 0.622–0.673) and AUPRC was 0.566. The calibration plot, of predicted risk against observed delirium rate, shows the risk of delirium is considerably underestimated, especially in patients with predicted risk of delirium less than 0.5 (Fig. 1d). Poor calibration is corroborated by the calibration slope model logit(probability of delirium) = alpha + beta ×logit(p), where p is


Introduction
E-PRE-DELIRIC is a point-of-admission ICU delirium risk prediction tool [1], with reported good or moderate performance [2][3][4]. In this study, we assessed its performance in a large UK teaching hospital general ICU using routinely collected data, as approved by the local Research Data Governance Committee.

Methods
We retrospectively analysed data for 2445 consecutive ICU admissions (November 2014 to June 2017). Patients were routinely assessed for delirium, using twice daily Confusion Assessment Method for the ICU (CAM-ICU) assessment [5]. As in previous E-PRE-DELIRIC studies [1][2][3][4], delirium was defined as any positive CAM-ICU assessment or antipsychotic initiation while on ICU.

Results and discussion
Seven hundred sixty-three delirium cases were identified (43.7% of ICU admissions), a higher incidence than reported previously (Table 1). This is likely due to differences in the study population compared to previous studies: more patients were classified as urgent, the mean APACHE II score was higher, and median length of stay (LoS) was longer ( Table 1).
The calibration plot, of predicted risk against observed delirium rate, shows the risk of delirium is considerably underestimated, especially in patients with predicted risk of delirium less than 0.5 (Fig. 1d). Poor calibration is corroborated by the calibration slope model logit(probability of delirium) = alpha + beta ×logit(p), where p is  the E-PRE-DELIRIC score [6]. The estimated slope beta = 0.58 (95% CI 0.46-0.71) is significantly below 1, indicating the predicted probabilities are overly variable; and the estimated intercept alpha = 0.84 (95% CI 0.74-0.95) is significantly above 0 when fixing beta = 1, indicating the predicted probabilities are predominantly too low. E-PRE-DELIRIC is particularly poorly calibrated for the surgical patients in the study, many of whom have major intraabdominal pathology: those with predicted risk < 10% had an observed incidence of 26%.
While E-PRE-DELIRIC is intended as a point-ofadmission score, some of its exclusion criteria are retrospective (LoS; CAM-ICU completeness; comatose throughout). To assess real-world performance, we repeated our analysis without these criteria. The AUROC (0.615) and AUPRC (0.423, with 35.0% observed incidence) remained similar.

Conclusion
In this population, the E-PRE-DELIRIC score is not as discriminative or as well calibrated as previously reported. PPV was only slightly higher than delirium incidence, meaning the utility of E-PRE-DELIRIC for guiding clinical decision-making in this population is limited.