Case mix, outcomes and comparison of risk prediction models for admissions to adult, general and specialist critical care units for head injury: a secondary analysis of the ICNARC Case Mix Programme Database

Introduction This report describes the case mix and outcome (mortality, intensive care unit (ICU) and hospital length of stay) for admissions to ICU for head injury and evaluates the predictive ability of five risk adjustment models. Methods A secondary analysis was conducted of data from the Intensive Care National Audit and Research Centre (ICNARC) Case Mix Programme, a high quality clinical database, of 374,594 admissions to 171 adult critical care units across England, Wales and Northern Ireland from 1995 to 2005. The discrimination and calibration of five risk prediction models, SAPS II, MPM II, APACHE II and III and the ICNARC model plus raw Glasgow Coma Score (GCS) were compared. Results There were 11,021 admissions following traumatic brain injury identified (3% of all database admissions). Mortality in ICU was 23.5% and in-hospital was 33.5%. Median ICU and hospital lengths of stay were 3.2 and 24 days, respectively, for survivors and 1.6 and 3 days, respectively, for non-survivors. The ICNARC model, SAPS II and MPM II discriminated best between survivors and non-survivors and were better calibrated than raw GCS, APACHE II and III in 5,393 patients eligible for all models. Conclusion Traumatic brain injury requiring intensive care has a high mortality rate. Non-survivors have a short length of ICU and hospital stay. APACHE II and III have poorer calibration and discrimination than SAPS II, MPM II and the ICNARC model in traumatic brain injury; however, no model had perfect calibration.


Introduction
Traumatic brain injury is a common and potentially fatal condition. In the United States, 50,000 people die annually after head injury and 80,000 to 90,000 suffer long-term disability [1]. Head injury accounted for more than 120,000 admissions in England during 2000 to 2001, utilising over 320,000 bed days [2]. Ninety percent of head injuries seen in UK Accident and Emergency departments are mild, defined by the Royal Society of Rehabilitation Physicians as Glasgow Coma Score (GCS) 13 to 15 [3], 5% are moderate (GCS 9 to 12) and 5% are severe (GCS 3 to 8) [4].
Patients with severe head injury, in whom treatment is not deemed futile, are cared for in general or specialist intensive care units (ICUs). This is for a variety of reasons, most importantly because patients with a GCS below 9 need endotracheal intubation to protect their airway patency. Other reasons include management of associated extracranial injuries. There- fore, head injury presents a large burden on critical care facilities in the UK.
Factors associated with increased mortality after head injury include age [5], presenting GCS [6], lower blood pressure [7], serum glucose [8], and hypoxia [9]. Various risk prediction models, such as Simplified Acute Physiology Score (SAPS) II, Mortality Probability Models (MPM) II and Acute Physiology And Chronic Health Evaluation (APACHE) II and III have also been demonstrated to predict head injury mortality [10,11]. This report describes head injury patients admitted to ICUs across England, Wales and Northern Ireland, identified using the Intensive Care National Audit and Research Centre (ICN-ARC) Case Mix Programme (CMP) Database. The case mix of ICU admissions, outcome and activity associated with these admissions are described. The aim is to indicate the burden of head injury on intensive care nationally to help inform future planning policy and to allow local units to compare their practice and results. A comparison is also made of the ability to predict head injury mortality using several commonly used risk prediction models, which are already well established in intensive care audit.

Case Mix Programme Database
The CMP is a national comparative audit of adult, general critical care units (including ICUs and combined intensive care and high dependency units) in England, Wales and Northern Ireland. Additionally, a small number of specialist units, including neurosurgical units, participate in the audit. The data undergo extensive validation before being incorporated into the CMP Database. Details of the data collection and validation have been reported previously [12]. Data were extracted for 374,594 admissions to 169 general ICUs and 5,743 admissions to two neurosurgical units, from the period December 1995 to May 2005.

Selection of cases
Primary and secondary reasons for admission to ICU are coded in the CMP Database using the ICNARC Coding Method [13], a hierarchical method specifically designed for coding the reasons for admission to ICU. Admissions were selected from the database if they were aged 16 years or over, and the primary reason for admission to ICU was recorded as 'Primary brain injury', 'Subdural haematoma', or 'Extradural haematoma'.

Data
Data were extracted on the case mix, outcome and activity as defined below.

Case mix
The lowest total GCS from the first 24 hours following ICU admission (or the entire stay, if less than 24 hours) is recorded in the CMP if the admission was not sedated or paralysed and sedated for the whole of the first 24 hours.
The pre-sedation GCS quantifies the level of consciousness following traumatic brain injury before the admission is sedated or paralysed and sedated. It is recorded if the admission was sedated or paralysed and sedated at any time during first 24 hours in ICU.
Severity of illness was summarised by the APACHE II score [14], encompassing weightings for acute physiology (defined by derangement from the normal range for 12 physiological variables in the first 24 hours following admission to ICU), age, and a past medical history of specified severe conditions.
Admissions following emergency surgery were identified based on the source of admission to the CMP unit and the National Confidential Enquiry into Perioperative Death (NCE-POD) classification of surgery, as has been described previously [12].

Outcome
Survival data were extracted at discharge from the CMP unit and at ultimate discharge from hospital.

Activity
Length of stay in the CMP unit was calculated in fraction of days from the dates and times of admission and discharge. Length of stay in hospital was calculated in days from the dates of original admission and ultimate discharge. Readmissions to the unit within the same hospital stay were identified from the postcode, date of birth and sex, and confirmed by the participating units.

Analyses
A statistical analysis plan was agreed a priori. The analyses performed were as follows.

Descriptive statistics
The case mix, outcome and activity, as above, were described for head injury admissions. Continuous variables were summarised by mean and standard deviation, or median and interquartile range for skewed variables.
The number of admissions and mortality were presented by lowest total GCS for admissions not sedated/paralysed and sedated for the entire first 24 hours in ICU, and by pre-sedation GCS for admissions sedated/paralysed and sedated for the entire first 24 hours in ICU.

Evaluation of models in head injuries admissions
The prognostic ability of the APACHE II [14], APACHE III [15], SAPS II [16], MPM II [17] and ICNARC [18] models were assessed. Coefficients for APACHE II were taken from the UKspecific model [19]. These models were evaluated for discrim-ination (the ability of the model to distinguish survivors from non-survivors), calibration (the accuracy of the estimated probability of survival), and overall fit.
Discrimination was assessed by the area under the receiver operating characteristic (ROC) curve (AUC) [20]. Calibration was assessed by the Hosmer-Lemeshow C*-statistic [21] and Cox's calibration regression [22]. Overall fit of the model was assessed by Brier's score [23].
The AUC (also called the concordance statistic) measures the probability that a randomly selected non-survivor has a higher prediction than a randomly selected survivor. A value of 0.5 indicates no discrimination, and 1 indicates perfect discrimination.
The Hosmer-Lemeshow test divides the data into a number of equal-sized groups (typically 10) based on the predicted mortality -that is, the 10% with the lowest predicted mortality, the 10% with the next highest, and so on. The observed mortality in these groups is then compared to the expected mortality predicted by the model. The C*-statistic is calculated by summing the following quantity over these groups: where N is the number of patients in the group, O is the observed mortality and E is the expected mortality. Under the null hypothesis of perfect calibration (observed mortality = expected mortality), the C*-statistic has a chi-squared distribution with degrees of freedom equal to the number of groups. As the C*-statistic is sensitive to variation in sample sizes, it is not appropriate to directly compare C*-statistics for different models unless they have been calculated on the same patients.
Cox's calibration regression tests for a systematic lack of calibration by performing a linear recalibration of the log odds. The log odds are given by log(p/(1p)), where p is the mortality probability. The following model is fitted: true log odds = slope × predicted log odds + intercept If the model is perfectly calibrated then the slope will be 1 and the intercept 0, that is, true log odds = predicted log odds. This is tested with a likelihood ratio chi-squared test.
Brier's score, developed in relation to meteorological forecasting, is an overall measure of accuracy. It is the mean square error between outcome and prediction. For perfect predictions, Brier's score will be 0; for constant predictions of 0.5 for every individual, Brier's score will be 0.25. The lower the value of Brier's score, the more accurate the predictions.
Admissions were excluded if they were readmissions of the same patient within the same hospital stay, as outcomes would not be independent, or if they were missing the outcome variable of hospital mortality. In addition, the standard exclusion criteria were applied for each model.
Admissions were excluded from APACHE II if they stayed less than eight hours in the critical care unit, were transfers from another critical care unit, or were admitted for burns or following coronary artery bypass graft. Admissions were excluded from APACHE III if they stayed less than four hours in the critical care unit, or were admitted for burns or following coronary artery bypass graft. Admissions were excluded from SAPS II if they were under the age of 18, admitted for burns or following cardiac surgery, transferred to an ICU in another hospital, missing surgical status, or missing ventilation and oxygenation data. The MPM II model gives a mortality prediction on admission to ICU, and again at 24 hours into their stay. The prediction on admission was used for admissions staying less than 24 hours, with the prediction at 24 hours used for all other admissions. Admissions were excluded from MPM II if they were under the age of 18, admitted for burns or following cardiac surgery, or transferred to an ICU in another hospital. The ICNARC model has no exclusion criteria. Participating units in the CMP may elect not to collect additional data for APACHE III, SAPS II and/or MPM II, so all admissions to these units were excluded from the relevant model(s).
Initially, the evaluation of these models was carried out on all the admissions that met the inclusion criteria for each model. A second evaluation was also carried out on just the admissions that met the inclusion criteria for all the models, allowing more direct comparison of the results. As the ICNARC model was developed using the CMP Database, a further sensitivity analysis was performed, excluding all admissions that had been included in the development dataset for the ICNARC model.
The discrimination of the models was also compared to the discrimination of GCS alone -both the lowest GCS from the first 24 hours and the pre-sedation value -in all admissions with a GCS recorded and the subset of these eligible for all the models. As no prediction of mortality is produced directly from GCS, it is not possible to include GCS in comparisons of calibration.
All analyses were performed using Stata 8.2 (Stata Corporation, College Station, TX, USA).

Descriptive statistics
Overall, 11,021 admissions following traumatic brain injury were identified in the database, representing 3.0% of all admissions to these units. The case mix, outcome and activity of these admissions are presented in Table 1. Hospital mortal-ity was considerably higher than ICU mortality (33.5% versus 23%). Of the 938 admissions that died in hospital post first ICU discharge, 74 (7.9%) had all active treatment withdrawn prior to ICU discharge and 176 (18.8%) were discharged for palliative care.
Of the 11,021 admissions: 4,766 (43.2%) were not sedated or paralysed for the entire first 24 hours in ICU, and had a GCS recorded during this time; 4,331 (39.4%) were sedated and paralysed for the entire first 24 hours in ICU and had a presedation GCS recorded; the remaining 1,914 (17.4%) did not have a GCS recorded. Figures 1 and 2 show the distribution of GCS and its relationship with mortality for these groups. We see that the relationship between GCS and mortality is more extreme (with a higher mortality for GCS 3 and a lower mortality for GCS 15) for GCS measurements from the first 24 hours in ICU than for pre-sedation measurements. Tables 2 and 3 show the measures of model performance for each of the five risk prediction models in all admissions eligible for that model and the 5,393 admissions eligible for all five models, respectively. Figure 3 shows calibration plots for the five models, and Figure 4 shows ROC curves for the five models plus lowest total GCS. MPM II excluded the most admissions, with 7,267 admissions included in the analysis, compared to 10,285 for the ICNARC model. However, this was largely due to units electing not to collect data for MPM II. The ICNARC model had the best performance on all measures, closely followed by SAPS II and MPM II. The APACHE models had much poorer performance. The calibration plots ( Figure 3) indicate that APACHE III significantly underestimated mortality for all admissions, whereas APACHE II appears to overestimate mortality for higher risk admissions.

Evaluation of models in head injuries admissions
Excluding all admissions included in the development dataset for the ICNARC model left 2,563 admissions. All models exhibited slightly poorer performance in terms of both discrimination and overall fit in this dataset; however, the order of the models was preserved, with the ICNARC model still demonstrating the best performance. Table 1 Case mix, outcome and activity of admissions following traumatic brain injury by type of unit The discrimination of GCS is compared with that of the risk prediction models in Table 4. While the discrimination of the lowest GCS from the first 24 hours in ICU was good (AUC 0.81) in admissions that were not sedated or paralysed for the entire first 24 hours, it was outperformed by all the risk prediction models. APACHE III, SAPS II, MPM II and the ICNARC model all displayed excellent discrimination (AUC 0.89 to 0.91) in this group. Discrimination of all models was considerably worse when restricted to patients that were sedated and had a pre-sedation GCS recorded; however, the raw GCS still displayed worse discrimination than the models.

Discussion
This study examines the outcomes of 11,021 head injury patients admitted to UK ICUs since 1995, and the predictive ability of five risk-adjustment scores for intensive care head injury mortality.
ICNARC is an independent charity and coordinates a national comparative audit of patient outcomes from participating ICUs. In total, 171 UK ICUs contributed data used in this study: 153 in England, 8 in Northern Ireland and 10 in Wales. The CMP is a high quality database and performs well against the Directory of Clinical Databases criteria [12], comprising data on consecutive admissions from each centre, explicit variable definitions, data collection training for observers and objective variables without scope for inter-observer error.
A limitation of the analysis is that a proportion of admissions, 1,692 patients, did not have a documented GCS/pre-sedation GCS. These patients were, therefore, excluded from the analysis, which may have introduced an element of bias.
Seventy-seven percent of the head injury admissions in this analysis were male, a male to female ratio of 3.3:1. In series of several thousands of head injuries in adults, including patients who did not require intensive care and presented with any GCS, males accounted for 67% to 90% of cases [2,8,[24][25][26][27]. This association is well established and is correlated with the greater sensation seeking behaviour of males [28]. The mean age of adults admitted to the ICUs in this analysis was 44 years. This is moderately higher than other studies that quote mean ages between 28 and 38 years [8,[24][25][26][27]29,30].
Survivors' length of stay (LOS) in ICU was a median of 3.2 days. Non-survivors only stayed a median of 1.6 days. The design of this analysis is such that we can only speculate upon explanations for this disparity. Firstly, this may be a reflection of limited provision of ICU beds within participating centres.
Much has been written about critical care provision in the UK [31][32][33][34]. A shortage of units providing intermediate/high dependency care [35] and intensive care [33] has been identified and appropriately referred patients may still be refused admission to intensive care because of this [34]. Patients in our study may have been cared for on a ward or in a lower level critical care setting and transferred to the ICU only when they had acutely deteriorated, by which point it may have been too late to sizeably influence their outcome. Similarly, there may have been pressure to discharge patients from the ICU prematurely to make way for others with greater perceived critical care need. This appears to be supported by the higher nonsurvivors' median hospital LOS of three days; that is, patients with ultimately fatal head injury did not spend all of their admission receiving intensive care. Alternatively, the fact that these patients did not stay on an ICU for their entire admission may have reflected sound medical judgement by which intensive care was channelled away from patients whose outcome would not have been expected to be changed significantly regardless of the level of medical attention they received.

Figure 2
Distribution of Glasgow Coma Score and mortality for admissions with a Glasgow Coma Score recorded before sedation Distribution of Glasgow Coma Score and mortality for admissions with a Glasgow Coma Score recorded before sedation.

Figure 1
Distribution of Glasgow Coma Score and mortality for admissions with a Glasgow Coma Score recorded in the first 24 hours Distribution of Glasgow Coma Score and mortality for admissions with a Glasgow Coma Score recorded in the first 24 hours.
A study of 843 head injury patients in a UK ICU demonstrated similar overall median ICU LOS of three days [36]. Asthma patients, in contrast, only stay a median of 1.5 days in UK ICUs and represented only 1.7% of UK ICU admissions [37]. As well as longer ICU LOS, head injury accounted for 3% of all ICU admissions in our analysis. Therefore, with intensive care costing £1,219 to £1,638 per day [33], head injury represents a large burden on critical care resources. The survivors tended to stay in hospital for some time, with a median hospital LOS of 24 days. Overall, median LOS was 23 days in a series of 182 head injury patients presenting with a GCS of less than 9 [38], and 10 days in a series of 843 patients requiring ICU treatment [36]. Asthma and chronic obstructive pulmonary disease median hospital LOS were 8 days [37] and 16 days [39], respectively.
It is notable that non-survivors spent only a median of three days in hospital. It appears that if their injury was serious enough to be fatal, they would die early during the admission, within only a couple of days. In contrast, the long hospital LOS of survivors in our study is not surprising. These are highly dependent patients who need almost all activities of daily living performed for them; those requiring initial intubation will require weaning from the respiratory support and their immo-  bility puts them at risk of multiple medical complications, such as venous thromboembolism and pneumonia. Many of these patients will also have associated injuries that will delay their recovery, such as those to the chest, abdominal organs and spine.
Head injury patients in this analysis had a 77% chance of surviving to leave the ICU and a 66.5% chance of surviving to leave hospital. Our in-hospital mortality rate of 33.5% is comparable to previous studies, whose patients were adults and all received intensive care, where the mortality rate was 23% to 39.5% [8,36,38,40]. A mortality of just 14% was described where severe head injury accounted for only 32% of 22,924 patients [41]. Patients in our analysis with a pre-sedation GCS of 3 to 8 had a mortality of 38.5% compared to 44% mortality in a series of severe head injury only [27]. However, this included all hospital admissions whereas our analysis was restricted to those who were accepted for intensive therapy, and so would have included patients whose prognosis was deemed so dire and unmodifiable that they would not have been ICU candidates or who died prior to admission to an ICU.
It is notable that almost 30% of non-survivors in our analysis died after initial discharge from the ICU. This again may represent sound medical judgement in patients deemed to have a dire prognosis that would not improve significantly despite intensive care; however, only 8% of these patients had all active treatment withdrawn during their ICU stay, and only 19% were specified as a discharge for palliative care. Alternatively, it may again represent a lack of ICU provision to patients with great need for it. An estimated figure of 50% of non-survivors after surgery in the UK are never admitted to an ICU [31]. Clearly all of our patients were within an ICU at some point of their admission but it is possible that the length of time they spent there may not have been optimal.
Surgical status showed that approximately 20% of admissions required emergency surgery prior to their arrival in the ICU.

Figure 3
Calibration plots for APACHE II, APACHE III, SAPS II, MPM II and ICNARC models  Calibration plots for APACHE II, APACHE III, SAPS II, MPM II and ICNARC models. However, the nature of the database does not inform us of the type of surgery performed and, although some may have undergone craniotomy, they may alternatively have undergone surgery for extracranial injuries, for example, laparotomy.
Our analysis also compared the performance of five risk prediction models in the prediction of head injury mortality in this population. Risk prediction models can be used to prognosticate but also to allow large-scale audit of outcomes in different centres or at different times. Observational studies of provision and outcomes in critical care often rely on risk prediction models to reduce bias. For these reasons, the models must be robust with as accurate calibration as possible to the particular population. Established models can display a loss of fit when evaluated in different critical care populations [42]. Even more so, this is a potential problem when they are evaluated in a single condition, such as head injury, for which they have not been specifically developed. We compared the models using a spectrum of measures of calibration and discrimination. This followed an approach developed under the guidance of an expert statistical steering committee for a large multicentre comparison of the risk prediction models in all ICU admissions [42]. The use of quantitative measures of model fit (Cox's calibration regression, Brier's score) rather than tests of perfect calibration (Hosmer-Lemeshow) alone allows more reliable comparison of the degree of miscalibration among the models. Although none of the risk prediction models evaluated in this analysis discriminated perfectly between survivors and nonsurvivors amongst 5,393 head injury patients in intensive care, SAPS II, MPM II and the ICNARC model discriminated better than APACHE II and III and had superior calibration. The performance of the risk prediction models surpassed that of raw GCS alone. This is a reflection of the importance of multiple factors in the prediction of outcome after head injury. Extracranial factors indirectly reflect the scale of secondary brain injury. Not only does the outcome from traumatic brain injury depend on adequate oxygenation and perfusion to facilitate restoration of normal neural architecture and physiology, insufficiency of these causes further neuronal insult. Therefore, incorporating the factors relating to systemic injury and cardiorespiratory Receiver operating characteristic (ROC) curves for APACHE II, APACHE III, SAPS II, MPM II, ICNARC model and lowest total Glasgow Coma Score Receiver operating characteristic (ROC) curves for APACHE II, APACHE III, SAPS II, MPM II, ICNARC model and lowest total Glasgow Coma Score.
(page number not for citation purposes) function, for example, allows more accurate prediction of outcome.
The ICNARC model performed the best of all with respect to both discrimination and calibration. There may be several reasons for this. Firstly, this comparative analysis was based on an original dataset of 374,594 admissions from the CMP and the ICNARC model was derived using 231,930 of those admissions from the same database. Therefore, this would not be a fair representation of the ICNARC model's performance in head injury in intensive care. The comparative analysis was repeated using the remaining 142,664 admissions (2,563 head injuries), where none had been used to develop the ICN-ARC model, and it was again demonstrated to perform the best. A criticism that could still be raised is that these patients still have a similar case mix to those used to develop the ICN-ARC model as they came from the same UK ICUs. It is surprising, therefore, that the APACHE II model we used, which had been recalibrated for UK ICUs, did not perform better. In contrast, SAPS II and MPM II were developed using data from 137 ICUs in 12 countries throughout Europe and North America, the UK being only one of them, but still performed better than the UK-calibrated APACHE model. Thus, case mix can only partially explain the differences in model performance.
The second reason for the superior performance of the ICN-ARC model, followed by SAPS II and MPM II, may be their choice and weighting of variables relevant to neurological outcome. All of the models incorporate a mixture of the basic physiological factors that cause secondary brain injury, such as systolic blood pressure, hypoxia and temperature. However, they treat the neurological status of the patient differently and in varying depth. APACHE III uses a grid combining variations of eye opening, verbal and motor responses to give an overall score. The categories of each component are a compressed form of those in the Glasgow Coma Scale. SAPS II, on the other hand, uses the full GCS, which has been repeatedly shown to independently predict head injury mortality [6,8,26,43,44]. Although MPM II uses a cruder assessment of conscious level, that is, 'coma or deep stupor', it also incorporates the presence of intracranial mass effect. Presence of a mass lesion has also been demonstrated to be an independent predictor of head injury mortality [8,40,45,46]. In the case of sedated patients, APACHE II and III assume their GCS to be 15, which in the context of head injury requiring intensive care is clearly an often-false assumption and will underestimate disease severity. In contrast, SAPS II uses pre-sedation GCS as a direct replacement for the GCS in the first 24 hours, and the ICNARC model uses weightings for sedated and paralysed/sedated patients. The combination is much more likely to give a truer neurological assessment and make the ICNARC model a more appropriate tool in predicting head injury mortality.

Conclusion
This study demonstrates that head injury patients requiring intensive care in the UK have a 77% chance of surviving to leave the ICU and a 66.5% chance of surviving to leave hospital. Non-survivors had a much briefer length of stay than survivors. When predicting mortality in this population using risk prediction models that have been successfully evaluated in the ICU, APACHE II and III were found to have poorer calibration and discrimination than the ICNARC model, SAPS II and MPM II. The ICNARC model performed the best of the five models evaluated, although all models had significant departures from perfect calibration. A comparison between the raw GCS and more detailed ICU predictive models demonstrated the better performance of the models and thus reflects the contribution of extracranial physiological factors to outcome after head injury. While the impact of individual physiological variables on outcome has been recognised by previous studies, we need to examine the relative contribution to outcome in our patient population.

Key messages
• Traumatic brain injury requiring intensive care has an ICU mortality of 23.5% • Length of stay is much shorter in-ICU and in-hospital for non-survivors • The ICNARC model, SAPS II and MPM II have superior calibration and discrimination compared to APACHE II and III in traumatic brain injury, although none of these models have perfect calibration • Lowest GCS from the first 24 hours in ICU had good discrimination when measured, but could not be objectively assessed for 57% of admissions