Validation and analysis of prognostic scoring systems for critically ill patients with cirrhosis admitted to ICU

Introduction The number of patients admitted to ICU who have liver cirrhosis is rising. Current prognostic scoring tools to predict ICU mortality have performed poorly in this group. In previous research from a single centre, a novel scoring tool which modifies the Child-Turcotte Pugh score by adding Lactate concentration, the CTP + L score, is strongly associated with mortality. This study aims to validate the use of the CTP + L scoring tool for predicting ICU mortality in patients admitted to a general ICU with cirrhosis, and to determine significant predictive factors for mortality with this group of patients. This study will also explore the use of the Royal Free Hospital (RFH) score in this cohort. Methods A total of 84 patients admitted to the Glasgow Royal Infirmary ICU between June 2012 and Dec 2013 with cirrhosis were included. An additional cohort of 115 patients was obtained from two ICUs in London (St George’s and St Thomas’) collected between October 2007 and July 2009. Liver specific and general ICU scoring tools were calculated for both cohorts, and compared using area under the receiver operating characteristic (ROC) curves. Independent predictors of ICU mortality were identified by univariate analysis. Multivariate analysis was utilised to determine the most predictive factors affecting mortality within these patient groups. Results Within the Glasgow cohort, independent predictors of ICU mortality were identified as Lactate (p < 0.001), Bilirubin (p = 0.0048), PaO2/FiO2 Ratio (p = 0.032) and PT ratio (p = 0.012). Within the London cohort, independent predictors of ICU mortality were Lactate (p < 0.001), PT ratio (p < 0.001), Bilirubin (p = 0.027), PaO2/FiO2 Ratio (p = 0.0011) and Ascites (p = 0.023). The CTP + L and RFH scoring tools had the highest ROC value in both cohorts examined. Conclusion The CTP + L and RFH scoring tool are validated prognostic scoring tools for predicting ICU mortality in patients admitted to a general ICU with cirrhosis.


Introduction
The prevalence of liver disease in Scotland has been increasing over the last 30 years. Mortality due to liver disease is one of the few causes of death that is increasing [1]. There is a similar picture in England and Wales, with liver disease being the fifth most common cause of mortality after heart disease, cancer, stroke, and respiratory disease. This is in contrast to most Western European countries, which have seen a decline [1,2].
Liver disease accounts for an increasing proportion of Intensive Care Unit (ICU) and hospital admissions. Admissions rose by 71 % in male patients and 43 % in female patients between 1990 and 2003. This change has been mainly attributed to alcohol, which accounted for 85 % of liver disease deaths in 2007 [3]. Overall, patients with cirrhosis account for 15 % of Glasgow Royal Infirmary ICU admissions, and 3.3 % of ICU admissions in St George's and St Thomas' Hospitals. Patients with liver disease admitted to ICU have poor outcomes and a complex disease process. Mortality in these patients is widely documented in literature, with a meta-analysis of seventeen papers in 2010 showing the weighted mean ICU and hospital mortality to be 45 % and 58 %, respectively [4,5].
There are currently no validated prognostic scoring tools to predict ICU outcome in patients with cirrhotic liver disease within the general ICU setting that can be calculated quickly at a patient's bedside. Existing hepatic scoring tools are designed for a specific use, for example, the Child-Turcotte Pugh (CTP) score was designed to predict mortality following surgical treatment of oesophageal varices, and the United Kingdom model for end-stage liver disease (UKELD) was designed to assess patients for transplant in the UK [6,7].
In research conducted at Glasgow Royal Infirmary, existing scoring tools (the CTP, UKELD, model of endstage liver disease (MELD), Glasgow alcoholic hepatitis score (GAHS), sequential organ failure assessment (SOFA), the acute physiology and chronic health evaluation II (APACHE II), and the chronic liver failuresequential organ failure assessment score (CLIF-SOFA)) did not reach the level of clinical usefulness based on receiver operating characteristic (ROC) curve analysis of an area under the curve (AUC) of ≥0.8 [8][9][10]. Therefore, these existing tools may not be clinically useful for predicting ICU mortality. Analysis of this cohort found lactate, bilirubin, ascites, and prothrombin time (PT) ratio as independent predictors of outcome [11]. Other published studies have shown promising results for predicting outcome in patients with cirrhosis using the SOFA and MELD scoring tools, and have also demonstrated that the current CTP score is not the most effective tool for predicting outcome in patients with cirrhosis [12,13].
The relationship between blood lactate concentration on admission to the ICU and mortality in patients with cirrhosis is widely demonstrated within the literature [14][15][16][17]. Despite this, only one existing liver-specific scoring tool, the Royal Free Hospital score (RFH) includes lactate, which is validated for use in a tertiary hepatic treatment centre [18]. As a result, in a previous study [11] lactate was incorporated into an existing scoring tool, the CTP, to produce two novel tools. The CTP score was chosen due to its categorical variables that can easily be calculated at a patient's bedside. One scoring tool (CTP-L) splits the lactate into bands and awards 1, 2 or 3 points, which is similar to the other variables in the CTP score. The other tool (CTP + L) adds the raw value of lactate (mmol/L) onto the existing CTP score. These unvalidated tools performed well in a single cohort of patients, but results need to be validated in patients from another centre to demonstrate the usefulness of the scoring tool [11]. This study aims to validate these newly created scoring tools as prognostic measures of ICU outcome using data obtained from another ICU centre, and to determine the most predictive factors for predicting ICU outcome within these cohorts.

Methods
Data collection for the previous study took place between June 2012 and May 2013, and 59 patients were recruited from the Glasgow Royal Infirmary (GRI) ICU. This is a 20-bed facility with a large gastroenterology unit, but is not a tertiary hepatic transplant centre [11]. An additional cohort of 25 patients from the GRI was recruited by extension of the data collection period by 6 months, giving a combined total of 84 patients. Data were collected as part of routine data collection within the department and no additional consent was required. Ethics approval was granted by the Local Research Ethics Committee (West of Scotland Research Ethics Committee, approved 20 March 2012, REC reference; 12/WS/0039, Chair; Dr Gregory Ofili) for the original data collection and the extension of data collection.
Inclusion was based on the presence of cirrhosis on admission to the ICU in any patient over 18 years old. Cirrhosis was diagnosed either histologically following biopsy or clinically by evidence of portal hypertension and one of the following: ascites, encephalopathy or oesophageal varices. Diagnosis was confirmed by an independent clinician.
Clinical and demographic data were obtained from patients' electronic records (CareVue, Philips IntelliVue Clinical Information Portfolio (ICIP) Revision D.03, Warrick, Naik, Avis, Fletcher, Franklin, Inwald 2011) and WardWatcher (Critical Care Audit Limited, Yorkshire). These are validated and complete clinical information systems [19]. First available clinical test results after admission to the ICU were recorded and used to calculate scores using all scoring tools. Clinical and biochemical data collected included sodium, potassium, urea, arterial lactate, creatinine, white cell count, bilirubin, PT ratio, albumin, platelets, arterial partial pressure of oxygen (PaO 2 ), arterial partial pressure of carbon dioxide (PaCO 2 ), PaO 2 / inspired oxygen fraction (FiO 2 ) ratio (calculated from the arterial gas sample), Glasgow coma scale (GCS), mean arterial blood pressure (MAP), noradrenaline dose, ascites and encephalopathy grade. Demographic information was also recorded relating to: age, gender, reason for admission, and Scottish Index of Multiple Deprivation (SIMD). The SIMD scores deprivation based on postcode and takes into account employment, income, health, education and crime, and is only applicable within Scotland [20]. The Indices of Deprivation is a deprivation score that is valid within England, but the required information was not available at the time of data collection so the score is not included [21]. West Haven criteria encephalopathy scores and ascites scores were collected pre-intubation in all patients.
A second previously published cohort of 115 ICU patients with cirrhosis was obtained from St Thomas' Hospital and St George's Hospital in London. Data were collected over a period of 20 months between 31 October 2007 and 1 July 2009. These patients were recruited for a demographic study in cirrhotic patients within a general ICU population. Scoring tools for these patients were recalculated based on raw data [5].
Both general ICU and liver-specific scoring tools were used. The general ICU scoring tools calculated were the APACHE II and the SOFA score [22,23]. Liver-specific scoring tools used were the CTP, UKELD, MELD, CLIF-SOFA and the RFH score [6,7,18,24,25].
Of the two new scoring tools, CTP-L and CTP + L, the latter had achieved a higher AUC on ROC curve analysis in the cohort upon which it was designed. As a result, this was the only scoring tool used [11]. A breakdown for the calculation of the CTP + L score can be seen in Table 1. The GRI data are referred to as the Glasgow dataset, and the St Thomas' and St George's data are referred to as the London dataset.
Within the Glasgow data, encephalopathy scores were collected prospectively pre-intubation in all patients to record accurate values. In other studies, including the data from the London dataset, the encephalopathy score was presumed to be 2, as pre-intubation scores were not available [5]. To test if the collection of pre-intubation encephalopathy scores was necessary, the Glasgow dataset was modified to compare the CTP score with correct encephalopathy, with the CTP score excluding encephalopathy scores.

Statistical analysis
Univariate analysis was performed to identify variables significantly related to ICU outcome. The Welch independent samples t test was performed for continuous parametric data, and the Mann-Whitney U test was used for non-parametric continuous data. Pearson's Chi squared test with Yates' continuity correction (where appropriate) or Fisher's exact test for count data were used for categorical data. All assumptions for statistical tests were met and p <0.05 was considered statistically significant. All missing data were kept blank.
Scoring tools were applied to both datasets, and compared using the AUC and optimum cutoff point determined by the Youden's index from ROC curves. Statistical models were produced by binary logistic regression for individual variables in both datasets against ICU mortality, with model selection based on analysis of variance (ANOVA) and Akaike information criterion (AIC) values. The optimum cut point from ROC curves produced for models was used to predict outcomes in the other dataset, and goodness-of-fit was compared using the Chi squared test and the phi coefficient. ROC curves were directly compared using DeLong's test for correlated ROC curves and the Chi squared test. An independent statistician provided assistance with the analysis in this study. Statistical analysis was performed using RStudio version 0.98.493 (R Foundation for Statistical Computing: Vienna, Austria) [26][27][28].

Results
There were 84 patients from the Glasgow dataset and 115 patients from the London dataset initially included in the analysis. During model selection, five patients from the Glasgow dataset and one patient from the London dataset were excluded from subsequent data analysis due to missing values, leaving 79 and 114 patients in each group, respectively.
Univariate analysis of the Glasgow dataset demonstrated that significant predictors of mortality were lactate (p <0.001), bilirubin (p = 0.0048) , PaO 2 /FiO 2 ratio (p = 0.032), PT ratio (p = 0.012). Mean age of patients in the Glasgow dataset was 50 years, and 66 % of patients were from the most deprived category of the SIMD. A summary of patient data collected and univariate analysis results can be seen in Table 2.
Univariate analysis of the London dataset showed that significant predictors of mortality were PT ratio (p <0.001), lactate (p <0.001), PaO 2 /FiO 2 ratio (p = 0.0011), bilirubin (p = 0.027), and the presence of ascites (p = 0.023). Mean age for patients in the London dataset was similar to the Glasgow dataset at 51 years. A summary of the London dataset patient data can be seen in Table 3.
The mortality rates in the Glasgow dataset were 30 % for ICU and 46 % for hospital, compared with the London dataset with 37 % ICU and 46 % hospital mortality. Mean APACHE II scores for the Glasgow and London datasets were 23.5 and 16.9, respectively, and the mean SOFA scores were 9.7 and 6.4, respectively. Comparison of the AUC of scoring tools applied to each dataset can be seen in Table 4. On the London dataset the RFH score performed most accurately (AUC = 0.77), with the CTP + L score performing to a similar level (AUC = 0.75). The original CTP score was again the least predictive of ICU mortality (AUC = 0.68). No scoring tool reached the clinically useful AUC of 0.8 in this dataset.

Binary logistic regression models
As none of the existing scoring tools applied to the London dataset reached the level of clinical usefulness, the raw data were analysed to find the optimum model for predicting ICU outcome. Statistical models were produced by binary logistic regression using the Glasgow dataset, and the optimum model for predicting ICU mortality was determined by ROC curve analysis. The highest AUC from ROC curve analysis in the Glasgow dataset was a model containing lactate, bilirubin, and PaO 2 /FiO 2 ratio (AUC = 0.89). Model selection using stepwise regression and ANOVA resulted in a model   A model was produced using the Glasgow data, based on independent predictors of outcome from the London data consisting of PaO 2 /FiO 2 ratio, PT ratio, and urea, which performed poorly compared to all other models, with an AUC of 0.73. All other models produced using the Glasgow dataset had an AUC >0.8.
The above process was repeated for the London dataset, where models were produced by binary logistic regression and compared using ROC curves. The most predictive model for the London dataset was a combination of PaO 2 /FiO 2 ratio, PT ratio, and urea obtaining an AUC of 0.78. No model based on the London dataset had an AUC >0.8. Based on stepwise regression analysis and ANOVA, a model containing lactate and PT ratio performed best in the London dataset. Odds ratios for predicting ICU outcome using this model were 1.13 (95 % CI 1.02-1.27, p = 0.032) for each mmol/L increase in lactate, and 2.17 (95 % CI 1.20-4.53, p = 0.021) for each increment in PT ratio.
A model was produced using the London data, based on the independent predictors of outcome from the Glasgow data, consisting of lactate, bilirubin, and PaO 2 /FiO 2 ratio. This model performed well compared to all others, with an AUC of 0.76. This is similar to AUC for the RFH and CTP + L scoring tools in this cohort of patients.

Goodness-of-fit of regression models
In order to determine the usefulness of the statistical models, goodness-of-fit testing was undertaken for both the Glasgow models by applying them to the London dataset, and for the London models by applying them to the Glasgow dataset. The Chi squared goodness-of-fit test was used for all models and phi coefficients compared.
Collecting encephalopathy grade for the CTP score When the CTP and CTP + L scores were compared in the Glasgow dataset with and without encephalopathy scores, results showed that there was no statistically significant difference between collecting and not collecting pre-intubation encephalopathy scores, either with the original CTP score (p = 0.12) or the modified score CTP + L (p = 0.52). Therefore encephalopathy scores do not significantly influence the CTP or CTP + L scores, and it may be unnecessary to collect these.
Combined performance of the RFH score and CTP + L score As the RFH score and CTP + L scores were the best performing tools in both datasets, the datasets were combined to create a cohort of 199 patients. ROC curves for the RFH and CTP + L score were produced and are shown in Fig. 1. Both scores performed well in this combined dataset with the CTP + L score having an AUC of 0.79, and the RFH score an AUC of 0.78. There was no statistically significant difference between the ROC curves (p = 0.92).

Discussion
This paper aimed to validate the use of the CTP + L as a method of predicting ICU outcome in patients with cirrhosis admitted to a general ICU. The CTP + L and RFH scoring tools performed similarly in the Glasgow dataset and both performed significantly better than the existing CTP score. When applied to the London dataset to validate the tool, the CTP + L tool performed well but failed to reach the clinically useful AUC of 0.8, which is commonly reported in the literature [10]. The CTP + L tool was more predictive than the CLIF-SOFA score in both datasets. Mortality rates in the Glasgow and London datasets were similar, with ICU mortality being 30 % and 37 % for the Glasgow and London datasets, respectively, and the hospital mortality the same in both datasets at 46 %. These values are lower than the 48 % and 58 % weighted mean values reported in the literature for ICU and hospital mortality [5]. The SOFA and APACHE II scores were lower in the London dataset than in the Glasgow dataset. These are validated scores for measuring the severity of illness in ICU patients, with higher scores indicating increased severity of illness and greater probability of mortality [22,23,29]. The difference in these scores between the two datasets may reflect a difference in admission criteria between the different ICUs. This difference may also reflect the reduced odds ratio associated with lactate in the two datasets, with higher arterial lactate concentration reported within the literature [30,31] to be associated with higher APACHE II scores and mortality. Future work in this area should explore ICU admission criteria for cirrhosis patients to help understand the care trajectory for this patient cohort.
Although arterial lactate concentration on admission was an independent predictor of ICU mortality in both datasets, the odds ratio for lactate predicting ICU mortality was higher within the Glasgow dataset than the London dataset (odds ratio 1.89 vs 1.13, respectively). Other published papers in this field [16,17] report odds ratios for lactate higher than that in the London dataset. Univariate analysis of the Glasgow dataset demonstrated that on admission, lactate, bilirubin, PaO 2 /FiO 2 ratio, and PT ratio were significant predictors of ICU mortality. These significant predictors were also significant predictors within the London dataset, and other published literature [17]. It was also found in the London dataset that the presence of ascites was a significant predictor of mortality, which was not found in the Glasgow dataset. The significance of lactate, bilirubin and PT ratio as predictors of mortality are logical from a physiological point of view, with elevated lactate demonstrating insufficient oxygen delivery to tissues, or the failure of the liver to metabolise lactate, or both [32]. Elevated bilirubin and PT ratio may represent failure of the liver to metabolise waste products and perform its synthetic function [33,34].
The RFH score was designed and validated for predicting hospital mortality within a liver transplant centre but not for predicting mortality in a general ICU population [18]. The results of this study validate the use of the RFH score as a prognostic scoring tool for predicting ICU mortality, as demonstrated by its performance in both the Glasgow and London datasets. Although the performance of the RFH is similar to that of the CTP + L score, the RFH is more complex to calculate. Due to the complexity of the calculation, and the requirement to separately calculate the number of failing organ systems, it is clear that this tool is designed for calculation on a computer, which may not be available at the patient's bedside.
The CTP + L score, however, can be calculated quickly at the patient's bedside using simple criteria to score 1, 2, or 3 points for each variable, and adding the raw value of the serum arterial lactate to this score, as can be seen in Table 1. CTP + L and RFH scores both predict ICU mortality to a similar degree. However, due to its simplicity, the CTP + L may be a more practical and versatile tool for evaluating patients quickly for admission to the ICU.
From multivariable analysis in the Glasgow dataset, the model that produced the best AUC was that containing lactate, bilirubin and PaO 2 /FiO 2 ratio (AUC = 0.89). This model also performed well in the London cohort of patients, producing an AUC of 0.76, which is comparable to that of the RFH and CTP + L, as can be seen in Table 4. This same model was selected as the optimum model within the Glasgow dataset based on ANOVA. This suggests that a scoring tool comprising only these factors could be used to predict ICU mortality. This would need to be validated in a larger cohort of patients in order to test its usefulness.
No scoring tool or statistical model from the London dataset reached the clinically useful AUC of 0.8. This is in contrast to the GRI dataset, where all models except one performed with an AUC >0.8. This suggests that patients from within the London dataset have fewer predictive variables compared with the Glasgow dataset, or that the predictive variables within the London cohort were not recorded and therefore not included in this analysis. Goodness-of-fit tests show that although the AUC was lower when the models were applied to the London dataset, the models were still predictive of ICU mortality.
This paper demonstrates that collecting pre-intubation hepatic encephalopathy scores does not increase the predictive values of the CTP or CTP + L scoring tools. Both datasets were combined to compare the performance of the two most predictive scoring tools: the RFH and CTP + L. This ROC curve (Fig. 1) shows that these two tools are similarly matched in predicting ICU outcome in the combined cohort of 199 patients.

Limitations
Validation of the CTP + L in the London datasets is limited, as the London dataset does not contain preintubation encephalopathy score, which is a key component of the CTP + L tool. This lack of pre-intubation encephalopathy score limits the ability to show that encephalopathy score is not required as part of the CTP + L score, and further evidence from another centre with pre-intubation encephalopathy score would be required to prove this conclusively. Both hepatic encephalopathy grade and ascites are subjective in nature and this makes them difficult to assess objectively and apply as part of a scoring tool. Additionally, clinical values for the scoring tools at admission were taken as soon as possible following admission, but in some cases this was delayed by a few hours. This variability in time until first available test results from admission may affect the predictive ability of the scoring tools. The use of first available clinical values does not account for any lead-time bias that may occur. Albumin is routinely administered for patients with cirrhosis as part of current guidelines, however any albumin administered before ICU admission would affect the utility of albumin as a predictive measure of ICU outcome.

Conclusion
It is known that patients admitted to ICU with cirrhosis have high mortality, however, the mortality rates within these cohorts are more favourable than those published in the literature. The CTP + L and RFH scoring tools are validated prognostic scoring tools for predicting ICU mortality in patients with cirrhotic liver disease admitted to a general ICU department. Collecting hepatic encephalopathy scores is not required for the CTP or CTP + L score, however, this would need to be validated on an external cohort of patients.

Key messages
The CTP + L and RFH scoring tools are validated prognostic scoring tools for patients with cirrhosis admitted to a general ICU Mortality rates in these cohorts are more favourable than those published in the literature Collecting hepatic encephalopathy scores may not be necessary for the CTP or CTP + L score in patients admitted to the general ICU, although this requires external validation

Competing interests
The authors declare that they have no competing interests.
Authors' contributions JC collected and performed the main analysis of the data, and is the main author of the manuscript. JM assisted in data collection, data interpretation, and critical review of the manuscript. AP, CS, PE, TQ, ST, and TR assisted in data collection, and critical review of the manuscript. MS provided support in statistical analysis of the data and presentation of results and has critically reviewed the manuscript. EF provided specialist expert guidance and critically reviewed the manuscript. JK provided overall supervision and was involved in the drafting and critical revision of the manuscript. All authors have read and approve the manuscript.
Author details