The early change of SOFA score as a prognostic marker of 28-day sepsis mortality: analysis through a derivation and a validation cohort

Background Since the Sepsis-3 criteria, change in Sequential Organ Failure Assessment (SOFA) score has become a key component of sepsis identification. Thus, it could be argued that reversal of this change (ΔSOFA) may reflect sepsis response and could be used as measure of efficacy in interventional trials. We aimed to assess the predictive performance of ΔSOFA for 28-day mortality. Methods Data from two previously published randomized controlled trials were studied: the first reporting on patients with severe Gram-negative infections as a derivation cohort and the second reporting on patients with ventilator-associated pneumonia as a validation cohort. Only patients with sepsis according to the Sepsis-3 definition were included in this analysis. SOFA scores were calculated on days 1, 2, 3, 5, 7, 14, and 28. Results We included 448 patients within the derivation cohort and 199 within the validation cohort. Mean SOFA scores on day 1 were 6.06 ± 4.07 and 7.84 ± 3.39, and 28 day mortality 22.8% and 29.6%, respectively. In the derivation cohort, the earliest time point where ΔSOFA score predicted mortality was day 7 (AUROC (95% CI) 0.84 (0.80–0.89); p < 0.001). The best tradeoff for prediction was found with 25% changes (78% sensitivity, 80% specificity); less than 25% decrease of admission SOFA was associated with increased mortality (odds ratio for death 14.87). This finding was confirmed in the validation cohort. Conclusions ΔSOFA on day 7 is a useful early prognostic marker of 28-day mortality and could serve as an endpoint in future sepsis trials alongside mortality. Trial registration ClinicalTrials.gov numbers NCT01223690 and NCT00297674


Background
In the light of numerous inconclusive interventional clinical trials in sepsis during the past two decades, the framework of those trials is to be revised [1][2][3][4]. Allcause mortality after 28 days has traditionally been the primary endpoint in these trials. However, with recent improvements in standard-of-care therapy, 28-day mortality is strongly dependent from other variables such as comorbid conditions and the adverse events of multiple interventions [5]. As such, it is reasonable that alternative endpoints need to be developed for sepsis. These endpoints need to provide earlier and accurate evaluation of the treatment effect under study.
Since sepsis is triggered by an infection, the endpoint of sepsis trials may be influenced by the attitude of regulatory bodies to focus new registration trials of antimicrobial agents towards early efficacy. The main example towards this end is the joint initiative between the Food and Drug Administration (FDA) with the Biomarkers Consortium of the Foundation for the National Institutes of Health (FNIH) on the update of primary endpoint definitions for non-inferiority trials for the management of infectious diseases. More precisely, the former test-of-cure visit usually taking place 7-14 days after end of treatment was replaced by the early response 48-72 h after start of treatment for acute bacterial skin and soft structure infections [6] and 3-5 days after start of treatment for community-acquired pneumonia [7], while efforts are being made to expand this concept to hospital-acquired and ventilator-associated pneumonia [8,9]. However, in order to develop a similar early endpoint for sepsis, it is mandatory that this endpoint is a predictor of 28-day mortality, i.e., the salient sequelum of sepsis and eventually of 90-day mortality that has recently emerged as a relevant clinical endpoint [10]. With the Sepsis-3 classification criteria, the Sequential Organ Failure Assessment (SOFA) score is used as a measure of sepsis-associated organ dysfunction. As a consequence, it is reasonable to define the earliest time point during the course of the disease where a clinical meaningful change of the baseline SOFA score is achieved.
The present study tries to define (a) the earliest time point during the course of sepsis where SOFA score changes can predict 28-mortality and (b) the cutoff change of baseline SOFA score that may be considered an early sign of sepsis resolution. The association of SOFA score changes with 90-day mortality is also assessed. In order to achieve so, we used two independent prospective cohorts of patients: the first as a derivation cohort and the second as a validation cohort.

Study populations
We retrospectively analyzed clinical data from a cohort of patients with sepsis, according to the 1991 sepsis definitions (derivation cohort) [11]; a second independent cohort using the 1991 sepsis definitions served as validation dataset for the primary hypothesis. Both cohorts were part of previously published multicenter randomized controlled trials comparing clarithromycin to placebo as adjunctive immunomodulatory treatment in sepsis [12,13].
The derivation cohort included patients with Gramnegative sepsis, enrolled in a prospective double-blind, placebo-controlled randomized clinical trial (RCT) studying the efficacy of intravenous clarithromycin in 28-day mortality. Patients were recruited from July 2007 to August 2011 in six departments (two intensive care units-ICUs, three medical wards, and one surgical ward) in five tertiary teaching hospitals in Greece. Patients were suffering from acute pyelonephritis or intra-abdominal infections or primary Gram-negative bacteremia [12] (ClinicalTrials.gov NCT01223690). Since the 28-day mortality of patients allocated to the placebo arm and of patients allocated to the clarithromycin arm did not differ, both arms were analyzed together for the purpose of this study.
The validation cohort consisted of patients with ventilator-associated pneumonia (VAP), enrolled in an RCT in two ICUs (one patient enrolled in one medical ward has not been included in the present study) in two tertiary teaching hospitals in Greece, from June 2004 to November 2005 (ClinicalTrials.gov NCT00297674) [13]. Since the 28-day mortality of patients allocated to the placebo arm and of patients allocated to the clarithromycin arm did not differ, both arms were analyzed together.
All medical and nursing charts of the derivation cohort were retrospectively reviewed, and components of SOFA score for each system (respiratory, coagulation, liver, cardiovascular, central nervous, and renal) were collected. Serial SOFA scores were calculated initially on day 1 (initial SOFA) and on days 2, 3, 5, 7, 14, and 28 after enrollment in the study.
For the purposes of this study, patients of each cohort who were meeting the Sepsis-3 criteria were identified; only those participated in this analysis. For the calculation of serial SOFA scores, when the Glasgow Come Scale (GCS) was not evaluable due to sedation for mechanical ventilation, the GCS immediately before mechanical ventilation was used. Patients discharged from hospital or deceased before day 28 were censored to the last known SOFA score. Delta SOFA (Δ SOFA ) for any follow-up day was provided by the formula: (SOFA score of the follow-up day − initial SOFA score) × 100/day 1 SOFA, and it was expressed as percentage.
The outcome measure in both cohorts was the earliest time point where the change of SOFA score was associated with 28-day mortality. The association of this change with 90-day mortality was a secondary endpoint.

Statistical analysis
Categorical values were presented as percentages, and continuous variables with normal distribution as mean and standard deviation (± SD). Categorical variables were compared using the two-sided Fisher exact test, whereas quantitative variables were assessed using Student's t test or the non-parametric Mann-Whitney test, as appropriate. The predictive capacity of different follow-up day Δ SOFA for mortality was evaluated with the area under the respective receiver operator characteristics (AUROC) curves and 95% confidence intervals (CIs). The optimal cutoff value for prediction of 28-day mortality was calculated using Youden's index. The Δ SOFA was expressed by medians and 95% CIs; comparisons between survivors and non-survivors were done by the Mann-Whitney U test. Breslow-Day's test was used to compare the performance of this cutoff value between the derivation and validation cohorts. A p value lower than 0.05 was considered statistically significant. All p values were twosided. Statistical analyses were performed using SPSS version 25.0 software.

Results
The study flow charts for both cohorts are shown in Fig. 1. A total of 448 of patients of the derivation cohort and 199 patients of the validation cohort could be classified as sepsis according to the Sepsis-3 criteria and were included in the analysis. Demographic baseline data of the two cohorts differed significantly (Table 1).

Primary endpoint
The ROC curves of the performance of Δ SOFA of followup days for the prediction of 28-day mortality in the derivation cohort are shown in Fig. 2a. When the AUROCs of Δ SOFA of follow-up days were compared, it was found that the earliest time point when the achieved AUROC was greater than previous days was on day 7 (Fig. 2b). When the absolute Δ SOFA scores were compared over time between survivors and non-survivors, despite the significantly greater decreases in survivors from nonsurvivors found by non-parametric statistics at all time points, a great overlap of values was shown (Fig. 2c). This led us to consider the percentage change of baseline SOFA as a more appropriate expression of the sepsis course than the absolute Δ SOFA . To this end, our analysis focused on the development of a specific value of Δ SOFA of day 7 as an early predictor of 28-day mortality. The analysis using the Youden index showed that a 25% cutoff value could discriminate non-survivors from survivors with sensitivity 78.4% (95% CI 69.0-85.7%), specificity 80.3% (95% CI 75.7-84.3%), positive predictive value 54.1% (95% CI 45.7-62.2%), and negative predictive value 92.7% (95% CI 89.0-95.2%).
Overall, in the derivation cohort, 148 (33%) patients had less than 25% decrease of SOFA score on day 7 and 300 (77%) patients had at least 25% decrease of initial SOFA score on day 7. Mortality after 28 days was 54.1% and 7.3%, respectively (p = 1.8361 × 10 −27 ). The OR for death after 28 days with a decrease of initial SOFA on day 7 less than 25% was 14.87 (95% CI 8.65-25.54). Similarly, the OR for death in the validation cohort was 6.95 (95% CI 2.05-23.55) (p value of the Breslow-Day test of homogeneity 0.250) ( Table 2).

Secondary endpoint
After ROC analysis, the day 7 Δ SOFA in the derivation cohort yielded an AUROC of 0.847 (0.807-0.886; p = 5.11 × 10 −29 ) for predicting 90-day mortality. When applying the cutoff of less than 25% decrease, this was associated with an OR of 13.20 for death after 90 days (95% CI 8.01-21.76; p = 4.78 × 10 −28 ). Table 3 describes the performance characteristics of the cutoff in predicting 90-day mortality in both cohorts.

Post hoc analysis
Although the validation cohort involved 199 with VAP all of whom were under mechanical ventilation, the derivation cohort comprised both mechanically (n = 71) and non-mechanically ventilated patients (n = 377) on study enrollment. The 28-day mortality among mechanically ventilated patients with at least 25% decrease of initial SOFA score and among mechanically ventilated patients with less than 25% decrease of initial SOFA score was 11.5% and 37.8%, respectively (p = 0.027). The respective 28-day mortality among the nonmechanically ventilated patients was 7.0% and 60.0%, respectively (p = 1.1 × 10 −26 ).
Due to the significant baseline differences between the derivation and validation cohorts and in order to assess the robustness of the above findings, a post hoc analysis has been performed, by merging both initial cohorts and randomly splitting them into cohort A and cohort B. It needs to be outlined that patients of both original cohorts were recruited before 2012 (the     Table S1 did not differ. The 25% change of initial SOFA score worked equally well for the prediction of both 28-day and 90-day mortality in both cohorts A and B (Table 4 and Additional file 2: Table S2, respectively).
Another concern was that some investigators handle SOFA score for deceased patients as the last observation carried forward, while others set the score to 24 in case of death. Using the second approach in the derivation cohort, it was found that 28-day mortality among 295 patients with at least 25% decrease of initial SOFA score was 6.1%; this was 56.2% among 153 patients with less than 25% decrease of the initial SOFA score.

Discussion
To the best of our knowledge, this is the first study to report a specific cutoff of 25% decrease of SOFA score  Abbreviations: CI confidence interval, NPV negative predictive value, PPV positive predictive value as the earliest significant surrogate of 28-day mortality using a derivation and a validation cohort. The cutoff remained robust in all subsequent analyses and subgroup evaluations, despite the fact that the used cohorts differed considerably in baseline characteristics, indicating that the elaborated endpoint may be generalizable.
Previous studies have shown that serial SOFA measurements are predictors of mortality on both days 3 and 5 of follow-up [14,15]. A cohort study of 20,007 critically ill patients in Canada reported that the slope of the SOFA score between days 1 and 7 was higher and better associated with final outcome (both ICU and hospital mortality) than was the average rate of change at later time points (between days 8 and 14) [16]. According to the authors, any increase between days 1 and 5 (defined as early change) was significantly associated with hospital and ICU mortality.
Recently, in a meta-regression analysis from 87 RCTs on septic patients using different SOFA derivatives as primary or secondary endpoints, the authors have shown that Δ SOFA (when defined as a fixed day minus initial day SOFA) explained 32% of treatment effect on mortality, suggesting that Δ SOFA is both responsive and consistent in detecting differences of treatment effects on mortality and could replace mortality as a surrogate endpoint in clinical trials [17]. The validity of change of SOFA on day 7 as an early predictor of 28-day mortality was analyzed in a large post-marketing survey among patients with sepsis and disseminated intravascular coagulation, who were propensity-matched to receive either antithrombin III alone or combination therapy with thrombomodulin. Although no difference was found between the two groups, when they were analyzed together, it was found that these changes between day 1 and day 7 provided AUROC 0.81 for 28-day mortality [18]. In a cohort of severe sepsis and septic shock, day 3 Δ SOFA displayed AUROC 0.68 (95% CI 056-0.79) whereas 50% SOFA decrease was associated with 61.3% sensitivity and 85.9% negative predictive value for ICU mortality prediction [19].
Another suggested endpoint based on SOFA score is the mean total SOFA score. This is the sum of the follow-up day SOFA scores divided by the number of days of ICU stay. In an historical cohort of 352 patients with mean length of stay (LOS) of 6.5 days, the mean total SOFA correlated well with mortality (OR 3.06, 95% CI 2.36 to 3.97) [15]. In a study evaluating levosimendan compared to placebo in patients with septic shock (the LeoPARDS RCT), the primary endpoint was powered to detect an absolute difference in the mean SOFA score (calculated up to a maximum of 28 ICU days) of at least 0.5 between the two arms [20]. The MaxSep RCT, comparing meropenem alone or in combination with moxifloxacin, in patients with severe sepsis, aimed to demonstrate a minimum of 1.1 point difference in mean SOFA scores between the two arms (calculated for a maximum ICU stay of 14 days) [21]. Both studies failed to demonstrate the expected difference, despite adequately large sample sizes (more than 500 patients per study), possibly due to the cutoffs used.
In the light of the existing publications, it is obvious that the suggested cutoff of at least 25% decrease of SOFA score on day 7 may neither replace mortality as an endpoint of clinical trials nor be considered a surrogate for sepsis resolution. However, there is no doubt that it may be considered as an early marker of improvement of the sepsis process so as to be encountered alongside mortality.
One major limitation of our study is the retrospective analysis of the data. However, due to the fact that all included patients were part of a prospective follow-up protocol during the initial randomized clinical trials, all required data were systematically collected up to day 28 limiting the bias that may come from this approach.

Conclusions
Overtime changes in Sequential Organ Failure Assessment score (Δ SOFA ) offer a more direct, scalar measurement of treatment effect of sepsis compared to Abbreviations: NPV negative predictive value, PPV positive predictive value, Sens sensitivity, Spec specificity traditional mortality endpoints. Any less than 25% Δ SOFA on day 7 may identify high mortality-risk patients showing that Δ SOFA changes may be incorporated alongside mortality in future clinical trials.
Additional file 1: Table S1. Comparative demographics of the two novel cohorts.
Additional file 2: Table S2. Prognostic performance for 90-day mortality of the 25% SOFA decrease cutoff on day 7 Δ SOFA between the derivation and validation cohorts.