Skip to main content

Sepsis subphenotyping based on organ dysfunction trajectory



Sepsis is a heterogeneous syndrome, and the identification of clinical subphenotypes is essential. Although organ dysfunction is a defining element of sepsis, subphenotypes of differential trajectory are not well studied. We sought to identify distinct Sequential Organ Failure Assessment (SOFA) score trajectory-based subphenotypes in sepsis.


We created 72-h SOFA score trajectories in patients with sepsis from four diverse intensive care unit (ICU) cohorts. We then used dynamic time warping (DTW) to compute heterogeneous SOFA trajectory similarities and hierarchical agglomerative clustering (HAC) to identify trajectory-based subphenotypes. Patient characteristics were compared between subphenotypes and a random forest model was developed to predict subphenotype membership at 6 and 24 h after being admitted to the ICU. The model was tested on three validation cohorts. Sensitivity analyses were performed with alternative clustering methodologies.


A total of 4678, 3665, 12,282, and 4804 unique sepsis patients were included in development and three validation cohorts, respectively. Four subphenotypes were identified in the development cohort: Rapidly Worsening (n = 612, 13.1%), Delayed Worsening (n = 960, 20.5%), Rapidly Improving (n = 1932, 41.3%), and Delayed Improving (n = 1174, 25.1%). Baseline characteristics, including the pattern of organ dysfunction, varied between subphenotypes. Rapidly Worsening was defined by a higher comorbidity burden, acidosis, and visceral organ dysfunction. Rapidly Improving was defined by vasopressor use without acidosis. Outcomes differed across the subphenotypes, Rapidly Worsening had the highest in-hospital mortality (28.3%, P-value < 0.001), despite a lower SOFA (mean: 4.5) at ICU admission compared to Rapidly Improving (mortality:5.5%, mean SOFA: 5.5). An overall prediction accuracy of 0.78 (95% CI, [0.77, 0.8]) was obtained at 6 h after ICU admission, which increased to 0.87 (95% CI, [0.86, 0.88]) at 24 h. Similar subphenotypes were replicated in three validation cohorts. The majority of patients with sepsis have an improving phenotype with a lower mortality risk; however, they make up over 20% of all deaths due to their larger numbers.


Four novel, clinically-defined, trajectory-based sepsis subphenotypes were identified and validated. Identifying trajectory-based subphenotypes has immediate implications for the powering and predictive enrichment of clinical trials. Understanding the pathophysiology of these differential trajectories may reveal unanticipated therapeutic targets and identify more precise populations and endpoints for clinical trials.


Sepsis is defined as a dysregulated immunological response to infection that results in acute organ dysfunction [1, 2]. The morbidity and mortality of sepsis remain high despite decades of research and numerous failed clinical trials [3, 4]. Recent research has highlighted that sepsis is a complex and heterogeneous syndrome, which includes a multidimensional array of clinical and biological features [5]. Identifying rigorous sepsis subphenotypes that present with similar prognostic markers and pathophysiologic features has the potential to improve therapy [6,7,8,9].

Recent sepsis subphenotyping studies used static measurements available soon after admission to the emergency department or intensive care unit (ICU) to characterize patients [5, 10,11,12]. However, due to the stochastic nature of infection and variable presentation to health care after developing symptoms, static assessments of sepsis subphenotypes may be incomplete, ignoring the dynamic nature of organ failure in sepsis [13].

More recently, subphenotypes characterized by dynamic patient temperature trajectories have been identified in sepsis. The differential pattern of temperature change may represent a varied underlying inflammatory response to infection [1]. The trajectory of the Sequential Organ Failure Assessment (SOFA) score after ICU admission has been used to predict outcomes and improve prognostic stratification in sepsis [13, 14]. In a recent study, Sanchez-Pinto et al. [15] leveraged a matrix factorization-based approach to identify multiple organ dysfunction syndrome subphenotypes according to longitudinal pediatric SOFA (pSOFA) scores, but their approach was focusing on the subphenotypes captured by the “motifs,” or frequent subsequence patterns, of the SOFA trajectories, which may not characterize the long term trends encoded in those trajectories well. However, whether the trajectory of multisystem organ failure is associated with distinct phenotypic patterns in sepsis remains largely unexplored. Identifying distinct subphenotypes of organ dysfunction trajectory in sepsis can refine our understanding of the natural history of sepsis in the ICU in response to standard of care treatment and define patterns of disease that may benefit from novel therapeutic strategies [16].

The objective of this study was to develop and evaluate sepsis subphenotypes. The first goal was to determine whether distinct SOFA score trajectory-based subphenotypes in patients with sepsis can be identified through the electronic health record. The second goal was to understand whether those different subphenotypes are associated with the patterns of biomarkers and clinical outcomes. The third goal was to determine whether the identified subphenotypes can be predicted by using patient baseline characteristics and early-stage clinical features.



We did a cohort study on datasets that contained granular patient level data from a total of 221 hospitals in the USA, whose overall workflow is illustrated in Fig. 1. Our goal was to derive sepsis subphenotypes of patients in ICU according to their SOFA organ dysfunction trajectories using dynamic time warping (DTW) [17] and hierarchical agglomerative clustering (HAC) [18]. We then characterized these subphenotypes using comprehensive patient information including demographics, comorbidities, use of mechanical ventilation, type of ICU unit, admission source, organ source of sepsis, and examined their associated clinical outcomes as well as clinical biomarkers. We further built multiple random forest models to predict the derived subphenotypes from different time points’ patient clinical characteristics. To ensure replicability, the same analysis pipeline was conducted in three validation cohorts.

Fig. 1
figure 1

Workflow of study. A The MIMIC-III dataset was used as development cohort and NMEDW, eICU, and CEDAR datasets were used as validation cohorts. Electronic health records including laboratory tests, vital signs, and medication were extracted to compute the SOFA score every 6 h during 72 h after admission to ICU. B Each patient was represented as a 72-h SOFA score trajectory. Dynamic time warping (DTW) was used to compute heterogeneous SOFA trajectory similarities and HAC was applied to identify subphenotypes based on trajectory similarities. C To re-derive subphenotypes in three validation cohorts and consider sensitivity analysis to clustering method, specifically, use another method (Group-Based Trajectory Modeling, GBTM) to generate subphenotypes. Statistical analysis were performed among subphenotypes in terms of demographic factors, laboratory tests and vital signs. D The predictive model of subphenotypes at successive time points (hours 6, 24, 36, 48, 60) after ICU admission was constructed based on a random forest classifier by using patients’ clinical data including laboratory tests, vital signs, and SOFA subscores

Definition of sepsis and study population

The development cohort (Medical Information Mart for Intensive Care III database: MIMIC-III) was from Beth Israel Deaconess Medical Center (BIDMC) with admissions dating from 2001 to 2012, which has 673 licensed beds, including 493 medical/surgical beds, 77 critical care beds, and 62 OB/GYN beds [19]. The first validation cohort was from Northwestern Medicine Enterprise Data Warehouse (NMEDW), which is a network of eleven hospitals located in northern Illinois with 2,554 beds in total, with ICU admissions dating from 2012 to 2019 [20]. The second validation cohort was from the eICU collaborative research database, which combined multicenter data from patients who were admitted to one of 335 units at 208 hospitals located throughout the USA between 2014 and 2015 [21]. The third validation cohort was from Weill Cornell Critical carE Database for Advanced Research (CEDAR) with ICU admissions dating from 2001 to 2020, which was built on NewYork-Presbyterian/Weill Cornell Medical Center (NYP/WCMC), including 862 beds in total [22]. The inclusion–exclusion cascade for the patients are shown in Additional file 1: Fig. S1, where Sepsis-3 criteria are defined as in Singer et al. [2]

SOFA score computation and model descriptions

The SOFA score was derived from six organ-specific subscores including respiration, coagulation, liver, cardiovascular, CNS, and renal [16], which was obtained every 6 h within the first 72 h of ICU admission. For each 6-h period, the worst variable value was used to compute the SOFA subscores. To obtain the urine output during 6 h, we divided daily urine output by 4. The lowest GCS for each 6-h period was used irrespective of sedation. Missing values (Additional file 1: Table S14) were imputed using last observation carried forward (LOCF) and next observation carried backward (NOCB) [23]. If there was no any value during the first 72 h, we used 0 to fill.

After the SOFA scores were derived, each patient is represented as a vector of 12 SOFA scores from the first 6 h to the last 6 h across the 72-h period after ICU admission. Then, DTW and HAC were used to derive subphenotypes [17]. In particular, DTW was used to evaluate the similarities between pairwise patient SOFA trajectories (Additional file 1: Figs. S19 and S20). This method can capture the differences among the evolution heterogeneity in terms of the temporal curves, which can assess similarity between patients robustly. HAC was then used to perform clustering among patients based on the similarities obtained from DTW. Multiple clustering indices (Supplemental Appendix 7) were calculated to determine the optimal numbers of subphenotypes.

Subphenotype reproducibility and prediction

To ensure the robustness of the derived subphenotypes, we re-derived them with group-based trajectory modeling (GBTM), which is one type of latent class analysis (LCA) that assigns each patient a probability of belonging to each particular subphenotype on the basis of maximum likelihood estimation [24].

We trained a random forest model to predict the derived subphenotypes from the baseline patient clinical collected characteristics at successive time points after ICU admission, with the goal of examining whether the trajectory subphenotypes could be predicted early. Candidate predictors included demographics, comorbidities, SOFA subscores, laboratory tests, and vital signs. Predictor contributions were evaluated with the Shapley additive explanations (SHAP) strategy [25].

Statistical analysis

Data were analyzed using tslearn package 0.3.1 and scikit-learn package 0.22.2 with Python 3.7. Survival analysis to 28 days was performed using Kaplan–Meier curves. Statistical significance was set at P < 0.05, and all tests were 2-tailed. The detailed descriptions about statistical testing are shown in Supplemental Appendix 2.


Cohort characteristics

Our development cohort MIMIC-III had 4,678 sepsis patients with the median age 65.9 years (Interquartile Range (IQR) [53.7–77.9]), which included 2,625 male (56.1%) and 3,367 white (71.9%) patients. The overall in-hospital mortality rate was 10.9%, and the median ICU length-of-stay was 2.8 days (IQR [1.6–5.6]). There were 1,893 patients (40.5%) treated with mechanical ventilation during the first three days. The mean baseline SOFA score obtained from the first 6 h after ICU admission was 4.96 (Standard Deviation (SD): 2.8). Most of the patients (2,611, 55.8%) were in the medical intensive care unit (MICU). The overall demographic distributions of the validation cohorts from NMEDW(n = 3,665) and eICU [21] (n = 12,282) are similar to the development cohort. Patients in validation cohort CEDAR (n = 4,804) were older (median age 77 years (IQR [66.0–88.0]) compared to development cohort. The overall in-hospital mortality rates of patients in NMEDW, eICU, and CEDAR were 14.0%, 10.5%, and 199%, respectively. The median length-of-stay were 3.8 days (IQR [1.9–7.9]), 2.8 days (IQR [1.7–5.1]), 4.4 days (IQR [2.7–7.9]). There were 1,524 (41.6%), 5,772 (47.0%), and 2,263 (47.1%) patients that needed mechanical ventilation in the first three days. The mean baseline SOFA scores were 5.68 (SD:2.8), 5.9 (SD:3.1), and 6.4 (SD:3.1) in validation cohorts.

Comparisons between survivors and nonsurvivors

In the development cohort, nonsurvivors were older than survivors, with a median age of 71.5 years (IQR, [59.9–80.9]) compared with 65.2 years for survivors (IQR, [53.2–77.4], P-value < 0.001). Nonsurvivors had higher comorbidity burden with a median Elixhauser index score [26] 7.0 (IQR [2.0–12.0]). Median ICU length-of-stay for nonsurvivors was 3.95 days (IQR [1.9–7.7]), and the rate of mechanical ventilation during the first three days was 59.8%. Nonsurvivors had higher baseline SOFA scores, with a mean value 7.1 (SD: 3.7). More nonsurvivors were admitted in MICU (Additional file 1: Table S1). Similar statistics in validation cohorts are shown in Additional file 1: Tables S2, S3, and S4.

SOFA trajectory and the derived subphenotypes

Based on the pairwise patients’ SOFA trajectory similarity matrix obtained from DTW, we generated clustermaps with HAC (Additional file 1: Fig. S2), where four distinct clusters were identified as subphenotypes. The number of clusters was determined according to multiple clustering indexes (Additional file 1: Appendix 6 and Table S5).

The overall trajectory and prevalence of each subphenotype across four cohorts are shown in Figs. 2 and 3. Specifically, in the development cohort, the Rapidly Worsening subphenotype (n = 612, 13.1%) was characterized by continuously increased SOFA scores from a mean (SD) of 4.5 (2.8) at admission to more than 7 at 72 h. This subphenotype had the fewest patients. The Delayed Worsening subphenotype (n = 960, 20.5%) was characterized by decreased SOFA scores within the first 48 h from a mean (SD) of 5.2 (2.7) at baseline to 3.7 (2.8), followed by an increase over the last 24 h. The Rapidly Improving subphenotype (n = 1,932, 41.3%) was characterized by a consistent continuous improvement in SOFA scores from a mean (SD) of 5.54 (2.9) at baseline to less than 3. This was the most common subphenotype and it had the highest SOFA score at baseline. The Delayed Improving subphenotype (n = 1,174, 25.1%) was characterized by an increase and then a gradual decrease in SOFA score over 72 h. It had the lowest SOFA score at baseline with mean (SD) 4.0 (2.4). Similar trajectory trends were obtained in all three validation cohorts (Figs. 2 and 3, Supplemental Appendix 3). Individual SOFA subscore trajectories for each subphenotype are provided in Additional file 1: Figs. S3, S4, S10, and S14.

Fig. 2
figure 2

Sequential Organ Failure Assessment (SOFA) trajectories of the identified subphenotypes in development and three validation cohorts. DI: Delayed Improving; RI: Rapidly Improving; DW: Delayed Worsening; RW: Rapidly Worsening

Fig. 3
figure 3

The prevalence of each subphenotype in development (MIMIC-III) and other three validation cohorts (NMEDW, eICU, CEDAR). DI: Delayed Improving; RI: Rapidly Improving; DW: Delayed Worsening; RW: Rapidly Worsening

Patient characteristics comparisons across subphenotypes

Patient characteristics differed across subphenotypes (Table 1, Figs. 4, 5, and Additional file 1: Table S6). Specifically, Rapidly Worsening patients had the highest rates of mechanical ventilation (46.41%), the highest median Elixhauser comorbidity burden value of 5 (IQR [0–10]) but the lowest baseline SOFA score compared to the other subphenotypes. They had the highest mortality rate (Fig. 4A 28.3%, P-value < 0.001) and a longer length of stay (Table 1, 2.9 days, P-value < 0.001). Rapidly Improving patients had the lowest rate of mortality (Fig. 4A 5.5%) and mechanical ventilation (37.9%), and the shortest length-of-stay (2.4 days). It had the highest proportion of patients meeting criteria for septic shock (15.5%, P-value = 0.002). Delayed Improving and Delayed Worsening patients had lower rates of mortality (10.7%, 10.6%) and mechanical ventilation (42.5%, 39.3%) than the Rapidly Worsening subphenotype. The median age of the four subphenotypes was similar in the development cohort. Male patients were more common in all subphenotypes. Chord diagrams (Fig. 5) showed the differences of subphenotypes in terms of abnormal clinical biomarkers. The Rapidly Worsening group was more likely to have patients with abnormal cardiovascular biomarkers (bicarbonate, troponin T or I, lactate) and hematologic (such as hemoglobin, INR, platelet, glucose, RDW). Patients in this subphenotype had a higher chronic comorbidity burden and had abnormal SOFA subscores including respiration, coagulation and liver. The Rapidly Improving patients were more likely to have abnormal inflammatory laboratory values (temperature, WBC, bands, CRP, albumin, lymphocyte percent) and abnormal cardiovascular, CNS, and renal SOFA subscores. There was a lower chronic comorbidity burden in this subphenotype. Delayed Worsening group had more abnormal hematologic and respiration, coagulation, CNS, and SOFA renal subscores. Abnormal respiration, coagulation, and cardiovascular SOFA subscores were strongly associated with Delayed Improving. The characteristics on validation cohorts are provided in Additional file 1: Appendix 4 and Tables S7, S8, S9, S10, S11, and S12. The associations between all comorbidities and subphenotypes were investigated and shown in Additional file 1: Tables S16, S17, S18, and S19. Multiple comorbidities such as congestive heart failure, renal failure, liver disease, and cancer showed the differences among subphenotypes.

Table 1 Patient characteristics among subphenotypes in the development cohort
Fig. 4
figure 4

Survival analysis in terms of the identified subphenotypes in development and three validation cohorts. DI: Delayed Improving; RI: Rapidly Improving; DW: Delayed Worsening; RW: Rapidly Worsening. The A, B, C, and D show the survival analysis results in development and three validation cohorts, respectively

Fig. 5
figure 5

Chord diagrams showing abnormal variables by subphenotype in development cohort. a abnormal biomarkers vs. all subphenotypes; I: abnormal biomarkers vs. DI; II: abnormal biomarkers vs. RI; III: abnormal biomarkers vs. DW; IV: abnormal biomarkers vs. RW; b abnormal subscores vs. all subphenotypes; V: abnormal subscores vs. DI; VI: abnormal subscores vs. RI; VII: abnormal subscores vs. DW; VIII: abnormal subscores vs. RW. DI: Delayed Improving; RI: Rapidly Improving; DW: Delayed Worsening; RW: Rapidly Worsening

Subphenotype reproducibility and prediction

Sensitivity analysis with another clustering approach GBTM confirmed the four subphenotypes with the data from development cohort (Additional file 1: Fig. S8). Patients’ memberships of the four subphenotypes re-derived by GBTM were highly consistent with those obtained from HAC (Additional file 1: Fig. S9), and thus, we did not find substantial changes in clinical characteristics of those subphenotypes derived from the sensitivity analysis (Additional file 1: Table S13).

We trained random forest models for predicting subphenotypes according to early-stage patient characteristics. Overall, with the first 6 h after ICU admission, the models obtained the accuracy of 0.78 (95% Confidence Interval [CI] [0.77, 0.8]) in development cohort and 0.79 (95% CI [0.78, 0.8]), 0.81 (95% CI [0.8, 0.84]), and 0.82 (95% CI [0.81, 0.84]) in NMEDW, eICU, and CEDAR validation cohorts, respectively. Predictor contributions on four cohorts are shown in Fig. 6 and Additional file 1: Figs. S5, S11, and S15, which demonstrated different patterns when predicting different subphenotypes. For example, creatinine, bicarbonate, RDW, and BUN contribute more for predicting the Rapidly Improving group, while platelet, INR, AST, and lactate contributed more to the prediction of the Rapidly Worsening group. The prediction performance at successive time points are shown in Additional file 1: Fig. S18. The accuracy increased to 0.87 (95% CI [0.86, 0.88]) in development cohort and 0.86 (95% CI [0.85, 0.88]), 0.86 (95% CI [0.85, 0.87]), and 0.84 (95% CI [0.83, 0.85]) in NMEDW, eICU, and CEDAR validation cohorts at the 24 h after ICU admission, respectively.

Fig. 6
figure 6

SHAP value-based predictor contribution to the subphenotype prediction of the predictive model in development cohort. Features’ importance is ranked based on SHAP values. In this figure, each point represented a single observation. The horizontal location showed whether the effect of that value was associated with a positive (a SHAP value greater than 0) or negative (a SHAP value less than 0) impact on prediction. Color showed whether the original value of that variable was high (in red) or low (in blue) for that observation. For example, in RW, a low platelets value had a positive impact on the RW subphenotype prediction; the “low” came from the blue color, and the “positive” impact was shown on the horizontal axis. DI: Delayed Improving; RI: Rapidly Improving; DW: Delayed Worsening; RW: Rapidly Worsening


We reported four sepsis subphenotypes based on dynamic organ dysfunction trajectories using a data-driven methodology. DTW was used to calculate patients’ SOFA trajectory similarities because of its capability of capturing heterogeneous evolution among the temporal sequences robustly, based on which HAC was leveraged to identify patient groups with similar trajectories. The subphenotypes identified were Rapidly Worsening, Delayed Worsening, Rapidly Improving, and Delayed Improving. Patients in the Rapidly Worsening subphenotype had progressive organ dysfunction with the ongoing ICU stay. The two Delayed groups had unstable organ dysfunction over the study period and the Rapidly Improving group had the highest admission organ dysfunction but quickly improved. Outcomes followed SOFA trajectory across each subphenotype were irrespective of traditional baseline SOFA score and septic shock categories.

A major strength of this analysis is that we have identified time-dependent progression patterns that may be related to the differential response of specific organ dysfunction to standard of care interventions. For example, the Rapidly Improving group had cardiovascular and respiratory failure at admission that resolved over 72 h. The Rapidly Worsening groups developed multisystem organ failure including visceral organ dysfunction, specifically liver failure in addition to cardiovascular and respiratory failure. These differential patterns suggest varying time-dependent, treatment responsive organ dysfunction pathophysiology in sepsis. The cardiovascular and respiratory subscores are driven by the vasopressor dose and PaO2/FiO2, respectively, which may respond to therapeutic interventions such as corticosteroids, volume resuscitation, and the application of PEEP or therapeutic suctioning [27]. However, as demonstrated by our analysis, sepsis-related renal and liver failure may be less modifiable with our current therapeutic strategies over the past twenty years [28, 29]. Our study highlights that patterns of organ dysfunction in patients with sepsis are Rapidly Improving, Rapidly Worsening, and Delayed. Each of these patterns may be due to a different pathophysiology and benefit from different treatments in the future. However, these findings have immediate implications for those designing clinical trial endpoints such as change in SOFA subscore [30]. Moreover, enrolling patients with a Rapidly Improving phenotype into a trial evaluating a therapeutic agent to reduce the duration of organ dysfunction will unlikely reveal a difference.

It deserves noting that our Rapidly Improving patients had better outcomes across all patients studied, but still represented 21%, 36%, 21%, and 24% of all deaths in our development and validation cohorts (NMEDW, eICU, and CEDAR cohorts), respectively, despite an overall 5%, 10%, 5%, and 9% in-hospital mortality. This low mortality rate but high numbers of absolute deaths highlights that further research is needed to understand the cause of death in patients with rapidly improving organ dysfunction in sepsis [31]. The Rapidly Worsening subphenotype was less common compared to rapidly improving and may represent patients with our classical understanding of septic shock [32]. More recent evidence suggests that the pathophysiology of early, progressive organ dysfunction in our Rapidly Worsening patients may be due to over exuberant activation of necroinflammatory cell death pathways in multiple organs, highlighting the need for novel treatment strategies [33,34,35]. The Delayed Worsening and Improving subphenotypes had intermediate outcomes across our cohorts, and more nuanced differences in clinical characteristics. These trajectories may be influenced by non-resolving inflammation or immune paralysis [36, 37]. Further understanding of the biology underlying these subphenotypes will be critical to develop the next generation of treatments for sepsis in all its forms.

The potential for distinct pathophysiologic etiologies for the differential trajectories is supported by the differential patterns of organ dysfunction, infectious source, vital signs, inflammatory, hematologic, and cardiovascular variables at admission to the ICU. As shown in Fig. 5, and Additional file 1: Figs. S6, S7, S12, S13, S16, and S17, there were different variables associated with different groups over the course of the study. For example, those patients of Rapidly Improving were more likely to have more abnormal inflammatory markers (such as WBC, bands, albumin, temperature, and lymphocyte) and more abnormal values on cardiovascular, and CNS subscores. They were also more likely to have urosepsis. There was a lower comorbidity score in patients with this subphenotype, which suggests that sepsis outcomes may be more dependent on underlying illness. The Rapidly Worsening patients had more comorbidities and distinct derangements in clinical variables associated with metabolic acidosis and hypoperfusion, e.g., a low bicarbonate and higher lactate, and disseminated intravascular coagulation, e.g., low platelets and a higher INR and respiratory failure. Both of the Delayed subphenotypes had less specific variables associated with group membership, including inflammatory, hepatic, hematologic, and pulmonary associated with Delayed Improvement and hematologic, cardiovascular and renal variables associated with Delayed Worsening. These differences may be related to secular trends in therapeutics and differing case mixes in each cohort.

We built multivariable prediction models for the identified trajectory subphenotypes from patient baseline characteristics and early-stage clinical features. Several interesting findings were obtained. (1) A high comorbidity score tended to predict the subphenotypes of Rapidly Worsening because patients with high comorbidity burden were more likely to present worse organ dysfunction in ICU; (2) the roles of laboratory tests and vital signs were different on prediction. For example, low platelets had a positive impact on the Rapidly Worsening prediction and high platelets had a positive impact on the Rapidly Improving prediction. These prediction models may enhance the clinical utility of the identified subphenotypes in practice, as they can be predicted with the EHR information captured within the early hours of ICU admission, especially for Rapid Improving and Rapid Worsening subphenotypes, which has important clinical implications as discussed above. Our model can be implemented within the EHR system as a risk calculator for subphenotype assignments.

Our manuscript complements and adds to other recent study of sepsis subphenotypes. For example, Seymour et al. and Knox et al. each identified four subphenotypes that were associated with organ dysfunction patterns and clinical outcomes in patients with sepsis using a panel of baseline clinical variables [5, 10]. There is some overlap in our high risk groups, notably both include liver injury and shock. However, our work demonstrates that the difference in outcome in this group is due to progressive non-resolving organ dysfunction that calls for novel treatments. Prior work by Ferreira et al. and Sakr et al. used changes in the SOFA score after ICU admission to improve prognostic stratification in sepsis, but did not use these changes to establish subphenotypes. Bhavani et al. used longitudinal temperature trajectories to identify four sepsis subphenotypes, with significant variability in inflammatory markers and outcomes, highlighting the potential for novel immune signatures to be uncovered through trajectory analysis [1]. Differential organ dysfunction trajectory may be related to the immune response but may also be explained by differences in preexisting frailty, effective source control, resuscitation, and processes of care.

This study has several limitations. First, our sepsis subphenotypes were identified based on the data-driven method, which may not be directly related to underlying differences in biology. Integration of biological data may help refine our understanding of differential disease progression and the potential for therapeutics to alter the course. Second, although we used many separate hospitals in validation, all of them are located in the USA, which may limit generalizability to other locations of care. Moreover, these observational cohorts may not directly reflect sepsis clinical trial populations but are representative of academic and community hospitals across the USA. Third, we did not evaluate the effect of specific randomized interventions on SOFA score trajectory. Fourth, this identified sepsis subphenotypes only focused on patients admitted to an ICU, which is subject to differences in ICU admission practices across institutions. Last but not the least, we did not investigate the association between care processes and the subphenotypes, which would be an important topic in future research.


We discovered four sepsis subphenotypes with different natural histories following admission to the ICU. Our results suggest that these subphenotypes represent a differential host pathogen response in the setting of current standard of care therapy. Understanding differential trajectory has implications for the design and predictive enrichment of therapeutic clinical trials [38]. Further understanding of the underlying biology of subphenotypes may reveal insights into sepsis pathophysiology and improve the personalization of sepsis management.

Availability of data and materials

The deidentified data from development cohort (MIMIC-III) and data from validation cohort (eICU) can be obtained after approval of proposal with a signed data access agreement by checking physionet (The Research Resource for Complex Physiologic Signals, For validation cohorts (NMEDW and CEDAR), data access are not covered by our data transfer agreements. All source codes in this study are available at Our implementation is based on Python 3.7 and R 3.6. More specifically, clustering models were implemented by using Python packages ‘scikit-learn 0.23.2’ ( and ‘scipy 1.5.3’ ( The implementation of SHAP is based on ‘SHAP 0.35.0’ ( R package ‘NbClust’ ( was used to determine the optimal number of clusters in agglomerative hierarchical clustering. Chord diagrams were created using R package ‘circlize’ ( Statistical tests and survival analyses were performed based on R.



Intensive care unit


Dynamic time warping


Hierarchical agglomerative clustering


Sequential organ failure assessment


Group-based trajectory modeling


Latent class analysis


Electronic health record


Medical information mart for intensive care III database


Beth Israel deaconess medical center


Northwestern medicine enterprise data Warehouse


Critical carE database for advanced research


NewYork-Presbyterian/Weill Cornell Medical Center


Central nervous system


Last observation carried forward


Next observation carried backward


Shapley additive explanations


Medical intensive care unit


Standard deviation


White blood cell count


Red blood cell distribution width


C-reactive protein


Confidence interval


Aspartate aminotransferase


  1. Bhavani SV, Carey KA, Gilbert ER, Afshar M, Verhoef PA, Churpek MM. Identifying novel sepsis subphenotypes using temperature trajectories. Am J Respir Crit Care Med. 2019;200(3):327–35.

    Article  Google Scholar 

  2. Singer M, Deutschman CS, Seymour CW, et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA. 2016;315(8):801–10.

    Article  CAS  Google Scholar 

  3. Rhee C, Dantes R, Epstein L, et al. Incidence and trends of sepsis in US hospitals using clinical vs claims data, 2009–2014. JAMA. 2017;318(13):1241–9.

    Article  Google Scholar 

  4. Mullard A. Drug withdrawal sends critical care specialists back to basics. Lancet. 2011;378(9805):1769.

    Article  Google Scholar 

  5. Seymour CW, Kennedy JN, Wang S, et al. Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis. JAMA. 2019;321(20):2003–17.

    Article  CAS  Google Scholar 

  6. Abraham E. Moving forward with refinement of definitions for sepsis. Crit Care Med. 2021;49(5):861–3.

    Article  Google Scholar 

  7. Cohen J, Opal S, Calandra T. Sepsis studies need new direction. Lancet Infect Dis. 2012;12(7):503–5.

    Article  Google Scholar 

  8. Cohen J, Vincent JL, Adhikari NK, et al. Sepsis: a roadmap for future research. Lancet Infect Dis. 2015;15(5):581–614.

    Article  Google Scholar 

  9. DeMerle KM, Angus DC, Baillie JK, et al. Sepsis subclasses: a framework for development and interpretation. Crit Care Med. 2021;49(5):748–59.

    Article  Google Scholar 

  10. Knox DB, Lanspa MJ, Kuttler KG, Brewer SC, Brown SM. Phenotypic clusters within sepsis-associated multiple organ dysfunction syndrome. Intens Care Med. 2015;41(5):814–22.

    Article  Google Scholar 

  11. Scicluna BP, van Vught LA, Zwinderman AH, et al. Classification of patients with sepsis according to blood genomic endotype: a prospective cohort study. Lancet Resp Med. 2017;5(10):816–26.

    Article  Google Scholar 

  12. Sweeney TE, Azad TD, Donato M, et al. Unsupervised analysis of transcriptomics in bacterial sepsis across multiple datasets reveals three robust clusters. Crit Care Med. 2018;46(6):915–25.

    Article  Google Scholar 

  13. Ferreira FL, Bota DP, Bross A, Melot C, Vincent JL. Serial evaluation of the SOFA score to predict outcome in critically ill patients. JAMA-J Am Med Assoc. 2001;286(14):1754–8.

    Article  CAS  Google Scholar 

  14. Sakr Y, Lobo SM, Moreno RP, et al. Patterns and early evolution of organ failure in the intensive care unit and their relation to outcome. Crit Care. 2012;16(6):1–9.

    Article  Google Scholar 

  15. Sanchez-Pinto LN, Stroup EK, Pendergrast T, Pinto N, Luo Y. Derivation and validation of novel phenotypes of multiple organ dysfunction syndrome in critically ill children. JAMA Network Open. 2020;3(8):e209271-e.

    Article  Google Scholar 

  16. Vincent JL, Moreno R, Takala J, et al. The SOFA (sepsis-related organ failure assessment) score to describe organ dysfunction/failure. Intens Care Med. 1996;22(7):707–10.

    Article  CAS  Google Scholar 

  17. Berndt DJ, Clifford J. Using dynamic time warping to find patterns in time series. KDD workshop; 1994: Seattle, WA, USA; 1994. p. 359-70.

  18. Jain AK, Murty MN, Flynn PJ. Data clustering: a review. ACM Comput Surv. 1999;31(3):264–323.

    Article  Google Scholar 

  19. Johnson AEW, Pollard TJ, Shen L, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3:4458.

    Article  Google Scholar 

  20. Starren JB, Winter AQ, Lloyd-Jones DM. Enabling a learning health system through a unified enterprise data warehouse: the experience of the northwestern university clinical and translational sciences (NUCATS) institute. Clin Transl Sci. 2015;8(4):269–71.

    Article  Google Scholar 

  21. Pollard TJ, Johnson AE, Raffa JD, Celi LA, Mark RG, Badawi O. The eICU collaborative research database, a freely available multi-center database for critical care research. Sci Data. 2018;5(1):1–13.

    Article  Google Scholar 

  22. Schenck EJ, Hoffman KL, Cusick M, Kabariti J, Sholle ET, Campion TR Jr. Critical carE Database for Advanced Research (CEDAR): An automated method to support intensive care units with electronic health record data. J Biomed Inform. 2021;118:103789.

    Article  Google Scholar 

  23. Moritz S, Bartz-Beielstein T. imputeTS: time series missing value imputation in R. R J. 2017;9(1):207.

    Article  Google Scholar 

  24. Nagin DS, Odgers CL. Group-based trajectory modeling in clinical research. Annu Rev Clin Psychol. 2010;6:109–38.

    Article  Google Scholar 

  25. Molnar C. Interpretable machine learning:; 2020.

  26. Quan HD, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130–9.

    Article  Google Scholar 

  27. Schenck EJ, Oromendia C, Torres LK, Berlin DA, Choi AMK, Siempos II. Rapidly improving ARDS in therapeutic randomized controlled trials. Chest. 2019;155(3):474–82.

    Article  CAS  Google Scholar 

  28. Woźnica EA, Inglot M, Woźnica RK. Łysenko LJAic, University emooWM Liver dysfunction in sepsis. Science. 2018;27(4):547–51.

    Google Scholar 

  29. Gaudry S, Hajage D, Benichou N, et al. Delayed versus early initiation of renal replacement therapy for severe acute kidney injury: a systematic review and individual patient data meta-analysis of randomised clinical trials. Lancet. 2020;395(10235):1506–15.

    Article  Google Scholar 

  30. Lambden S, Laterre PF, Levy MM, Francois B. The SOFA score—development, utility and challenges of accurate assessment in clinical trials. Crit Care. 2019;23(1):1–9.

    Article  Google Scholar 

  31. Abrams D, Montesi SB, Moore SKL, et al. Powering bias and clinically important treatment effects in randomized trials of critical Illness. Science. 2020;48(12):1710–9.

    Google Scholar 

  32. Hotchkiss RS, Moldawer LL, Opal SM, Reinhart K, Turnbull IR, Vincent J-L. Sepsis and septic shock. Nat Rev Dis Primers. 2016;2(1):16045.

    Article  Google Scholar 

  33. Ma KC, Schenck EJ, Siempos II, et al. Circulating RIPK3 levels are associated with mortality and organ failure during critical illness. Science. 2018;3(13):7789.

    Google Scholar 

  34. Linkermann A. Death and fire—the concept of necroinflammation. Cell Death Differ. 2019;26(1):1–3.

    Article  Google Scholar 

  35. Schenck EJ, Ma KC, Price DR, et al. Circulating cell death biomarker TRAIL is associated with increased organ dysfunction in sepsis. Science. 2019;4(9):7789.

    Google Scholar 

  36. Mira JC, Gentile LF, Mathias BJ, et al. Sepsis pathophysiology, chronic critical Illness, and persistent inflammation-immunosuppression and catabolism syndrome. Science. 2017;45(2):253–62.

    Google Scholar 

  37. Cao C, Yu M, Chai Y. Pathological alteration and therapeutic implications of sepsis-induced immune cell apoptosis. Cell Death Dis. 2019;10(10):782.

    Article  Google Scholar 

  38. Granholm A, Alhazzani W, Derde LPG, et al. Randomised clinical trials in critical care: past, present and future. Intens Care Med. 2021;5:70002.

    Google Scholar 

Download references


The work of ZX, CZ, HZ, and FW are supported by NSF 1750326, NIH RF1AG072449, and NIH R01MH124740. The work of ES is supported by NHLBI K23HL151876. The work of CM and YL are supported in part by NIH 1R01LM013337 and U01TR003528.


The research funding is from NSF and NIH.

Author information

Authors and Affiliations



ES and FW contributed to the conceptualization, investigation, writing, reviewing and editing of the manuscript. ZX contributed to data analysis, drafting, editing and reviewing manuscript. CM contributed to data analysis, editing and reviewing manuscript. CS, HZ, IS, LT, and DP contributed to discussion, commenting and editing the manuscript. YL, CM, ZX, FW, and ES verified the data. YL and CM had access to the raw data from the NMEDW. FW, ES, and ZX had access to the raw data from the CEDAR, eICU, and MIMIC-III. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Yuan Luo, Edward J. Schenck or Fei Wang.

Ethics declarations

Ethics approval and consent to participate

Consent obtained for use of MIMIC-III, eICU, NMEDW, and CEDAR databases.

Consent for publication

Not applicable.

Competing interests

ES received the consulting fees in terms of Axle Informatics (NIAID COVID19 Vaccine Subject Matter Expert Program) and payment in terms of Department of Defense (Peer Reviewed Medical Research Program). All other authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:

 Sepsis Subphenotyping Based on Organ Dysfunction Trajectory.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Z., Mao, C., Su, C. et al. Sepsis subphenotyping based on organ dysfunction trajectory. Crit Care 26, 197 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: