Deploying unsupervised clustering analysis to derive clinical phenotypes and risk factors associated with mortality risk in 2022 critically ill patients with COVID-19 in Spain

Background The identification of factors associated with Intensive Care Unit (ICU) mortality and derived clinical phenotypes in COVID-19 patients could help for a more tailored approach to clinical decision-making that improves prognostic outcomes. Methods Prospective, multicenter, observational study of critically ill patients with confirmed COVID-19 disease and acute respiratory failure admitted from 63 ICUs in Spain. The objective was to utilize an unsupervised clustering analysis to derive clinical COVID-19 phenotypes and to analyze patient’s factors associated with mortality risk. Patient features including demographics and clinical data at ICU admission were analyzed. Generalized linear models were used to determine ICU morality risk factors. The prognostic models were validated and their performance was measured using accuracy test, sensitivity, specificity and ROC curves. Results The database included a total of 2022 patients (mean age 64 [IQR 5–71] years, 1423 (70.4%) male, median APACHE II score (13 [IQR 10–17]) and SOFA score (5 [IQR 3–7]) points. The ICU mortality rate was 32.6%. Of the 3 derived phenotypes, the A (mild) phenotype (537; 26.7%) included older age (< 65 years), fewer abnormal laboratory values and less development of complications, B (moderate) phenotype (623, 30.8%) had similar characteristics of A phenotype but were more likely to present shock. The C (severe) phenotype was the most common (857; 42.5%) and was characterized by the interplay of older age (> 65 years), high severity of illness and a higher likelihood of development shock. Crude ICU mortality was 20.3%, 25% and 45.4% for A, B and C phenotype respectively. The ICU mortality risk factors and model performance differed between whole population and phenotype classifications. Conclusion The presented machine learning model identified three clinical phenotypes that significantly correlated with host-response patterns and ICU mortality. Different risk factors across the whole population and clinical phenotypes were observed which may limit the application of a “one-size-fits-all” model in practice.


Introduction
Since the outbreak of COVID-19 disease began in December 2019 in China, soaring cases of confirmed SARS-CoV-2 are pummeling the global health system. More than 91 million people have developed SARS-CoV-2 infection, and more than 2 million have died [1]. Critical illness from COVID-19 has constrained intensive care unit (ICU) material and human resources [2]. As of January 18, 2021 more than 2.5 million people in Spain have been infected with SARS-CoV-2 and more than 53,000 have died [3]. Short-term mortality reported rate ranges from 16 to 62% of patients admitted to ICU [4][5][6][7][8]. The heterogeneity of patients that have been treated in China [4], Italy [8], USA [5][6][7] or Spain [9,10] may explain the wide variation of mortality rate due to their population characteristics, presence of comorbidities and healthcare systems. A recent international survey [11] reported significant practice variations in the management of severe COVID-19 patients, including differences at the regional, hospital, and patient level. Therefore, it is necessary to characterize phenotypes, by extending the enrolment of patients outside of one ICU site to multiple patients being treated in different hospitals. Allowing to adequately measure mortality-related factors adjusted by the inter-hospital variation to determine clinical outcomes.
Risk factors represent the most important approach when defining treatment of hospitalized patients as these measures can inform clinical courses most likely for a patient given their a priori risk. However, risk factors can also interplay differently when they are included in different patient clusters. A single model based on general risk factors (one-size-fits-all) might be limited for clinical data interpretation and application across sites. Different combinations of risk factors may naturally cluster into previously undescribed subsets or phenotypes that may have different risks for a high mortality rate and that may therefore help to determine the response to treatments in COVID- 19. We hypothesize that the presence of welldefined phenotypes in COVID-19 could help to more appropriately identify patients at risk of ICU mortality than general models for the entire population considering that this disease results in a constellation of symptoms, laboratory derangement, immune dysregulation, and clinical complications.
The primary objective was to determine the presence of distinct clinical phenotypes using unsupervised clustering methods that were applied to the datasets available on ICU admission. The second objective was to assess which factors are independently associated with ICU mortality. The added value of this large-scale multicenter prospective study lies to discover phenotypes based on clinical data available at ICU admission that can help explain the variation in clinical results of COVID-19 disease in the ICU.

Study design
A multicenter observational, prospective cohort study that consisted of a large-scale data source of hospital ICU admissions and patient-level clinical data. The enrolment criteria included adult' patients with laboratory confirmed SARS-CoV-2 infection admitted in 63 ICUs across Spain due to acute respiratory failure between February 22, 2020 and May 11, 2020. The study was approved by the reference institutional review board at Joan XXIII University Hospital (IRB# CEIM/066/2020) and each participating site with a waiver of informed consent. All data values were anonymized prior to the phenotyping which consisted of clustering clinical variables on their association with COVID-19 mortality.

Study sites and patients population
The study enrolled consecutive adult patients (> 16 years) with laboratory confirmed SARS-CoV-2 infection, detected by RT-PCR positive test of nasopharyngeal, oropharyngeal swab or invasive respiratory samples according to the WHO recommendations [12]. The follow-up of patients was scheduled until August 11, 2020, which confirmed ICU discharge or death whichever occurred first. A complete list of participating ICUs and their investigators is provided in the acknowledgements section. In this cohort, 43 patients were described in a preliminary report of a single-center case series in Tarragona, Spain [9]. defined as any in-ICU death. All complications and outcomes were followed during ICU admission.

Data collection
Data was obtained from a voluntary registry created by Spanish Society of Intensive Care Medicine-SEMICYUC. All consecutive cases admitted to the ICU were collected. There were no patients excluded from the analysis that was enrolled to participating ICU and met criteria.
All the collected variables recorded at ICU admission are listed in the Additional file 1: p. 6. To determine severity of illness, the Acute Physiology and Chronic Health Evaluation (APACHE) II score [13] and Sequential Organ Failure Assessment (SOFA) scoring [14] were calculated for all patients within the first 24 h of ICU admission.
The ICU admission criteria, use of antiviral, antibiotic or co-adjuvant treatment, and also the measures that would determine the need to intubate and type of ventilator support required (oxygenation, high flow nasal cannula [HFNC], noninvasive [NIV] or invasive [IMV] mechanical ventilation) were not standardized between centers and were left to the discretion of the attending physician, according to SEMICYUC and National Ministry of Health [15] and were included in the case report form and confirmed by the medical records. We also collected hospital-level data including city, county and number of hospital beds available. The study definitions used in the present analysis are shown in the Additional file 1: p. 2.

Statistical analysis
No statistical sample size calculation was performed a priori, and sample size was equal to the number of patients admitted to the participant's ICUs with confirmed COVID-19 during the study period. To describe baseline characteristics, the continuous variables were expressed as median (interquartile range [IQR]) and categorical variables as number of cases (percentage). For patient demographics and clinical characteristics, differences between groups were assessed using the chi-squared test and Fisher's exact test for categorical variables, and the Mann-Whitney U or Wilcoxon test for continuous variables. To performed the analysis, we first assessed the candidate variables, missing values, and correlation. Multiple imputation was used to account for missing data (Additional file 1: p. 2). After evaluating correlation, highly correlated variables were excluded (Additional file 1: p. 5).
An overview of the primary analysis plan is outlined in Fig. 1. In a first step, a multilevel conditional logistic modelling and the intraclass correlation coefficient (ICC) was calculated (Additional file 1: p. 2) with patients nested in hospital to characterize hospital-level variation In a second step, to determine presence of distinct clinical phenotypes in our population of COVID-19 patients, an unsupervised clustering analysis was applied to the database at ICU admission. In order to carry out this analysis, a discretization of the numerical variables into categorical ones was done using "ChiMerger" packages for R software. The information provided by each variable regarding ICU mortality was defined using the Information Value (IV). A IV greater than 0.03 was considered clinically important and this variable was included in the multivariate logistic regression analysis. Model performance was examined using accuracy test, Sensitivity, Specificity and AUC modeling. Subsequently, the unsupervised cluster analysis was performed using the important variables. The Podani distance was used to calculate the distance between patients and the "partition around medoids" (PAM) algorithm to perform the clustering [16]. The optimal number of clusters were determined after studying the silhouette [17] and the PAM objective for different numbers of clusters (Additional file 1: p. 12). Each of these clusters represent a specific patient's phenotype. To visualize the clusters in a lower dimensional space, we used a Principal Component Analysis (PCA). We obtain important variables according to IV for each phenotype, and the OR of these variables were obtained after applying a GLM (Generalize linear Regression model) analysis. GLM is how statistical software R performs multiple logistic regression analysis when the command family = "binomial" is indicated. Multinomial regression models were fit to further compare patient comorbidities across phenotype classification. Model performance in each phenotype was examined using accuracy test, Sensibility, Specificity and AUC.
Lastly, a traditional multivariate analysis GLM was performed to investigate the association between baseline (on ICU admission) variables and ICU-mortality. The GLM model comprised factors of clinical interest and all significant covariates (p < 0.05) in the univariate analysis of ICU mortality and presence of collinearity was studied by variance inflation factors (VIF). The results are presented as odds ratios (OR) and 95% confidence intervals (CI). To determine our model, we checked adequate model performance between groups with a cross validation model (K-fold = 10) and the model with better performance was chosen.
For all model validation, database was randomly split into two subsets: (a) a "training set" (80%), and (b) a "validation set" (20%). Model performance was examined using accuracy test, precision, sensitivity, specificity and area under ROC curve (AUC). Data analysis was performed using R software (cran.r-project.org).

Patients characteristics at ICU admission
From February 29, 2020 to June 11, 2020 a total of 2,022 critically ill patients from 63 ICUs were enrolled in the present analysis. Forty percent of ICUs belonged to hospitals with more than 500 beds, 40% to hospitals between 200 and 500 beds and the remaining 20 percent to hospitals with fewer than 200 beds. To determine if a significant inter-hospital variation is present, multilevel conditional logistic modelling with patients nested in hospital to characterize hospital-level variation of ICU mortality was done. According to intraclass correlation coefficient (ICC) obtained 0.04 when considering all hospital (n = 63) and of 0.04 when excluded hospitals that submitted data on few than 10 patients, no significant inter-hospital variation was observed (Additional file 1: e- Fig Table 3, p. 8.

ICU mortality
Overall, 660 patients (32.6%) died. The crude ICU mortality increased significantly with the increase in predefined age cut-off and was greater than 80% in patients over 80 years old (Additional file 1: p. 7). Age, male sex, severity of illness (APACHE II and SOFA), presence of arterial hypertension, diabetes, coronary arterial disease, chronic obstructive pulmonary disease (COPD), chronic kidney disease (CKD), immunosuppression and hematologic disease markers were significantly higher in 'nonsurvivor' patients. Non-survivor patients compared with those that survived had higher levels of D-Lactate dehydrogenase (LDH), white blood cells, serum creatinine, C-reactive protein (CRP), Procalcitonin (PCT), serum lactate, serum D-dimer and serum ferritin. Nonsurvivor patients developed more frequent complications such as shock, kidney and myocardial dysfunction at ICU admission. High Flow Nasal Cannula (HFNC) was more frequent in survivors, while invasive mechanical ventilation (MV) was more common in non-survivors. Mortality for those who received MV during their ICU stay (n = 1554; 76.8%) was 37.3% (n = 580) higher than observed in patients who did not require MV 17.0% (80/468, p < 0.001). Complete characteristics of patients according outcome are shown in Table 1.

Unsupervised analysis (cluster) to determine different phenotypes in critically ill patients
Once the variables were categorized, 5 patients (0.24%) were excluded for outlier's data, and the analysis was performed with 2,017 patients. Of the 50 variables considered, only 25 were considered as predictors according to the IV (Additional file 1: p. 11) and were included in the model. Remarkably, no treatment option was a predictive factor for ICU-mortality. The categorized variables independently associated with ICU-mortality are shown in Additional file 1: p. 12. The performance of the model was adequate with an accuracy of 0.77, sensitivity of 0.88, specificity of 0.54 and AUC of 0.82. According to the Podani's distance and the Shilouette and PAM plots (Additional file 1: p. 13) the optimal number of clusters in our dataset was 3. Cluster A included 537 patients (26.7%), cluster B included 623 (30.8%) and cluster C included 857 patients (42.5%). The clusters in   Fig. 2. By including these important variables in a regression model for each cluster, we observed that the discrimination of each model was higher than general model except for C phenotype ( Table 3). The Variables independently associated with mortality were different between automatic and cluster models (Table 4 and Fig. 3  A-B).

Construction of the ICU Mortality classic multivariate model
Of the 42 variables measured at ICU admission, 25 variables that were statistically significant in the univariate analysis (Table 1) were included in the model. The initial dataset of patients was randomly split in two subgroups "Training group" with 1,618 patients (80%) and "Test group" with 404 patients (20%). The characteristics of patients included into each subgroup are shown in Additional file 1: e- Table 3, p. 8. No significant differences were observed between the subgroups. Inclusion of these 25 variables in a GLM model for the training group, resulted in 10 variables that were independently associated with ICU mortality (Fig. 4) (Table 3), however, the variables included in each model were different (Fig. 4 and Additional file 1: e- Fig. 9, p. 16).

Discussion
The main finding of our study is that among patients with COVID-19, 3 clinical phenotypes were derived using habitual clinical and laboratory variables at ICU admission. The ability of identifying phenotypes using a small set of variables is a crucial step towards clinical application and has important implications for possible IQR interquartile range, APACHE II Acute Physiology and Chronic Health Evaluation II, SOFA Sequential Organ Failure Assessment, BMI body mass index, COPD Chronic obstructive pulmonary disease, HIV human immunodeficiency viruses, PaO 2 /FiO 2 Partial pressure arterial oxygen/fraction of inspired oxygen a Corresponds to minimum or maximum value, as appropriate, within 12 h of ICU admission. The variables in this Table were no transformed for your comparison b APACHE II score to the severity of illness, the score is obtained by adding the following components (1) 12 clinical and laboratory variables each with a score range of 0 to 4 points (APS). The APS is determined from the worst physiologic values during the initial 24 h after ICU admission, (2) age with a range 0 to 6 points and (3) Chronic health points if the patients has history of severe organ system insufficiency or is immunocompromised assign 5 points if the patients is no operative or emergency postoperative and 2 points for elective postoperative patients with a total score range of 0 to 71 c SOFA score corresponds to the severity of organ dysfunction, reflecting six organ systems each with a score range of 0 to 4 points (cardiovascular, hepatic, hematologic, respiratory, neurological, renal), with a total score range of 0 to 24    3)*** differential treatment guided by phenotypes and validated prognostic scoring systems [18,19]. Our C phenotype was associated with more than double the ICU mortality than each of the remaining two phenotypes. This C phenotype was characterized by the interplay of older age (> 65 year), a high severity (APACHE II > 15 and SOFA > 5), greater burden of risk factors (hematologic disease and coronary disease) and a higher likelihood of developing further complications (shock and AKI).

Table 2 Characteristics of 2017 critically ill patients included in machine learning analysis according to overall or cluster (phenotype) population
Previous studies have implemented clustering techniques to analyze various data sources relating to demographic, geographic, environment, and socioeconomic determinants of health and disease. There are studies that have evaluated treatment decisions and characterized clinical phenotypes associated with complications, ICU admission and mortality risk in critically ill COVID-19 patients. According to the Situation Report & Public Health Guidance published by Johns Hopkins University on March 19th, 2020, people over 60 and those with chronic health conditions are at the highest risk for COVID-19 complications [20]. To our knowledge, this is the first study with a high number of critically ill patients to analyze the presence of phenotypes in patients with SARS-CoV-2 infection. This multicenter cohort study of 2,022 critically ill patients found that 660 patients (32.6%) died at ICU discharge. Our ICU mortality rates was significantly lower as reported in Yang et al. [4] in Wuhan, China (61.5%), by Myers et al. [7] in California, USA (50.0%), by Arentz et al. [6] in Washington, USA (67%) and by Richardson et al. [5] in New York, USA (78%), but slightly higher to reported by Grasselli et al. [8] in Lombardy region, Italy (26%). These observed differences in ICU mortality could respond to different healthcare models and important practice variations in the management of severe COVID-19 patients [11], but it can also depend on the frequency of presentation of the different phenotypes.
In our study, a great variability in model performance and risk factors were observed during cross-validation to choose the best model to use. In addition, we use 2 different techniques for the selection of important variables, one of them is the "classic" approach dependent on the p-value, while the other, a "modern" statistical Table 2 (continued) IQR interquartile range, APACHE II Acute Physiology and Chronic Health Evaluation II, SOFA Sequential Organ Failure Assessment, BMI body mass index, COPD Chronic obstructive pulmonary disease, HIV human immunodeficiency viruses, PaO 2 /FiO 2 Partial pressure arterial oxygen/ fraction of inspired oxygen a Corresponds to minimum or maximum value, as appropriate, within 12 h of ICU admission. The variables in this Table were no transformed for your comparison b APACHE II score to the severity of illness, the score is obtained by adding the following components (1) 12 clinical and laboratory variables each with a score range of 0 to 4 points (APS). The APS is determined from the worst physiologic values during the initial 24 h after ICU admission, (2) age with a range 0 to 6 points and (3) Chronic health points if the patients has history of severe organ system insufficiency or is immunocompromised assign 5 points if the patients is no operative or emergency postoperative and 2 points for elective postoperative patients with a total score range of 0 to 71 c SOFA score corresponds to the severity of organ dysfunction, reflecting six organ systems each with a score range of 0 to 4 points (cardiovascular, hepatic, hematologic, respiratory, neurological, renal), with a total score range of 0 to 24 d Defined as a body mass index (calculated as weight in kilograms divided by height in meters squared) of 30 or greater e Baseline eGFR < 60 on at least two consecutive values at least 12 weeks apart prior or hemodialysis f Included acute leukemia, myelodysplastic syndrome and Lymphomas g According to the New York Heart Association (NYHA) Functional Classification III and IV h Defined as patients in whom adequate fluid resuscitation therapy are unable to restore hemodynamic stability and need any dose of vasopressor drugs i Define as an abrupt and sustained (more than 24 h) decrease in kidney function and categorized according to RIFLE criteria j Define as an acute decrease in ejection fraction (EF) with dilatation of ventricles observed in echocardiography upon ICU admission All comparison between clusters. *p < .05; **p < .01; ***p < .001, others comparison p > .01 approaches, is more in line with the new recommendations [21]. Although the performance of the models was similar, the variables included in each of them are different. This could be related to the presence of a very heterogeneous patient population, which is revealed during random partitioning (80%/20%) validation of each model or by implementing 2 variable selection techniques. In this context, three clinical phenotypes of COVID-19 patients were derived using routinely available clinical data at ICU admission by an unsupervised cluster analysis. The phenotypes were multidimensional, differed in their demographics, laboratory abnormalities, patterns of organ dysfunction, and associated with ICU mortality. In addition, our phenotypes are not similar with groupings or phenotypes of patients performed so far considering only the presence of clinical complications [22,23], or the type of ARDS [24]. Our COVID-19 phenotypes can be identified at the time of the ICU admission, and thus could be useful in facilitating early tailored therapy and improve prognosis. Only routinely available clinical and laboratory data were used in the clustering models, and the phenotypes were derived from a large observational multicenter cohort to ensure generalizability. Importantly, we have observed that the variables associated with the ICU mortality varied between the global model and the models

Table 4 Factors independently associated with ICU mortality in automatic and clustering models (automatic model: generalized linear model [GLM] with important variables according to information value analysis for ICU mortality; A, B and C Phenotypes: GLM models with important variables according to information value analysis for each cluster)
OR Odds ratio, CI Confidence interval, APACHE II Acute Physiology and Chronic Health Evaluation II, SOFA Sequential Organ Failure Assessment, COPD Chronic obstructive pulmonary disease, LDH D-Lactate dehydrogenase, PCT Procalcitonin, GAP-UCI Time in days from symptoms onset and ICU admission, PaO 2 /FiO 2 Partial pressure arterial oxygen/fraction of inspired oxygen, AKI acute kidney injury a Corresponds to categorized variables independently associated with ICU mortality for COVID-19 patients, (1) "Type L" characterized by high compliance and low lung recruitablity and (2) "Type H" with low compliance and high lung recruitability as a two "extremes" of a spectrum of respiratory failure in COVID-19 pneumonia. Despite the importance related to clinical experience in each of these approaches, none of these studies have been developed through a machine learning process to determine phenotypes nor have they been tested for validation. Hypoxemia has been proposed as a marker of severity for the differentiation of phenotypes [22,24]. In our study, the PaO 2 /FiO 2 relationship at ICU admission was an independent risk factor for ICU mortality in overall multivariate analysis (as a continuous or dichotomized variable), but was only closely associated with ICU mortality in phenotype C. Other variables such as advanced age, serum D-dimer values and the development of AKI were variables more strongly related to ICU mortality in all subgroups or phenotypes analyzed than PaO 2 /FiO 2 at ICU admission.
Our results should be interpreted in the context of the study limitations. First, although phenotypes were found to be generalizable in our population, risk factors and characteristics of clinical phenotypes were derived initially from data at ICU admission of multicenter observational study in Spain. However, these risk factors are similar to those that have been reported by other investigators [4][5][6][7][8]. The cross-validation carried out and the high discrimination observed for each of the models built for phenotypes, suggests their applicability to other populations, but it should be examined considering the high variability observed in patients with COVID-19 and in the support measures applied. Second, because missing data were common for some variables included in the clustering models, multiple imputation was used in the primary analysis. However, variables with high missing values were excluded and the missing threshold used was reported elsewhere [18]. Third, only routinely available clinical data at ICU admission were used to identify risk factors and clinical phenotypes, and the inclusion of other data related to clinical evolution of patients in the ICU could change risk factors or phenotype assignments. However, our objective was to study early risk factors and phenotypes at ICU admission that may allow for early treatment implementation and as a result improve patient outcome. Fourth, although IL-6 is an excellent severity biomarker, we have not been able to include this biomarker in the models because more than 50% of patients had no IL-6 determination upon ICU admission. Although the inclusion of IL-6 in models could modify or improve their performance, we do not consider it appropriate to impute a large number of missing data. In addition, if IL-6 is a biomarker not usually available its inclusion in the models would not have practical application. Finally, we did not collect data on ethnicity or socioeconomic factors. These factors may play a role in the prevalence of pre-existing comorbidities and mortality due to COVID-19. Our findings should be interpreted within the context of the study population and its generalizability to other populations warrant further investigation.

Conclusion
To our knowledge this is the largest study that describe different phenotypes of patients with confirmed COVID-19 that were admitted to ICU to date. We not only characterized three novel clinical phenotypes, but extended findings outside of a single site ICU by characterizing the association of comorbidities with clinical phenotype and the association of clinical phenotypes with clinical outcomes. Different risk factors for the global population and clinical phenotypes were observed, possibly due to the heterogeneity of patients, which may limit the application of a single predictive model for all patients with COVID-19. Further research is needed to determine the application of these phenotypes in clinical practice, in other patient's population and for clinical trial design.