Real-world inter-observer variability of the Sequential Organ Failure Assessment (SOFA) score in intensive care medicine: the time has come for an update

© The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. The Creative Commons Public Domain Dedication waiver (http:// creat iveco mmons. org/ publi cdoma in/ zero/1. 0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Critical Care


Dear Editor,
Moreno et al. [1] have elegantly highlighted the recent changes in clinical practice and organ support that may not be captured by the SOFA score in its original form, suggesting the need for an update.
Another reason to move to a new version is that the real-world application of the SOFA score in its current form may lack reproducibility. We have empirically noticed that some intensivists strictly follow the original description of the score for calculation, whereas others adopt a more liberal approach, seeking to preserve its original essence in a non-standard fashion. As a result, a patient with acute respiratory failure who is receiving veno-venous extracorporeal membrane oxygenation (VV-ECMO) and has a PaO 2 /FiO 2 of 190 mmHg may be imputed 3 points (strict approach) or 4 points (liberal approach).
To test this hypothesis, we performed a retrospective study in the ICU of a university hospital in Spain. The study was approved by the Ethics Committee, with a waiver for informed consent. We obtained a random sample by selecting the patients admitted to the ICU at 9:00 a.m. on the 15th of all odd months in 2022. We requested two consultants, two senior and two junior residents to rate the SOFA score from the information available in the electronic medical record. We used the two-way random effects intraclass correlation coefficient (ICC) and the 95% confidence interval (CI) to assess the reliability and consistency of the measurements performed by the different raters. We used the linearly weighted Cohen's Kappa (κ) and the 95% CI to measure the inter-rater reliability between clinicians with similar professional experience. The ICC or κ values were interpreted as poor (< 0.5), moderate (0.5-0.75), good (0.75-0.9) or excellent (> 0.9) inter-rater agreement [2].
We calculated the SOFA score for 102 patients. The overall ICC (95% CI) of the SOFA score was 0.83 (0.77-0.87). We found the following ICC (95% CI) for the different organ systems:  Table 1.
In our study, inter-observer agreement of the overall SOFA score was good. We observed an excellent interobserver reliability in the liver and the coagulation systems, which can be attributed to the objectivity given by the use of laboratory measurements alone. We found a good inter-rater agreement in the cardiovascular system, where the small differences detected may be explained by the use of inotropes like dobutamine or levosimendan, or mechanical circulatory support, which are not captured by the original SOFA score. We identified a moderate inter-observer agreement in the evaluation of the respiratory and renal systems. The moderate agreement in the respiratory system may reflect the wide range of respiratory support devices available nowadays, which include high-flow oxygen therapy, non-invasive and invasive ventilation, or VV-ECMO. On the other side, the original score does not take into account the possibility of using SpO 2 when arterial blood gas analysis is not available. The variability detected in the renal system is explained by the use of renal replacement therapy (RRT) or the removal of urinary catheters to reduce the risk of infection. Finally, we observed a poor agreement in the assessment of the CNS, which we attribute to the inherent subjectivity in the evaluation of the Glasgow Coma Scale, particularly in patients under sedation or mechanical ventilation. When considering the possibility that professional experience may influence the reliability of the SOFA score, our data demonstrates that overall agreement between clinicians with the same level of expertise remains only moderate.
Previous studies have described a good inter-observer agreement for the overall SOFA score. Arts DGT et al. [3] found an ICC of 0.89 in 2005, with weighted κ coefficients similar to the ones we obtained for the hepatic and coagulation systems (> 0.9) and the circulatory system (0.75-0.9). However, the reported κ coefficients for the renal (0.85), respiratory (0.63), and CNS (0.55) were higher than the ones we observed. We consider the downtrend in the agreement in these organ systems over time can be explained by the wider availability of RRT and respiratory support devices, as well as the implementation of light sedation policies and screening tools for delirium. Yet, another study by Tallgren M et al. [4] pointed out that less than half of the SOFA scores calculated by physicians were accurate in realworld practice. Training, education, and comprehensive guidelines may improve the accuracy of the SOFA score [4,5].
We believe that the SOFA score should be updated to reflect the trends in current clinical practice and organ support, and that this should be complemented by a training program to increase its accuracy and minimize inter-observer variability.

Declarations Ethics approval and consent to participate
The study was approved by the local ethics committee (Comité de Ética de la Investigación del Área de Salud de Valladolid Oeste), with a waiver for informed consent.

Consent for publication
Not applicable.