Skip to main content

Real-world inter-observer variability of the Sequential Organ Failure Assessment (SOFA) score in intensive care medicine: the time has come for an update

A Correspondence to this article was published on 11 May 2023

The Original Article was published on 13 January 2023

Dear Editor,

Moreno et al. [1] have elegantly highlighted the recent changes in clinical practice and organ support that may not be captured by the SOFA score in its original form, suggesting the need for an update.

Another reason to move to a new version is that the real-world application of the SOFA score in its current form may lack reproducibility. We have empirically noticed that some intensivists strictly follow the original description of the score for calculation, whereas others adopt a more liberal approach, seeking to preserve its original essence in a non-standard fashion. As a result, a patient with acute respiratory failure who is receiving veno-venous extracorporeal membrane oxygenation (VV-ECMO) and has a PaO2/FiO2 of 190 mmHg may be imputed 3 points (strict approach) or 4 points (liberal approach).

To test this hypothesis, we performed a retrospective study in the ICU of a university hospital in Spain. The study was approved by the Ethics Committee, with a waiver for informed consent. We obtained a random sample by selecting the patients admitted to the ICU at 9:00 a.m. on the 15th of all odd months in 2022. We requested two consultants, two senior and two junior residents to rate the SOFA score from the information available in the electronic medical record. We used the two-way random effects intraclass correlation coefficient (ICC) and the 95% confidence interval (CI) to assess the reliability and consistency of the measurements performed by the different raters. We used the linearly weighted Cohen’s Kappa (κ) and the 95% CI to measure the inter-rater reliability between clinicians with similar professional experience. The ICC or κ values were interpreted as poor (< 0.5), moderate (0.5–0.75), good (0.75–0.9) or excellent (> 0.9) inter-rater agreement [2].

We calculated the SOFA score for 102 patients. The overall ICC (95% CI) of the SOFA score was 0.83 (0.77–0.87). We found the following ICC (95% CI) for the different organ systems: central nervous system (CNS) 0.42 (0.32–0.53), renal 0.62 (0.54–0.70), respiratory 0.65 (0.57–0.72), cardiovascular 0.84 (0.80–0.88), coagulation 0.93 (0.91–0.95), and liver 0.94 (0.92–0.96). The inter-observer agreement according to the degree of professional experience is summarized in Table 1.

Table 1 Inter-rater reliability of the SOFA score between clinicians with similar professional experience, expressed as linearly weighted Cohen's Kappa (κ) and the 95% confidence interval (CI)

In our study, inter-observer agreement of the overall SOFA score was good. We observed an excellent inter-observer reliability in the liver and the coagulation systems, which can be attributed to the objectivity given by the use of laboratory measurements alone. We found a good inter-rater agreement in the cardiovascular system, where the small differences detected may be explained by the use of inotropes like dobutamine or levosimendan, or mechanical circulatory support, which are not captured by the original SOFA score. We identified a moderate inter-observer agreement in the evaluation of the respiratory and renal systems. The moderate agreement in the respiratory system may reflect the wide range of respiratory support devices available nowadays, which include high-flow oxygen therapy, non-invasive and invasive ventilation, or VV-ECMO. On the other side, the original score does not take into account the possibility of using SpO2 when arterial blood gas analysis is not available. The variability detected in the renal system is explained by the use of renal replacement therapy (RRT) or the removal of urinary catheters to reduce the risk of infection. Finally, we observed a poor agreement in the assessment of the CNS, which we attribute to the inherent subjectivity in the evaluation of the Glasgow Coma Scale, particularly in patients under sedation or mechanical ventilation. When considering the possibility that professional experience may influence the reliability of the SOFA score, our data demonstrates that overall agreement between clinicians with the same level of expertise remains only moderate.

Previous studies have described a good inter-observer agreement for the overall SOFA score. Arts DGT et al. [3] found an ICC of 0.89 in 2005, with weighted κ coefficients similar to the ones we obtained for the hepatic and coagulation systems (> 0.9) and the circulatory system (0.75–0.9). However, the reported κ coefficients for the renal (0.85), respiratory (0.63), and CNS (0.55) were higher than the ones we observed. We consider the downtrend in the agreement in these organ systems over time can be explained by the wider availability of RRT and respiratory support devices, as well as the implementation of light sedation policies and screening tools for delirium. Yet, another study by Tallgren M et al. [4] pointed out that less than half of the SOFA scores calculated by physicians were accurate in real-world practice. Training, education, and comprehensive guidelines may improve the accuracy of the SOFA score [4, 5].

We believe that the SOFA score should be updated to reflect the trends in current clinical practice and organ support, and that this should be complemented by a training program to increase its accuracy and minimize inter-observer variability.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.



Sequential Organ Failure Assessment


Veno-venous extracorporeal membrane oxygenation

PaO2 :

Partial pressure of oxygen

FiO2 :

Fraction of inspired oxygen


Intensive care unit


Intraclass correlation coefficient


Confidence interval


Cohen’s Kappa


Central nervous system

SpO2 :

Peripheral oxygen saturation


Renal replacement therapy


  1. Moreno R, Rhodes A, Piquilloud L, Hernandez G, Takala J, Gershengorn HB, et al. The Sequential Organ Failure Assessment (SOFA) Score: has the time come for an update? Crit Care. 2023;27(1):15.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Koo TK, Li MY. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med. 2016;15(2):155–63.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Arts DGT, De Keizer NF, Vroom MB, De Jonge E. Reliability and accuracy of Sequential Organ Failure Assessment (SOFA) scoring. Crit Care Med. 2005;33(9):1988–93.

    Article  CAS  PubMed  Google Scholar 

  4. Tallgren M, Bäcklund M, Hynninen M. Accuracy of Sequential Organ Failure Assessment (SOFA) scoring in clinical practice. Acta Anaesthesiol Scand. 2009;53(1):39–45.

    Article  CAS  PubMed  Google Scholar 

  5. Lambden S, Laterre PF, Levy MM, Francois B. The SOFA score-development, utility and challenges of accurate assessment in clinical trials. Crit Care. 2019;23(1):374.

    Article  PubMed  PubMed Central  Google Scholar 

Download references


Not applicable.


The authors conducted this research without any funding.

Author information

Authors and Affiliations



DPT designed the study. PAMG, ICP, CCR, GJPP, CCC and TGTE evaluated the SOFA score in the patient sample. DPT and PAMG analyzed the data. DPT, PAMG, ICP, CCR and CDR critically interpreted the data. All authors read and approved the final text.

Corresponding author

Correspondence to David Pérez-Torres.

Ethics declarations

Ethics approval and consent to participate

The study was approved by the local ethics committee (Comité de Ética de la Investigación del Área de Salud de Valladolid Oeste), with a waiver for informed consent.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pérez-Torres, D., Merino-García, P.A., Canas-Pérez, I. et al. Real-world inter-observer variability of the Sequential Organ Failure Assessment (SOFA) score in intensive care medicine: the time has come for an update. Crit Care 27, 160 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: