A systematic review and meta-analysis of the clinimetric properties of the core outcome measurement instruments for clinical effectiveness trials of nutritional and metabolic interventions in critical illness (CONCISE)
Critical Care volume 27, Article number: 450 (2023)
CONCISE is an internationally agreed minimum set of outcomes for use in nutritional and metabolic clinical research in critically ill adults. Clinicians and researchers need to be aware of the clinimetric properties of these instruments and understand any limitations to ensure valid and reliable research. This systematic review and meta-analysis were undertaken to evaluate the clinimetric properties of the measurement instruments identified in CONCISE.
Four electronic databases were searched from inception to December 2022 (MEDLINE via Ovid, EMBASE via Ovid, CINAHL via Healthcare Databases Advanced Search, CENTRAL via Cochrane). Studies were included if they examined at least one clinimetric property of a CONCISE measurement instrument or recognised variation in adults ≥ 18 years with critical illness or recovering from critical illness in any language. The COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist for systematic reviews of Patient-Reported Outcome Measures was used. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses were used in line with COSMIN guidance. The COSMIN checklist was used to evaluate the risk of bias and the quality of clinimetric properties. Overall certainty of the evidence was rated using a modified Grading of Recommendations, Assessment, Development and Evaluation approach. Narrative synthesis was performed and where possible, meta-analysis was conducted.
A total of 4316 studies were screened. Forty-seven were included in the review, reporting data for 12308 participants. The Short Form-36 Questionnaire (Physical Component Score and Physical Functioning), sit-to-stand test, 6-m walk test and Barthel Index had the strongest clinimetric properties and certainty of evidence. The Short Physical Performance Battery, Katz Index and handgrip strength had less favourable results. There was limited data for Lawson Instrumental Activities of Daily Living and the Global Leadership Initiative on Malnutrition criteria. The risk of bias ranged from inadequate to very good. The certainty of the evidence ranged from very low to high.
Variable evidence exists to support the clinimetric properties of the CONCISE measurement instruments. We suggest using this review alongside CONCISE to guide outcome selection for future trials of nutrition and metabolic interventions in critical illness.
Trial registration : PROSPERO (CRD42023438187). Registered 21/06/2023.
Functional decline and disability affect many survivors of critical illness and can be long-lasting . Post-intensive care syndrome comprises physical, cognitive, and mental health impairments, which can result in adverse socioeconomic consequences and are recognised by patients, clinicians, and public sector organisations as a major public health issue [2, 3]. Muscle wasting occurs rapidly in critical illness and is the result of decreased protein synthesis, bioenergetic failure, and intramuscular inflammation [4,5,6]. Nutritional and metabolic interventions may be able to reverse these pathological changes, improving patient outcomes . The variation in outcomes collected makes comparison between trials challenging, limiting future systematic reviews and meta-analyses [8, 9].
A methodological approach to address this issue is the creation of a Core Outcome Set (COS). This approach does not prevent researchers from evaluating additional outcomes, however, it provides the minimum standard ensuring that essential outcomes within a research area are consistently assessed using the same measurement instruments. Core outcome measures for clinical effectiveness of nutritional and metabolic interventions in critical illness (CONCISE) is an internationally agreed set of outcomes and measurement instruments for use at 30 and 90 days post enrolment, in nutritional and metabolic clinical research in critically ill adults . The development of CONCISE involved a systematic review identifying outcome measures used in critical care nutrition trials and their clinimetric properties followed by a consensus process. The following measurement instruments were recommended: Short Form-36 Physical Component Score (SF-36 PCS) , 30 s sit-to-stand (30STS) , 6-min walk test (6MWT) , Short Physical Performance Battery (SPPB) , Barthel Index , Katz Index , Lawton Instrumental Activities of Daily Living (IADL) , Global Leadership Initiative on Malnutrition criteria (GLIM)  and handgrip strength (HGS) .
Clinicians and researchers using the measurement instruments recommended by CONCISE need to be aware of the clinimetric properties of these measurement instruments, to ensure valid and reliable research. Clinimetric or measurement properties refer to the quality of the measurement tool and the quality of its performance . This systematic review and meta-analysis aimed to summarise and evaluate the clinimetric properties of the measurement instruments recommended in CONCISE.
The review was registered on PROSPERO (CRD42023438187) on 21st June 2023. This study followed the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) methodology for systematic reviews of Patient-Reported Outcome Measures (PROMs) . This is reported in line with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Additional file 1: Table S1) , as recommended by the COSMIN guidelines as we await the combined PRISMA-COSMIN guideline .
Search strategy and selection criteria
A search strategy was designed based on the search filter for finding studies on clinimetric properties, developed by Terwee et al. . The search strategy is outlined in the Additional file 1. Four electronic databases (MEDLINE via Ovid, EMBASE via Ovid, CINAHL via Healthcare Databases Advanced Search, and CENTRAL via Cochrane) were searched. Databases were searched from inception to December 2022. Studies identified in the preliminary systematic review process for CONCISE were added [8, 10]. Reference lists were manually searched to screen for eligible studies and relevant review articles. No limits for language, date or geographical region were used. Citations were imported to the web-based collaboration software platform, Covidence .
Inclusion and exclusion criteria
Inclusion and exclusion criteria were established prior to screening. Studies were included if they examined at least one clinimetric property of a CONCISE measurement instrument in adults ≥ 18 years with critical illness or recovering from critical illness in any language. To ensure completeness, we also included studies examining the clinimetric properties of variations or components of CONCISE measurement instruments, including the Short Form-36 Physical Functioning (SF-36 PF), five times STS (5xSTS) and SPPB 4 m gait speed. We included systematic reviews and pooled analyses where they provided new data. Unpublished studies, preprints, and conference abstracts without subsequent study publication were excluded.
Two authors (TD, EK) screened each title and abstract independently to determine eligibility for inclusion. Disagreements were resolved through discussion with a third reviewer (ZP). Full texts were assessed by both authors against the predetermined inclusion and exclusion criteria. Data extraction was completed by two authors (TD, EK) independently using standardised extraction forms. Data extraction included publication details (e.g., title, year, journal), patient characteristics (e.g. age, sex, severity and duration of illness), details of measurement setting (e.g., type of intensive care unit (ICU), timeframe) and the predetermined clinimetric properties of the measurement instrument. Authors were contacted for missing demographic data. Clinimetric properties extracted were based on the COSMIN guidelines and are described in Table 1. Data included structural validity (factor analysis results on dimensionality), internal consistency (Cronbach’s alpha), reliability (intraclass correlations), measurement error (standard error of measurement (SEM), smallest detectable change (SDC) and minimal important change (MIC)), construct validity (convergent validity—correlation of CONCISE instruments with comparator measures (Additional file 1: Table S2), divergent validity—correlation of CONCISE instruments with dissimilar measures (Additional file 1: Table S2); and known-groups validity—comparison of CONCISE instrument scores between two subgroups using relative effect sizes or area under the curve (AUC)), responsiveness to change (mean differences, median differences, AUC or relative effect sizes), predictive validity (correlation, odds ratio, AUC or regression coefficient) and interpretability (floor and ceiling effects). Content validity (as per step 5 of COSMIN guidelines)  was not evaluated as the aim of this review was to present and evaluate the clinimetric properties of the measurement instruments which had reached consensus through rigorous methodology in CONCISE, and not to formulate additional recommendations about the use of specific outcome measurement instruments.
Assessment of risk of bias and certainty of the evidence
Two independent reviewers (TD, EK) used the COSMIN checklist to evaluate the risk of bias of clinimetric properties, blinded to each other's ratings . Disagreements were resolved by discussion with a third reviewer (ZP). Based on the risk of bias assessment, studies were rated as either very good, adequate, doubtful, or inadequate. Following this, each clinimetric property result was rated against the criteria for good measurement (clinimetric) properties (Table 1). Each result was rated as sufficient (+), insufficient (−) or indeterminate (?). Predictive validity was not rated as this is not included in the COSMIN checklist. Specific hypotheses were developed for construct validity and responsiveness (Additional file 1: Table S3). Construct validity and responsiveness were considered sufficient (+) if ≥ 75% of the hypotheses were met, or insufficient (−) if ≥ 75% of the hypotheses were not met, otherwise they were considered inconsistent (±) . All results for each clinimetric property were qualitatively summarised and where appropriate, quantitatively pooled and this summarised result was evaluated against the criteria for good measurement (clinimetric) properties to get an overall rating. Finally, the evidence was graded using the modified Grading of Recommendations, Assessment, Development and Evaluation system approach (GRADE) approach . GRADE was adopted and modified as per COSMIN guidelines to rate four of the five GRADE factors (risk of bias, inconsistency, imprecision, and indirectness). Disagreements were resolved by discussion with a third reviewer (ZP).
For reliability, where there were three or more studies, we calculated pooled intraclass correlation coefficients (ICCs) and 95% confidence intervals using a standard generic inverse variance random effects model. ICC values were combined based on estimates derived from a Fisher transformation, z = 0.5 × ln((1 + ICC)/(1 − ICC)), which has an approximate variance, (Var(z) = 1/(N-3)), where N is the sample size . Between-study heterogeneity was evaluated using the I2 test. Where meta-analysis was not appropriate, we calculated weighted means (number of participants included per study) and weighted standard deviation. Where it was not possible to pool results statistically, results were descriptively summarised. Meta-analysis of data was performed using the statistical software package Review Manager 5.4 (RevMan 5.4.1). Where effect sizes were missing and studies provided sufficient data, Cohen's d was computed as the effect size to assess responsiveness. In cases where the data did not allow for Cohen's d calculation, standardised response mean (SRM) was used as an alternative effect size measure.
The search identified 4316 studies. Forty-seven were included in the review, reporting data for 12,308 participants. PRISMA flow diagram is outlined in Fig. 1. All included articles were in English. Table 2 outlines the characteristics of the included studies.
Risk of bias
The COSMIN risk of bias rating varied from inadequate to very good. Ratings for individual studies are provided in Additional file 1: Table S4. Multiple studies tested more than one measurement property (n = 15). The breakdown of studies reporting clinimetric properties was as follows: structural validity (n = 0), internal consistency (n = 4), reliability (n = 10), measurement error (n = 9), hypothesis testing for construct validity (n = 25) and responsiveness (n = 12). Certainty of evidence was rated using the GRADE approach . Ratings ranged from very low to high. GRADE ratings are outlined in Additional file 1: Table S5.
Short Form-36 Physical Function (SF-36 PF)
Eleven studies reported data for the SF-36 PF [28,29,30,31,32,33,34,35,36,37,38]. The SF-36 PF had excellent internal consistency (pooled Cronbach’s α 0.94) supported by a high certainty of evidence but was rated indeterminate due to no information on its structural validity. It had sufficient test–retest reliability (Pooled ICC 0.86) supported by a low certainty of evidence [32, 33, 35]. There was a moderate to high certainty of evidence supporting sufficient construct validity and responsiveness [29,30,31, 35,36,37,38,39]. No studies tested measurement error. Floor effects post ICU discharge ranged from 6 to 32% and ceiling effects post ICU discharge ranged from 9 to 38% (Additional file 1: Table S7 and Fig. 3) [34, 35, 37]. The SF-36 PF score at 1 month post ICU discharge was not predictive of 1 year mortality or 6 month readmissions . There was no data on the association with length of stay.
Short Form-36 Physical Component Score (SF-36 PCS)
Nine studies reported data for the SF-36 PCS [29, 33, 37,38,39,40,41,42,43]. No studies tested internal consistency or reliability. There was a moderate to high certainty of evidence supporting sufficient construct validity and responsiveness [33, 37,38,39,40,41,42,43]. The MIC of the SF-36 PCS was 6.5 but measurement error was rated indeterminate due to no calculation of SDC . A floor effect of 3% was seen at 6 months post ICU discharge (Additional file 1: Table S7 and Fig. 3) . The SF-36 PCS score at 1 month post discharge was not predictive of 1 year mortality or 6 month readmissions . There was no data on the association with length of stay.
Two studies reported data for the 30STS [44, 45] and three studies for the 5xSTS [39, 46, 47]. When pooled together, there was a very low certainty of evidence supporting excellent test–retest reliability (ICC 0.99) and inter-rater reliability (Pooled ICC 0.95) [44, 46, 47]. Sufficient construct validity was supported by a high certainty of evidence [39, 47] and one study demonstrated sufficient responsiveness with a low certainty of evidence . Measurement error was indeterminate due to no calculation of MIC but the SEM of the 30STS ranged from 0.51 to 1.51 repetitions and the SDC ranged from 1.19 to 4.45 repetitions [29, 35]. No floor or ceiling effects were seen at hospital discharge . A floor effect of 15% was seen at ICU discharge when using the 30STS and 35% at 3 months post discharge when using the 5xSTS (Additional file 1: Table S7 and Fig. 3) [39, 45]. STS performance at ICU discharge was predictive of hospital length of stay . There was no data on the association with mortality or hospital readmissions.
6-min walk test (6MWT)
Nine studies reported data for the 6MWT [13, 28, 30, 31, 36, 38, 39, 48]. No studies in our review tested the reliability of the 6MWT. Sufficient construct validity and responsiveness were supported by a high certainty of evidence [13, 28, 30, 31, 39, 40]. Measurement error was rated as insufficient with a high certainty of evidence as the range for MIC was estimated to be 14-30 m by anchor-based methods which was lower than the SDC of 21–34 m . A floor effect of 40% was seen at hospital discharge and 4% at 3 months post ICU discharge (Additional file 1: Table S7 and Fig. 3) [38, 39]. 6MWT performance at 3 and 6 months post ICU discharge can predict 1 year mortality, and hospital readmissions [6, 12] . There was no data on the association with length of stay.
Short Physical Performance Battery (SPPB)
Two studies reported data for the SPPB [29, 49]. No studies in our review tested the reliability of the SPPB. Sufficient construct validity supported by a low certainty of evidence was demonstrated in one study . Responsiveness to change was insufficient from awakening to ICU discharge (ES 0.33) with a very low certainty of evidence . Measurement error was indeterminate due to no calculation of MIC. The reported range of SDC was 1.3–1.5 points . The SPPB had a significant floor effect of 83% at awakening and 57% at ICU discharge (Additional file 1: Table S7 and Fig. 3) . SPPB performance at 1 month post ICU discharge was not predictive of 1 year mortality or 6 month readmissions . There was no data on the association with length of stay.
Short Physical Performance Battery (SPPB)—4 m gait speed
Five studies reported data on the SPPB 4 m gait speed [30, 31, 36, 40, 50]. Excellent test–retest reliability of the SPPB 4 m gait speed was supported by a low certainty of evidence (ICC range 0.89–0.99) . Sufficient construct validity was supported by a high certainty of evidence and responsiveness was indeterminate [30, 31, 36, 40, 50]. Measurement error was rated insufficient with a high certainty of evidence as the range for MIC was estimated to be 0.13–0.14 m/s by anchor-based methods which was lower than the SDC of 0.06 m/s . No studies tested interpretability. SPPB 4 m gait speed performed at 6 months was predictive of hospital readmissions between 6 to 12 months . There was no data on the association with mortality or length of stay.
Activities of daily living
Four studies reported data for the Barthel Index [51,52,53,54]. It showed sufficient inter-rater reliability (ICC 0.98) and good internal consistency (Cronbach’s α 0.81) supported by a low certainty of evidence but was rated indeterminate for internal consistency due to no information on structural validity . Sufficient construct validity was supported by a high certainty of evidence [52, 54]. Sufficient responsiveness was demonstrated in a single study with a very low certainty of evidence . Measurement error was rated as indeterminate due to no calculation of MIC. A floor effect of 11% and a ceiling effect of 1% were seen at ICU discharge with an SEM of 7.2 points and an SDC of 20 points (Additional file 1: Table S7 and Fig. 3) . There was no data on the association with mortality, hospital readmissions, or length of stay.
Eight studies reported data for the Katz Index [40, 55,56,57,58,59,60,61]. No studies in our review examined the Katz Index in terms of internal consistency, reliability, measurement error and interpretability. Construct validity was rated insufficient with a high certainty of evidence [40, 57, 60, 61]. Responsiveness was sufficient in a single study with a very low certainty of evidence . The Katz index score on ICU admission was predictive of short term (in-hospital to 90 days) mortality but there was no data on the association with longer term mortality, hospital readmissions or length of stay [55, 56, 59, 62].
Instrumental Activities of Daily Living (Lawson IADL)
Four studies provided data on Lawson IADL [40, 53, 56, 63]. No studies in our review examined the IADL in terms of internal consistency, reliability, responsiveness, measurement error and interpretability. Sufficient construct validity was supported by a moderate certainty of evidence . The IADL at ICU admission was predictive of long term mortality but there were conflicting results regarding shorter term mortality and it was not predictive of hospital length of stay [53, 56, 63]. When performed at 6 months, it was not predictive of hospital readmissions between 6 and 12 months .
Handgrip strength (HGS)
Fifteen studies reported data on HGS [29, 36, 40, 47, 52, 54, 64,65,66,67,68,69,70,71]. There was excellent inter-rater reliability (Pooled ICC 0.95) and good test–retest reliability (Pooled ICC 0.89) supported by a very low to low certainty of evidence [65, 68]. Construct validity was inconsistent and no studies tested responsiveness [31, 36, 40, 47, 52, 54, 64, 69, 71, 72]. Measurement error was indeterminate due to no calculation of MIC. The SEM ranged between 2.8 to 4.5 kg and SDC 7.8 to 12.5 kg . Significant floor effects were seen during ICU admission ranging from 26 to 55% (Additional file 1: Table S7 and Fig. 3) [64, 69, 71]. Handgrip strength performed well in the diagnosis of ICU-acquired weakness with high sensitivity and specificity . Handgrip strength during ICU admission was not predictive of in-hospital mortality, hospital length of stay or ICU length stay [69,70,71]. When performed at 1 month and 6 months post ICU discharge, handgrip strength was not predictive of 1 year mortality or hospital readmissions [29, 40].
Global Leadership Initiative on Malnutrition Criteria (GLIM)
Two studies reported data for the GLIM [73, 74]. No studies in our review examined the GLIM in terms of reliability, responsiveness, measurement error and interpretability. There was a high certainty of evidence supporting sufficient construct validity. Two studies validated the GLIM against the Subjective Global Assessment (SGA) demonstrating a high level of precision (AUC 0.85–0.93) and agreement (Kappa 0.85) [48, 49]. The GLIM at ICU admission was predictive of ICU mortality and hospital length of stay . There was no data on its association with longer term mortality and hospital readmissions.
This systematic review and meta-analysis evaluated the clinimetric properties of the measurement instruments recommended in CONCISE . The SF-36 PCS, SF-36 PF, STS, 6MWT and Barthel Index had the strongest clinimetric properties and certainty of evidence. The SPPB, Katz Index and handgrip strength had less favourable results. There was limited available data for the IADL and GLIM.
The CONCISE measurement instruments are established and considered feasible to use during critical illness and its recovery. Our review highlighted differences between the instruments in the strength of clinimetric properties and performance at different time points. The ability to stand from sitting unaided is increasingly recognised by patients as playing a fundamental role in activities of daily living [75,76,77], and our data shows the STS to be an attractive functional independence test with minimal floor effects at ICU and hospital discharge when the repetition based 30STS is used. Our data also support previous findings regarding the 6MWT being a well-defined test for use in critical care nutrition research, post ICU discharge [13, 30]. ICU survivors experience profound disability with previous work demonstrating that only 40% could ambulate at 7 days after ICU discharge . As a result, more complex outcome measures including the 6MWT, SPPB and the Physical Function in ICU Test (PFIT-S) are plagued by floor effects at ICU or hospital discharge as demonstrated in our data [13, 38, 79]. The properties of the SPPB in critically ill patients are poorly defined with a significant floor effect at ICU discharge. Interestingly the 4 m gait speed test, a component of the SPPB, had robust clinimetric properties post hospital discharge suggesting its role may be best utilised later in the recovery period.
The SF-36 and its PCS are widely reported in critical care rehabilitation trials  with well-established clinimetric properties . While our data supports excellent construct validity and responsiveness of the SF-36 PCS with no significant floor or ceiling effects, we found no data describing its internal consistency or reliability. The closely related SF-36 PF domain had excellent internal consistency and reliability but patients with good recovery trajectories have significant ceiling effects unlike those with persistent impairment where significant floor effects are seen .
Measurement of activities of daily living was deemed essential in the CONCISE Delphi process. Our data suggest the Barthel Index has the current best clinimetric properties with more limited evidence for the Katz Index and IADL. Handgrip strength had excellent inter-rater reliability but studies with a larger sample size are needed to improve the certainty of evidence to allow generalisability in trials of critical illness and there are significant floor effects when used during ICU admission.
The GLIM criteria are a diagnostic tool for malnutrition rather than a patient-reported or performance-based measurement instrument. Reliability, responsiveness, and measurement error testing, as described elsewhere in this review are therefore less relevant for the GLIM criteria and have not been studied. It was seen to be highly accurate in diagnosing malnutrition in critical illness and showed excellent construct validity when compared to the SGA supporting its use in the ICU setting.
Implications for outcome selection and future research
The paucity of relevant research and the difficulty of face-to-face assessments during recovery from critical illness make mandating measurement instruments challenging. The use of patient-reported questionnaires, such as the SF-36, or objective performance-based measurement instruments that can be feasibly administered at home via telemedicine, such as the STS [81, 82], may improve loss to follow-up and enable adequate analysis of interventions over recovery from critical illness.
It has previously been suggested that a single measurement instrument to evaluate functional outcomes cannot be used due to the presence of floor and ceiling effects at different time points, which we highlight above . This means identifying change over time or change in response to an intervention is challenging. The repetition based 30STS has robust clinimetric properties and no floor and ceiling effects at hospital discharge making it an attractive measure of physical function for longitudinal nutrition studies in critical illness.
The strong interest in activities of daily living suggests the Katz Index and IADL require further evaluation in the critically ill population. It has previously been suggested that the Barthel Index is more suitable than the Katz Index for assessing patients after an ICU stay  and our analysis supports this recommendation. Additional clinimetric research is required for a more complete evaluation of IADL, handgrip strength and GLIM. Without further research, these instruments may be less attractive for future clinical trials involving patient care. Defining measurement error and responsiveness in more detail for all CONCISE measurement instruments will aid future trial design and sample size calculation.
Strengths and limitations
This review followed the COSMIN methodology and a rigorous approach was taken to the evaluation of the quality and certainty of evidence using the COSMIN risk of bias checklist, COSMIN’s criteria for good measurement properties and the modified GRADE approach . The most important limitations are the low number of high-quality studies and the possibility that relevant studies with clinimetric data were missed in our searches hence results should be interpreted with this in mind. This is especially true for responsiveness where studies used a CONCISE measurement instrument but failed to comment specifically on responsiveness and therefore did not appear in our search. To minimise this, we included all randomised controlled trials of nutrition in critical illness since 2000 from the preliminary CONCISE systematic review [8, 10] but studies with non-nutritional interventions using CONCISE measurement instruments may have been missed. Due to the small number of studies, we included all studies in this review regardless of the risk of bias and subgroup analysis was not performed. We also had to adapt the COSMIN methodology for PROMs to use for the CONCISE performance-based and diagnostic measurement instruments. The studies examined were heterogeneous with variable time points of measurement which were often different to the 30 day or 90 day fixed time points we recommend in CONCISE. Finally, there were no studies evaluating structural validity and the risk of bias was doubtful in many of the studies due to the small sample size or other important methodological flaws such as an inappropriate time interval between assessments when examining reliability. This reinforces the need for large high-quality clinimetric studies in critical illness.
The CONCISE measurement instruments are established and feasible to administer during critical illness and its recovery. The SF-36 PF, SF-36 PCS, STS 6MWT, and Barthel Index had the strongest clinimetric properties and certainty of evidence. Further clinimetric research into all the CONCISE measurement instruments will improve outcome selection for future trials of nutrition and metabolic interventions in critical illness and enable greater generalisability of findings between studies. We suggest using this review alongside CONCISE to guide outcome selection for future trials of nutrition and metabolic interventions in critical illness.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author upon reasonable request.
Activities of daily living
Area under the curve
Core outcome set
COnsensus-based Standards for the selection of health Measurement Instruments
Global Leadership on Malnutrition
Grading of Recommendations, Assessment, Development and Evaluation
Instrumental activities of daily living
Intraclass correlation coefficient
Intensive care unit
Minimal important change
Medical Research Council
The Physical Function in ICU Test
Preferred Reporting Items for Systematic Reviews and Meta-analyses
Risk of bias
Short Form-36 Questionnaire
- SF-36 PCS:
Short Form-36 Questionnaire Physical Component Score
- SF-36 PF:
Short Form-36 Questionnaire Physical Functioning
Smallest detectable change
Standard error of measurement
Subjective Global Assessment
Short Physical Performance Battery
Standardised response mean
30 second sit-to-stand test
Five times sit-to-stand
6-minute walk test
Herridge MS, Tansey CM, Matté A, Tomlinson G, Diaz-Granados N, Cooper A, et al. Functional disability 5 years after acute respiratory distress syndrome. N Engl J Med. 2011;364(14):1293–304.
Herridge MS, Moss M, Hough CL, Hopkins RO, Rice TW, Bienvenu OJ, et al. Recovery and outcomes after the acute respiratory distress syndrome (ARDS) in patients and their family caregivers. Intensive Care Med. 2016;42(5):725–38.
Needham DM, Davidson J, Cohen H, Hopkins RO, Weinert C, Wunsch H, et al. Improving long-term outcomes after discharge from intensive care unit: report from a stakeholders’ conference. Crit Care Med. 2012;40(2):502–9.
Puthucheary ZA, Rawal J, McPhail M, Connolly B, Ratnayake G, Chan P, et al. Acute skeletal muscle wasting in critical illness. JAMA. 2013;310(15):1591–600.
Puthucheary ZA, Astin R, Mcphail MJW, Saeed S, Pasha Y, Bear DE, et al. Metabolic phenotype of skeletal muscle in early critical illness. Thorax. 2018;73(10):926–35.
Chapple LAS, Kouw IWK, Summers MJ, Weinel LM, Gluck S, Raith E, et al. Muscle protein synthesis after protein administration in critical illness. Am J Respir Crit Care Med. 2022;206(6):740–9.
Bear DE, Parry SM, Puthucheary ZA. Can the critically ill patient generate sufficient energy to facilitate exercise in the ICU? Curr Opin Clin Nutr Metab Care. 2018;21(2):110–5.
Taverny G, Lescot T, Pardo E, Thonon F, Maarouf M, Alberti C. Outcomes used in randomised controlled trials of nutrition in the critically ill: a systematic review. Crit Care. 2019;23(1):12.
Chapple LAS, Ridley EJ, Chapman MJ. Trial design in critical care nutrition: the past, present and future. Nutrients. 2020;12:3694.
Davies TW, van Gassel RJJ, van de Poll M, Gunst J, Casaer MP, Christopher KB, et al. Core outcome measures for clinical effectiveness trials of nutritional and metabolic interventions in critical illness: an international modified Delphi consensus study evaluation (CONCISE). Crit Care Lond Engl. 2022;26(1):240.
Ware JEJ, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30(6):473–83.
McAllister LS, Palombaro KM. Modified 30-Second Sit-to-Stand Test: Reliability and Validity in Older Adults Unable to Complete Traditional Sit-to-Stand Testing. J Geriatr Phys Ther. 2020;43(3):153–8.
Parry SM, Nalamalapu SR, Nunna K, Rabiee A, Friedman LA, Colantuoni E, et al. Six-minute walk distance after critical illness: a systematic review and meta-analysis. J Intensive Care Med. 2021;36(3):343–51.
Pavasini R, Guralnik J, Brown JC, di Bari M, Cesari M, Landi F, et al. Short Physical Performance Battery and all-cause mortality: systematic review and meta-analysis. BMC Med. 2016;14(1):215.
Mahoney FI, Barthel DW. Functional evaluation: the barthel index. Md state Med J. 1965;14:61–5.
Katz S. Assessing self-maintenance: activities of daily living, mobility, and instrumental activities of daily living. J Am Geriatr Soc. 1983;31(12):721–7.
Graf C. The Lawton instrumental activities of daily living scale. Am J Nurs. 2008;108(4):52–62 (quiz 62–3).
Cederholm T, Jensen GL, Correia MITD, Gonzalez MC, Fukushima R, Higashiguchi T, et al. GLIM criteria for the diagnosis of malnutrition: a consensus report from the global clinical nutrition community. Clin Nutr Edinb Scotl. 2019;38(1):1–9.
Bohannon RW. Grip strength predicts outcome. Age Ageing. 2006;35(3):320 (author reply 320).
de Vet HCW, Terwee CB, Bouter LM. Current challenges in clinimetrics. J Clin Epidemiol. 2003;56(12):1137–41.
Prinsen CAC, Mokkink LB, Bouter LM, Alonso J, Patrick DL, de Vet HCW, et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res Int J Qual Life Asp Treat Care Rehabil. 2018;27(5):1147–57.
Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 2021;372
Elsman EBM, Butcher NJ, Mokkink LB, Terwee CB, Tricco A, Gagnier JJ, et al. Study protocol for developing, piloting and disseminating the PRISMA-COSMIN guideline: a new reporting guideline for systematic reviews of outcome measurement instruments. Syst Rev. 2022;11(1):121.
Terwee CB, Jansma EP, Riphagen II, de Vet HCW. Development of a methodological PubMed search filter for finding studies on measurement properties of measurement instruments. Qual Life Res Int J Qual Life Asp Treat Care Rehabil. 2009;18(8):1115–23.
Covidence—Better systematic review management. [cited 2023 Jan 15]. https://www.covidence.org/
Mokkink LB, Terwee CB, Knol DL, Stratford PW, Alonso J, Patrick DL, et al. The COSMIN checklist for evaluating the methodological quality of studies on measurement properties: A clarification of its content. BMC Med Res Methodol. 2010;10(1):22.
DerSimonian R, Laird N. Meta-analysis in clinical trials revisited. Contemp Clin Trials. 2015;45(Pt A):139–45.
Alison JA, Kenny P, King MT, McKinley S, Aitken LM, Leslie GD, et al. Repeatability of the six-minute walk test and relation to physical function in survivors of a critical illness. Phys Ther. 2012;92(12):1556–63.
Bakhru RN, Davidson JF, Bookstaver RE, Kenes MT, Welborn KG, Morris PE, et al. Physical function impairment in survivors of critical illness in an ICU Recovery Clinic. J Crit Care. 2018;45(8610642):163–9.
Chan KS, Pfoh ER, Denehy L, Elliott D, Holland AE, Dinglas VD, et al. Construct validity and minimal important difference of 6-minute walk distance in survivors of acute respiratory failure. Chest. 2015;147(5):1316–26.
Chan KS, Mourtzakis M, Aronson Friedman L, Dinglas VD, Hough CL, Ely EW, et al. Evaluating muscle mass in survivors of acute respiratory distress syndrome: a 1-year multicenter longitudinal study. Crit Care Med. 2018;46(8):1238–46.
Chrispin PS, Scotton H, Rogers J, Lloyd D, Ridley SA. Short Form 36 in the intensive care unit: assessment of acceptability, reliability and validity of the questionnaire. Anaesthesia. 1997;52(1):15–23.
Heyland DK, Hopman W, Coo H, Tranmer J, McColl MA. Long-term health-related quality of life in survivors of sepsis. Short Form 36: a valid and reliable measure of health-related quality of life. Crit Care Med. 2000;28(11):3599–605.
Kaarlola A, Pettilä V, Kekki P. Performance of two measures of general health-related quality of life, the EQ-5D and the RAND-36 among critically ill patients. Intensive Care Med. 2004;30(12):2245–52.
Khoudri I, Ali Zeggwagh A, Abidi K, Madani N, Abouqal R. Measurement properties of the short form 36 and health-related quality of life after intensive care in Morocco. Acta Anaesthesiol Scand. 2007;51(2):189–97.
Needham DM, Wozniak AW, Hough CL, Morris PE, Dinglas VD, Jackson JC, et al. Risk factors for physical impairment after acute lung injury in a national, multicenter study. Am J Respir Crit Care Med. 2014;189(10):1214–24.
Puthucheary ZA, Gensichen JS, Cakiroglu AS, Cashmore R, Edbrooke L, Heintze C, et al. Implications for post critical illness trial design: sub-phenotyping trajectories of functional recovery among sepsis survivors. Crit Care. 2020;24(1):577.
Wischmeyer PE, Hasselmann M, Kummerlen C, Kozar R, Kutsogiannis DJ, Karvellas CJ, et al. A randomized trial of supplemental parenteral nutrition in underweight and overweight critically ill patients: the TOP-UP pilot trial. Crit Care Lond Engl. 2017;21(1):142.
Denehy L, Nordon-Craft A, Edbrooke L, Malone D, Berney S, Schenkman M, et al. Outcome measures report different aspects of patient function three months following critical care. Intensive Care Med. 2014;40(12):1862–9.
Chan KS, Aronson Friedman L, Dinglas VD, Hough CL, Shanholtz C, Ely EW, et al. Are physical measures related to patient-centred outcomes in ARDS survivors? Thorax. 2017;72(10):884–92.
de Azevedo JRA, Lima HCM, Frota PHDB, Nogueira IROM, de Souza SC, Fernandes EAA, et al. High-protein intake and early exercise in adult intensive care patients: a prospective, randomized controlled trial to evaluate the impact on functional outcomes. BMC Anesthesiol. 2021;21(1):283.
Kawakami D, Fujitani S, Morimoto T, Dote H, Takita M, Takaba A, et al. Prevalence of post-intensive care syndrome among Japanese intensive care unit patients: a prospective, multicenter, observational J-PICS study. Crit Care. 2021;25(1):69.
Weinert CR, Gross CR, Kangas JR, Bury CL, Marinelli WA. Health-related quality of life after acute lung injury. Am J Respir Crit Care Med. 1997;156(4 Pt 1):1120–8.
Costigan FA, Rochwerg B, Molloy AJ, McCaughan M, Millen T, Reid JC, et al. I SURVIVE: inter-rater reliability of three physical functional outcome measures in intensive care unit survivors. Can J Anaesth J Can Anesth. 2019;66(10):1173–83.
O’Grady HK, Edbrooke L, Farley C, Berney S, Denehy L, Puthucheary Z, et al. The sit-to-stand test as a patient-centered functional outcome for critical care research: a pooled analysis of five international rehabilitation studies. Crit Care Lond Engl. 2022;26(1):175.
de Melo TA, Duarte ACM, Bezerra TS, França F, Soares NS, Brito D. The Five Times Sit-to-Stand Test: safety and reliability with older intensive care unit patients at discharge. Rev Bras Ter Intensiva. 2019;31(1):27–33.
de Melo TA, Silva-Guimarães F, Lapa-e-Silva JR. The five times sit-to-stand test: safety, validity and reliability with critical care survivors’s at ICU discharge. Arch Physiother. 2022;13(1):2.
Rosa RG, Dietrich C, Valle ELT, Souza D, Tagliari L, Mattioni M, et al. The 6-Minute Walk Test predicts long-term physical improvement among intensive care unit survivors: a prospective cohort study. Rev Bras Ter Intensiva. 2021;33(3):374–83.
Parry SM, Denehy L, Beach LJ, Berney S, Williamson HC, Granger CL. Functional outcomes in ICU—what should we be using? An observational study. Crit Care Lond Engl. 2015;19(1):127.
Chan KS, Aronson Friedman L, Dinglas VD, Hough CL, Morris PE, Mendez-Tellez PA, et al. Evaluating physical outcomes in acute respiratory distress syndrome survivors: validity, responsiveness, and minimal important difference of 4-meter gait speed test. Crit Care Med. 2016;44(5):859–68.
Chiang LL, Wang LY, Wu CP, Wu HD, Wu YT. Effects of physical training on functional status in patients with prolonged mechanical ventilation. Phys Ther. 2006;86(9):1271–81.
Dos Reis NF, Figueiredo FCXS, Biscaro RRM, Lunardelli EB, Maurici R. Psychometric properties of the barthel index used at intensive care unit discharge. Am J Crit Care Off Publ Am Assoc Crit-Care Nurses. 2022;31(1):65–72.
Sacanella E, Pérez-Castejón JM, Nicolás JM, Masanés F, Navarro M, Castro P, et al. Mortality in healthy elderly patients after ICU admission. Intensive Care Med. 2009;35(3):550–5.
Van Der Schaaf M, Dettling DS, Beelen A, Lucas C, Dongelmans DA, Nollet F. Poor functional status immediately after discharge from an intensive care unit. Disabil Rehabil. 2008;30(23):1812–8.
Abd-El-Gawad WM, Adly NN, Salem HM. Diagnostic accuracy of activities of daily living in prediction of community-acquired pneumonia outcomes in elderly patients admitted to intensive care units. J Clin Gerontol Geriatr. 2013;4(4):123–7.
Bo M, Massaia M, Raspo S, Bosco F, Cena P, Molaschi M, et al. Predictive factors of in-hospital mortality in older patients admitted to a medical intensive care unit. J Am Geriatr Soc. 2003;51(4):529–33.
Clini EM, Crisafulli E, Antoni FD, Beneventi C, Trianni L, Costi S, et al. Functional recovery following physical training in tracheotomized and chronically ventilated patients. Respir Care. 2011;56(3):306–13.
Daubin C, Chevalier S, Séguin A, Gaillard C, Valette X, Prévost F, et al. Predictors of mortality and short-term physical and cognitive dependence in critically ill persons 75 years and older: a prospective cohort study. Health Qual Life Outcomes. 2011;9:35.
Tripathy S, Mishra JC, Dash SC. Critically ill elderly patients in a developing world—mortality and functional outcome at 1 year: A prospective single-center study. J Crit Care. 2014;29(3):474.e7-474.e13.
Vest MT, Murphy TE, Araujo KLB, Pisani MA. Disability in activities of daily living, depression, and quality of life among older medical ICU survivors: a prospective cohort study. Health Qual Life Outcomes. 2011;9:9.
Wu AW, Damiano AM, Lynn J, Alzola C, Teno J, Landefeld CS, et al. Predicting future functional status for seriously ill hospitalized adults. The SUPPORT prognostic model. Ann Intern Med. 1995;122(5):342–50.
Bruno RR, Wernly B, Flaatten H, Fjølner J, Artigas A, Baldia PH, et al. The association of the activities of daily living and the outcome of old intensive care patients suffering from COVID-19. Ann Intensive Care. 2022;12(1):26.
Broslawski GE, Elkins M, Algus M. Functional abilities of elderly survivors of intensive care. J Am Osteopath Assoc. 1995;95(12):712–7.
Ali NA, O’Brien JMJ, Hoffmann SP, Phillips G, Garland A, Finley JCW, et al. Acquired weakness, handgrip strength, and mortality in critically ill patients. Am J Respir Crit Care Med. 2008;178(3):261–8.
Baldwin CE, Paratz JD, Bersten AD. Muscle strength assessment in critically ill patients with handheld dynamometry: An investigation of reliability, minimal detectable change, and time to peak force generation. J Crit Care. 2013;28(1):77–86.
Cottereau G, Dres M, Avenel A, Fichet J, Jacobs FM, Prat D, et al. Handgrip strength predicts difficult weaning but not extubation failure in mechanically ventilated subjects. Respir Care. 2015;60(8):1097–104.
Fan E, Dowdy DW, Colantuoni E, Mendez-Tellez PA, Sevransky JE, Shanholtz C, et al. Physical complications in acute lung injury survivors: a two-year longitudinal prospective study. Crit Care Med. 2014;42(4):849–59.
Hermans G, Clerckx B, Vanhullebusch T, Segers J, Vanpee G, Robbeets C, et al. Interobserver agreement of Medical Research Council sum-score and handgrip strength in the intensive care unit. Muscle Nerve. 2012;45(1):18–25.
Lee JJ, Waak K, Grosse-Sundrup M, Xue F, Lee J, Chipman D, et al. Global muscle strength but not grip strength predicts mortality and length of stay in a general population in a surgical intensive care unit. Phys Ther. 2012;92(12):1546–55.
Mohamed-Hussein AAR, Makhlouf HA, Selim ZI, Gamaleldin SW. Association between hand grip strength with weaning and intensive care outcomes in COPD patients: a pilot study. Clin Respir J. 2018;12(10):2475–9.
Parry SM, Berney S, Granger CL, Dunlop DL, Murphy L, El-Ansary D, et al. A new two-tier strength assessment approach to the diagnosis of weakness in intensive care: an observational study. Crit Care Lond Engl. 2015;19(1):52.
Fan E, Ciesla ND, Truong AD, Bhoopathi V, Zeger SL, Needham DM. Inter-rater reliability of manual muscle strength testing in ICU survivors and simulated patients. Intensive Care Med. 2010;36(6):1038–43.
Shahbazi S, Hajimohammadebrahim-Ketabforoush M, Vahdat-Shariatpanahi M, Shahbazi E, Vahdat-Shariatpanahi Z. The validity of the global leadership initiative on malnutrition criteria for diagnosing malnutrition in critically ill patients with COVID-19: A prospective cohort study. Clin Nutr Espen. 2021;43:377–82.
Theilla M, Rattanachaiwong S, Kagan I, Rigler M, Bendavid I, Singer P. Validation of GLIM malnutrition criteria for diagnosis of malnutrition in ICU patients: an observational study. Clin Nutr Edinb Scotl. 2021;40(5):3578–84.
Applebaum EV, Breton D, Feng ZW, Ta AT, Walsh K, Chassé K, et al. Modified 30-second Sit to Stand test predicts falls in a cohort of institutionalized older veterans. PLoS ONE. 2017;12(5): e0176946.
Dall PM, Kerr A. Frequency of the sit to stand task: An observational study of free-living adults. Appl Ergon. 2010;41(1):58–61.
Grant PM, Dall PM, Kerr A. Daily and hourly frequency of the sit to stand movement in older adults: a comparison of day hospital, rehabilitation ward and community living groups. Aging Clin Exp Res. 2011;23(5–6):437–44.
Herridge MS, Chu LM, Matte A, Tomlinson G, Chan L, Thomas C, et al. The RECOVER program: disability risk groups and 1-year outcome after 7 or more days of mechanical ventilation. Am J Respir Crit Care Med. 2016;194(7):831–44.
Parry SM, Knight LD, Baldwin CE, Sani D, Kayambu G, Da Silva VM, et al. Evaluating Physical Functioning in Survivors of Critical Illness: Development of a New Continuum Measure for Acute Care*. Crit Care Med. 2020;48(10):1427–35.
Wright SE, Thomas K, Watson G, Baker C, Bryant A, Chadwick TJ, et al. Intensive versus standard physical rehabilitation therapy in the critically ill (EPICC): a multicentre, parallel-group, randomised controlled trial. Thorax. 2018;73(3):213–21.
Bowman A, Denehy L, Benjemaa A, Crowe J, Bruns E, Hall T, et al. Feasibility and safety of the 30-second sit-to-stand test delivered via telehealth: an observational study. PM R. 2023;15(1):31–40.
Núñez-Cortés R, Flor-Rufino C, Martínez-Arnau FM, Arnal-Gómez A, Espinoza-Bravo C, Hernández-Guillén D, et al. Feasibility of the 30 s Sit-to-Stand Test in the Telehealth Setting and Its Relationship to Persistent Symptoms in Non-Hospitalized Patients with Long COVID. Diagn Basel Switz. 2022;13(1):24.
da Silveira LTY, da Silva JM, Soler JMP, Sun CYL, Tanaka C, Fu C. Assessing functional status after intensive care unit stay: the Barthel Index and the Katz Index. Int J Qual Health Care J Int Soc Qual Health Care. 2018;30(4):265–70.
Hu L, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Model. 1999;6(1):1–55. https://doi.org/10.1080/10705519909540118.
Floyd FJ, Widaman KF. Factor analysis in the development and refinement of clinical assessment instruments. Psychol Assess. 1995;7(3):286–99. https://doi.org/10.1037/1040-35184.108.40.2066.
The authors would like to thank the European Society of Intensive Care Medicine and the American Society of Parenteral and Enteral Nutrition for endorsing this work.
This research received no external funding. Thomas Davies receives funding from the National Institute of Health Research (NIHR) Academic Clinical Fellowship Award Programme (Award Number: 2021–19-009). Eileen Kelly received full-time funding from the NIHR Pre-Doctoral Clinical Academic Fellowship Award Programme (Award Number: NIHR302695).
National Institute of Health Research (NIHR) Academic Clinical Fellowship Award Programme,2021-19-009,National Institute of Health Research (NIHR) Pre-Doctoral Clinical Academic Fellowship Award Programme,NIHR302695
Ethics approval and consent to participate
Consent for publication
JCP is an editor of Critical Care. ZP has received honoraria for consultancy and/or speaker fees from Nestle, Fresenius Kabi, Nutricia, Baxter and Faraday Pharmaceuticals, and research and educational grants from Nestle and Baxter. ARB has received honoraria for consultancy and/or speaker fees from Nestlé, Fresenius Kabi, VIPUN Medical and Nutricia. SJS received grants and non-financial support from Reactive Robotics GmbH (Munich, Germany), ASP GmbH (Attendorn, Germany), STIMIT AG (Biel, Switzerland), ESICM (Geneva, Switzerland), grants, personal fees, and non-financial support from Fresenius Kabi Deutschland GmbH (Bad Homburg, Germany), grants from the Innovations fund of The Federal Joint Committee (G-BA), personal fees from Springer Verlag GmbH (Vienna, Austria) for educational purposes and Advanz Pharma GmbH (Bielefeld, Germany), non-financial support from national and international societies (and their congress organisers) in the field of anaesthesiology and intensive care medicine, outside the submitted work. SJS holds stocks in small amounts from Alphabet Inc., Bayer AG, and Siemens AG; these holdings have not affected any decisions regarding his research or this study. AH’s position is currently supported by a stipend from the Medical Faculty RWTH Aachen “Habilitationsstipendium”. Within the last 36 months AH received lecture and travel fees from Fresenius Kabi Germany and Baxter and grants for investigator initiated trials from the DFG, Fresenius Kabi Germany, Lotte & John Hecht Memorial Foundation and Pascoe. None of the disclosed financial relationships may be perceived as inappropriately influencing AH’s contribution to this project or this manuscript. MVP received research funding from Fresenius-Kabi and Nutricia Research, speakers fees from Nutricia. MVP is the principal investigator of the PRECISe trial, which uses a COS for resp failure by DN. MPC receives funding from the Research Foundation Flanders (FWO) (Grant No. 1832817N) and Onderzoeksraad, KU Leuven (Grant No. C24/17/070) and the Private Charity Organization “Help Brandwonden Kids”. DEB has received speaker fees from Baxter Healthcare and has received research grant funding from Nutricia Ltd. RMP has received honoraria and/or research grants from Edwards Lifesciences and Intersurgical UK. All other authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Davies, T.W., Kelly, E., van Gassel, R.J.J. et al. A systematic review and meta-analysis of the clinimetric properties of the core outcome measurement instruments for clinical effectiveness trials of nutritional and metabolic interventions in critical illness (CONCISE). Crit Care 27, 450 (2023). https://doi.org/10.1186/s13054-023-04729-7