Functional hemodynamic tests: a systematic review and a metanalysis on the reliability of the end-expiratory occlusion test and of the mini-fluid challenge in predicting fluid responsiveness

Background Bedside functional hemodynamic assessment has gained in popularity in the last years to overcome the limitations of static or dynamic indexes in predicting fluid responsiveness. The aim of this systematic review and metanalysis of studies is to investigate the reliability of the functional hemodynamic tests (FHTs) used to assess fluid responsiveness in adult patients in the intensive care unit (ICU) and operating room (OR). Methods MEDLINE, EMBASE, and Cochrane databases were screened for relevant articles using a FHT, with the exception of the passive leg raising. The QUADAS-2 scale was used to assess the risk of bias of the included studies. In-between study heterogeneity was assessed through the I2 indicator. Bias assessment graphs were plotted, and Egger’s regression analysis was used to evaluate the publication bias. The metanalysis determined the pooled area under the receiving operating characteristic (ROC) curve, sensitivity, specificity, and threshold for two FHTs: the end-expiratory occlusion test (EEOT) and the mini-fluid challenge (FC). Results After text selection, 21 studies met the inclusion criteria, 7 performed in the OR, and 14 in the ICU between 2005 and 2018. The search included 805 patients and 870 FCs with a median (IQR) of 39 (25–50) patients and 41 (30–52) FCs per study. The median fluid responsiveness was 54% (45–59). Ten studies (47.6%) adopted a gray zone analysis of the ROC curve, and a median (IQR) of 20% (15–51) of the enrolled patients was included in the gray zone. The pooled area under the ROC curve for the end-expiratory occlusion test (EEOT) was 0.96 (95%CI 0.92–1.00). The pooled sensitivity and specificity were 0.86 (95%CI 0.74–0.94) and 0.91 (95%CI 0.85–0.95), respectively, with a best threshold of 5% (4.0–8.0%). The pooled area under the ROC curve for the mini-FC was 0.91 (95%CI 0.85–0.97). The pooled sensitivity and specificity were 0.82 (95%CI 0.76–0.88) and 0.83 (95%CI 0.77–0.89), respectively, with a best threshold of 5% (3.0–7.0%). Conclusions The EEOT and the mini-FC reliably predict fluid responsiveness in the ICU and OR. Other FHTs have been tested insofar in heterogeneous clinical settings and, despite promising results, warrant further investigations. Electronic supplementary material The online version of this article (10.1186/s13054-019-2545-z) contains supplementary material, which is available to authorized users.


Introduction
Tailored fluid therapy has received increasing attention in the management of patients with acute circulatory failure in both the intensive care unit (ICU) and operating room (OR). The aim is to try and prevent both inadequate tissue perfusion and fluid overload [1]. Unnecessary fluid administration has been associated with increased morbidity, mortality, and hospital length of stay in both critically ill and surgical patients [2][3][4][5][6][7][8][9][10].
The only physiological reason to give a fluid challenge (FC) to a patient with acute circulatory failure is to increase the stroke volume (SV) ultimately leading to an increase in oxygen transport [11][12][13]. However, this is only achieved in approximately 50% of ICU and OR patients [14,15]. The prediction of fluid responsiveness prior to FC administration is a topic of interest, which has been extensively investigated, but remains challenging [1,13,[16][17][18]. Bedside clinical signs, systemic pressures, and static volumetric variables poorly predict fluid responsiveness [17]. Moreover, the values of the ventilator-induced dynamic changes in pulse pressure and stroke volume [pulse pressure variation (PPV) and stroke volume variation (SVV), respectively] are often unreliable in a significant number of ICU and OR patients [19][20][21].
To overcome these limitations, bedside functional hemodynamic assessment has gained in popularity [17,18,22]. A functional hemodynamic test (FHT) consists of a maneuver that affects cardiac function and/or heartlung interactions, with a subsequent hemodynamic response, the extent of which varies between fluid responders and non-responders [17,18,22].
The FHT called passive leg raise (PLR) has been successfully used since 2009 to assess fluid responsiveness in ICU patients [23], as confirmed by three metanalyses [24][25][26]. Some conditions, however, including abdominal or intracranial hypertension and traumatic hip or lower limb fractures, limit the use of a PLR [27], and it is often unfeasible in the OR.
A number of different FHTs have been proposed as alternatives to the PLR, for use in both the ICU and more recently the OR. These tests can be subdivided into two groups. One subgroup of FHTs is based on the assessment of changes in systemic PPV and SVV or left ventricular SV in response to a predefined alteration in ventilatory settings. These tests rely on physiological heart-lung interactions, which can affect several cardiac properties. A change in respiratory dynamics alters venous return, leading to changes in right ventricular preload, afterload, and subsequently left ventricular function. [23,28]. A second subgroup of tests aims at testing the increase in SV after the rapid administration of a small aliquot of a predefined FC [29,30].
Since the reliability and the limits of PPV, SVV, and PLR in predicting fluid responsiveness have been already extensively investigated in different clinical settings [15,[24][25][26]31], we conducted a systematic review of the literature and performed a metanalysis aimed at assessing the overall quality, external validation, consistency, and risk of bias of the other FHTs available in both the ICU and OR.

Study selection and inclusion criteria
We included articles published in the English language, in indexed scientific journals, from 1966 to June 2018. Reviews, case reports, and studies published in abstract form were not included. Only studies performed in adults were eligible for inclusion.
Only studies that compared the reliability of the FHT to a FC, as the gold standard for assessing fluid responsiveness, were included. The definition of a FHT was a standardized hemodynamic maneuver affecting cardiac function and/or heart-lung interactions and used to assess fluid responsiveness. The definition of a FC was a fixed quantity of fluid administered in a defined time to change a hemodynamic variable by a predetermined threshold. We included only the following hemodynamic variables as potential indicators of a positive FC: cardiac output (CO); SV; their indexed values (CI and SVI) or SV surrogates, i.e., aortic velocity-time integrals; and aortic blood flow, as assessed by either transthoracic or trans-esophageal echocardiography.
We excluded those studies in which FHTs were performed in patients with an open chest or with atrial fibrillation. We did not impose exclusion criteria regarding the modality or the absence of mechanical ventilation.

Search strategy and data extraction
We independently searched the MEDLINE, EMBASE, and Cochrane Database of Systematic Reviews using the following search criteria: (fluid AND responsiveness) OR passive AND leg AND raising) OR end-expiratory AND occlusion AND test) OR pulse AND pressure AND variation) OR stroke AND volume AND variation) OR (dynamic AND indices OR indexes)) OR mini-fluid challenge) OR functional AND hemodynamic AND monitoring) OR (fluid AND challenge). Filters: Humans; English; Adult: 19+ years.
The references for all included papers, review articles, commentaries, and editorials on this topic were also reviewed to identify other studies of interest that were missed during the primary search. Two of the authors (FT and GM) independently performed the evaluation of titles and abstracts. The articles were then subdivided into three subgroups: "included" and "excluded" (if the two examiners agreed with the selection) or "uncertain" (in case of disagreement). In case of "uncertain" classification, a further examination was performed by an expert (AM) and a conclusive decision was made.
We used a standardized data form to extract the data from all included studies, recording (1) the characteristics of the investigated population, (2) the methods used to perform the FHT test and to assess its hemodynamic effect, (3) the modalities of FC administration and the definition of fluid responsiveness, and (4) the area under the receiver operating characteristic (ROC) curve (AUC) and all the statistical data obtained by the ROC curve analysis (i.e., sensitivity, specificity, Youden index, positive and negative predictive values, positive and negative likelihood ratios). For those studies in which more than one method of hemodynamic monitoring was used to estimate flow parameters, we reported only the data obtained by the technique considered to be the most reliable, according to the following scale: pulmonary artery catheter or calibrated technique > cardiac echocardiography performed by experts (both transthoracic or transesophageal) > uncalibrated technique or esophageal Doppler probes > bioimpedance or bioreactance.

Assessment of risk of bias in the included studies
The QUADAS-2 scale was used to assess the risk of bias of the included studies [32]. Two expert authors (AM and MC) independently examined the studies using predefined criteria, which are reported in Additional file 1: Table S1.
For each criterion, the risk of bias was judged as high (3 points), unclear (2 points), or low (1 point). If the answers to all signaling questions for a domain were "yes," then the risk of bias was judged as "low." If any signaling question was answered "no," the potential risk of bias was defined as indicated in Additional file 1: Table S1. The sum of these points was used to calculate the global risk of bias.
Studies were included in the highest risk of bias group if the sum of the points obtained by the risk of bias and applicability judgment assessment was higher than the median value for all the studies.

Statistical analysis
Statistical analysis was conducted on the summary statistics described in the selected articles (e.g., means, medians, proportions), and therefore, the statistical unit of observation for all the selected variables was the single study and not the individual patients.
The descriptive statistics of individual studies used different statistical indicators for central tendency and variability, whereas absolute and relative frequencies were adopted for qualitative variables. Quantitative variables were summarized with means (standard deviation, SD) or medians (25th-75th interquartile range, IQR) according to their distribution.
For the selected studies, we planned to perform (1) a metanalysis in order to determine the pooled AUC and the pooled sensitivity and specificity of the FHT as a predictor of fluid responsiveness and (2) a metanalysis in order to determine the pooled correlation between the changes in the flow hemodynamic parameters after FHT and the changes after FC administration. The FC was the exposure variable, and clinical and hemodynamic characteristics were considered as the outcome variables. Fixed effect models were used. In-between study heterogeneity was assessed through the I 2 indicator. Bias assessment graphs were plotted, and Egger's regression analysis was used to evaluate the publication bias. Student's t test or Mann-Whitney test for parametric or non-parametric distributions were respectively used to assess a difference in mean values between responders and non-responders.

Results
The electronic search identified 7674 titles. After the first assessment by two authors, 32 full-text manuscripts were included in the secondary analysis and 21 met the inclusion criteria: 7 performed in OR and 14 in ICU between 2005 and 2018. The senior examiner evaluated 177 of the 7524 (2%) potentially relevant studies because of disagreement between the two authors. A detailed description of the selection process flow is provided in Fig. 1. We did not find any further relevant publications by reviewing the references of the selected studies, review articles, commentaries, or editorials regarding the use of FHTs.
According to the search criteria, we identified seven different types of FHTs (see Table 1): 1. An interruption of the mechanical ventilation for few seconds to determine an increase in right ventricle preload (the end-expiratory occlusion test EEOT) 2. A quick administration of an aliquot of 50-100 ml of fluid to increase the SV (the mini-FC test) 3. The use of a lung recruitment maneuver (LRM) of 25-30 cmH 2 0 to affect the hemodynamic response of the right ventricle 4. The assessment of the systolic arterial pressure decrease after the use of successive incremental pressure-controlled breaths [the respiratory systolic variation test (RSVT)] 5. The assessment of the arterial pressure response during a Valsalva maneuver 6. The assessment of the arterial pressure elevated during a brief increase of the positive endexpiratory pressure increase from 10 to 20 cmH 2 0 7. An increase of the tidal volume from 6 to 8 ml/kg for 1 min to enhance the baseline reliability of the dynamic indexes of fluid responsiveness All the studies were monocentric and, overall, included 805 patients and 870 FCs with a median (IQR) of 39  patients and 41  FCs per study. The median (IQR) fluid responsiveness was 54% (45-60) and was not different between the OR and ICU studies [51% (37-62) vs. 54% (45)(46)(47)(48)(49)(50)(51)(52)(53)(54)(55)(56)(57)(58), respectively; p = 0.81]. The hemodynamic values of responders and non-responders before FHT application in both the OR and ICU studies did not differ (see Additional file 1: Table S2). Ten studies (48%) adopted a gray zone analysis of the ROC curve,      Overall, the median (IQR) QUADAS-2 score of the included studies was 9 (8)(9)(10)(11) and was not different between the OR and ICU [10 (8-11) vs. 9 (8-11), respectively; p = 0.67]. Three OR studies (43%) and six ICU studies (43%) were classified in the subgroup with the highest risk of bias (see Table 2).

Metanalysis of the included studies (see Figs. 2, 3, and 4)
The pooled AUC of the EEOT from two studies conducted in the OR [33,34] and six [23,43,46,[48][49][50] in the ICU was 0.96 (95%CI 0.92-1.00). The pooled sensitivity of the test was 0.86 (95%CI 0.74-0.94), with I 2 of 75% (95%CI 43-85%), and the pooled specificity was 0.91 (95%CI 0.85-0.95), with I 2 of 35% (95%CI 0-69%). The median threshold identified was a 5% (4-8%) increase in the  Table 3 for details) Fig. 3 EEOT forest plot of included studies. Forest plot reporting the pooled sensitivity and specificity (green diamonds) of the end-expiratory occlusion test (EEOT) in predicting of fluid responsiveness by considering the changes in stroke volume or its surrogates after the test and those induced by fluid challenge administration. Black squares represent the values of sensitivity and specificity (with 95% confidence intervals; black lines) of each study included in the metanalysis, and the size of each square indicates the size of each study. The definitions Monnet et al. "a" and "b" refer to the two populations investigated in the study [50] (see also Table 3 and see text for details). 95%CI, 95% confidence intervals considered variable. The funnel plot of the included studies testing the EEOT shows a significant likelihood of publication bias (see Additional file 1: Figures S1  and S2).
The funnel plot for the included studies testing the mini-FC shows a small likelihood of publication bias (see Additional file 1: Figures S3 and S4). Moreover, it was possible to calculate a pooled correlation of r = 0.68 (95%CI 0.41-0.84) between the changes in the cardiac flow parameters after mini-FC application and after FC administration from data obtained from 6 studies [29,36,40,41,44,45].

Discussion
The main findings of this systematic review conducted in ICU and OR patients are as follows: (1) the EEOT and the mini-FC have been tested in the OR and ICU and shown good sensitivity and specificity for predicting fluid responsiveness; (2) currently, the literature provides insufficient data regarding the other FHTs to assess a pooled quantification of their reliability in predicting fluid responsiveness; and (3) publication bias, small-sized study effects, and methodological heterogeneity of the individual studies should be considered.
This FHT was first proposed by Monnet et al. [23] and predicts fluid responsiveness by assessing changes in CO, or its surrogates, following a few second interruption to mechanical ventilation. In preload-dependent patients, this maneuver increases venous return and right ventricular and then subsequently left ventricular stroke volume. The potential drawbacks of this FHT include that it may be limited by patient positioning, the baseline tidal volume ventilation adopted, and the hemodynamic effects of residual spontaneous breathing efforts. Only one study used the EEOT to assess fluid responsiveness in prone ICU patients with moderate ARDS, reporting an AUC of 0.65 (0.46-0.84) [43]. Prone positioning affects the venous return by compressing the inferior cava vein and changing the intra-abdominal pressure [51][52][53], which may reduce the changes in CO and SV seen in response to the ventilatory challenge and limit the reliability of the EEOT.
The change in intrathoracic pressure may be insufficient to adequately increase right ventricular preload when a lung-protective ventilation strategy is used. Also, if the neural trigger for ventilation is preserved, a 15-to 30-s expiratory hold would result in a progressive increase in inspiratory pressure [54], affecting the venous return and   the reliability of the FHT. Unfortunately, data regarding these issues is limited and contradictory.
In the OR, the EEOT performed better in a study using a mean tidal volume of 6.8 ml/kg [34], when compared to another study using 8.2 ml/kg [33]. In the ICU, the median tidal volume in those studies enrolling supine patients was 6.8 ml/kg (6.1-7.3). The EEOT failed to predict fluid responsiveness in the study of Myatra et al. using a 6-ml/ kg ventilation [49], whereas Jozwiak et al. reported an AUC of 0.98 (0.85-1.0) using a 6.2-ml/kg ventilation. Interestingly, these two latter studies reported a comparable mean total respiratory system compliance in the enrolled patients (28 vs. 36 ml/cmH 2 O, respectively).
Monnet et al. reported an EEOT failure as high as 22.5%, due to the patient effort against an occluded airway [23]. However, none of the other studies using this FHT reported this failure rate. Four of the five studies reported no spontaneous breathing activity during assisted-controlled ventilation (see Table 1), implying the level of sedation was inhibiting neural triggering. None of these studies reported a flowchart showing the overall number of excluded patients, limiting the assessment of EEOT reliability during visible spontaneous breathing activity, which is a potential drawback for assessing fluid responsiveness.

Mini-FC
The mini-FC showed a pooled AUC of 0.91 (95%CI 0.85-0.97). The pooled sensitivity and specificity were 0.82 (95%CI 0.76-0.88) and 0.83 (95%CI 0.77-0.89), respectively, with a best threshold of 5% (3.0-7.0%) increase in SV or its surrogates, see Fig. 4 and Table 3. These values of the pooled ROC curve imply a moderate overlap in the distribution of responders and nonresponders.
In the two studies reporting an AUC higher than 0.90, the percentage of patients included in the gray zone was approximately 14-19% [35, 36] (see Table 3). Moreover, the performance of this FHT was comparable under stable conditions in the OR (using uncalibrated tools) and in more unstable ICU patients (using calibrated tools) (see Table 1).
The dose of the mini-FC was not fixed. Most of the studies used a bolus of 100 ml infused over 60 s, but Wu et al. demonstrated that a 10% of change in SV following the infusion of a 50-ml bolus in 10 s reliably predicted fluid responsiveness [40].
Some may argue that the mini-FC should not be considered an appropriate FHT, since the response to the first small aliquot of fluids is actually included into the response to the final volume administered, therefore not predicting the response to the whole FC, but only to a part of it. However, recent studies have shown different components of FC, related to the response (the extent of SV increase) and sustainability of the hemodynamic effect (the effect of SV over time) [55][56][57]. The mini-FC allows a dynamic evaluation of fluid administration, preventing inappropriate administration and allowing a tailored infusion. Moreover, this FHT has been also used in a different functional manner. In fact, Mallat et al. [45] demonstrated that a reduction in PPV [AUC = 0.92 (0.81-0.98)] or SVV [AUC = 0.91 (0.80-0.97)] following a mini-FC test was a better predictor of fluid responsiveness than an increase in CO. The cutoffs identified by the ROC curve for the changes in PPV and SVV are even smaller (2.0%) than the changes in CO (5.2%), implying a high precision of measurement, whichever hemodynamic tool is used.

Other FTHs
All the other FHTs reported in the literature affect both right ventricular preload and afterload, by briefly altering intrathoracic pressure and, as a consequence, venous return and pulmonary vascular resistance.
The RSVT is based on the delivery of consecutive pressure-controlled inspiratory breaths, using incremental peak inspiratory pressures (up to 30 cmH 2 O) and plotting the minimal values of the systolic arterial pressure recorded after each breath against the related airway pressures (offline slope calculation) [28,37]. Despite promising results obtained in both the OR and ICU [28,37], the integration of respiratory and hemodynamic signals required to allow an online computation of the RSVT has never been achieved at the bedside.
Raising intrathoracic pressure by increasing peak inspiratory pressure using either a Valsalva maneuver [42]or the end-expiratory occlusion pressure [47] or by performing a LRM are all FHTs that induce a sudden change in right ventricular preload and afterload. LRMs have been successfully applied in the OR, showing a comparable AUC in neurosurgery [38] and general abdominal surgery [39]. However, Biais et al. found that the best threshold to define fluid responsiveness was a 30% reduction in SV, but De Broca et al. showed only a 16% reduction was required [39], suggesting caution in the interpretation on this FHT.
Finally, more recently, Myatra et al. successfully enhanced the reliability of baseline indexes of fluid responsiveness by increasing the tidal volume from 6 to 8 ml/kg for 1 min (the tidal volume challenge) [49]. This simple and quick FHT could be used in patients undergoing protective ventilation but should be tested in larger ICU populations both with and without spontaneous breathing activity.

Bedside application
The EEOT and the mini-FC could be appropriately used in different clinical scenarios, especially when the PLR is unsuitable or in adjunct to that. In Fig. 5, we propose a step-by-step clinical algorithm in patients who would benefit from FC administration in the OR and the ICU.

Limitations
The comparability of the included studies is limited by the heterogeneity of FC administration used as the reference point (see Table 1). Aya et al. have previously demonstrated that a FC should be at least 4 ml/kg [55]. For this reason, some patients enrolled in those studies adopting a smaller dose of FC (3.7 ml/kg [34]; 3.3 ml/kg [35,38]) may be underchallenged, which may have affected the observed rate of fluid responsiveness and, in turn, the ROC curve construction.
Another potential source of bias is related to the different hemodynamic tools used to assess both fluid responsiveness and FHT reliability. In fact, when considering the median cutoff value identifying responders from non-responders (about 5% for both the EEOT and the mini-FC), the accuracy of measurement of the changes in CO, or its surrogates, is of pivotal importance. For example, the negative results of Guinot et al. [33], conducted in the OR, have been questioned as the esophageal Doppler does not measure the change in aortic diameter and could therefore underestimate the change in SV during either the EEOT or the FC [59].
Additionally, the reliability of different calibrated and uncalibrated tools in tracking the dynamic trends of CO may not be consistent and may be below the boundaries of the accuracy and precision of the Critchley-Critchley criteria [60,61]. For instance, the reproducibility of the measurements obtained by the different hemodynamic tools has never been reported in the included studies. This implies that small changes in CO, or its surrogates, after a FHT may be inaccurately detected in the OR, where the hemodynamic monitoring is usually performed with uncalibrated tools, whereas the use of calibrated techniques by means of thermodilution could reduce the risk of imprecise measurements in ICU.
All the included studies had a small-sized single-center design and enrolled a median number of patients rather small [39][40][41][42][43][44][45][46][47][48][49][50]], and about 43% of the included studies were classified in the subgroup with the highest risk of bias, mainly because of the drawbacks related to the patient selection, according to the QUADAS-2 score (see Table 2). This limitation along with the use of different cutoff values, thresholds, and measurement techniques to assess fluid responsiveness potentially produced heterogeneity in the response to the FC administration. As confirmed, the proportion of responders ranged between 30 and 71% across the included studies. The bedside application is also limited in those potentially misclassified patients (roughly 20% in the reported studies) included in the gray zone of the ROC curve, where the predictive power of the FHT is rather low. Another source of heterogeneity may be related to the different sample sizes of the included studies, as confirmed by the large interquartile ranges of the I 2 . Finally, we did not include non-full-text studies, studies not in English, and unpublished studies, and this systematic review was not prospectively registered in PROSPERO, an international database of systematic reviews in health and social care, increasing the overall risk of reporting bias. Clinical algorithm for EEOT and mini-FC application in the ICU and the OR. In the OR, FHTs can be added to the dynamic indexes evaluation, considering the gray zone reported in the literature [21]. When PPV or SVV values range within the gray zone, we suggest the use of the EEOT, as the first step. A clear positive response (SV increase > 5%) suggests fluid responsiveness, whereas a negative/uncertain response could be further investigated by the consequent use of the mini-FC, as indicated. In critically ill patients, the need of FC administration is often evaluated combining different signs and measurements [58]. In this context, the EEOT (in patient undergoing controlled mechanical ventilation) and the mini-FC (in patients retaining to some extent a spontaneous breathing effort) can be useful when the PLR is unsuitable.*We suggest a FC of 4 ml/kg [55] over 10 min. **Intra-abdominal hypertension; uncontrolled pain, cough, discomfort, and awakening; hip/leg fractures; uncontrolled intracranial hypertension. ICU, intensive care unit; OR, operating room; FC, fluid challenge; PLR, passive leg raising; CMV, controlled mechanical ventilation; SB, spontaneously breathing patients; AMV, assisted mechanical ventilation; PPV, pulse pressure variation; SVV, stroke volume variation; EEOT, end-expiratory occlusion test; SV, stroke volume