Performance of the cuff leak test in adults in predicting post-extubation airway complications: a systematic review and meta-analysis

Background Clinical practice guidelines recommend performing a cuff leak test in mechanically ventilated adults who meet extubation criteria to screen those at high risk for post-extubation stridor. Previous systematic reviews demonstrated excellent specificity of the cuff leak test but disagreed with respect to sensitivity. We conducted a systematic review and meta-analysis to assess the diagnostic accuracy of the cuff leak test for predicting post-extubation airway complications in intubated adult patients in critical care settings. Methods We searched Medline, EMBASE, Scopus, ISI Web of Science, the Cochrane Library for eligible studies from inception to March 16, 2020, without language restrictions. We included studies that examined the diagnostic accuracy of cuff leak test if post-extubation airway obstruction after extubation or reintubation was explicitly reported as the reference standard. Two authors in duplicate and independently assessed the risk of bias using the Quality Assessment for Diagnostic Accuracy Studies-2 tool. We pooled sensitivities and specificities using generalized linear mixed model approach to bivariate random-effects meta-analysis. Our primary outcomes were post-extubation airway obstruction and reintubation. Results We included 28 studies involving 4493 extubations. Three studies were at low risk for all QUADAS-2 risk of bias domains. The pooled sensitivity and specificity of cuff leak test for post-extubation airway obstruction were 0.62 (95% CI 0.49–0.73; I2 = 81.6%) and 0.87 (95% CI 0.82–0.90; I2 = 97.8%), respectively. The pooled sensitivity and specificity of the cuff leak test for reintubation were 0.66 (95% CI 0.46–0.81; I2 = 48.9%) and 0.88 (95% CI 0.83–0.92; I2 = 87.4%), respectively. Subgroup analyses suggested that the type of cuff leak test and length of intubation might be the cause of statistical heterogeneity of sensitivity and specificity, respectively, for post-extubation airway obstruction. Conclusions The cuff leak test has excellent specificity but moderate sensitivity for post-extubation airway obstruction. The high specificity suggests that clinicians should consider intervening in patients with a positive test, but the low sensitivity suggests that patients still need to be closely monitored post-extubation.

Since the endotracheal tube precludes direct visualization of the upper airway, the cuff leak test was proposed to predict the presence of laryngeal edema and post-extubation airway obstruction [10,11]. Theoretically, when there is no laryngeal edema, there is an air leak around the tube after deflating the balloon cuff of the endotracheal tube [12,13]. In contrast, a failed cuff leak test suggests little or no air leak around the tube, suggesting potential airway obstruction from laryngeal edema [12,13]. The clinical practice guideline published by the American Thoracic Society and American College of Chest Physicians in 2017 recommends performing a cuff leak test in mechanically ventilated adults who meet extubation criteria to screen those at high risk for post-extubation stridor [14]. This guideline referenced two systematic reviews on the diagnostic accuracy of cuff leak test [15,16], published in 2009 and 2011. Both reviews demonstrated excellent specificity of the cuff leak test but disagreed with respect to sensitivity. In addition, there have been several studies of the diagnostic accuracy of cuff leak test published after these reviews were completed.
Consequently, we conducted a systematic review and meta-analysis to assess the diagnostic accuracy of the cuff leak test for predicting post-extubation airway obstruction and subsequent reintubation.

Methods
The conduct and reporting of this systematic review followed the PRISMA-DTA Statement [17]. Our review protocol was registered at PROSPERO (CRD42018084357).
We searched Medline, EMBASE, Scopus, ISI Web of Science, and the Cochrane Library for eligible studies from inception to March 16, 2020, without language restrictions [18]. Our search strategy was developed with the help of a medical librarian (Additional file 3: Table S1). We hand-searched the references of included articles for potentially relevant studies.
We examined the diagnostic accuracy of cuff leak test in intubated adult patients awaiting extubation in critical care settings. The index test was cuff leak test regardless of the type of cuff leak test (quantitative or qualitative) and threshold used. The reference standards included post-extubation airway obstruction determined by the original authors and subsequent reintubation. We included observational studies (cross-sectional and cohort studies) that examined the diagnostic accuracy of cuff leak test in critical care settings if: (1) the data were extractable into a 2 × 2 table from the reported data, (2) post-extubation airway obstruction after extubation or reintubation was explicitly reported as the reference standard. We considered both published studies and conference proceedings; however, we included the abstracts from conference proceedings only when they provided data in enough detail to be extractable. We considered interventional studies in critical care settings that examined the efficacy of systemic corticosteroids to prevent post-extubation airway complications; however, we excluded patients to whom systemic corticosteroids were administered after they were judged at high risk of postextubation airway complications. Two authors (AK and JK) independently screened titles and abstracts obtained from the search and selected potentially relevant articles. Disagreement was resolved through discussion.
The first author (AK) and one of the other authors (JLJ and JK) in duplicate and independently extracted the following data from each study: (1) patient demographics (age, sex); (2) study characteristics (country, study population; duration of mechanical ventilation; mode of ventilation; observation period after extubation); (3) the type of cuff leak test (quantitative or qualitative); (4) numbers of true-positive, false-positive, true-negative, and falsenegative; and (5) the reference standards used. Quantitative cuff leak test measures the air leak volume with a cuff deflated and judges the post-extubation airway obstruction based on its absolute volume or proportion in comparison with the expiratory tidal volume against a certain threshold [19]. Qualitative cuff leak test examines the presence or absence of audible expired air around an endotracheal tube, which indicates the pass or failure of the test [19]. In this study, a lack of a cuff leak, having a risk of post-extubation complications, was considered a positive test, while having a leak, suggesting low risk of post-extubation complications was a negative test [19].
Two authors (AK and JLJ) independently assessed the risk of bias using the Quality Assessment for Diagnostic Accuracy Studies-2 (QUADAS-2) tool [20]. Inconsistency was resolved through consensus.
We pooled the data using a generalized linear mixed model approach to bivariate random-effects meta-analysis to calculate summary estimates of sensitivity, specificity, and likelihood ratios as well as the associated 95% confidence intervals (CIs) [24]. We pooled prevalence using a random-effects model, with exact binomial estimates of standard deviation and the Freeman-Tukey transformation for zero cells [25]. To examine the sources of heterogeneity, we examined the receiver operating characteristic (ROC) curves, assessed the correlation between sensitivity and specificity, and analyzed whether sensitivity and specificity of cuff leak changed with the type of cuff leak test (qualitative or quantitative), the proportion of women, inclusion or exclusion of reintubated patients in a study, and the length of intubation, using subgroup or meta-regression analysis [14,19]. We also calculated the sensitivity and specificity of cuff leak test using a cutoff of 110 mL, a value that is frequently used in clinical practice [21]. We tested for publication bias using Deeks' method [26]. We created a Fagan's nomogram, which determines the posttest probability according to the pretest probability and the calculated positive and negative likelihood ratios [27]. We followed standard diagnostic meta-analytic approaches in focusing on the sensitivity, specificity, and likelihood ratios instead of positive and negative predictive values because predictive values are dependent on the population prevalence of post-extubation complications, which can vary considerably. The threshold of statistical significance was set at P < 0.05. All analyses were performed with Stata SE, version 15.1 (Stata Corp; College Station, TX).
Three of the 28 studies were at low risk of bias for all QUADAS-2 risk of bias domains (Table 3). Seventeen studies (60.7%) were deemed at low risk of bias for the domain of patient selection. Eight out of 22 studies that assessed quantitative cuff leak prespecified the cutoff of cuff leak; fourteen studies (50%) were deemed to have adequately assessed the index test. A reference standard was adequately assessed in 18 studies (64.3%). Study participants were adequately followed up in 19 studies (67.9%).
Subgroup analysis suggested that the specificity was similar between the qualitative and quantitative cuff leak tests ( A nomogram based on the pretest probability of 9% (the incidence of stridor in the studies included in our study) is provided (Fig. 3).

Discussion
Our study found that the cuff leak test has excellent specificity but moderate sensitivity for post-extubation airway obstruction. The cuff leak test thus works better to rule in than to rule out potential post-extubation airway obstruction. However, the false-negative rate of  NR not reported 38% suggests that the cuff leak test may fail to identify some patients with post-extubation airway obstruction. Our study found that the specificity of the cuff leak test for post-extubation airway obstruction was excellent, which is consistent with two previous systematic reviews [15,16]. In contrast, Ochoa et al. and Zhou et al. concluded that the sensitivity of cuff leak testing for postextubation airway obstruction was 56% and 80%, respectively [15,16]: Our pooled sensitivity was 62%, which fell between those two findings. We included nearly double the number of studies that their reviews did. Furthermore, the additional studies we included were higher quality, potentially making our findings more reliable.
Our analysis found that the qualitative cuff leak test had low sensitivity (35%) in predicting post-extubation airway obstruction. This has been consistently found in recent studies. The most likely explanation is the subjective nature of this test. In addition, since Schnell et al.
provided data from three different methods of qualitative cuff leak testing [48], the sensitivity of which were all around 30%; this study may have been overweighed, although repeat analysis limiting Schnell's study to a single data contribution did not change the sensitivity of the qualitative test. In contrast, the specificity of both qualitative and quantitative cuff leak tests was high, nearly 90%, while there was a statistically significant difference between two methods, clinically both performed equally well. A cutoff of 110 mL also had a low sensitivity (44%) and high specificity for predicting post-extubation airway obstruction. We thus conclude that the cuff leak test has high specificity and can be used to select patients to consider treating with systemic corticosteroids, but its low sensitivity suggests that the traditional practice of closely observing all patients in the immediate post-extubation period should be continued. Consistent with these findings, the nomogram suggested that while a negative cuff leak test represents low possibility of post-extubation airway obstruction, a positive test still provides a relatively low posttest probability. The guideline by ATS/ACCP provided a conditional recommendation regarding cuff leak test [14], because failing the cuff leak test might lead a delay in extubation and an increase in complications such as barotrauma and ventilator-associated pneumonia. The guideline weakly recommended that the cuff leak test be reserved for highrisk patients, who experienced a traumatic intubation, were intubated more than 6 days, have a large endotracheal tube, are female, or were reintubated after an unplanned extubation [14]. Our analysis found that the length of intubation had a small impact on the specificity of cuff leak test. Female sex and reintubation had no impact of the accuracy of cuff leak test. Since the original studies included in our review examined non-selected patients with respect to the risk of post-extubation airway obstruction and the sensitivity of cuff leak test is moderate, we support the idea of the ATS/ACCP guideline to reserve cuff leak test for high-risk patients.
Our study suggested that the cuff leak test has moderate sensitivity and excellent specificity for reintubation. Although the sensitivity in our study was similar to those of previous meta-analyses, the specificity in our study was slightly higher [15,16]. The area under the SROC curve and DOR were also greater than previously reported [16]. Thus, a failed cuff leak test may serve as a good marker for those at risk of reintubation, if postextubation airway obstruction is not treated adequately.
The limitation of cuff leak test has been repeatedly discussed. Cuff leak test can be susceptible to relationship of tube size to laryngeal diameter [41], respiratory system compliance and resistance, inspiratory flow, expiatory flow and time, and airway collapse [51], and clinicians should bear in mind that the ability of cuff leak test may vary according to the condition or type of patients [52]. Additionally, coughing during cuff deflation test hinders accurate measurement of the leak volume and lowers the reproducibility. A previous physiological study suggested that while patients were sedated and paralyzed, the cuff leak volume was reliably measurable [53]. An adequate amount of sedatives and opioids can suppress coughing during the airway suctioning before cuff leak test or cuff deflation during the test. Further, cuff leak testing  is recommended several hours before extubation, which allows the arousal of patients from sedation by the time of extubation. Thus, we may be able to at least attempt to increase the reliability of cuff leak measurement.
Few tests are available to estimate the risk of postextubation airway complications. A case series of three patients suggested that video laryngoscopy enabled visualization of laryngeal edema prior to extubation [54], but its clinical efficacy in estimating post-extubation airway complications is yet to be determined. Several studies have examined the role of laryngeal ultrasonography in adult patients. Laryngeal air column width difference is the difference between width of airway at the level of the vocal cord with cuff inflated and deflated. Its reported sensitivity and specificity varied across studies, ranging from 50 to 91% and 54 to 72%, respectively [44,46,49]. Laryngeal air column width ratio is the ratio of air column width before extubation over that after intubation. It has been examined in only one study [55] and needs further validations. Thus, no single available options can correctly estimate the risk of post-extubation airway complications. Clinicians should not overly rely on one single test in predicting the success or failure of extubation.
Our study had several strengths and limitations. Strengths included a comprehensive search in five databases without language restrictions. This allowed us to conduct relevant subgroup analyses with a larger number of studies. Further, inclusion of non-English studies facilitates the generalizability of our findings in various clinical settings [56].
Our study had some limitations. First, the definition of post-extubation airway obstruction differed across studies. Stridor was more frequently used as the reference standards than laryngeal edema (as assessed with endoscopy). Laryngeal edema may be more frequent than stridor, because stridor and respiratory distress occur when laryngeal edema narrows the airway by ≥ 50% [57]. However, laryngeal edema is not always screened for in extubated patients, and the presence of stridor is an accepted sign of respiratory distress that triggers a concern for airway obstruction. Thus, the finding of our study is generalizable to the clinical practice. Second, although we attempted to include in the analysis the incidence of stridor due to post-extubation airway events, some patients might have had a concurrent clinical state that necessitated high minute ventilation or tachypnea through an edematous airway, which manifested as 'stridor. ' Therefore, we might not have been able to completely separate stridor due to post-extubation airway events from stridor due to other etiologies, such as respiratory insufficiency. This limitation also applies to reintubation. Third, whether to reintubate patients is subject to treating physicians' discretion and the effect of treatment to abort post-extubation stridor. Therefore, the value of cuff leak test in predicting the need for reintubation in clinical practice may be limited along with the third limitation. However, prevention of post-extubation airway obstruction is more important than reintubation per se. Once the cuff leak test identifies patients at high risk of postextubation airway obstruction, prophylactic systemic corticosteroids are indicated [9,14,58]. Fourth, 15 out of 23 studies that assessed the quantitative cuff leak test determined the cutoff with the knowledge of the results of the reference standards. It is known that data-driven optimization of the cutoff can lead to overestimation of test performance [59]. Thus, the pooled accuracy of quantitative cuff leak test in our study can be an overestimation; the optimal cutoff is still unknown. Fifth, the quantitative cutoff for a positive test varied between the studies. Since we had aggregate data from each included study, we failed to determine the optimal cutoff of cuff leak test. Finally, there was substantial statistical heterogeneity in the pooled sensitivity and specificity for both outcomes. The presence of statistical heterogeneity is a common issue intrinsic to the The prediction region illustrates the extent of statistical heterogeneity by depicting a region within there is 95% confidence that the true sensitivity and specificity of a future study should lie meta-analysis of diagnostic accuracy of a test, given the clinical and methodological diversity in original studies as well as the possible relationship between sensitivity and specificity, as exemplified in ROC curves in which more sensitive cut points have lower specificity (and vice versa). Our subgroup analyses suggested that the type of cuff leak test (quantitative versus qualitative) and length of intubation might have been the cause of statistical heterogeneity of sensitivity and specificity, respectively, for post-extubation airway obstruction.

Conclusion
The cuff leak test has excellent specificity but moderate sensitivity for post-extubation airway obstruction. The cuff leak test is a useful tool in the decision-making about extubation, but the low sensitivity suggests that a negative test cannot completely exclude post-extubation airway obstruction and that patients still need to be closely monitored post-extubation. The higher specificity suggests that clinicians should consider intervening in patients with systemic corticosteroids in response to a positive test. Continued research to find better modalities to rule out post-extubation airway obstruction is needed.