The limitations of observational studies on the treatment of severe sepsis

Observational studies usually agree with randomised, controlled trials. It is a logical fallacy, however, to suggest that agreement in one direction implies prediction in the other direction. Observational studies are not scientifically capable of proving or disproving hypotheses such as the efficacy and safety of the treatment of severe sepsis with antithrombin. Observational studies are difficult to analyse and interpret because of the heterogeneity of real-life patient populations, the lack of standardised treatment regimens, the lack of standardised indications for treatment, and the lack of predefined endpoints.

AT III = antithrombin III; RCT = randomised, controlled trial. Messori and colleagues are to be congratulated on their interesting report on the Italian observational study of antithrombin III (AT III) use in intensive care units [1]. Audits of this sort are difficult and time consuming to conduct. Observational studies are also difficult to analyse and interpret because of the heterogeneity of real-life patient populations, the lack of standardised treatment regimens, the lack of standardised indications for treatment, and the lack of predefined endpoints for assessing survival (e.g. 28 day survival or in-hospital mortality). For these reasons we take issue with Messori and colleagues over the very strong conclusion they draw on the basis of their survey. In our opinion, this nonrandomised observational study is not able to provide scientific evidence to support the authors' statements.

Observational study
As pointed out by Pocock and Elbourne, observational studies have one crucial deficiency: the design is not an experimental one [2]. Experimental design involves randomisation, the use of entry criteria, and the rigorous use of standard definitions of index medical conditions such as disseminated intravascular coagulation and septic shock. None of these are to be found in the survey by Messori and colleagues [1]. In the absence of rigorous experimental methodology, it is not possible to be sure that the findings of an observational study are predictive of the results of a randomised, controlled trial (RCT). The published finding that observational studies usually agree with RCTs comes from studies in indications unrelated to sepsis and septic shock [3,4]. Furthermore, it is a logical fallacy to suggest that agreement in one direction implies prediction in the other direction. Observational studies are not scientifically capable of proving or disproving any hypothesis.
With this in mind, it is interesting to note that one of the main observations of Messori and colleagues' survey (n = 56 for sepsis) is at variance with the results from a very much larger RCT reported in the same indication by Warren and colleagues (n = 2341 for sepsis) [5]. Whereas Warren and colleagues' RCT found significantly lower mortality in a prespecified subgroup analysis of AT III-treated patients not receiving heparin (n = 352) compared with placebo-treated patients not receiving heparin (n = 346), Messori and colleagues' survey observed an increase in mortality in patients not receiving heparin. This casts serious doubt over the scientific value of a survey reporting outcomes in a treatment that was not randomised to patients, with only

Letter
The limitations of observational studies on the treatment of severe sepsis 56 patients with sepsis treated with AT III. Such a type of study is not scientifically or statistically capable of proving causality or supporting statements.
Other crucial weaknesses of Messori and colleagues' survey lie in its observational nature and the lack of any standardisation of definitions or outcomes. Controlled studies usually indicate the 28 day mortality whereas observational studies indicate the hospital mortality, which normally is higher than the 28 day mortality. In addition, hospital mortality may depend on local factors affecting clinical practice such as when to discharge patients from hospital, which varies among countries.

Overview of studies
Messori and colleagues also report the results of a metaanalysis of four studies of AT III in sepsis [5][6][7][8]. Conclusions are drawn based on confidence intervals calculated for their meta-analysis. It is probably unwise, however, to place too much weight on meta-analyses of studies conducted in critical care settings because the spectacular failure of metaanalyses to predict the outcomes of subsequent large-scale RCTs is well known [9][10][11].
Furthemore, if confidence intervals overlap or P > 0.05, this does not prove that there is no difference; on the contrary, there is still a possibility that the observation suffers from a type II error (failure to detect a true difference due to inadequate sample size). This is particularly likely to occur when the number of patients studied is small, as in the phase II trials on AT III. In this context, it is misleading of Messori and colleagues to use Figure 1 of their paper to suggest, with reference to the confidence intervals, that there is no difference between the studies or that their observational study has a very similar result to the other studies cited. In the same way, potentially misleading claims are made about the subgroup analysis of the effects of coadministered heparin. In the results section it is suggested, erroneously, that no differences exist between results on the basis of P values alone.

Conclusion
Surveys and audits are very important ways of documenting routine clinical practice and of confirming the implementation of guidelines based on the results of RCTs. Surveys and audits are also useful for generating hypotheses, but they are not a substitute for RCTs when proving or disproving hypotheses. This is because surveys are not based on an experimental design. Messori and colleagues have raised interesting concerns but they have not provided answers to those concerns.