Theoretical considerations: when is a gene-association hypothesis supported?
The P value is the probability, assuming that the null hypothesis (i.e., no difference between groups) is in fact true and that all model assumptions (i.e., no selection, attrition, analysis, or reporting bias—only chance is operating) are satisfied, of obtaining a result equal to or more extreme than what was actually observed [9, 10]. The “P value fallacy” is “the illusion that conclusions can be produced with certain ‘error rates’ without consideration of information from outside the experiment” [9]. The fallacy is to think that the P value refers to a hypothesis probability, involving inductive reasoning back from evidence (observations) to underlying truth [9–11]. This leads to misinterpretations of the P value (Table 2). To make the inductive inference about hypothesis probability requires Bayesian methods.
Bayesian methods are conceptually simple: (Prior-odds of null hypothesis)(Bayes factor) = (Posterior-odds of null hypothesis) [9]. The prior-odds are based on evidence external to the study concerning the plausibility of the null hypothesis; in a field of study, this is the ratio of the number of “true relationships” to “no relationships” among those tested in the field [12]. The Bayes factor (BF) measures the relative support, from the observed evidence, for two hypotheses: (Probability of the data given the null hypothesis)/(Probability of the data given the alternative hypothesis). The BF modifies the prior probability to give the posterior probability of the null hypothesis (or, if one reverses the numerator and denominator, the post-study probability that there is a true association: positive predictive value (PPV)). One can calculate, from the same numbers used to calculate the P value, the minimum BF: the strongest evidence against the null hypothesis, using the best supported hypothesis (the observed association) as the alternative-hypothesis [11]. One can also calculate the PPV of a statistically significant finding using the prior probability of an association, the BF based on power and alpha [BF = αβ/(1 − α)(1 − β)], in addition to bias (affecting the accuracy of the alpha and also reflecting our estimate of the prior-odds) [12–14]. The PPV is lowered by low study power (smaller studies with small expected effect sizes), low pre-study odds (hypothesis-generating experiments), bias (flexibility in designs, definitions, outcomes, and analytic modes), and number of teams working in the field (hotter scientific fields) [12]. There are some surprising results of Bayesian methods (Table 2).
Empirical considerations: when is a gene-association hypothesis supported?
There is evidence to support the predictions from Bayesian methods in interpreting study results (Table 2). This is particularly so in genetic-association studies where the expected true (when there is a true association) odds ratios for common SNPs with common complex diseases (such as ARDS) is repeatedly found to be 1.1–1.4; this means that studies have low power unless there are >1000 subjects [12, 15]. This empirical evidence (Table 2) suggests that Bayesian methods, which keep statistical evidence (conveyed traditionally by the P value and more usefully by the BF) distinct from inductive inferences about hypotheses, are useful because they incorporate data external to the study (estimation of priors) in order to arrive at a conclusion about a hypothesis (posterior probability of the probed association being true) [9–12].
Interpreting ARDS gene-association studies
Using the growing cohort of patients, six ARDS gene-association studies have been published by this group (Table 1) [2–7]. These reports were well done according to reporting guidelines [15]. We ask three questions to improve interpretation of these (and, in general, future) gene-association studies in critical care.
-
1.
Priors: how likely is an association to be expected given information external to the study? Considerations are listed in Table 3. In gene-association studies for complex diseases, the prior is usually in the range of 0.001 (SNPs with only limited prior evidence) to 0.1 (SNPs that already show fairly compelling evidence for association), and in non-replication studies is likely closer to 0.001 [14].
Table 3 Considerations relevant to interpretation of the results of gene-association studies for the risk of ARDS -
2.
Minimum BF and PPV: what does the evidence from the study show us? Considerations are listed in Table 3. Assuming a P value threshold 0.01 to 0.001, for a prior of 0.01 the PPV is (assuming power of 0.5) 38–82%, and for a prior of 0.001, 8–40% [14].
-
3.
Bias: do we need to modify these estimates for study bias? Considerations are listed in Table 3. If bias is low (0.05–0.2 = “the proportion of probed analyses that would not have been ‘research findings’, but nevertheless end up presented and reported as such, because of bias”), power is 50%, and prior is very high (0.1), the PPV that a statistically significant finding is true is 35–55% [12].