Interpretation of gene associations with risk of acute respiratory distress syndrome: P values, Bayes factors, positive predictive values, and need for replication

Rimpau, Sebastian; Joffe, Ari R.

doi:10.1186/s13054-016-1550-8

Critical Care

Table 2 Some surprising statements about P values, results of Bayesian methods, and empirical evidence supporting the predictions of Bayesian methods

From: Interpretation of gene associations with risk of acute respiratory distress syndrome: P values, Bayes factors, positive predictive values, and need for replication

Surprising P-value misinterpretations (false statements)	Correction	Reference
The P value is the probability that the null hypothesis is true	The P value assumes the null hypothesis is true	[10]
P ≤ 0.05 means the null hypothesis is false, or should be rejected	P ≤ 0.05 simply flags the data as being unusual if all the assumptions used to compute it were correct	[10]
P > 0.05 means the null hypothesis is true, or should be accepted	P > 0.05 only suggests that the data are not unusual if all the assumptions used to compute it were correct; the same data would also not be unusual under many other hypotheses	[10]
If you reject the null hypothesis because P ≤ 0.05, the chance your “significant finding” is a false positive is 5%	The P value only refers to how often you would be in error over very many uses of the test across different studies, and not in a single use of the test	[10]
Surprising results of Bayesian methods
In late-phase clinical trials with equipoise (the prior probability of the null hypothesis is 50%), a study with a P = 0.05 makes the posterior probability of the null hypothesis no less than 13%		[11]
In more exploratory research (the prior probability of the null hypothesis is, say, 75%), a study with a P = 0.05 or P = 0.01 makes the posterior probability of the null hypothesis no less than 31% and 10%, respectively		[11]
An adequately powered (80%) exploratory epidemiologic (prior 1:10, bias 0.3, α = 0.05) study with a statistically significant finding has a positive predictive value (PPV) 20% and, if underpowered (20%), a PPV of 10%		[12]
In large traditional cohort studies (prior 1:20, bias 0.1, α = 0.05, power 90%), the false positive to false negative ratio of findings is 32:1		[13]
In a well done (power 95%, α = 0.05) cohort study testing SNPs with less than compelling evidence (prior 1:100), with a statistically significant finding (P = 0.05 or 0.01) the PPV is 16.1% and <60%; even with fairly compelling prior evidence (prior 1:10), the PPV is 67.9% and <90%		[14]
Surprising empirical evidence supporting the predictions of Bayesian methods
In traditional genome epidemiology [a “few candidate risk factors are selected based on diverse considerations” (low prior); small sample size (low power, given the small size of expected effect); “discovery hunting using conventional levels” of statistical significance, confounding, selective reporting (bias)], the crude replication rate of statistically significant genetic associations is ~1.2%		[13]
Hallmarks of discovery exploratory research (low priors, low BF, high bias): “vibration of effects” (evidence of inflated early effect sizes in epidemiologic associations), “Proteus phenomenon” (a rapid early sequence of extreme, opposite results in retrospective hypothesis-generating molecular research), and “winners curse” (the first positive study provides inflated estimates compared to reality)		[12, 13]

Back to article page

ISSN: 1364-8535

Contact us

Submission enquiries: journalsubmissions@springernature.com