Skip to main content

Interpretation of gene associations with risk of acute respiratory distress syndrome: P values, Bayes factors, positive predictive values, and need for replication

The Original Article was published on 05 September 2016

Single nucleotide polymorphisms (SNPs) in certain genes play a role in the observed variability in development and severity of acute respiratory distress syndrome (ARDS). Identified SNPs can direct future studies aiming to target diagnostic, preventive, and therapeutic interventions for the complex pathophysiology of ARDS [1]. For example, CFTR is involved in fluid absorption from alveoli and in negatively modulating the inflammatory response [2, 3], and Perez-Marques et al. report that SNPs in DNA for proteins involved in splicing in the exon 9 region of CFTR mRNA were independently associated with risk for ARDS [2]. The same group have identified other statistically significant candidate gene associations with the risk for pediatric ARDS (Table 1) [27]. In adults, other candidate gene associations with risk for development of and outcome from ARDS have been suggested [1, 8]. How should these gene-association hypotheses be interpreted?

Table 1 Studies to identify risk, understand pathophysiology, and provide insight into newer individualized therapies for severity of CAP-induced ARDS in children

Theoretical considerations: when is a gene-association hypothesis supported?

The P value is the probability, assuming that the null hypothesis (i.e., no difference between groups) is in fact true and that all model assumptions (i.e., no selection, attrition, analysis, or reporting bias—only chance is operating) are satisfied, of obtaining a result equal to or more extreme than what was actually observed [9, 10]. The “P value fallacy” is “the illusion that conclusions can be produced with certain ‘error rates’ without consideration of information from outside the experiment” [9]. The fallacy is to think that the P value refers to a hypothesis probability, involving inductive reasoning back from evidence (observations) to underlying truth [911]. This leads to misinterpretations of the P value (Table 2). To make the inductive inference about hypothesis probability requires Bayesian methods.

Table 2 Some surprising statements about P values, results of Bayesian methods, and empirical evidence supporting the predictions of Bayesian methods

Bayesian methods are conceptually simple: (Prior-odds of null hypothesis)(Bayes factor) = (Posterior-odds of null hypothesis) [9]. The prior-odds are based on evidence external to the study concerning the plausibility of the null hypothesis; in a field of study, this is the ratio of the number of “true relationships” to “no relationships” among those tested in the field [12]. The Bayes factor (BF) measures the relative support, from the observed evidence, for two hypotheses: (Probability of the data given the null hypothesis)/(Probability of the data given the alternative hypothesis). The BF modifies the prior probability to give the posterior probability of the null hypothesis (or, if one reverses the numerator and denominator, the post-study probability that there is a true association: positive predictive value (PPV)). One can calculate, from the same numbers used to calculate the P value, the minimum BF: the strongest evidence against the null hypothesis, using the best supported hypothesis (the observed association) as the alternative-hypothesis [11]. One can also calculate the PPV of a statistically significant finding using the prior probability of an association, the BF based on power and alpha [BF = αβ/(1 − α)(1 − β)], in addition to bias (affecting the accuracy of the alpha and also reflecting our estimate of the prior-odds) [1214]. The PPV is lowered by low study power (smaller studies with small expected effect sizes), low pre-study odds (hypothesis-generating experiments), bias (flexibility in designs, definitions, outcomes, and analytic modes), and number of teams working in the field (hotter scientific fields) [12]. There are some surprising results of Bayesian methods (Table 2).

Empirical considerations: when is a gene-association hypothesis supported?

There is evidence to support the predictions from Bayesian methods in interpreting study results (Table 2). This is particularly so in genetic-association studies where the expected true (when there is a true association) odds ratios for common SNPs with common complex diseases (such as ARDS) is repeatedly found to be 1.1–1.4; this means that studies have low power unless there are >1000 subjects [12, 15]. This empirical evidence (Table 2) suggests that Bayesian methods, which keep statistical evidence (conveyed traditionally by the P value and more usefully by the BF) distinct from inductive inferences about hypotheses, are useful because they incorporate data external to the study (estimation of priors) in order to arrive at a conclusion about a hypothesis (posterior probability of the probed association being true) [912].

Interpreting ARDS gene-association studies

Using the growing cohort of patients, six ARDS gene-association studies have been published by this group (Table 1) [27]. These reports were well done according to reporting guidelines [15]. We ask three questions to improve interpretation of these (and, in general, future) gene-association studies in critical care.

  1. 1.

    Priors: how likely is an association to be expected given information external to the study? Considerations are listed in Table 3. In gene-association studies for complex diseases, the prior is usually in the range of 0.001 (SNPs with only limited prior evidence) to 0.1 (SNPs that already show fairly compelling evidence for association), and in non-replication studies is likely closer to 0.001 [14].

    Table 3 Considerations relevant to interpretation of the results of gene-association studies for the risk of ARDS
  2. 2.

    Minimum BF and PPV: what does the evidence from the study show us? Considerations are listed in Table 3. Assuming a P value threshold 0.01 to 0.001, for a prior of 0.01 the PPV is (assuming power of 0.5) 38–82%, and for a prior of 0.001, 8–40% [14].

  3. 3.

    Bias: do we need to modify these estimates for study bias? Considerations are listed in Table 3. If bias is low (0.05–0.2 = “the proportion of probed analyses that would not have been ‘research findings’, but nevertheless end up presented and reported as such, because of bias”), power is 50%, and prior is very high (0.1), the PPV that a statistically significant finding is true is 35–55% [12].

Conclusions

The observed evidence (the P value, or better yet, the BF) can be combined with prior considerations of plausibility to determine how well two hypotheses are supported (posterior probability, PPV). The posterior probability (PPV) that there is an association between exploratory SNPs and severity of ARDS in children is low given the low prior probability, the modest BF (reflected in modest P values and power), and potential for bias. This is not necessarily a problem if our interest is in generating hypotheses for further scientific study [13]. An interesting hypothesis has been suggested (i.e., a gene association) and warrants further investigation; we should wait for replication in additional larger studies before accepting this hypothesis. These future studies will have a prior probability that is closer to 0.1 (the posterior probability after the current study), and thus replication would move us much further toward accepting the hypothesis [1315]. Overall, caution is warranted: most genetic associations for ARDS in adults have not replicated [8].

Abbreviations

ARDS:

Acute respiratory distress syndrome

BF:

Bayes factor

CAP:

Community-acquired pneumonia

CELF2:

Elav-like family member 2

CFTR:

Cystic fibrosis transmembrane conductance regulator

PPV:

Positive predictive value

SNP:

Single nucleotide polymorphism

TIA-1:

T-cell intracellular antigen-1

References

  1. Sapru A, Flori H, Quasney MW, Dahmer MK, for the Pediatric Acute Lung Injury Consensus Conference Group. Pathobiology of acute respiratory distress syndrome. Pediatr Crit Care Med. 2015;16:S6–S22.

    Article  PubMed  Google Scholar 

  2. Perez-Marques F, Simpson P, Yan K, Quasney MW, Halligan N, Merchant D, Dahmer MK. Association of polymorphisms in genes of factors involved in regulation of splicing of cystic fibrosis transmembrane conductance regulator mRNA with acute respiratory distress syndrome in children with pneumonia. Crit Care. 2016;20:281.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Baughn JM, Quasney MW, Simpson P, Merchant D, Li S, Levy H, Dahmer MK. Association of cystic fibrosis transmembrane conductance regulator gene variants with acute lung injury in African American children with pneumonia. Crit Care Med. 2012;40:3042–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Patwari PP, O’Cain P, Goodman DM, Smith M, Krushkal J, Liu C, Somes G, Quasney MW, Dahmer MK. Interleukin-1 receptor antagonist intron 2 variable number of tandem repeats polymorphism and respiratory failure in children with community-acquired pneumonia. Pediatr Crit Care Med. 2008;9:553–9.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Russell R, Quasney MW, Halligan N, Li S, Simpson P, Waterer G, Wunderink RG, Dahmer MK. Genetic variation in MYLK and lung injury in children and adults with community-acquired pneumonia. Pediatr Crit Care Med. 2010;11:731–6.

    Article  PubMed  Google Scholar 

  6. Dahmer MK, O’Cain P, Patwari PP, Simpson P, Li S, Halligan N, Quasney MW. The influence of genetic variation in surfactant protein B on severe lung injury in African American children. Crit Care Med. 2011;39:1138–44.

    Article  CAS  PubMed  Google Scholar 

  7. Chen J, Wilson ES, Dahmer MK, Quasney MW, Waterer GW, Feldman C, Wunderink RG. Lack of association of the caspase-12 long allele with community-acquired pneumonia in people of African descent. PLoS One. 2014;9(2):389194.

    Google Scholar 

  8. Tejera P, Meyer NJ, Chen F, Feng R, Zhao Y, O’Mahony DS, et al. Distinct and replicable genetic risk factors for acute respiratory distress syndrome of pulmonary or extrapulmonary origin. J Med Genet. 2010;49:671–80.

    Article  Google Scholar 

  9. Goodman SN. Toward evidence-based medical statistics. 1: the p value fallacy. Ann Intern Med. 1999;130:995–1004.

    Article  CAS  PubMed  Google Scholar 

  10. Greenland S, Senn SJ, Rothman KJ, Carlin BJ, Poole C, Goodman SN, Altman DG. Statistical tests, P-values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol. 2016;31:337.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Goodman SN. Toward evidence-based medical statistics. 2: the Bayes factor. Ann Intern Med. 1999;130:1005–13.

    Article  CAS  PubMed  Google Scholar 

  12. Ioannidis JPA. Why most published research findings are false. PLoS Med. 2005;2(8):e124.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Ioannidis JPA, Tarone R, McLaughlin JK. The false-positive to false-negative ratio in epidemiologic studies. Epidemiology. 2011;22:450–6.

    Article  PubMed  Google Scholar 

  14. Broer L, Lill CM, Schuur M, Amin N, Roehr JT, Bertram L, et al. Distinguishing true from false positives in genomic studies: p values. Eur J Epidemiol. 2013;28:131–8.

    Article  PubMed  Google Scholar 

  15. Little J, Higgins JPT, Ioannidis JPA, Moher D, Gagnon F, von Elm E, et al. STrengthening the Reporting of Genetic Association Studies (STREGA)--an extension of the STROBE statement. PLoS Med. 2009;6(2):31000022.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

There was no funding for this work.

Availability of data and materials

Not applicable.

Authors’ contributions

SR and ARJ made substantial contributions to conception and design and interpretation of data; participated sufficiently in the work to take public responsibility for the content and agreed to be accountable for all aspects of the work. ARJ wrote the first draft of the manuscript. SR revised the manuscript critically for important intellectual content. SR and ARJ have given final approval of the version to be published.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Not applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ari R. Joffe.

Additional information

See related research by Perez-Marques et al., https://ccforum.biomedcentral.com/articles/10.1186/s13054-016-1454-7 This comment refers to the article available at: http://dx.doi.org/10.1186/s13054-016-1454-7.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rimpau, S., Joffe, A.R. Interpretation of gene associations with risk of acute respiratory distress syndrome: P values, Bayes factors, positive predictive values, and need for replication. Crit Care 20, 402 (2016). https://doi.org/10.1186/s13054-016-1550-8

Download citation

  • Published:

  • DOI: https://doi.org/10.1186/s13054-016-1550-8

Keywords