Skip to main content

A significant p value is not equivalent to the superiority of one test index over another

The Original Article was published on 11 September 2019

Dear Editor,

We read with great interest the study by Kazune and colleagues [1], reporting that higher mottling scores are associated with lower microcirculatory oxygen saturation. However, the authors inappropriately concluded that microcirculatory oxygen saturation was more specific than the mottling score in predicting 28-day mortality. This conclusion was based on a multivariable regression model that microcirculatory oxygen saturation was significantly associated with 28-day mortality (p = 0.008), but mottling score was not associated with mortality after adjustment (p = 0.61, 0.11, and 0.29 for the mottling scores of 1, 2 and 3, respectively, as compared with the 0 score). We have to point out that the p values in a multivariable logistic regression model cannot be interpreted as the strength of association, and the diagnostic performance cannot be compared with the p values in a multivariable regression model. The global assessment of the diagnostic performance of an index is the area under the receiver operating characteristic curve (AUROC) or concordance index. Statistical inference for the equivalence of two AUROCs can be implemented by using the Delong’s method [2]. The sensitivity and specificity may change by varying the cutoff values for determining the event versus non-event; thus, without specifying the cutoff values for the mottling score and microcirculatory oxygen saturation, it does not make sense to compare their sensitivities.

Furthermore, the authors did not make assumption on the causal relationship between these variables and just throw all of them into a multivariable regression model. Distinguishing between mediators from confounders is important. For example, the microcirculatory oxygen saturation may be a direct cause of vascular injury as measured by elevated biomarkers such as intracellular adhesion molecule-1 and vascular cell adhesion molecule-1, and have indirect effect on mortality. Such an association can be better analyzed by using the causal mediation analysis [3]. Another important issue we must point out is the influence of vasopressors on the mottling score. Pathophysiologically, vasopressors can influence the microcirculation in the skin, as well as the microcirculatory oxygen saturation as measured in the study [4]. In the framework of causal inference [5], the vasopressor use can be an important confounder because it has direct causal effect on mottling score, microcirculatory oxygen saturation, and the outcome. Such a confounder can be handled in a multivariable model, or in the counterfactual framework. However, the authors failed to do so.

Authors’ response

We thank Drs Ji and Li for their interest in our article [1]. Drs Ji and Li question our conclusions regarding the accuracy of microcirculatory oxygen saturation and mottling score in predicting 28-day mortality. We would like to point out that we used C-statistic for assessment of diagnostic performance of microcirculatory oxygen saturation and mottling score, as suggested by Drs Ji and Li in their letter. The C-statistic values of univariate prediction models using microcirculatory oxygen saturation and mottling score were 0.76 and 0.79, respectively, and are both clearly stated in the results section. In the results section of our article, we also stated the cutoff values of both tests (score of 2 or more for mottling and 26% for microcirculatory oxygen saturation). Our conclusions regarding the accuracy of microcirculatory oxygen saturation and mottling score in predicting 28-day mortality are based on the above data rather than p values in the multivariable logistic regression model.

Another point raised by Drs Ji and Li regarding our study is the selection of independent variables for inclusion into the multivariable regression model. Although explanatory models that consider causal relationships between variables are used in etiological research, there are other purposes for the use of such models [6]. In our study, we built a descriptive model to capture the association between 28-day survival, biochemical markers of endothelial dysfunction, and skin microcirculatory hypoperfusion resulting from endothelial dysfunction.

We would also like to disagree with the objections of Drs Ji and Li regarding vasopressors as an important confounder in predicting mortality when using markers of tissue hypoperfusion such as mottling score, as this contradicts previous findings that mottling score is predictive of death irrespective of vasopressor dose [7].

Availability of data and materials

No data for the work.


  1. Kazune S, Caica A, Volceka K, Suba O, Rubins U, Grabovskis A. Relationship of mottling score, skin microcirculatory perfusion indices and biomarkers of endothelial dysfunction in patients with septic shock: an observational study. Crit Care. 2019;23(1):311.

    Article  PubMed  PubMed Central  Google Scholar 

  2. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837.

    Article  CAS  PubMed  Google Scholar 

  3. Zhang Z, Zheng C, Kim C, Van Poucke S, Lin S, Lan P. Causal mediation analysis in the context of clinical research. Ann Transl Med. 2016;4:425–5.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Zhang Z, Chen K. Vasoactive agents for the treatment of sepsis. Ann Transl Med 2016;4:333–3.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Robins JM, Hernán MÁ, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11:550–60.

    Article  CAS  PubMed  Google Scholar 

  6. Shmueli G. To explain or to predict? Stat Sci. 2010;25(3):289–310.

    Article  Google Scholar 

  7. Dumas G, Lavillegrand J-R, Joffre J, Bigé N, de Moura EB, Baudel J-L, et al. Mottling score is a strong predictor of 14-day mortality in septic patients whatever vasopressor doses and other tissue perfusion parameters. Critical Care. 2019;23(1):211.

    Article  PubMed  PubMed Central  Google Scholar 

Download references




No funding.

Author information

Authors and Affiliations



QJ conceived the idea and drafted the manuscript; WL helped interpret the results. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Weimin Li.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This comment refers to the article available at

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ji, Q., Li, W. A significant p value is not equivalent to the superiority of one test index over another. Crit Care 23, 359 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: