A significant p value is not equivalent to the superiority of one test index over another

Dear Editor, We read with great interest the study by Kazune and colleagues [1], reporting that higher mottling scores are associated with lower microcirculatory oxygen saturation. However, the authors inappropriately concluded that microcirculatory oxygen saturation was more specific than the mottling score in predicting 28-day mortality. This conclusion was based on a multivariable regression model that microcirculatory oxygen saturation was significantly associated with 28-day mortality (p = 0.008), but mottling score was not associated with mortality after adjustment (p = 0.61, 0.11, and 0.29 for the mottling scores of 1, 2 and 3, respectively, as compared with the 0 score). We have to point out that the p values in a multivariable logistic regression model cannot be interpreted as the strength of association, and the diagnostic performance cannot be compared with the p values in a multivariable regression model. The global assessment of the diagnostic performance of an index is the area under the receiver operating characteristic curve (AUROC) or concordance index. Statistical inference for the equivalence of two AUROCs can be implemented by using the Delong’s method [2]. The sensitivity and specificity may change by varying the cutoff values for determining the event versus non-event; thus, without specifying the cutoff values for the mottling score and microcirculatory oxygen saturation, it does not make sense to compare their sensitivities. Furthermore, the authors did not make assumption on the causal relationship between these variables and just throw all of them into a multivariable regression model. Distinguishing between mediators from confounders is important. For example, the microcirculatory oxygen saturation may be a direct cause of vascular injury as measured by elevated biomarkers such as intracellular adhesion molecule-1 and vascular cell adhesion molecule-1, and have indirect effect on mortality. Such an association can be better analyzed by using the causal mediation analysis [3]. Another important issue we must point out is the influence of vasopressors on the mottling score. Pathophysiologically, vasopressors can influence the microcirculation in the skin, as well as the microcirculatory oxygen saturation as measured in the study [4]. In the framework of causal inference [5], the vasopressor use can be an important confounder because it has direct causal effect on mottling score, microcirculatory oxygen saturation, and the outcome. Such a confounder can be handled in a multivariable model, or in the counterfactual framework. However, the authors failed to do so.

A significant p value is not equivalent to the superiority of one test index over another Qiaoying Ji 1 and Weimin Li 2* This comment refers to the article available at https://doi.org/10.1186/s13054-019-2589-0.

Dear Editor,
We read with great interest the study by Kazune and colleagues [1], reporting that higher mottling scores are associated with lower microcirculatory oxygen saturation. However, the authors inappropriately concluded that microcirculatory oxygen saturation was more specific than the mottling score in predicting 28-day mortality. This conclusion was based on a multivariable regression model that microcirculatory oxygen saturation was significantly associated with 28-day mortality (p = 0.008), but mottling score was not associated with mortality after adjustment (p = 0.61, 0.11, and 0.29 for the mottling scores of 1, 2 and 3, respectively, as compared with the 0 score). We have to point out that the p values in a multivariable logistic regression model cannot be interpreted as the strength of association, and the diagnostic performance cannot be compared with the p values in a multivariable regression model. The global assessment of the diagnostic performance of an index is the area under the receiver operating characteristic curve (AUROC) or concordance index. Statistical inference for the equivalence of two AUROCs can be implemented by using the Delong's method [2]. The sensitivity and specificity may change by varying the cutoff values for determining the event versus non-event; thus, without specifying the cutoff values for the mottling score and microcirculatory oxygen saturation, it does not make sense to compare their sensitivities.
Furthermore, the authors did not make assumption on the causal relationship between these variables and just throw all of them into a multivariable regression model. Distinguishing between mediators from confounders is important. For example, the microcirculatory oxygen saturation may be a direct cause of vascular injury as measured by elevated biomarkers such as intracellular adhesion molecule-1 and vascular cell adhesion molecule-1, and have indirect effect on mortality. Such an association can be better analyzed by using the causal mediation analysis [3]. Another important issue we must point out is the influence of vasopressors on the mottling score. Pathophysiologically, vasopressors can influence the microcirculation in the skin, as well as the microcirculatory oxygen saturation as measured in the study [4]. In the framework of causal inference [5], the vasopressor use can be an important confounder because it has direct causal effect on mottling score, microcirculatory oxygen saturation, and the outcome. Such a confounder can be handled in a multivariable model, or in the counterfactual framework. However, the authors failed to do so.

S. Kazune, A. Caica, K. Volceka and A. Grabovskis
We thank Drs Ji and Li for their interest in our article [1]. Drs Ji and Li question our conclusions regarding the accuracy of microcirculatory oxygen saturation and mottling score in predicting 28-day mortality. We would like to point out that we used C-statistic for assessment of diagnostic performance of microcirculatory oxygen saturation and mottling score, as suggested by Drs Ji and Li in their letter. The C-statistic values of univariate prediction models using microcirculatory oxygen saturation and mottling score were 0.76 and 0.79, respectively, and are both clearly stated in the results section. In the results section of our article, we also stated the cutoff values of both tests (score of 2 or more for mottling and 26% for microcirculatory oxygen saturation). Our conclusions regarding the accuracy of microcirculatory oxygen saturation and mottling score in predicting 28-day mortality are based on the above data rather than p values in the multivariable logistic regression model.
Another point raised by Drs Ji and Li regarding our study is the selection of independent variables for inclusion into the multivariable regression model. Although explanatory models that consider causal relationships between variables are used in etiological research, there are other purposes for the use of such models [6]. In our study, we built a descriptive model to capture the association between 28-day survival, biochemical markers of endothelial dysfunction, and skin microcirculatory hypoperfusion resulting from endothelial dysfunction.
We would also like to disagree with the objections of Drs Ji and Li regarding vasopressors as an important confounder in predicting mortality when using markers of tissue hypoperfusion such as mottling score, as this contradicts previous findings that mottling score is predictive of death irrespective of vasopressor dose [7].