Discordant identification of pediatric severe sepsis by research and clinical definitions in the SPROUT international point prevalence study

Introduction Consensus criteria for pediatric severe sepsis have standardized enrollment for research studies. However, the extent to which critically ill children identified by consensus criteria reflect physician diagnosis of severe sepsis, which underlies external validity for pediatric sepsis research, is not known. We sought to determine the agreement between physician diagnosis and consensus criteria to identify pediatric patients with severe sepsis across a network of international pediatric intensive care units (PICUs). Methods We conducted a point prevalence study involving 128 PICUs in 26 countries across 6 continents. Over the course of 5 study days, 6925 PICU patients <18 years of age were screened, and 706 with severe sepsis defined either by physician diagnosis or on the basis of 2005 International Pediatric Sepsis Consensus Conference consensus criteria were enrolled. The primary endpoint was agreement of pediatric severe sepsis between physician diagnosis and consensus criteria as measured using Cohen’s κ. Secondary endpoints included characteristics and clinical outcomes for patients identified using physician diagnosis versus consensus criteria. Results Of the 706 patients, 301 (42.6 %) met both definitions. The inter-rater agreement (κ ± SE) between physician diagnosis and consensus criteria was 0.57 ± 0.02. Of the 438 patients with a physician’s diagnosis of severe sepsis, only 69 % (301 of 438) would have been eligible to participate in a clinical trial of pediatric severe sepsis that enrolled patients based on consensus criteria. Patients with physician-diagnosed severe sepsis who did not meet consensus criteria were younger and had lower severity of illness and lower PICU mortality than those meeting consensus criteria or both definitions. After controlling for age, severity of illness, number of comorbid conditions, and treatment in developed versus resource-limited regions, patients identified with severe sepsis by physician diagnosis alone or by consensus criteria alone did not have PICU mortality significantly different from that of patients identified by both physician diagnosis and consensus criteria. Conclusions Physician diagnosis of pediatric severe sepsis achieved only moderate agreement with consensus criteria, with physicians diagnosing severe sepsis more broadly. Consequently, the results of a research study based on consensus criteria may have limited generalizability to nearly one-third of PICU patients diagnosed with severe sepsis. Electronic supplementary material The online version of this article (doi:10.1186/s13054-015-1055-x) contains supplementary material, which is available to authorized users.


Introduction
Sepsis is a leading cause of death in children worldwide, responsible for an estimated 75,000 hospitalizations annually in the United States and nearly 50 % of all childhood hospital deaths worldwide [1][2][3][4][5]. Within the spectrum of this syndrome, severe sepsis refers to children with shock or other organ dysfunction and is the high-risk group targeted for interventional studies in the pediatric intensive care unit (PICU) [6].
Investigators in clinical trials of severe sepsis face important challenges that have contributed to high failure rates for many promising novel therapies, with few attempts to include children [7]. One fundamental issue is that the sepsis syndrome is characterized by non-specific physiologic abnormalities that encompass a heterogeneous population. Consensus criteria for pediatric sepsis were therefore established to facilitate consistent enrollment across research studies [6]. Many of these criteria have since been adopted for use in clinical practice [8]; however, published reports have demonstrated only moderate overlap of physician diagnosis of severe sepsis with consensus criteria [9,10]. These findings raise concern that many children diagnosed and treated for severe sepsis in clinical practice may have important physiologic-and outcome-differences from those studied in interventional trials [9,10].
The degree to which a study population is representative of patients diagnosed and treated in clinical practice has a major impact on the external validity of a study [11][12][13]. Although physician diagnosis serves clinical practice and consensus criteria were intended primarily for research, the alignment of these two methods to identify children with severe sepsis will significantly impact the extent to which study results translate into effective care at the bedside. Moreover, criteria that purposely define a disorder as a decompensated state at one extreme of the entire spectrum-as the existing pediatric consensus definitions currently do for severe sepsis and septic shock-may hinder early clinical diagnosis or delay enrollment in a clinical trial. Although the inherent challenge between research efficacy and clinical effectiveness is not limited to sepsis [14][15][16][17], understanding the extent to which consensus criteria for pediatric severe sepsis are in agreement with clinical practice will help to improve understanding of the utility of these criteria and to identify ways to improve both physician diagnosis and consensus definitions. To date, the agreement of physician diagnosis with consensus criteria for pediatric severe sepsis has not been evaluated in a large-scale setting.
The Sepsis PRevalence, OUtcomes, and Therapies (SPROUT) study researchers screened nearly 7000 PICU patients for severe sepsis at 128 sites across 26 countries using consensus criteria [18]. In addition, the attending physician caring for each patient provided an independent diagnostic assessment for severe sepsis. Using these data, we determined the level of agreement between consensus criteria [6] and attending physician diagnostic assessment ("physician diagnosis") for pediatric severe sepsis. We hypothesized that agreement would be moderate at best, and thus we aimed to compare differences in patient characteristics, treatment strategies, and outcomes for children identified as having severe sepsis by consensus criteria versus physician diagnosis.

Study design
SPROUT was a prospective point prevalence study performed at 128 PICUs in 26 countries over the course of 5 study days spaced over 1 year [18]. Sites were recruited by open invitation through established research networks, and participation was voluntary. Ethical approval was obtained at all sites, and waiver of informed consent was granted at all but three sites, at which written consent was required for data collection (see Additional file 1 for a list of all approving ethical bodies). The details of the SPROUT study methodology have been published previously [18].

Study population
All patients <18 years of age being treated in a participating PICU at 9:00 AM local time on each study day were screened for severe sepsis using a standardized form incorporating the 2005 International Pediatric Sepsis Consensus Conference criteria: (1) at least two systemic inflammatory response syndrome criteria; (2) confirmed or suspected invasive infection; and (3) cardiovascular dysfunction, acute respiratory distress syndrome (ARDS), or at least two organ dysfunctions [6]. The subset of patients with septic shock defined by cardiovascular dysfunction were included within the spectrum of severe sepsis. Only data available within the 24 h preceding the 9:00 AM study day time were considered for screening, yielding a study cohort with active severe sepsis. Patients who were ≥18 years of age, corrected gestational age <42 weeks, or had surgery involving cardiopulmonary bypass in the preceding 5 days were excluded.
To ascertain physician diagnosis of severe sepsis, an investigator at each site provided a list of patients, using a standardized form, to the attending physician of record on each study day. Attending physicians were instructed as follows: "In your clinical opinion, does this patient (yes or no) meet criteria for severe sepsis and/or septic shock when considering data only from the past 24 h (i.e., 9:00 AM yesterday to 9:00 AM today)?" Attending physicians who provided diagnoses were not involved with screening of patients for severe sepsis by consensus criteria. Similarly, site investigators screening for consensus criteria were blinded to physician diagnosis.

Data collection
Data were collected about demographics, comorbid conditions, admission source, laboratory results, and therapies within a 48-h window around the study day (9:00 AM before to 9:00 AM after the study day). Definitions for the primary site of infection were adapted from published criteria [18]. The day of severe sepsis recognition was identified as the first calendar day on which a patient met consensus conference criteria for severe sepsis [6], and the presence of new or progressive multiorgan dysfunction syndrome (NPMODS) was measured for 7 days following severe sepsis recognition [19][20][21]. New MODS was defined as no or one organ dysfunction at sepsis recognition with subsequent development of at least two organ dysfunctions. Progressive MODS was defined as existing multiorgan dysfunction syndrome (MODS) (at least two organ dysfunctions) at sepsis recognition with development of at least one other concurrent organ dysfunction. For severe sepsis screening, organ dysfunctions were defined using pediatric sepsis consensus conference criteria [6], whereas more stringent criteria were used to define NPMODS [21]. For severity of illness, the Pediatric Index of Mortality (PIM)-3 score [22] was calculated at PICU admission, whereas the Pediatric Logistic Organ Dysfunction (PELOD) score [23] was calculated on the study day.
Patients with severe sepsis were followed for 90 days or until death or hospital discharge. Outcomes included vasoactive-and ventilator-free days from day of severe sepsis recognition through day 28, NPMODS, change in functional status from admission to hospital discharge (using the Pediatric Overall Performance Category [POPC] 1-6 ordinal scale [24]), and all-cause mortality at PICU discharge. Patients surviving to hospital discharge were classified as having "at least mild disability" for any increase in POPC and "at least moderate disability" if discharge POPC was 3 or higher and increased by at least 1 from baseline [25].
All data were recorded and managed using standardized case report forms within the web-based Research Electronic Data Capture (or REDCap) system, a secure database that provides an intuitive interface for validated data entry and audit trails for tracking data [26]. The methods used to ensure data quality and confidentiality have been published previously [18]. The primary outcome was the level of agreement in identification of pediatric severe sepsis between physician diagnosis and consensus criteria. Secondary outcomes included differences in patient characteristics, therapies used, and clinical outcomes between patients identified using physician diagnosis versus consensus criteria.

Statistical analysis
Analyses were performed using STATA software (version 12.1; College Station, TX, USA). Data are presented as medians with interquartile ranges for continuous variables and frequencies with percentages for categorical variables. Comparisons between groups were performed using the Kruskal-Wallis test for continuous variables, and categorical variables were compared using Fisher's exact test. For variables in which significant differences were identified across all three groups, pairwise comparisons were performed using Wilcoxon rank-sum or Fisher's exact tests for continuous or categorical variables, respectively. The level of agreement between the two methods to identify patients with severe sepsis was quantified using percentage agreement and Cohen's κ. Multivariable logistic regression was used to test the independent association of each definition of severe sepsis with PICU mortality after controlling for patient-level variables found to be significantly different across groups. Status as a developed (North America, Australia/New Zealand, and Europe) or resource-limited (Asia, Africa, and South America) region did not provide significant effect modification when tested as an interaction term with definition of severe sepsis (p = 0.36), but it was included as a confounder in the final logistic regression model. Statistical significance was defined as a p value <0.05.

Results
Over the 5 study days, 6925 PICU patients were screened across all sites. In total, 706 were identified with severe sepsis by either consensus criteria or physician diagnosis, with 137 patients (19.4 %) identified by physician diagnosis only, 301 (42.6 %) by both definitions, and 268 (38.0 %) by consensus criteria only (Fig. 1). The percentage agreement between consensus criteria and physician diagnosis for identifying severe sepsis in the entire cohort of screened patients was 94 %, but it was only 43 % for identifying severe sepsis among the 706 patients with sepsis by either definition. The inter-rater agreement between physician diagnosis and consensus criteria achieved a moderate κ ± standard error of 0.57 ± 0.02. The percentage agreement between physician diagnosis and consensus criteria varied significantly across geographic regions (p < 0.001). Agreement was lowest in North America (31 % agreement of 444 patients identified with severe sepsis by any criteria); moderate in Australia and New Zealand (45 %) and Europe (51 %); and highest in Asia (72 %), Africa (72 %), and South America (85 %). The percentage agreement was 44 %, 49 %, 45 %, 38 %, and 39 % across study days 1-5, respectively, suggesting that clinicians did not alter their diagnostic assessment to better comply with consensus criteria over time.
Of the 438 total patients with a physician diagnosis of severe sepsis, 31 % were not concurrently identified with severe sepsis by consensus criteria and thus would have been ineligible to participate in a clinical trial that enrolled patients based on the criteria established by consensus guidelines. Moreover, 47 % of the patients identified as having severe sepsis by consensus criteria-and thus potentially eligible for a clinical trial-were not simultaneously considered to have severe sepsis by physician diagnosis.
Characteristics of 704 of the patients identified with severe sepsis by physician diagnosis alone, by both definitions, or by consensus criteria alone are shown in Table 1. Data could not be obtained for two patients (both identified by consensus criteria alone) who did not consent to data collection at one site that required informed consent for this purpose. Patients identified only by physician diagnosis were younger (p = 0.008), had a lower severity of illness (lower PIM-3 [p < 0.001] and PELOD [p < 0.001] scores), and a lower proportion of cardiovascular (p = 0.006) and hematologic dysfunction (p < 0.001) at sepsis recognition than patients identified by consensus criteria alone or both definitions. Patients identified by consensus criteria alone were less likely to be previously healthy (p = 0.002) and more likely to have neuromuscular comorbidities (p < 0.001) than patients identified by physician diagnosis alone or both definitions. Patients identified by consensus criteria alone were more likely to have respiratory (p = 0.004) or unknown primary site of infection (p = 0.002) and less likely to have primary bloodstream infection (p < 0.001).
Patients identified concurrently by both physician diagnosis and consensus criteria had lower platelet counts and higher blood lactate, C-reactive protein, and procalcitonin levels than patients identified by either criterion alone (all p < 0.01) (Table 2). Similarly, patients identified by both physician diagnosis and consensus criteria were also more likely to be treated with vasoactive infusions, albumin, blood products, granulocyte/granulocyte macrophage colony-stimulating factor, intravenous immunoglobulin, and renal replacement therapy than patients identified by either criterion alone (all p < 0.05) ( Table 3).
PICU mortality was significantly lower for patients with only a physician diagnosis of severe sepsis (18 %) than with both definitions (27 %; p = 0.02) and trended lower than for consensus criteria alone (21 %; p = 0.29; Table 4). The proportion with NPMODS, at least mild disability, and at least moderate disability all trended lower for patients identified with severe sepsis by physician diagnosis alone, but none reached statistical significance. After controlling for age, PIM-3 score, number of comorbid conditions, and treatment in developed versus resource-limited regions, patients identified with severe sepsis by physician diagnosis only or by consensus criteria only did not have PICU mortality different from that of patients identified by both physician diagnosis and consensus criteria (Table 5).

Discussion
Across this large international network of PICUs, physician diagnosis and consensus criteria achieved only a moderate level of agreement in identifying critically ill children with severe sepsis. These findings suggest that the results of a research study with enrollment based only on current consensus criteria may not be generalizable to nearly one-third of pediatric patients diagnosed with severe sepsis in a PICU. Moreover, for nearly half of the patients identified with severe sepsis by consensus criteria, and thus eligible for clinical trials, a diagnosis of severe sepsis was not corroborated by the treating attending physician.
The data in this study demonstrate that PICU physicians worldwide commonly diagnose critically ill children with severe sepsis who do not meet consensus criteria. This finding is consistent with a prior report of a single-center study in the United States [10]. In general, patients with physician-diagnosed severe sepsis tended to be younger, less severely ill, and less likely to receive several sepsis-related therapies than patients who also met consensus criteria. These observations likely reflect the intention of the International Pediatric Sepsis Consensus Conference to identify a more severely ill subgroup of critically ill children with severe sepsis for enrollment in clinical trials [6]. It may therefore be appropriate for physicians to diagnose severe sepsis more often than consensus criteria, particularly in children with a lower severity of illness who may need fewer adjunctive sepsis-related therapies. Physicians may also  have been recognizing and treating children with sepsis earlier in their illness course, at a point when they may not yet have manifested the physiologic and laboratory derangements required by consensus criteria. The loss of a statistical difference in PICU mortality between patients identified by physician diagnosis alone and those identified by both definitions after correcting for severity of illness is consistent with these suppositions. A lower threshold for diagnosing severe sepsis in clinical practice likely accounts for the generally better outcome metrics identified in epidemiologic studies using administrative databases [1, 2, 5] than has been reported in clinical trials [20,[27][28][29]. Our study therefore suggests caution when extrapolating results from epidemiologic, observational,  and interventional studies to this less severely ill subgroup of patients who are diagnosed on the basis of physician impression and do not meet consensus criteria. Conversely, the absence of certain features, such as two or more systemic inflammatory response syndrome criteria [30,31], hypotension only after 40 ml/kg of intravenous fluid, and laboratory derangements beyond arbitrary cutpoints (e.g., international normalized ratio >2), should not preclude the clinical diagnosis of severe sepsis and/or septic shock. A refined consensus definition that more closely matches the clinical definition and establishes a spectrum of risk concordant with outcome, similar to that of the recent at-risk, mild, moderate, and severe pediatric ARDS definitions [32], would benefit both research scientists and clinicians.
By quantifying the extent to which physician diagnosis and consensus criteria overlap, one can better understand the potential loss of efficacy as research findings are adopted at the bedside. Previous studies show that efficacious therapies in clinical trials are often extended to a broader patient population, with a more variable and perhaps even a distinct pathophysiology [14][15][16][17]. To our knowledge, this is the first multicenter, international study of the generalizability of consensus criteria for pediatric severe sepsis. Because only 69 % of patients with a physician's diagnosis of severe sepsis met consensus criteria, it may not be appropriate to apply the results of a clinical trial to up to one-third of children treated for severe sepsis in a general PICU practice. Although a potential decrement from efficacy to effectiveness may not dissuade widespread implementation of a new therapy, enthusiasm could be altered if concerns over adverse effects, resource utilization, or costs were taken into consideration.
It is also noteworthy that nearly half of patients in the SPROUT study were identified as having severe sepsis by consensus criteria without corroboration by physician diagnosis. This likely reflects that both physician diagnosis and consensus criteria rely on relatively non-specific   Sites included as "developed regions" were located in North America, Australia/New Zealand, and Europe changes in physiology that are commonly found in critically ill children in general, not just in children with sepsis. Still, this observation highlights an opportunity to refine future iterations of consensus criteria and address variability in physician diagnosis to identify sepsis. Patients with preexisting neuromuscular disorders, such as cerebral palsy and epilepsy, were particularly likely to meet consensus criteria for severe sepsis despite an alternative clinical diagnosis. Many of these patients have dysautonomia or altered physiologic responses to illness that make it challenging to apply predefined rigid criteria. The higher rate of unknown primary site of infection suggests that consensus criteria may be less specific than physician diagnosis in identifying patients with a true infection. It is also possible that consensus criteria include PICU patients with a high severity of illness, namely those with shock and multiorgan system dysfunction caused by non-septic insults, who are commonly treated empirically for infection even if clinical suspicion is low. Alternatively, if physician diagnosis does not corroborate consensus criteria despite true presence of severe sepsis, there may be a higher rate of parental or clinician refusal to participate in research because the clinicians managing the patient may not agree on the diagnosis of concern. The incorporation of new biomarkers or novel microbiologic detection systems that improve early sensitivity and specificity for invasive infections into future revisions of the consensus criteria may improve alignment with physician diagnosis.
There are several limitations to this study. First, physician diagnosis and the application of consensus criteria to identify patients with severe sepsis could not be independently verified. The reasons why patients were not identified as having severe sepsis by one or both definitions were also not available. Although these concerns may have led to misclassification of some patients, it is unlikely that such a bias would have preferentially affected one group. Moreover, because we included only physicians practicing in a PICU setting, 89 % of whom were trained in pediatric critical care medicine, the level of agreement for other providers who commonly treat sepsis (e.g., emergency physicians) may differ from our results. A second limitation is that criteria used for physician diagnosis were not standardized. Although this may have increased variability in physician diagnosis across providers and sites, such an approach reflects that physicians often differ in their diagnostic approach. Third, the higher proportion of patients with available laboratory measures in the consensus criteria only and both definition groups may itself have increased the likelihood of identification by consensus criteria because, in the absence of available data, these measures were assumed to be within normal limits. Although it is more likely that sicker patients had directed laboratory testing performed than unsuspected organ dysfunction was discovered through routine laboratory testing, differential availability of laboratory results may have been a source of misclassification bias. Finally, because a "gold standard" does not exist, it is not possible to determine the degree to which either definition correctly identified pediatric patients with severe sepsis or compared the sensitivity and specificity of the two definitions.

Conclusions
Physician diagnosis of pediatric severe sepsis achieved only a moderate level of agreement with consensus criteria across this multicenter, international point prevalence study. The results of a research study based on current consensus criteria may not be generalizable to nearly one-third of pediatric patients diagnosed with severe sepsis in a PICU. Further research is needed to understand the extent to which the specificity of consensus criteria and variability in physician diagnosis may be contributing to this discordant identification of pediatric severe sepsis. Attempts to better align consensus definitions with clinical diagnosis by establishing a continuous spectrum of illness severity are needed.

Key messages
Although consensus criteria for pediatric sepsis were established to facilitate consistent enrollment across research studies, the extent to which these criteria reflect physician diagnosis of severe sepsis, which underlies external validity for pediatric sepsis research, is not known. Of 6925 PICU patients screened at 128 PICUs in 26 countries, 706 patients were identified by physician diagnosis and/or consensus criteria as having severe sepsis. Only 301 patients (42.6 %) were identified by both physician diagnosis and consensus criteria (κ 0.57 ± 0.02). The 31 % of patients with physician-diagnosed severe sepsis who did not meet consensus criteria were younger, had a lower severity of illness, and a lower PICU mortality than those who met consensus criteria or both definitions. The results of a research study based on consensus criteria may have limited generalizability to nearly one-third of PICU patients diagnosed with severe sepsis.

Additional file
Additional file 1: Approving ethical bodies at each study site. (DOCX 16 kb)