Candidemia on presentation to the hospital: development and validation of a risk score

Introduction Candidemia results in substantial morbidity and mortality, especially if initial antifungal therapy is delayed or is inappropriate; however, candidemia is difficult to diagnose because of its nonspecific presentation. Methods To develop a risk score for identifying hospitalized patients with candidemia, we performed a retrospective analysis of a large database of 176 acute-care hospitals in the United States. We studied 64,019 patients with bloodstream infection (BSI) on presentation from 2000 through 2005 (derivation cohort) and 24,685 from 2006 to 2007 (validation cohort). We used recursive partitioning (RPART) to identify the best discriminators for Candida as the cause of BSI. We compared three sets of models (equal-weight, unequal-weight, vs full model with additional variables from logistic regression model) for sensitivity analysis. Results The RPART identified 6 variables as the best discriminators: age < 65 years, temperature ≤ 98°F or severe altered mental status, cachexia, previous hospitalization within 30 days, admitted from other healthcare facility, and need for mechanical ventilation. The prevalence for patients presented with 0 through 6 risk factors in the derivation cohort was 28.7%, 38.8%, 21.8%, 8.3%, 2.1%, 0.3%, and < 0.1% respectively. The corresponding candidemia rates were 0.4% (69/18,355), 0.8% (196/24,811), 1.6% (229/13,984), 3.2% (168/5,330), 4.2% (58/1,371), 9.6% (15/157), and 27.3% (3/11) respectively (P < 0.0001). Findings were similar in the validation cohort (P < 0.0001). The equal-weight risk score model, which signed 1 point to each risk factor, yielded good discrimination in both cohorts with areas under the receiver operating curve (AUROCs) of 0.70 versus 0.71 (derivation versus validation). AUROC values were similar for the unequal-weight model, which signed different weight to each risk factor based on multivariable logistic regression coefficient, (AUROCs, 0.70-0.72). Both equal-weight and unequal-weight models were well calibrated (all Hosmer-Lemshow P > 0.10, indicating predicted and observed candidemia rates did not differ significant across the 7 risk stratus). The full model with 16 risk factors had slightly higher AUROCs (0.74 versus 0.73 for derivation versus validation); however, 7 variables were no longer significant in the recalibrated model for the validation cohort, indicating that the additional items did not materially enhance the model. Conclusions A simple equal-weight risk score differentiated patients' risk for candidemia in a graded fashion upon hospital presentation.


Conclusions
A simple equal-weight risk score differentiated patients' risk for candidemia in a graded fashion upon hospital presentation.

Introduction
Candidemia represents the fourth most common type of hospital-acquired bloodstream infection (BSI) [1][2][3]. More importantly, candidemia results in substantial morbidity [4][5][6][7][8] and mortality [7][8][9][10], especially if initial antifungal therapy is delayed or is not appropriate [5,11,12]. Delaying therapy by as little as 12 hours after obtaining a blood culture can double the risk of death [11]. Therefore, prompt initiation of antifungal therapy is a key determinant of outcome. Complicating efforts to identify subjects at risk for candidemia is the expansion of healthcare delivery beyond the hospital and the evolving recognition of distinct healthcare-associated infection syndromes [13][14][15], Candida may now represent a cause of BSI in patients presenting to the hospital [8].
Given the need to ensure appropriate and timely antifungal therapy and to optimally separate patients at low risk for candidemia from those at high risk, some form of risk stratification for candidemia becomes imperative. This is particularly true for those with candidemia on admission to the hospital because clinicians rarely consider this diagnosis in this setting. The nonspecific signs and symptoms of candidemia further frustrate efforts at early patient identification [16]. Although biomarkers such as (13)--D-glucan are being investigated [17], they are not likely to prove useful in patients presenting to the hospital. The traditional approach to assessing the probability of Candida as a cause of nosocomial BSI has relied upon assessing the number and type of risk factors (e.g., corticosteroid therapy, total parenteral nutrition); however, this strategy has proven to have little utility in critically ill patients and proposed schema for risk stratification have yet to be well validated.
We hypothesized that, despite frustration with clinical risk stratification paradigms for inpatients, assessment of select characteristics could identify patients presenting to the hospital who are at heightened risk for candidemia. We further theorized that these select characteristics could be used to develop a prediction rule to indicate which patients are likely to have BSI due to Candida as opposed to a bacterial pathogen.

Design
To develop a clinical risk score for identifying patients with BSI likely to be caused by Candida spp. upon hospital presentation, we performed a retrospective analysis of patients discharged from 176 acute-care hospitals in the United States from 2000 to 2005. We validated the risk score with discharge data from the same hospitals from 2006 to 2007.
The database comprises acute-care admissions at participating hospitals, including electronically imported or manually abstracted demographic, clinical (e.g., comorbidities, vital signs, laboratory values, other clinical findings), and administrative data (e.g., diagnosis). The underlying data for this study are a limited data set with all patient specific information anonymized. This study was reviewed and approved by the New England Institutional Review Board/Human Subjects Research Committee (Wellesley, MA, USA). It was conducted in compliance with US federal regulations, Health Insurance Portability and Accountability Act, and the Helsinki Declaration.
The outcome for deriving the risk score was BSI due to Candida spp. as defined by the presence of a blood culture positive for Candida, and a concomitant primary or secondary diagnostic code (International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM)) indicative of candidemia. We required that blood samples had been drawn within one day before or within two days after hospital admission. This database undergoes multiple quality assurance assessments with periodic data auditing. In order to limit coding bias we require concomitant presence of an ICD-9 code for candidemia and a positive blood culture. We did not explore other forms of invasive candidiasis.

Variables
Candidate variables were selected a priori based on their biologic plausibility of explaining risk for candidemia. Specifically, we explored demographic factors (age, gender), vital signs, mental status, laboratory test results, and underlying comorbid conditions. Vital signs included pulse, blood pressure, temperature, and respiratory rate. Altered mental status (AMS) was defined by a Glasgow Coma Scale (GCS) score of 10 to 14 or disoriented/lethargy (mild AMS); GCS 5 to 9 (moderate AMS); GCS less than 5 or a designation of 'coma' as charted by a physician (severe AMS). Laboratory testing included serum albumin; blood urea nitrogen (BUN); creatinine; sodium; potassium; glucose; hemoglobin; white blood cell count; and other routine chemistry, hematology, blood gas, and metabolic results. Comorbid conditions included cachexia (ICD-9 secondary diagnosis code), history of malignancy, diabetes, chronic heart failure, and other chronic conditions abstracted through chart review or secondary ICD-9 diagnostic codes. In addition, we explored variables pertinent to candidemia and were available in the data base, such as hemodialysis, immunosuppressive medication, previous hospitalization within 30 days, transfer from another healthcare facility, and mechanical ventilation on admission. Certain patient characteristics were not available in this database. For example, utilization of parenteral nutrition outside the hospital and prior antibiotic exposure are not recorded in this database. Vital signs and other patient-specific characteristics were obtained within one day of admission. For each vital sign and laboratory test result, we used the worst value obtained in the emergency department or, if not available, on the day of admission.

Risk score development
To identify risk factors that optimally separate patients at low risk for candidemia from those at high risk, we used a recursive partition (RPART) approach [23]. Also referred to as classification and regression tree analysis [24], RPART has been used to derive prediction rules for acute chest pain [25], heart failure [26], and other conditions [27,28]. RPART first identifies the variable with the highest discrimination for the outcome of interest (node) and then repeats the process to partition subsequent nodes. RPART yields a tree-like algorithm with numerous nodes. To further improve ease of use, we simplified the algorithm based on the number of risk factors present, giving equal weight (one point) to each risk factor identified in by the RPART (equal-weight risk score).

Risk score validation
To validate the model, we applied the derived risk score to patients in the validation cohort. We compared the betweencohort distribution of candidemia prevalence by risk score strata for the validation cohort with that from the derivation cohort and performed the Cochrane-Armitage test to assess trend [29]. We used the area under the receiver operating curve (AUROC) to assess the discrimination of the model and Hosmer-Lemshow test to assess model calibration. A higher value for the Hosmer-Lemshow test indicates better model fit.

Sensitivity analysis
Using AUROC and Hosmer-Lemshow goodness-of-fit statistics, we compared the discrimination and calibration of the simpler versus more complex models. Specifically, we fit three sets of logistic regression models. The first was the equalweight risk-score model, which was a univariate logistic regression model using a single continuous variable of the number of risk factors present (ranging from 0 to 6). This model gave the same weight for each risk factor present. The second was the unequal-weight risk factor model, which was a multivariable logistic regression model using each of the same variables in the equal-weight risk score as covariates. The unequal weight model assigned different weights for each variable per multivariable logistic regression coefficients. The third model was the full risk factor model, which was generated from a stepwise multivariable logistic regression analysis with additional variables retained in the model that were significant (P < 0.05).
Statistical analyses were performed using Statistical Analysis Software (SAS, version 9.01; SAS Institute Inc., Cary, NC, USA). Two-sided P values < 0.05 were considered statistically significant.

Baseline characteristics of derivation and validation cohorts
The derivation cohort included 64,019 admissions and the validation cohort included 24,685 (Table 1) [see Additional data file 1]. Many between-cohort differences in demographics, laboratory findings, vital signs, comorbidities, and other variables were statistically significant. For example, the derivation cohort had a smaller proportion of patients aged 64 years or younger, smaller proportion of men, and higher in-hospital mortality. Approximately 10% of patients needed mechanical ventilation on admission, including 9.2% of those in the derivation cohort and 10.9% of those in the validation cohort. Among patients needing mechanical ventilation, candidemia occurred in 2.3% of those in the derivation cohort and in 3.1% of those in the validation cohort (Table 2) [see Additional data file 2].

Derivation and validation of candidemia risk score
Univariate analysis revealed that the following variables were associated with candidemia: age younger than 65 years; cachexia; deranged albumin, arterial pH, and electrolytes; temperature of 98°F or less, or severe altered mental status; previous hospitalization within 30 days; admitted from other healthcare facility; and mechanical ventilation at admission (all P  0.001; Table 2). These associations were similar in the derivation and validation cohorts.
In the derivation cohort, an overall score of 1 or more had an sensitivity of 90.7% and a negative predictive value (NPV) of 99.6% for the presence of candidemia. The specificity was more limited at 28.9%. The negative predictive value of each total point score remained above 99% so long as the number of risk factors presented remained less than 3. These findings were similar in the validation cohort. In other words, a low score nearly excluded the likelihood of candidemia. In patients with a score of zero, who account for nearly 30% of all subjects evaluated, there were very few cases of candidemia, with a NPV of 99.6%.

Sensitivity analysis
The equal-weight risk model was associated with discrimination similar to that of the unequal-weight model ( Table 3) Both the equal and unequal weight models provided good calibration of predicted versus observed candidemia across lowand high-risk strata as demonstrated by insignificant P values in both cohorts (all Hosmer-Lemshow chi-squared test P > 0.10, a larger P value is better, because it suggests that predicted and observed incident rates are in higher agreement across low and high risk stratus). The full model also provided good calibration in the derivation cohort (P = 0.74) but not in the validation cohort (P = 0.02), suggesting over-or underprediction in some risk strata when additional variables were added to the model.

Discussion
Our analysis demonstrates that a simple equal-weight risk stratification score can assess the potential for candidemia in newly hospitalized patients with BSI. We validated our model using a cohort of patients discharged during the two consecutive years after the derivation cohort. The cohorts had similar graded risk of candidemia that increased with increased number of risk factors. The equal-weight risk-score model provided similar between-cohort discrimination for the risk of candidemia and goodness of model fit, indicating the stability of our risk score. In a sensitivity analysis, the equal-weight riskscore model provided nearly identical discrimination and goodness of fit compared with that of unequal-weight model. A full 16-risk factor model provided slightly better discrimination but was less robust. Importantly, the equal-weight riskscore model is easier to apply than the other two models. The need for a risk stratification scheme is pressing. Although Candida may be an infrequent cause of BSI on admission, epidemiologic data indicate that the rate of this is likely to increase. The expansion of healthcare delivery beyond the hospital continues apace, and multiple studies now document the evolution of healthcare-associated infections that are distinct from community-acquired or nosocomial infections [13][14][15]. The likelihood of an increasing prevalence of candidemia at admission, along with the need to ensure that such patients receive early and appropriate antifungal therapy, underscores the anticipated benefit of easy-to-use risk stratification. Prior efforts at risk stratification for candidemia as a cause of nosocomial BSI have been largely unsuccessful due to the lack of a large clinical data set to model such infrequent events. Our effort builds on earlier analyses [30,31] by focusing on a distinct cohort of patients and by using multiple statistical methods to cross-validate the algorithms. Moreover, many adjuncts to a clinically based risk stratification scheme, such as relying on the colonization index or serodiagnostic testing, are less likely to be available in patients presenting to the hospital.
Our risk-score comprised six demographic, patient history, and clinical findings that are routinely available in any acutecare hospital setting and that were previously shown to be associated with adverse outcomes [8,32]. To minimize the time needed to assess the risk of candidemia, we excluded variables requiring laboratory testing.
Our risk score offers several advantages over previous models [30,31]. First, as noted above, our variables were routinely available at presentation and did not require cultures or other tests to confirm the presence of colonization, sepsis, or other conditions. This increases the scores practical value for rapid assessment of risk for candidemia. Second, the accuracy and robustness of our risk score was supported by derivation from a cohort comprising 64,019 patients and validation from a different cohort comprising 24,685 patients in a different time period. Most previous studies of risk assessment in candidemia did not include any retrospective or prospective validation. Third, our results are likely to be generalizable to a broad range of patients presenting to acute-care hospitals because they are derived from teaching and non-teaching hospitals and from urban and rural hospitals, and are not limited to patients in intensive care units. Fourth, we used the concomitant presence of candidemia code and a positive blood culture to identify candidemia case and included acute clinical presentation on admission as candidate variables, which is likely to be a strength of our paper because many large-scale databases tend to only have the results of administrative coding and lack actual culture confirmation.
Our risk score seems consistent with the pathogenesis of candidemia, which includes: increased fungal burden or colonization, often due to broad-spectrum antibacterial therapy or previous health care exposure; disruption of mucosal and skin barriers, often due to indwelling vascular catheters, surgery, trauma, or chemotherapy-related mucositis; and immune dysfunction, which allows dissemination of fungal colonies [16]. For example, previous admission within 30 days and admission from another health care facility, which were important in our model, are likely represent markers for the first and second steps in the pathogenesis of candidemia. Secondly, the relationship between the need for mechanical ventilation and candidemia has been confirmed by others [32]. Although previous studies found that age was not an independent risk factor for candidemia [30,33], our analyses revealed that among patients with BSIs the younger ones appear potentially more iatrogenically immunosuppressed. For example, patients aged less than 65 years were more likely to be on immunosuppres- Distribution of overall cases and Candidemia cases by the equal-weight Candidemia Risk Score Distribution of overall cases and Candidemia cases by the equal-weight Candidemia Risk Score.

Figure 2
Receiver operating characteristics curves for the equal-weight Candi-demia Risk Score by cohort Receiver operating characteristics curves for the equal-weight Candidemia Risk Score by cohort. The area under the receiver operating curve was 0.70 for the derivation cohort and 0.71 for the validation cohort.
sive therapy (17.0% versus 12.6%; P < 0.0001), hemodialysis (4.5% versus 2.7%, P < 0.0001), or have metastatic cancer (6.0% versus 4.7%; P < 0.0001). Similarly, cachexia was associated with metastatic cancer (6.7% versus 4.9%; P < 0.0001), immunosuppressed status, or other severe clinical conditions making patients prone to repeated hospitalization and infections. Furthermore, hypothermia is a risk factor for greater mortality with infection and may suggest that fungal infections are often more severe when detected, or more likely to have a delay in therapy resulting in hypothermia and potentially worse outcomes [34]. In total, our risk score probably captured composite measures for exposures to healthcare delivery and its associated risks for candidemia such as underlying immunosuppression and severity of illness --both expected risk factors for candidemia. Hence the model appeared robust overall when applied to a separate patient population in a different time period in the validation cohort. The high NPV of a low score indicates that the clinical value of the equal-weight score lies in its ability to identify a group of patients at an exceedingly low risk for candidemia. Given an overall prevalence of 1.2%, which essentially represents the pre-test probability of candidemia in these patients, application of the risk score selects for a group of patients where the risk of candidemia approximates zero. In these subjects antifungal therapy can likely be withheld safely because a low score essentially rules out candidemia. More importantly, this very-low-risk group comprises the bulk of the subjects. Alternatively, although the prevalence of candidemia in the higherrisk groups remains limited, the score at least can serve to remind clinicians to consider candidemia and to weigh the potential for this along with the presence or absence of other clinical factors.
Our model had several limitations. First, the retrospective design needs to be validated in a prospective study. However, only large databases provide a sufficiently large sample to identify enough candidemia cases for multivariable modeling.
To address issues related to bias from utilization of ICD-9 coding, we required culture evidence of candidemia. Second, we limited our population to patients with candidemia diagnosed within two days of admission. Extending the observation period may have changed our model. Therefore, our findings are not necessarily applicable for suspected nosocomial candidemia. Similarly, we likely missed cases present at admission but not diagnosed until later during hospitalization because cultures are not always obtained upon admission. Third, information was lacking on some specific risk factors for candidemia. For example, we did not have data on whether patients were receiving total parenteral nutrition on admission, had central venous catheters in place, had been exposed to antimicrobial therapy, or had recently undergone surgery [30,31,33,35]. Nevertheless, we included previous hospitalization within 30 days, immunosuppression status, and cachexia as candidate variables, which were likely to be associated with those known risk factors identified in the previous literature. Our score is meant to serve as an adjunct to clinical decision-making, which might incorporate knowledge of all potential risk factors. It is not meant in any way to supplant bedside decision-making. Finally, our analysis focused on subjects presenting to the hospital. Therefore, this score does not necessarily apply in cases of suspected nosocomial candidemia.

Conclusions
In conclusion, we derived and validated a simple risk-score model that stratifies patients at risk for candidemia, which may help clinicians to rule out candidemia and to shorten the time required to identify patients at increased risk for this disease. It may also help researchers to stratify clinical trial or other outcome studies based on the risk present. Although prospective validation is required, six easy-to-determine characteristics categorize candidemia risks at early hospitalization.