Clinical review: Sepsis and septic shock - the potential of gene arrays

Over the past decade several investigators have applied microarray technology and related bioinformatic approaches to clinical sepsis and septic shock, thus allowing for an assessment of how, or if, this branch of genomic medicine has meaningfully impacted the field of sepsis research. The ability to simultaneously and efficiently measure the steady-state mRNA abundance of thousands of transcripts from a given tissue source (that is, 'transcriptomics') has provided an unprecedented opportunity to gain a broader, genome-level 'picture' of complex and heterogeneous clinical syndromes such as sepsis. A trancriptomic approach to sepsis and septic shock is technically challenging on multiple levels, but nonetheless modest, tangible advances are being realized. These include a genome-level understanding of the complexity of sepsis and septic shock, identification of novel candidate pathways and targets for potential intervention, discovery of novel, candidate diagnostic and stratification biomarkers, and the ability to stratify patients into clinically relevant, expression-based subclasses. The challenges moving forward include robust validation studies, standardization of technical approaches, standardization and further development of analytical algorithms, and large-scale collaborations.


Introduction
Over the past decade several investigators have applied microarray technology and related bioinformatic approaches to clinical sepsis and septic shock, thus allowing for an assessment of how, or if, this branch of genomic medicine has meaningfully impacted the sepsis fi eld. Th is review will fi rst provide an overview of the gene microarray approach, including limitations and study design considerations. Subsequently, the review will focus on the potential translational application of microarray data and genome-wide expression profi ling to the sepsis fi eld. Four broad areas will be discussed: genome-level understanding of sepsis, biomarker discovery, gene expression-based identifi cation of septic shock subclasses, and discovery of novel targets and pathways.

Technology, approaches, and limitations
Microarray-related technology, approaches, and limitations have been extensively reviewed elsewhere [1][2][3][4][5], and will be summarized below. Notably, there is now an emerging technology, RNA sequencing (RNA-Seq) [6], that has potentially intriguing applications for the fi eld, but will not be further discussed as there are no RNA-Seq data specifi cally related to sepsis.
Th e fundamental technical innovation of microarray technology is the ability to simultaneously measure mRNA abundance of thousands of transcripts (transcriptomics). Th e technique generally involves reverse transcription of RNA into cDNA, with the inclusion of a labeling molecule for detection. Th e labeled cDNA (targets) is subsequently applied to a support surface arrayed with nucleotide sequences corresponding to specifi c genes (probes). Th e probes and targets hybridize via standard nucleic acid interactions and the amount of hybridization refl ects the abundance of a specifi c mRNA species. Th e supporting surface is subsequently washed and scanned to provide raw mRNA abundance data. An important limitation of transcriptomics is that it solely provides a 'snapshot' of steady-state mRNA abundance. Th e degree of mRNA abundance is infl uenced by multiple factors, and does not provide any direct information about gene end products (proteins), nor post-translational modifi ers of protein function, such as phosphorylation or glycation.
One major consideration in designing a microarray experiment involves the RNA source. Ideally, the RNA source should be relatively homogenous and closely represent the disease/condition biology of interest. For example, the discovery of neutrophil gelatinase-associated lipocalin as a biomarker for acute kidney injury included microarray-based analysis of kidneys from

Abstract
Over the past decade several investigators have applied microarray technology and related bioinformatic approaches to clinical sepsis and septic shock, thus allowing for an assessment of how, or if, this branch of genomic medicine has meaningfully impacted the fi eld of sepsis research. The ability to simultaneously and effi ciently measure the steady-state mRNA abundance of thousands of transcripts from a given tissue source (that is, 'transcriptomics') has provided an unprecedented opportunity to gain a broader, genomelevel 'picture' of complex and heterogeneous clinical syndromes such as sepsis. A trancriptomic approach to sepsis and septic shock is technically challenging on multiple levels, but nonetheless modest, tangible advances are being realized. These include a genomelevel understanding of the complexity of sepsis and septic shock, identifi cation of novel candidate pathways and targets for potential intervention, discovery of novel, candidate diagnostic and stratifi cation biomarkers, and the ability to stratify patients into clinically relevant, expression-based subclasses. The challenges moving forward include robust validation studies, standardization of technical approaches, standardization and further development of analytical algorithms, and large-scale collaborations.
rodents subjected to renal ischemia [7]. Most of the studies described below have used the blood compartment as the RNA source. Reliance on the blood compartment has obvious limitations with regard to specifi c organ perturbations in clinical sepsis, but also refl ects the practical limitations of tissue sampling in clinical research and does provide a broad picture of a systemic response. Blood-derived RNA can come from either whole blood (a mixed population of blood cells), or following the isolation of specifi c blood cells. Th e whole-blood approach facilitates the procurement of samples from multiple centers, without the requirement for cell separation expertise, and has the potential to provide a comprehensive picture. However, the whole blood approach has the potential to confound data interpre ta tion due to heterogeneous blood cell populations. Th e cell-specifi c RNA approach provides a more homogenous RNA source, but has the potential to miss biologically relevant expression signatures from cells that are excluded from the experimental approach. For example, a study that focuses exclusively on peripheral blood mononuclear cells will not account for the potentially important response of neutrophils.
Another important consideration in designing a microarray experiment involves the reference (control) group to which gene expression in the population of interest will be compared. For example, if one is interested in studying gene expression patterns in sepsis, relative to a normal state, then comparisons to normal controls is appropriate. In contrast, if one is interested in discovering gene expression patterns that distinguish sepsis from 'sterile infl ammation' , then a more appropriate control group would consist of patients who are not infected, but meet criteria for systemic infl ammatory response (SIRS).
Th e heterogeneity and complexity that characterize clinical sepsis present an important challenge to clinical microarray studies. From one perspective, one could say that the comprehensive nature of a microarray approach is ideally suited for studying such a heterogeneous and complex syndrome. From another perspective, the hetero geneity and complexity are potentially profound confounders for data interpretation. Accordingly, it is critical that microarray data be interpreted in the context of robust clinical/biological data that can infl uence gene expression patterns. Th ese include, but are not limited to, race, gender, age, co-morbidities, infecting pathogen class, state of immune competence, and therapy.
Analysis of microarray data is an evolving and complex fi eld. A universal initial step involves data normalization, which allows valid comparisons across samples by reducing technical variations not directly related to biological variation [5]. A typical next step involves statistical compari sons across groups of interest using either parametric or non-parametric analysis of variance. Unfortunately, there is no clear consensus as to which statistical test is most appropriate for a given data set, and it is particularly troubling that lists of 'diff erentially regulated genes' , from the same data set, can substantially vary based on the statistical test [8,9]. Regardless of what statistical test one uses, it is imperative that the statistical test incorporates corrections for multiple comparisons to account for a substantially high risk of false positives. One common fi lter that is applied to microarray data involves an expression fi lter that compares mRNA abundance of specifi c gene probes in one cohort versus a reference cohort. Expression fi lters are useful to assess 'magnitude of eff ect' and to reduce the number of comparisons for a subsequent statistical test, but they are not valid substitutes for formal statistical testing. Finally, there is the issue of statistical power in microarray experiments, which can be calculated, but is dependent on assumptions that can be diffi cult to derive objectively [10]. In general, a heterogeneous study cohort will require substantially more independent samples, compared to a more homogenous cohort. Th e statistical tests described above typically yield large lists of diff erentially regulated genes, thus leaving one with the challenge of assigning biological meaning to these gene lists. One approach to data interpretation involves the generation of 'heat maps' , which statistically cluster genes and samples based on similarity of expression. Heat maps provide a broad picture of gene expression patterns and allow for the discovery of disease 'subclasses' based on diff erential gene expression [11]. Another approach to viewing large microarray data sets involves the generation of gene expression 'mosaics' based on a 'self-organizing map' algorithm [12,13]. Th ese gene expression mosaics provide microarray data with a 'face' that is recognizable via intuitive pattern recognition, and were recently applied to allocate patients with septic shock into clinically relevant subclasses [14,15].
Beyond these global assessments of gene expression patterns there exist a number of public and proprietary databases allowing for the assignment of biological function to gene lists. Th ese databases examine uploaded gene lists and determine whether the gene list is enriched for genes that are biologically related, based on the established literature. Th e outputs from these databases range from generic (for example, 'immune response') to specifi c (for example, 'antigen presentation') biological processes. Furthermore, the outputs from these databases provide an estimate of signifi cance (P-values) indicating how likely a gene list is enriched for a given biological function by chance alone. Th e level of signifi cance is directly proportional to the number of genes in the list that correspond to the given biological function, and indirectly proportional to the total number of genes in the list. A related approach to assigning biological meaning to gene lists involves the generation of gene networks based on known, direct and indirect, interactions between genes [16,17].

Genome-level understanding of sepsis
Microarray-based expression profi ling has provided an unprecedented opportunity to gain a broader, genomelevel 'picture' of complex and heterogeneous clinical syndromes such as sepsis. In addition, this genome-level approach has the potential to reduce investigator bias, and thus increase discovery capability, in as much as all genes are potentially interrogated, rather than a specifi c set of genes chosen by the investigator based on a priori and potentially biased assumptions.
Many of the fundamental physiologic and biologic principles of the sepsis paradigms are derived from experiments involving human volunteers subjected to intravenous endotoxin challenge [18][19][20][21]. More recently, the genome-level response during experimental human endotoxemia has been studied using microarray technology [16,22,23]. Talwar and colleagues [22] compared eight volunteers challenged with intravenous endotoxin to four controls challenged with saline. Mononuclear cell-specifi c RNA was obtained at four diff erent time points after endotoxin challenge and analyzed via microarray. As expected, a large number of transcripts related to infl ammation and innate immunity were substantially up-regulated in response to endotoxin challenge. Interest ingly, the peak transcriptomic response to the single endotoxin challenge occurred within 6 hours and mRNA levels generally returned to control levels within 24 hours. Th e investigators also reported endotoxinmediated diff erential regulation of over 100 genes not typically associated with acute infl ammation (for example, cathepsin H, sialidase 1, UDP-glucose dehydrogenase, zinc fi nger protein 266, homeo box B2). Finally, and of relevance to subsequent sections of this review, endotoxin challenge also led to repression of several gene programs directly related to adaptive immunity (for example, interleukin-7 receptor, T cell receptor α locus, zeta-chain T cell receptor associated protein kinase 70 kDa, T cell receptor γ locus).
Calvano and colleagues [16] also studied normal volunteers subjected to a single endotoxin challenge, but applied a (then) novel approach to microarray data analysis centered on knowledge-based interactive gene networks. Again, the maximal up-regulation of gene networks corresponding to infl ammation and innate immunity occurred at approximately 6 hours after the endotoxin challenge, and generally returned to baseline by 24 hours. Perhaps the most interesting fi nding from this network-centered analysis, however, was the widespread and early repression of gene networks related to mitochondrial energy production (for example, NADH dehydrogenase 1, pyruvate dehydrogenase, ATP synthase) and protein synthesis (ribosomal protein L3, ribosomal protein S8, eukaryotic translation initiation factor). Tang and colleagues [24] have corroborated the repression of mitochondrial energy production-related genes in a study focused on neutrophil-specifi c gene expression in critically ill patients with sepsis.
Th e human endotoxemia studies described above provide a highly controlled and reproducible experimental setting to explore sepsis biology at the level of the entire transcriptome, but as with all sepsis models, this model does not fully replicate the complex and heterogeneous syndrome seen at the bedside following infection with live microbes [25]. Consequently, several investigators have attempted microarray-based studies in critically ill patients with sepsis and septic shock. Th ese studies present considerable experimental challenges due to the inherent heterogeneity of clinical sepsis and septic shock. Nonetheless, several studies have provided novel insight into the overall genome-level response to sepsis [9,17,24,[26][27][28][29][30][31][32][33][34]. A common theme across many of these studies is the massive up-regulation of infl ammation-and innate immunity-related genes in patients with sepsis and septic shock. Th ese observations are not intrinsically novel, but they are consistent with the long-standing sepsis paradigms centered on a hyperactive infl ammatory response, and thus provide an important layer of biological plausibility with regard to overall microarray data output in the context of clinical sepsis.
Another common paradigm in the sepsis fi eld involves a two-phase model consisting of an initial hyper-infl amma tory phase followed by a compensatory anti-infl amma tory phase, but this has been recently challenged, in large part due to the multiple failures of interventional clinical trials founded on this paradigm [35][36][37]. Recently, Tang and colleagues [3] conducted a formal systematic review of a carefully selected group of microarray-based human sepsis studies. A major conclusion of this systematic review is that, in aggregate, the transcriptome-level data do not consistently separate sepsis into distinct proand anti-infl ammatory phases. Th is conclusion has been questioned [38], but is supported by several recent cytokine-and infl ammatory mediator-based studies in clinical and experimental sepsis [39][40][41].
Another prevailing paradigm in the sepsis fi eld involves the concept of immune-paralysis, which frames sepsis as more of an adaptive immune problem (rather than just an overactive innate immune system) and the inability to adequately clear infection [42,43]. Recently, this paradigm was elegantly corroborated in mice subjected to sepsis and rescued by administration of IL-7, an anti-apoptotic cytokine essential for lymphocyte survival and expansion [44,45]. As mentioned previously, studies in human volun teers challenged with endotoxin revealed early repres sion of gene programs related to adaptive immunity [22]. In studies focused on mononuclear cellspecifi c expression profi les, Tang and colleagues [30,31] have also reported early repression of adaptive immunity genes in patients with sepsis. Finally, multiple studies in children with septic shock have reported, and validated, early and persistent repression of adaptive immunityrelated gene programs (for example, genes corresponding to the T cell receptor) [9,11,14,15,17,[32][33][34]. Th us, the concept of adaptive immune dysfunction as an early and prominent feature of clinical sepsis and septic shock seems to be well supported by the available genome-wide expression data.
Developmental age is thought to be a major contributor to sepsis heterogeneity. Recently, a microarray-based study in children with septic shock corroborated this concept at the genomic level [46]. Four developmental age groups of children were compared based on wholeblood-derived gene expression profi les. Children in the 'neonate' group (<28 days of age) demonstrated a unique expression profi le relative to older children. For example, children in the neonate group demonstrated widespread repression of genes corresponding to the triggering receptor expressed on myeloid cells 1 (TREM1) pathway. TREM1 is critical for amplifi cation of the infl ammatory response to microbial products and there has been recent interest in blockade of the TREM1 signaling pathway in septic shock [47]. Th e observation that TREM1 signaling may not be relevant in neonates with septic shock illustrates how some potential therapeutic strategies for septic shock may not have biological plausibility in certain developmental age groups.

Biomarker discovery
A daily conundrum in the intensive care unit is the ability to distinguish which patients that meet criteria for SIRS are infected, and which patients with SIRS are not infected. Accordingly, there are ongoing eff orts to discover diagnostic biomarkers for sepsis (SIRS secondary to infection), and microarray approaches have the potential to enhance these eff orts. Several investigators have reported genome-level signatures that can distinguish patients with SIRS from patients with sepsis [26,29,31,48]. A substantial amount of work, including validation, remains to be done in order to leverage these datasets into clinically applicable diagnostic biomarkers, but the datasets nonetheless provide a foundation for the derivation and development of diagnostic biomarkers for sepsis.
Investigators have also applied microarray technology to address other important clinical challenges directly related to infection. Cobb and colleagues [49,50] have reported an expression signature (the 'ribonucleogram') having the potential to predict ventilator-associated pneumonia in critically ill blunt trauma patients up to 4 days before traditional clinical recognition. Similarly, Ramilo and colleagues [51] have reported expression signatures that can distinguish infl uenza A infection from bacterial infection, and Escherichia coli infection from Staphylococcus aureus infection, in hospitalized febrile children. In contrast, Tang and colleagues [30] were unable to defi ne organism-specifi c gene expression signatures (Gram positive versus Gram negative bacteria) in critically ill adults with sepsis.
Another aspect of biomarker development in sepsis surrounds stratifi cation biomarkers, particularly to predict outcome. Th eoretically, any gene that is consistently diff erentially regulated between survivors and nonsurvivors in a microarray dataset may warrant further investigation as a potential outcome biomarker. For example, a microarray study by Pachot and colleagues [27,52] identifi ed CX3CR1 (fractalkine receptor) as a potential outcome biomarker in sepsis. Similarly, Nowak and colleagues [53] have leveraged microarray data to identify chemokine (C-C motif ) ligand 4 (CCL4) as an outcome biomarker in children with septic shock. Both candidate stratifi cation biomarkers, however, require further validation.
IL-8 has emerged as a robust stratifi cation biomarker in children with septic shock [54], and the rationale for pursuing it stemmed directly from microarray-based studies identifying IL-8 as one of the more highly expressed genes in pediatric non-survivors of septic shock, compared to survivors [34]. Subsequent studies demonstrated that serum IL-8 protein levels, measured within 24 hours of presentation to the intensive care unit with septic shock, could predict survival in pediatric septic shock with a probability of 95% [54]. Th e ability of IL-8 to serve as a stratifi cation biomarker was subsequently validated in a completely independent cohort of children with septic shock. Consequently, it has been proposed that IL-8 could be used in future pediatric septic shock interventional trials as a means to exclude patients having a high likelihood of survival with standard care, as a means of improving the risk-to-benefi t ratio of a given intervention. Th is type of stratifi cation strategy would be particularly applicable for an intervention that carries more than minimal risk. Interestingly, it appears that IL-8-based stratifi cation may not perform in a similarly robust manner in adults with septic shock [55], thus providing another example of how developmental age contributes to septic shock heterogeneity.
Currently, there is an ongoing eff ort to derive and validate a multi-biomarker sepsis outcome risk model in pediatric septic shock. Th e foundation of this eff ort is the relatively unbiased selection of a panel of candidate outcome biomarkers using microarray data from a large cohort of children with septic shock [56,57].

Gene-expression-based identifi cation of septic shock subclasses
Viewing septic shock as a highly heterogeneous syndrome implies the existence of 'disease subclasses' , in an analogous manner to that encountered in the oncology fi eld [37]. Recently, there has been an attempt to identify septic shock subclasses in children based on genomewide expression profi ling [11]. Complete microarray data from a large cohort of children with septic shock, representing the fi rst 24 hours of admission, were used to identify septic shock subclasses. A heat map of over 6,000 diff erentially regulated genes was generated using an unsupervised clustering algorithm. Patients were then classifi ed into one of three subclasses (subclasses ' A' , 'B' , or 'C') based on statistically similar gene expression patterns, as determined by the fi rst and second order branching patterns of the heat map. Subsequently, the clinical database was mined to determine if there were any phenotypic diff erences between the three subclasses. Patients in subclass A had a signifi cantly higher level of illness severity as measured by mortality, organ failure, and illness severity score.
Th e gene expression patterns that distinguished the subclasses were distilled to a 100-gene expression signature by conducting a leave-one-out cross-validation procedure and selecting the 100 genes having the greatest subclass prediction capability. Th ese 100 genes were then uploaded to a gene expression database that identifi ed enrichment for genes corresponding to adaptive immunity, glucocorticoid receptor signaling, and the peroxisome proliferator-activated receptor-α signaling pathway. Of note, the genes corresponding to these functional annotations were generally repressed in the subclass of patients with the higher level of illness severity (that is, subclass A patients).
In a subsequent study, the expression patterns of the 100 subclass-defi ning genes were depicted using visually intuitive gene expression mosaics and shown to a panel of clinicians with no formal bioinformatic training and blinded to the actual patient subclasses (Figure 1). Th e clinicians were able to allocate patients into the respective subclasses with a high degree of sensitivity and specifi city [15]. Th e ability to identify a subclass of children with a higher illness severity was further corrobor ated when the gene-expression-based subclassi fication strategy was applied to a separate validation cohort of children with septic shock [14]. Collectively, these studies demonstrate the feasibility of subclassifying patients with septic shock, in a clinically relevant manner, based on the expression patterns of a discrete set of genes having relevance to sepsis biology. Th e availability of clinical microfl uidics [58] and digital mRNA measurement technology [59] may allow for clinical feasibility of measuring the 100 class-defi ning genes in a timely manner that is suitable to direct patient care or for clinical trial stratifi cation.

Discovery of novel targets and pathways
Th e potential to interrogate the entire genome in a relatively unbiased manner provides an opportunity to discover previously unrecognized, or unconsidered, targets and pathways relevant to sepsis biology. Th is is a daunting task in the context of a highly heterogeneous syndrome such as clinical sepsis, and the many unavoidable confounding factors inherent to clinical sepsis microarray studies. Nonetheless, several studies illustrate the potential of genome-wide expression profi ling in the discovery of novel targets and pathways.
For example, using a combination of expression profi ling and in vitro approaches, Pathan and colleagues [60] have identifi ed interleukin-6 as a major contributor to myocardial depression in patients with meningococcal sepsis. Th is is a particularly intriguing and robust study because the study population is relatively homogeneous (that is, exclusively patients with meningococcal) compared to the majority of sepsis microarray studies that have enrolled patients with heterogeneous sepsis etiologies.
In one of the earliest clinical sepsis microarray studies, Pachot and colleagues [27] identifi ed a set of genes diff eren tially regulated between survivors and non-survivors. Th e gene most highly expressed in survivors, relative to non-survivors, was that encoding the chemokine receptor CX3CR1 (fractalkine receptor). In a subsequent validation study, these same investigators provided further evidence supporting the novel concept that dys regu lation of CX3CR1 in monocytes contributes to immuneparalysis in human sepsis [52]. Th ese studies further demonstrate the potential to discover novel pathways through discovery-oriented expression profi ling.
Several studies in children with septic shock have documented early and persistent repression of gene programs directly related to zinc homeostasis, in combina tion with low serum zinc concentrations [9,11,17,32,34]. Since normal zinc homeostasis is absolutely critical to normal immune function [61], these obser vations have raised the possibility of zinc supplementation as a potentially safe and low cost therapeutic strategy in clinical septic shock and other forms of critical illness [62][63][64]. Importantly, Knoell and colleagues [65,66] have independently corroborated that zinc supplementation is a highly benefi cial strategy in experimental sepsis. Additional studies by Knoell and colleagues [67] have corroborated decreased plasma zinc concentrations in patients with sepsis, and that low plasma zinc concen trations correlate with higher illness severity. Furthermore, plasma zinc concentrations correlate inversely with mono cyte expression of the zinc transporter gene SLC39A8 (also know as ZIP8) [67,68]. Interestingly, microarray-based studies in children with septic shock have reported high levels of SLC39A8 expression in nonsurvivors, relative to survivors [34]. Despite the intriguing convergence of these data from independent laboratories, the safety and effi cacy of zinc supplementation in clinical sepsis remains to be directly demonstrated and is a current area of active investigation. One consideration for these studies will be the incorporation of trancriptomic analyses to determine if zinc supplementation infl uences the zinc-related gene repression patterns described above.
In the aforementioned studies involving children with septic shock, metalloproteinase (MMP)-8 has consistently been the highest expressed gene in patients with septic shock, relative to normal controls [9,11,17,[32][33][34]46]. In addition, MMP-8 is more highly expressed in patients with septic shock compared to patients with sepsis, and in septic shock non-survivors compared to septic shock survivors [69]. MMP-8 is also known as neutrophil collagenase because it is a neutrophil-derived protease that cleaves collagen in the extracellular matrix, but MMP-8 is also known to have other cellular sources and non-extracellular matrix substrates, including chemokines and cytokines [70]. Th e consistently high level of expression of MMP-8 in clinical septic shock recently stimulated the formal study of MMP-8 in experimental sepsis. Th ese studies demonstrated that either genetic abla tion of MMP-8 or pharmacologic inhibition of MMP-8 activity confers a signifi cant survival advantage in a murine model of sepsis [69]. While these studies require further development and validation, the fi ndings are intriguing given that there exist a number of drugs to eff ectively inhibit MMP-8 activity in the clinical setting [71].

Conclusion
Despite the tremendous methodological challenges that come with translational research involving humans with sepsis, microarray technology and complex bioinformatic approaches are beginning to provide novel insights into this complex syndrome. Progress, albeit slow, has been realized with regard to our understanding of the genomelevel response during sepsis, the identifi cation of potential novel targets and pathways, discovery of candidate diagnostic and stratifi cation biomarkers, and the possibility of clinically relevant and clinically feasible gene-expression-based subclassifi cation. Th e challenges ahead include robust validation studies, standardization of technical approaches, standardization and further development of analytical algorithms, and large scale collaborations.

Competing interests
The Cincinnati Children's Hospital Research Foundation and the author have a provisional patent for the use of interleukin-8 as a stratifi cation biomarker for septic shock.