Study design and research governance
This was a multi-centre, prospective, observational clinical trial conducted across four tertiary critical care settings in Australia from November 2007 to November 2009. Athlomics Pty Ltd sponsored the trial and has registered this product as the SeptiCyte® Lab test. The sponsor initiated and designed the trial in collaboration with clinical investigators. The study protocol was approved by institutional review boards (IRBs)/Human Research Ethics Committees (HRECs) from Mater Health Services (MHS), Uniting Care, the Royal Brisbane & Women's Hospital and the Nepean Hospital Human Research Ethics Committee, prior to the recruitment of study volunteers. Independent clinical research organisations contracted by the sponsor were responsible for the monitoring and management of clinical data including verification with source notes.
Data collected from the aforementioned clinical trial were used to perform microarray studies in which to define a gene set in which to focus MT PCR studies. Following, an a priori panel of gene expression biomarkers was applied to MT PCR data from the clinical trial to create a diagnostic rule. The MT PCR data were randomly partitioned into a training set and a test set. The diagnostic rule was generated from the training set, and then applied to the test set, in a blinded fashion. MT and GS performed bioinformatics and statistical analyses in accordance with details presented in the study protocol and statistical analysis plan.
Study population and criteria for inclusion and exclusion
All study participants were 18 years or older, had a body mass index < 40, and provided written informed consent. Patients were recruited as being likely to enter the Sepsis cohort if they met the ACCP/SCCM Consensus Statement [6], and had a clinical suspicion of systemic infection based on microbiological diagnoses. A definitive diagnosis of sepsis was unlikely to be known at the time patients were enrolled in the study; thus confirmation of sepsis and assignment of patients to the sepsis cohort was made retrospectively.
Potential sepsis participants admitted to the ICU and, patients admitted for planned major open surgery, were excluded from the study if they had any systemic immunological disorders including Systemic Lupus Erythromatosus, Crohn's disease and Insulin-Dependent Diabetes Mellitus (Type 1 diabetes); were transplant recipients or were currently receiving chemotherapy treatment for cancer.
Twenty-seven blood culture positive sepsis patients with community-acquired infections were enrolled into this clinical trial as soon as practicable, on admission to the ICU. Specifically, sepsis patients' were enrolled within 24 hours of admission and on average, were in the ICU for five days. Participants in the post-surgical (PS) cohort were recruited pre-operatively and blood samples were collected within 24 hours following surgery (n = 38). Furthermore, 20 healthy adult control (HC) participants were recruited within the Mater Adult Hospital staff, on the basis that they had no concurrent illnesses at the time of blood collection or any past history of immunological dysfunction. All participants or their surrogate decision-maker provided written informed consent prior to the collection of any study data or biological samples.
Collection of data
Demography, vital signs measurements (blood pressure, heart rate, respiratory rate, tympanic temperature), haematology (full blood count), clinical chemistry (urea, electrolytes, liver function enzymes, blood glucose) as well as microbial status were recorded. Blood was drawn into minimally two PAXgene (PreAnalytix, Feldbachstrasse, Hombrechtikon, Switzerland) tubes (5 ml total) for gene expression analyses using the SeptiCyte Lab test.
Gene expression assays
RNA isolation was performed using PAXgene Blood RNA kits (PreAnalytix, a Qiagen/BD Company, Feldbachstrasse, Switzerland) and following standard instructions recommended by the manufacturer. RNA quality was determined using an automated electrophoresis station (Experion, BioRad (Gladesville, New South Wales, Australia) and BioAnalyser, Agilent (Forest Hill, Victoria, Australia). The 260 nm/280 nm ratios for all samples were > 1.9. Once total RNA was extracted, gene expression was assessed using the Affymetrix HGU133 Plus 2.0 GeneChip® (Santa Clara, CA, USA) and MT-PCR SeptiCyte Lab assay.
Microarray studies were conducted on total RNA extracted from a subset of participants that included 20 HC, 11 post-surgical and 10 sepsis samples, using HGU133 Plus 2.0 GeneChips (Affymetrix, Santa Clara, CA, USA). Of the 145 gene expression biomarkers derived from the early sepsis equine model, 408 probesets were derived. This panel was selected because it had been demonstrated a priori to contain information relevant to the separation of the three groups. In brief, 10 μg of total RNA was processed using the 3' Amplification One-Cycle Target Labelling kit (Affymetrix) and 20 μg of generated cRNA was injected into a GeneChip cartridge. The GeneChip array was incubated at 45°C for minimally 16 hr in a rotating oven at 60 rpm. GeneChips were then washed and stained with Streptavidin Phycoerythrin, using Affymetrix supplied wash protocol, EukGE-Ws2v5.
HGU133 fluorescent images were acquired in a GeneChip® Scanner 3000 7G (Affymetrix). Affymetrix Power Tools, R and Perl scripts were used to filter background noise and normalise data based on a detection metric used to identify perfect match probes relative to other background probes. Differentially expressed genes were compared if the signal was > 100 and the fold change was > 2.0. All HGU133 Plus 2.0 GeneChip data are available from the GEO repository (GSE28750).
The MT-PCR approach first described by Stanley and Szewczuk [15] combines reverse transcription and preliminary amplification within Step 1 followed by final target amplification in Step 2. In brief, a mastermix was prepared that contained in part, lyophilized primers for Step 1 of this method and 10 ng of RNA template. Primer sets were designed by AusDiagnostics Pty Ltd (Sydney, NSW, Australia), and validated for clinical performance by Mater Pathology. Inner or nested amplicons were approximately 70 to 90 base pairs (bp) and the outer amplicons were up to 150 bp. All primers spanned intron-exon boundaries and were preferentially designed toward the 3' end.
In this reverse transcription and amplicon enrichment stage, reactions were performed using 20 ul volumes consisting of 10 ul of Step 1 mastermix, 8 ul of molecular biology grade (diethylpyrocarbonate) water, and 2 ul of RNA template (minimally 10 ug). Final target amplification process of multiplexed amplicons (Step 2), utilised involved preparation of the following PCR mix: 10 ul of the Step 1 reaction, was added to 550 ul of DEPC water and 560 ul of Mastermix. Twenty microliter aliquots of this Step 2 PCR mix were then added to the corresponding gene disk containing the lyophilized inner or nested primers. Amplification was performed on the RG6000 (Qiagen) thermal cycler.
Gene expression levels, expressed as "relative fold change", were determined using a method comparable to the 2-ΔΔCt method described by Pfaffl [16], where the point of peak cycling acceleration was extrapolated to represent maximum fold change. JAG1, FUK and PRDM8 were used as normalisation control genes in the calculation of gene expression fold changes. Five negative controls (using DEPC water as a template), and five positive controls (using commercially available Universal RNA as a template) were included as references.
Bioinformatics and statistical analysis of data
Four multi-feature classifiers were used to separate HC versus Mixed Inflammation (MI included both PS and sepsis groups), as well as PS Vs sepsis using the HGU133 Plus 2.0 gene expression data. These classification techniques include Recursive Partitioning, Figueiredo's method, the "least absolute shrinkage and selection operator" or LASSO, and Logistic Regression on Principal Components. In particular, the first three of these four classifiers were applied as they have the capacity to identify small subsets of biomarkers using a high 'throughput' platform. Individual genes were examined using an empirical Bayes adjusted linear model [17], with P- values adjusted for multiple comparisons using Holm's method [18].
In the absence of a formal 'validation set' of samples in which to apply the algorithm created from the 'training set' of data, the Leave-One-Out Cross Validation technique was used. To estimate the error rate using this technique a sample from the original set was removed, the method rebuilt using the same procedure as before (including re-running of the pre-selection step), and this model used to predict the 'left out' sample. This is repeated so that each sample is sequentially left out, and the error rate computed in its absence.
To further validate the microarray data, the SeptiCyte Lab signature was applied to all currently available Gene Expression Omnibus (GEO) HGU133 Plus 2.0 GeneChip data derived from human whole blood leucoctye-based gene expression studies. The GEO database is an international publicly available archive of GeneChip data that can be used in research and development, for interrogation of proprietary gene signatures to support assessments of diagnostic utility. Following a review of this public database, an additional set of controls (n = 164) with no known systemic inflammatory or immunological conditions were added to the microarray study cohort.
For the MT-PCR analyses, a panel of maximally 42 genes (also known as SeptiCyte Lab) was used to generate a diagnostic classifier using a LogitBoost machine learning algorithm [19]. The data were randomly partitioned into a training and validation set. The LogitBoost algorithm was used on the training set, to generate a classifier. The classifier was then applied to the validation set. Posterior probabilities of each condition were obtained for the validation set. These posterior probabilities were used as a diagnostic index, and the diagnostic performance was assessed using a receiver operator characteristics (ROC) curve [20, 21]. After the initial area under the ROC curve was calculated the procedure was repeated, such that the data were randomly partitioned 500 times into training and validation sets, and the ROC area calculated for each random partition. The procedure was also conducted to include a random permutation of group labels in the training and validation sets. The distribution of ROC areas was calculated under this permutation scheme.
It should be noted that no tabulations of sensitivity or specificity have been provided, since these parameters have meaning only in the context of a well-defined clinical population, where the population has been sampled more extensively than that described in this study. Furthermore, sensitivity and specificity may always be traded, that is, specificity may be increased by reducing sensitivity or vice versa. Thus, meaningful claims about sensitivity and specificity can only be made in late-stage clinical research, when the diagnostic thresholds have been established using large cohorts of blinded samples.