A comprehensive time-course based multicohort analysis of ... · infectious inflammation....

16
SEPSIS A comprehensive time-coursebased multicohort analysis of sepsis and sterile inflammation reveals a robust diagnostic gene set Timothy E. Sweeney, 1,2 * Aaditya Shidham, 2 Hector R. Wong, 3,4 Purvesh Khatri 2,5 * Although several dozen studies of gene expression in sepsis have been published, distinguishing sepsis from a sterile systemic inflammatory response syndrome (SIRS) is still largely up to clinical suspicion. We hypothesized that a multicohort analysis of the publicly available sepsis gene expression data sets would yield a robust set of genes for distinguishing patients with sepsis from patients with sterile inflammation. A comprehensive search for gene expression data sets in sepsis identified 27 data sets matching our inclusion criteria. Five data sets (n = 663 samples) compared patients with sterile inflammation (SIRS/trauma) to time-matched patients with infections. We applied our multicohort analysis framework that uses both effect sizes and P values in a leave-one-data set-out fashion to these data sets. We identified 11 genes that were differentially expressed (false discovery rate 1%, interdata set heterogeneity P > 0.01, summary effect size >1.5-fold) across all discovery cohorts with excellent diagnostic power [mean area under the receiver operating characteristic curve (AUC), 0.87; range, 0.7 to 0.98]. We then validated these 11 genes in 15 independent cohorts comparing (i) time-matched infected versus noninfected trauma patients (4 cohorts), (ii) ICU/trauma patients with infections over the clinical time course (3 cohorts), and (iii) healthy subjects versus sepsis patients (8 cohorts). In the discovery Glue Grant cohort, SIRS plus the 11-gene set im- proved prediction of infection (compared to SIRS alone) with a continuous net reclassification index of 0.90. Overall, multicohort analysis of time-matched cohorts yielded 11 genes that robustly distinguish sterile inflammation from infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills about 750,000 people in the United States every year (1). It is also the single most expensive condition treated in the United States, costing the healthcare system more than $20 billion annually (2). Prompt diagnosis and treatment of sepsis is crucial to reducing mor- tality, with every hour of delay increasing mortality risk (3). Sepsis is defined by the presence of the systemic inflammatory response syn- drome (SIRS), in addition to a known or suspected source of infection (1). However, SIRS is not specific for sepsis, because sterile inflamma- tion can arise as a nonspecific response to trauma, surgery, thrombo- sis, and other noninfectious insults. Thus, sepsis can be difficult to distinguish clinically from systemic inflammation caused by non- infectious sources, such as tissue trauma (4). There is no gold stan- dardblood test for distinguishing patients with infections at the time of diagnosis before results become available from standard micro- biological cultures. One of the most common biomarkers of infection, procalcitonin, has a summary area under the receiver operating char- acteristic (ROC) curve (AUC) of 0.78 (range, 0.66 to 0.90) (5). Several groups have evaluated whether cytokine or gene expression arrays can accurately diagnose sepsis; however, because of the highly variable na- ture of host response and human genetics, no robust diagnostic signa- ture has been found (610). Both infections and tissue trauma activate many of the same innate immune receptor families, such as the Toll-like receptors (TLRs) and NOD-like receptors (NLRs), and consequently activate largely over- lapping transcriptional pathways. Thus, distinguishing conserved downstream effects attributable solely to infections has been exceed- ingly difficult. Recent work has shown that there are pattern recog- nition receptors potentially specific to pathogen response, including several glycan receptor families (11). Hence, it may be possible that an infection-specific immune response could be differentiated from sterile inflammation. The ongoing search for new therapies for sepsis, and for new prognostic and diagnostic biomarkers, has generated several dozen microarray-based genome-wide expression studies over the past dec- ade, variously focusing on diagnosis, prognosis, pathogen response, and underlying sepsis pathophysiology (10). Despite tremendous gains in the understanding of gene expression in sepsis, few insights have trans- lated to improvements in clinical practice. Many of these studies have been deposited into public repositories such as the National Institutes of Health Gene Expression Omnibus (GEO) and ArrayExpress, and thus, there is now a wealth of publicly available data in sepsis. In par- ticular, there are several studies comparing patients with sepsis to pa- tients with noninfectious inflammation (such as SIRS) that occurs after major surgery, traumatic injury, or in nonsepsis-related inten- sive care unit (ICU) admission (thrombosis, respiratory failure, etc.). One data set in particular, the Inflammation and Host Response to Injury Program (Glue Grant) (6, 12, 13), has yielded several important findings about the effects of time on gene expression after trauma and in sepsis. One part of the Glue Grant longitudinally examined gene 1 Department of Surgery, Stanford University School of Medicine, Palo Alto, CA 94305, USA. 2 Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA 94305, USA. 3 Division of Critical Care Medicine, Cincinnati Childrens Hospital Medical Center and Cincinnati Childrens Research Foundation, 3333 Burnet Avenue, Cincinnati, OH 45223, USA. 4 Department of Pediatrics, University of Cincinnati College of Medicine, 231 Albert Sabin Way, Cincinnati, OH 45267, USA. 5 Stanford Institute for Immunity, Transplantation and Infection, Stanford University School of Medicine, Palo Alto, CA 94305, USA. *Corresponding author. E-mail: [email protected] (T.E.S.); [email protected] (P.K.) RESEARCH ARTICLE www.ScienceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 1 by guest on June 21, 2020 http://stm.sciencemag.org/ Downloaded from

Transcript of A comprehensive time-course based multicohort analysis of ... · infectious inflammation....

Page 1: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

SEPS I S

A comprehensive time-course–based multicohortanalysis of sepsis and sterile inflammation reveals arobust diagnostic gene setTimothy E. Sweeney,1,2* Aaditya Shidham,2 Hector R. Wong,3,4 Purvesh Khatri2,5*

http://stm.sc

Dow

nloaded from

Although several dozen studies of gene expression in sepsis have been published, distinguishing sepsis from asterile systemic inflammatory response syndrome (SIRS) is still largely up to clinical suspicion. We hypothesizedthat a multicohort analysis of the publicly available sepsis gene expression data sets would yield a robust set ofgenes for distinguishing patients with sepsis from patients with sterile inflammation. A comprehensive searchfor gene expression data sets in sepsis identified 27 data sets matching our inclusion criteria. Five data sets (n = 663samples) compared patients with sterile inflammation (SIRS/trauma) to time-matched patients with infections. Weapplied our multicohort analysis framework that uses both effect sizes and P values in a leave-one-data set-outfashion to these data sets. We identified 11 genes that were differentially expressed (false discovery rate ≤1%,inter–data set heterogeneity P > 0.01, summary effect size >1.5-fold) across all discovery cohorts with excellentdiagnostic power [mean area under the receiver operating characteristic curve (AUC), 0.87; range, 0.7 to 0.98]. Wethen validated these 11 genes in 15 independent cohorts comparing (i) time-matched infected versus noninfectedtrauma patients (4 cohorts), (ii) ICU/trauma patients with infections over the clinical time course (3 cohorts), and (iii)healthy subjects versus sepsis patients (8 cohorts). In the discovery Glue Grant cohort, SIRS plus the 11-gene set im-proved prediction of infection (compared to SIRS alone) with a continuous net reclassification index of 0.90. Overall,multicohort analysis of time-matched cohorts yielded 11 genes that robustly distinguish sterile inflammation frominfectious inflammation.

ien

by guest on June 21, 2020

cemag.org/

INTRODUCTION

Sepsis, a syndrome of systemic inflammation in response to infection,kills about 750,000 people in the United States every year (1). It is alsothe single most expensive condition treated in the United States,costing the healthcare system more than $20 billion annually (2).Prompt diagnosis and treatment of sepsis is crucial to reducing mor-tality, with every hour of delay increasing mortality risk (3). Sepsis isdefined by the presence of the systemic inflammatory response syn-drome (SIRS), in addition to a known or suspected source of infection(1). However, SIRS is not specific for sepsis, because sterile inflamma-tion can arise as a nonspecific response to trauma, surgery, thrombo-sis, and other noninfectious insults. Thus, sepsis can be difficult todistinguish clinically from systemic inflammation caused by non-infectious sources, such as tissue trauma (4). There is no “gold stan-dard” blood test for distinguishing patients with infections at the timeof diagnosis before results become available from standard micro-biological cultures. One of the most common biomarkers of infection,procalcitonin, has a summary area under the receiver operating char-acteristic (ROC) curve (AUC) of 0.78 (range, 0.66 to 0.90) (5). Severalgroups have evaluated whether cytokine or gene expression arrays canaccurately diagnose sepsis; however, because of the highly variable na-

1Department of Surgery, Stanford University School of Medicine, Palo Alto, CA 94305,USA. 2Stanford Center for Biomedical Informatics Research, Stanford University, PaloAlto, CA 94305, USA. 3Division of Critical Care Medicine, Cincinnati Children’s HospitalMedical Center and Cincinnati Children’s Research Foundation, 3333 Burnet Avenue,Cincinnati, OH 45223, USA. 4Department of Pediatrics, University of Cincinnati Collegeof Medicine, 231 Albert Sabin Way, Cincinnati, OH 45267, USA. 5Stanford Institute forImmunity, Transplantation and Infection, Stanford University School of Medicine, PaloAlto, CA 94305, USA.*Corresponding author. E-mail: [email protected] (T.E.S.); [email protected] (P.K.)

www.Sci

ture of host response and human genetics, no robust diagnostic signa-ture has been found (6–10).

Both infections and tissue trauma activatemany of the same innateimmune receptor families, such as the Toll-like receptors (TLRs) andNOD-like receptors (NLRs), and consequently activate largely over-lapping transcriptional pathways. Thus, distinguishing conserveddownstream effects attributable solely to infections has been exceed-ingly difficult. Recent work has shown that there are pattern recog-nition receptors potentially specific to pathogen response, includingseveral glycan receptor families (11). Hence, it may be possible that aninfection-specific immune response could be differentiated from sterileinflammation.

The ongoing search for new therapies for sepsis, and for newprognostic and diagnostic biomarkers, has generated several dozenmicroarray-based genome-wide expression studies over the past dec-ade, variously focusing on diagnosis, prognosis, pathogen response, andunderlying sepsis pathophysiology (10). Despite tremendous gains inthe understanding of gene expression in sepsis, few insights have trans-lated to improvements in clinical practice. Many of these studies havebeen deposited into public repositories such as the National Institutesof Health Gene Expression Omnibus (GEO) and ArrayExpress, andthus, there is now a wealth of publicly available data in sepsis. In par-ticular, there are several studies comparing patients with sepsis to pa-tients with noninfectious inflammation (such as SIRS) that occursafter major surgery, traumatic injury, or in non–sepsis-related inten-sive care unit (ICU) admission (thrombosis, respiratory failure, etc.).

One data set in particular, the Inflammation and Host Response toInjury Program (Glue Grant) (6, 12, 13), has yielded several importantfindings about the effects of time on gene expression after trauma andin sepsis. One part of the Glue Grant longitudinally examined gene

enceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 1

Page 2: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

Dow

nloaded from

expression in patients after severe traumatic injuries. Several groupshave examined these data with respect to time; notable findings arethat (i) more than 80% of expressed genes show differential expressionafter traumatic injury (6); (ii) different clusters of genes recover overmarkedly different time periods (12); (iii) differing scenarios of inflam-mation such as trauma, burns, and endotoxicosis exhibit similar geneexpression changes (12); and (iv) the extent to which posttrauma geneexpression profiles differ from those of healthy controls and their de-gree of gene expression recovery over time are correlated with clinicaloutcomes (13, 14). There is thus growing understanding of the impor-tance of the changes that underlie recovery from trauma and their im-pact on specific clinical outcomes.

We hypothesized that only time-matched comparisons, such asthose that compare SIRS/trauma to sepsis at the same clinical timepoints, would yield genes robustly diagnostic of sepsis. We carriedout a comprehensive, time-course–based multicohort analysis of thepublicly available gene expression data in sepsis to identify a conserved11-gene set that can robustly distinguish noninfectious inflammation(such as SIRS, trauma, and ICU admissions) from inflammation dueto acute infections, as in sepsis. This 11-gene set had excellent diag-nostic power in the discovery cohorts and was then validated in 15 in-dependent cohorts.

by guest on June http://stm

.sciencemag.org/

RESULTS

Comprehensive search and labeled principal componentsanalysis visualizationsWe identified 27 independent gene expression data sets that sat-isfied our criteria in GEO and ArrayExpress, from which we includeda total of 2903 microarrays (table S1) (7, 8, 15–40). These 27 data setscomprised only 22 independent cohorts, because the six data sets fromthe Genomics of Pediatric SIRS/Septic Shock Investigators (GPSSSI)were combined into a single cohort containing 219 patients with SIRSor sepsis at ICU day 1 (15–20). Many of the samples used were from theGlue Grant trauma data sets, which have a total of 333 patients sampledat up to eight time points (1301 samples used here) after traumatic injury.

21, 2020

These 27 data sets contain cohorts of chil-dren and adults, men and women, with amix of community- and hospital-acquiredsepsis, sampled from whole blood, neutro-phils, and peripheral blood mononuclearcells (PBMCs).

First, we sought to use the simplest pos-sible methods to see whether noninfectedSIRS/trauma patients and sepsis/infectionpatients could be separated by gene ex-pression. We thus co-normalized all avail-able data sets comparing SIRS/trauma withsepsis/infection in a single matrix. Labeledprincipal components analysis (PCA) (using168 genes identified by 10-fold cross-validatedLasso-penalized logistic regression) showedthat SIRS/trauma patients can be separatedfrom sepsis patients with modest overlap(Fig. 1A). Next, we labeled each sampleas “early” (within 48 hours of admission) or“late” (more than 48 hours after admis-

www.Sci

sion). Most of the nonseparable samples were the late samples (Fig.1B). This finding remained true even when we included healthy con-trols as a separate class (fig. S1). Previous work has shown that gene ex-pression after trauma, burns, or endotoxemia changes nonlinearly overtime (6, 12, 14, 35). This continuous change in expression after initial insultcould explain the inability to distinguish noninfected SIRS/trauma fromsepsis in the late samples if all time points are treated as equal.

Therefore, we sought to get a qualitative sense of whether gene ex-pression during the hospital course after injury is similar among differ-ent cohorts. We included all peripheral blood data sets that examinedgene expression longitudinally over time after admission for nonsepticevents. We used CUR matrix decomposition to identify the 100 genesthat were most orthogonal to each other and used these to perform labeledPCA with classes determined by days after injury. Reassuringly, the geneexpression at each time point was closest to the time points by whichit was bounded (for example, the days [1,2) group was preceded by days[0,1) and followed by days [2,3); Fig. 2 and movie S1). Furthermore,changes in expression over time explained most variance in the data sets,as evidenced by the different day groups changing in each of the firstthree labeled principal components. In summary, our analysis showedthat the changes in gene expression after trauma/ICU admissions (i) pro-ceed in a nonlinear fashion over time and (ii) show similar changes overtime across data sets.

Time-matched multicohort analysisBecause changes in gene expression after admission for trauma ex-plain a large amount of variance in the data set, and because these changesproceed nonlinearly, direct comparisons of a patient at admission withthat same patient several days later at the time of infection would beconfounded by “normal” changes in expression due to recovery fromthe inciting event, as well as any “abnormal” changes due to the hospital-acquired infection. It would be extremely difficult to disentangle thesechanges, if not impossible. Consequently, comparisons that do not takeclinical time into account will not yield biomarkers that can robustlydiscriminate infected from noninfected patients (Fig. 1). Therefore, wefocused only on infection data sets that also included a time-matchednoninfected cohort (to allow for direct time-matched comparisons).

A B

t

Fig. 1. Labeled PCA comparing sterile SIRS/trauma versus sepsis patients. (A) Sterile SIRS/traumaand sepsis patients appear to be largely separable in the transcriptomic space, with only a minimal non-

separable set. (B) The same labeled PCA is shown, with labels updated to reflect patients in recovery fromnoninfectious SIRS/trauma and patients with hospital-acquired sepsis; the late group (>48 hours after hos-pital admission) is much harder to separate. n = 1094 combined from 15 studies.

enceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 2

Page 3: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

We thus separated the data sets into two groups: (i) data sets compar-ing patients at hospital admission for trauma, surgery, or critical illnessversus patients at admission to the hospital for sepsis [GSE28750 (27),GSE32707 (31), GSE40012 (26), and the GPSSSI unique combined datasets (n = 408 samples) (15–20)] and (ii) the Glue Grant data sets con-taining patients with hospital-acquired infections and day-matchednoninfected patients, from which we used only patients in the buffy coat

www.Sci

sample cohort (Table 1). The Glue Grant trauma cohorts were sampledat roughly 0.5, 1, 4, 7, 14, 21, and 28 days after injury; these cohortswere thus divided into their sampling time bins, creating subgroupsin which patients diagnosed with an infection in a given time bin canbe compared to noninfected patients in the same time bin. For the buffycoat samples, there were at least 10 patients present in five time bins,and these were thus taken for further study. Thus, we used a total of

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from

Table 1. Publicly available gene expression data sets comparing SIRS/ICU/trauma to sepsis/infections. CAP, community-acquired pneumonia;ARDS, acute respiratory distress syndrome.

Data set

Year Submitting

author

Paperreferencenumber

Control used here

enc

Condition used here

eTranslationalMedicine.o

Sampletype used

rg 13 May 2015

Platform

Vol 7 Issue

Time pointspresent in

dataset (days)

GPSSSI unique 2

006–2011 Wong 15–20 P ediatric ICU-SIRS S epsis and septic shock(bacterial infectionsonly)

Wholeblood

GPL570

1–3

GSE28750

2011 Sutherland 27 2 4 hours post–“majorsurgery”

C

ommunity-acquiredsepsis

Whole blood

GPL570 Admission

GSE32707

2012 Dolinay 31 M ICU patients ± SIRS,nonseptic

S

epsis, sepsis + ARDS Whole blood GPL10558 0 and 7

GSE40012

2012 Parnell 26 S IRS (66% trauma) S epsis from CAP (bacterialinfections only)

Whole blood

GPL6947 D ays 1–5 for bothSIRS and sepsis

Glue Grant–Trauma–Buffycoat

2

004–2006 Multiple 6, 12–13 T rauma patientswithout infection

T

rauma patients ± 24hours from diagnosisof infection

Buffy coat

GPL570 0.5, 1, 4, 7, 14,21, and 28

A B [0,1)

[1,2)

[2,3)

[3,4)

[4,5)

[5,6)

[6,10)

[10,20)

[20,40)

Day

s si

nce

ad

mis

sio

n

Fig. 2. Two views of the first three principal components of labeledPCA of time-course data sets. Five peripheral whole-blood gene expres-

(A and B) The three-dimensional plots of the first three principal componentsdemonstrate that changes by day explain most variance in the data sets,

sion data sets were combined and matched for common genes. The geneswith the top 100 orthogonality scores were selected via CUR matrix de-composition, and labeled PCA was performed, broken into classes by day.

different data sets show similar changes over time, and the changes overtime proceed in a nonlinear fashion. Parts (A) and (B) show two differentviews of the same data; also see movie S1.

287 287ra71 3

Page 4: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

http://stm.science

Dow

nloaded from

nine cohorts comparing time-matched SIRS/trauma to sepsis/infection,comprising 663 samples (326 SIRS/trauma controls and 337 sepsis/infection cases; Table 2 shows the cohorts in the multicohort analysis;table S2 shows the individual microarray design matrix).

We then applied our previously described (41, 42) multicohort geneexpression analysis framework to compare SIRS/trauma with sepsis/infection, including all nine cohorts in a leave-one-data set-out fashion.The output from this analysis underwent a three-step thresholding pro-cess [false discovery rate (FDR) <1% for both pooled effect size andFischer’s method, inter–data set heterogeneity P > 0.01, and absolutesummary effect size fold change > 1.5], which yielded 82 genes differ-entially expressed between SIRS/trauma and sepsis patients across alltime points (summary statistics for all 82 genes shown in table S3).To obtain the most parsimonious set of significant genes that best dis-criminates between classes, we carried out a greedy forward search toidentify which combination of the 82 genes produced the best improve-ments in AUC across all discovery data sets. Here, discrimination is basedon an “infection z score” that combines gene expression levels (using thedifference of geometric means between positive and negative genes) into astandardized score for each sample in each data set. This yielded a finalset of 11 genes (6 overexpressed and 5 underexpressed in sepsis; Table 3and Fig. 3). Table S4 shows probe-level expression data for these 11 genesin the discovery cohorts. The mean ROC AUC of this 11-gene set in thenine discovery cohorts was 0.87 (range, 0.70 to 0.98; Fig. 4A and fig. S2).

Glue Grant sorted-cells cohort validationThe Glue Grant trauma cohorts have two independent subcohorts: oneis the buffy coat cohort (samples processed from 2004 to 2006 on Af-

www.Sci

by guest on June 21, 2020m

ag.org/

fymetrix array GPL570), and the other is the sorted-cells cohort, whichincluded neutrophils, monocytes, and T cells [samples processed from2008 to 2011 on custom Glue Grant Human (GGH) arrays; Table 4].These cohorts are separate patients, separated in time and profiled usingdifferent technologies. Although their inclusion criteria and enrollingsites are largely the same, they are otherwise independent. We thus val-idated our 11-gene signature in the Glue Grant sorted-cells cohorts. Here,we split the sorted-cells cohorts into the same time bins as the discoverybuffy coat cohorts and treated each time bin separately.

From the sorted-cells subcohort, we expected the neutrophil set toperform most similarly to a whole-blood sample, because neutrophilsmake up 75 to 85% of the total leukocyte pool after trauma in bothinfected and noninfected patients (and hence most of the gene expres-sion present in peripheral blood) (fig. S3). Indeed, the 11-gene set per-formed very well at separating time-matched noninfected traumapatients from septic trauma patients (4 cohorts; 218 samples; meanAUC, 0.83; range, 0.73 to 0.89) (Fig. 4B). Surprisingly, the 11-gene setalso showed discriminatory power in the monocytes and T cells fromthese same patients (monocytes AUC range, 0.71 to 0.97; T cells AUCrange, 0.69 to 0.9) (figs. S4 and S5). Because we excluded any sorted-cellsdata sets from the multicohort analysis, we did not expect diagnosticcapability in these cell types. In the sorted-cells cohort, AUC increasedwith greater time since initial trauma; this may suggest that inflamma-tion due to infection is easier to discriminate as the “genomic storm” oftraumatic injury begins to recover.

Examination of the 11-gene set in the Glue Grant cohortsAs expected, in the Glue Grant buffy coat cohort, patients within±24 hours of diagnosis of infection have significantly higher infectionz scores at all time points as compared to time-matched patients with-out infection; this was validated in the neutrophils cohort [repeated-measures analysis of variance (ANOVA) P < 0.0001; Fig. 4, C and D,and table S5A]. Comparison of the infection z score by time sinceinjury in the buffy coat cohort shows a significant decline over time(repeated-measures ANOVA change over time P < 0.0001), but thereappears to be a lesser (though still significant) effect in the neutrophilsvalidation cohort (repeated-measures ANOVA change over time P <0.05) (Fig. 4, C and D, and table S5A). The interaction of group withtime since injury was not significant in either discovery or validationcohorts, suggesting that the decline in infection z scores over time forboth groups is likely due to recovery from traumatic injury resulting inreduced inflammation (table S5A).

Next, we analyzed how infection z scores changed in infected pa-tients before and after diagnosis of infection (samples that were notincluded in identifying the 11-gene set). We grouped the samples frompatients who were ever diagnosed with infection on the same hospitalstay into four groups according to their time from diagnosis of infec-tion (either greater than 5 days before infection, 5-to-1 days beforeinfection, within ±1 day of diagnosis of infection, or 2-to-5 days afterdiagnosis of infection, where no group besides the ±1 day of diagnosisof infection was included in the multicohort analysis for discovery ofthe 11-gene set). We further divided these groups into bins accordingto days since injury. Within each time bin, the infection z scores for thediagnostic groups increased significantly as they progressed towardinfection for both the discovery buffy coat cohort and the validationneutrophils cohort [Jonckheere trend (JT) test P < 0.01; Fig. 4, E andF]. Furthermore, in all cohorts, the infection z score declined in thegroups that were 2 to 5 days after infection diagnosis, when patients

Table 2. All data sets used in the multicohort analysis. The numbersafter the Glue Grant cohort titles indicate days since infection in the givencohort (for instance, [1,3) are patients from 1 to 3 days since injury).

Cohort

SIRS/trauma

controls (n)

Sepsis/infectioncases (n)

Total(n)

Admissioncomparisons

GSE28750

11 10 21

GSE32707

55 48 103

GSE40012

24 41 65

GPSSSI unique

30 189 219

Hospital-acquiredcomparisons (GlueGrant buffy coatcohorts)

Glue Grantbuffy coat

[1,3)

65

9 74

Glue Grantbuffy coat

[3,6)

63

17 80

Glue Grantbuffy coat

[6,10)

50

15 65

Glue Grantbuffy coat[10,18)

22

4 26

Glue Grantbuffy coat[18,24)

6

4 10

Total used in multicohort analysis

326 337 663

enceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 4

Page 5: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from

were beginning to recover from infection, presumably due to antibiotictreatment. This may also explain the increase in diagnostic power astime increases since initial injury. We emphasize that the resulting“peak” in infection z score around the time of infection diagnosis vali-dates the association of the infection z score with clinical infection,because neither the >5 days prior cohorts, the 5-to-1 days prior co-horts, nor the 2-to-5 days after cohorts were included in the multico-hort analysis, but still shows the hypothesized trends in both thediscovery buffy coat cohort and the validation neutrophils cohort. Simi-lar results were seen in the monocytes and T cells samples (same pa-tients as the neutrophils validation cohorts; figs. S4B and S5B).

The infection z scores for patients who were later infected duringtheir hospital stays were significantly higher in buffy coat samples atthe time of admission than for those never infected during their hos-pital admission (P < 0.01; neutrophils validation group P = 0.05; Fig. 4,E and F). One possibility is that there was a baseline difference in in-jury severity, and that this might influence the infection z score. Se-verely injured patients are known to be more susceptible to infection(43). To test this hypothesis, we used linear regression of eventualhospital-acquired infection status, injury severity score, and their inter-action to predict infection z score as the independent variable (tableS5B). Both eventual hospital-acquired infection status and injury se-verity score were independently significant in predicting infection zscore at admission, indicating that injury severity alone does not ex-plain these effects. The interaction term was significant and negativein both the discovery buffy coat cohort and the validation neutrophils

www.Sci

cohort samples, perhaps suggesting that higher infection z score at ad-mission may indicate greater susceptibility to later infection. Furtherstudies are needed to examine this observation.

Clinical utility in the Glue GrantTo test whether the infection z score might add to the clinical deter-minations of infection, we compared logistic regression using SIRScriteria alone to that using SIRS criteria plus our infection z scorein discriminating Glue Grant trauma patients (both buffy coat andneutrophils cohorts) with and without infection. The logistic regres-sion model using SIRS criteria alone had an AUC of 0.64, whereasSIRS criteria plus the infection z score had an overall AUC (using asingle coefficient for infections at all time points) of 0.81 (fig. S6). Thecontinuous net reclassification index (NRI) is a measure of how manypatients would be correctly reclassified by improving a disease marker;here, the continuous NRI of adding the infection z score to SIRS alonewas 0.90 (95% confidence interval, 0.62 to 1.17), where a continuousNRI greater than 0.6 is associated with “strong” improvement in pre-diction (44).

Independent validation of the infection z scoreNext, we validated our score in three independent longitudinal cohortsthat included only trauma or ICU patients who eventually acquired in-fections: GSE6377 (35), GSE12838, and EMEXP3001 (23) (Table 5). Allthree cohorts followed patients from the day of admission at least throughthe day of infection diagnosis (mostly VAP). Because all patients in

Table 3. The 11-gene set that separates SIRS/trauma from sepsis. Included are meta-analysis effect sizes, errors, and heterogeneity analyses.

Gene symbol

Full name Pooledeffectsize

P

s

ooledeffectize, SE

Effectsize, P s

enceT

Effectize, Q (FDR)

S

ranslationalMe

um of logs,Q (FDR)

C

dicine.org 13 M

ochran’sQ

ay 2015

t2 h

Vol 7 Issue

Interstudyeterogeneity,

P

CEACAM1 C

arcinoembryonic antigen-related cell adhesionmolecule 1

0.778

0.073 1.9 × 10−26 6.5 × 10−22 7.2 × 10−14 16.43 0.023 0.037

ZDHHC19 Z

inc finger, DHHC-typecontaining 19

1.083

0.130 6.6 × 10−17 3.3 × 10−13 7.1 × 10−22 18.36 0.078 0.019

C9orf95 N

icotinamide ribosidekinase 1

0.598

0.102 4.6 × 10−9 1.4 × 10−6 2.9 × 10−12 12.64 0.032 0.125

GNA15 G

uanine nucleotide bindingprotein (G protein), a-15

0.603

0.119 4.0 × 10−7 4.3 × 10−5 4.7 × 10−8 10.66 0.030 0.222

BATF B

asic leucine zippertranscription factor,ATF-like

1.053

0.163 9.4 × 10−11 6.2 × 10−8 1.6 × 10−19 17.49 0.115 0.025

C3AR1 C

omplement component 3areceptor 1

0.643

0.097 3.7 × 10−11 3.0 × 10−8 9.3 × 10−8 3.99 0.000 0.858

KIAA1370 F

amily with sequence similarity214, member A

−0.664

0.148 7.8 × 10−6 4.5 × 10−4 2.1 × 10−10 18.79 0.095 0.016

TGFBI T

ransforming growth factorb–induced, 68 kD

−0.730

0.108 1.2 × 10−11 1.1 × 10−8 2.0 × 10−10 9.12 0.013 0.333

MTCH1 M

itochondrial carrier 1 −0.686 0.135 4.1 × 10−7 4.3 × 10−5 7.9 × 10−10 13.04 0.058 0.111

RPGRIP1 R

etinitis pigmentosa GTPaseregulator interactingprotein 1

−0.694

0.156 8.5 × 10−6 4.7 × 10−4 5.2 × 10−9 16.93 0.103 0.031

HLA-DPB1 M

ajor histocompatibilitycomplex, class II, DP b1

−0.659

0.157 2.6 × 10−5 1.1 × 10−3 8.6 × 10−9 17.38 0.107 0.026

287 287ra71 5

Page 6: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from

each of the three cohorts acquired infections, they did not have time-matched noninfected controls. To compare the validation cohort in-fection cases with noninfected trauma patients, we used Glue Grantbuffy coat noninfected controls. We internally normalized each cohort

www.Sci

using housekeeping genes and then co-normalized with the Glue Grantbuffy coat patients using empiric Bayes batch correction. Then, we com-pared the validation cohorts to the Glue Grant noninfected patients atmatched time points as a variable reference. Comparing trauma/ICU

−1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 −1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 −1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0

−1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 −1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 −1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0

−4 −3 −2 −1 0 1 −4 −3 −2 −1 0 1 −4 −3 −2 −1 0 1

−4 −3 −2 −1 0 1 −4 −3 −2 −1 0 1

GSE28750GSE32707GSE40012GPSSSI.uniqueGlue.buffy[1,3)Glue.buffy[3,6)Glue.buffy[6,10)Glue.buffy[10,18)Glue.buffy.[18,24)

Summary effect

GSE28750GSE32707GSE40012GPSSSI.uniqueGlue.buffy[1,3)Glue.buffy[3,6)Glue.buffy[6,10)Glue.buffy[10,18)Glue.buffy.[18,24)

Summary effect

GSE28750GSE32707GSE40012GPSSSI.uniqueGlue.buffy[1,3)Glue.buffy[3,6)Glue.buffy[6,10)Glue.buffy[10,18)Glue.buffy.[18,24)

Summary effect

GSE28750GSE32707GSE40012GPSSSI.uniqueGlue.buffy[1,3)Glue.buffy[3,6)Glue.buffy[6,10)Glue.buffy[10,18)Glue.buffy.[18,24)

Summary effect

Standardized mean difference (log2 scale)

Standardized mean difference (log2 scale)

Standardized mean difference (log2 scale)

Standardized mean difference (log2 scale)

KIAA1370 TGFBI MTCH1

GNA15 BATF C3AR1

CEACAM1 ZDHHC19 C9orf95

RPGRIP1 HLA-DPB1

Fig. 3. Effect sizes of the 11-gene set. Forest plots for random effects model estimates of effect size of the positive genes, comparing SIRS/trauma/ICU to infection/sepsis patients in each of the discovery cohorts.

enceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 6

Page 7: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from

patients to a time-matched baseline is necessary because our earlierfindings (Fig. 4, C to F) showed a change over time in infection zscore in the noninfected patients (table S5A). The three indepen-dent longitudinal trauma/ICU cohorts show that patients within±1 day of infection are generally separable from time-matched non-infected Glue Grant patients, with ROC AUCs ranging from 0.68 to0.84 (Fig. 5).

www.Sci

We further validated the 11-gene set in eight additional indepen-dent data sets that compared healthy controls to those with bacte-rial or viral sepsis at admission using whole-blood samples [n = 446:GSE11755 (38), GSE13015 (7), GSE20346 (37), GSE21802 (22), GSE25504(39), GSE27131 (32), GSE33341 (30), and GSE40396 (25); Table 6].The infection z scores for all eight data sets were combined in a single violinplot, showing excellent separation (Wilcoxon P < 1 × 10−63; Fig. 6A).

A B

C D

E F

9273410

6547279

63372917

50231015

22854

6114

8351310

56292110

55132110

4613714

24133

0 0 15 19 9 2 0 0 11 19 5

n n nn n n nn n nn

-.-

)

-

-

)-,

.--)-

--

r

-

Fig. 4. Results of the 11-geneset in the discovery and neu

trophils validation data sets(A) ROC curves shown for separating sterile SIRS/ICU/traumapatients from those with sepsisin the discovery data sets. (BROC curves shown for separatingtrauma patients with infectionsfrom time-matched trauma patients without infection in theGlue Grant neutrophils validation data sets. (C and D) GlueGrant buffy coat discovery (Cand neutrophils validation samples (D) after >1 day since injuryshowing average infection zscore in noninfected patientsversus patients within ±24 hoursof diagnosis. In both cases, thereis a significant effect due toboth time and infection status(E and F) Box plots of infection z score by time since injury for buffy coat discovery (Eand neutrophils validation samples (F): patients never infectedare compared to patients >5 daysbefore infection, 5-to-1 days before infection, ±1 day of diagnosis (cases), and 2-to-5 daysafter infection diagnosis. JT trendtest was significant (P < 0.01) foan increasing trend from neverinfected to ±1 day of infection for each time point afteradmission.

enceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 7

Page 8: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from

The mean ROC for separating healthy and septic patients is 0.98 (range,0.94 to 1.0; Fig. 6B).

Our results provide strong evidence that the infection z score de-clines over time since admission/injury in whole blood, buffy coat, neu-trophils, and monocytes. We have also shown that non–time-matchedcomparison yields inaccurate classification of infection, especially forlate-acquired infections in SIRS/trauma patients. Hence, comparing in-fection z scores of SIRS/trauma patients at admission with those of late-acquired sepsis/infection patients would be an inaccurate measure ofdiagnostic power. However, because the effect of the decrease in infec-tion z score over time is relatively monotonous, comparison of admis-sion SIRS/trauma/surgery patients with late-acquired sepsis/infectionwould provide a lower limit on detection of ROC AUC for the infectionz scores. That is, because the infection z score decreases over time, if thenoninfected patients tested at admission had been sampled later (atmatched times to the sepsis patients), their infection z scores wouldbe lower at that later time (and hence more easily separable from thehigher infection z scores in the septic patients). Using this inference, weexamined four independent data sets that compared SIRS/trauma/surgery patients either to the same patients later in their hospital courseat onset of sepsis or to a mixed cohort of patients with community-and hospital-acquired sepsis. These data sets included whole blood[EMTAB1548 (21)], neutrophils [GSE5772 (29)], and PBMCs [GSE9960(8); EMEXP3621 (40)] (Table 7). In each of these four data sets, theinfection z score separated late-acquired infections from admissionSIRS or trauma, with ROC AUCs ranging from 0.48 to 0.76 in PBMCsto 0.86 in whole blood (fig. S7). We emphasize that these AUCs areexpected to be lower due to their time-mismatched comparison andare essentially the lower limits of what properly time-matched infectionz scores would be in each of these cell compartments.

www.Sci

Finally, we examined our 11-gene set in one data set comparinghealthy controls or those with autoimmune inflammation to acute bac-terial infections after diagnosis confirmation (GSE22098, n = 274) (33).Exact sampling times are not available, but typically, confirmation ofinfection takes 24 to 72 hours, so these infection samples are expectedto show lower z scores than at the time of diagnosis. Still, the infection zscore was able to discriminate healthy and autoimmune inflammationpatients from those with acute infections (ROC AUC, 0.72; fig. S8). Con-sidering that cohorts with autoimmune inflammation were not includedin our discovery set, this provides validation of the specificity of the in-fection z score for infectious inflammation.

The effect of infection type on infection z scoreTo examine whether there were any infection type–specific differencesin the infection z score, we compared patients infected with Gram-positive versus Gram-negative bacteria, as well as those with viral in-fections to those with bacterial infections. The Glue Grant patientswere not analyzed, because there were too few time-matched infectionpatients in each subcohort. Four data sets had information on Gram-positive versus Gram-negative infection, and four had data on bacte-rial versus viral infections; in neither case was there a clear trend ofdifferences in infection z score based on infection subtype (table S6).

Gene set pathway evaluation and transcriptionfactor analysisHaving validated the 11-gene set, we examined whether any mecha-nism might explain why these genes were acting in concert. We ana-lyzed the 11-gene set with Ingenuity Pathway Analysis, which showedthat several of the genes are downstream of IL-6 and JUN (fig. S9). All11 genes identified by the multicohort analysis were tested with

Table 5. Publicly available gene expression time-course data sets of trauma patients who develop infections. VAP, ventilator-associatedpneumonia; VAT, ventilator-associated tracheobronchitis; N/A, not available.

Data set

Year Submitting

author

Paperreferencenumber

Controlusedhere

Condition usedhere

enc

Sampletype used

eTranslationalMe

Platform

dicine.org

n,control

13 May 20

n,case

15 V

Samplesused here

ol 7 Issue 287

Time pointspresent indata set(days)

GSE6377

2008 Cobb 35 None I CU patients witheventual VAP

Buffy coat

GPL201 0 11 99 1–21

GSE12838

2008 Cobb N/A None I CU patients witheventual VAP

Buffy coat andneutrophils

GPL570

0 4 60 1–11

E-MEXP-3001

2011 Lopez 23 None I CU patients withacquired VAPor VAT

Whole blood

A-Agil-28(GPL6480)

0

8 56 1–7

Table 4. The Glue Grant sorted-cells cohort.

Data set

Year Submitting

author

Paperreferencenumber

Controlused here

Conditionused here

Sampleused here

Platform

Dayssinceinjury

n,control

287ra

n,case

Glue Grantsorted-cells

2008–2011

Multiple 6 Trauma patientswithoutinfection

Trauma patients ±24 hours fromdiagnosis ofinfection

Neutrophils,monocytes,and T cells

GGH-1,GGH-2 (27)

[1,3)

56 10

[3,6)

55 10

[6,10)

46 14

[10,18)

24 3

71 8

Page 9: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from

both EncodeQT and PASTAA (chosen for a mix of experimental resultsand in silico transcription factor predictions). EncodeQT found only onesignificant transcription factor among the positive genes (MAX) andnone for the negative genes (EncodeQT Q ≤ 0.01, table S7A). PASTAAshowed enrichment for well-known proinflammatory transcriptionfactors, such as nuclear factor kB (NF-kB) member c-REL, STAT5,and interferon response factors (IRF) 1 and 10 (table S7B).

Because we did not find an obvious network driver, we next studiedwhether the genes were enriched in certain immune cell types that mightexplain their relation to sepsis. We searched GEO for human immune celltype–specific gene expression profiles and found 277 samples from 18

www.Sci

data sets matching our criteria (table S8). We aggregated these into broadimmune cell type signatures using mean gene expression scores. Wethen calculated standardized enrichment scores using the same meth-od as the infection z score (difference of geometric means betweenpositive and negative genes). We did this both for the initial set of82 genes found to be significantly enriched in the multicohort analysisand for the 11-gene set found after forward search (the genes includedin the infection z score) (Fig. 7). The set of all 82 significant genes wasfound to be highly enriched in band cells only (>4 SDs above the mean;P < 1 × 10−6). The 11-gene set was significantly enriched (>2 SDs abovethe mean; P = 0.015) in band cells but also showed up-regulation in

−1

0

1

Time since admission

Infe

ctio

n sc

ore

(z s

core

)

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00False-positive rate (1−specificity)

Tru

e-po

sitiv

e ra

te (

sens

itivi

ty)

Glue Grant (noninfected)

EMEXP3001>1 day before infectionEMEXP3001±1 day of infection

−1

0

1

2

Time since admission

Glue Grant (noninfected)

GSE6377 >1 day before infectionGSE6377±1 day of infection

−1

0

1

2 4 6 8Time since admission

A B

C D

EMEXP3001: AUC = 0.84 (95% CI 0.89 – 0.78)

GSE6377: AUC = 0.68 (95% CI 0.75 – 0.6)GSE12838: AUC = 0.72 (95% CI 0.83 – 0.61)

Glue Grant (noninfected)

GSE12838 >1 day before infectionGSE12838 ±1 day of infection

2 4 6 8 2 4 6 8

Infe

ctio

n sc

ore

(z s

core

)

Infe

ctio

n sc

ore

(z s

core

)

Fig. 5. No-controls data sets of trauma/ICU patients who develop VAP.These data sets did not include noninfected patients, so they were em-

only the first 8 days since admission are shown, and patients are cen-sored >1 day after diagnosis of infection. (D) ROC curves compare

piric Bayes co-normalized with time-matched Glue Grant patients. Or-ange line shows Glue Grant loess curve. (A) EMEXP3001. (B) GSE6377.(C) GSE12838, both neutrophils and whole-blood samples. In all cases,

patients within ±1 day of diagnosis (blue points in A to C) with time-matched noninfected Glue Grant patients. See Table 5 for further dataset details.

enceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 9

Page 10: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from

regulatory T cells (Tregs) and down-regulation in dendritic cells. Thissuggests that one driving force in differential gene expression betweensterile SIRS and sepsis is the presence of band cells; however, the best setof genes for diagnosis contains information that may incorporatemultiple cell type shifts at once. Finally, we checked whether therewas a difference in band counts (where present) between acutely in-fected and noninfected patients in the Glue Grant trauma cohort, but

www.ScienceTranslationalMedicine.org

there was no significant difference (meannoninfected = 2.13; mean infected = 2.74;P = 0.49).

DISCUSSION

The dozens of studies that we examinedhere have reported valuable insights intochanges in gene expression that occur inresponse to SIRS, trauma, surgery, andsepsis; one key insight is that time afterinjury is an important factor in gene ex-pression after injury (6, 12, 14, 35). Acrossmultiple independent cohorts, we showedthat changes in gene expression over timeduring recovery are nonlinear but followa similar trajectory (Fig. 2). Therefore, acomparison of gene expression across earlyand late time points in the same patientwill yield a large number of differentiallyexpressed genes solely due to the recoveryprocess. It is thus very difficult to identify

relatively small changes in gene expression due to late complicationsuch as infection from the large changes caused by recovery. Therefore,we separated longitudinal studies into subcohorts of patients at matchedtime points. We used an integrated, time-course–based multicohortanalysis (41, 42) to evaluate differential gene expression between sterileSIRS/trauma and sepsis/infection patients. We then used a forward searchto select a parsimonious set of differentially expressed genes optimized

Table 6. Publicly available gene expression data sets in whole blood or neutrophils comparing healthy controls to septic patients.

Data set Y

ear Submitting

author

Paperreferencenumber

Controlused here

Conditionused here

Sample typeused

Platform

13 May

n,control

2015 Vo

n,case

l 7 Issu

Time pointspresent in

data set (days)

GSE11755 2

008 Emonts 38 Healthychildren

Children withmeningococcal

sepsis

Whole blood

GPL570 3 8 0, 0.33, 1,and 3

GSE13015 2

008 Chaussabel 7 Healthy ortype 2diabetic

Community-acquired ornosocomial

sepsis

Whole blood

GPL6106 andGPL6947

20

63 Mixed:admissionor sepsisonset

GSE20346 2

011 Parnell 37 Pre-/post-influenza vaccine

Bacterialpneumonia andsevere influenza

Whole blood

GPL6947 36 45 1–5

GSE21802 2

011 Bermejo-Martin 22 Healthy Severe influenza Whole blood GPL6102 4 12 Early versuslate sepsis

GSE25504 2

014 Smith 39 Healthyneonates

Neonates withbacterialinfection

Whole blood

GPL570,GPL6947,

and GPL13667

44

44 Infectiononset

GSE27131 2

011 Berdal 32 Healthy Severe influenza A Whole blood GPL6244 7 14 0–6

GSE33341 2

011 Ahn 30 Healthy Sepsis frombacterial CAP

Whole blood

GPl571 43 51 Admission

GSE40396 2

013 Hu 25 Healthychildren

after surgery

Children withinfection + fever

Whole blood

GPL10558 22 30 Admission

−2

0

2

4

P = 3.73 x 10−64

Infe

ctio

n sc

ore

(zsc

ore)

Healthy Sepsis

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00False-positive rate (1−specificity)

Tru

e-po

sitiv

e ra

te (s

ensi

tivity

)

GSE33341: A

A B

GSE11755.WB: AUC = 1 (95% CI 1 − 1)

GSE3015.both: AUC = 1 (95% CI 1 − 0.99)

GSE20346: AUC = 0.97 (95% CI 0.98 − 0.95)

GSE21802.viral: AUC = 0.99 (95% CI 1 −0.97)

GSE25504.all: AUC = 0.94 (95% CI 0.96−0.91)

GSE27131.viral: AUC = 1 (95% CI 1 −1)

UC = 1 (95% CI 1 −1)

GSE40396: AUC = 0.96 (95% CI 0.99 − 0.94)

Fig. 6. Discrimination of healthy versus sepsis. Eight independent validation data sets that metinclusion criteria (peripheral whole blood or neutrophils, sampled within 48 hours of sepsis diagnosis)

were tested with the infection z score. (A) Infection z scores for all patients (n = 446) were combined ina single violin plot; error bars show middle quartiles. P values calculated with Wilcoxon rank-sum test.(B) Separate ROC curves for each of the eight data sets discriminating sepsis patients from healthycontrols. Mean ROC AUC = 0.98. See Table 6 for further data set details.

e 287 287ra71 10

Page 11: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from

for discriminatory power for sepsis. An infection z score, derived fromthe geometric mean of the 11-gene set, had a mean ROC AUC of 0.87in the discovery cohorts for distinguishing SIRS/trauma from sepsis/infection patients.

We validated the 11-gene set in an independent group of patientsfrom the Glue Grant. The mean AUC for distinguishing sepsis fromnoninfectious inflammation was 0.83 in the neutrophils validation co-hort, with a trend toward better diagnostic power with greater timesince initial injury, when initial traumatic inflammation wanes andhospital-acquired infections manifest (43). Although we expect the

www.Scie

whole-blood transcriptional profiles to be largely driven by neutro-phils, the signal in sorted cells will certainly differ from whole blood.Despite this limitation, the infection z scores performed comparably invalidation cohorts. We further validated the infection z score in severaladditional external data sets, which included three longitudinal co-horts of ICU/trauma patients who developed VAP/VAT; eight cohortsof healthy controls compared to patients with bacterial or viral sepsis;four cohorts of admission SIRS/trauma patients compared to patientsat mixed or later time points using whole blood, neutrophils, and PBMCs;and one cohort of patients with autoimmune inflammation compared

Table 7. Publicly available gene expression data sets comparing sterile SIRS/trauma patients to later or non–time-matched sepsis/infectionpatients.

Data set

Year Submitting

author

Paperreferencenumber

Controlused here

Conditionused here

nceTra

Sampletype used

nslationalMedi

Platform

cine.org 13 M

n,control

ay 2015 Vol

n,case

7 Issue

Time pointspresent indata set

GSE5772

2007 Tang 29 ICU nonsepsis(43% trauma)

Sepsis—early + late

Neutrophils

GPL4274 23 70 Mixed: admissionor sepsis onset

GSE9960

2009 Tang 8 ICU nonsepsis Sepsis Monocytes GPL570 16 54 Mixed: admissionor sepsis onset

E-MEXP-3621

2012 Harokopos 40 Trauma atadmission

Trauma atonset of sepsis

Monocytes

A-Affy-37(GPL571)

5

5 Admission versusonset of sepsis

E-MTAB-1548

2014 Almansa 21 Postsurgery(average 2 days)

Sepsis aftersurgery(average4 days)

Whole blood

A-MEXP-2183(GPL10332)

34postsurgery

39

Postsurgery(average

2 days), onsetof sepsis(average4 days)

A B

z z

Fig. 7. Cell type enrichment analyses. (A and B) Standardized enrichment scores (z scores, dots) for human immune cell types for both (A) theentire set of 82 genes found to be significant in multicohort analysis and (B) the 11-gene set found after forward search (subset of the 82 genes).

Part (B) also shows a box plot of distributions of z scores.

287 287ra71 11

Page 12: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from

to patients with acute infection. Finally, we showed that the infec-tion z score does not have systematic trends with regard to infec-tion type (Gram-positive versus Gram-negative and bacterial versusviral) across those data sets for which infection type information isavailable.

Using the extensive clinical phenotype data available for patients inthe Glue Grant, we illustrated two important points about the appli-cation of the infection z score. First, the infection z score showed adecline over time since injury that was similar in both infected and non-infected patients. Thus, for maximal discriminatory power, if the infec-tion z score were to be tested in a longitudinal study, the diagnosticthresholds would need to be a function of the time since initial injury/event. Second, the infection z scores increased over the days before in-fection, peaked within 1 day of diagnosis, and decreased afterwards(presumably due to treatment of infection). This observation raisesthe possibility that earlier diagnosis or stratification of patients at riskof developing sepsis may be possible using the 11-gene set. In particular,we note that the early rise in infection z score that precedes a clinicaldiagnosis of infection is not a false positive but an “early positive” result.

In the Glue Grant buffy coat cohort, SIRS binary parameters aloneperformed poorly in discriminating patients at time of infection fromnoninfected patients. SIRS criteria plus the infection z score with aglobal cutoff (that is, not broken into separate time bins) increasedthe discriminatory power with a continuous NRI of 0.9. However, SIRSis only one of several criteria used to diagnose sepsis. Procalcitonin is awell-studied biomarker for differentiating sepsis from SIRS, with a sum-mary ROC AUC of 0.78 (range, 0.66 to 0.90) (5). The average AUC inour discovery cohorts was 0.87 and the time-matched neutrophils val-idation cohort had a mean AUC of 0.83, both of which are thus at leastcomparable to procalcitonin. None of the publicly available data setsincluded procalcitonin levels, so no direct comparison is available.We emphasize, however, that each of these markers need not be usedseparately; any prospective study of the infection z score should alsoinclude known biomarkers to test for better diagnostic performanceusing biomarker combinations and for head-to-head comparisons.

Both infectious and noninfectious inflammation can lead to SIRSthrough activation of the same innate immune pathways [TLRs, RIG-like receptor (RLRs), NLRs, etc.], so the “typical” proinflammatorygenes and cytokines (such as tumor necrosis factor and the inter-leukins) are generally expressed in both sterile and infectious inflam-mation (45). For instance, one recent study showed high correlation ingene expression between sterile inflammation (Glue Grant burns co-hort) and four independent sepsis data sets, with as much as 93% ofthe genes changing in the same direction in the two conditions (12).Thus, a standard hypothesis-driven approach in the search of biomar-kers specifically differentially expressed between sterile SIRS and sepsisis unlikely to succeed, given that the “standard” suite of cytokines andchemokines known to be expressed in sepsis is mostly also activated insterile SIRS. However, several protein families have been shown to havespecificity for pathogen-associated molecular patterns, thus giving riseto the possibility of infection-specific innate immune signaling path-ways (11). Our data-driven, unbiased approach searched specificallyfor genes that are homogeneously statistically differentially expressedbetween sterile SIRS/trauma patients and sepsis patients across multi-ple cohorts.

Some of the genes in the sepsis-specific 11-gene set, such asCEACAM1,C3AR1, GNA15, and HLA-DPB1, have been previously associated withsepsis or infections (46, 47). The regulatory control of these genes may

www.Scie

be enriched for several proinflammatory factors, but no single com-mon factor explained the network. The gene sets found here may bebetter explained by cell type enrichment analyses. We show that bandcells and the myeloid cell line are highly enriched for the gene setsfound to be significantly differentially expressed between sterile SIRSand sepsis. The finding of enrichment in band cells is particularly in-triguing, because bands have previously been shown to help differentiatesterile SIRS and sepsis (48). However, there is very high variability inband counts both by automatic blood counters and by hand (49), andno good serum marker exists. The 11-gene set may distinguish sepsisfrom sterile SIRS at least in part because it also includes informationon increased Tregs and decreased dendritic cells, both of which have pre-viously been implicated in sepsis (50, 51). In particular, the joint findingsthat the 11-gene set is overexpressed in bands but underexpressed inadaptive immune cells are remarkably similar to the phenotype of in-creased immature granulocytes and decreased adaptive immunity causedby myeloid-derived suppressor cells in infection and chronic critical ill-ness (52, 53). The connection between the 11-gene set and different im-mune cell types may help explain some sepsis biology, but certainly, these11 genes require further study.

Our study has some limitations. First, although we validated the11-gene set in all available independent data sets, prospective valida-tion is required. Second, the Glue Grant buffy coat and neutrophilscohorts were incorporated in a way that treated different periods oftime since injury as different data sets, though the noninfected con-trols came from the same patient cohort at different sampling times(with some dropouts due to injury, recovery, or missed sampling). Thesetime-based control subsets are thus not independent of one another,which may lead to underestimation of effect size variance; however, thiswas only the case for the two Glue Grant data sets and not the rest ofthe data sets.

This work presents several future directions. First, both the 11-geneset and the protein products of these genes will need to be tested pro-spectively in a time-matched manner. Although protein assays are fasterthan transcript quantitation assays, a number of advances in polymerasechain reaction technology have brought assay times down toward therange of clinical applicability (54). Second, our results showed that thechanges in gene expression due to normal recovery from a traumatic event(such as injury or surgery) mean that time must be properly accountedfor in any gene expression study of acute illness. Our search found severalstudies that examine time course after SIRS/trauma (GSE6377, GSE12838,GSE40012, and EMEXP3001) and several that examine the time coursesince onset of sepsis/infection (GSE20346, GSE2713, GSE40012, andEMEXP3850). However, we found only one publicly available microar-ray study (the Glue Grant) that examined a cohort of patients over time,where some of the cohorts develop infection and some do not. Thus, onthe basis of our results, we recommend that future studies of sepsis di-agnostics should be designed with longitudinal cohorts both with andwithout infection to enable appropriate time-matched comparisons(9, 10).

Overall, our comprehensive analysis of publicly available gene ex-pression data in SIRS/trauma and sepsis has yielded a parsimonious11-gene set with excellent discriminatory power in both the discoverycohorts and in 15 independent cohorts. Optimizing a clinical assay forthis gene set to get results within a window of clinical relevance shouldbe feasible. Further study will be needed both to confirm our clinical find-ings in a prospective manner and to investigate the molecular pathwaysupstream of these genes.

nceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 12

Page 13: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

http:D

ownloaded from

SUPPLEMENTARY MATERIALS

www.sciencetranslationalmedicine.org/cgi/content/full/7/287/287ra71/DC1Materials and MethodsFig. S1. Labeled PCA comparing healthy controls, SIRS/trauma patients, and sepsis patients.Fig. S2. Violin plots for the data sets included in the discovery multicohort analysis.Fig. S3. Neutrophil percentages for the Glue Grant patients with both complete blood countand microarray data.Fig. S4. Performance of the infection z score in the sorted monocytes from the Glue Grantcohort.Fig. S5. Performance of the infection z score in the sorted T cells from the Glue Grant cohort.Fig. S6. Linear models of SIRS criteria and the infection z score.Fig. S7. The infection z score in non–time-matched data sets.Fig. S8. Comparison of the infection z scores in patients with acute infections to healthycontrols and patients with autoimmune diseases.Fig. S9. Ingenuity Pathway Analysis results for the 11-gene set.Fig. S10. Schematic of the entire integrated multicohort analysis.Table S1. Summary spreadsheet of all data sets referenced in the manuscript.Table S2. Design matrix of individual phenotypes for multicohort analysis.Table S3. Summary statistics for the 82 genes that passed significance, heterogeneity, andeffect size filtering after multicohort analysis.Table S4. Probe-level data for all 11 genes in the diagnostic set for all patients in the multi-cohort analysis.Table S5. Linear models of infection score in the Glue Grant data.Table S6. Comparison of infection z score across infection types.Table S7. In silico transcription factor binding analyses for the 11-gene set.Table S8. Design matrix for cell type enrichment analyses.Movie S1. Rotation of a time-course–labeled PCA of trauma patients.References (55–79)

by guest on June 21, 2020//stm

.sciencemag.org/

REFERENCES AND NOTES

1. D. C. Angus, W. T. Linde-Zwirble, J. Lidicker, G. Clermont, J. Carcillo, M. R. Pinsky, Epidemi-ology of severe sepsis in the United States: Analysis of incidence, outcome, and associatedcosts of care. Crit. Care Med. 29, 1303–1310 (2001).

2. C. A. Torio, R. A. Andrews (Agency for Healthcare Research and Quality, Rockville, MD,2013).

3. R. Ferrer, I. Martin-Loeches, G. Phillips, T. M. Osborn, S. Townsend, R. P. Dellinger, A. Artigas,C. Schorr, M. M. Levy, Empiric antibiotic treatment reduces mortality in severe sepsis andseptic shock from the first hour: Results from a guideline-based performance improve-ment program. Crit. Care Med. 42, 1749–1755 (2014).

4. B. Coburn, A. M. Morris, G. Tomlinson, A. S. Detsky, Does this adult patient with suspectedbacteremia require blood cultures? JAMA 308, 502–511 (2012).

5. B. M. Tang, G. D. Eslick, J. C. Craig, A. S. McLean, Accuracy of procalcitonin for sepsis diag-nosis in critically ill patients: Systematic review and meta-analysis. Lancet Infect. Dis. 7, 210–217(2007).

6. W. Xiao, M. N. Mindrinos, J. Seok, J. Cuschieri, A. G. Cuenca, H. Gao, D. L. Hayden, L. Hennessy,E. E. Moore, J. P. Minei, P. E. Bankey, J. L. Johnson, J. Sperry, A. B. Nathens, T. R. Billiar, M. A. West,B. H. Brownstein, P. H. Mason, H. V. Baker, C. C. Finnerty, M. G. Jeschke, M. C. López, M. B. Klein,R. L. Gamelli, N. S. Gibran, B. Arnoldo, W. Xu, Y. Zhang, S. E. Calvano, G. P. McDonald-Smith,D. A. Schoenfeld, J. D. Storey, J. P. Cobb, H. S. Warren, L. L. Moldawer, D. N. Herndon, S. F. Lowry,R. V. Maier, R. W. Davis, R. G. Tompkins; Inflammation and Host Response to Injury Large-ScaleCollaborative Research Program, A genomic storm in critically injured humans. J. Exp. Med. 208,2581–2590 (2011).

7. R. Pankla, S. Buddhisa, M. Berry, D. M. Blankenship, G. J. Bancroft, J. Banchereau, G. Lertmemongkolchai,D. Chaussabel, Genomic transcriptional profiling identifies a candidate blood biomarker sig-nature for the diagnosis of septicemic melioidosis. Genome Biol. 10, R127 (2009).

8. B. M. Tang, A. S. McLean, I. W. Dawes, S. J. Huang, R. C. Lin, Gene-expression profiling ofperipheral blood mononuclear cells in sepsis. Crit. Care Med. 37, 882–888 (2009).

9. S. B. Johnson, M. Lissauer, G. V. Bochicchio, R. Moore, A. S. Cross, T. M. Scalea, Gene expres-sion profiles differentiate between sterile SIRS and early sepsis. Ann. Surg. 245, 611–621(2007).

10. D. M. Maslove, H. R. Wong, Gene expression profiling in sepsis: Timing, tissue, andtranslational considerations. Trends Mol. Med. 20, 204–213 (2014).

11. Y. van Kooyk, G. A. Rabinovich, Protein-glycan interactions in the control of innate andadaptive immune responses. Nat. Immunol. 9, 593–601 (2008).

12. J. Seok, H. S. Warren, A. G. Cuenca, M. N. Mindrinos, H. V. Baker, W. Xu, D. R. Richards,G. P. McDonald-Smith, H. Gao, L. Hennessy, C. C. Finnerty, C. M. López, S. Honari, E. E. Moore,

www.Scie

J. P. Minei, J. Cuschieri, P. E. Bankey, J. L. Johnson, J. Sperry, A. B. Nathens, T. R. Billiar, M. A. West,M. G. Jeschke, M. B. Klein, R. L. Gamelli, N. S. Gibran, B. H. Brownstein, C. Miller-Graziano,S. E. Calvano, P. H. Mason, J. P. Cobb, L. G. Rahme, S. F. Lowry, R. V. Maier, L. L. Moldawer,D. N. Herndon, R. W. Davis, W. Xiao, R. G. Tompkins; Inflammation and Host Response to Injury,Large Scale Collaborative Research Program, Genomic responses in mouse models poorlymimic human inflammatory diseases. Proc. Natl. Acad. Sci. U.S.A. 110, 3507–3512 (2013).

13. H. S. Warren, C. M. Elson, D. L. Hayden, D. A. Schoenfeld, J. P. Cobb, R. V. Maier, L. L. Moldawer,E. E. Moore, B. G. Harbrecht, K. Pelak, J. Cuschieri, D. N. Herndon, M. G. Jeschke, C. C. Finnerty,B. H. Brownstein, L. Hennessy, P. H. Mason, R. G. Tompkins; Inflammation and the Host Re-sponse to Injury Large Scale Collaborative Research Program, A genomic score prognostic ofoutcome in trauma patients. Mol. Med. 15, 220–227 (2009).

14. K. H. Desai, C. S. Tan, J. T. Leek, R. V. Maier, R. G. Tompkins, J. D. Storey; Inflammation andthe Host Response to Injury Large Scale Collaborative Research Program, Dissecting inflam-matory complications in critically injured patients by within-patient gene expression changes:A longitudinal clinical genomics study. PLOS Med. 8, e1001093 (2011).

15. N. Cvijanovich, T. P. Shanley, R. Lin, G. L. Allen, N. J. Thomas, P. Checchia, N. Anas, R. J. Freishtat,M. Monaco, K. Odoms, B. Sakthivel, H. R. Wong; Genomics of Pediatric SIRS/Septic Shock In-vestigators, Validating the genomic signature of pediatric septic shock. Physiol. Genomics34, 127–134 (2008).

16. T. P. Shanley, N. Cvijanovich, R. Lin, G. L. Allen, N. J. Thomas, A. Doctor, M. Kalyanaraman,N. M. Tofil, S. Penfil, M. Monaco, K. Odoms, M. Barnes, B. Sakthivel, B. J. Aronow, H. R. Wong,Genome-level longitudinal expression of signaling pathways and gene networks in pediatricseptic shock. Mol. Med. 13, 495–508 (2007).

17. H. R. Wong, T. P. Shanley, B. Sakthivel, N. Cvijanovich, R. Lin, G. L. Allen, N. J. Thomas, A. Doctor,M. Kalyanaraman, N. M. Tofil, S. Penfil, M. Monaco, M. A. Tagavilla, K. Odoms, K. Dunsmore,M. Barnes, B. J. Aronow; Genomics of Pediatric SIRS/Septic Shock Investigators, Genome-level expression profiles in pediatric septic shock indicate a role for altered zinc homeostasisin poor outcome. Physiol. Genomics 30, 146–155 (2007).

18. H. R. Wong, N. Cvijanovich, G. L. Allen, R. Lin, N. Anas, K. Meyer, R. J. Freishtat, M. Monaco,K. Odoms, B. Sakthivel, T. P. Shanley; Genomics of Pediatric SIRS/Septic Shock Investigators,Genomic expression profiling across the pediatric systemic inflammatory response syndrome,sepsis, and septic shock spectrum. Crit. Care Med. 37, 1558–1566 (2009).

19. H. R. Wong, R. J. Freishtat, M. Monaco, K. Odoms, T. P. Shanley, Leukocyte subset-derivedgenomewide expression profiles in pediatric septic shock. Pediatr. Crit. Care Med. 11, 349–355(2010).

20. H. R. Wong, N. Z. Cvijanovich, G. L. Allen, N. J. Thomas, R. J. Freishtat, N. Anas, K. Meyer,P. A. Checchia, R. Lin, T. P. Shanley, M. T. Bigham, D. S. Wheeler, L. A. Doughty, K. Tegtmeyer,S. E. Poynter, J. M. Kaplan, R. S. Chima, E. Stalets, R. K. Basu, B. M. Varisco, F. E. Barr, Validation ofa gene expression-based subclassification strategy for pediatric septic shock. Crit. Care Med.39, 2511–2517 (2011).

21. R. Almansa, E. Tamayo, M. Heredia, S. Gutierrez, P. Ruiz, E. Alvarez, E. Gomez-Sanchez,D. Andaluz-Ojeda, R. Ceña, L. Rico, V. Iglesias, J. I. Gomez-Herreras, J. F. Bermejo-Martin,Transcriptomic evidence of impaired immunoglobulin G production in fatal septic shock.J. Crit. Care 29, 307–309 (2014).

22. J. F. Bermejo-Martin, I. Martin-Loeches, J. Rello, A. Antón, R. Almansa, L. Xu, G. Lopez-Campos,T. Pumarola, L. Ran, P. Ramirez, D. Banner, D. C. Ng, L. Socias, A. Loza, D. Andaluz, E. Maravi,M. J. Gómez-Sánchez, M. Gordón, M. C. Gallegos, V. Fernandez, S. Aldunate, C. León, P. Merino,J. Blanco, F. Martin-Sanchez, L. Rico, D. Varillas, V. Iglesias, M. Marcos, F. Gandía, F. Bobillo,B. Nogueira, S. Rojo, S. Resino, C. Castro, R. Ortiz de Lejarazu, D. Kelvin, Host adaptive immunitydeficiency in severe pandemic influenza. Crit. Care 14, R167 (2010).

23. I. Martin-Loeches, E. Papiol, R. Almansa, G. López-Campos, J. F. Bermejo-Martin, J. Rello, Intu-bated patients developing tracheobronchitis or pneumonia have distinctive complement sys-tem gene expression signatures in the pre-infection period: A pilot study. Med. Intensiva 36,257–263 (2012).

24. E. Tamayo, A. Fernández, R. Almansa, E. Carrasco, L. Goncalves, M. Heredia, D. Andaluz-Ojeda,G. March, L. Rico, J. I. Gómez-Herreras, R. O. de Lejarazu, J. F. Bermejo-Martin, Beneficial role ofendogenous immunoglobulin subclasses and isotypes in septic shock. J. Crit. Care 27, 616–622(2012).

25. X. Hu, J. Yu, S. D. Crosby, G. A. Storch, Gene expression profiles in febrile children with definedviral and bacterial infection. Proc. Natl. Acad. Sci. U.S.A. 110, 12792–12797 (2013).

26. G. P. Parnell, A. S. McLean, D. R. Booth, N. J. Armstrong, M. Nalos, S. J. Huang, J. Manak, W. Tang,O.-Y. Tam, S. Chan, B. M. Tang, A distinct influenza infection signature in the blood transcriptomeof patients with severe community-acquired pneumonia. Crit. Care 16, R157 (2012).

27. A. Sutherland, M. Thomas, R. A. Brandon, R. B. Brandon, J. Lipman, B. Tang, A. McLean, R. Pascoe,G. Price, T. Nguyen, G. Stone, D. Venter, Development and validation of a novel molecular bio-marker diagnostic test for the early detection of sepsis. Crit. Care 15, R149 (2011).

28. Y. Tang, H. Xu, X. Du, L. Lit, W. Walker, A. Lu, R. Ran, J. P. Gregg, M. Reilly, A. Pancioli, J. C. Khoury,L. R. Sauerbeck, J. A. Carrozzella, J. Spilker, J. Clark, K. R. Wagner, E. C. Jauch, D. J. Chang, P. Verro,J. P. Broderick, F. R. Sharp, Gene expression in blood changes rapidly in neutrophils and mono-cytes after ischemic stroke in humans: A microarray study. J. Cereb. Blood Flow Metab. 26,1089–1102 (2006).

nceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 13

Page 14: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from

29. B. M. Tang, A. S. McLean, I. W. Dawes, S. J. Huang, R. C. Lin, The use of gene-expressionprofiling to identify candidate genes in human sepsis. Am. J. Respir. Crit. Care Med. 176,676–684 (2007).

30. S. H. Ahn, E. L. Tsalik, D. D. Cyr, Y. Zhang, J. C. van Velkinburgh, R. J. Langley, S. W. Glickman,C. B. Cairns, A. K. Zaas, E. P. Rivers, R. M. Otero, T. Veldman, S. F. Kingsmore, J. Lucas, C. W. Woods,G. S. Ginsburg, V. G. Fowler Jr., Gene expression-based classifiers identify Staphylococcus aureusinfection in mice and humans. PLOS One 8, e48979 (2013).

31. T. Dolinay, Y. S. Kim, J. Howrylak, G. M. Hunninghake, C. H. An, L. Fredenburgh, A. F. Massaro,A. Rogers, L. Gazourian, K. Nakahira, J. A. Haspel, R. Landazury, S. Eppanapally, J. D. Christie,N. J. Meyer, L. B. Ware, D. C. Christiani, S. W. Ryter, R. M. Baron, A. M. Choi, Inflammasome-regulated cytokines are critical mediators of acute lung injury. Am. J. Respir. Crit. Care Med.185, 1225–1234 (2012).

32. J.-E. Berdal, T. E. Mollnes, T. Wæhre, O. K. Olstad, B. Halvorsen, T. Ueland, J. H. Laake, M. T. Furuseth,A. Maagaard, H. Kjekshus, P. Aukrust, C. M. Jonassen, Excessive innate immune response andmutant D222G/N in severe A (H1N1) pandemic influenza. J. Infect. 63, 308–316 (2011).

33. M. P. Berry, C. M. Graham, F. W. McNab, Z. Xu, S. A. Bloch, T. Oni, K. A. Wilkinson, R. Banchereau,J. Skinner, R. J. Wilkinson, C. Quinn, D. Blankenship, R. Dhawan, J. J. Cush, A. Mejias, O. Ramilo,O. M. Kon, V. Pascual, J. Banchereau, D. Chaussabel, A. O’Garra, An interferon-inducibleneutrophil-driven blood transcriptional signature in human tuberculosis. Nature 466,973–977 (2010).

34. K. Fredriksson, I. Tjäder, P. Keller, N. Petrovic, B. Ahlman, C. Schéele, J. Wernerman, J. A. Timmons,O. Rooyackers, Dysregulation of mitochondrial dynamics and the muscle transcriptome in ICUpatients suffering from sepsis induced multiple organ failure. PLOS One 3, e3686 (2008).

35. J. E. McDunn, K. D. Husain, A. D. Polpitiya, A. Burykin, J. Ruan, Q. Li, W. Schierding, N. Lin, D. Dixon,W. Zhang, C. M. Coopersmith, W. M. Dunne, M. Colonna, B. K. Ghosh, J. P. Cobb, Plasticity of thesystemic inflammatory response to acute infection during critical illness: Development of theriboleukogram. PLOS One 3, e1564 (2008).

36. T. P. Chung, J. M. Laramie, D. J. Meyer, T. Downey, L. H. Tam, H. Ding, T. G. Buchman, I. Karl,G. D. Stormo, R. S. Hotchkiss, J. P. Cobb, Molecular diagnostics in sepsis: From bedside tobench. J. Am. Coll. Surg. 203, 585–598 (2006).

37. G. Parnell, A. McLean, D. Booth, S. Huang, M. Nalos, B. Tang, Aberrant cell cycle and apoptoticchanges characterise severe influenza A infection—A meta-analysis of genomic signatures incirculating leukocytes. PLOS One 6, e17186 (2011).

38. M. Emonts, thesis, Erasmus University Rotterdam (2008).39. C. L. Smith, P. Dickinson, T. Forster, M. Craigon, A. Ross, M. R. Khondoker, R. France, A. Ivens,

D. J. Lynn, J. Orme, A. Jackson, P. Lacaze, K. L. Flanagan, B. J. Stenson, P. Ghazal, Identifi-cation of a human neonatal immune-metabolic network associated with bacterial infec-tion. Nat. Commun. 5, 4649 (2014).

40. A. G. Vassiliou, N. A. Maniatis, S. E. Orfanos, Z. Mastora, E. Jahaj, T. Paparountas, A. Armaganidis,C. Roussos, V. Aidinis, A. Kotanidou, Induced expression and functional effects of aquaporin-1 in human leukocytes in sepsis. Crit. Care 17, R199 (2013).

41. P. Khatri, S. Roedder, N. Kimura, K. De Vusser, A. A. Morgan, Y. Gong, M. P. Fischbein, R. C. Robbins,M. Naesens, A. J. Butte, M. M. Sarwal, A common rejection module (CRM) for acute rejectionacross multiple organs identifies novel therapeutics for organ transplantation. J. Exp. Med. 210,2205–2221 (2013).

42. R. Chen, P. Khatri, P. K. Mazur, M. Polin, Y. Zheng, D. Vaka, C. D. Hoang, J. Shrager, Y. Xu, S. Vicent,A. J. Butte, E. A. Sweet-Cordero, A meta-analysis of lung cancer gene expression identifies PTK7as a survival gene in lung adenocarcinoma. Cancer Res. 74, 2892–2902 (2014).

43. F. Hietbrink, L. Koenderman, M. Althuizen, J. Pillay, V. Kamp, L. P. Leenen, Kinetics of the innateimmune response after trauma: Implications for the development of late onset sepsis. Shock40, 21–27 (2013).

44. M. J. Pencina, R. B. D’Agostino Sr., O. V. Demler, Novel metrics for evaluating improvement indiscrimination: Net reclassification and integrated discrimination improvement for normalvariables and nested models. Stat. Med. 31, 101–113 (2012).

45. K. Newton, V. M. Dixit, Signaling in innate immunity and inflammation. Cold Spring Harb.Perspect. Biol. 4, a006049 (2012).

46. S. A. Madsen-Bouterse, R. Romero, A. L. Tarca, J. P. Kusanovic, J. Espinoza, C. J. Kim, J.-S. Kim,S. S. Edwin, R. Gomez, S. Draghici, The transcriptome of the fetal inflammatory responsesyndrome. Am. J. Reprod. Immunol. 63, 73–92 (2010).

47. A. Kwan, M. Hubank, A. Rashid, N. Klein, M. J. Peters, Transcriptional instability during evol-ving sepsis may limit biomarker based risk stratification. PLOS One 8, e60501 (2013).

48. R. Cavallazzi, C. L. Bennin, A. Hirani, C. Gilbert, P. E. Marik, Is the band count useful in thediagnosis of infection? An accuracy study in critically ill patients. J. Intensive Care Med. 25,353–357 (2010).

49. P. J. Cornbleet, Clinical utility of the band count. Clin. Lab. Med. 22, 101–136 (2002).50. F. Venet, C.-S. Chung, G. Monneret, X. Huang, B. Horner, M. Garber, A. Ayala, Regulatory T cell

populations in sepsis and trauma. J. Leukoc. Biol. 83, 523–535 (2008).51. D. Grimaldi, S. Louis, F. Pène, G. Sirgo, C. Rousseau, Y. E. Claessens, L. Vimeux, A. Cariou, J. P. Mira,

A. Hosmalin, J. D. Chiche, Profound and persistent decrease of circulating dendritic cells isassociated with ICU-acquired infection in patients with septic shock. Intensive Care Med. 37,1438–1446 (2011).

www.Scie

52. D. I. Gabrilovich, S. Nagaraj, Myeloid-derived suppressor cells as regulators of the immunesystem. Nat. Rev. Immunol. 9, 162–174 (2009).

53. E. L. Vanzant, C. M. Lopez, T. Ozrazgat-Baslanti, R. Ungaro, R. Davis, A. G. Cuenca, L. F. Gentile,D. C. Nacionales, A. L. Cuenca, A. Bihorac, C. Leeuwenburgh, J. Lanz, H. V. Baker, B. McKinley,L. L. Moldawer, F. A. Moore, P. A. Efron, Persistent inflammation, immunosuppression, andcatabolism syndrome after severe blunt trauma. J. Trauma Acute Care Surg. 76, 21–30; dis-cussion 29–30 (2014).

54. S. Park, Y. Zhang, S. Lin, T.-H. Wang, S. Yang, Advances in microfluidic PCR for point-of-careinfectious disease diagnostics. Biotechnol. Adv. 29, 830–839 (2011).

55. W. Xu, J. Seok, M. N. Mindrinos, A. C. Schweitzer, H. Jiang, J. Wilhelmy, T. A. Clark, K. Kapur,Y. Xing, M. Faham, J. D. Storey, L. L. Moldawer, R. V. Maier, R. G. Tompkins, W. H. Wong,R. W. Davis, W. Xiao; Inflammation and Host Response to Injury Large-Scale Collaborative Re-search Program, Human transcriptome array for high-throughput clinical studies. Proc. Natl.Acad. Sci. U.S.A. 108, 3707–3712 (2011).

56. Y. Koren, L. Carmel, Robust linear dimensionality reduction. IEEE Trans. Vis. Comput. Graph.10, 459–470 (2004).

57. A. Bodor, I. Csabai, M. W. Mahoney, N. Solymosi, rCUR: An R package for CUR matrixdecomposition. BMC Bioinformatics 13, 103 (2012).

58. J. Vandesompele, K. De Preter, F. Pattyn, B. Poppe, N. Van Roy, A. De Paepe, F. Speleman,Accurate normalization of real-time quantitative RT-PCR data by geometric averaging ofmultiple internal control genes. Genome Biol. 3, RESEARCH0034–research0034.11 (2002).

59. J. T. Leek, W. E. Johnson, H. S. Parker, A. E. Jaffe, J. D. Storey, The sva package for removingbatch effects and other unwanted variation in high-throughput experiments. Bioinformatics28, 882–883 (2012).

60. R. K. Auerbach, B. Chen, A. J. Butte, Relating genes to function: Identifying enriched tran-scription factors using the ENCODE ChIP-Seq significance tool. Bioinformatics 29, 1922–1924(2013).

61. H. G. Roider, T. Manke, S. O’Keeffe, M. Vingron, S. A. Haas, PASTAA: Identifying transcriptionfactors associated with sets of co-regulated genes. Bioinformatics 25, 435–442 (2009).

62. K. L. Jeffrey, T. Brummer, M. S. Rolph, S. M. Liu, N. A. Callejas, R. J. Grumont, C. Gillieron,F. Mackay, S. Grey, M. Camps, C. Rommel, S. D. Gerondakis, C. R. Mackay, Positive regulationof immune cell function and inflammatory responses by phosphatase PAC-1. Nat. Immunol.7, 274–283 (2006).

63. F. O. Martinez, S. Gordon, M. Locati, A. Mantovani, Transcriptional profiling of the humanmonocyte-to-macrophage differentiation and polarization: New molecules and patterns ofgene expression. J. Immunol. 177, 7303–7311 (2006).

64. S. Radom-Aizik, F. Zaldivar Jr., S.-Y. Leu, P. Galassetti, D. M. Cooper, Effects of 30 min ofaerobic exercise on gene expression in human neutrophils. J. Appl. Physiol. 104, 236–243(2008).

65. F. He, H. Chen, M. Probst-Kepper, R. Geffers, S. Eifes, A. Del Sol, K. Schughart, A.-P. Zeng, R. Balling,PLAU inferred from a correlation network is critical for suppressor function of regulatory T cells.Mol. Syst. Biol. 8, 624 (2012).

66. M. Giefing, S. Winoto-Morbach, J. Sosna, C. Döring, W. Klapper, R. Küppers, S. Böttcher, D. Adam,R. Siebert, S. Schütze, Hodgkin-Reed-Sternberg cells in classical Hodgkin lymphoma show altera-tions of genes encoding the NADPH oxidase complex and impaired reactive oxygen speciessynthesis capacity. PLOS One 8, e84928 (2013).

67. J. A. Meyers, D. W. Su, A. Lerner, Chronic lymphocytic leukemia and B and T cells differ in theirresponse to cyclic nucleotide phosphodiesterase inhibitors. J. Immunol. 182, 5400–5411(2009).

68. S. Eckerle, V. Brune, C. Döring, E. Tiacci, V. Bohle, C. Sundström, R. Kodet, M. Paulli, B. Falini,W. Klapper, A. B. Chaubert, K. Willenbrock, D. Metzler, A. Bräuninger, R. Küppers, M.-L. Hansmann,Gene expression profiling of isolated tumour cells from anaplastic large cell lymphomas: In-sights into its cellular origin, pathogenesis and relation to Hodgkin lymphoma. Leukemia 23,2129–2138 (2009).

69. K. A. Stegmann, N. K. Björkström, H. Veber, S. Ciesek, P. Riese, J. Wiegand, J. Hadem, P. V. Suneetha,J. Jaroszewicz, C. Wang, V. Schlaphoff, P. Fytili, M. Cornberg, M. P. Manns, R. Geffers, T. Pietschmann,C. A. Guzmán, H. G. Ljunggren, H. Wedemeyer, Interferon-a–induced TRAIL on natural killercells is associated with control of hepatitis C virus infection. Gastroenterology 138, 1885–1897(2010).

70. D. C. Vinh, S. Y. Patel, G. Uzel, V. L. Anderson, A. F. Freeman, K. N. Olivier, C. Spalding,S. Hughes, S. Pittaluga, M. Raffeld, L. R. Sorbara, H. Z. Elloumi, D. B. Kuhns, M. L. Turner,E. W. Cowen, D. Fink, D. Long-Priel, A. P. Hsu, L. Ding, M. L. Paulson, A. R. Whitney, E. P. Sampaio,D. M. Frucht, F. R. DeLeo, S. M. Holland, Autosomal dominant and sporadic monocytopeniawith susceptibility to mycobacteria, fungi, papillomaviruses, and myelodysplasia. Blood 115,1519–1529 (2010).

71. P. Ancuta, K.-Y. Liu, V. Misra, V. S. Wacleche, A. Gosselin, X. Zhou, D. Gabuzda, Transcriptionalprofiling reveals developmental relationship and distinct biological functions of CD16+ andCD16- monocyte subsets. BMC Genomics 10, 403 (2009).

72. N. Novershtern, A. Subramanian, L. N. Lawton, R. H. Mak, W. N. Haining, M. E. McConkey, N. Habib,N. Yosef, C. Y. Chang, T. Shay, G. M. Frampton, A. C. Drake, I. Leskov, B. Nilsson, F. Preffer,D. Dombkowski, J. W. Evans, T. Liefeld, J. S. Smutko, J. Chen, N. Friedman, R. A. Young, T. R. Golub,

nceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 14

Page 15: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

R E S EARCH ART I C L E

http://sD

ownloaded from

A. Regev, B. L. Ebert, Densely interconnected transcriptional circuits control cell states in humanhematopoiesis. Cell 144, 296–309 (2011).

73. F. Allantaz, D. T. Cheng, T. Bergauer, P. Ravindran, M. F. Rossier, M. Ebeling, L. Badi, B. Reis,H. Bitter, M. D’Asaro, A. Chiappe, S. Sridhar, G. D. Pacheco, M. E. Burczynski, D. Hochstrasser,J. Vonderscher, T. Matthes, Expression profiling of human immune cell subsets identifiesmiRNA-mRNA regulatory relationships correlated with cell type specific expression. PLOSOne 7, e29979 (2012).

74. E. Tsitsiou, A. E. Williams, S. A. Moschos, K. Patel, C. Rossios, X. Jiang, O.-D. Adams, P. Macedo,R. Booton, D. Gibeon, K. F. Chung, M. A. Lindsay, Transcriptome analysis shows activation ofcirculating CD8+ T cells in patients with severe asthma. J. Allergy Clin. Immunol. 129, 95–103(2012).

75. M. Frankenberger, T. P. Hofer, A. Marei, F. Dayyani, S. Schewe, C. Strasser, A. Aldraihim,F. Stanzel, R. Lang, R. Hoffmann, O. Prazeres da Costa, T. Buch, L. Ziegler-Heitbrock, Transcriptprofiling of CD16-positive monocytes reveals a unique molecular fingerprint. Eur. J. Immunol.42, 957–974 (2012).

76. N.-Y. Huen, A. L. Pang, J. A. Tucker, T.-L. Lee, M. Vergati, C. Jochems, C. Intrivici, V. Cereda,W.-Y. Chan, O. M. Rennert, R. A. Madan, J. L. Gulley, J. Schlom, K. Y. Tsang, Up-regulation of pro-liferative and migratory genes in regulatory T cells from patients with metastatic castration-resistant prostate cancer. Int. J. Cancer 133, 373–382 (2013).

77. K. C. Malcolm, E. M. Nichols, S. M. Caceres, J. E. Kret, S. L. Martiniano, S. D. Sagel, E. D. Chan,L. Caverly, G. M. Solomon, P. Reynolds, D. L. Bratton, J. L. Taylor-Cousar, D. P. Nichols,M. T. Saavedra, J. A. Nick, Mycobacterium abscessus induces a limited pattern of neutrophilactivation that promotes pathogen survival. PLOS One 8, e57402 (2013).

78. N. Rapin, F. O. Bagger, J. Jendholm, H. Mora-Jensen, A. Krogh, A. Kohlmann, C. Thiede,N. Borregaard, L. Bullinger, O. Winther, K. Theilgaard-Mönch, B. T. Porse, Comparing cancervs normal gene expression profiles identifies new disease entities and common transcrip-tional programs in AML patients. Blood 123, 894–904 (2014).

79. N. A. Mabbott, J. K. Baillie, H. Brown, T. C. Freeman, D. A. Hume, An expression atlas of humanprimary cells: Inference of gene function from coexpression networks. BMC Genomics 14, 632(2013).

www.Scie

Acknowledgments:We would like to thank T. Roth for the help in curating the public immunecell profiles; T. Chakraborty, J. P. Cobb, H. Hossain, M. Lissauer, G. Parnell, and B. Tang forhelpful discussions concerning their publicly available data; P. Mason for assistance with ac-cessing the Glue Grant data; and the Glue Grant investigators for sharing their data publicly;they are supported in this by National Institute of General Medical Sciences Glue Grant LegacyAward R24GM102656. Funding: T.E.S is funded by National Library of Medicine grant2T15LM007033, a Stanford Child Health Research Institute Young Investigator Award (throughthe Institute for Immunity, Transplantation and Infection), and the Stanford Department ofSurgery. P.K. is funded by National Institute of Allergy and Infectious Diseases grants1U19AI109662, U19AI057229, U54I117925, and U01AI089859. Author contributions: Studyconception and design: T.E.S. and P.K.; contributed materials and methods: A.S. and H.R.W.;performed the analysis: T.E.S. and P.K.; drafted manuscript: T.E.S. and P.K.; and critical revision:T.E.S., A.S., H.R.W., and P.K. Competing interests: The 11-gene set has been disclosed for pos-sible patent protection to Stanford Office of Technology and Licensing by T.E.S. and P.K. Dataand materials availability: Links to the data sets used here are available in table S1; patientdesign matrix and probe-level data are available in tables S2 and S3. The GPSSSI unique dataset has been deposited in GEO at accession number GSE66099. The Glue Grant data are pub-licly available, pending Institutional Review Board approval per the instructions found athttps://www.gluegrant.org/. The data and code necessary to recreate the multicohort analysishave been deposited online at http://khatrilab.stanford.edu/sepsis; access to the data will begranted after approval by the Glue Grant consortium.

Submitted 2 January 2015Accepted 11 March 2015Published 13 May 201510.1126/scitranslmed.aaa5993

Citation: T. E. Sweeney, A. Shidham, H. R. Wong, P. Khatri, A comprehensive time-course–based multicohort analysis of sepsis and sterile inflammation reveals a robust diagnosticgene set. Sci. Transl. Med. 7, 287ra71 (2015).

tm

nceTranslationalMedicine.org 13 May 2015 Vol 7 Issue 287 287ra71 15

by guest on June 21, 2020.sciencem

ag.org/

Page 16: A comprehensive time-course based multicohort analysis of ... · infectious inflammation. INTRODUCTION Sepsis, a syndrome of systemic inflammation in response to infection, kills

inflammation reveals a robust diagnostic gene setbased multicohort analysis of sepsis and sterile−A comprehensive time-course

Timothy E. Sweeney, Aaditya Shidham, Hector R. Wong and Purvesh Khatri

DOI: 10.1126/scitranslmed.aaa5993, 287ra71287ra71.7Sci Transl Med

disease.any attempts to identify sepsis patients by gene expression would need to consider the time since the onset of two conditions. The authors also discovered that gene expression in these conditions changes over time, and thuscollection of data from patients with sepsis or sterile inflammation and identified a set of genes that distinguish the

evaluated a largeet al.misdiagnosed with sepsis exposes the patient to side effects of the drugs. Sweeney sepsis and initiation of antibiotics can be deadly. Conversely, preemptively starting antibiotics in a patientcauses of inflammation, which can be a nonspecific response to noninfectious conditions. A delay in recognition of

Sepsis, or systemic inflammation resulting from infection, is notoriously difficult to distinguish from otherUncovering the tracks of sepsis

ARTICLE TOOLS http://stm.sciencemag.org/content/7/287/287ra71

MATERIALSSUPPLEMENTARY http://stm.sciencemag.org/content/suppl/2015/05/11/7.287.287ra71.DC1

CONTENTRELATED

http://stm.sciencemag.org/content/scitransmed/12/539/eabb2575.fullhttp://stm.sciencemag.org/content/scitransmed/12/539/eaaz3833.fullhttp://stm.sciencemag.org/content/scitransmed/11/518/eaax9000.fullhttp://stm.sciencemag.org/content/scitransmed/9/412/eaan5689.fullhttp://science.sciencemag.org/content/sci/358/6359/42.fullhttp://science.sciencemag.org/content/sci/358/6359/111.fullhttp://stm.sciencemag.org/content/scitransmed/8/346/346ra91.fullhttp://stm.sciencemag.org/content/scitransmed/8/335/335ra55.fullhttp://stm.sciencemag.org/content/scitransmed/8/335/335fs9.fullhttp://stm.sciencemag.org/content/scitransmed/7/303/303ra140.fullhttp://stm.sciencemag.org/content/scitransmed/7/299/299ra122.fullhttp://stke.sciencemag.org/content/sigtrans/8/381/ra59.fullhttp://stm.sciencemag.org/content/scitransmed/2/48/48ra65.fullhttp://stm.sciencemag.org/content/scitransmed/5/195/195ra95.full

REFERENCES

http://stm.sciencemag.org/content/7/287/287ra71#BIBLThis article cites 77 articles, 11 of which you can access for free

PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions

Terms of ServiceUse of this article is subject to the

registered trademark of AAAS. is aScience Translational MedicineScience, 1200 New York Avenue NW, Washington, DC 20005. The title

(ISSN 1946-6242) is published by the American Association for the Advancement ofScience Translational Medicine

Copyright © 2015, American Association for the Advancement of Science

by guest on June 21, 2020http://stm

.sciencemag.org/

Dow

nloaded from