A Burkholderia pseudomalleiprotein microarray reveals ... · for inclusion in vaccines (3). The...

6
A Burkholderia pseudomallei protein microarray reveals serodiagnostic and cross-reactive antigens Philip L. Felgner a,b,c,1 , Matthew A. Kayala b,d , Adam Vigil a , Chad Burk a , Rie Nakajima-Sasaki a , Jozelyn Pablo a , Douglas M. Molina c , Siddiqua Hirst a , Janet S. W. Chew e , Dongling Wang e , Gladys Tan e , Melanie Duffield f , Ron Yang g , Julien Neel b,d , Narisara Chantratita h , Greg Bancroft i , Ganjana Lertmemongkolchai j , D. Huw Davies a , Pierre Baldi b,d,k , Sharon Peacock h,l , and Richard W. Titball g,i a Department of Medicine, Division of Infectious Diseases, University of California, Irvine, CA 92697; b Institute for Genomics and Bioinformatics, d Department of Computer Science, and k Department of Biological Chemistry, University of California, Irvine, CA 92067; c Antigen Discovery, Inc., Irvine, CA 92618; e Defence Medical & Environmental Research Institute, DSO National Laboratories, 27 Medical Drive, #13-01, Singapore 117510; f Defence Science and Technology Laboratory, Porton Down, Salisbury SP4 0JQ, United Kingdom; g School of Biosciences, Geoffrey Pope Building, University of Exeter, Exeter EX4 4QD, United Kingdom; i Department of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, United Kingdom; j Department of Clinical Immunology, Khon Kaen University, Khon Kaen 40002, Thailand; h Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok 10400, Thailand; and l Center for Clinical Vaccinology and Tropical Medicine, Nuffield Department of Clinical Medicine, University of Oxford, Churchill Hospital, Oxford CB2 0QQ, United Kingdom Edited by Peter Palese, Mount Sinai School of Medicine, New York, NY, and approved June 19, 2009 (received for review December 8, 2008) Understanding the way in which the immune system responds to infection is central to the development of vaccines and many diagnostics. To provide insight into this area, we fabricated a protein microarray containing 1,205 Burkholderia pseudomallei proteins, probed it with 88 melioidosis patient sera, and identified 170 reactive antigens. This subset of antigens was printed on a smaller array and probed with a collection of 747 individual sera derived from 10 patient groups including melioidosis patients from Northeast Thailand and Singapore, patients with different infec- tions, healthy individuals from the USA, and from endemic and nonendemic regions of Thailand. We identified 49 antigens that are significantly more reactive in melioidosis patients than healthy people and patients with other types of bacterial infections. We also identified 59 cross-reactive antigens that are equally reactive among all groups, including healthy controls from the USA. Using these results we were able to devise a test that can classify melioidosis positive and negative individuals with sensitivity and specificity of 95% and 83%, respectively, a significant improve- ment over currently available diagnostic assays. Half of the reac- tive antigens contained a predicted signal peptide sequence and were classified as outer membrane, surface structures or secreted molecules, and an additional 20% were associated with pathoge- nicity, adaptation or chaperones. These results show that microar- rays allow a more comprehensive analysis of the immune response on an antigen-specific, patient-specific, and population-specific basis, can identify serodiagnostic antigens, and contribute to a more detailed understanding of immunogenicity to this pathogen. antigen discovery melioidosis diagnostic antigen prediction U nderstanding the interaction of the immune system with bacteria is central to an understanding of the pathogenesis of infectious disease, and also to the development of diagnostics and vaccines. Yet there is an unmet need for a comprehensive and unambiguous approach for quantifying the immune re- sponse to infection in an antigen-specific manner and on a genome-wide scale. The development of antibodies to individual components of bacterial pathogens is 1 important element of the immune response, and is especially relevant to the development of diagnostics and for therapies where antibodies play direct or indirect roles in the killing of pathogens or pathogen-infected cells. Although protein antigens are generally assumed to make up the majority of the bacterial antigens that are recognized by antibodies, the reasons why only some proteins evoke antibody responses are poorly understood. There is a general and intuitive assumption that proteins that are displayed on the surface of the bacterium are more likely to evoke an antibody response. In the case of proteins able to induce protective immunity there is some evidence that this assumption is correct. An analysis of 72 bacterial proteins included in vaccines or shown to induce protective immunity in animal models of disease showed that 52 (72%) possessed signal sequences and were therefore likely to be located outside of the cell membrane (1). The assumption that proteins able to induce protective immunity possess signal sequences is exploited in reverse vaccinology, where candidate protective antigens are predicted from the genome sequence (2). However, these studies fail to provide insight into the broader question of the differential abilities of bacterial proteins to induce antibody responses in a population. To a large extent, the difficulties in addressing this question are a reflection of the limitations of existing technologies for mapping the complete subset of proteins involved in the immune response against the infectious agent. These technologies rely on chromatographic or electrophoretic separation of proteins from bacteria or the screening of expression libraries. These approaches are limited by the different levels of expression of proteins in the native or recombinant hosts, specific bacteria growth conditions if culti- vable, and are not conducive to high-throughput screening of a large collection of serum samples. We have previously developed array technology that allows protein microarrays to be constructed from the predicted pro- teome of a microorganism (3–7). These arrays can be used to address basic questions about the interactions of a given patho- gen with the host immune system (8–10), and allows the iden- tification of proteins which can be used as diagnostic reagents or for inclusion in vaccines (3). The empirical data gathered from this type of array can also be used to evaluate and improve the accuracy of in silico antigen prediction of a proteome. We report here the development of a Burkholderia pseudoma- llei protein array. B. pseudomallei is the causative agent of melioidosis, a serious and often fatal infectious disease of humans. It is an important medical problem across Southeast Author contributions: P.L.F., A.V., G.T., G.B., G.L., D.H.D., P.B., S.P., and R.W.T. designed research; M.A.K., A.V., R.N.-S., J.P., D.M.M., S.H., J.S.W.C., D.W., and N.C. performed research; P.L.F., M.A.K., A.V., C.B., G.T., M.D., R.Y., J.N., G.L., D.H.D., P.B., S.P., and R.W.T. analyzed data; and P.L.F., A.V., P.B., S.P., and R.W.T. wrote the paper. Conflict of interest statement: P.L.F. and D.H.D. have patent applications related to protein microarray fabrication and have stock positions with Antigen Discovery, Inc. D.M.M. is an employee with Antigen Discovery, Inc. This article is a PNAS Direct Submission. Freely available online through the PNAS open access option. 1 To whom correspondence should be addressed at: Department of Medicine, Division of Infectious Diseases, University of California, 3501 Hewitt Hall, Irvine, CA 92697. E-mail: [email protected]. This article contains supporting information online at www.pnas.org/cgi/content/full/ 0812080106/DCSupplemental. www.pnas.orgcgidoi10.1073pnas.0812080106 PNAS August 11, 2009 vol. 106 no. 32 13499 –13504 MEDICAL SCIENCES Downloaded by guest on March 12, 2021

Transcript of A Burkholderia pseudomalleiprotein microarray reveals ... · for inclusion in vaccines (3). The...

Page 1: A Burkholderia pseudomalleiprotein microarray reveals ... · for inclusion in vaccines (3). The empirical data gathered from this type of array can also be used to evaluate and improve

A Burkholderia pseudomallei protein microarrayreveals serodiagnostic and cross-reactive antigensPhilip L. Felgnera,b,c,1, Matthew A. Kayalab,d, Adam Vigila, Chad Burka, Rie Nakajima-Sasakia, Jozelyn Pabloa,Douglas M. Molinac, Siddiqua Hirsta, Janet S. W. Chewe, Dongling Wange, Gladys Tane, Melanie Duffieldf, Ron Yangg,Julien Neelb,d, Narisara Chantratitah, Greg Bancrofti, Ganjana Lertmemongkolchaij, D. Huw Daviesa, Pierre Baldib,d,k,Sharon Peacockh,l, and Richard W. Titballg,i

aDepartment of Medicine, Division of Infectious Diseases, University of California, Irvine, CA 92697; bInstitute for Genomics and Bioinformatics, dDepartmentof Computer Science, and kDepartment of Biological Chemistry, University of California, Irvine, CA 92067; cAntigen Discovery, Inc., Irvine, CA 92618;eDefence Medical & Environmental Research Institute, DSO National Laboratories, 27 Medical Drive, #13-01, Singapore 117510; fDefence Science andTechnology Laboratory, Porton Down, Salisbury SP4 0JQ, United Kingdom; gSchool of Biosciences, Geoffrey Pope Building, University of Exeter, Exeter EX44QD, United Kingdom; iDepartment of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT,United Kingdom; jDepartment of Clinical Immunology, Khon Kaen University, Khon Kaen 40002, Thailand; hMahidol-Oxford Tropical Medicine ResearchUnit, Faculty of Tropical Medicine, Mahidol University, Bangkok 10400, Thailand; and lCenter for Clinical Vaccinology and Tropical Medicine, NuffieldDepartment of Clinical Medicine, University of Oxford, Churchill Hospital, Oxford CB2 0QQ, United Kingdom

Edited by Peter Palese, Mount Sinai School of Medicine, New York, NY, and approved June 19, 2009 (received for review December 8, 2008)

Understanding the way in which the immune system responds toinfection is central to the development of vaccines and manydiagnostics. To provide insight into this area, we fabricated aprotein microarray containing 1,205 Burkholderia pseudomalleiproteins, probed it with 88 melioidosis patient sera, and identified170 reactive antigens. This subset of antigens was printed on asmaller array and probed with a collection of 747 individual seraderived from 10 patient groups including melioidosis patients fromNortheast Thailand and Singapore, patients with different infec-tions, healthy individuals from the USA, and from endemic andnonendemic regions of Thailand. We identified 49 antigens thatare significantly more reactive in melioidosis patients than healthypeople and patients with other types of bacterial infections. Wealso identified 59 cross-reactive antigens that are equally reactiveamong all groups, including healthy controls from the USA. Usingthese results we were able to devise a test that can classifymelioidosis positive and negative individuals with sensitivity andspecificity of 95% and 83%, respectively, a significant improve-ment over currently available diagnostic assays. Half of the reac-tive antigens contained a predicted signal peptide sequence andwere classified as outer membrane, surface structures or secretedmolecules, and an additional 20% were associated with pathoge-nicity, adaptation or chaperones. These results show that microar-rays allow a more comprehensive analysis of the immune responseon an antigen-specific, patient-specific, and population-specificbasis, can identify serodiagnostic antigens, and contribute to amore detailed understanding of immunogenicity to this pathogen.

antigen discovery � melioidosis � diagnostic � antigen prediction

Understanding the interaction of the immune system withbacteria is central to an understanding of the pathogenesis

of infectious disease, and also to the development of diagnosticsand vaccines. Yet there is an unmet need for a comprehensiveand unambiguous approach for quantifying the immune re-sponse to infection in an antigen-specific manner and on agenome-wide scale. The development of antibodies to individualcomponents of bacterial pathogens is 1 important element of theimmune response, and is especially relevant to the developmentof diagnostics and for therapies where antibodies play direct orindirect roles in the killing of pathogens or pathogen-infectedcells. Although protein antigens are generally assumed to makeup the majority of the bacterial antigens that are recognized byantibodies, the reasons why only some proteins evoke antibodyresponses are poorly understood. There is a general and intuitiveassumption that proteins that are displayed on the surface of thebacterium are more likely to evoke an antibody response. In thecase of proteins able to induce protective immunity there is some

evidence that this assumption is correct. An analysis of 72bacterial proteins included in vaccines or shown to induceprotective immunity in animal models of disease showed that 52(72%) possessed signal sequences and were therefore likely to belocated outside of the cell membrane (1). The assumption thatproteins able to induce protective immunity possess signalsequences is exploited in reverse vaccinology, where candidateprotective antigens are predicted from the genome sequence (2).

However, these studies fail to provide insight into the broaderquestion of the differential abilities of bacterial proteins toinduce antibody responses in a population. To a large extent, thedifficulties in addressing this question are a reflection of thelimitations of existing technologies for mapping the completesubset of proteins involved in the immune response against theinfectious agent. These technologies rely on chromatographic orelectrophoretic separation of proteins from bacteria or thescreening of expression libraries. These approaches are limitedby the different levels of expression of proteins in the native orrecombinant hosts, specific bacteria growth conditions if culti-vable, and are not conducive to high-throughput screening of alarge collection of serum samples.

We have previously developed array technology that allowsprotein microarrays to be constructed from the predicted pro-teome of a microorganism (3–7). These arrays can be used toaddress basic questions about the interactions of a given patho-gen with the host immune system (8–10), and allows the iden-tification of proteins which can be used as diagnostic reagents orfor inclusion in vaccines (3). The empirical data gathered fromthis type of array can also be used to evaluate and improve theaccuracy of in silico antigen prediction of a proteome.

We report here the development of a Burkholderia pseudoma-llei protein array. B. pseudomallei is the causative agent ofmelioidosis, a serious and often fatal infectious disease ofhumans. It is an important medical problem across Southeast

Author contributions: P.L.F., A.V., G.T., G.B., G.L., D.H.D., P.B., S.P., and R.W.T. designedresearch; M.A.K., A.V., R.N.-S., J.P., D.M.M., S.H., J.S.W.C., D.W., and N.C. performedresearch; P.L.F., M.A.K., A.V., C.B., G.T., M.D., R.Y., J.N., G.L., D.H.D., P.B., S.P., and R.W.T.analyzed data; and P.L.F., A.V., P.B., S.P., and R.W.T. wrote the paper.

Conflict of interest statement: P.L.F. and D.H.D. have patent applications related to proteinmicroarray fabrication and have stock positions with Antigen Discovery, Inc. D.M.M. is anemployee with Antigen Discovery, Inc.

This article is a PNAS Direct Submission.

Freely available online through the PNAS open access option.

1To whom correspondence should be addressed at: Department of Medicine, Division ofInfectious Diseases, University of California, 3501 Hewitt Hall, Irvine, CA 92697. E-mail:[email protected].

This article contains supporting information online at www.pnas.org/cgi/content/full/0812080106/DCSupplemental.

www.pnas.org�cgi�doi�10.1073�pnas.0812080106 PNAS � August 11, 2009 � vol. 106 � no. 32 � 13499–13504

MED

ICA

LSC

IEN

CES

Dow

nloa

ded

by g

uest

on

Mar

ch 1

2, 2

021

Page 2: A Burkholderia pseudomalleiprotein microarray reveals ... · for inclusion in vaccines (3). The empirical data gathered from this type of array can also be used to evaluate and improve

Asia and northern Australia, and is increasingly recognized inother tropical areas of the world including areas of SouthAmerica (11, 12). The global incidence of disease is not known,but in northeast Thailand, the disease accounts for 40% of alldeaths from community-acquired septicemia. The potential forthe bacterium to cause disease after inhalation has also resultedin the inclusion of this pathogen on the CDC category B list ofpotential biological warfare and bioterrorism agents (12). Herewe have used the protein array to map the antibody response in747 serum samples from well-defined melioidosis positive andnegative patients.

ResultsProtein Microarray Design and Construction. We devised a proteinarray of 1,205 proteins, including proteins predicted to be surfacelocated using PSORTb (13), components of the 3 different typeIII systems, components of the flagella, proteins identified asimmunoreactive from 2D gels, and 672 proteins selected atrandom. This array was probed with a collection of 88 sera frommelioidosis patients in Singapore (SI Text). One hundred seventyantigens with average signal intensity greater than 2.5 times thestandard deviation of the average negative control spots wereconsidered seroreactive. Of these, 80 proteins were encoded onchromosome 1, and 90 on chromosome 2. Three of the serore-active proteins were encoded in genomic islands (BPSL1705 inGI8, BPSS0663 in GI4, and BPSS1068 in GI15), and 2 proteinswere encoded in possible genomic islands (BPSS0402 andBPSL0739). Some large CDSs (coding sequences) were ex-pressed as overlapping fragments and for many of these, wefound reactivity against several of the fragments, providingevidence that epitopes are distributed throughout these proteins.For example, 6 polypeptides from the BPSS2053 CDS, encodinga 3,103 amino acid protein similar to the Ralstonia solanacearumprobable hemagglutinin-related protein, all reacted with me-lioidosis sera. Four polypeptides that form the BPSS1434 CDS,encoding a 2,178 amino acid protein similar to a Streptococcuspneumoniae cell wall surface anchor family protein, all reactedwith patient sera. Nine polypeptides from the BPSL1661 CDS,encoding a 3,229 amino acid protein similar to a Ralstoniasolanacearum putative hemagglutinin/hemolysin-related pro-tein, all reacted with melioidosis sera. In a previous study, 12 B.pseudomallei proteins were identified as immunoreactive by 2Dgels (14), and all of these proteins were identified as immuno-reactive using the array.

Mapping the Antigenic Profile. The seroreactive antigens wereprinted onto smaller arrays (Fig. 1), which we used to probe theentire collection of 747 serum samples from patients withmelioidosis and control sera from individuals in Singapore,Thailand, and the USA (Table S1 and SI Text). The Singaporeand Thailand melioidosis-positive and -negative cases werepatients entering the hospital with symptoms of melioidosis thatwere later confirmed to be either positive or negative formelioidosis. The Thailand cases were from Ubon Ratchathaniin Northern Thailand and healthy controls from the same regionwhere the disease is endemic. There were also healthy controlsfrom Southern Thailand (outside the endemic region) andpatients diagnosed with leptospirosis, other bacteremia, orfungemia.

The reactivity of 747 sera from different individuals is shown asa heatmap (Fig. 2) and as a histogram (Fig. S1) with patient samplesgrouped according to their clinical description. Thirty-one of themost reactive serodiagnostic antigens and 31 cross reactive antigensare shown. ‘Serodiagnostic’ antigens are defined as significantlydifferentially reactive between the Singapore melioidosis-positiveand -negative groups with Benjamini and Hochberg adjustedCyber-T P values �0.05, and ‘cross-reactive’ antigens had a P value�0.05 (Fig. 3). Analysis of variance was performed to detect

differences in signal intensity between groups. For the serodiag-nostic antigens there is a significant difference when comparingmelioidosis-positive patients to all other groups (P � 3.6 � 10�7),but the Singapore and Thailand melioidosis patients are not dif-ferent from each other (P � 0.55), and all melioidosis negativepatients are not different from each other (P � 0.91) (Fig. S1). Allof the sera react similarly to the cross-reactive antigens whetherfrom melioidosis-positive or -negative individuals, healthy subjectsfrom endemic or nonendemic areas, or patients with other infec-tions (P value � 0.08).

Determining a Set of Serodiagnostic Antigens. There were 2 distinctcollections of melioidosis-positive and -negative specimens usedfor this work, from different clinical investigators and clinicalsites in Thailand and Singapore, and different definitions of thecases and controls were used in each location. The incidence ofmelioidosis is relatively low in Singapore, and exposure to B.pseudomallei is considered to be an infrequent event. TheSingaporean cases were clinically confirmed melioidosis andpositive by the indirect hemagglutination assay (IHA) and about40% of these samples were proven culture positive. All of thecontrols for this collection lacked clinical signs of melioidosisdisease and were also IHA negative. The Northeast Thailandcollection is from a region where the disease is consideredendemic. All of the cases are proven culture positive, and thecontrols are culture negative but IHA positive subjects are notexcluded, because it is known that healthy individuals in thegeneral population in Northeast Thailand can be IHA positive.For all of the work reported here we used the Singaporeancases and controls to develop a classifier that can distinguishmelioidosis-positive and -negative subjects, and tested sensitivityand specificity of the classifier independently on the Thailandcollection.

Our goal was to establish a collection of antigens that could beused as a multiplex set to accurately distinguish the melioidosiscases from controls. As such, we studied the discriminatorypower of different sets of ORFs using receiver operating char-acteristic (ROC) curves. First, ROC curves were generated forindividual serodiagnostic antigens to assess their ability toseparate the control and disease cases. The serodiagnostic

No DNA

Blank

HMIg

EBNA

Melioidosis Positive

Melioidosis Negative

Expression Control

Fig. 1. Construction of a B. pseudomallei Microarray. Arrays were printedcontaining 214 B. pseudomallei proteins, positive and negative control spots.The arrays were read in a laser confocal scanner, analyzed, and the datanormalized as described in the Materials and Methods section. Protein ex-pression efficiency was determined to be 99.2% by probing against a carboxy-terminal HA tag for quality control. Each array contains positive control spotsprinted from 4 serial dilutions of human IgG (HMIg), and the intensity of thesespots was similar for both serum samples. Each array also contained 6 ‘‘NoDNA’’ negative control spots, and the reactivity of these spots was low for bothserum samples. There are also 4 serially diluted EBNA1 (EBNA) protein controlspots that are reactive to varying degrees in different subjects, as expected,and provide a methodological control. The remaining spots on the array arein vitro transcription/translation reactions expressing 183 different B.pseudomallei proteins that were selected from our primary array analysis. Thesignal intensity of each antigen is represented by rainbow palette of blue,green, red, and white by increasing signal intensity. Representative microar-ray immunofluorescence images of individual patient sera are displayed.

13500 � www.pnas.org�cgi�doi�10.1073�pnas.0812080106 Felgner et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 1

2, 2

021

Page 3: A Burkholderia pseudomalleiprotein microarray reveals ... · for inclusion in vaccines (3). The empirical data gathered from this type of array can also be used to evaluate and improve

antigens were ranked by decreasing single antigen area under theROC curve (AUC) (Table S2). The top 5 ORFs all have an AUCgreater than 0.81, with GroEL (BPSL2697; AUC 0.89; Ben-jamini and Hochberg adjusted Cyber-T P value �10e-14) givingthe best single antigen discrimination. Heat shock proteins likeGroEL are often dismissed a priori for serodiagnostic utilitybecause of perceived cross-reactivity with heat shock proteinsfrom other organisms that would theoretically result in lowspecificity of the test; here, we find the heat shock protein GroELis the most significantly differentially reactive antigen and havesimilarly found serodiagnostic power from heat shock proteinsfrom other organisms (7). The 31st antigen has an AUC of 0.622which still exceeds the upper 95% confidence interval forrandom expectations for the AUC. To extend the analysis tocombinations of antigens, we used kernel methods and supportvector machines (15, 16) to build linear and nonlinear classifiers.As input to the classifier, we used the highest-ranking 1, 2, 5, 10,20, 25, 30, and 31 ORFs on the basis of either P value or singleantigen AUC and the results were validated with 10 runs of3-fold cross-validation. The boxplots for these predictions wereplotted and the results show that increasing the antigen numberfrom 1 to 2, 2 to 5, and 5 to 10 produced an improvement in theclassifier and a reduction in accuracy as the antigens increaseabove 20 due to over-fitting (Fig. 4A and Table S3).

The classifier threshold that yielded the highest accuracy in theSingapore data set was then used to predict the samples in the

Thailand collection. Using 20 antigens the classifier predicts 86%of the true positives and 98% of the true negatives fromSingapore, and 76% and 72% of the Thailand positives andnegatives, respectively. The healthy controls from North andSouth Thailand are predicted with 78% and 86% accuracy,respectively. Patients with other infections are predicted asmelioidosis negative with 69% to 100% accuracy, depending onthe group.

Improving Serodiagnostic Accuracy. To test the feasibility of usingthe serodiagnostic antigens on an alternative analytical platform,10 serodiagnostic proteins were printed onto an Optitran nitro-cellulose membrane (typically used for western blots) using aBioDot jet dispenser (SI Text). These 10 serodiagnostic proteinswere chosen for being highly significant and highly seroreactivein the protein microarray assay. The paper was then cut into 3mm strips to produce ‘immunostrips’ (also called ‘line blots’)(Fig. 4B). The individual strips were probed with 60 differentmelioidosis positive sera and 67 melioidosis negative sera fromthe Singapore collection. Reactive bands were visualized afterincubation with alkaline phosphatase conjugated anti-humansecondary antibody followed by substrate, and the band inten-sities quantified with ImageJ (17). Melioidosis patients reactedstrongly against the serodiagnostic antigens although the inten-sity pattern varied depending on the patient. Naïve and healthycontrol subjects had low reactivity against these serodiagnosticantigens.

SingaporePositive

SingaporeNegative

ThailandPositive

ThailandNegative

North ThailandHealthy

South ThailandHealthy

USBacteremia Fu

ngae

mia

Leptospirosis

serodiagnostic

cross-reactive

serodiagnostic

cross-reactive

Healthy

signalintensity200000175000150000125000100000750005000025000

0

Fig. 2. Probing a collection of B. pseudomallei infected, uninfected, and healthy control sera from Singapore, Thailand, and the USA. Arrays containing 214B. pseudomallei proteins were probed with 747 melioidosis and nonmelioidosis sera organized into 11 groups as described in the text. The normalized intensityis shown according to the colorized scale with red strongest, bright green weakest, and black in between. The antigens are in rows and are grouped accordingto serodiagnostic and cross-reactive. The patient samples are in columns and sorted left to right by increasing average intensity to serodiagnostic antigens.

Ant

igen

Inte

nsity

0

50000

100000

150000

200000

250000

p-va

lue

1.E-11

1.E-10

1.E-09

1.E-08

1.E-07

1.E-06

1.E-05

1.E-04

1.E-03

1.E-02

Serodiagnostic Cross-reactive

BPS

L269

7B

PSS0

477

BPS

S153

2B

PSS1

512

BPS

L331

9B

PSS1

492

BPS

S159

9B

PSL2

520

BPS

L252

2B

PSL0

280

BPS

L144

5B

PSS1

525

BPS

L191

3B

PSS1

516

BPS

L209

6B

PSS0

530

BPS

S138

5B

PSS1

722

BPS

S214

1B

PSS0

542

BPS

L269

8B

PSS0

476

BPS

L324

7B

PSL2

030

BPS

S158

8B

PSL1

937

BPS

S075

2B

PSL2

017

BPS

L322

2B

PSS2

013

BPS

S161

7B

PSL1

901

BPS

L099

9B

PSL2

063

BPS

L276

5B

PSS2

053

BPS

S153

1B

PSL2

765

BPS

S205

3B

PSL3

398

BPS

L146

5B

PSS1

434

BPS

L261

5B

PSS2

053

BPS

L190

2B

PSL1

661

BPS

L166

1B

PSS1

993

BPS

L282

7B

PSS0

088

BPS

L001

4B

PSS1

600

BPS

S197

4B

PSS0

933

BPS

S073

4B

PSL1

631

BPS

L166

1B

PSL1

600

BPS

S010

9B

PSS0

940

BPS

S174

2

Singapore PositiveSingapore Negativep-value

Fig. 3. Serodiagnostic antigen discovery of melioidosis-positive patients from Singapore. The mean sera reactivity of the 214 antigens was compared betweenthe Singapore melioidosis-positive and Singapore melioidosis-negative groups. Antigens with Benjamini Hochberg corrected P value less than 0.05 are organizedto the left and cross-reactive antigens to the right. The 31 most reactive serodiagnostic and 31 of the most reactive cross-reactive antigens are shown.

Felgner et al. PNAS � August 11, 2009 � vol. 106 � no. 32 � 13501

MED

ICA

LSC

IEN

CES

Dow

nloa

ded

by g

uest

on

Mar

ch 1

2, 2

021

Page 4: A Burkholderia pseudomalleiprotein microarray reveals ... · for inclusion in vaccines (3). The empirical data gathered from this type of array can also be used to evaluate and improve

A classifier was trained to accurately diagnose the melioidosis-positive and -negative samples from Singapore, and the classifierwas tested against strips probed with the independent collectionof positive and negative samples from Thailand. ROC curveswere generated and compared to the microarray results (Fig.4C). The area under the immunostrip ROC curve was greaterthan the microarray, indicating a more accurate test can beobtained by transferring the serodiagnostic antigens discoveredby microarray to the immunostrip format. The results aresummarized in the contingency table (Fig. 4D). Increasing thenumber of antigens from 1 to 5 on the microarray results in amore accurate discrimination between positive and negativesamples, with maximum sensitivity and specificity of 90% and55%, respectively, for the Thailand samples. However, immu-nostrips containing the top 10 serodiagnostic antigens discov-ered by microarray discriminate the Thailand positive andnegative patient samples with 95% sensitivity and 83% specific-ity. This accuracy is greater than previously published assaysusing crude B. pseudmallei antigen ELISA and affinity-purifiedantigen ELISA, and a substantial improvement over the clinicalstandard IHA assay. (Table S3). This proof-of-concept diagnos-tic assay validates the set of serodiagnostic antigens and dem-onstrates the feasibility of transferring the antigens discovered bymicroarray to the immunostrip format to correctly classify B.pseudomallei positive sera using 10 differentially reactive sero-diagnostic antigens.

Functional Classification of the Reactive Antigens. We next classifiedthe serodiagnostic and cross-reactive antigens according toannotated and computationally predicted features. The com-plete analysis is in Table S4 and a summary in Table 1.Annotated features were taken from the Artemis database onthe Sanger website which has 2 categories of functional defini-tions called ‘Colour’ and ‘Class.’ Each of the 13 colors has afunctional definition, and each of the 5,853 proteins in thedatabase is assigned exactly 1 color. The results in Table 1 showthat only 2 of the 13 color classifications are significantly

enriched in both the serodiagnostic and cross-reactive antigensets. ‘Pathogenicity/adaptation/chaperone’ and ‘surface’ mole-cules account for 75% of all of the reactive antigens whereasthese functional categories account for only 38% of the proteinsprinted on the array (and 32% of the whole proteome). There are180 class definitions in Artemis, and proteins can belong to morethan 1 class; proteins classified as ‘chaperones’ and ‘outermembrane localized’ are significantly enriched in the serodiag-nostic antigen set by 12.3- and 4.6-fold, respectively. Proteinclassified as membrane/exported/lipoproteins are enrichedamong the cross reactive antigens, but not the serodiagnosticantigens. The PSORTb computational predictor shows thatcytoplasmic proteins are significantly underrepresented in thecross-reactive antigen set and cytoplasmic membrane proteinsare significantly enriched. Proteins predicted to be extracellularand outer membrane localized are significantly enriched, and thepresence of a signal peptide is a highly significant enrichmentfeature. We have expressed and probed 20% of the B. pseudoma-llei proteome and identified 49 serodiagnostic antigens and 59cross-reactive antigens. By probing the remaining 4,648 proteinsin the complete proteome, we may expect to identify a total of295 cross-reactive and 240 serodiagnostic antigens. Pathogenic-ity associated molecules and membrane structures account for32% of the entire proteome, but these 2 categories account for74% of the reactive antigens. Thus by fabricating protein arrayscontaining 1901 molecules in only these 2 categories, we expectto identify more than 70% of the reactive antigens.

The majority of these serodiagnostic antigens (36/49) wereencoded on chromosome 2 of B. pseudomallei K96243. (TableS2) Chromosome 2 has previously been suggested to be associ-ated with genes required for adaptation and survival in differentniches (18), and our finding that serodiagnostic antigens werepreferentially encoded on chromosome 2 is consistent with thishypothesis. The serodiagnostic proteins included a range ofknown or putative virulence associated proteins including com-ponents of TTSS3 (BPSS1532, BipB and BPSS1525, BopE)(19–21) and a type IV pilus protein PilO (BPSS1599) (22).

Mel. Negative Mel. Positive

BPSL2697BPSS1532BPSS0477BPSS1512BPSS1492BPSL2522BPSS1525BPSL2698BPSS0476BPSL3222

no DNAno DNA

6ng/mlVIG17ng/mlVIG50ng/mlVIG

A

B

C

Dia

gnos

is A

ccur

acy

%

0.85

0.90

0.95

1.00

1 2 5 10 15 20 25 30 31

Number of Diagnostic Antigens

1 76% 92% 75% 66%2 89% 87% 90% 53%5 90% 93% 90% 55%

Diagnostic Assayimmunostrip 95% 83%

affinity-purified antigen ELISA 87% 88%crude B. pseudomallei antigen ELISA 86% 83%

indirect hemagglutination assay 79% 72%

Number of Diagnostic Antigens

Sing

apor

e Pos

.Si

ngap

ore N

eg.

Thail

and P

os.

Thail

and N

eg.D

False positive rate

True

pos

itive

rate

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.5 1.0

ImmunostripMicroarray

Fig. 4. Development of a melioidosis classifier for improved diagnosis. (A) The graph shows 9 Boxplots for nonlinear classifiers with increasing number ofantigens. As the number of antigens increases up to 5 antigens, the classifier becomes more accurate. (B) Ten serodiagnostic antigens were printed ontonitrocellulose paper in adjacent stripes using a BioDot jet dispenser (SI Text). Strips were probed with patient sera diluted 1/200 followed by alkaline phosphataseconjugated secondary antibody and enzyme substrate. Weak reactivity in the naïve healthy controls can be readily distinguished from the strong reactivity ininfected subjects. Ten representative strips for each group are shown. (C) The graph shows the immunostrip ROC curves compared to the microarray ROC curvegenerated for with the same 10 antigens. (D) Percent diagnostic accuracy for each assay is listed. Microarray accuracy was calculated for 1, 2, 5, and 10 antigens.Thailand sera are from a previously well-characterized cohort of patients and represent a selected population of samples with definite melioidosis comparedto patients with an alternative diagnosis. Thailand sera samples were used for direct comparison of the microarray, ELISA, IHA, and immunostrip assays.

13502 � www.pnas.org�cgi�doi�10.1073�pnas.0812080106 Felgner et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 1

2, 2

021

Page 5: A Burkholderia pseudomalleiprotein microarray reveals ... · for inclusion in vaccines (3). The empirical data gathered from this type of array can also be used to evaluate and improve

We have also interfaced our dataset with data on genes whichare up-regulated in the hamster model of melioidosis (23). Genesencoding 6 of the 31 diagnostic signature proteins (BPSS0476;GroES: BPSL1445; putative lipoprotein BPSL2017; Di-hemecytochrome c peroxidase; BPSL2697; GroEL: BPSL3319; f lagel-lin; and BPSL2520: putative exported protein) were up-regulated in vivo. Chaperones and stress response proteins froma range of pathogens have frequently been found to react withthe appropriate convalescent sera (14, 24–28). Both GroEL andDNA K have previously been identified as reactive with serafrom melioidosis patients (14, 25).

DiscussionThis study is the most extensive use of a diverse collection ofpatient sera on a protein microarray. Of 1,205 proteins probed,49 antigens were identified that are significantly more reactive inpatients with melioidosis and 59 antigens were cross-reactive inhealthy people and patients with other infections. Becausemelioidosis is confined to relatively small localized regions of theworld, it might have been anticipated that immunoreactivityagainst this species would be localized as well. Our finding thatthese cross-reactive antigens also reacted equally with sera fromindividuals in the USA suggests that cross-reactive antibodiesmay be responsible for these signals. This finding contrastsmarkedly to our previous studies with Francisella tularensis (7,29), Borrelia burgdorferi (5), Plasmodium falciparum (6), andvaccinia (8), where reactivity of arrayed proteins with naïvepatient sera was minimal or not seen.

The antibodies in melioidosis negative sera may be indicativeof past exposure to bacteria related to B. pseudomallei whichpossess similar proteins. For example, BPSS0542 (glycosyl hy-drolase), BPSL1207 (polyribonucleotide nucleotidyltrans-ferase), and BPSS2136 (family S43 nonpeptidase homologue)are identical in a range of Burkholderia species such as Burk-holderia thailandensis, B. vietnamiensis, and B. cenocepacia. B.thailandensis is found in many regions of the world including SEAsia and the USA (30) and the B. pseudomallei and B. thailan-densis genomes are highly syntenic (31). Bacteria of the B.cepacia complex are also widely distributed worldwide and arefrequently the cause of complications in cystic fibrosis. All ofthese bacteria rarely cause disease in healthy individuals but it ispossible that subclinical disease results in the induction of animmune response to these bacteria and the development of

antibodies that react with the protein array. Even though cross-reactivity is widespread, some individuals within any region wereessentially immunologically naïve. The results reported heresuggest widespread exposure and cross-reactivity to nonpatho-genic Burkholderia sp., but these conclusions warrant corrobo-rating data from additional endemic and nonendemic regionsand comparative data on arrays from nonpathogenic strains.

The information from the array was used to devise an accurateprototype diagnostic test, using differentially reactive antigenswhile avoiding the cross-reactive ones. The microarray platformmay not necessary be the most sensitive to produce the mostaccurate test results. Serodiagnostic antigens discovered bymicroarray were transferred to the immunostrip platform, theSingaporean specimen collection was used to train the classifier,and the classifier tested on the Thailand collection. The immu-nostrip test allowed for the correct identification of 95% ofmelioidosis-positive samples and 83% of melioidosis-negativesamples, a major improvement over the most commonly useddiagnostic test (the IHA test) which identifies 79% and 72% ofpositive and negative samples, respectively. We regard themicroarray as a serodiagnostic antigen discovery platform. An-tigens discovered by microarray can be transferred to other moresensitive platforms to give more accurate test results.

The data we report here also provide a broader insight into theantigens that are recognized by the immune system after naturalinfection. Reactivity was not evenly distributed across the pro-teome and the cross-reactive and serodiagnostic antigens areselectively enriched with proteins from specific functional cat-egories. Membrane and secreted molecules accounted for 52%of the reactive antigens, but this class of molecule constitutesonly 29% of the proteins on the array. Similarly, pathogenicity-island related proteins and chaperones (as annotated in theArtemis database) accounted for 22% of the reactive antigensbut only 9% of the proteins printed on the large array were fromthis category. Three of the 6 annotated chaperones printed onthe array were serologically reactive, and all 3 of these wereserodiagnostic for melioidosis. PSORTb predicts 82 proteins(7% of the large array) to be extracellular or outmembranelocalized, and 32 (39%) are antigenic. Proteins with a signalpeptide are enriched on the reactive antigen lists and proteinslacking a signal peptide sequence are underrepresented. Pro-teins involved in central and intermediary metabolism are alsosignificantly underrepresented on the reactive antigen list. These

Table 1. Enrichment table

Counts Cross-reactive Serodiagnostic

Whole proteome Large chip Hits Fold enrich P value Hits Fold enrich P value

Artemis ’colour’ definitionspathogenicity/adaption/chaperones 368 106 13 * 2.50 1.2E-03 11 * 2.55 2.4E-03Surface (outer membrane, inner membrane,

secreted, surface structures)1533 352 31 * 1.80 1.8E-04 25 * 1.75 1.1E-03

All other categories 3952 747 15 — — 13 — —Total 5853 1205 59 49

Artemis ’class’ definitionsChaperones 30 6 0 0.00 1.0E �00 3 * 12.30 1.2E-03Membrane/exported/lipoproteins 397 174 15 * 1.76 2.1E-02 12 1.70 5.8E-02Outer membrane 81 48 2 0.85 1.0E �00 9 * 4.61 7.2E-05

Computational predictionsPSORTb cytoplasmic 1975 293 2 * 0.14 1.6E-05 7 0.59 1.2E-01PSORTb cytoplasmicmembrane 1054 94 11 * 2.39 4.3E-03 5 1.31 5.8E-01PSORTb extracellular 37 11 5 * 9.28 8.8E-05 3 * 6.71 8.3E-03PSORTb outermembrane 150 71 10 * 2.88 1.6E-03 14 * 4.85 2.2E-07PSORTb periplasmic 130 30 5 * 3.40 1.3E-02 2 1.64 3.5E-01PSORTb unknown 2507 706 26 * 0.75 2.2E-02 18 * 0.63 1.8E-03SignalP �� 0.7 1041 340 40 * 2.40 1.1E-10 30 * 2.16 9.7E-07

Felgner et al. PNAS � August 11, 2009 � vol. 106 � no. 32 � 13503

MED

ICA

LSC

IEN

CES

Dow

nloa

ded

by g

uest

on

Mar

ch 1

2, 2

021

Page 6: A Burkholderia pseudomalleiprotein microarray reveals ... · for inclusion in vaccines (3). The empirical data gathered from this type of array can also be used to evaluate and improve

examples illustrate that molecular recognition by the immunesystem is not stochastic. There is a preference for recognition ofsurface molecules, chaperones, and pathogenicity-related fac-tors, and the PSORTb outer membrane and extracellular com-putational tool predicts antigenicity. These results classifyingantigenicity by protein function are not out of line with expec-tations but they are far more quantitative and informative thanpreviously reported.

Seventy-four percent of the antigens fall into functionalannotation categories that are enriching features, but the other26% of the reactive antigens are from categories with proteomicfeatures that are underrepresented on the immunoreactive an-tigen list. We have not yet identified predictive features of thesemolecules that selectively target them for immune recognition.Some of the molecules with proteomic features that are under-represented on the hit list may be expressed at high levels in vivomaking them targets for immune recognition independent oftheir functional category. Although certain protein categoriesare enriched in the immunoreactive antigen list, there is nocategory that is entirely reactive. For example, there are 458proteins on the array in the pathogenicity and outer membranecategories, but only 17% of these molecules are reactive. Part ofthe explanation for nonreactivity may be that the proteinsexpressed in the cell free in vitro expression system are not in theproper conformation to be recognized by the antibodies. Forexample, we have shown that proteins requiring disulfide bondsfor their native conformation may not be seen unless in vitroexpression is carried out under oxidizing conditions (4). Lack ofposttranslational modification, incomplete folding of proteinsexpressed in vitro, or unfolding of proteins bound to nitrocel-lulose are other factors that could result in an underestimate ofthe total number of reactive antigens determined by this method,but our experience with other agents indicates that this effect isactually marginal (4). At this point, any in silico algorithm for

predicting the antigenic profile based on sequence data alonewill be imperfect because the majority of proteins predicted arenonreactive. Furthermore, there are categories of reactive mol-ecules with proteomic features that are not enriched in theantigenic profile, so they would not be predicted. The datapresented here highlight the need to improve in silico predictionalgorithms for antigen discovery and vaccine development.

These results provide significant insight into the antigenicityof a bacterial pathogen, while both confirming and overturningassumptions that have previously been made about the natureof immunoreactive proteins. In addition to contributing to amore profound understanding of the humoral immune re-sponse to infection, this work could lead to the developmentmore accurate tests for clinical diagnosis and for worldwidedisease surveillance.

Materials and MethodsProtein Microarray Chip Fabrication and Probing Methods. Protein microarraychips consisting of 1,205 B. pseudomallei antigens were fabricated as de-scribed previously (SI Text) (3, 4, 7).

SI. The description of the serum samples, immunostrip fabrication and prob-ing, and data analysis methods are in the SI Text.

ACKNOWLEDGMENTS. We thank Eng-Eong Ooi and Jimmy Loh (DefenceMedical and Environmental Research Institute, Singapore) for providing se-rum samples; Allen C. Cheng (University of Melbourne) for sera collection; Mrs.Vanaporn Wuthiekanun and Dr. Narisara Chatratita (Mahidol-Oxford TropicalMedicine Research Unit, Bangkok) for providing samples from Thailand andfor ELISA and IHA assays, respectively; Drs. Gavin Koh and Rapeephan Rat-tanawongnara for collection of clinical data; and Xiaolin Tan for assistancewith ANOVA analysis. The primer design and statistical analyses of this workwere supported primarily by National Institutes of Health Biomedical Infor-matics Training Program Grant 5T15LM007743 (to M.K.) and National ScienceFoundation Grants MRI EIA-0321390 and NSF 0513376 (to P.B.) and in part byNational Institutes of Health/National Institute of Allergy and Infectious Dis-eases Grants U01AI061363 and U54065359 (to P.F.).

1. Mayers C, et al. (2003) Analysis of known bacterial protein vaccine antigens revealsbiased physical properties and amino acid composition. Comp Funct Genomics 4:468–478.

2. Rappuoli R (2001) Reverse vaccinology, a genome-based approach to vaccine devel-opment. Vaccine 19:2688–2691.

3. Davies DH, et al. (2005) Profiling the humoral immune response to infection by usingproteome microarrays: High-throughput vaccine and diagnostic antigen discovery.Proc Natl Acad Sci USA 102:547–552.

4. Davies DH, et al. (2007) Proteome-wide analysis of the serological response to vacciniaand smallpox. Proteomics 7:1678–1686.

5. Barbour AG, et al. (2008) A genome-wide proteome array reveals a limited set ofimmunogens in natural infections of humans and white-footed mice with Borreliaburgdorferi. Infect Immun 76:3374–3389.

6. Doolan DL, et al. (2009) Profiling humoral immune responses to P. falciparum infectionwith protein microarrays. Proteomics in press.

7. Eyles JE, et al. (2007) Immunodominant Francisella tularensis antigens identified usingproteome microarray. Proteomics 7:2172–2183.

8. Davies DH, et al. (2008) Antibody profiling by proteome microarray reveals theimmunogenicity of the attenuated smallpox vaccine modified vaccinia virus ankara iscomparable to that of Dryvax. J Virol 82:652–663.

9. Sette A, et al. (2008) Selective CD4� T cell help for antibody responses to a large viralpathogen: Deterministic linkage of specificities. Immunity 28:847–858.

10. Benhnia MR, et al. (2008) Redundancy and plasticity of neutralizing antibody responsesare cornerstone attributes of the human immune response to the smallpox vaccine.J Virol 82:3751–3768.

11. Dance DA (1991) Melioidosis: The tip of the iceberg? Clin Microbiol Rev 4:52–60.12. Dance DA (2002) Melioidosis. Curr Opin Infect Dis 15:127–132.13. Gardy JL, et al. (2005) PSORTb v. 2.0: Expanded prediction of bacterial protein subcel-

lular localization and insights gained from comparative proteome analysis. Bioinfor-matics 21:617–623.

14. Harding SV, et al. (2007) The identification of surface proteins of Burkholderiapseudomallei. Vaccine 25:2664–2672.

15. Baldi P, Brunak SR (2001) in Bioinformatics: The Machine Learning Approach (MITPress, Cambridge, MA), 2nd Ed, pp xxi, 452.

16. Vapnik V (1995) in The Nature of Statistical Learning Theory (Springer, New York, NewYork).

17. Abramoff MD, Magelhaes PJ, Ram SJ (2004) Image Processing with ImageJ. Biopho-tonics International 11:36–42.

18. Holden MTG, et al. (2004) Genornic plasticity of the causative agent of melioidosis,Burkholderia pseudomallei. Proc Natl Acad Sci USA 101:14240–14245.

19. Warawa J, Woods DE (2005) Type III secretion system cluster 3 is required for maximalvirulence of Burkholderia pseudomallei in a hamster infection model. FEMS MicrobiolLett 242:101–108.

20. Stevens MP, et al. (2004) Attenuated virulence and protective efficacy of a Burkhold-eria pseudomallei bsa type III secretion mutant in murine models of melioidosis.Microbiology 150:2669–2676.

21. Burtnick MN, et al. (2008) Burkholderia pseudomallei type III secretion system mutantsexhibit delayed vacuolar escape phenotypes in RAW 264.7 murine macrophages. InfectImmun 76:2991–3000.

22. Essex-Lopresti AE, et al. (2005) A type IV pilin, PilA, contributes to adherence ofBurkholderia pseudomallei and virulence in vivo. Infect Immun 73:1260–1264.

23. Tuanyok A, Tom M, Dunbar J, Woods DE (2006) Genome-wide expression analysis ofBurkholderia pseudomallei infection in a hamster model of acute melioidosis. InfectImmun 74:5465–5476.

24. Amemiya K, et al. (2007) Detection of the host immune response to Burkholderia malleiheat-shock proteins GroEL and DnaK in a glanders patient and infected mice. DiagnMicrobiol Infect Dis 59:137–147.

25. Woo PC, Leung PK, Wong SS, Ho PL, Yuen KY (2001) groEL encodes a highly antigenicprotein in Burkholderia pseudomallei. Clin Diagn Lab Immunol 8:832–836.

26. Sanchez-Campillo M, et al. (1999) Identification of immunoreactive proteins of Chla-mydia trachomatis by western blot analysis of a two-dimensional electrophoresis mapwith patient sera. Electrophoresis 20:2269–2279.

27. Lemos JA, Giambiagi-Demarval M, Castro AC (1998) Expression of heat-shock proteinsin Streptococcus pyogenes and their immunoreactivity with sera from patients withstreptococcal diseases. J Med Microbiol 47:711–715.

28. Hinode D, Grenier D, Mayrand D (1995) Purification and characterization of a DnaK-like and a GroEL-like protein from Porphyromonas gingivalis. Anaerobe 1:283–290.

29. Sundaresh S, et al. (2007) From protein microarrays to diagnostic antigen discovery: astudy of the pathogen Francisella tularensis. Bioinformatics 23:i508–i518.

30. Glass MB, et al. (2006) Pneumonia and septicemia caused by Burkholderia thailandensisin the United States. J Clin Microbiol 44:4601–4604.

31. Yu Y, et al. (2006) Genomic patterns of pathogen evolution revealed by comparison ofBurkholderia pseudomallei, the causative agent of melioidosis, to avirulent Burkhold-eria thailandensis. BMC Microbiol 6:46.

13504 � www.pnas.org�cgi�doi�10.1073�pnas.0812080106 Felgner et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 1

2, 2

021