Identification and Classification of bcl Genes and Proteins ...Identification and Classification...

10
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, Nov. 2009, p. 7163–7172 Vol. 75, No. 22 0099-2240/09/$12.00 doi:10.1128/AEM.01069-09 Copyright © 2009, American Society for Microbiology. All Rights Reserved. Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application in Bacillus anthracis Detection and Fingerprinting Tomasz A. Leski, 1 Clayton C. Caswell, 2 ‡ Marcin Pawlowski, 5 David J. Klinke, 2,3 Janusz M. Bujnicki, 5,6 Sean J. Hart, 7 and Slawomir Lukomski 2,4 * Nova Research, Inc., Alexandria, Virginia 22308 1 ; Department of Microbiology, Immunology, and Cell Biology, 2 Department of Chemical Engineering, 3 and Mary Babb Randolph Cancer Center, 4 West Virginia University, Morgantown, West Virginia 26506; Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, 02-109 Warsaw, 5 and Bioinformatics Laboratory, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, 61-614 Poznan, 6 Poland; and Bio/Analytical Chemistry, Chemistry Division, Code 6112, Naval Research Laboratory, Washington, DC 20375 7 Received 9 May 2009/Accepted 10 September 2009 The Bacillus cereus group includes three closely related species, B. anthracis, B. cereus, and B. thuringiensis, which form a highly homogeneous subdivision of the genus Bacillus. One of these species, B. anthracis, has been identified as one of the most probable bacterial biowarfare agents. Here, we evaluate the sequence and length polymorphisms of the Bacillus collagen-like protein bcl genes as a basis for B. anthracis detection and finger- printing. Five genes, designated bclA to bclE, are present in B. anthracis strains. Examination of bclABCDE sequences identified polymorphisms in bclB alleles of the B. cereus group organisms. These sequence polymor- phisms allowed specific detection of B. anthracis strains by PCR using both genomic DNA and purified Bacillus spores in reactions. By exploiting the length variation of the bcl alleles it was demonstrated that the combined bclABCDE PCR products generate markedly different fingerprints for the B. anthracis Ames and Sterne strains. Moreover, we predict that bclABCDE length polymorphism creates unique signatures for B. anthracis strains, which facilitates identification of strains with specificity and confidence. Thus, we present a new diagnostic concept for B. anthracis detection and fingerprinting, which can be used alone or in combination with previously established typing platforms. The Bacillus cereus group includes three closely related spe- cies, B. anthracis, B. cereus, and B. thuringiensis, as well as the more distantly related species B. mycoides and B. weihen- stephanensis. These gram-positive, spore-forming bacteria form a highly homogeneous subdivision of the genus Bacillus, which also contains several other organisms belonging to the B. sub- tilis group. The importance and public awareness of B. cereus group organisms are associated with their distinct phenotypes and pathological effects. B. anthracis is the causative agent of anthrax, a disease that affects humans and animals worldwide and has also been developed as a biological warfare agent (17, 25). B. cereus is an opportunistic human pathogen which is responsible mainly for gastrointestinal illnesses resulting from food contamination (9), whereas B. thuringiensis is an insect pathogen whose toxin is a biological pesticide widely used in global agriculture (38). The systematics of the members of the B. cereus group poses significant challenges due to very high level of chromosomal synteny and protein identity (33). In- tense efforts have focused on overcoming these challenges, and there has been a particular focus on developing methods for specific detection of B. anthracis and for differentiating among strains of these closely related organisms. Biodefense and forensic needs prompted large-scale se- quencing of multiple bacillus genomes in a search for polymor- phic sites for use in typing procedures (33). One type of poly- morphism involves variation in the number of repeating nucleotide units that are referred to as variable-number tan- dem repeats (VNTRs). The resulting variation in the length and mass of the PCR products of these units can be demon- strated by gel and capillary electrophoresis (20), mass spec- trometry (29), or microchannel fluidics (30). To date, several different VNTRs have been identified and tested. For example, Keim et al. studied the genetic relationship among a large collection of B. anthracis isolates based on the VNTRs found in the vrr genes (19, 20). Using a similar approach, Valjevac et al. used VNTRs of Bcms loci as markers to assess the phylogeny of members of the B. cereus group (46). Finally, length varia- tion of the collagen-like (CL) region of the bclA gene was employed to differentiate among B. anthracis strains (6, 42). The CL sequences, which are composed of Gly-Xaa-Yaa (i.e., a glycine followed by two additional residues; GXY) re- peats, have been identified in silico in more than 100 prokary- otic proteins (34). Recent studies demonstrated that some bacterial CL proteins (CLPs), such as streptococcal protein Scl and BclA, can form the collagen triple helix (4, 14, 48). Bac- terial CLPs are typically surface exposed and are found in * Corresponding author. Mailing address: Department of Microbi- ology, Immunology, and Cell Biology, West Virginia University HSC, 2095 Health Sciences North, Box 9177, Morgantown, WV 26506. Phone: (304) 293-6405. Fax: (304) 293-7823. E-mail: slukomski@hsc .wvu.edu. ‡ Present address: Department of Microbiology and Immunology, Brody School of Medicine, East Carolina University, Greenville, NC. † Supplemental material for this article may be found at http://aem .asm.org/. Published ahead of print on 18 September 2009. 7163 on September 5, 2020 by guest http://aem.asm.org/ Downloaded from

Transcript of Identification and Classification of bcl Genes and Proteins ...Identification and Classification...

Page 1: Identification and Classification of bcl Genes and Proteins ...Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application

APPLIED AND ENVIRONMENTAL MICROBIOLOGY, Nov. 2009, p. 7163–7172 Vol. 75, No. 220099-2240/09/$12.00 doi:10.1128/AEM.01069-09Copyright © 2009, American Society for Microbiology. All Rights Reserved.

Identification and Classification of bcl Genes and Proteins of Bacillus cereusGroup Organisms and Their Application in Bacillus anthracis

Detection and Fingerprinting�†Tomasz A. Leski,1 Clayton C. Caswell,2‡ Marcin Pawlowski,5 David J. Klinke,2,3 Janusz M. Bujnicki,5,6

Sean J. Hart,7 and Slawomir Lukomski2,4*Nova Research, Inc., Alexandria, Virginia 223081; Department of Microbiology, Immunology, and Cell Biology,2 Department ofChemical Engineering,3 and Mary Babb Randolph Cancer Center,4 West Virginia University, Morgantown, West Virginia 26506;

Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology,02-109 Warsaw,5 and Bioinformatics Laboratory, Institute of Molecular Biology and Biotechnology, Faculty of

Biology, Adam Mickiewicz University, 61-614 Poznan,6 Poland; and Bio/Analytical Chemistry,Chemistry Division, Code 6112, Naval Research Laboratory, Washington, DC 203757

Received 9 May 2009/Accepted 10 September 2009

The Bacillus cereus group includes three closely related species, B. anthracis, B. cereus, and B. thuringiensis,which form a highly homogeneous subdivision of the genus Bacillus. One of these species, B. anthracis, has beenidentified as one of the most probable bacterial biowarfare agents. Here, we evaluate the sequence and lengthpolymorphisms of the Bacillus collagen-like protein bcl genes as a basis for B. anthracis detection and finger-printing. Five genes, designated bclA to bclE, are present in B. anthracis strains. Examination of bclABCDEsequences identified polymorphisms in bclB alleles of the B. cereus group organisms. These sequence polymor-phisms allowed specific detection of B. anthracis strains by PCR using both genomic DNA and purified Bacillusspores in reactions. By exploiting the length variation of the bcl alleles it was demonstrated that the combinedbclABCDE PCR products generate markedly different fingerprints for the B. anthracis Ames and Sterne strains.Moreover, we predict that bclABCDE length polymorphism creates unique signatures for B. anthracis strains,which facilitates identification of strains with specificity and confidence. Thus, we present a new diagnosticconcept for B. anthracis detection and fingerprinting, which can be used alone or in combination withpreviously established typing platforms.

The Bacillus cereus group includes three closely related spe-cies, B. anthracis, B. cereus, and B. thuringiensis, as well as themore distantly related species B. mycoides and B. weihen-stephanensis. These gram-positive, spore-forming bacteria form ahighly homogeneous subdivision of the genus Bacillus, whichalso contains several other organisms belonging to the B. sub-tilis group. The importance and public awareness of B. cereusgroup organisms are associated with their distinct phenotypesand pathological effects. B. anthracis is the causative agent ofanthrax, a disease that affects humans and animals worldwideand has also been developed as a biological warfare agent (17,25). B. cereus is an opportunistic human pathogen which isresponsible mainly for gastrointestinal illnesses resulting fromfood contamination (9), whereas B. thuringiensis is an insectpathogen whose toxin is a biological pesticide widely used inglobal agriculture (38). The systematics of the members of theB. cereus group poses significant challenges due to very highlevel of chromosomal synteny and protein identity (33). In-

tense efforts have focused on overcoming these challenges, andthere has been a particular focus on developing methods forspecific detection of B. anthracis and for differentiating amongstrains of these closely related organisms.

Biodefense and forensic needs prompted large-scale se-quencing of multiple bacillus genomes in a search for polymor-phic sites for use in typing procedures (33). One type of poly-morphism involves variation in the number of repeatingnucleotide units that are referred to as variable-number tan-dem repeats (VNTRs). The resulting variation in the lengthand mass of the PCR products of these units can be demon-strated by gel and capillary electrophoresis (20), mass spec-trometry (29), or microchannel fluidics (30). To date, severaldifferent VNTRs have been identified and tested. For example,Keim et al. studied the genetic relationship among a largecollection of B. anthracis isolates based on the VNTRs found inthe vrr genes (19, 20). Using a similar approach, Valjevac et al.used VNTRs of Bcms loci as markers to assess the phylogenyof members of the B. cereus group (46). Finally, length varia-tion of the collagen-like (CL) region of the bclA gene wasemployed to differentiate among B. anthracis strains (6, 42).

The CL sequences, which are composed of Gly-Xaa-Yaa(i.e., a glycine followed by two additional residues; GXY) re-peats, have been identified in silico in more than 100 prokary-otic proteins (34). Recent studies demonstrated that somebacterial CL proteins (CLPs), such as streptococcal protein Scland BclA, can form the collagen triple helix (4, 14, 48). Bac-terial CLPs are typically surface exposed and are found in

* Corresponding author. Mailing address: Department of Microbi-ology, Immunology, and Cell Biology, West Virginia University HSC,2095 Health Sciences North, Box 9177, Morgantown, WV 26506.Phone: (304) 293-6405. Fax: (304) 293-7823. E-mail: [email protected].

‡ Present address: Department of Microbiology and Immunology,Brody School of Medicine, East Carolina University, Greenville, NC.

† Supplemental material for this article may be found at http://aem.asm.org/.

� Published ahead of print on 18 September 2009.

7163

on Septem

ber 5, 2020 by guesthttp://aem

.asm.org/

Dow

nloaded from

Page 2: Identification and Classification of bcl Genes and Proteins ...Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application

microorganisms pathogenic to humans and animals. BclA (Ba-cillus CLP of B. anthracis) is a major spore surface protein (41)and is found in all members of the B. cereus group (6; thisstudy). A second CLP, designated BclB (47), was identified asa component of the B. anthracis exosporium; however, its dis-tribution and structural properties have not been well charac-terized. Likewise, two closely related proteins, ExsH and ExsJ,contain GXY CL repeats and are presumably located in theexosporium of Bacillus strains (45).

In this work we investigated in silico the occurrence anddistribution of the bcl genes, presumably encoding CLPs, in allmembers of the B. cereus group. A new classification of theresulting Bcl protein variants is proposed based on the domaincomposition and folding of these proteins. As many as 10 bclgenes were found in a single B. cereus strain. Five genes wereconsistently observed in B. anthracis strains and designatedbclA to bclE. We further analyzed sequence polymorphismsamong these bcl genes and assessed use of them for B. anthra-cis detection and strain fingerprinting. Representative mem-bers of the B. cereus group and less closely related controlbacilli were used to demonstrate specific bclB gene-based de-tection of B. anthracis spores. Finally, a combination of exper-iments and mathematical modeling was used to demonstratehow combined use of the bclABCDE sequence polymorphismscan be a powerful tool for strain fingerprinting in biodefenseand forensic applications.

MATERIALS AND METHODS

Bioinformatic analyses. Sequence searches for collagen homologs were car-ried out using PSI-BLAST (1) and the NCBI nonredundant database. Severalindependent searches were performed using representative sequences of mem-bers of the collagen family (PF01391) in the PFAM database (2). The gappedBLAST algorithm (blastpgp) was used with default parameters (BLOSUM62substitution matrix; gap open penalty, 11; gap extension penalty, 1; number ofiterations, up to 5; expectation value threshold, 0.0001). CLANS (cluster analysisof sequences) (12) was used to identify (sub)families of closely related sequencesand to visualize similarities within and between Bcl proteins. CLANS is a Javautility based on the Fruchterman-Reingold graph layout algorithm, which usesthe P values of high-scoring segment pairs obtained from an N�N BLAST searchto compute attractive and repulsive forces between each pair of sequences in auser-defined data set. A two-dimensional representation was obtained by seedingsequences randomly in the arbitrary distance space. The sequences were thenmoved within this environment according to the force vectors resulting from allpairwise interactions, and the process was repeated to convergence. Groups ofsequences (i.e., clans) were extracted from the CLANS output. A multiple-sequence alignment of the retrieved sequences was constructed using MAFFT(18) and optimized manually. In addition, representatives of each clan wereanalyzed using GeneSlico MetaServer (22), a gateway to a variety of computa-tional methods for protein structure prediction, including sequence comparisonsand secondary structure prediction, as well as tertiary fold recognition. Finally,the fold recognition methods were compared, evaluated, and ranked by thePCONS server (28) to identify the preferred modeling templates and the con-sensus alignment.

Bacterial strains. The strains used in this study are listed in Table 1. B.anthracis avirulent strain Sterne lacking the pXO2 plasmid was obtained fromthe Colorado Serum Company, Denver, CO. Genomic DNA of B. anthracisstrain Ames was kindly provided by B. Lin of the Center for Bio/MolecularScience and Engineering at the Naval Research Laboratory, Washington, DC.Most Bacillus strains were obtained from the American Type Culture Collection,Manassas, VA.

Sporulation and spore preparation. Spores were prepared as described pre-viously (15). Briefly, Bacillus strains were grown overnight on Trypticase soy agarat 30°C. Colonies of each strain were suspended in phosphate-buffered saline(pH 7.0) and plated on the following sporulation agar media: Schaeffer medium(37) for B. mycoides, 2� SG (24) for B. anthracis and B. cereus, and NSM (32) forB. thuringiensis. The plates were incubated at 37°C, except for the B. mycoides

plates, which was incubated at 30°C. The sporulation process was monitoredusing phase-contrast microscopy. Spores were collected when the cultures con-tained �95% phase-bright spores, typically after 4 days, and were suspended in2 ml of sterile ice-cold MilliQ water. The suspensions were centrifuged at4,000 � g for 5 min at 4°C, and the resulting spore pellets were resuspended infresh water. Washing was repeated four more times to remove the remainingdebris, and spore suspensions were stored at 4°C. The spore concentration wastested by plating spores on growth media. The purity of spore preparations wasevaluated by phase-contrast microscopy with oil immersion using a Nikon Op-tiphot-2 optical microscope (Nikon Inc., Melville, NY) equipped with a Plan�100 objective. For PCR amplification, aliquots of spore preparations werediluted in water to obtain the desired concentrations and used without anyfurther processing.

DNA isolation and purification. Bacteria were grown overnight in Trypticasesoy broth at 30°C. To isolate genomic DNA, 0.5-ml cultures were used, and DNAwas extracted and purified using an IT 1-2-3 R.A.P.I.D. DNA purification kitaccording to the manufacturer’s recommendations (Idaho Technology, Inc., SaltLake City, UT). A bead-beating step was conducted with a BIO101/FastPrepFP120 homogenizer (Thermo Fisher Scientific, Inc., Waltham, MA) for 45 s atspeed 5.5.

PCR amplification. PCR amplification was performed using a DNA EngineTetrad 2 (MJ Research, Inc., Waltham, MA) and the following cycling protocol:initial denaturation at 94°C for 1 min, followed by 31 cycles of denaturation at94°C for 45 s, annealing at 59°C for 45 s, and elongation at 72°C for 1 min 45 sand then a final extension step consisting of 5 min at 72°C. Bacillus total DNAtemplates were used at a final concentration of approximately 15 ng/�l of reac-tion buffer (10 mM Tris-HCl, 1.5 mM MgCl2, 50 mM KCl; pH 8.3). For PCRamplification with spores, approximately 104 spores per reaction mixture wereused. For multiplex PCR, the same cycling protocol was employed with a tem-perature gradient from 50 to 65°C for primer annealing and an Mg2� concen-tration range of 1.5 to 6.5 mM. Primers employed for PCR amplification arelisted in Table 2. The PCR products were analyzed on 2% Invitrogen ultrapureagarose (Invitrogen Corp., Carlsbad, CA) containing 1 �g/ml ethidium bromide.Electrophoresis was carried out in 1� Tris-acetate-EDTA buffer at 95 V for 2 h.The DNA size standard used was the 2-log DNA ladder (New England BioLabs,Inc., Beverly, MA). Gel images were captured using the UVP Bio-Doc-It system(UVP LLC, Upland, CA) and were processed using iPhoto ’08 v.7.1.4 (AppleInc., Cupertino, CA) and Canvas 9 (ACD Systems of America, Miami, FL).

Classification methods. We proposed that the bclABCDE gene products canbe used to classify B. anthracis samples as different strains with confidence. Theaccuracy of a PCR assay for classifying samples correctly based on the gelmigration patterns of the bclABCDE gene products was estimated using boot-strap resampling (10). P values were calculated from the resulting probabilitydensity functions using a nonparametric approach (13), as described in Fig. S1 inthe supplemental material. Bootstrap resampling was used to create syntheticreplicates of the bclABCDE gene products for the eight different B. anthracisstrains studied. For each pairwise comparison, the lowest level of confidence (i.e.,highest P value) was used to label a dendrogram representation of the differences

TABLE 1. Bacterial strains

Species or serovar Strain Source

B. anthracis Sterne 34F2 Colorado SerumCompanya

B. anthracis (DNA) Ames NRLb

B. cereus ATCC 4342 ATCCc

B. cereus ATCC 14579 (type strain) ATCCB. cereus ATCC 13061 ATCCB. thuringiensis serovar

kurstakiATCC 33679 ATCC

B. thuringiensis serovarkurstaki

ATCC 33680 ATCC

B. subtilis ATCC 6051 (type strain) ATCCB. subtilis ATCC 21332 ATCCB. subtilis ATCC 31028 ATCCB. megaterium ATCC 14581 (type strain) ATCCB. mycoides ATCC 6462 (type strain) ATCC

a Colorado Serum Company, Denver, CO.b NRL, Naval Research Laboratory, Washington, DC.c ATCC, American Type Culture Collection, Manassas, VA.

7164 LESKI ET AL. APPL. ENVIRON. MICROBIOL.

on Septem

ber 5, 2020 by guesthttp://aem

.asm.org/

Dow

nloaded from

Page 3: Identification and Classification of bcl Genes and Proteins ...Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application

between strains. A P value corresponded to the likelihood that a sample drawnfrom one population was misclassified as a member of another population. A Pvalue of 0.05 was considered significant.

RESULTS

Identification of Bacillus CLPs. First, the Bacillus CLPs wereidentified using a bioinformatic approach. The PFAM family ofcollagen sequences (http://pfam.sanger.ac.uk/family?PF01391)contains 9,744 sequences of CL domains, defined as a regionconsisting of 60 amino acids, and often individual CL proteinsharbor more than one collagenous domain. The BLASTCLUSTprogram was used to identify 37 sequences with �55% se-quence similarity that represent best the PF01391 family.These sequences were used as queries in independent PSI-BLAST searches of a nonredundant database, which led toidentification of 4,214 full-length proteins with CL sequences.From this set of identified proteins, a subset of 236 sequenceswas extracted that were annotated as derived from the genusBacillus, and a list of putative Bcl proteins was created. Thishigh number of various Bcl proteins identified was not ex-

pected given the high level of chromosomal synteny and pro-tein similarity among the members of the B. cereus group (33).

Classification of the Bcl proteins. Next, we performed com-putational analyses to understand relationships between Bclproteins. The noncollagen regions of Bcl sequences were clus-tered based on their pairwise BLAST similarity scores, usingCLANS (12). We have experimentally found that for thisgroup of sequences a P value threshold of 10�6 produces thebest qualitative results. Lower P values resulted in disconnec-tion of the most divergent sequences, while higher values re-sulted in overcompacting of the whole data set into a singleclan with only a few outliers. CLANS identified 10 main sub-families (clans) for the Bcl proteins (Fig. 1). A total of 171 Bclproteins were clustered into one of these clans, while 65 of theBcl proteins were not classified. In contrast, efforts to clusterthe Bcl proteins based on comparison of their CL regions wereinconclusive.

Structural organization of the Bcl proteins. Prediction ofthe detailed protein structure was performed using the Gene-Silico MetaServer (22) for all Bcl proteins grouped into clans1 to 10. We focused on (i) primary structure (e.g., domainprediction and identification), (ii) secondary structure (e.g.,helices, strands, loops, transmembrane helices, and disorderedregions), and (iii) fold recognition. In addition, groups of full-length sequences were extracted that formed clusters in theCLANS output, and multiple-sequence alignments were con-structed for detection of the structural organization of Bclsequences (Table 3). In summary, the domain architecture ofBcl proteins comprised (i) a short N-terminal region (N re-gion) that occurs as 1 of 11 variants, (ii) a linker region (Lregion) that contains five conserved helices and is present insome Bcls, (iii) a highly variable CL region that is composed of9 to 386 GXY triplets, and (iv) a C-terminal domain (CTD)that occurs in one of six folds that include the three knownfolds (a cupredoxin-like fold, a tumor necrosis factor/C1q-likefold, and a seven-blade beta-propeller fold).

Distribution of Bcl proteins among Bacillus species. There isno strict correlation between the distribution of a given clan’smembers and Bacillus species, which suggests that horizontalgene transfers occurred many times during the evolution of Bclproteins (see Fig. S2 in the supplemental material). Neverthe-

FIG. 1. Two-dimensional projection of CLANS. Clustering resultswere obtained for the noncollagen regions of Bcl proteins. Connec-tions between points represent the degree of pairwise sequence simi-larity, as quantified by BLAST P values (the darker the line, the higherthe level of similarity).

TABLE 2. Primers used for PCR amplification ofbclABCDE sequences

PCR target Primer Sequence (5�–3�)

bclA genea bclA F1 GAATCTTTATCAGCTAGTGCATTTG

bclA R1 AAGCAACTTTTTCAATAATAATGGATG

bclB genea bclB F1 GGCCCAGAAAATATTGGACCTAC

bclB R1 ATTAGACGATATTAAGACCTGCGC

bclC genea bclC F1 CCATGCTTTCCAAGTAGCGCTGG

bclC R1 ATTAAGCGATTCTAAATACAGTTAG

bclD genea bclD F1 TGTAATAATCAAAATGGTGTACATC

bclD R1 ATCAACTTAACCTTATTATCGTTAAC

bclE genea bclE F1 AGTCCATTAAATTCTAATTTCAAGAT

bclE R1 ATTAACTCAATCTAATAATCGTTAAAG

bclA CL regionb bclA F2 ATGAATCTTTATCAGCTAGTGCbclA R3 AAATGCATATAGTCCTGCTGG

bclB CL regionb bclB F2 AGGCCCAGAAAATATTGGACbclB R2 AGGCTGACTAAATCCAAATCC

bclC CL regionb bclC F3 AATAGCAGCATTACAAACCGCbclC R3 CTGTATCTGTGCCGTTGAAG

bclD CL regionb bclD F2 GGTGTACATGTTGATTCATGCbclD R2 TGCATACGTCGCCGTAATAC

bclE CL regionb bclE F3 CCTACTTTTCCTCCAGTTCCbclE R4 GTACGTTGCAGTTGTACTAAC

bclB of B.anthracisc

bclB F2 AGGCCCAGAAAATATTGGACbclB R4 GAGTTCCTCCCACACCTGG

Non-B. anthracisbclB

bclB F3 CCAAGTRATGCAAACAGATTAG

bclB R5 TACTWGTTCCACCHGTTAA

a The primers amplify bclABCDE genes.b The primers amplify mainly the fragment encoding the CL regions of

bclABCDE.c The primers specifically amplify the bclB gene of B. anthracis.

VOL. 75, 2009 bcl GENE POLYMORPHISM 7165

on Septem

ber 5, 2020 by guesthttp://aem

.asm.org/

Dow

nloaded from

Page 4: Identification and Classification of bcl Genes and Proteins ...Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application

less, clans 1, 2b, and 3 are the clans most characteristic of B.cereus. An exception is the BclB variant that forms clan 2a,which has been found only in B. anthracis. Based on the dis-tribution pattern for members of the clans mentioned above, itis possible to distinguish B. anthracis from other Bacillus spe-cies. However, distinguishing between B. cereus and B. thurin-giensis is difficult. To facilitate the classification of Bcl proteins,as well as strain identification, we developed a simple webserver that is available at http://kudlaty.genesilico.pl/bcl/index.py. The query sequence submitted by a user is compared tohidden Markov model profiles of Bcl subfamilies with theHHsearch method (39). The ranking of “hits” against all of theclans characterized here is displayed, and the tentative classi-fication of the query sequence is indicated graphically. Thiswebsite also provides multiple-sequence alignments of mem-bers for each clan studied here. We believe that this resourcewill be useful for researchers interested in Bacillus organisms.

Detection and characterization of the bclABCDE genes in B.anthracis: proof of principle. Five variable bcl genes have beenfound in the genome of B. anthracis strain Sterne (Fig. 2). Inaddition to the previously reported bclA (41) and bclB (47)genes, both of which encode exosporial proteins, three addi-tional genes that encode presumed CLPs were identified.These genes were, by convention, designated bclC, bclD, andbclE consecutively in order of their clockwise localizationaround the chromosome (Fig. 2A). The coding sequences ofthe bclABC genes are located on a plus strand of the chromo-some, while the coding sequences of bclD and bclE are locatedon a minus strand. Primers were designed to amplify each bclgene using chromosomal DNA of the Sterne strain of B. an-thracis as the template (Table 2). As expected, PCR amplifi-cations yielded single-band products of the predicted lengthsfor each bcl gene (Fig. 2B). The largest amplified fragment

from the Sterne strain of B. anthracis was that of the bclE gene,which was approximately 1.9 kb long. bclD, the smallest of theCLP-encoding genes, yielded an �0.9-kb amplified fragment.The remaining genes, bclA, bclB, and bclC were amplified as1.2-, 1.0-, and 1.4-kb fragments, respectively.

Two additional CLPs were identified in the B. anthracisgenomes analyzed, and both of them contained short CL re-gions. The first protein, designated BclF (locus BAS3290),belongs to clan 10 and contains 12 GXY repeats interruptedby two 2-amino-acid insertions. The second protein, BclG(BA2449), belongs to clan 6b and contains nine GXY triplets.The apparent lack of length variation in the CL regions ofthese proteins among B. anthracis strains differed from thevariation in BclA to BclE, and therefore, they were not in-cluded in the subsequent analyses. Nevertheless, there is sig-nificant length variation in the CL regions of both the BclF (upto 43 GXY repeats) and BclG (up to 66 GXY repeats) proteinsin other members of the B. cereus group (Table 3), which couldbe used in typing of these organisms.

The bclABCDE genes were all found in the chromosomes ineight complete genomes of B. anthracis strains (Sterne, Ames,Australia 94, CNEVA-9066, A1055, Vollum, USA6153, andKruger) and were characterized by significant length variation,especially in their CL regions (Fig. 2C). Each of the bcl genespotentially encodes a protein with an N region composed of 25to 41 amino acids. The length of the central CL region in BclA,BclB, BclC, BclD, and BclE varies significantly, ranging from18 to 594 amino acids (6 to 198 GXY repeats). The BclCprotein is unique in that it contains the 132-amino-acid Lregion between the N and CL regions. The lengths of theCTDs in BclA, BclB, BclC, BclD, and BclE ranged from 130 to162 amino acids. In summary, the genomes of B. anthracisstrains contained five distinct bcl open reading frames encod-

TABLE 3. Architecture of the Bcl proteins

ClanNo. of amino acid residues in Bcl region or domain (predicted structure) No. of

BclsbN region L region CL region CTDa

1 33 (disordered) None 145–228 87 or 88 (cupredoxin-like fold) 132a 33 (disordered) None 168–222 162 (unknown fold, 11 strands, and three helices) 62b 30 (disordered) 122 (five helices and

unknown fold)111–129 161 or 162 (unknown fold, 11 strands, and three helices) 11

3 15 or 21 (variable,disordered)

None 57–335 162 or 163 (unknown fold and 11 strands) 7

4a 25 (coiled coils) 126 (five helices andunknown fold)

33–447 131 or 132 (complement component C1q domain TNF-like fold; 26% sequence identity with PDB 1wck A)

13

4b 25 (coiled coils) 132 (five helices andunknown fold)

192–999 127–130 (complement component C1q domain TNF-likefold; 23% sequence identity with PDB 1wck A)

20

5 25 (disordered) 134 (five helices andunknown fold)

66–225 301 (WD40 domain seven-blade beta-propeller fold; 45%sequence identity with PDB 1l0q A)

5

6a 36–41 (variable) None 78–1,158 135 (complement component C1q domain TNF-like fold;22% sequence identity with PDB 1wck A)

45

6b 8 or 29 (variable) None 27–198 136–141 (complement component C1q domain TNF-likefold; 22% sequence identity with PDB 1wck A)

9

7 40 (disordered) None 114–144 132 (complement component C1q domain TNF-like fold;97% sequence identity with PDB 1wck A)

15

8 32–38 (disordered) None 96–105 161–165 (unknown fold, two helices, and nine strands) 49 19 or 21 (variable) None 417–675 133 (complement component C1q domain TNF-like fold;

17% sequence identity with PDB 1wck A)14

10 32 (disordered) None 45–129 134 (complement component C1q domain TNF-like fold;17% sequence identity with PDB 1pk6 A)

9

a The PDB codes indicate the best predicted templates.b Numbers of Bcl proteins that cluster in the clans.

7166 LESKI ET AL. APPL. ENVIRON. MICROBIOL.

on Septem

ber 5, 2020 by guesthttp://aem

.asm.org/

Dow

nloaded from

Page 5: Identification and Classification of bcl Genes and Proteins ...Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application

ing CLPs; however, significant size variation was observed forbcl alleles of different B. anthracis strains.

bclB gene-based detection of B. anthracis. The bclABCDEgenes were found in the genomes of other members of the B.cereus group, including B. cereus and B. thuringiensis, as well asin the genomes of B. mycoides and B. weihenstephanensis (seeFig. S2 in the supplemental material). However, although theCTD regions of all 25 available BclB sequences (clan 2) werehighly conserved among the members of the B. cereus group,the BclB N regions in all 14 B. anthracis strains (clan 2a) varied

significantly from 11 BclB sequences of B. cereus and B. thu-ringiensis strains (clan 2b). In addition, members of clan 2b alsocontain the L region composed of 122 amino acids, which is notpresent in BclB from clan 2a strains (Table 3). PCR primerswere designed (Table 2) based on nucleotide sequence align-ments of the 5� ends of the bclB genes (see Fig. S3 in thesupplemental material) to test the hypothesis that bclB-basedamplification can be used to specifically detect DNA of B.anthracis (Fig. 3). As predicted, PCR amplification with prim-ers bclB F2 and bclB R4 using DNA templates from B. anthra-

FIG. 2. Identification, detection, and characterization of the bcl genes and Bcl proteins of B. anthracis. Five genes designated bclA to bclE wereidentified in the genomes of B. anthracis strains. (A) Graphic representation of the B. anthracis strain Sterne chromosome showing the locations andorientations of the bclABCDE genes. Basic characteristics are shown in the table. (B) Detection of the bclABCDE genes in the genome of B. anthracisSterne. The bclABCDE genes were PCR amplified, and resulting products were analyzed in a 2% agarose gel. Lanes M, 2-log DNA ladder. (C) Schematicrepresentation (not to scale) and characterization of the BclABCDE proteins. The data are based on sequences obtained from the genomes of eight B.anthracis strains (Sterne, Ames, Australia 94, CNEVA-9066, A1055, Vollum, USA 6153, and Kruger). Unless indicated otherwise, Bcl protein regionswere arbitrarily designated the N region (or amino-terminal region), L region, CL region, and CTD (or carboxyl-terminal region) (4). The table showsthe ranges of the numbers of amino acids in the different regions of the BclABCDE proteins.

FIG. 3. Specific detection of B. anthracis by PCR of the bclB gene. (A) Schematic representation (not to scale) of two main BclB variants ofclan 2a and 2b proteins. Primers bclB F2 and bclB R4 (Table 2) were designed to differentiate between the bclB allele present in B. anthracis strainsfrom the alleles present in other members of the B. cereus group, which are detected with primers bclB F3 and bclB R5. (B) PCR amplificationusing B. anthracis-specific primers. PCR products were analyzed by 2% agarose gel electrophoresis. The DNA templates used for PCR weregenomic DNAs from B. anthracis (Ba) Sterne and Ames; B. cereus (Bc) strains ATCC 14579, ATCC 4342, and ATCC 13061; B. thuringiensis (Bt)strains ATCC 33679 and ATCC 33680; B. subtilis (Bs) strains ATCC 6051, ATCC 21332, and ATCC 31028; B. mycoides (Bmy) strain ATCC 6462;and B. megaterium (Bme) strain ATCC 14581. Lane M, 2-log DNA ladder.

VOL. 75, 2009 bcl GENE POLYMORPHISM 7167

on Septem

ber 5, 2020 by guesthttp://aem

.asm.org/

Dow

nloaded from

Page 6: Identification and Classification of bcl Genes and Proteins ...Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application

cis strains Sterne and Ames yielded single products of theexpected sizes, 645 bp and 699 bp, respectively, that werededuced from sequence data. Conversely, none of the PCRsthat used as templates DNA from three B. cereus strains, twoB. thuringiensis strains, and one B. mycoides strain resulted inamplification of the bclB genes of these strains. As expected,PCR was negative for DNA templates from the control strainsof B. subtilis (n � 3) and B. megaterium (n � 1), which do notharbor the bclB gene. Altogether, our bioinformatic analysesof the bclB sequences that were obtained from 25 distinctmembers of the B. cereus group led to results indicating that

bclB polymorphisms can provide quick and specific detectionof anthrax etiology.

Next, we performed PCR amplification using intact spores toassess the feasibility of a bclB-based method to detect B. an-thracis (Fig. 4). B. anthracis spores, as well as control spores ofB. cereus, B. thuringiensis, and B. mycoides, were prepared, andthe purity of each preparation was evaluated with a light mi-croscope (data not shown). Equal amounts of spores from theBacillus strains were added to PCR mixtures, and amplificationwas carried out either with primers that were specific for thebclB gene of B. anthracis (bclB F2 and bclB R4) or with controlprimers specific for the bclB gene of non-B. anthracis organ-isms (bclB F3 and bclB R5) belonging to the B. cereus group(Table 2). PCR amplification using B. anthracis-specific prim-ers and �104 spores of B. anthracis Sterne in the reactionmixture yielded the expected DNA product, while PCR ampli-fication using spores of B. cereus, B. thuringiensis, and B. my-coides did not (Fig. 4, upper panel). Importantly, all of thelatter spores yielded DNA products in control reactions withthe non-B. anthracis-specific primers, whereas B. anthracisspores did not (Fig. 4, bottom panel). These data demonstratethat amplification of the bclB gene can specifically differentiateB. anthracis spores from spores of other members of the B.cereus group; however, experiments using a large panel ofspores obtained from various Bacillus strains are necessary tovalidate bclB-based detection performed directly in the field.

Fingerprinting of B. anthracis strains based on bclABCDElength polymorphism. Significant sequence length polymor-phism was observed in the CL regions of BclA to BclE. Thevariability in the length of the CL region encoded by variousbclA alleles of several B. anthracis strains has previously beenused for strain differentiation (6, 42). Here, we significantlyimproved the discriminatory power by simultaneous analysis ofthe lengths of the CL regions of all five bcl genes examined(bclABCDE). First, PCR amplifications were performed withgenomic DNAs from B. anthracis strains Sterne and Amesusing primers flanking the bclABCDE CL regions (Table 2).PCR products were found in 2% agarose gels as single DNA

FIG. 4. Amplification of the bclB gene using Bacillus sp. spores.Spores were prepared from B. anthracis Sterne (Ba), B. cereus ATCC13061 (Bc), B. thuringiensis ATCC 33679 (Bt), and B. mycoides ATCC6462 (Bm). Specific detection of B. anthracis spores (�104 spores perreaction) by PCR was performed with B. anthracis-specific bclB prim-ers bclB F2 and bclB R4 (upper panel). A control positive PCR am-plification was performed with primers bclB F3 and bclB R5, whichamplify the bclB gene from the non-B. anthracis Bacillus species B.cereus, B. thuringiensis, and B. mycoides (lower panel). PCR productswere analyzed by 2% agarose gel electrophoresis. Lane M, 2-log DNAladder.

FIG. 5. bcl-based fingerprinting of B. anthracis strains. Fragments of individual bclABCDE genes from B. anthracis strains Sterne and Ameswere PCR amplified using primers flanking the CL regions (left panel). Expected sizes of PCR products are indicated below the gel. The individualPCR samples of bclABCDE for either strain Sterne or Ames were combined, and the sets were each loaded into single wells (middle panel). Thesolid squares indicate PCR bands in the gels. Multiplex PCR was performed with combined bclB primers using DNA from strain Sterne as thetemplate (right panel). All PCR products were analyzed by 2% agarose gel electrophoresis. Lane M, 2-log DNA ladder.

7168 LESKI ET AL. APPL. ENVIRON. MICROBIOL.

on Septem

ber 5, 2020 by guesthttp://aem

.asm.org/

Dow

nloaded from

Page 7: Identification and Classification of bcl Genes and Proteins ...Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application

bands at the predicted sizes deduced from sequence data (Fig.5, left panel). We next loaded side by side combined samplescontaining bclABCDE gene products obtained from each straininto single wells, and band patterns were resolved by agarosegel electrophoresis (Fig. 5, middle panel). The results showthat the fingerprints generated for B. anthracis strain Sterneand strain Ames were significantly different and consisted offive and four bands, respectively. The bclABCDE amplificationproducts of strain Sterne were separated from each other, butthe amplification products of bclA (728 bp) and bclC (743 bp)were not resolved in a sample from strain Ames.

Finally, multiplex PCR with all five primer pairs was at-tempted with DNA of the Sterne strain as the template byusing a temperature gradient from 50 to 65°C for primer an-nealing and an Mg2� concentration range of 1.5 to 6.5 mM inthe buffer (Fig. 5, right panel). The bclABCDE genes were allamplified with an annealing temperature of 50°C and an Mg2�

concentration of 1.8 mM, although the intensities of the bclAand bclE bands were relatively low. Together, these data dem-onstrate that significant length variation in the CL regions ofthe bclABCDE genes that are present in the genomes of allavailable B. anthracis strains can be a valuable tool in strainfingerprinting.

Discriminating among B. anthracis strains using bclABCDE-based fingerprinting. A computational approach was used toestablish the feasibility of discriminating among B. anthracisstrains using length polymorphism in the CL regions of thebclABCDE genes. The first step was to develop and calibrate aquantitative relationship between experimentally measured gelmigration patterns of the bclABCDE gene products PCR am-plified from B. anthracis strains Sterne and Ames, and thetheoretical fragment lengths that were inferred from the se-quence data (see Fig. S1 in the supplemental material). Thecalibrated model was next used to predict the fragment sizesamplified by PCR for each of the bclABCDE genes present inthe genomes of six additional B. anthracis strains. The uncer-tainty associated with strain fingerprinting using multivariate

measurement of the amplified fragments derived from thebclABCDE genes was estimated using bootstrap resampling(10). Bootstrap resampling was used to create a population ofsynthetic replicates. The ability to distinguish among the strainsusing the bclABCDE genes was represented in two dimensionsusing multidimensional scaling (Fig. 6A). The levels of confi-dence associated with distinguishing among these strains areshown in an annotated dendrogram in Fig. 6B. The eightstrains clustered into four distinct groups. The Kruger andCNEVA-9066 strains clustered together, while the A1055, Vol-lum, and USA6153 strains formed a separate group (P 0.0001). The Ames strain appeared to be distinct from theother strains (P 0.0001). Only the Sterne and Australia 94strains, which exhibited the highest level of similarity, couldnot be distinguished from each other (P � 0.377). Hence, wepredicted that under the experimental conditions used here,we would be able to differentiate with confidence strains of B.anthracis, with the exception of the Sterne and Australia 94strains, using a multilocus typing approach based on bclAB-CDE length polymorphism.

bcl gene-based fingerprinting of the B. cereus group organ-isms. Determination of the origin of certain spores may also beimportant for non-B. anthracis Bacillus species in the event ofa hoax, a blunder by a perpetrator, or psychological terrorism.Primer pairs that were optimized for the bclABCDE genes of B.anthracis were used to generate fingerprints using DNA templatesfrom three B. cereus strains, one B. thuringiensis strain, and one B.mycoides strain (Fig. 7). Not all primer pairs yielded bclABCDEgene products with all DNA templates (see Fig. S4 in the supple-mental material). Despite partial amplification of three or fourbands, the combined PCR samples generated unique fingerprintpatterns for the strains analyzed. Inclusion of the bclF and bclGgenes in the fingerprint analysis, as well as primer optimization,should significantly improve the discriminating power. This testdemonstrates that bcl-based fingerprinting could also be em-ployed in forensic applications for differentiation of strains of allmembers of the B. cereus group.

FIG. 6. Mathematical modeling of bclABCDE-based fingerprinting of B. anthracis strains. (A) Multidimensional scaling representation of thesimilarity among the bootstrap replicates of eight B. anthracis strains based on multivariate PCR measurement of bclABCDE gene fragments.(B) Dendrogram with confidence levels associated with correct classification of the eight B. anthracis strains.

VOL. 75, 2009 bcl GENE POLYMORPHISM 7169

on Septem

ber 5, 2020 by guesthttp://aem

.asm.org/

Dow

nloaded from

Page 8: Identification and Classification of bcl Genes and Proteins ...Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application

DISCUSSION

The dissemination of B. anthracis spores to government of-fices and media outlets in the United States in late 2001 height-ened public awareness of the potential for biological attacks,and since that time much emphasis has been placed on micro-bial forensic techniques that could determine the identity andorigin of organisms used as biological weapons (3, 7); however,sequencing of B. anthracis genomes was required for theseanalyses (11). This is because the chromosomes of B. cereusgroup organisms are virtually identical and interspecies differ-ences are largely determined by plasmid contents. The presentwork was initiated to identify and characterize potential mark-ers specific to B. anthracis, as well as to individual B. anthracisstrains. Here, we describe significant diversity and sequencepolymorphisms in the Bacillus bcl genes and use of these genesfor B. anthracis detection and strain fingerprinting.

The CLPs identified in our searches yielded proteins thatwere classified into 10 clans based on CLANS (see Fig. S2 inthe supplemental material). The Bcl proteins of B. anthracisbelong to clans 2a (BclB), 4a (BclC), 6a (BclD and BclE), and7 (BclA). In addition to B. anthracis, the BclA, BclB, BclC,BclD, and BclE proteins are found in many other members ofthe B. cereus group. Despite the uniqueness of proteins thataccounts for clan groups, several of the clans share protein foldpredictions. Clans 4, 6, 7, 9, and 10 all have a predicted C-terminal TNF/C1q-like domain, although the level of predictedhomology varies for each clan. The crystal structure of theBclA CTD (clan 7) was recently solved, and the authors re-ported that this domain is strikingly similar to the C-terminal

globular domain of C1q (35). The mammalian proteins belong-ing to the TNF/C1q superfamily are involved in many diversefunctions, including inflammation, autoimmunity, host de-fense, and apoptosis (21). The BclA protein is found in theexosporium and affects the hydrophobicity and adhesive prop-erties of B. anthracis spores (5). A recent study showed thatBclA interacts with the integrin receptor Mac-1 present onphagocytic and nonphagocytic cells and that this interactionaffects spore uptake and infectivity in mice (31). Here, wepredict that a TNF/C1q-like fold may be widespread in Bclproteins. Other Bcl CTDs, such as those in clan 1, are pre-dicted to contain a cupredoxin-like fold. The cupredoxins are agroup of copper-containing proteins that are found in numer-ous organisms and function in a wide variety of cellular pro-cesses, including many enzymatic reactions and aerobic andanaerobic respiration (36). Additionally, Bcl proteins in clan 5organisms contain a predicted WD40 domain, which is alsofound in the eukaryotic cell cycle protein CDC20 and in severalG-protein subunits, where it functions in protein-proteininteractions (26, 49). Until this work, cupredoxin-like andWD40 domains in Bcl proteins had been not reported or in-vestigated, and their biological significance is not known.Nonetheless, our analyses indicate that Bcl proteins are com-mon in the members of the B. cereus group and can be classi-fied into distinct clan groups based on the predicted proteinfolds, which are often shared with protein folds found in mam-malian proteins.

Both BclA and BclB were identified as components of theoutermost spore layer called the exosporium, which is charac-teristic of the spores of B. cereus group organisms but not thespores of B. subtilis group organisms (16). Sequences of bclgenes and Bcl proteins were found in the genomes and pro-teomes of B. cereus group organisms, such as B. anthracis, B.cereus, and B. thuringiensis, as well as two related species, B.mycoides and B. weihenstephanensis. In contrast, they were notpresent in B. subtilis group organisms. Considering the fact thatBcl proteins have a common architecture, it is tempting tospeculate that all of them are associated with the exosporium.Recently, an exosporium-targeting sequence motif was identi-fied in the N-terminal domain of BclA and BclB (44), as wellas two other proteins designated BAS3290 and BAS4623,which we refer to here as a protein belonging to clan 10 (BclF)and BclE, respectively. The N-terminal domain of BclA isproteolytically cleaved, and the processed mature protein isinserted into the exosporium. However, the BclC and BclDproteins, as well as several other Bcl proteins, lack the consen-sus targeting sequence, and their association with the exospo-rium remains to be verified experimentally.

In addition to variation within the noncollagenous domainsof Bcl proteins, there are both length and sequence polymor-phisms in their collagenous domains. The CL regions of pro-teins BclA to BclE in B. anthracis Sterne consist of 28 distinctGXY triplets (Fig. S5). However, there is a strong preponder-ance of GXT triplets that account for about 97, 92, 49, 98, and96% of all CL repeats in BclA to BclE, respectively. It has beenshown that both BclA (40, 41) and BclB (47) are glycoproteinsand that threonine residues are O glycosylated (8), which mayexplain the observed high GXT repeat content. Bcl CL glyco-sylation is not unique to B. anthracis; rather, it is an intrinsicproperty of the Bcl proteins in bacilli (43). Glycosylation of

FIG. 7. bcl-based fingerprinting of non-B. anthracis Bacillus spe-cies. PCR amplification of the bclABCDE genes was performed usingDNA templates from the following strains: B. cereus (Bc) strainsATCC 14579, ATCC 4342, and ATCC 13061; B. thuringiensis (Bt)ATCC 33679; and B. mycoides (Bm) ATCC 6462. Individually ampli-fied bclABCDE gene products were combined and analyzed by 2%agarose gel electrophoresis. The estimated products sizes are indicatedbelow the gel. Some primer pairs did not yield a detectable DNAproduct (ND). Lane M, 2-log DNA ladder.

7170 LESKI ET AL. APPL. ENVIRON. MICROBIOL.

on Septem

ber 5, 2020 by guesthttp://aem

.asm.org/

Dow

nloaded from

Page 9: Identification and Classification of bcl Genes and Proteins ...Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application

other Bcl proteins has yet to be confirmed. The BclC CL regionis unique and is characterized by a lower frequency of GXTrepeats (�49%), which is accompanied by a high frequency ofGXQ triplets (�32%), which are not found in other Bcls.These differences provide an additional basis for differentiat-ing Bcl variants. For example, BclD and BclE have high CTDsequence identity and are both grouped in clan 6a; however,the BclE CL region contains several triplets (GST, GET, GNT,GTT, GGT, GMT, GSA, GSI, GSM, GNM, GPM, GDT, andGVS) that are not present in the BclD CL region (see Fig. S5in the supplemental material).

Here, we identified sequence polymorphisms that occur inthe bclB alleles as a way to discriminate between B. anthracisand other members of the B. cereus group. Primers were de-signed that specifically amplified the bclB gene product whenchromosomal DNA of B. anthracis was used as a template forPCR but not when DNAs of the closely related species B.cereus, B. thuringiensis, and B. mycoides, which contain differentbclB alleles, were used as templates. Importantly, the sameresults were obtained when spores were used as the PCRtemplates. While we demonstrated the feasibility of our ap-proach, the bclB-based detection system could be developed toidentify B. anthracis in the field using portable PCR deviceswith speed and sensitivity.

Although bclB alone serves as a B. anthracis genetic identi-fier, in aggregate, the bcl genes exhibit significant diversity,which could be used to generate B. anthracis strain “finger-prints” (Fig. 5 and 6). It has been shown that length polymor-phism in the CL region of bclA could differentiate some B.anthracis strains (42), while length variation in the bclB CLregion in B. anthracis has not been explored (47). Since bclA ispresent in the genomes of other B. cereus group members (B.anthracis, B. cereus, and B. thuringiensis), PCR amplification ofbclA was coupled with various electrophoretic separationmethods to discriminate among B. cereus members at the strainlevel (6). The present work employed three additional bclgenes (bclC, bclD, and bclE) that are present in the genomes ofB. anthracis strains and are characterized by significant lengthvariation in their CL regions. This method should allow evengreater discrimination between strains because it employs fivevariables (bclA, bclB, bclC, bclD, and bclE) rather than a singlevariable (bclA), as proposed previously. Other methods havebeen proposed that leverage the previously used multiple-locusVNTR analysis (MLVA) method to differentiate strains of B.anthracis (20, 23, 27). Using eight marker loci, Keim et al.analyzed a large worldwide collection of B. anthracis strainsthat clustered into six major genetic groups (20). A subsequentstudy by Lista et al., employing 25-locus MLVA for typing oflarge French and Italian collections, as well as reference B.anthracis strains, showed increased discriminatory power (27).Interestingly, some of the repeats used in that study (Bams13and Bams30) are associated with bcl genes. Our bootstrapresampling analysis of a limited number of B. anthracis refer-ence strains, which was based on bclABCDE length polymor-phism, predicted strain clusters that had some differences fromand some similarities to clusters obtained using the 25-locusMLVA criteria. For example, B. anthracis Volum (cluster A4),Sterne (cluster A3b), and Ames (cluster A3b) were on thesame main branch as determined by both methods, althoughour analysis discriminated strains Sterne and Ames better. On

the other hand, the Australia 94 and Sterne strains clusteredseparately in clusters A3a and A3b, respectively, based on the25-locus MLVA criteria, while we could not discriminate be-tween them based on bclABCDE polymorphism.

Here we used agarose gel electrophoresis to separate the bclPCR products; however, this technique may not be optimalconsidering the relatively large sizes of the DNA fragmentsanalyzed. Alternative separation techniques could significantlyimprove our results (6). The size and number of DNA bandscould also be altered by incorporating endonuclease digestionof PCR products. An increase in the number of bands accom-panied by a decrease in the band sizes would likely result inincreased discriminatory power. This alternative protocol, in-cluding the digestion of PCR products, can be readily incor-porated into the mathematical approach employed in thisstudy. Together, our “proof of principle” experiments deter-mined the feasibility of bcl fingerprinting, but further work isneeded to validate our model.

In conclusion, we describe use of the bcl genes of B. cereusgroup organisms as chromosomal genetic identifiers of thespecies, as well as a means of discrimination at the strain level.Incorporation of bcl-based identification into existing systemsmay improve B. anthracis detection, and it may also improvemicrobial forensic techniques used to identify individuals re-sponsible for biological attacks. In addition to using bcl genepolymorphisms as species- and strain-identifying markers, it isimportant that the Bcl proteins be assessed in terms of theirbiology. Additional research is needed to determine the role ofBcl in the pathogenesis of B. cereus group organisms. Thepresent work proposes the first comprehensive classification ofthe CLPs of bacilli, which is based on the predicted structuralcharacteristics of these proteins. This study increases our un-derstanding of the diversity of this unique family of prokaryoticproteins that have been found in many pathogenic bacteria.

ACKNOWLEDGMENTS

We are grateful to Baochuan Lin for providing the genomic DNA ofB. anthracis strain Ames. We thank Chris Garton and StephanieBoblett for assistance with some experiments.

Chris Garton and Stephanie Boblett were supported by NIH grant5P20RR016477 to the West Virginia IDeA Network for BiomedicalResearch Excellence. This work was supported in part by NIH grantsAI50666 (S.L.) and CA132124 (D.J.K.), by the PhRMA Foundation(D.J.K.), and by a West Virginia University Health Science Centerinternal grant from the Office of Research and Graduate Education(S.L.). J.M.B. and M.P. were supported by NIH grant GM081680-01.

REFERENCES

1. Altschul, S., T. Madden, A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D.Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of pro-tein database search programs. Nucleic Acids Res. 25:3389–3402.

2. Bateman, A., E. Birney, R. Durbin, S. R. Eddy, K. L. Howe, and E. L. L.Sonnhammer. 2000. The Pfam protein families database. Nucleic Acids Res.28:263–266.

3. Bhattacharjee, Y., and M. Enserink. 2008. Anthrax investigation. FBI dis-cusses microbial forensics—but key questions remain unanswered. Science321:1026–1027.

4. Boydston, J. A., P. Chen, C. T. Steichen, and C. L. Turnbough, Jr. 2005.Orientation within the exosporium and structural stability of the collagen-like glycoprotein BclA of Bacillus anthracis. J. Bacteriol. 187:5310–5317.

5. Brahmbhatt, T. N., B. K. Janes, E. S. Stibitz, S. C. Darnell, P. Sanz, S. B.Rasmussen, and A. D. O’Brien. 2007. Bacillus anthracis exosporium proteinBclA affects spore germination, interaction with extracellular matrix pro-teins, and hydrophobicity. Infect. Immun. 75:5233–5239.

6. Castanha, E. R., R. R. Swiger, B. Senior, A. Fox, L. N. Waller, and K. F. Fox.2006. Strain discrimination among B. anthracis and related organisms by

VOL. 75, 2009 bcl GENE POLYMORPHISM 7171

on Septem

ber 5, 2020 by guesthttp://aem

.asm.org/

Dow

nloaded from

Page 10: Identification and Classification of bcl Genes and Proteins ...Identification and Classification of bcl Genes and Proteins of Bacillus cereus Group Organisms and Their Application

characterization of bclA polymorphisms using PCR coupled with agarose gelor microchannel fluidics electrophoresis. J. Microbiol. Methods 64:27–45.

7. Dance, A. 2008. Anthrax case ignites new forensics field. Nature 454:813.8. Daubenspeck, J., H. Zeng, P. Chen, S. Dong, C. Steichen, N. Krishna, D.

Pritchard, and C. Turnbough. 2004. Novel oligosaccharide side-chains of thecollagen-like region of BclA, the major glycoprotein of the Bacillus anthracisexosporium. J. Biol. Chem. 279:30945–30953.

9. Drobniewski, F. A. 1993. Bacillus cereus and related species. Clin. Microbiol.Rev. 6:324–338.

10. Efron, B., and R. Tibshirani. 1986. Bootstrap methods for standard errors,confidence intervals, and other measures of statistical accuracy. Stat. Sci.1:54–77.

11. Enserink, M. 2008. Anthrax investigation. Full-genome sequencing pavedthe way from spores to a suspect. Science 321:898–899.

12. Frickey, T., and A. Lupas. 2004. CLANS: a Java application for visualizingprotein families based on pairwise similarity. Bioinformatics 20:3702–3704.

13. Hall, P., and M. P. Wand. 1988. On nonparametric discrimination usingdensity differences. Biometrica 75:541–547.

14. Han, R., A. Zwiefka, C. C. Caswell, Y. Xu, D. R. Keene, E. Lukomska, Z.Zhao, M. Hook, and S. Lukomski. 2006. Assessment of prokaryotic collagen-like sequences derived from streptococcal Scl1 and Scl2 proteins as a sourceof recombinant GXY polymers. Appl. Microbiol. Biotechnol. 72:109–115.

15. Hart, S. J., A. Terray, T. A. Leski, J. Arnold, and R. Stroud. 2006. Discoveryof a significant optical chromatographic difference between spores of Bacillusanthracis and its close relative, Bacillus thuringiensis. Anal. Chem. 78:3221–3225.

16. Henriques, A. O., J. Moran, and P. Charles. 2007. Structure, assembly, andfunction of the spore surface layers. Annu. Rev. Microbiol. 61:555–588.

17. Inglesby, T. V., D. A. Henderson, J. G. Bartlett, M. S. Ascher, E. Eitzen,A. M. Friedlander, J. Hauer, J. McDade, M. T. Osterholm, T. O’Toole, G.Parker, T. M. Perl, P. K. Russell, K. Tonat, and the Working Group onCivilian Biodefense. 1999. Anthrax as a biological weapon: medical andpublic health management. JAMA 281:1735–1745.

18. Katoh, K., K. Misawa, K.-I. Kuma, and T. Miyata. 2002. MAFFT: a novelmethod for rapid multiple sequence alignment based on fast Fourier trans-form. Nucleic Acids Res. 30:3059–3066.

19. Keim, P., A. M. Klevytska, L. B. Price, J. M. Schupp, G. Zinser, K. L. Smith,M. E. Hugh-Jones, R. Okinaka, K. K. Hill, and P. J. Jackson. 1999. Molec-ular diversity in Bacillus anthracis. J. Appl. Microbiol. 87:215–217.

20. Keim, P., L. B. Price, A. M. Klevytska, K. L. Smith, J. M. Schupp, R.Okinaka, P. J. Jackson, and M. E. Hugh-Jones. 2000. Multiple-locus vari-able-number tandem repeat analysis reveals genetic relationships withinBacillus anthracis. J. Bacteriol. 182:2928–2936.

21. Kishore, U., C. Gaboriaud, P. Waters, A. K. Shrive, T. J. Greenhough,K. B. M. Reid, R. B. Sim, and G. J. Arlaud. 2004. C1q and tumor necrosisfactor superfamily: modularity and versatility. Trends Immunol. 25:551–561.

22. Kurowski, M. A., and J. M. Bujnicki. 2003. GeneSilico protein structureprediction meta-server. Nucleic Acids Res. 31:3305–3307.

23. Le Fleche, P., Y. Hauck, L. Onteniente, A. Prieur, F. Denoeud, V. Ramisse,P. Sylvestre, G. Benson, F. Ramisse, and G. Vergnaud. 2001. A tandemrepeats database for bacterial genomes: application to the genotyping ofYersinia pestis and Bacillus anthracis. BMC Microbiol. 1:2.

24. Leighton, T. J., and R. H. Doi. 1971. The stability of messenger ribonucleicacid during sporulation in Bacillus subtilis. J. Biol. Chem. 246:3189–3195.

25. Lew, D. P. 2000. Bacillus anthracis (anthrax), p. 2215–2220. In G. L. Mandell,J. E. Bennett, and R. Dolin (ed.), Mandell, Douglas, and Bennett’s principlesand practice of infectious diseases, vol. 2. Churchill Livingstone, Philadel-phia, PA.

26. Li, D., and R. Roberts. 2001. WD-repeat proteins: structure characteristics,biological function, and their involvement in human diseases. Cell. Mol. LifeSci. 58:2085–2097.

27. Lista, F., G. Faggioni, S. Valjevac, A. Ciammaruconi, J. Vaissaire, C. leDoujet, O. Gorge, R. De Santis, A. Carattoli, A. Ciervo, A. Fasanella, F.Orsini, R. D’Amelio, C. Pourcel, A. Cassone, and G. Vergnaud. 2006. Geno-typing of Bacillus anthracis strains based on automated capillary 25-locimultiple locus variable-number tandem repeats analysis. BMC Microbiol.6:33.

28. Lundstrom, J., L. Rychlewski, J. Bujnicki, and A. Elofsson. 2001. Pcons: a

neural-network-based consensus predictor that improves fold recognition.Protein Sci. 10:2354–2362.

29. Muddiman, D. C., D. S. Wunschel, C. Liu, L. Pasa-Tolic, K. F. Fox, A. Fox,G. A. Anderson, and R. D. Smith. 1996. Characterization of PCR productsfrom bacilli using electrospray ionization FTICR mass spectrometry. Anal.Chem. 68:3705–3712.

30. Mueller, O., K. Hahnenberger, M. Dittmann, H. Yee, R. Dubrow, R. Nagle,and D. Ilsley. 2000. A microfluidic system for high-speed reproducible DNAsizing and quantitation. Electrophoresis 21:128–134.

31. Oliva, C. R., M. K. Swiecki, C. E. Griguer, M. W. Lisanby, D. C. Bullard,C. L. Turnbough, Jr., and J. F. Kearney. 2008. The integrin Mac-1 (CR3)mediates internalization and directs Bacillus anthracis spores into profes-sional phagocytes. Proc. Natl. Acad. Sci. USA 105:1261–1266.

32. Phillips, A. P., and J. W. Ezzell. 1989. Identification of Bacillus anthracis bypolyclonal antibodies against extracted vegetative cell antigens. J. Appl.Bacteriol. 66:419–432.

33. Rasko, D. A., M. R. Altherr, C. S. Han, and J. Ravel. 2005. Genomics of theBacillus cereus group of organisms. FEMS Microbiol. Rev. 29:303–329.

34. Rasmussen, M., M. Jacobsson, and L. Bjorck. 2003. Genome-based identi-fication and analysis of collagen-related structural motifs in bacterial andviral proteins. J. Biol. Chem. 278:32313–32316.

35. Rety, S., S. Salamitou, I. Garcia-Verdugo, D. J. S. Hulmes, F. Le Hegarat, R.Chaby, and A. Lewit-Bentley. 2005. The crystal structure of the Bacillusanthracis spore surface protein BclA shows remarkable similarity to mam-malian proteins. J. Biol. Chem. 280:43073–43078.

36. Savelieff, M. G., T. D. Wilson, Y. Elias, M. J. Nilges, D. K. Garner, and Y. Lu.2008. Experimental evidence for a link among cupredoxins: red, blue, andpurple copper transformations in nitrous oxide reductase. Proc. Natl. Acad.Sci. USA 105:7919–7924.

37. Schaeffer, P., J. Millet, and J. P. Aubert. 1965. Catabolic repression ofbacterial sporulation. Proc. Natl. Acad. Sci. USA 54:704–711.

38. Schnepf, E., N. Crickmore, J. Van Rie, D. Lereclus, J. Baum, J. Feitelson,D. R. Zeigler, and D. H. Dean. 1998. Bacillus thuringiensis and its pesticidalcrystal proteins. Microbiol. Mol. Biol. Rev. 62:775–806.

39. Soding, J. 2005. Protein homology detection by HMM-HMM comparison.Bioinformatics 21:951–960.

40. Steichen, C., P. Chen, J. F. Kearney, and C. L. Turnbough, Jr. 2003. Iden-tification of the immunodominant protein and other proteins of the Bacillusanthracis exosporium. J. Bacteriol. 185:1903–1910.

41. Sylvestre, P., E. Couture-Tosi, and M. Mock. 2002. A collagen-like surfaceglycoprotein is a structural component of the Bacillus anthracis exosporium.Mol. Microbiol. 45:169–178.

42. Sylvestre, P., E. Couture-Tosi, and M. Mock. 2003. Polymorphism in thecollagen-like region of the Bacillus anthracis BclA protein leads to variationin exosporium filament length. J. Bacteriol. 185:1555–1563.

43. Tamborrini, M., M. A. Oberli, D. B. Werz, N. Schurch, J. Frey, P. H.Seeberger, and G. Pluschke. 2009. Immuno-detection of anthrose containingtetrasaccharide in the exosporium of Bacillus anthracis and Bacillus cereusstrains. J. Appl. Microbiol. 106:1618–1628.

44. Thompson, B., and G. Stewart. 2008. Targeting of the BclA and BclB pro-teins to the Bacillus anthracis spore surface. Mol. Microbiol. 70:421–434.

45. Todd, S. J., A. J. G. Moir, M. J. Johnson, and A. Moir. 2003. Genes ofBacillus cereus and Bacillus anthracis encoding proteins of the exosporium. J.Bacteriol. 185:3373–3378.

46. Valjevac, S., V. Hilaire, O. Lisanti, F. Ramisse, E. Hernandez, J.-D. Cavallo,C. Pourcel, and G. Vergnaud. 2005. Comparison of minisatellite polymor-phisms in the Bacillus cereus complex: a simple assay for large-scale screen-ing and identification of strains most closely related to Bacillus anthracis.Appl. Environ. Microbiol. 71:6613–6623.

47. Waller, L., M. Stump, K. Fox, W. Harley, A. Fox, G. Stewart, and M.Shahgholi. 2005. Identification of a second collagen-like glycoprotein pro-duced by Bacillus anthracis and demonstration of associated spore-specificsugars. J. Bacteriol. 187:4592–4597.

48. Xu, Y., D. R. Keene, J. M. Bujnicki, M. Hook, and S. Lukomski. 2002.Streptococcal Scl1 and Scl2 proteins form collagen-like triple helices. J. Biol.Chem. 277:27312–27318.

49. Yu, H. 2007. Cdc20: a WD40 activator for a cell cycle degradation machine.Mol. Cell 27:3–16.

7172 LESKI ET AL. APPL. ENVIRON. MICROBIOL.

on Septem

ber 5, 2020 by guesthttp://aem

.asm.org/

Dow

nloaded from