Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete

7
DOI: 10.1126/science.281.5375.375 , 375 (1998); 281 Science et al. Claire M. Fraser Spirochete , the Syphilis Treponema pallidum Complete Genome Sequence of This copy is for your personal, non-commercial use only. clicking here. colleagues, clients, or customers by , you can order high-quality copies for your If you wish to distribute this article to others here. following the guidelines can be obtained by Permission to republish or repurpose articles or portions of articles ): November 8, 2012 www.sciencemag.org (this information is current as of The following resources related to this article are available online at http://www.sciencemag.org/content/281/5375/375.full.html version of this article at: including high-resolution figures, can be found in the online Updated information and services, http://www.sciencemag.org/content/281/5375/375.full.html#ref-list-1 , 42 of which can be accessed free: cites 77 articles This article 525 article(s) on the ISI Web of Science cited by This article has been http://www.sciencemag.org/content/281/5375/375.full.html#related-urls 100 articles hosted by HighWire Press; see: cited by This article has been http://www.sciencemag.org/cgi/collection/molec_biol Molecular Biology subject collections: This article appears in the following registered trademark of AAAS. is a Science 1998 by the American Association for the Advancement of Science; all rights reserved. The title Copyright American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the Science on November 8, 2012 www.sciencemag.org Downloaded from

Transcript of Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete

Page 1: Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete

DOI: 10.1126/science.281.5375.375, 375 (1998);281 Science

et al.Claire M. FraserSpirochete

, the SyphilisTreponema pallidumComplete Genome Sequence of

This copy is for your personal, non-commercial use only.

clicking here.colleagues, clients, or customers by , you can order high-quality copies for yourIf you wish to distribute this article to others

  here.following the guidelines

can be obtained byPermission to republish or repurpose articles or portions of articles

  ): November 8, 2012 www.sciencemag.org (this information is current as of

The following resources related to this article are available online at

http://www.sciencemag.org/content/281/5375/375.full.htmlversion of this article at:

including high-resolution figures, can be found in the onlineUpdated information and services,

http://www.sciencemag.org/content/281/5375/375.full.html#ref-list-1, 42 of which can be accessed free:cites 77 articlesThis article

525 article(s) on the ISI Web of Sciencecited by This article has been

http://www.sciencemag.org/content/281/5375/375.full.html#related-urls100 articles hosted by HighWire Press; see:cited by This article has been

http://www.sciencemag.org/cgi/collection/molec_biolMolecular Biology

subject collections:This article appears in the following

registered trademark of AAAS. is aScience1998 by the American Association for the Advancement of Science; all rights reserved. The title

CopyrightAmerican Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by theScience

on

Nov

embe

r 8,

201

2w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Page 2: Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete

Complete Genome Sequenceof Treponema pallidum, the

Syphilis SpirocheteClaire M. Fraser,* Steven J. Norris, George M. Weinstock,

Owen White, Granger G. Sutton, Robert Dodson,Michelle Gwinn, Erin K. Hickey, Rebecca Clayton,

Karen A. Ketchum, Erica Sodergren, John M. Hardham,Michael P. McLeod, Steven Salzberg, Jeremy Peterson,Hanif Khalak, Delwood Richardson, Jerrilyn K. Howell,

Monjula Chidambaram, Teresa Utterback, Lisa McDonald,Patricia Artiach, Cheryl Bowman, Matthew D. Cotton,Claire Fujii, Stacey Garland, Bonnie Hatch, Kurt Horst,

Kevin Roberts, Mina Sandusky, Janice Weidman,Hamilton O. Smith, J. Craig Venter

The complete genome sequence of Treponema pallidum was determined andshown to be 1,138,006 base pairs containing 1041 predicted coding sequences(open reading frames). Systems for DNA replication, transcription, translation,and repair are intact, but catabolic and biosynthetic activities are minimized.The number of identifiable transporters is small, and no phosphoenolpyruvate:phosphotransferase carbohydrate transporters were found. Potential virulencefactors include a family of 12 potential membrane proteins and several putativehemolysins. Comparison of the T. pallidum genome sequence with that ofanother pathogenic spirochete, Borrelia burgdorferi, the agent of Lyme disease,identified unique and common genes and substantiates the considerable di-versity observed among pathogenic spirochetes.

Venereal syphilis was first reported in Europein the late 1400s (1), coincident with thereturn of Columbus from the New World.The disease quickly reached epidemic pro-portions in Europe and spread across theworld during the early 16th century with theage of exploration. Syphilis was ubiquitousby the 19th century and has been called theacquired immune deficiency syndrome ofthat era (2). Syphilis is characterized by mul-tiple clinical stages and long periods of latent,asymptomatic infection. The primary infec-tion is localized, but organisms rapidly dis-seminate and cause manifestations through-out the body, including the cardiovascular

and nervous systems (3). Although effectivetherapies have been available since the intro-duction of penicillin in the mid-20th century,syphilis remains an important global healthproblem.

Treponema pallidum is the causativeagent of syphilis. It is a spirochete, a helicalto sinusoidal bacterium with outer and cyto-plasmic membranes, a thin peptidoglycanlayer, and flagella that lie in the periplasmicspace and extend from both ends toward themiddle of the organism. Recent pulsed-fieldgel electrophoresis studies (4) have shownthat T. pallidum contains a circular chromo-some of about 1000 kilobase pairs, making itone of the smallest prokaryotic genomes. De-spite its importance as an infectious agent,relatively little is known about T. pallidum incomparison with other bacterial pathogens(5). The organism is an obligate human par-asite that cannot be cultured continuously invitro (6). Mechanisms of T. pallidum patho-genesis are poorly understood. No knownvirulence factors have been identified, andthe outer membrane is mostly lipid with apaucity of proteins (7). Consequently, exist-ing diagnostic tests for syphilis are subopti-mal, and no vaccine against T. pallidum isavailable.

Spirochetes represent a phylogeneticallyancient and distinct bacterial group. Both T.

pallidum and Borrelia burgdorferi, the caus-ative agent of Lyme disease, are similar inhaving relatively small genomes and surviv-ing only in association with a host. However,they are not closely related and probablyevolved independently from a more complexancestor by loss of unnecessary genes andacquisition of new functions that promotedsurvival in the host environment. Comparisonof the T. pallidum and B. burgdorferi ge-nomes (8) allows assessment of biologicaldiversity within this group of bacteria.

Genome analysis. The genome of T. pal-lidum subsp. pallidum (Nichols) was se-quenced by the whole genome random se-quencing method as described (8–10). The T.pallidum genome is a circular chromosome of1,138,006 base pairs with an average G 1 Ccontent of 52.8% (Figs. 1 and 2). There are atotal of 1041 predicted open reading frames(ORFs), with an average size of 1023 bp,representing 92.9% of total genomic DNA.Predicted biological roles were assigned to577 ORFs (55%) by the classification schemeadopted from Riley (11); 177 ORFs (17%)match hypothetical proteins from other spe-cies, and 287 ORFs (28%) have no databasematch and presumably represent novel genes(Fig. 1 and Table 1). Ninety T. pallidumORFs of unknown function match chromo-some-encoded proteins in B. burgdorferi (8);however, no T. pallidum ORFs match B.burgdorferi plasmid-encoded proteins, sug-gesting that the plasmid proteins may beunique to Borrelia species (8). The averagesize of the predicted proteins in T. pallidum is37,771 daltons, ranging from 3235 to172,869 daltons, and the mean isoelectricpoint for all predicted proteins is 8.1, rangingfrom 3.9 to 12.3, values similar to thoseobserved in other bacterial species (8, 9).

Forty-two paralogous gene families con-taining a total of 129 ORFs (12%) wereidentified in T. pallidum (Fig. 1). Fifteenfamilies contain 44 genes that have no as-signed biological role. Thirty families haveonly two members. The largest family, with14 members, consists of proteins with aden-osine triphosphate (ATP)–binding cassettesin ABC transport systems. Within 13 genefamilies are 16 clusters of adjacent genes thatmay represent duplications in the T. pallidumgenome.

All 61 triplet codons are used in T. palli-dum. There is a bias for G or C in the thirdcodon position in T. pallidum, in contrast toan A or T bias in this position in B. burgdor-feri. This observation is consistent with the G1 C content in the T. pallidum genome beingalmost twice that in the B. burgdorferi ge-nome. The disparate G 1 C content betweenthe spirochete genomes creates a bias in over-all codon usage, resulting in a difference inamino acid composition in the predicted cod-ing sequences.

C. M. Fraser, O. White, G. G. Sutton, R. Dodson, M.Gwinn, E. K. Hickey, R. Clayton, K. A. Ketchum, S.Salzberg, J. Peterson, H. Khalak, D. Richardson, T.Utterback, L. McDonald, P. Artiach, C. Bowman, M. D.Cotton, C. Fujii, S. Garland, B. Hatch, K. Horst, K.Roberts, M. Sandusky, J. Weidman, H. O. Smith, andJ. C. Venter are with The Institute for Genomic Re-search, 9712 Medical Center Drive, Rockville, MD20850, USA. S. J. Norris, G. M. Weinstock, E. Soder-gren, J. M. Hardham, M. P. McLeod, J. K. Howell, and M.Chidambaram are at the University of Texas HealthScience Center, Departments of Microbiology andMolecular Genetics, Pathology and Laboratory Medi-cine, and the Center for the Study of Emerging andReemerging Pathogens, Post Office Box 20708, Hous-ton, TX 77225, USA.

*To whom correspondence should be addressed. E-mail: [email protected]

R E S E A R C H A R T I C L E S

www.sciencemag.org SCIENCE VOL 281 17 JULY 1998 375

on

Nov

embe

r 8,

201

2w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Page 3: Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete

Origin of replication. Two criteria wereused to identify a replication origin in T.pallidum: the co-localization of genes (dnaA,dnaN, recF, and gyrA) often found near theorigin in prokaryotic genomes and GC skew(12) (Fig. 2). On the basis of these results, wedesignated base pair 1 of the T. pallidum ge-nome in an intergenic region of the chromo-some that is located within the putative originof replication.

Sixty-four percent of the coding se-quences in the T. pallidum genome arealigned in the direction of replication, withthe point of transcriptional divergence lo-cated near the putative origin between clpPand dnaA (Figs. 1 and 2). A number ofcodons occur in coding sequences alignedin the direction of replication at significant-ly higher frequencies than expected (P ,5.3e227), including TTG (Leu), GCG (Ala),CGT (Arg), GTG (Val), and TGT (Cys).Codons that are overrepresented are alsofound in the most highly skewed oligomers(GGAGCGTG, TGTGTGTG, GTGTGTGC,TTTTTTGT, and GGTGTGTG).

Codon adaptation index (CAI), which isdesigned to be a relative measure of transla-tional efficiency (13), was computed for T.pallidum with the codon frequencies from theribosomal proteins, the translation elongationfactors, and glyceraldehyde-3-phosphate de-hydrogenase. Proteins with a high CAI arepresumably highly expressed in exponentialgrowth (13). The distribution of CAI scoresin T. pallidum ORFs (13) exhibits a strand-dependent switch in magnitude around theorigin of replication (Fig. 2). In both T. pal-lidum and B. burgdorferi, there is a markeddifference in CAI values (high versus low)for genes on opposite strands of the chromo-some, with genes transcribed in the directionof replication exhibiting a high CAI.

Transcription and translation. Trepo-nema pallidum contains a basic set of genesfor transcription and translation that includeshomologs to the a, b, and b9 subunits of thecore RNA polymerase, five sigma factors(s24, s28, s43, s54, and s70), and five genesthat encode proteins involved in transcriptelongation and termination (nusA, nusB,nusG, greA, and rho). Treponema pallidum ismissing both a recognizable s38 (rpoS),which is the major sigma factor in stationaryphase activated in response to oxidative andosmotic stress, and a s32, which is involvedin transcription of heat shock proteins.

Forty-four tRNA species, organized intoeight clusters containing 25 genes plus 19single genes, were identified (Figs. 1 and 2).Two ribosomal RNA (rRNA) operons arepresent in the genome. Their organization isthe same as that commonly found in eubac-teria (16S-tRNA-23S-5S) (14), in contrastwith the unusual arrangement seen in B.burgdorferi (8, 15). Both T. pallidum rRNA

operons are transcribed in the direction ofreplication.

All tRNA synthetase genes were identi-fied with the exception of glutaminyl-tRNAsynthetase, similar to B. burgdorferi (8). It islikely that glutamyl-tRNA synthetase amino-acylates tRNAGln with glutamate followed bytransamidation by Glu-tRNA amidotrans-ferase (16). Two distinct lysyl-tRNA syn-thetase (LysS) species are present in T. pal-lidum, a class I type most similar to those ineuryarchaea and B. burgdorferi (17) and aclass II type most similar to those in eubac-teria and eukaryotes. The class II LysS in T.pallidum represents a COOH-terminal frag-ment of Escherichia coli LysS. A region nearthe NH2-terminus of LysS binds the antico-don of the tRNA and is crucial for its activity(18). Thus, it is likely that the class II LysS isnonfunctional and may be in the process ofbeing lost from the genome.

Replication, repair, recombination, andrestriction-modification systems. The com-plement of genes for DNA replication in T.pallidum is similar to that in other minimalgenomes such as Mycoplasma genitalium andB. burgdorferi. Orthologs for the a, b, «, g,and t subunits of E. coli DNA polymerase IIIare present. Treponema pallidum has ho-mologs of one type I topoisomerase (topA)and one type II topoisomerase (gyrAB), butunlike B. burgdorferi, it is missing topoisom-erase IV, which is involved in chromosomesegregation. However, chromosome segrega-tion in T. pallidum may proceed by an alter-native mechanism that involves the bindingof hemimethylated DNA to the cytoplasmicmembrane. This idea is supported by thepresence of DNA adenine methyltransferase(dam) in T. pallidum but not in B. burgdorferi(19).

DNA repair in T. pallidum includes themajor known pathways of uvr excision re-pair, mutL/mutS mismatch repair, mutY, anddat. The T. pallidum genome encodes ho-mologs of the recF pathway of recombination(recFGJNR) but lacks homologs to sbcB(exoI) as well as recB, recC, and recD. Thus,homologous recombination resembles therecF pathway of E. coli. The converse is truein B. burgdorferi, where there are homologsof recBCD but not the recF pathway genes(8). Treponema pallidum contains an A- orG-specific adenine glycosylase (mutY), rec-ognizes GA mismatches in duplex DNA, andexcises adenine. No enzyme with similar ac-tivity has been identified in either M. geni-talium or B. burgdorferi. This differencemay, in part, explain the lower G 1 C contentof the B. burgdorferi and M. genitalium ge-nomes as compared with T. pallidum. Norecognizable genes encoding restriction ormodification enzymes were found.

Biosynthetic pathways. Treponema pal-lidum is an obligate parasite of humans. Con-

sistent with this property, previous physio-logic studies have shown that it has limitedbiosynthetic capabilities and requires multi-ple nutrients from the host (20) (Fig. 3). TheT. pallidum genome encodes a pathway forthe conversion of phosphoenolpyruvate orpyruvate through oxaloacetate to aspartate (atthe expense of glutamate), in accordance withthe previous observation that most of the[14C]glucose incorporated into amino acidswas in the form of aspartate (21). Predictedpathways for the interconversion of aspartateand glutamine to glutamate, aspartate to as-paragine, glutamate to proline, and serine toglycine are also present. Treponema pallidumis unable to synthesize enzyme co-factors,fatty acids, and nucleotides de novo, similarto M. genitalium and B. burgdorferi. Deoxy-ribonucleotides can be obtained by reductionof ribonucleoside diphosphates through theaction of ribonucleotide diphosphate reduc-tase and thioredoxin reductase.

Transport. An organism such as T. pal-lidum, with limited biosynthetic capabilities,must have a repertoire of transport proteinswith broad substrate specificity to obtain thenecessary nutrients from the environment.The T. pallidum genome contains 57 ORFs(5% of the total) that encode 18 distincttransporters with predicted specificity foramino acids, carbohydrates, and cations (Fig.3 and Table 1). For the most part, thesetransport systems are of similar specificity tothose found in M. genitalium and B. burgdor-feri (8); however, several important differ-ences are seen.

Treponema pallidum has a broad spec-trum of amino acid transporters, althoughthese transporters are different from those inB. burgdorferi. For example, a transporter forglutamate or aspartate in T. pallidum is mostsimilar to mammalian glutamate transporters.There are no phosphoenolpyruvate:phospho-transferase (PTS) systems in T. pallidum forthe import of carbohydrates, in contrast toother bacterial species whose genome se-quences have been determined (8, 9). Ge-nome analysis predicts that T. pallidumhas three ATP-binding cassette transporterswith specificity for galactose (mglBAC) (22,23), ribose (rbsAC), and multiple sugars(y4oQRS), respectively; however, these threetransporters may display a broader substratespecificity. In E. coli, the mgl transporterdisplays affinity not only for galactose butalso for glucose (24), and its expression isup-regulated in glucose-limiting conditionsbut repressed at high glucose concentrations(24). Treponema pallidum may also requirean environment with limiting glucose con-centrations for maximal expression of thistransporter. Treponema pallidum has no rec-ognizable inorganic phosphate (Pi) uptakesystem, unlike other bacteria studied bywhole-genome analysis to date; therefore, up-

R E S E A R C H A R T I C L E S

17 JULY 1998 VOL 281 SCIENCE www.sciencemag.org376

on

Nov

embe

r 8,

201

2w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Page 4: Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete

take of glycerol-3-phosphate through themultiple sugar transporter may represent theprimary means whereby T. pallidum obtainsPi (Fig. 3).

Treponema pallidum contains an ATP-binding cassette transporter with specificityfor thiamine. Both thiamine and thiaminepyrophosphate (TPP) are substrates for thethiamine transporter in E. coli (25). This find-ing is of interest because T. denticola, T.vincentii, and Leptospira species require TPPfor growth in vitro (26), which suggests thatT. pallidum may also exhibit a growth de-pendency on TPP. The only recognizableTPP-dependent enzyme present in T. palli-dum is transketolase, which creates a linkbetween the pentose phosphate pathway andglycolysis.

Energy metabolism. The complement oftransport proteins in T. pallidum suggests thatit may use several carbohydrates as energysources, including glucose, galactose, mal-

tose, and glycerol. Experimental evidence hasdemonstrated that only glucose, mannose,and maltose support the multiplication of T.pallidum in a tissue culture system (27). It isnot known whether T. pallidum can use ami-no acids as a source of carbon and energy;however, the lack of necessary catabolic andanabolic pathway genes suggests that itwould not be able to use such alternativecompounds.

Metabolic pathway analysis reveals thatgenes encoding all of the enzymes of theglycolytic pathway are present in T. pallidum,including hexokinase, which phosphorylatesglucose and other hexose sugars (Fig. 3).Both M. genitalium and B. burgdorferi lackhexokinase; however, in these organisms,phosphorylation of hexoses is an integral partof the PTS uptake mechanism. Instead of thetypical eubacterial phosphofructokinase andpyruvate kinase, T. pallidum contains ho-mologs of these enzymes that use pyro-

phosphate. Similar inorganic pyrophosphate(PPi)–dependent enzymes have been de-scribed in some bacteria, protists, protozoa,and plants (28). None of the genes encodingcomponents of the tricarboxylic acid cycle oroxidative phosphorylation were identified,contrary to previous reports of the presenceof cytochromes, flavoproteins, and some ofthe tricarboxylic acid cycle enzymes (20, 29);these may have represented contaminatingrabbit components. Reducing power is prob-ably generated through the oxidative branchof the pentose phosphate pathway. This sim-plified metabolic strategy is similar to thatseen in both M. genitalium (9) and B. burg-dorferi (8).

Treponema pallidum, like B. burgdorferi,lacks a respiratory electron transport chain;therefore, ATP production must be accom-plished by substrate-level phosphorylation.As a result, membrane potential must be es-tablished by the reverse reaction of the ATP

cationstroABCD

K+

ntpJ Cu+

TpF1

Na+

oadAB

Mg++

mgtCE

ribose/galactose

rbsAC

glucose/ galactose/glycerol-P(?)

galactosemglABC

malate /succinate/fumarate

dctM

thiamine

D-alanine/glycinedagA

neutral amino acids

glutamate

spermidine /putrescinepotABCD

glutamate/aspartate

H+

V-type ATPase

ATP

ADP + Pi

cationsP-type ATPasecarnitine

alanine

ENERGY PRODUCTION

Glucose-6-P

Glucose

PEP

Pyruvate

Lactate

Acetyl-CoA

6-P-gluconate

Ribulose-5-P

Ribose-5-P

Glyc-3-P

GLYCOLYSIS

PENTOSE PHOSPHATE PATHWAY

+Oxaloacetate

Acetyl P

Acetate

Fatty acid

L-glutamate

L-aspartate + α-ketoglutarate

L-asparagine

L-glutamine

+

L-proline

+

Phosphatidic acid

Phosphatidyl glyc -P

L-serine L-glycine

V-type ATPase

Na +

ATP

ADP + P i

Phosphatidyl glycerol

?

Glycerol-3-P

N-acetyl-Gln-1-P

UDP-N-acetyl-D-Gln

?

UDP-N-acetyl-Gln

8 murein synthesis genes

L-glutamate D-glutamate

L-alanine D-alanine

D-alanyl-D-alanine

sec protein excretion &

leader peptidases

CELL WALL

22 putative lipoproteins

dNDPs

dNTPs

rNDPs

rNTPs

rAMP, rCMP, rUMPdAMP, dCMP, dTMP

dUMP

PRPPadenine

PPi

MUREIN SYNTHESIS

PROTEIN SECRETION

NUCLEOSIDES/NUCLEOTIDES ?

Fig. 3. Solute transport and metabolic pathways in T. pallidum. A schematicdiagram of a T. pallidum cell providing an integrated view of the transportersand the main components of the metabolism of this organism, as deducedfrom the genes identified in the genome. Presumed transporter specificity isindicated. Question marks indicate where particular uncertainties exist or

expected activities were not found. r, ribo; d, deoxy; AMP, adenosine mono-phosphate; CMP, cytosine monophosphate; NDP, nucleotide diphosphate;NTP, nucleotide triphosphate; TMP, thymidine monophosphate; UMP, uri-dine monophosphate; ADP, adenosine diphosphate; CoA, coenzyme A; UDP,uridine diphosphate; PRPP, phosphoribosyl-pyrophosphate.

R E S E A R C H A R T I C L E S

www.sciencemag.org SCIENCE VOL 281 17 JULY 1998 377

on

Nov

embe

r 8,

201

2w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Page 5: Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete

synthase. In both spirochetes, the ATP syn-thase is of the V1V0 type, most similar tothose found in eukaryotic vacuoles and inarchaea (30). Treponema pallidum has twoV1V0-type ATP synthase operons, each con-taining seven genes (Table 1). The gene orderin one operon (subunit E–ORF–subunitA–subunit B–subunit D–subunit I–subunit K)is identical to that seen in the ATP synthaseoperon in B. burgdorferi (8). The secondoperon in T. pallidum contains ATP synthasesubunits A, B, D, E, F, I, and K. The differ-ence in subunit gene composition betweenthese operons suggests that the ATP syn-thases may have different functions in thecell.

One clue as to the functional role for thetwo ATP synthases is the presence of anoxaloacetate decarboxylase transporter thatmay be involved in extrusion of Na1 fromthe cell, creating a Na1 gradient (31). Such agradient can be used to drive Na1-dependenttransporters similar to the amino acid trans-porters that are found in T. pallidum. Alter-natively, the Na1 gradient could be used tosynthesize ATP in the same manner as theH1 gradient is used by an F1F0-type ATPsynthase. Two ATP synthases have beenidentified in Enterococcus hirae, with speci-ficity for Na1 and H1, respectively (32).

Cellular processes. Treponema pallidumis microaerophilic and grows only at reducedconcentrations of molecular oxygen (33).This most likely reflects a balance betweenan oxygen requirement for energy productionand defects in protective mechanisms againstreactive oxygen intermediates. Unlike B.burgdorferi, which is also microaerophilic, T.pallidum apparently lacks genes encoding su-peroxide dismutase, catalase, or peroxidase

activities that protect against oxygen toxicity.NADH oxidase is the only enzyme identifiedthus far that can account for O2 utilization byT. pallidum.

Treponema pallidum contains a basic setof heat shock proteins but lacks s32, which isresponsible for transcription of heat shockgenes in other bacteria. This lack is consistentwith previous reports that T. pallidum lacks adetectable heat shock response. There is nochange in the amounts of GroEL or otherproteins at increased temperatures (34). Atleast two heat shock proteins in T. pallidum(GroEL and DnaK) appear to be constitutive-ly expressed at high levels, which may miti-gate the need for a typical heat shock re-sponse (35). However, the observed thermalsensitivity of T. pallidum (6 ) may reflect theabsence of a robust heat shock response inthis organism. It is of interest that B. burg-dorferi, which also lacks a recognizable s32,exhibits a heat shock response. The differen-tial response of the two spirochetes to in-creased temperatures suggests that a proteinor proteins of unknown biological func-tion may be involved in this process in B.burgdorferi.

Regulatory functions. Treponema palli-dum contains a minimal set of regulatorygenes that encode two response-regulatortwo-component systems and several puta-tive transcriptional repressors of unknownspecificity.

Although T. pallidum does not have asugar-specific PTS system, it does contain ahomolog of enzyme I (ptsI), a phosphocarrierprotein HPr (ptsH), an HPr(Ser) kinase(ptsK), and two ptsN genes, which suggeststhat these proteins may function mainly asregulators. Gram-positive bacteria have aspecific ATP-dependent protein kinase(ptsK) that phosphorylates HPr on a serineresidue (36). HPr(Ser;P) and a DNA-bind-

ing protein then interact to mediate repressionby binding specifically to DNA sequences,catabolite responsive elements found in thecontrol regions of catabolite-sensitive oper-ons (36). These proteins in T. pallidum mayfunction in a manner similar to that observedin Gram-positive bacteria.

Escherichia coli and other Gram-negativeorganisms coordinate nitrogen and carbonutilization so that mechanisms of carbon re-pression do not block the uptake and use oforganic nitrogen sources. Nitrogen-carbonutilization in E. coli is modulated by a regu-latory protein, PtsN, that displays similarityto the PTS enzymes IIA specific for fruc-tose and mannitol (37). Biochemical datasuggest that PtsN does not phosphorylate car-bohydrates but instead serves as a positiveregulator of organic nitrogen metabolism.Under such conditions, phosphoenolpyruvate(PEP)-dependent phosphorylation of PtsNoccurs through the transfer of a phosphategroup from PEP to enzyme I, then to a histi-dine residue on HPr, and finally to PtsN (37).The gene content of T. pallidum suggests thatboth ATP- and PEP-dependent protein phos-phorylation of HPr may integrate intracellularsignals reflecting the metabolic state of thecell. However, these hypotheses remain to bedemonstrated experimentally. These proteinsmay play alternative regulatory roles in T.pallidum as this organism displays limitedtransport and metabolic capacities.

Motility and chemotaxis. Motility-asso-ciated genes are highly conserved in both T.pallidum and B. burgdorferi, consistent withthe importance of this activity in these highlyinvasive spirochetes (23, 38). The 36 genesencoding proteins involved in flagellar struc-ture and function in T. pallidum are mostsimilar to those in B. burgdorferi (8). Theydiffer only in the number of proteins in theperiplasmic flagellar filaments; T. pallidum

Fig. 4. Dendrogram of members of the tprprotein family of T. pallidum. This unrooteddistance dendrogram was generated from amultiple sequence alignment of the 12 T. pal-lidum tpr paralogs described herein and Mspsequences from three Treponema denticolastrains (tdA, T. denticola OTK; tdB, T. denticola35405; tdC, T. denticola 33520) (42).

Fig. 5. Gene clusters found in both the T. pallidum and B. burgdorferi genomes. Graph valuesrepresent the number of T. pallidum gene clusters in which orthologous genes have the sameorganization in B. burgdorferi.

R E S E A R C H A R T I C L E S

17 JULY 1998 VOL 281 SCIENCE www.sciencemag.org378

on

Nov

embe

r 8,

201

2w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Page 6: Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete

has three core proteins (FlaB1, FlaB2, andFlaB3), a sheath protein (FlaA), and twouncharacterized proteins (39), and B. burg-dorferi has a single core protein and sheathprotein, whereas most other bacteria haveonly a core protein. Both spirochetes containtwo copies of the flagellar motor switch pro-tein, FliG; however, the importance of thisgene duplication is not known. Most of theflagellar genes in T. pallidum are found infour operons that contain between 2 and 16genes, most similar to the arrangement seenin B. burgdorferi. Treponema pallidum hasretained a s28 ortholog and class II and classIII motility promoters, whereas motility genesin B. burgdorferi appear to be transcribedthrough s70 initiation (8, 40). Treponema pal-lidum contains 13 chemotaxis genes that in-clude four methyl-accepting chemotaxis pro-teins with putative specificity for amino acids(aspartate, glutamate, and histidine) or carbohy-drates (glucose, ribose, and galactose).

Membrane proteins and lipoproteins.Freeze fracture studies (7) have shown thatthe outer membrane of T. pallidum contains arelatively small number of integral membraneproteins, a feature that may permit the organ-ism to evade the human immune response.Two candidate outer membrane proteins havebeen identified, but the cellular location andfunction of these proteins are a subject ofsome controversy (41). Although it is diffi-cult to identify outer membrane proteins withcertainty, genome analysis of T. pallidumindicates the presence of only 22 putativelipoproteins, as compared with 105 in B.burgdorferi, consistent with results from ul-trastructural studies.

Potential virulence factors. Treponemapallidum contains a large family of duplicat-ed genes (paralogs) (tprA-L) that encode pu-tative membrane proteins that may functionas porins and adhesins (Figs. 2 and 4). Thishypothesis is based on pair-wise and multiplesequence alignments of the T. pallidum genefamily to a major outer sheath protein (Msp)from T. denticola that represents an abundant,highly immunogenic, pore-forming adhesinin the outer membrane (42). It is not yetknown whether the tpr genes are expressedindividually or coordinately or to what extenteach gene is expressed. This gene family in T.pallidum is reminiscent of a 32-memberparalogous gene family encoding outer mem-brane proteins in Helicobacter pylori (omp)(9). The two gene families share featuresbesides possible porin and adhesin functions,the most striking being that both have mem-bers with regions of extensive sequence iden-tity. However, in both organisms, the homol-ogous regions do not always encompass theentire gene, so that some regions are identi-cal, but others are variable. As in the H.pylori family, one of the T. pallidum genes(tprA and L) contains a frameshift within a

small dinucleotide repeat that might be cor-rected by slipped-strand mispairing. Multiplecopies of the tpr genes may represent a mech-anism for generation of antigenic variation inT. pallidum as is found in other pathogenicbacteria, including Neisseria gonorrhoeae,M. genitalium, relapsing fever borreliae, andB. burgdorferi. Identification of the tpr fam-ily of putative outer membrane proteins mayprovide new targets for vaccine development.

Previous studies have indicated that T.pallidum does not produce lipopolysaccha-ride or potent exotoxins, although cytotoxicactivity against neuroblasts and other celltypes has been observed at extremely highconcentrations of the bacterium (43). Ge-nome analysis has revealed five genes encod-ing proteins similar to bacterial hemolysins.These putative hemolysin orthologs alsoshare varying degrees of amino acid sequencesimilarity with B. burgdorferi predicted pro-teins (8). None of the predicted hemolysinswith similarity to T. pallidum sequences havebeen shown to be cytolytic in their purifiedstate, and it will be necessary to perform suchstudies with the T. pallidum proteins before acytotoxic function can be assigned definitive-ly. A B. burgdorferi protein with hemolyticactivity (BlyA) was described recently (44),but orthologous sequences are not present inT. pallidum.

Comparative genomics. Four hundredseventy-six ORFs in T. pallidum (46%) haveorthologs in B. burgdorferi; 76% of theseORFs have a predicted biological function.More than 40% of the orthologous genes in T.pallidum and B. burgdorferi are highly con-served in other bacteria (8, 9) and are in-volved in housekeeping functions such astranscription, translation, DNA replication,basic energy metabolism, flagellar structureand function, cell division, and protein secre-tion. Some of the genes of unknown functionthat are conserved in the spirochetes but notrecognized in the other available genome se-quences are likely to represent “spirochete-specific” genes that contribute to the unusualstructural properties of these bacteria.

One hundred fifteen ORFs shared by T.pallidum and B. burgdorferi encode proteinsof unknown biological function; and almost50% of these appear to be unique to thespirochete group. This set of proteins with alimited phylogenetic distribution may includeimportant determinants of spirochete struc-ture and physiology and may, for example, beinvolved in the ability of both T. pallidumand B. burgdorferi to infect humans andcause chronic, disseminated disease.

Three hundred four of the ORFs shared bythe two spirochetes are located in gene clus-ters with conserved gene order (Fig. 5). Sev-eral conserved clusters contain ORFs encod-ing ribosomal proteins, including the largestcluster containing 27 ORFs, whereas other

important clusters encode proteins for theflagella, adenosine triphosphatases (ATPases),and cell division, as well as groups of pro-teins that are not obviously related. Furtherstudy of the arrangement of these clusters inthe two genomes may provide insight intothe evolution of the chromosomes of theseorganisms.

Of the 572 T. pallidum ORFs (54% oftotal) that are not shared with B. burgdorferi,more than 80% are of unknown biologicalfunction. This finding lends support to theconcept of diversity within a single group ofbacteria and underscores the fact that a con-siderable amount of T. pallidum biology isyet to be elucidated.

Conclusion. Treponema pallidum hasbeen a difficult organism to study experimen-tally because of its absolute dependence on amammalian host for sustained growth andviability. The genomic sequence of T. palli-dum offers a wealth of basic information thatwould be difficult, if not impossible, to obtainby any other approach. A more completeunderstanding of the biochemstry of this or-ganism derived from genome analysis mayprovide a foundation for the development of aculture medium for T. pallidum, which opensup the possibility of future genetic studies.

References and Notes1. F. F. Cartwright, Disease and History (Dorset, New

York, 1972).2. S. H. Barondes, Molecules and Mental Illness (Scien-

tific American Library, New York, 1993).3. E. N. Robinson Jr. et al., in Syphilis: Disease with a

History. Mechanisms of Microbial Disease, M. Schae-chter, G. Medoff, B. I. Einstein, Eds. (Williams &Wilkins, Baltimore, MD, 1993), pp. 334–342; S. Selland S. J. Norris, Int. Rev. Exp. Pathol. 24, 204 (1983).

4. E. M. Walker, J. K. Arnett, J. D. Heath, S. J. Norris,Infect. Immun. 59, 2476 (1991); E. M. Walker et al., J.Bacteriol. 177, 1797 (1995).

5. S. J. Norris et al., Microbiol. Rev. 57, 750 (1993).6. A. H. Fieldsteel, D. L. Cox, R. A. Moeckli, Infect.

Immun. 32, 908 (1981); S. J. Norris and D. G. Ed-mondson, ibid. 53, 534 (1986); D. L. Cox, MethodsEnzymol. 236, 390 (1994).

7. J. D. Radolf, M. V. Norgard, W. W. Shulz, Proc. Natl.Acad. Sci. U.S.A. 86, 2051 (1989); E. M. Walker, L. A.Borenstein, D. R. Blanco, J. N. Miller, M. A. Lovett, J.Bacteriol. 173, 5585 (1991); D. L. Cox, P. Chang, A. W.McDowall, J. D. Radolf, Infect. Immun. 60, 1076(1992).

8. C. M. Fraser et al., Nature 390, 580 (1997).9. R. D. Fleischmann et al., Science 269, 496 (1995);

C. M. Fraser et al., ibid. 270, 397 (1995); C. J. Bult etal., ibid. 273, 1058 (1996); J.-F. Tomb et al., Nature388, 539 (1997); H.-P. Klenk et al., ibid. 390, 364(1997).

10. The Nichols strain of T. pallidum subsp. pallidum wasoriginally isolated from the cerebrospinal fluid of aneurosyphilis patient in 1912 [H. A. Nichols and W. H.Hough, J. Am. Med. Assoc. 60, 108 (1913)]. Thisstrain has retained its virulence and appears to beclosely related to more recent isolates. Freshly ex-tracted T. pallidum subsp. pallidum Nichols was in-oculated into the testes of adult male New Zealandwhite rabbits (;5 3 107 T. pallidum per testis) thatwere housed at 16° to 18°C and were providedantibiotic-free feed and water. Animals were treatedon days 0 and 7 with triamcinolone acetonide (6mg/kg) to inhibit the host immune response andthereby increase bacterial yields. Rabbits were killedon day 11, and the testes were removed aseptically,

R E S E A R C H A R T I C L E S

www.sciencemag.org SCIENCE VOL 281 17 JULY 1998 387

on

Nov

embe

r 8,

201

2w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Page 7: Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete

rinsed in phosphate-buffered saline (PBS), andminced. The tissue was extracted for 30 min withconstant stirring under an atmosphere of 1.5% O2,5% CO2, balance N2 in 30 ml of PBS containing 10%heat-inactivated normal rabbit serum, 1.0 mM dithio-threitol, and heparin (1.5 U/ml). The extract wasremoved and saved and the extraction process re-peated. The resulting suspensions were centrifugedtwice at 500g for 7 min at 4°C to remove tissuedebris, and the resulting supernatant was centrifugedat 20,000g for 30 min at 4°C to sediment the T.pallidum. Pellets from each extraction were washedonce with PBS, resuspended in 2 ml of PBS, andlayered onto a 34-ml 10 to 60% continuous gradientof Hypaque in PBS [J. B. Baseman, J. C. Nichols, O.Rumpp, N. S. Hayes, Infect. Immun. 10, 1062 (1974)].Gradients were centrifuged at 87,000g for 80 min at20°C. Bands corresponding to purified T. pallidumwere removed and washed once with 30 ml of PBS(20,000g for 30 min at 4°C) and twice in a micro-centrifuge (1.5 ml, 13,000g for 10 min at 4°C). About1.9 3 1010 purified organisms were recovered.Treponema pallidum in 1 ml of PBS was lysed byaddition of 1 ml of lysis solution [tris-EDTA (TE)10 :100 0.5% SDS, proteinase K (100 mg/ml)] andincubation at 55°C for 2 hours. The lysate was re-peatedly extracted with an equal volume of phenol:chloroform:isoamyl alcohol until the interface wasfree of visible debris. After a final extraction withchloroform and addition of 1 M ammonium acetate,the nucleic acid was precipitated from the aqueousphase by addition of two volumes of ethanol, pel-leted by centrifugation, washed once with 70% eth-anol, air-dried for 15 min at 37°C, resuspended in 100ml of TE buffer (10 :1), and stored at 4°C. DNAconcentration was determined by fluorometry. Elec-trophoresis of the DNA in 0.6% agarose gel (0.53tris–boric acid–EDTA buffer) indicated that the DNAwas .23 kb in size. Purity of the T. pallidum DNAwas assessed by treatment of 3 ml of the sample withSal I, electrophoresis, Southern (DNA) blotting, andhybridization with a rabbit mitochondrial DNA probe.Although hybridizing bands were detectable, theamount of contaminating rabbit mitochondrial DNAwas estimated to be ,1% of the total DNA. Forpreparation of a small insert library in pUC 18 vector,;10 mg of purified T. pallidum DNA was sheared bynebulization at 1.5 3 1023 N/m2 for 2 min and sizefractionated on a 1.2% agarose gel. A library with anaverage insert size between 1.6 and 2.0 kb wasprepared, and the T. pallidum genome was sequencedby a whole-genome random sequencing method pre-viously applied to other microbial genomes (8, 9). Anabout sevenfold genome coverage was achieved bygenerating 14,464 sequences with an average editedlength of 525 bases. Sequences were assembled withTIGR Assembler, resulting in a total of 130 assembliescontaining at least two sequences, which were clus-tered into 30 groups on the basis of linking informa-tion from forward and reverse sequence reads. All T.pallidum sequences that had been mapped previouslywere searched against the assemblies in an attemptto order the contig groups. Sequence and physicalgaps for the chromosome were closed as previouslydescribed (8, 9). A total of 676 sequencing reactionswere necessary to complete genome closure. At thecompletion of the project, less than 1% of the chro-mosome had single-fold coverage. Coding regions(ORFs) were identified with compositional analysisthat used an interpolated Markov model based onvariable length oligomers [S. Salzberg, A. Delcher, S.Kasif, O. White, Nucleic Acids Res. 26, 544 (1998)].ORFs of .600 bp were used to train the Markovmodel, as well as T. pallidum ORFs from GenBank.Once trained, the model was applied to the completeT. pallidum genome sequence. ORFs that overlappedwere visually inspected and, in some cases, removed.All putative ORFs were searched against a nonredun-

dant amino acid database as previously described (8,9). ORFs were also analyzed with 527 hidden Markovmodels constructed for a number of conserved pro-tein families (pfam v2.0) with hmmer [E. L. Sonham-mer, S. R. Eddy, R. Durbin, Proteins 28, 405 (1997)].Families of paralogous genes were constructed bypair-wise searches of proteins with FASTA. Matchesthat spanned at least 60% of the smaller of theprotein pair were retained and visually inspected. Atotal of 42 paralogous gene families containing 129genes were identified (Fig. 1). TopPred was used toidentify potential membrane-spanning domains(MSD) in proteins [M. G. Claros and G. vonHeijne,Comput. Appl. Biosci. 10, 685 (1994)]. Five hundredthirty-five proteins containing at least one putativeMSD were identified, of which 246 were predicted tohave more than one MSD. The presence of signalpeptides and the probable position of a cleavage sitein secreted proteins were detected with Signal-P [H.Neilsen, J. Engelbrecht, S. Brunak, G. von Heijne,Protein Eng. 10, 1 (1997)]; 155 proteins were predict-ed to have a signal peptide. Lipoproteins were iden-tified by scanning for a lipobox in the first 30 aminoacids of every protein. Twenty-two putative lipopro-teins were identified, and five of these were ORFs ofunknown function. All features identified with theseprograms were visually inspected to eliminate falsepositives.

11. M. Riley, Microbiol. Rev. 57, 862 (1993).12. J. R. Lobry, Mol. Biol. Evol. 13, 660 (1996); M. P.

Francino and H. Ochman, Trends Genet. 13, 240(1997).

13. P. M. Sharp and W.-H. Li, Nucleic Acids Res. 15, 1281(1987); J. O. McInerney, Microbiol. Comp. Genomics2, 89 (1997).

14. B. J. Paster, F. E. Dewhirst, W. G. Weisburg, J. Bacte-riol. 173, 6101 (1991); A. Centurion-Lara, C. Castro,W. C. van Voorhis, S. Lukehart, FEMS Microbiol. Lett.143, 235 (1996).

15. C. Ojaimi, B. E. Davidson, I. Saint Girons, I. G. Old,Microbiology 140, 2931 (1994).

16. W. Freist, D. H. Gauss, M. Ibba, D. Soll, Biol. Chem.378, 1103 (1997).

17. M. Ibba, J. L. Bono, P. A. Rosa, D. Soll, Proc. Natl. Acad.Sci. U.S.A. 94, 14383 (1997); M. Ibba et al., Science278, 1119 (1997).

18. S. Commans, P. Plateau, S. Blanquet, F. Dardel, J. Mol.Biol. 253, 100 (1995); S. Cusack, A. Yaremchuk, M.Tukalo, EMBO J. 15, 6321 (1996).

19. L. V. Stamm, S. R. Greene, N. Y. Barnes, H. L. Bergen,J. M. Hardham, FEMS Microbiol. Lett. 155, 115(1997).

20. C. D. Cox, in Pathogenesis and Immunology of Trepo-nemal Infection, R. F. Schell and D. M. Musher, Eds.(Dekker, New York, 1983), pp. 57–70.

21. J. T. Barbieri, F. E. Austin, C. D. Cox, Infect. Immun. 31,1071 (1981).

22. L. V. Stamm, N. R. Young, J. G. Frye, J. M. Hardham,DNA Sequence 6, 293 (1996).

23. S. F. Porcella et al., Gene 177, 115 (1996).24. A. Death and T. Ferenci, Res. Microbiol. 144, 529

(1993); T. Ferenci, FEMS Microbiol. Rev. 18, 301(1996).

25. A. Matsuura, A. Iwashima, Y. Nose, J. Vitaminol. 18,29 (1972).

26. K. G. Van Horn and R. M. Smibert, Can. J. Microbiol.29, 1141 (1983); R. C. Johnson, in The Prokaryotes,M. P. Starr et al., Eds. (Springer-Verlag, Berlin, 1981),pp. 582–591.

27. S. J. Norris et al., unpublished data.28. W. E. O’Brien, S. Bowien, H. G. Wood, J. Biol. Chem.

250, 8690 (1975); R. E. Reeves, R. Serrano, D. J. South,ibid. 251, 2958 (1976); R. G. Kemp and R. L. Tripathi,J. Bacteriol. 175, 5723 (1993); I. Bruchhaus, T. Jacobs,M. Denart, E. Tannich, Biochem. J. 316, 57 (1996).

29. P. G. Lysko and C. D. Cox, Infect. Immun. 21, 462(1978); ibid. 16, 885 (1977); N. L. Schiller and C. D.Cox, ibid., p. 60.

30. M. Forgac, Physiol. Rev. 69, 765 (1989); P. L. Pedersonand E. Carafoli, Trends Biochem. Sci. 12, 146 (1987);J. Konishi, T. Wakagi, T. Oshima, M. Yoshida, J. Bio-chem. 102, 1379 (1987).

31. E. Schwarz, D. Oesterhelt, H. Reinke, K. Beyreuther, P.Dimroth, J. Biol. Chem. 263, 9640 (1988); G.Woehlke, K. Wifling, P. Dimroth, ibid. 267, 22798(1992).

32. C. Shibata, T. Ehara, K. Tomura, K. Igarashi, H. Koba-yashi, J. Bacteriol. 174, 6117 (1992); K. Takase et al.,J. Biol. Chem. 269, 11037 (1994).

33. C. D. Cox and M. K. Barber, Infect. Immun. 10, 123(1974); W. H. Cover et al., Sex. Transm. Dis. 9, 1(1982); D. L. Cox et al., Appl. Environ. Microbiol. 56,3063 (1990).

34. L. V. Stamm, F. C. Gherardini, E. A. Parrish, C. R.Moomaw, Infect. Immun. 59, 1572 (1991).

35. S. J. Norris, unpublished data.36. J. Reizer et al., Mol. Microbiol. 27, 1157 (1998); J.

Deutscher and M. H. Saier Jr., Proc. Natl. Acad. Sci.U.S.A. 80, 6790 (1983); B. E. Jones et al., J. Biol.Chem. 272, 26530 (1997); J. Reizer et al., EMBO J. 8,2111 (1989).

37. B. S. Powell et al., J. Biol. Chem. 270, 4822 (1995); J.Reizer, A. Reizer, M. H. Saier Jr., G. R. Jacobsen, ProteinSci. 1, 722 (1992); M. H. Saier Jr. and J. Reizer, Mol.Microbiol. 13, 755 (1994).

38. K. Doetsch, Nature 255, 656 (1975); N. W. Charon,E. P. Greenberg, M. B. Koopman, R. J. Limberger, Res.Microbiol. 143, 597 (1992); S. R. Greene, L. V. Stamm,J. M. Hardham, N. R. Young, J. G. Frye, DNA Sequence7, 267 (1997); K. E. Hagman, S. F. Porcella, T. G.Popova, M. V. Norgard, Infect. Immun. 65, 1701(1997).

39. S. J. Norris, N. W. Charon, R. G. Cook, M. D. Fuentes,R. J. Limberger, J. Bacteriol. 170, 4072 (1988); R. D.Isaacs et al., Infect. Immun. 57, 3403 (1989); R. D.Isaacs and J. D. Radolf, ibid. 58, 2025 (1990); L.Pallesen and P. Hindersson, ibid. 57, 2166 (1989); C. I.Champion, J. N. Miller, M. A. Lovett, D. R. Blanco, ibid.58, 1697 (1990).

40. R. J. Limberger, L. L. Slivienski, M. C. El-Afandi, L. A.Dantuono, J. Bacteriol. 178, 4628 (1996); Y. Ge andN. W. Charon, FEMS Microbiol. Lett. 153, 425 (1997).

41. J. D. Radolf, Mol. Microbiol. 16, 1067 (1995); D. R.Blanco et al., J. Bacteriol. 177, 3556 (1995); D. R.Blanco et al., ibid. 178, 6685 (1996); D. R. Blanco,J. N. Miller, M. A. Lovett, Emerg. Infect. Dis. 3, 11(1997); D. R. Akins et al., J. Bacteriol. 179, 5076(1997); C. I. Champion et al., ibid., p. 1230; J. M.Hardham et al., Gene 197, 47 (1997).

42. J. C. Fenno et al., J. Bacteriol. 179, 1082 (1997); D. A.Mathers, W. K. Leung, J. C. Fenno, Y. Hong, B. C.McBride, Infect. Immun. 64, 2904 (1996); J. C. Fenno,K. H. Muller, B. C. McBride, J. Bacteriol. 178, 2489(1996).

43. T. J. Fitzgerald, L. A. Repesh, S. G. Oakes, Br. J. Vener.Dis. 58, 1 (1982); S. G. Oakes, L. A. Repesh, R. S.Pozos, T. J. Fitzgerald, ibid., p. 220; G. H. Wong, B. M.Steiner, S. Graves, Infect. Immun. 41, 636 (1983).

44. T. Guina and D. B. Oliver, Mol. Microbiol. 24, 1201(1997).

45. We thank A. R. Kerlavage for preparation of Fig. 2;D. Soll and M. H. Saier Jr. for helpful discussions; M.Heaney, J. Scott, and A. Saeed for software anddatabase support; and B. Vincent, D. Maas, and V.Sapiro for computer system support. Supported byNIH grant AI31068 to GMW. Requests for reprintsshould be sent to C.M.F. (e-mail: [email protected]).The annotated genome sequence and gene familyalignments are available on the World Wide Webat www.tigr.org/tdb/mdb/tpdb/tpdb.html. Addi-tional information regarding the T. pallidum ge-nome is available at utmmg.med.uth.tmc.edu/treponema/tpall.html. The genome sequence hasbeen deposited with GenBank under accessionnumber AE000520.

6 March 1998; accepted 6 May 1998

R E S E A R C H A R T I C L E S

17 JULY 1998 VOL 281 SCIENCE www.sciencemag.org388

on

Nov

embe

r 8,

201

2w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from