Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4)...

26
Intron length distribution During annotation it was observed that many S. mansoni genes show a bias in their intron sizes, with the size of introns markedly increasing from the 5’ end to the 3’ end. In particular, small introns (<30 nt) are numerous at 5’ ends and introns of several kilobases are numerous at 3’ ends of transcripts. To test this distribution, we extracted intron sequences from genes in the S. mansoni genome annotation (v4). A large number of genes span gaps, preventing an accurate determination of intron size. The analysis was, therefore, limited to 5,906 gene models (5,890 introns; Supplementary Table 4), in which the start and stop codons for a gene were present within a single contig. Despite limiting the analysis to approximately one-half of the total predicted gene set, a size bias can clearly be seen (Figure 2, Supplementary Table 5) for introns at positions 1-6, relative to the 5’ or 3’ ends of each transcript. Because genes with different sizes were compared directly alongside each other, the signal of skewed size distribution is overtaken by noise after the first 5 or 6 introns. The reason for the distribution remains to be determined but a recent study of human- chimpanzee orthologue pairs found that more 5' introns are longer than 3' introns, because they contain more regulatory elements 1 . Using version 53 of the Ensembl database 2 , we extracted intron sizes from the C. elegans (release 190) and human genome (NCBI 36) annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite bias with considerably larger introns at the 5’ end of each transcript (Supplementary Table 5). Analysis of putative transcription-associated proteins (TAPs) Schistosome transcription factor families were identified by searching against a customised transcription factor database and clustered using the TRIBE-MCL algorithm. The clustering analysis detected 362 protein families containing Schistosoma sequences, including 27 singletons, defined as clusters containing only a single member (Supplementary Fig. 2). The 335 S. mansoni TAP clusters were analysed with respect to the origin of the members from the other 18 metazoan genomes queried with the HMMs (Supplementary Table 6). On SUPPLEMENTARY INFORMATION doi: 10.1038/nature08160 www.nature.com/nature 1

Transcript of Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4)...

Page 1: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

Intron length distribution

During annotation it was observed that many S. mansoni genes show a bias in their intron

sizes, with the size of introns markedly increasing from the 5’ end to the 3’ end. In particular,

small introns (<30 nt) are numerous at 5’ ends and introns of several kilobases are numerous

at 3’ ends of transcripts. To test this distribution, we extracted intron sequences from genes in

the S. mansoni genome annotation (v4). A large number of genes span gaps, preventing an

accurate determination of intron size. The analysis was, therefore, limited to 5,906 gene

models (5,890 introns; Supplementary Table 4), in which the start and stop codons for a gene

were present within a single contig. Despite limiting the analysis to approximately one-half

of the total predicted gene set, a size bias can clearly be seen (Figure 2, Supplementary Table

5) for introns at positions 1-6, relative to the 5’ or 3’ ends of each transcript. Because genes

with different sizes were compared directly alongside each other, the signal of skewed size

distribution is overtaken by noise after the first 5 or 6 introns.

The reason for the distribution remains to be determined but a recent study of human-

chimpanzee orthologue pairs found that more 5' introns are longer than 3' introns, because

they contain more regulatory elements1. Using version 53 of the Ensembl database2, we

extracted intron sizes from the C. elegans (release 190) and human genome (NCBI 36)

annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there

is a pronounced but opposite bias with considerably larger introns at the 5’ end of each

transcript (Supplementary Table 5).

Analysis of putative transcription-associated proteins (TAPs)

Schistosome transcription factor families were identified by searching against a customised

transcription factor database and clustered using the TRIBE-MCL algorithm. The clustering

analysis detected 362 protein families containing Schistosoma sequences, including 27

singletons, defined as clusters containing only a single member (Supplementary Fig. 2). The

335 S. mansoni TAP clusters were analysed with respect to the origin of the members from

the other 18 metazoan genomes queried with the HMMs (Supplementary Table 6). On

SUPPLEMENTARY INFORMATIONdoi: 10.1038/nature08160

www.nature.com/nature 1

Page 2: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

average, 43% of the clusters contain chordate sequences, contrasting with nearly half that

number for nematodes (27%). Furthermore, the schistosome DNA-binding proteins appear to

display the greatest similarity with those of pufferfish (49% and 47% of the TAP families

have members originating from Tetraodon nigroviridis and Takifugu rubripes, respectively).

Evolution of distinct tissues

Schistosomes possess all the components of the Scribble, Par and Crumbs complexes3 that

interact directly with the cytoskeleton to specify where adherens and septate/tight junctions

form, thus controlling cell polarity (Supplementary Table 9). The presence of Claudin and

Zonula Occludens-1, critical proteins of the septate/tight junction, suggests that schistosome

epithelia are organised in a similar fashion to Drosophila while genes encoding other septate

junction proteins, e.g. Neurexin IV, Neuroglian, and Gliotactin4 accord with the ultrastructural

descriptions of septate structures in sensory nerve endings5. Cell-cell contact interactions

responsible for denoting tissue boundary lines may be specified by Notch signalling, and

homologues of the Notch-processing protease, Kuzbanian, are present, while the array of at

least ten Notch/Delta-like genes may be linked to the morphological diversity of distinct

phenotypes in the life cycle.

Schistosomes possess transcription factors (Supplementary Table 9), which, in insects

and vertebrates, mediate neurogenesis and act to pattern the nerve cord, supporting the ancient

origins of neural patterning6. These genes, well studied in vertebrates, act in concert with

Max-interacting protein upon neurogenin to influences Delta and promote lateral inhibition;

thus cells are selected from a pool of neural progenitors to become neurons. The use of

attractive or repulsive guidance cues in nervous system development, best characterised in C.

elegans7, directs nerves to their synaptic partners. In addition to Netrin and its membrane

receptor UNC-58, homologues of the antagonistic slit and roundabout proteins indicate that

schistosomes have all the tools needed to control axon growth cones and migration of neural

cells.

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 2

Page 3: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

Nuclear receptors

Nuclear receptors (NRs) are important transcriptional modulators in metazoans, which

regulate homeostasis, differentiation, metamorphosis and reproduction. There are 21 members

found in S. mansoni, more than half of them evolved from a second gene duplication, which is

independent from that of vertebrates. The data challenge the current theory of NR evolution9.

Two NRs are homologues of the thyroid hormone receptor that previously were thought to be

restricted to chordates10. Interestingly, a novel family of NR was identified in S. mansoni and

subsequently demonstrated to exist in other platyhelminths, mollusks and crustaceans9. In S.

mansoni there are three members of the family that each possesses two DNA-binding

domains (DBD) in tandem followed by a ligand-binding domain. The unique structure of

having two DBDs suggests a novel mechanism for regulation of target gene(s).

Redox biochemistry

Schistosome parasites reside in an aerobic environment and are vulnerable to damaging

reactive oxygen species, produced by their own aerobic respiration, as well as by the host

immune assault and schistosome redox enzymes have been shown to play a protective role11.

The redox biochemistry of S. mansoni is similar to but condensed relative to human pathways

(Supplementary Table 10 and Supplementary Fig. 3). Superoxide generated by the partial

reduction of molecular oxygen is converted to H2O2 by superoxide dismutase; H2O2 is

primarily reduced by peroxiredoxins, with electrons supplied by either thioredoxin or

glutathione12, which are reduced by a single, multifunctional protein thioredoxin glutathione

reductase, an essential protein and promising target for antischistosome drug development13.

Many redox proteins commonly found in other eukaryotes are apparently absent from the

genome (Supplementary Table 10). In addition, NADPH oxidase (superoxide generation) and

nitric oxide synthase (NO generation) are absent from the S. mansoni genome. Although there

are reports indicating the presence of nitric oxide synthase in worms14,15, diaphorase activity

in the parasite attributed to nitric oxide synthase could potentially be due to the activity of

ferredoxin NADP(H) oxidoreductase16 or perhaps to the putative cytochrome P450 reductase

(Smp_030760). NADPH oxidases are widespread in nature and are thought to have emerged

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 3

Page 4: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

before the divergence of fungi, plants and animals17; their apparent absence from the S.

mansoni genome is remarkable.

Lipid Metabolism

Lipid metabolism is an area in which previous biochemical studies have shown

schistosomes to be deficient but a genetic explanation for their inability to synthesise sterols

or free fatty acids (FFA) de novo was lacking18. The mevalonate pathway, a precursor to

steroid synthesis, is completely represented in the S. mansoni genome (Supplementary Table

11). However, there is scant evidence for the presence of enzymes that convert its products,

via squalene and lanosterol, to sterols19,20. Similarly, S. mansoni’s inability to synthesise FFA

is confirmed by loss of genes encoding both the cytosolic and mitochondrial fatty acid

synthase complexes. Only an acyl carrier protein, acetyl-CoA carboxylase and the

mitochondrial enzyme beta-ketoacyl synthase are present. As schistosomes can elongate

existing FFA derived from the host, these genes may still perform a function, acting in

concert with several homologues of yeast fatty acid elongation proteins. It should be noted

that the usually biochemically inactive beta oxidation pathway, for which S. mansoni

possesses a complete complement of requisite enzymes, might operate in reverse to perform a

similar function21.

As compensation, schistosomes do have the enzymes necessary to incorporate FFA or

sterols into, or derive them from, more complex lipid molecules. It appears that S. mansoni

depends upon a plethora of receptors and transport proteins dedicated to lipid uptake from the

host: at least three low density lipoprotein and two high density lipoprotein receptors for the

uptake of esterified-sterols, triacylglycerol (TAG) and phospholipids; at least four isoforms

of saposin, a protein involved intracellular lipid transfer and the hydrolysis of sphingolipids in

lysosomes; cholesterol and glycosphingolipid trafficking Niemann–Pick type C proteins;

several phosphatidylinositol, phosphatidylethanolamine and phosphatidylcholine transfer

proteins; and previously identified LDL receptor molecules8, complete the schistosome’s

extensive lipid carrying protein repertoire.

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 4

Page 5: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

Two isoforms of sterol esterase and several phospholipase homologues may provide

alterative sources of FFA, cholesterol or esterified intermediates such as lysophosphatidic

acid for membrane synthesis. In regard to esterified lipid precursors, it should be noted that

only one, glycerol-3-phosphate O-acyltransferase, a microsomal isoform22, appears capable of

producing lysophosphatidic acid from FFA and glycerol 3 phosphate, the mitochondrial

isoform is absent. TAG plays an uncertain role in the schistosome’s life cycle, despite

constituting 40% or more of the lipid content of adult worms18. They are slow to turn over, do

not contribute to the formation of other lipids such as phosphatidylcholine 18 and the apparent

inactivity of beta oxidation makes their use as an energy store doubtful21. Nevertheless, S.

mansoni possesses lipases capable of breaking TAG down, so the molecule may have

functions beyond preventing intracellular FFA concentrations from rising too high18.

Pathways responsible for synthesizing the phospholipid components of membranes are

well represented with two notable exceptions. Phosphatidylcholine can not be converted from

phosphatidylethanolamine but must be derived from diacylglycerol23 and, lacking inositol-3-

phosphate synthase, the parasite must depend on its host as source of inositol. Unlike

phosphatidylcholine, phosphatidylserine synthesis proceeds via a pathway similar to that

utilised by mammals. There is no evidence that it can be synthesised directly from

diacylglycerol.

The kinome and signalling

The TGF beta signalling pathway has been shown to play a potential role in host-

parasite interactions by transducing a signal from host TGF beta 1, via a parasite signalling

pathway, to regulate gene expression24 and reproductive development25. In schistosomes the

TGFβ pathway has been conserved and like Drosophila melanogaster and Caenorhabditis

elegans26, there are a limited number of members that are responsible for the diversity of

genes that are regulated. To date, two members of the schistosome TGFβ family of ligands,

two type II TGF- beta receptors, four type I TGF- beta receptors SmTβRI and three activin

receptor-like type I kinases, five members of the schistosome Smad family, and ten

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 5

Page 6: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

scaffolding/regulatory proteins and two target genes have been identified27 (Supplementary

Figure 5 and Supplementary Table 15). Both the TGF beta /Activin and BMP pathways are

represented by ligands and Smads. The type II and type I receptors for the BMP pathway have

not been identified.

In S. mansoni, few signalling molecules have been characterised, beyond those of the TGFβ

pathway. The study of the kinase complement (kinome) is of major importance for

understanding Schistosome physiology and provides insights into how to disrupt its

interactions with its host. The S. mansoni genome encodes 249 kinases, including 22 genes

with alternative splicing. A dendrogram using the kinase domain shows that the main

schistosome groups CAMK (calcium/calmodulin-dependent protein kinase), AGC, STE

(homologues of yeast sterile kinases) and CMGC are well separated (Supplementary Fig. 8),

although several kinase families fell outside of these clusters. All of the main CMGC families

are conserved between yeast, nematodes, mammals and S. mansoni, with the exception of the

RCK family, which is absent from S. mansoni and yeast. The S. mansoni MAPK include

representatives of ERK (extracellular signal–regulated kinase) and JNK (c-Jun NH2-terminal

kinase) types of MAPK, but appears to lack members of the p38 type (also not identified in S.

japonicum28). However, the presence of MAP kinases responsible for the activation of p38

(in humans and other species), suggests that this pathway (Supplementary Fig. 9) is active but

may have a different activator of the effector proteins, perhaps JNK. The second largest S.

mansoni kinase group, the AGC family, contains representatives of all family members

present in C. elegans and D. melanogaster. Interestingly, the most highly transcribed kinases

are 3 members of the AGC group that belongs to the GRK family of G protein signalling

regulators29 (Supplementary Table 16 and Supplementary Figure 8).

The least represented groups observed within the schistosome genome are the casein

kinase (CK1) and Receptor Guanylate Cyclase (RGC) families with only 7 and 3 members,

respectively. In contrast, CK1 is the largest group in C. elegans with 87 members and RGC

has 27 members. The CK1 and CMGC group members expressed in sperm or during

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 6

Page 7: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

spermatogenesis in C.elegans, are missing in S. mansoni, perhaps indicating distinct

regulatory mechanisms in this species.

The STE kinases, which function in the MAP kinase activation cascade, are also

comparatively abundant in the schistosome genome. In S. mansoni, this group includes 7

STE7 (MAPKK) kinases, 4 STE11 (MAPKKK), and 13 STE20 (MAPKKK) kinases.

Crosstalk occurs between protein kinases functioning at different levels of the MAPK

cascade. The large number of STE family kinases could translate into an enormous potential

for upstream signal specificity and diversity. Vertebrate EGF activates S. mansoni EGFR and

the downstream classical ERK pathway, indicating the conservation of EGFR function in S.

mansoni. Moreover, human EGF has been shown to increase protein and DNA synthesis as

well as protein phosphorylation in parasites, supporting the hypothesis that host EGF could

regulate schistosome development30.

The Degradome

ESTs from GenBank, dbEST, and the Wellcome Trust Sanger Institute

(ftp://ftp.sanger.ac.uk/pub/pathogens/Schistosoma/mansoni/ESTs/) were aligned to the S.

mansoni genome using BLAT (a minimum score 30 and 90% sequence similarity) to obtain

information related to life cycle stage and adult worm gender from the original library

description31. Of the 335 protease sequences, 80% have at least one matching EST

(Supplementary Table 18).

Twenty-one percent of the degradome comprises inactive protease homologues (IPHs),

i.e., proteases that contain mutations in one or more of the critical active site residues

(Supplementary Table 18). These IPHs are found in all five mechanistic classes, and some

have already been confirmed as being actively transcribed32,33. Though IPHs constitute a

significant proportion of the total degradome in other organisms (e.g., 15% in humans34), their

function(s) is unclear. They may regulate or inhibit, perhaps by a dominant-negative effect,

protease activity or titrate endogenous protein inhibitors, thereby increasing overall

proteolytic capacity34,35.

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 7

Page 8: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

Metabolic chokepoints

A total of 607 enzymatic reactions could be placed in pathways and 120 of these

enzymes were identified as chokepoints based on the capability to uniquely generate specific

products or utilise specific substrates (Supplementary Table 21). As validation of the

approach, the list of chokepoints includes many that are drug targets in S. mansoni or other

organisms. For instance, in the mevalonate pathway, almost all reactions are chokepoints. The

3-hydroxymethylglutaryl-coenzyme A reductase (HMG-CoA reductase) is well characterized:

its inhibition is accompanied by a cessation of egg production by the female parasite and a

reduced ability of the parasite to properly glycosylate proteins 36. That the enzyme is

significantly different from its mammalian orthologue and that its inhibitor mevinolin, has

anti-schistosomal activity37, provides further validation for HMG-CoA reductase as a target.

The chokepoint enzyme, L-DOPA decarboxylase, plays an important role in catecholamine

biosynthesis in S. mansoni, decarboxylating 5-hydroxytryptophan and L-DOPA 38. The

therapeutic potential of the enzyme is suggested by the drug Methyldopa, which is commonly

used to treat hypertension and inhibits L-DOPA decarboxylase in mammals, and which

inhibits the enzyme activity in schistosome extracts38. Glycolysis contains 4 chokepoint

reactions; one of which, phosphofructokinase, has been extensively studied 39,40and the

relationship established between its inhibition by trivalent organic antimonials and the

inhibition of schistosome growth in vitro41,42

1. Gazave, E., Marques-Bonet, T., Fernando, O., Charlesworth, B. & Navarro, A. Patterns and rates of intron divergence between humans and chimpanzees. Genome biology 8, R21 (2007).

2. Hubbard, T.J. et al. Ensembl 2009. Nucleic Acids Res 37, D690-697 (2009). 3. Humbert, P.O., Dow, L.E. & Russell, S.M. The Scribble and Par complexes in polarity

and migration: friends or foes? Trends Cell Biol 16, 622-630 (2006). 4. Furuse, M. & Tsukita, S. Claudins in occluding junctions of humans and flies. Trends

Cell Biol 16, 181-188 (2006). 5. Nuttman, C.J. Fine Structure of Ciliated Nerve Endings in Cercaria of Schistosoma-

Mansoni. Journal of Parasitology 57, 855-& (1971). 6. Cowden, J. & Levine, M. Ventral dominance governs sequential patterns of gene

expression across the dorsal-ventral axis of the neuroectoderm in the Drosophila embryo. Dev Biol 262, 335-349 (2003).

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 8

Page 9: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

7. Dickson, B.J. Molecular mechanisms of axon guidance. Science 298, 1959-1964 (2002).

8. Verjovski-Almeida, S. et al. Transcriptome analysis of the acoelomate human parasite Schistosoma mansoni. Nat Genet 35, 148-157 (2003).

9. Wu, W., Niles, E.G., Hirai, H. & LoVerde, P.T. Evolution of a novel subfamily of nuclear receptors with members that each contain two DNA binding domains. BMC Evol Biol 7, 27 (2007).

10. Wu, W., Niles, E.G. & LoVerde, P.T. Thyroid hormone receptor orthologues from invertebrate species with emphasis on Schistosoma mansoni. BMC Evol Biol 7, 150 (2007).

11. Loverde, P.T. Do antioxidants play a role in schistosome host-parasite interactions? Parasitol Today 14, 284-289 (1998).

12. Sayed, A.A., Cook, S.K. & Williams, D.L. Redox balance mechanisms in Schistosoma mansoni rely on peroxiredoxins and albumin and implicate peroxiredoxins as novel drug targets. J Biol Chem 281, 17001-17010 (2006).

13. Sayed, A.A. et al. Identification of oxadiazoles as new drug leads for the control of schistosomiasis. Nat Med 14, 407-412 (2008).

14. Kohn, A.B., Moroz, L.L., Lea, J.M. & Greenberg, R.M. Distribution of nitric oxide synthase immunoreactivity in the nervous system and peripheral tissues of Schistosoma mansoni. Parasitology 122 Pt 1, 87-92 (2001).

15. Kohn, A.B., Lea, J.M., Moroz, L.L. & Greenberg, R.M. Schistosoma mansoni: use of a fluorescent indicator to detect nitric oxide and related species in living parasites. Exp Parasitol 113, 130-133 (2006).

16. Girardini, J.E., Dissous, C. & Serra, E. Schistosoma mansoni ferredoxin NADP(H) oxidoreductase and its role in detoxification. Mol Biochem Parasitol 124, 37-45 (2002).

17. Bedard, K., Lardy, B. & Krause, K.H. NOX family NADPH oxidases: not just in mammals. Biochimie 89, 1107-1112 (2007).

18. Brouwers, J.F., Smeenk, I.M., van Golde, L.M. & Tielens, A.G. The incorporation, modification and turnover of fatty acids in adult Schistosoma mansoni. Mol Biochem Parasitol 88, 175-185 (1997).

19. Herman, G.E. Disorders of cholesterol biosynthesis: prototypic metabolic malformation syndromes. Hum Mol Genet 12 Spec No 1, R75-88 (2003).

20. Lees, N.D., Skaggs, B., Kirsch, D.R. & Bard, M. Cloning of the late genes in the ergosterol biosynthetic pathway of Saccharomyces cerevisiae--a review. Lipids 30, 221-226 (1995).

21. Barrett, J., Biochemistry of Parasitic Helminths. (Macmillan Publishers, London, 1981).

22. Chen, Y.Q. et al. AGPAT6 is a novel microsomal glycerol-3-phosphate acyltransferase. J Biol Chem 283, 10048-10057 (2008).

23. de Kroon, A.I. Metabolism of phosphatidylcholine and its implications for lipid acyl chain composition in Saccharomyces cerevisiae. Biochim Biophys Acta 1771, 343-352 (2007).

24. Osman, A., Niles, E.G., Verjovski-Almeida, S. & LoVerde, P.T. Schistosoma mansoni TGF-beta receptor II: role in host ligand-induced regulation of a schistosome target gene. PLoS Pathog 2, e54 (2006).

25. Freitas, T.C., Jung, E. & Pearce, E.J. TGF-beta signaling controls embryo development in the parasitic flatworm Schistosoma mansoni. PLoS Pathog 3, e52 (2007).

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 9

Page 10: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

26. Ruvkun, G. & Hobert, O. The taxonomy of developmental control in Caenorhabditis elegans. Science 282, 2033-2041 (1998).

27. Loverde, P.T., Osman, A. & Hinck, A. Schistosoma mansoni: TGF-beta signaling pathways. Exp Parasitol 117, 304-317 (2007).

28. Wang, L. et al. Reconstruction and in silico analysis of the MAPK signaling pathways in the human blood fluke, Schistosoma japonicum. FEBS Lett 580, 3677-3686 (2006).

29. Metaye, T., Perdrisot, R. & Kraimps, J.L. [GRKs and arrestins: the therapeutic pathway?]. Med Sci (Paris) 22, 537-543 (2006).

30. Vicogne, J. et al. Conservation of epidermal growth factor receptor function in the human parasitic helminth Schistosoma mansoni. J Biol Chem 279, 37407-37414 (2004).

31. Kent, W.J. BLAT--the BLAST-like alignment tool. Genome Res 12, 656-664 (2002). 32. Merckelbach, A., Hasse, S., Dell, R., Eschlbeck, A. & Ruppel, A. cDNA sequences of

Schistosoma japonicum coding for two cathepsin B-like proteins and Sj32. Trop Med Parasitol 45, 193-198 (1994).

33. Delcroix, M., Medzihradsky, K., Caffrey, C.R., Fetter, R.D. & McKerrow, J.H. Proteomic analysis of adult S. mansoni gut contents. Mol Biochem Parasitol 154, 95-97 (2007).

34. Puente, X.S., Sanchez, L.M., Overall, C.M. & Lopez-Otin, C. Human and mouse proteases: a comparative genomic approach. Nat Rev Genet 4, 544-558 (2003).

35. Lopez-Otin, C. & Overall, C.M. Protease degradomics: a new challenge for proteomics. Nat Rev Mol Cell Biol 3, 509-519 (2002).

36. Vandewaa, E.A., Mills, G., Chen, G.Z., Foster, L.A. & Bennett, J.L. Physiological role of HMG-CoA reductase in regulating egg production by Schistosoma mansoni. Am J Physiol 257, R618-625 (1989).

37. Chen, G.Z., Foster, L. & Bennett, J.L. Antischistosomal action of mevinolin: evidence that 3-hydroxy-methylglutaryl-coenzyme a reductase activity in Schistosoma mansoni is vital for parasite survival. Naunyn Schmiedebergs Arch Pharmacol 342, 477-482 (1990).

38. Catto, B.A. Schistosoma mansoni: decarboxylation of 5-hydroxytryptophan, L-dopa, and L-histidine in adult and larval schistosomes. Exp Parasitol 51, 152-157 (1981).

39. Ding, J., Su, J.G. & Mansour, T.E. Cloning and characterization of a cDNA encoding phosphofructokinase from Schistosoma mansoni. Mol Biochem Parasitol 66, 105-110 (1994).

40. Mansour, J.M., McCrossan, M.V., Bickle, Q.D. & Mansour, T.E. Schistosoma mansoni phosphofructokinase: immunolocalization in the tegument and immunogenicity. Parasitology 120 ( Pt 5), 501-511 (2000).

41. Su, J.G., Mansour, J.M. & Mansour, T.E. Purification, kinetics and inhibition by antimonials of recombinant phosphofructokinase from Schistosoma mansoni. Mol Biochem Parasitol 81, 171-178 (1996).

42. Bueding, E. & Mansour, J.M. The relationship between inhibition of phosphofructokinase activity and the mode of action of trivalent organic antimonials on Schistosoma mansoni. Br J Pharmacol Chemother 12, 159-165 (1957).

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 10

Page 11: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 11

Page 12: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 12

Page 13: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 13

Page 14: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 14

Page 15: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 15

Page 16: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 16

Page 17: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 17

Page 18: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

Supplementary Figure 1. Idiograms of S. mansoni chromosomes 1-7, W and Z. S. mansoni BAC clones were mapped to the karyotype of S. mansoni by FISH. The solid black areas are heterochromatin and the open areas are euchromatin. The BAC clones are identified by BAC numbers. Arrows identify the approximate location of the BAC on the chromosome. The scale is an arbritary measure of chromosome length and allows positioning of the BAC.

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 18

Page 19: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

 

 

 

Supplementary Figure 2. S. mansoni Transcription Associated Protein (TAP) families displayed as a 2D-hierarchical clustering. The rows are labelled with species names and the columns represent clusters which have schistosome sequences as members. If a cluster doesn't contain any sequences from the other genomes queried, the 'cell' in the matrix is yellow and orange if it does. The darker the shade of orange, the greater the number of sequences in the genome (after normalising for genome size).

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 19

Page 20: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

H2O2

GSH Biosynthesis

L-Glutamate + L-Cysteine

γ-Glu-Cys synthase

γ -L-glutamyl-L-cysteine

GSH Synthase

Glycine

GSH

Gut

Hemoglobindigestion

O2-

SODsec(Cu/Zn)

GPx/Prx?

H2O

Fe3+

OH•

Absent:GSH ReductaseTrx ReductaseMsrASulfiredoxinNADPH oxidaseNO synthaseCatalase (1.11.1.6)1-Cys PrxGST classes (a, k, p, s, z)GPx (1.11.1.9)Heme oxygenase

Cytoplasm

Mitochondria

Grx(S)2 → Grx(SH)2

2 GSH GSSGTGR

Protein(SH)2 → Protein(S)2

H2O

Trx(SH)2

2 GSH

GSH

O2-

SOD (Mn)

H2O

GSSG

TGR

Trx(SH)2

Trx(S)2

FDR

Mitochondrial e- transport

SOD (Cu/Zn)

O2-

H2O2

TGR

Trx(S)2

Prx

Prx-SO2

Sestrin

?

Prx

H2O2

H2O2

Other GSH functions

Other Trx functionsFe2+

Supplementary Figure 3. Enzymes and proteins involved in redox metabolism in the S. mansoni genome. Proteins are shown in red boxes; reactive oxygen species are shown in yellow. Abbreviations: Trx, thioredoxin; GSH, glutathione; Prx, peroxiredoxin; TGR, thioredoxin glutathione reductase; SOD, superoxide dismutase; GPx, GSH peroxidase; Grx, glutaredoxin; FDR, ferredoxin NADP(H) oxidoreductase.

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 20

Page 21: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

 

 

 

Supplementary Figure 4. The conserved cys-loop region of the putative cys-loop family ligand-gated ion channels from S. mansoni. The top ten sequences are nicotinic acetylcholine receptor subunits, and the bottom four sequences are from the non-nicotinic cys-loop groups, potentially GABA, glycine or acetylcholine-gated anion channels. The alignment of this segment was generated with ClustalX 2.0.9 with default parameters.

 

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 21

Page 22: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

NogginChordin

BMPDAN

THBS1

Decorin

LTBP1

TGFβ

Smurf1/2

RhoA

PP2A

P107

E2F4/5

DP1

Pitx2

ROCK1

p70S6Ka

SP1

IDSmad1

Smad2

SARA

Smad4

ERK1

Smad4

Rbx1

Cul1

SKP1

ActivinRI

TGFβRI

BMPRII

BMPRI

TGFβRII

ActivinRI

Smad2

Noggin

Smad1BSmad1

Smad1B

ERK2

p70S6Kb

SmCBP1

GCN5

SmCBP2

Growth factorRas/ MAPK

+p

+p

+u +u

-p

+p

+p

TAK1, MEKK1

DAXX/JUK MAPK signaling pathway

Ubiquitin mediated proteolysis

DNA

Transcription factors, co-activators and co-

repressors

ActivinRI

Smad3Smad3

Proteins/enzymes found in S. mansoni

Proteins/enzymes not found in S. mansoni but in humans

GCP

DNA

FKBP12

ActivinRII

Supplementary Figure 5. Schistosoma mansoni TGF-beta Signaling Pathway. Diagram showing the TGF-beta pathway of S. mansoni. The components of the pathway (identified in Supplementary Table 15) found in the schistosome genome are shown in blue. The white boxes depict the components present in the human TGF beta pathway but not present in the S. mansoni pathway.

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 22

Page 23: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

Supplementary Figure 6: Relative size of the S. mansoni kinome. A total of 249 kinases were identified in the predicted proteome of the parasite. For comparison, the % of the total predicted proteome that codes for kinases is shown for: Plasmodium falciparum, Homo sapiens, Trypanosoma cruzi, Trypanosoma brucei, Caernorhabditis elegans, Leishmania major, Mus musculus (see Supplementary Methods for data sources).

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 23

Page 24: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

Supplementary Figure 7. Distribution of kinase groups in S. mansoni and model organisms. S. mansoni kinases were classified according to KinBase (www.kinase.com/kinbase). A complete description of the kinome classification can be found in Supplementary Table 16. The group classification of various kinases (PTK- Protein Tyrosine Kinase; TKL- Tyrosine Kinase Like; AGC- cAMP-dependent protein kinase/protein kinase G/protein kinase C extended; CaMK -Calcium/Calmodulin regulated kinases; CMGC- Cyclin-dependent Kinases and other close relatives; CK1 - Cell Kinase I; STE- MAP Kinase cascade kinases; RGC- Receptor Guanylate Cyclases; and Other) of C. elegans, D. melanogaster, S. cerevisiae and H. sapiens is shown.

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 24

Page 25: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

Supplementary Figure 8. Dendrogram of the kinases of S. mansoni. To construct the phylogenetic tree of the kinase domains of the S. mansoni kinome, a phylogenetic tree was created using Neighbor joining and bootstrapping with 500 replicates in CLC Main Workbench (CLCbio V. 4.1.1, Denmark) and viewed in FigTree V1.2 (http://tree.bio.ed.ac.uk/software/figtree/). The major groups are labeled and color coded: Blue CMGC, orange STE, red AGC, green CAMK, pink TK, grey CK1, black RGC, light blue TKL yellow Other, light pink Unknown. The arrows point to the three most transcribed kinases according to EST evidence (Supplementary Table 16).

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 25

Page 26: Analysis of putative transcription-associated proteins (TAPs) · annotation (Supplementary Table 4) and confirmed that In both C. elegans and human, there is a pronounced but opposite

NGF

BDNF

NT3/4

EGF

FGF

FDGF

TNF

IL1

PASL

TGFβ

Ras Raf1

G12 Gap1mNF1

Cdc42/Rac

CASP

TRAF2

DAXX

TRAF6

GADD45 MEKK4

TAO1/2

GLK

HGK

HPK1

TAB1

TAB2

Tpl2/Cot

MEKK1

LZK

MUKMLTK

ASK2

CBEB

Elk-1

Sap1ac-Myc

NFAT-2

NFAT-4

C-JUN

JunD

AFT-2

Elk-1

p53

Sap1a

GADD153

MAX

MEF2C

HSP27

Tau

STMN1

cPDA2

MNK1/2

RSK2

PPP3C

PRAK

MAPKAPK

MSK1/2Cdc25B

NLK

ERK5

PP2CA MKP

PTP

MKK3

MKK6

PP5

NIK

IKK

HSP72

Evil

MKK7JNK

JIP1/2

MKK4

ARRB

FLNA

CrkII

JIP3

p120GAP

PKCRasGRP

RasGRF CNasGEF

RafB

Mos

Rap1

MEK1

MEK2

MP1

ERK

PTP MKP

CREB

PP2CBECSIT

ASK1

TAK1

c-fosSRF

CACN

Trk

EGFRs

FGFRFDGFR

TNFR

IL1R

FAS

TGFβR

CD14

GRB2 SOS

PKA

NFkB

PAK

MST1/2

GCK

MLK3

MEKK2/3

MEK5 SmNR4A

p38

AKT

DNA

DNA

DNA

DNADNA

HerterotrimericG-protein

Ca2+

IP3cAMP

DAG

+p

+p +p

+p +p

+p

+p

+p +p

+p-p -p

+p

+p

+p

+p

+p

+p

+p

+p

+p

+p

+p

+p +p +p

+p

+p

+p

+p

-p

-p

-p-p -p

-p -p

+p

+p +p

+p+p

+p

-p-p

Classical MAP kinase pathway

JNK and p38 MAP kinase pathway

Serum, cyto toxic drugs, irradiation, heatshock, lipopolysaccharide and

other stress

LPS

DNA damage

ERK5 pathway

Serum, EGF, reactive oxygen species or srk

tryosinkinase downstream

MAPKKKK MAPKKK MAPKK MAPK Transcription factor

p53 signaling pathway Apoptosis

Proliferation, differentiation, inflammation Cell

cycle

Wnt signaling pathway

Proliferation, differentiation

Phosphatidylinositolsignaling system

+p Proliferation, inflammation anti-apoptosis

Proliferation, differentiation

Proteins/enzymes found in S. mansoni

Proteins/enzymes not found in S. mansoni but in human

Supplementary Figure 9. Schistosoma mansoni MAPK Signaling Pathway. Diagram showing the MAPK pathway of S. mansoni. The components of the pathway found in the schistosome genome are shown in blue. The white boxes depict the components present in the human MAPK pathway but not present in the S. mansoni pathway.

doi: 10.1038/nature08160 SUPPLEMENTARY INFORMATION

www.nature.com/nature 26