Prof. A.S. Kolaskar Vice Chancellor University of Pune
description
Transcript of Prof. A.S. Kolaskar Vice Chancellor University of Pune
Prof. A.S. KolaskarVice Chancellor
University of Pune
Prof. A.S. KolaskarVice Chancellor
University of Pune
Bioinformatics for Parasitic diseases: Malaria
Life cycle of Plasmodium falciparum
Human
Mosquito
Countries endemic to Malaria
Drug-Resistance in Malaria endemic-countries
Source: National Centre for Infectious Diseases, CDC,Atlanta
WHO/TDR: Focus on Malaria
Information & Resources for Malaria: 1
Malaria Focus: Bill & Melinda Gates Foundation
Information & Resources for Malaria: 2
$50 million grant for malaria
research
Malaria Focus: Wellcome Trust Foundation
Partly funded the Plasmodium sequencing project
Information & Resources for Malaria: 3
•Text search• Sequence search
MR4@ATCC
Deposit OR Order Culture
Information & Resources for Malaria: 4
National Institute of Allergy & Infectious Diseases
Information & Resources for Malaria: 5
CDC HomeInformation & Resources for Malaria: 6
Division of Parasitic Diseases: Information on Malaria
CDC: Division of Vector-borne infectious diseases
Complete details regarding the life-history of mosquito, the vector for many infectious
diseases
Genomes: the current status
• Published complete genomes: 169
– Archaeal: 17
– Bacterial: 131
– Eukaryal: 21
• Completed Viral genomes: >1400
• Prokaryotic ongoing genomes: 428
• Eukaryotic ongoing genomes: 360
As of January 13, 2004
Highly voluminous data: Needs to be
analyzed for Knowledge Generation
Genome database: Plasmodium
Genome database: Anopheles
Genome Organisation of Homo sapiens, Anopheles gambiae and Plasmodium falciparum
Organism Genome size Number of chromosomes
Number of predicted
genes
Homo sapiens (Hs)
3 GB 23 ~24,000-40,000
Anopheles gambiae (Ag)
0.27 GB 3 ~12,000
Plasmodium falciparum (Pf)
23 MB 14 ~5000
Approaches to mine genomes of host, vector & parasite
1. Chromosome-wise comparison
2. Comparison of pathway-specific genes
3. Stage-specific comparison
Rate limiting factors:• Extent of annotation of genomic data• Lack of complete connectivity between genomic and derived databases• Need to define appropriate cutoffs to detect similarities between phylogenetically diverse organisms
Chromosomes of Homo sapiens
Chr Size (bp)
Chromosomes of Plasmodium falciparum
Chr Size (bp)
X 24,902,716
2R 78,412,699
2L 52,393,056
3R 64,548,413
3L 56,406,562
Chromosomes of Anopheles gambiae
Chromosome-wise Comparisons of Proteomes: Program & parameters
• Data sources:
– P. falciparum: PlasmoDB
– H. sapiens : RefSeq@NCBI
– A. gambiae : ENSEMBL
• Program : BLASTP
• Sequence identity: >20%
• Alignment length: >50aa
• E value: zero or with negative powers
Comparison of Proteomes of H. sapiens & A. gambiae
Ag\Hs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X2L 607 518 512 435 406 448 459 408 406 379 480 506 218 282 317 409 473 228 444 332 180 330 3912R 790 649 559 561 538 550 496 503 491 503 585 580 266 423 444 499 545 245 579 387 200 402 4763L 532 421 378 343 315 404 352 344 334 307 391 363 169 193 220 334 328 172 372 250 114 239 2963R 588 517 467 389 401 437 411 370 371 382 457 485 216 286 339 391 414 216 456 333 165 358 372X 239 179 171 164 168 166 176 144 136 154 181 197 78 112 116 146 201 78 165 119 61 105 155
0
200
400
600
800
1000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y
2L
2R
3L
3R
X
0
100200
300
400
500600
700
1 2 3 4 5 6 7 8 9 10 11 12 13 14
2L
2R
3L
3R
X
Comparison of Proteomes of P.falciparum & A. gambiae
Hs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X YPf1 25 28 18 13 17 25 16 14 16 18 15 20 14 14 21 24 17 13 12 17 7 13 17 2
2 51 40 27 19 31 31 21 26 31 39 36 31 27 17 41 30 22 15 16 24 12 16 26 6
3 56 53 43 41 51 48 42 29 29 39 41 46 40 34 42 36 43 23 40 29 20 24 33 12
4 47 45 29 27 29 34 28 20 22 27 31 28 20 23 36 25 30 19 26 30 11 19 28 6
5 77 65 60 44 41 55 58 36 47 49 52 50 43 46 53 55 54 31 47 45 29 29 50 8
6 58 49 39 36 39 45 37 30 35 50 33 44 32 34 42 38 45 15 37 34 17 25 38 6
7 40 42 33 24 27 38 28 28 22 35 28 28 23 24 35 27 27 17 23 32 14 26 36 5
8 63 62 49 43 35 50 40 29 46 46 49 46 36 38 49 40 42 18 42 32 21 20 39 8
9 72 58 40 41 41 55 54 32 38 41 38 50 36 33 36 50 56 19 44 32 17 26 38 9
10 68 75 51 47 40 59 42 37 52 51 53 62 41 41 51 53 43 21 54 54 25 30 44 11
11 118 96 85 66 70 91 77 60 85 85 85 82 61 62 86 87 72 49 67 81 37 48 71 20
12 101 111 80 67 70 78 72 54 68 87 73 86 65 61 81 84 76 37 61 78 33 43 68 14
13 141 125 104 80 77 94 87 60 76 84 103 97 88 72 103 91 95 43 82 82 42 53 80 25
14 145 128 118 84 87 110 100 77 100 98 101 105 80 85 112 109 122 64 90 99 58 46 94 29
Comparison of Proteomes of Homo sapiens & Plasmodium falciparum
Pf chr1 vs Hs chr228 significant matches
• 1
• 2
• 3
• 4
• 5
• 6
• 7
• 8
• 9
• 10
• 11
• 12
• 13
• 14
• 1• 2• 3• 4• 5• 6• 7• 8• 9• 10• 11• 12• 13• 14• 15• 16• 17• 18• 19• 20• 21• 22• X• Y
Chr. in Pf Chr. in Hs
Significant matches
List of significant matches
• Proteins that are part of eukaryotic transcriptional and translational machinery
• Heat shock proteins: molecular chaperones• Histones• Actin and tubulin: cytoskeletal proteins• Ornithine aminotransferase: Involved in the inter-
conversion of arginine, proline and glutamate residues and the synthesis of polyamines. Polyamines are implicated to have a role in cell proliferation.
List of significant matches contd.,
• Polyubiquitin: involved in the ATP-dependent selective degradation of cellular proteins, maintenance of chromatin structure, regulation of gene expression, stress response and ribosome biogenesis
• Proteasome are large barrel-like bodies which contain proteolytic enzymes in their inner surface.
• DEAD family RNA helicases.• Histone deacetylase: critical mediators of
transcription repression.
Pf_c12 with Hs_c18
• Clustered asparagine rich protein (CARP)
• Function not clearly known
• Expressed in different stages of the life-cycle of the parasite
• Immunogenic (Kuma et.al,1990)
• Bruno like 4 RNA binding protein
• Transcriptional regulator
gi_pf Seq_pf
Acc_hs Seq_hs
Score E value
Aln_len %id Func_hs Func_pf
g23509043 459 XP_029431 269 71 3e-014
37/84 44 bruno-like 4, RNA binding protein
clustered-asparagine-rich protein
A search against Pfam database revealed that the CARP has matches both at its N and C terminii to RNA binding domains. A protein with a probable house-keeping
gene activity is known to be immunogenic….
Case study: 1
• Hypothetical protein• Blast against nr database
shows significant matches towards the N terminus with zinc-finger containing proteins of higher eukaryotes like mammals and fishes.
• No significant matches to other protozoans
• Zinc-finger binding domain.
• Transcriptional factor
• Binds both to RNA and DNA.
Pf_c10 vs Hs_c18
Case Study
May have acquired from the host….
• Expressed in the intraerythrocytic (up to 0.5% of parasite protein) and schizont-stage of the parasite.
• Dual function of protein folding and signal transduction.
• Cyclophilin is also present in other parasites like T. gondii, Brugia malayi etc.
• Receptor for Immuno-supressive drug cyclosporin A
• Known to be present in higher eukaryotes including plants(involved in handling stress response).
Case Study
Pf_c12 vs Hs_c21
Pf_c2 with Hs_c6
• Putative helicase • Involved in a number of cellular functions including translation, RNA splicing, and ribosome assembly.
• Located within human major histocompatibility complex class III region.
gi_pf Seq_pf
Acc_hs Seq_hs
Score E value
Aln_len %id Func_hs Func_pf
g16804988 457 XP_041840 428 513 1e-147
263/453 58 HLA-B associated transcript-1
eIF-4A-like DEAD family RNA helicase
Malaria parasite pathways: Hagai Ginsburg
URL: http://sites.huji.ac.il/malaria/
Metabolome of Plasmodium falciparum
• Metabolic pathways of Plasmodium falciparum are known to be stage-specific.
• Asexual blood-stage parasites depend on glycolysis and conversion of pyruvate to lactate to derive energy.
• MS-MS studies carried out by Florens et.al(2002), revealed that gametocyte and sporozoite stages of the malarial parasite contain peptides of enzymes known to be involved in mitochondrial TCA cycle and oxidative phosphorylation.
In Plasmodium falciparum
Enzyme ChromosomeHs
Ag Pf
Citrate synthase 12 3L 10
Aconitase 22 3R 13
Isocitrate dehydrogenase 2 2L 13
Alpha-keto glutarate dehydrogenase (E1) 7 2R 8
Alpha-keto glutarate dehydrogenase (E2) 14 3L 13
Alpha-keto glutarate dehydrogenase (E3) 7 3L 12
Succinyl CoA ligase 13 2L 14
Succinate dehydrogenase (Cyt b560) (SDHA) 1 3L -
Succinate dehydrogenase ( Cyt b small) (SDHB) 11 X -
Succinate dehydrogenase (flavoprotein) (SDHC) 5 3L 10
Succinate dehydrogenase (iron-sulfur) (SDHD) 1 2L 12
Fumarase 1 * 2R* 9**
Malate dehydrogenase 7 3R 6
Chromosomal locations of TCA cycle-enzymes
• * Class II non-iron dependent Fumarase • ** Class I iron-dependent Fumarase
Comparison of TCA cycle enzymes of
Plasmodium falciparum-Anopheles gambiae-Homo sapiens
Comparison of TCA cycle
0102030405060708090
Query:Human Database:Anopheles
Query: Human Database:Plasmodium
Query: Anopheles Database:Plasmodium
Plasmodium contains only two SDH subunits in contrast to 4 SDH subunits in human & anopheles
Fumarase class I is present in Plasmodium whereas Fumarase class II is present in human & anopheles
Seq
unec
e id
enti
ty
Enzyme
TCA cycle: Comparison of proteome of host,vector and parasite revealed…
• TCA cycle-specific enzymes of Homo sapiens and Anopheles gambiae have high degree of sequence identity.
• Aconitase and Fumarase enzymes of Plasmodium falciparum show very less similarity with their human and mosquito counterparts.
• An iron regulatory protein that has a C terminal domain similar to Aconitase is present in Plasmodium and it likely carries out the function of Aconitase enzyme.
• Fumarase (Class I) an iron-dependent enzyme is present in Plasmodium whereas Fumarase (Class II), an non-iron dependent enzyme is present in human and mosquito.
• Succinate dehydrogenase in Plasmodium contains only two subunits in contrast to its human & mosquito counterparts, which have four subunits.
Homology models of Isocitrate dehydrogenaseHigh sequence identity Structural similarities
Chromosomal location of enzymes involved in Purine Biosynthesis
Enzyme Chromosome
Hs Ag Pf
Adenosine deaminase 22 2L 10
Adenylate kinase 1 2L 10
Adenylosuccinate lyase 1 2R 2
Adenylosuccinate synthetase 1 2R 13
DNA polymerase 1 2 2L 9,10,14,
DNA-directed RNA polymerase II 11 3R 2
GMP synthetase 3 3R 10
Guanylate kinase 1 3L 9
Hypoxanthine phosphoribosyltransferase X - 10
Inosine-5'-monophosphate dehydrogenase 7 3L 9
Nucleoside diphosphate kinase 17 2L 6
Purine nucleoside phosphorylase 14 add 5
Ribonucleotide reductase 8 3R 10,14
Thioredoxin reductase 22 X 9
Enzyme Sequence Identity
Hs Ag
Adenosine deaminase 26 30
Adenylate kinase 52 51
Adenylosuccinate lyase 22 24
Adenylosuccinate synthetase 46 45
DNA polymerase 1 26 24
DNA-directed RNA polymerase II 54 52
GMP synthetase 30 30
Guanylate kinase 37 40
Hypoxanthine phosphoribosyltransferase 49 -
Inosine-5'-monophosphate dehydrogenase 49 48
Nucleoside diphosphate kinase 61 60
Purine nucleoside phosphorylase - -
Ribonucleotide reductase 60 64
Thioredoxin reductase 45 44
Sequence similarity of Enzymes involved in Purine Biosynthesis using Pf as a reference
Chromosomal location of enzymes involved in Pyrimidine Biosynthesis
EnzymeChromosome
Hs Ag PfAspartate carbamoyltransferase 2 UNK 13Carbamoyl phosphate synthetase 2 UNK 13
Cytidine triphosphate synthetase 1 3R 14
Deoxyuridine 5'-triphosphate nucleotidohydrolase
15 UNK 11
Thymidylate synthase 18 2L 4Dihydroorotase 2 X 14Dihydroorotate dehydrogenase 16 2R 6DNA polymerase 1 3 2L 6DNA-directed RNA polymerase II 16 3R 2Nucleoside diphosphate kinase 17 2L 6Orotate phosphoribosyltransferase 3 2R 5Orotidine-monophosphate-decarboxylase 3 2R 10Ribonucleotide reductase 2 2L 10,14Serine hydroxymethyltransferase 12 X 12Thioredoxin reductase 22 X 9Thymidylate kinase 2 2L 12
Multi-domain protein in Hs & Ag • Single-domain proteins in Pf
• Located on different chromosomes
Enzyme Sequence Identity
Hs AgAspartate carbamoyltransferase 37 38
Carbamoyl phosphate synthetase 45 47
Cytidine triphosphate synthetase 44 43
Deoxyuridine 5'-triphosphate nucleotidohydrolase
36 35
Thymidylate synthase 55 42Dihydroorotase - -Dihydroorotate dehydrogenase 36 35DNA polymerase 1 26 24DNA-directed RNA polymerase II 30 52Nucleoside diphosphate kinase 61 60Orotate phosphoribosyltransferase 27 29Orotidine-monophosphate-decarboxylase 27 -Ribonucleotide reductase 60 64Serine hydroxymethyltransferase 45 47Thioredoxin reductase 45 44Thymidylate kinase 39 42
Sequence similarity of Enzymes involved in Pyrimidine Biosynthesis using Pf as a reference
• Present in Hs, Pf & Ag• No sequence similarity between Pf vs Hs and Pf vs Ag
Chromosomal location of enzymes involved in Hemoglobin degradation pathway
Enzyme Chromosome
Hs Ag Pf
Aspartyl protease 11 UNK 13,14
Aspartic hemoglobinase - - 14
Leucine aminopeptidase 4 2R 14
Methionine aminopeptidase 12,4 3R,2R 10,13,14
O-sialoglycoprotein endopeptidase 4 2L 7
Papain family cysteine protease 9 3L 9
Pepsinogen 11 UNK 8
SERA antigen/papain-like proteinase with active Cys
9 3L 2
Serine protease 4 UNK 5
Zinc-metallopeptidase 17 UNK 13
Enzyme Sequence Identity
Hs Ag
Aspartyl protease 34 32
Aspartic hemoglobinase - -
Leucine aminopeptidase 39 30
Methionine aminopeptidase 58 56
O-sialoglycoprotein endopeptidase 32 30
Papain family cysteine protease 26 27
Pepsinogen 31 30
SERA antigen/papain-like proteinase with active Cys
23 24
Serine protease 22 31
Zinc-metallopeptidase 29 24
Sequence similarity of Enzymes involved in Hemoglobin digestion using Pf as a reference
Comparing pathways:Lessons learnt
Purine Biosynthesis
Human:
• de novo
•Salvage
Anopheles
• de novo
•Salvage
HGPRT: Missing
[Hypoxanthine Guanine PhosphoribosylTransferase]
Plasmodium
• Salvage
HGPRT: Present
• well-studied drug target
Stage-specific comparison of P.falciparum proteins with Human proteome
Stage No. of proteins
No. of matches % matches
Sporozoite 1025 198 19
Merozoite 828 284 34
Trophozoite 1024 338 33
Transition stage: MosquitoHuman
Human: Liver specific
Human: RBC specific