Proteiinianalyysi 4
description
Transcript of Proteiinianalyysi 4
![Page 1: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/1.jpg)
Proteiinianalyysi 4
http://www.bioinfo.biocenter.helsinki.fi:8080/
downloads/teaching/spring2005/proteiinianalyysi
![Page 2: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/2.jpg)
Samankaltaisuus
• Sekvenssin perusteella
• Rakenteen perusteella– evoluutiossa rakenne säilyy kauemmin kuin
sekvenssi– vertailu
• rigid-body• distance matrix
![Page 3: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/3.jpg)
![Page 4: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/4.jpg)
![Page 5: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/5.jpg)
![Page 6: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/6.jpg)
![Page 7: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/7.jpg)
![Page 8: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/8.jpg)
![Page 9: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/9.jpg)
Dali
• Distance-matrix ALIgnment
![Page 10: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/10.jpg)
![Page 11: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/11.jpg)
![Page 12: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/12.jpg)
![Page 13: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/13.jpg)
Some Similarities are Readily Apparent others are more Subtle
Easy:Globins
125 res., ~1.5 Å
Tricky:Ig C & V
85 res., ~3 Å
Very Subtle: G3P-dehydrogenase, C-term. Domain >5 Å
![Page 14: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/14.jpg)
Same fold, same superfamily
![Page 15: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/15.jpg)
Fold space graph
• rakennevertailu kaikki kaikkia vastaan
• aluksi– redundanssi– domeenit
![Page 16: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/16.jpg)
![Page 17: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/17.jpg)
Protein domains/modules
• globular
• independently foldable
• occur in different contexts
![Page 18: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/18.jpg)
Domains via the contact matrix
![Page 19: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/19.jpg)
![Page 20: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/20.jpg)
![Page 21: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/21.jpg)
AnalogyHomology
‘superfold’
‘superfamily’
Dendrogram / homologues
Structure similarity
![Page 22: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/22.jpg)
![Page 23: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/23.jpg)
Komparatiivista genomiikkaa
• luokittelun jälkeen proteomien koostumusta voidaan verrata keskenään
![Page 24: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/24.jpg)
Universaalit proteiiniperheet
Protein functional class Number of families appearing in all known genomes
translation, incl. ribosome structure
53
transcription 4
replication, recombination, repair 5
metabolism 9
cellular processes (chaperones, secretion, cell division, cell wall biosynthesis)
9
![Page 25: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/25.jpg)
Distribution of probable homologues of predicted human proteins
Vertebrates only 22 %
Vertebrates and other animals
24 %
Animals and other eukaryotes
32 %
Eukaryotes and prokaryotes
21 %
No homologues in animals
1 %
Prokaryotes only 1 %
![Page 26: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/26.jpg)
Name TypeNo.
seedNo. full
Av. len
Av. %id
3D Description
GP120 Family 24 41447 153.5 56 1gc1 Envelope glycoprotein GP120
zf-C2H2Repeat
197 28442 23.4 35 1zaa Zinc finger, C2H2 type
LRRRepeat
2652 28207 23.9 27 1bnh Leucine Rich Repeat
RVT Family 177 25771 160.8 68 1hmv Reverse transcriptase (RNA-dependent DNA polymerase)
RVPDomain
53 21864 94.4 86 1ida Retroviral aspartyl protease
Cytochrom_B_NDomain
8 19592 154.8 68 3bcc Cytochrome b(N-terminal)/b6/petB
WD40Repeat
1923 16338 38.7 19 1gp2 WD domain, G-beta repeat
AnkRepeat
1182 15497 29.9 28 1awc Ankyrin repeat
COX1 Family 24 14643 226.9 48 1occ Cytochrome C and Quinol oxidase polypeptide I
igDomain
113 14032 63.6 19 8fab Immunoglobulin domain
Oxidored_q1 Family 33 12646 220.7 29 NADH-Ubiquinone/plastoquinone (complex I), various chains
Cytochrom_B_CDomain
9 11999 88.5 74 1bcc Cytochrome b(C-terminal)/b6/petD
ABC_tranDomain
63 11725 184.1 27 1b0u ABC transporter
PkinaseDomain
67 11451 216.9 23 1apm Protein kinase domain
RuBisCO_largeDomain
17 10485 282.4 81 3rubRibulose bisphosphate carboxylase large chain, catalytic domain
RuBisCO_large_NDomain
17 10205 117.2 83 3rubRibulose bisphosphate carboxylase large chain, N-terminal domain
TPRRepeat
575 9756 33.8 18 1a17 TPR Domain
PPR Family 560 8851 32.9 20 PPR repeat
RVT_thumbDomain
42 8064 50.1 88 Reverse transcriptase thumb domain
HCV_NS1 Family 10 7147 74.2 47 Hepatitis C virus non-structural protein E2/NS1
Suurimmat PFAM-perheet
![Page 27: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/27.jpg)
Name TypeNo.
seedNo. full
Av. len
Av. %id
Feature
Description
Oxidored_q1 Family 33 12646 220.7 29 TM NADH-Ubiquinone/plastoquinone (complex I), various chains
PPR Family 560 8851 32.9 20 PPR repeat
RVT_thumbDomain
42 8064 50.1 88 Reverse transcriptase thumb domain
HCV_NS1 Family 10 7147 74.2 47 Hepatitis C virus non-structural protein E2/NS1
MatK_N Family 22 5299 243.3 59 MatK/TrnK amino terminal region
Intron_maturas2 Family 26 4902 117.7 64 Type II intron maturase
BPD_transp_1 Family 92 4489 210.1 15 TM Binding-protein-dependent transport system inner membrane component
Oxidored_q1_N Family 32 3241 60.9 55 TM NADH-Ubiquinone oxidoreductase (complex I), chain 5 N-terminus
NADH_dehy_S2_C Family 77 3072 54.5 43 TM NG NADH dehydrogenase subunit 2 C-terminus
Oxidored_q1_C Family 72 3060 232.5 58 TM NADH-Ubiquinone oxidoreductase (complex I), chain 5 C-terminus
Sugar_tr Family 49 2800 327.0 19 TM Sugar (and other) transporter
TT_ORF1 Family 6 2736 111.7 56 TT viral orf 1
vMSA Family 4 2492 182.7 70 TM Major surface antigen from hepadnavirus
Mito_carr Family 210 2262 92.7 22 Mitochondrial carrier protein
ABC_membrane Family 73 2240 259.8 14 TM ABC transporter transmembrane region
Radical_SAMDomain
651 2220 168.6 14 Radical SAM superfamily
NADH5_C Family 85 2075 167.2 40 TM NADH dehydrogenase subunit 5 C-terminus
Glycos_transf_1 Family 78 1924 171.5 19 Glycosyl transferases group 1
Poty_coat Family 34 1780 205.7 55 Potyvirus coat protein
DUF6 Family 105 1767 125.6 15 TM Integral membrane protein DUF6
Suurimmat PFAM-perheet ilman tunnettua rakennetta (paljon TM-proteiineja)
![Page 28: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/28.jpg)
Family Description Homo sapiens (Human) Mus musculus (Mouse) Total
zf-C2H2 Zinc finger, C2H2 type 6942 (880) 4910 (672) 11852 (1552)
LRR Leucine Rich Repeat 1826 (273) 1796 (253) 3622 (526)
ig Immunoglobulin domain 2152 (777) 1363 (712) 3515 (1489)
WD40 WD domain, G-beta repeat 1716 (351) 1541 (313) 3257 (664)
Ank Ankyrin repeat 1611 (290) 1340 (242) 2951 (532)
7tm_1 7 transmembrane receptor (rhodopsin family) 818 (817) 1451 (1450) 2269 (2267)
EGF EGF-like domain 1088 (218) 1073 (196) 2161 (414)
fn3 Fibronectin type III domain 924 (205) 663 (193) 1587 (398)
Cadherin Cadherin domain 781 (138) 700 (135) 1481 (273)
Collagen Collagen triple helix repeat (20 copies) 700 (95) 666 (96) 1366 (191)
Pkinase Protein kinase domain 679 (643) 622 (583) 1301 (1226)
TPR TPR Domain 688 (140) 538 (115) 1226 (255)
efhand EF hand 580 (243) 497 (207) 1077 (450)
RRM_1 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain)
520 (292) 487 (267) 1007 (559)
Kelch Kelch motif 424 (94) 356 (74) 780 (168)
Sushi Sushi domain (SCR repeat) 404 (75) 339 (76) 743 (151)
Spectrin Spectrin repeat 414 (35) 241 (30) 655 (65)
SH3 SH3 domain 343 (261) 310 (244) 653 (505)
PDZ PDZ domain (Also known as DHR or GLGF) 339 (193) 303 (170) 642 (363)
PH PH domain 336 (300) 289 (257) 625 (557)
Suurimmat perheet ihmisellä ja hiirellä liittyvät usein proteiini-interaktioihin
![Page 29: Proteiinianalyysi 4](https://reader036.fdocuments.us/reader036/viewer/2022070410/568146b9550346895db3e662/html5/thumbnails/29.jpg)
Family Description Escherichia coli
Methanococcus jannaschii Total
Fer4 4Fe-4S binding domain 63 (38) 106 (38) 169 (76)
ABC_tran ABC transporter 95 (78) 20 (17) 115 (95)
Hexapep Bacterial transferase hexapeptide (three repeats) 68 (16) 15 (3) 83 (19)
TPR TPR Domain 14 (5) 52 (8) 66 (13)
BPD_transp_1 Binding-protein-dependent transport system inner membrane component
51 (50) 4 (4) 55 (54)
HTH_AraC Bacterial regulatory helix-turn-helix proteins, araC family 53 (27) 0 53 (27)
CBS CBS domain 18 (10) 34 (15) 52 (25)
RHS_repeat RHS Repeat 51 (6) 0 51 (6)
Radical_SAM Radical SAM superfamily 19 (19) 32 (32) 51 (51)
HTH_1 Bacterial regulatory helix-turn-helix protein, lysR family 46 (46) 2 (2) 48 (48)
LysR_substrate LysR substrate binding domain 45 (45) 1 (1) 46 (46)
Response_reg Response regulator receiver domain 38 (38) 0 38 (38)
HATPase_c Histidine kinase-, DNA gyrase B-, and HSP90-like ATPase 34 (34) 1 (1) 35 (35)
Sugar_tr Sugar (and other) transporter 32 (32) 1 (1) 33 (33)
Acetyltransf_1 Acetyltransferase (GNAT) family 24 (24) 4 (4) 28 (28)
Hydrolase haloacid dehalogenase-like hydrolase 23 (23) 4 (4) 27 (27)
Helicase_C Helicase conserved C-terminal domain 18 (18) 9 (9) 27 (27)
Fimbrial Fimbrial protein 25 (25) 0 25 (25)
AA_permease Amino acid permease 22 (22) 2 (2) 24 (24)
HAMP HAMP domain 23 (23) 0 23 (23)
Suurimmat perheet E. colilla ja arkebakteerilla liittyvät tyypillisesti metaboliaan