ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING
-
Upload
veda-campos -
Category
Documents
-
view
24 -
download
1
description
Transcript of ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING
![Page 1: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/1.jpg)
ANALYSIS OF GENETIC NETWORKS USING
ATTRIBUTED GRAPH MATCHING
![Page 2: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/2.jpg)
BACKGROUND
• Completion of sequencing projects• Need for functional discovery• Emerging area of study: Large
scale genomic analysis• Similarity of living systems
![Page 3: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/3.jpg)
GENETIC NETWORKS
• Modelling genetic networks• Interaction of genes and proteins• Relationship between topology and
function
![Page 4: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/4.jpg)
MOTIVATION
• Common biological processes• Comparison of networks• Discovering missing interactions• Discovering missing genes
![Page 5: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/5.jpg)
GRAPH MATCHING
mpn132
mpn124
mpn141
mpn145
mpn134
mpn133
mge234
mge235
mge236
mge312
mge314
mge310
mge313
mge336mge337
Search-based Algorithm
Pruning Techniques
G1
G2
![Page 6: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/6.jpg)
ROADMAP
• Scale-Free Networks• Modelling Genetic Networks• Graph Matching• Algorithm• Results
![Page 7: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/7.jpg)
SCALE-FREE NETWORKS
![Page 8: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/8.jpg)
COMPLEX NETWORKS
• Small-world model– WWW– Human acquaintances network– Citation networks– Biological networks
![Page 9: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/9.jpg)
SMALL-WORLD
• Features:– Characteristic path length– Clustering coefficient– Sparseness
![Page 10: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/10.jpg)
SMALL-WORLD
• Somewhere in between regular & random graphs
![Page 11: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/11.jpg)
SMALL-WORLD • Highly clustered• Short diameter
![Page 12: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/12.jpg)
SCALE-FREE NETWORKS
• Complex networks: biological, social, www, power grid, citation etc.
• Power low connectivity: P(k) = k -
• Hubs - authorities
![Page 13: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/13.jpg)
SCALE-FREE NETWORKS
• Application for testing scale free behavior• Yeast• Helicobacter Pylori• Mycoplasma Pnuemonia• Mycoplasma Genitelium• Linear log-log graph• Slope =
![Page 14: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/14.jpg)
SCALE-FREE NETWORKS • Slope is calculated by least mean
square method
![Page 15: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/15.jpg)
TOPOLOGY & FUNCTIONALITY
• Small diameter– ease of dissemination of information– ease of restoring after disturbance
• Cliquishness – Alternate paths are found
• Heterogeneity– Random removal does not effect the
network– Hubs are vulnerable to attack
![Page 16: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/16.jpg)
BIOLOGICAL ASPECTS • Multifunctionality
– Grouped into functional units
• Stability• Reason: Most of
the interactions are between hubs and authorities
![Page 17: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/17.jpg)
MODELLING GENETIC NETWORKS
![Page 18: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/18.jpg)
TYPES OF GENETIC NETWORKS• Categorized by data sources
– Metabolic pathways– Gene expression arrays– Protein interactions– Gene interactions
![Page 19: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/19.jpg)
INTERACTION MAPS• High level perspective
– Nodes: Genes or proteins– Edges: Presence of an interaction
• Data sources– Two-hybrid analysis– Fusion analysis– Chromosomal proximity– Phylogenetic analysis
![Page 20: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/20.jpg)
GRAPH MATCHING
![Page 21: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/21.jpg)
PROBLEM DEFINITION
Attributed Relational Graph (ARG)
G = { V, E, X}.
V = {v1, v2, …, vn} Nodes
E = {e1, e2, …, em} Edges
X = {x1, x2,…,xn} Attributes
![Page 22: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/22.jpg)
INEXACT SUBGRAPH MATCHING
Allow for :
• Mismatching attribute values
• Missing nodes
• Missing links
Also called error-correcting subgraph isomorphism
NP-Complete
![Page 23: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/23.jpg)
SEARCH TECHNIQUES
• Cost function• Pruning (Structure Constraints)•Backtracking
![Page 24: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/24.jpg)
ATTRIBUTED GRAPH MATCHING TOOL
![Page 25: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/25.jpg)
ATTRIBUTE MATCHING
- Amino Acid Sequence Content Composition– array of 20, percentage of each aa– Amino acid grouped into classes: array of 6– Amino acid triples grouped into classes:
array of 216
MKVLNKNEL
216
1
2)]()([ 21
iiiS XX
6 x 6 x 6
A
anOaX
A
1n
))(( 1)(
![Page 26: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/26.jpg)
ATTRIBUTE MATCHING
Difference in amino acid composition values of gene pairs for M. Genitalium and M. Pneumoniae.
Score
observations
![Page 27: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/27.jpg)
STRUCTURAL CONSTRAINTS
• Effect of scale-free behaviour– Connectivity information: Highly
heterogeneous, thus start with most connected and work around it
– Pruning strategy: comparibility is determined by power low
loglog
)(log)(log
12
12
12
12
kk
kPkP
xx
yy
![Page 28: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/28.jpg)
STRUCTURAL CONSTRAINTS• Neigborhood connectivity
– Choose the neighbor at the next stage
• Backtracking– Component by component– Go back to the neighbor with the
most connectivity within the component
![Page 29: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/29.jpg)
TEST CASE
• Mycoplasma Genitalium: – smallest genome (470 ORFs)
• Mycoplasma Pnuemoniae: – Very similar, superset (688 ORFs)
![Page 30: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/30.jpg)
TEST CASE...• Mycoplasma Genitalium:
– 232 nodes– 211 links
• Mycoplasma Pnuemoniae: – 267 nodes– 257 links
• Inputs:• MGE links• MPN links
• MGE synonyms• MPN synonyms
• MGE amino acid sequence• MPN amino acid sequence
![Page 31: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/31.jpg)
RESULTSMGE MPN
![Page 32: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/32.jpg)
DISCOVERY OF MISSING DATA
• Missing link
• Link between in MPN632 and MPN637 is missing in our data but exists in literature
![Page 33: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/33.jpg)
DISCOVERY OF MISSING DATA
• Missing node with known COG
MPN236--- MPN237---MPN238---MPN678MG098 ----MG099-----MG100----MG459
MG459 is ortholog of MPN678
![Page 34: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/34.jpg)
DISCOVERY OF MISSING DATA
• Missing node without known ortholog
![Page 35: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/35.jpg)
CONCLUSION
• Large-scale genomics• Interaction data captures system
structure and dynamics• Graph matching exploits the scale-
free characteristics• Novel interactions and genes can
be identified
![Page 36: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING](https://reader034.fdocuments.us/reader034/viewer/2022042822/5681366c550346895d9df8c6/html5/thumbnails/36.jpg)
ACKNOWLEDGEMENT
• YASEMİN TÜRKELİ