Post on 15-Jan-2016
description
Exploiting indirect neighbors and topological weight to predict protein function from protein–protein interactions
Hon Nian Chua, Wing-Kin Sung and Limsoon Wong
Motivation
Predicting the protein function from Protein-
protein interaction data.
• Previous studies considers level 1 neighbors• Can level-2 neighbors play an significant role
in this prediction?
Summarizing the output of the study
• level-2 neighbors does show functional association.• A significant no. Proteins were observed to be
having associations with level-2 neighbors but not with level-1 neighbors.
• A predicting algorithm:• 1) weight Level 1 & 2 neighbors based on
functional similarity.• 2) each function was also allotted a score based
on its weighted frequency in neighbors
Conventional approaches
• using only direct interactions i.e level-1 neighbors
• Consider a radius in the interaction neighborhood network
• Calculate a functional distance and use clustering to make some functional classes.
Protein-Protein interactions as an undirected graph
• G=(V,E)• (u, v) as two protein nodes • And edge e between them as interaction• U and v being , K-level neighbors– concept of
path with k-edges between u and v.
• Set of neighbors-- Sk
Indirect Functional Association
Significance
• out of 4162 annotated proteins, only 1999 or 48% share some function with level-1 neighbors.
Sets of neighborhood pairs
Simple neighbor counting
•
• Discuss– M and N• M- total predicted N-total functions known
The Algorithm
• 1) Functional similarity Weight
Previous approaches use CD-distance between proteins u and v given by
A simple example
• When a fraction ‘x’ of protein’s ‘u’s neighbors is common to protein ‘v’s neighbors then x is proportional to the probability that u’s functions are shared with v through common neighbors. (and vice versa for y protion of v ‘s neighbor common with neighbor of u)
•
• 2) integrating reliability of experimental sources:The prediction results can be improved by taking
differences in reliability of sources into account. So between u and v , the reliability of the interaction is estimated as:
• i source no. Euv set of sources with interaction u, v n no . Of times in which interaction btween u and v was observed
So, integrated equation becomes
Transitive functional Association
• If u is similar to w and w is similar to v then there can be a similarity between u and v given by:
•
Functional Similarity Weighted Averaging
• the likelihood of protein p having function x:
• STR(u,v) Transitive FS weight• r_int fraction of all the proteins who share this considered function• Sigma(p,x) = 1 if p has function x else =0
• Pi_x frequency of function x in proteins
Results
• 1) ORIGINAL NEIGHBOR COUNTING• 2) Neighbor counting with FS-weight• 3) scheme in (2)+ level-2 neighbors are
considered.
Comparison with other schemes
Improvements?
• Threshold at level-2..