Exploiting indirect neighbors and topological weight to predict protein function from protein–...

Exploiting indirect neighbors and topological weight to predict protein function from protein–protein interactions

Hon Nian Chua, Wing-Kin Sung and Limsoon Wong

Motivation

Predicting the protein function from Protein-

protein interaction data.

• Previous studies considers level 1 neighbors• Can level-2 neighbors play an significant role

in this prediction?

Summarizing the output of the study

• level-2 neighbors does show functional association.• A significant no. Proteins were observed to be

having associations with level-2 neighbors but not with level-1 neighbors.

• A predicting algorithm:• 1) weight Level 1 & 2 neighbors based on

functional similarity.• 2) each function was also allotted a score based

on its weighted frequency in neighbors

Conventional approaches

• using only direct interactions i.e level-1 neighbors

• Consider a radius in the interaction neighborhood network

• Calculate a functional distance and use clustering to make some functional classes.

Protein-Protein interactions as an undirected graph

• G=(V,E)• (u, v) as two protein nodes • And edge e between them as interaction• U and v being , K-level neighbors– concept of

path with k-edges between u and v.

• Set of neighbors-- Sk

Indirect Functional Association

Significance

• out of 4162 annotated proteins, only 1999 or 48% share some function with level-1 neighbors.

Sets of neighborhood pairs

Simple neighbor counting

•

• Discuss– M and N• M- total predicted N-total functions known

The Algorithm

• 1) Functional similarity Weight

Previous approaches use CD-distance between proteins u and v given by

A simple example

• When a fraction ‘x’ of protein’s ‘u’s neighbors is common to protein ‘v’s neighbors then x is proportional to the probability that u’s functions are shared with v through common neighbors. (and vice versa for y protion of v ‘s neighbor common with neighbor of u)

•

• 2) integrating reliability of experimental sources:The prediction results can be improved by taking

differences in reliability of sources into account. So between u and v , the reliability of the interaction is estimated as:

• i source no. Euv set of sources with interaction u, v n no . Of times in which interaction btween u and v was observed

So, integrated equation becomes

Transitive functional Association

• If u is similar to w and w is similar to v then there can be a similarity between u and v given by:

•

Functional Similarity Weighted Averaging

• the likelihood of protein p having function x:

• STR(u,v) Transitive FS weight• r_int fraction of all the proteins who share this considered function• Sigma(p,x) = 1 if p has function x else =0

• Pi_x frequency of function x in proteins

Results

• 1) ORIGINAL NEIGHBOR COUNTING• 2) Neighbor counting with FS-weight• 3) scheme in (2)+ level-2 neighbors are

considered.

Comparison with other schemes

Improvements?

• Threshold at level-2..

Exploiting indirect neighbors and topological weight to predict protein function from protein–...

Documents

Transcript of Exploiting indirect neighbors and topological weight to predict protein function from protein–...