CIS 4930/6930 – Recent Advances in Bioinformatics Spring 2014
description
Transcript of CIS 4930/6930 – Recent Advances in Bioinformatics Spring 2014
1
CIS 4930/6930 – Recent Advances in Bioinformatics
Spring 2014
Network models
Tamer Kahveci
2
Graphs
• Useful for describing networks.• G = (V, E) with
– V = set of nodes– E = set of edges
• Topological models– Directed/Undirected– Weighted/Unweighted– Deterministic/Probabilistic (G = (V, E, P))
• Concepts– Degree (indegree/outdegree), path
3
Topological properties
• Degree distribution, P(k) of G=(V, E)– Deg(k) = number of nodes in G
with degree = k.
– P(k) = Deg(k)/|V| = Probability that a random node in G has degree = k.
H.PyloriTodor et al. TCBB. 10:4. 2013
3
2
2
1
4
Topological properties
• Neighbors of node v, N(v) = set of nodes adjacent to v.
• Clustering coefficient of node v, Cv shows the connectivity of N(v).
• Slightly different denominator for directed vs undirected graph
Cv = # edges among N(v)
Max # edges possible among N(v)
• C(k) = Average clustering coefficients for all nodes with k edges.
• Networks clustering coefficient = average clustering coefficients of all nodes in G = (∑ Cv) / |V|
2/6
5
Centrality of a node
• Centrality of a node v in graph G = (V, E) indicates relative importance of v in G with respect to the rest of the nodes in G. Lets denote it with f(v | G) or simply f(v).
• Many centrality measures exists– Degree centrality
• How popular am I?• fDeg(v) = Deg(v)
– Closeness centrality– Betweenness centrality
6
Closeness Centrality
• How close am I to everyone else?• Given G = (V, E)• Dist(u,v) = shortest path length from u
to v in G• fClose(u) = ∑v in G Dist(u, v)
• Alternative (for disconnected networks)– fClose(u) = ∑v in V-{u} 1/ Dist(u, v)– 1/inf = 0
• How do I find shortest path?– Floyd-Warshall algorithm– Johnson’s algorithm
1
12
3
7
Betweenness Centrality
• How many pairs of nodes use me on the cheapest route to communicate?
• gst = number of shortest path between s & and t.
• gst(v) = number of shortest path between s & and t that contains v.
• fBetween(v) = (∑s,t gst(v)/ gst) / (number of s,t pairs in V- {v}).
Floyd-Warshall: shortest path
8
for k = 1 to n do // use node k on pathfor i = 1 to n do // origin i
for j = 1 to n do // destination jif (d[i,k] + d[k,j]) < d[i,j]) {
d[i,j] = d[i,k] + d[k,j] // shorter path lengthvisit[i,j] = k // new path goes through k
}
Given G = (V, E, w)
Distance(i, j, 0) = w(i, j)Distance(i, j, k+1) = min{Distance(i, j, k), Distance(i, k+1, k) + Distance(k+1, j, k)}
ji
k+1
V’ = {1, 2, …, k}
9
Key network models
• Erdos-Renyi
• Small world
• Scale free
10
Erdos-Renyi
• Totally uniformly random distribution of edges• Construction
– Given two parameters (n = # of nodes, p = probability of an edge existence)
– For all pairs of node (u,v)• Create an edge (u,v) with probability p.
11
Small World (Watts-Strogatz)• Everyone tends to be close to each other.• As the number of nodes (N) in the network
grows, the distance between two random nodes grows with the logarithm of N.
• Construction– Given three parameters:
• N = # of nodes. • K = average degree• p = rewiring probability
– Construct a ring lattice• Connect each ith node to nodes {i-1, i-2, …,
i-k/2} and {i+1, i+2, …, i+k/2} with an edge– For each node u
• For each edge (u, v)– Randomly pick a node v’ = V-{u}– Replace (u, v) with (u, v’) with probability p
…
12
Scale-Free
• A lot of poor work for a few super rich• Probability that a node has degree k drops exponentially
with k.– P(k) ~ k-ᵞ
• Construction (preferential attachment – or rich gets richer)– Given two parameters (n = # of nodes, k = average degree)– Build a small network (e.g. two nodes and one edge)– Repeat
• Insert a new node v• Insert k edges from v to existing nodes. Existing node u gets an edge with
probability pu = Deg(u)/ ∑i Deg(i)
– Until we have n nodes
13
Hierarchical
• Similar to fractals• Scale-free networks with high
clustering.• Construction
– Create an initial network (seed) with t peripheral nodes
– Create t copies of this network and connect each of them to the central node.
Fractal
Probabilistic
14
a
b c
0.3 0.6a
b c
a
b c
a
b c
a
b c
(1-0.6) x (1-0.3) = 0.28 0.180.28 0.12 0.42
0.28 + 0.12 + 0.42 + 0.18 = 1
G = (V, E, P)
P: E -> (0, 1]