Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic...

38
Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of nodes

Transcript of Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic...

Page 1: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Lecture6Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networksTopic 2: Different centrality measures of nodes

Page 2: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Graph spectral analysis/

Graph spectral clustering

Page 3: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

PROTEIN STRUCTURE: INSIGHTS FROM GRAPH THEORY

bySARASWATHI VISHVESHWARA, K. V. BRINDA and N. KANNANy

Molecular Biophysics Unit, Indian Institute of ScienceBangalore 560012, India

Page 4: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Laplacian matrix L=D-A

Adjacency Matrix Degree Matrix

Page 5: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Eigenvalues of a matrix A are the roots of the following equation

|A-λI|=0, where I is an identity matrix

Let λ is an eigenvalue of A and x is a vector such that

then x is an eigenvector of A corresponding to λ .

-----(1)N×N N×1 N×1

Eigenvalues and eigenvectors

Page 6: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Node 1 has 3 edges, nodes 2, 3 and 4 have 2 edges each and node 5 has only one edge. The magnitude of the vector components of the largest eigenvalue of the Adjacency matrix reflects this observation.

Page 7: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Node 1 has 3 edges, nodes 2, 3 and 4 have 2 edges each and node 5 has only one edge. Also the magnitude of the vector components of the largest eigenvalue of the Laplacian matrix reflects this observation.

Page 8: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

The largest eigenvalue (lev) depends upon the highest degree in the graph. For any k regular graph G (a graph with k degree on all the vertices), the eigenvalue with the largest absolute value is k. A corollary to this theorem is that the lev of a clique of n verticesis n − 1. In a general connected graph, the lev is always less than or equal to (≤ ) to the largest degree in the graph. In a graph with n vertices, the absolute value of lev decreasesas the degree of vertices decreases. The lev of a clique with 11 vertices is 10 and that of a linearchain with 11 vertices is 1.932

a linear chain with 11 vertices

Page 9: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

In graphs 5(a)-5(e), the highest degree is 6. In graphs 5(f)-5(i), the highest degree is 5, 4, 3 and 2 respectively.

Page 10: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

It can be noticed that the lev is generally higher if the graph contains vertices of high degree. The lev decreases gradually from the graph with highest degree 6 to the one with highest degree 2. In case of graphs 5(a)-5(e), where there is one common vertex with degree 6 (highest degree) and the degrees of the other vertices are different (less than 6 in all cases) i.e. the lev also depends on the degree of the vertices adjoining the highest degree vertex.

Page 11: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

This paper combines graph 4(a) and graph 4(b) and constructs a Laplacian matrix with edge weights (1/dij ), where dij is the distance between vertices i and j. The distances between the vertices of graph 4(a) and graph 4(b) are considered to be very large (say 100) and thus the matrix elements corresponding to a vertex from graph 4(a) and the other from graph 4(b) is considered to have a very small value of 0.01. The Laplacian matrix of 8 vertices thus considered is diagonalized and their eigenvalues and corresponding vector components are given in Table 3.

Page 12: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

The vector components corresponding to the second smallest eigenvalue contains the desired information about clustering, where the cluster forming residues have identical values. In Fig. 4, nodes 1-5 form a cluster (cluster 1) and 6-8 form another cluster (cluster 2).

Page 13: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Metabolome Based Reaction Graphs of M. tuberculosis and M. leprae: A Comparative Network Analysisby

Ketki D. Verkhedkar1, Karthik Raman2, Nagasuma R. Chandra2, Saraswathi Vishveshwara1*1 Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India, 2 Bioinformatics Centre, Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, IndiaPLoS ONE | www.plosone.org September 2007 | Issue 9 | e881

Page 14: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Construction of network

R1 R2

R3 R4

Stoichrometric matrix

Following this method the networks of metabolic reactions corresponding to 3 organisms were constructed

Page 15: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.
Page 16: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Analysis of network parameters

Page 17: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Giant component of the reaction network of e.coli

Page 18: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Giant components of the reaction networks of M. tuberculosis and M. leprae

Page 19: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Analyses of sub-clusters in the giant componentGraph spectral analysis was performed to detect sub-clusters of reactions in the giant component.To obtain the eigenvalue spectra of the graph, the adjacency matrix of the graph is converted to a Laplacian matrix (L), by the equation:L=D-Awhere D, the degree matrix of the graph, is a diagonal matrix in which the ith element on the diagonal is equal to the number of connections that the ith node makes in the graph.

It is observed that reactions belonging to fatty acid biosynthesis and the FAS-II cycle of the mycolic acid pathway in M. tuberculosis form distinct, tightly connected sub-clusters.

Page 20: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.
Page 21: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.
Page 22: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.
Page 23: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Identification of hubs in the reaction networksIn biological networks, the hubs are thought to be functionally important and phylogenetically oldest.

The largest vector component of the highest eigenvalue of the Laplacian matrix of the graph corresponds to the node with high degree as well as low eccentricity. Two parameters, degree and eccentricity, are involved in the identification of graph spectral (GS) hubs.

Page 24: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Identification of hubs in the reaction networks

Alternatively, hubs can be ranked based on their connectivity alone (degree hubs).

It was observed that the top 50 degree hubs in the reaction networks of the three organisms comprised reactions involving the metabolite L-glutamate as well as reactions involving pyruvate. However, the top 50 GS hubs of M. tuberculosis and M. leprae exclusively comprised reactions involving L-glutamate while the top GS hubs in E. coli only consisted of reactions involving pyruvate.

The difference in the degree and GS hubs suggests that the most highly connected reactions are not necessarily the most central reactions in the metabolome of the organism

Page 25: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.
Page 26: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Centrality measures of nodes

Page 27: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Centrality measures

Within graph theory and network analysis, there are various measures of the centrality of a vertex within a graph that determine the relative importance of a vertex within the graph.

•Degree centrality

•Betweenness centrality

•Closeness centrality

•Eigenvector centrality

•Subgraph centrality

We will discuss on the following centrality measures:

Page 28: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Degree centrality

Degree centrality is defined as the number of links incident upon a node i.e. the number of degree of the node

Degree centrality is often interpreted in terms of the immediate risk of the node for catching whatever is flowing through the network (such as a virus, or some information).

Degree centrality of the blue nodes are higher

Page 29: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Betweenness centrality

The vertex betweenness centrality BC(v) of a vertex v is defined as follows:

Here σuw is the total number of shortest paths between node u and w and σuw(v) is number of shortest paths between node u and w that pass node v

Vertices that occur on many shortest paths between other vertices have higher betweenness than those that do not.

Page 30: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

a

db f

e

c

Betweenness centrality σuw σuw(v) σuw/σuw(v)

(a,b) 1 0 0

(a,d) 1 1 1

(a,e) 1 1 1

(a,f) 1 1 1

(b,d) 1 1 1

(b,e) 1 1 1

(b,f) 1 1 1

(d,e) 1 0 0

(d,f) 1 0 0

(e,f) 1 0 0

Betweenness centrality of node c=6

Betweenness centrality of node a=0 Calculation for node c

Page 31: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Hue (from red=0 to blue=max) shows the node betweenness.

Betweenness centrality

•Nodes of high betweenness centrality are important for transport.

•If they are blocked, transport becomes less efficient and on the other hand if their capacity is improved transport becomes more efficient.

•Using a similar concept edge betweenness is calculated.

http://en.wikipedia.org/wiki/Betweenness_centrality#betweenness

Page 32: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Closeness centrality

The farness of a vortex is the sum of the shortest-path distance from the vertex to any other vertex in the graph.The reciprocal of farness is the closeness centrality (CC).

Here, d(v,t) is the shortest distance between vertex v and vertex t

Closeness centrality can be viewed as the efficiency of a vertex in spreading information to all other vertices

vVt

tvdvCC

\

),(

1)(

Page 33: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Eigenvector centralityLet A is the adjacency matrix of a graph and λ is the largest eigenvalue of A and x is the corresponding eigenvector then

The ith component of the eigenvector x then gives the eigenvector centrality score of the ith node in the network.

From (1)

N

jjjii xAx

1,

1

•Therefore, for any node, the eigenvector centrality score be proportional to the sum of the scores of all nodes which are connected to it. •Consequently, a node has high value of EC either if it is connected to many other nodes or if it is connected to others that themselves have high EC

-----(1)N×N N×1 N×1

|A-λI|=0, where I is an identity matrix

Page 34: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Subgraph centrality

the number of closed walks of length k starting and ending on vertex i in the network is given by the local spectral moments μ k (i), which are simply defined as the ith diagonal entry of the kth power of the adjacency matrix, A:

Closed walks can be trivial or nontrivial and are directly related to the subgraphs of the network.

Subgraph Centrality in Complex Networks, Physical Review E 71, 056103(2005)

Page 35: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

0 1 0 0 0 0 0 0 0 0 0 0 0 0

1 0 1 1 0 1 0 0 0 0 0 0 0 0

0 1 0 1 1 1 0 0 0 0 0 0 0 0

0 1 1 0 1 1 0 1 0 0 0 0 0 0

0 0 1 1 0 1 0 0 0 0 0 0 0 0

0 1 1 1 1 0 1 0 0 0 0 0 0 0

0 0 0 0 0 1 0 0 0 0 1 0 0 0

0 0 0 1 0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 0 1 0 1 0 0 1 1

0 0 0 0 0 0 0 0 1 0 1 0 1 1

0 0 0 0 0 0 1 0 0 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 1 1 0 1 0 1

0 0 0 0 0 0 0 0 1 1 0 0 1 0

M =

Muv = 1 if there is an edge between

nodes u and v and 0 otherwise.

Subgraph centrality

Adjacency matrix

Page 36: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

1 0 1 1 0 1 0 0 0 0 0 0 0 0

0 4 2 2 3 2 1 1 0 0 0 0 0 0

1 2 4 3 2 3 1 1 0 0 0 0 0 0

1 2 3 5 2 3 1 0 1 0 0 0 0 0

0 3 2 2 3 2 1 1 0 0 0 0 0 0

1 2 3 3 2 5 0 1 0 0 1 0 0 0

0 1 1 1 1 0 2 0 0 1 0 0 0 0

0 1 1 0 1 1 0 2 0 1 0 0 1 1

0 0 0 1 0 0 0 0 4 2 1 1 2 2

0 0 0 0 0 0 1 1 2 4 0 1 2 2

0 0 0 0 0 1 0 0 1 0 2 0 1 1

0 0 0 0 0 0 0 0 1 1 0 1 0 1

0 0 0 0 0 0 0 1 2 2 1 0 4 2

0 0 0 0 0 0 0 1 2 2 1 1 2 3

M2 =

(M2)uv for uv represents the

number of common neighbor of the nodes u and v.

local spectral moment

Subgraph centrality

Page 37: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

The subgraph centrality of the node i is given by

Let λ be the main eigenvalue of the adjacency matrix A. It can be shown that

Thus, the subgraph centrality of any vertex i is bounded above by

Subgraph centrality

Page 38: Lecture6 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Table 2. Summary of results of eight real-world complex networks.