Community Detection Algorithms
DIRECTED BY : ALIREZA ANDALIB
Member-Based Community Detection
1-Similarity characteristics are more often in the same community
Important Node Feature : node similarity - node degree(familiarity) - node reachability
similarity is based on overlap between the neighborhood
Two Methods to find similarity:
The similarity values between nodes v2 and v5 are :
Member-Based Community Detection
2- sub graphs based on node degrees is a clique
We can cut graph to complete sub graphs -> NP harduse brute force-polynomial solvable - use cliques as core of community
Brute-force clique identification Method -> can find all maximal cliques in a graph
Clique percolation method -> CMP
Though sharing no neighborhood overlap, the social circles of these players (coach, players, fans, etc.) might look quite similar due to their social status. In other words, nodes are regularly equivalent when they are connected to nodes that are themselves similar (a self-referential definition).
Member-Based Community Detection
3-The two extremes of reachability
(1) there is a path between them (regardless of the distance)
BFS & DFS Methods ->is not useful in large community
(2) so close to be immediate neighbors
we can find shortest paths between their nodes in Clique
but There are predefined sub graphs, with roots in community
Group-Based Community DetectionIn graph-based clustering, we cut the graph into several partitions
Cut size = how many cut edge and the summation of weights
12 4
Minimum Cut
Are not perfect coz often find singleton
nodsBalance Cut More Balance Cut
Group-Based Community Detection1-balance partitioning mod :
Graph G = (V,E) (Vertices, Edge) to K partition that have Pi vertices
P = (P1, P2, P3, ....... , Pk) , Pi ∩ Pj = 0 , =V , ¯Pi=V-Pi
Group-Based Community Detection
1-balance partitioning mod in matrix format :
Let matrix X Xi,j= 1 if node i is in community j , otherwise Xi,j= 0
Let D = diag(d1, d2, …. ,dn)
X’AX -> edge inside i community
Graph(G) Adjacency matrix(A)
1
7
4
2 6
10
53
8 9
Graph(G) with 3 community
1
3
2
Community matrix(X)Degree matrix(D)
Group-Based Community Detection
Robust Communities:
goal is to find sub graphs robust enough such that removing edges or nodes does not disconnect the sub graph
K-vertex connected graph method -> we must find minimum number of nodes that must be removed to disconnect the graph =K
minimum degree for any node in the graph should not be less than k
Group-Based Community Detection
Modular Communities:
How community structure found is at random(structures must far from random)
G(V, E) , |E| = m , we have degrees but don’t have Edges , v
Consider vi , vj nodes with di , dj degrees P(connect vi to vj ) = =
SO number of edges between vi and vj ->
Group-Based Community DetectionModular Communities:
modularity maximization try to maximize this distance
Consider Graph G = (V,E) (Vertices, Edge) to K partition that have Pi vertices P = (P1, P2, P3, ....... , Pk)
For partition Px this distance can be defined
generalize by partitioning P with k partitions
Group-Based Community Detection
Modular Communities:
In all graph this distance is defined
And in matrix form
Group-Based Community Detection
Dense Communities:
Cliques , clubs, and clans are examples of connected dense
we focus on sub graphs that should be disconnected
We can utilize the brute-force clique identification algorithm
Density
Group-Based Community Detection
Hierarchical Communities:
community can have sub/super communities. Girvan-Newman algorithm designed for divisive hierarchical clustering
Girvan-Newman have measure called “edge between ness” removes edges with higher edge between ness.
For an edge E, edge between ness is defined as the number Edge of shortest paths between node pairs (Vi , Vj) such that the shortest path Between ness between Vi and Vj passes through E.
Group-Based Community Detection
Hierarchical Communities (Girvan-Newman Algorithm):
1. Calculate edge between ness for all edges in the graph.
2. Remove the edge with the highest between ness
3. Recalculate between ness for all edges a edged by the edge removal
4. Repeat until all edges are removed
Group-Based Community DetectionHierarchical Communities:
Top Related