gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a...
-
Upload
danilo-oliveira -
Category
Technology
-
view
273 -
download
0
Transcript of gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a...
![Page 1: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/1.jpg)
gSkeletonClu [1]Revealing density-based clustering structure from the core-connected tree of a network
[1]Huang, J., Sun, H., Song, Q., Deng, H., & Han, J. (2013). Revealing density-based clustering structure from the core-connected tree of a network. IEEE Transactions on Knowledge and Data Engineering, 25(8), 1876–1889. http://doi.org/10.1109/TKDE.2012.100http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6200274&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F69%2F4358933%2F06200274.pdf%3Farnumber%3D6200274
![Page 2: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/2.jpg)
Abstract
Objective: Identify communities and vertices roles in a weighted network
![Page 3: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/3.jpg)
Overview
Given a weighted network…
1- Calculate its CCMST with the Core-Connectivity Similarity
2- Find the components called (Structure Core-Connected)
● Components that contains the core
3- Attach the vertex classified as border
4 - Identify the Hubs and Outlier
![Page 4: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/4.jpg)
Def1) Neighborhood
Neighborhood of n1:
r(n1) = {n1, n2, n3, n8}
![Page 5: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/5.jpg)
Def2) Structural Similarity
num = 2*weight(n1, n8) = 20
denA = denB = 1
num += 10*10 = 120
denA += Sqr[(10*10) + (10*10)] = 14.14
denB += Sqr[(10*10) + (10*10) + (10*10) + (5*5)] = 18.02
σ(n1, n8) = num/(denA*denB)
*Note = Initial values of num and den are a mysterious
![Page 6: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/6.jpg)
Def2) Structural Similarity
σ(n1, n8) = 0.47
![Page 7: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/7.jpg)
Def3) Ɛ-Neighborhood
Ɛ-Neighborhood for n1:
● Ɛ = 0.47● rƐ(n1) = {n1, n2, n8}
![Page 8: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/8.jpg)
Def4) Core
if | �(u) | >= μ, then u is a core. Denoted by Kε,μ (u)
Considering,
● Ɛ = 0.47● μ = 3,
so...
● Kε,μ (n1)
![Page 9: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/9.jpg)
Def5) Directly Structure-Reachable
If u is a core AND v belongs to �Ɛ(u).
So:
● u ⟼ ε,μv○ n1 ⟼ ε,μn8
![Page 10: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/10.jpg)
Def6) Hubs and Outliers
if h does not belong to any cluster
AND
if h bridges multiples cluster, such that:
h E r(u) ^ h E r(v)
then h is hub.
If not hub:
v is Outlier
![Page 11: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/11.jpg)
Def6) Hubs and Outliers
if h does not belong to any cluster
AND
if h bridges multiples cluster, such that:
● h E r(u) ^ h E r(v)
then h is hub.
If not hub:
v is Outlier
hub
outlier
![Page 12: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/12.jpg)
Def7) Structure Core-Similarity
CS(n1) candidates...
1. (n1, n0) - 0.082. (n1, n2) - 0.683. (n1, n3) - 0.434. (n1, n8) - 0.47
Ɛ
![Page 13: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/13.jpg)
Def7) Structure Core-Similarity
CS(n1) candidates...
1. (n1, n0) - 0.082. (n1, n2) - 0.683. (n1, n3) - 0.434. (n1, n8) - 0.47
Ɛ
![Page 14: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/14.jpg)
Def8) Reachability-Similarity
RS(n6, n7) = min {0.51, 0.1} = 0.1
RS(n6, n4) = 0.51
RS(n6, n5) = 0.55
---
RS(n7, n6) = min {0, 0.1} = 0
Asymmetric!!!!
![Page 15: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/15.jpg)
Def9) Core Connectivity Similarity
CCS(n6, n4) = 0.51
CCS(n6, n5) = 0.51
CCS(n6, n7) = 0
![Page 16: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/16.jpg)
Def9) Core Connectivity Similarity
CCS(n6, n4) = 0.51
CCS(n6, n5) = 0.51
CCS(n6, n7) = 0
![Page 17: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/17.jpg)
Def10) Structure Core-Connected
Given Ɛ E IR, μ E IN;u, v E V;u, and v are directly core-connected with each other if and only if:● Kε,μ (u) ^Kε,μ (v) ^ u ⟼ ε,μv
This is denoted by:u ⟷ ε,μv
gSkeletonClu will first try to find structures that respect this definition above, after that will append the "borders" ( vertex that are "directly structure reachable" but don't respect this def. above). At the end, the gSkeletonClu will separate the clusters, hubs and outliers.
![Page 18: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/18.jpg)
CCMST - Core-Connected Maximal Spanning Tree
Instead to use the complete network the authors proved that it is possible to identify the Structure Core-Connected components from the CCMST, considering the weight as the "CCS(u,v)".
Ɛ-Candidates:
● 0.51● 0.47● 0.43● 0.08● 0
![Page 19: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/19.jpg)
Core-Connected Components from CCMST
Ɛ= 0.51 Ɛ= 0.47
Ɛ= 0.43 Ɛ= 0.08
![Page 20: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/20.jpg)
Attracting Indices for Attaching Borders
RS(2,3) = 0.55
RS(1,3) = 0.43
--
AS(3) = 0.55
![Page 21: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/21.jpg)
Attracting Indices for Attaching Borders
AS(3) = 0.55
Ɛ= 0.47if AS(3) > Ɛ:
n3 is attached to the cluster that contains n2.
![Page 22: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/22.jpg)
So What…. ?Let`s execute from scratch!
![Page 23: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/23.jpg)
Step 1 - Prepare your weapons!!
Calculate the Weighted Core-Similarity NetworK
Ɛ = 0.47
μ = 3
Weighted NetworK: Weighted Core-Similarity NetworK:
![Page 24: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/24.jpg)
Step 2- Point your weapons...
Calculate the CCMST
.
![Page 25: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/25.jpg)
Step 3A - Fire!
Detect Core-Connected Components...
Ɛ = 0.47
μ = 3Ɛ= 0.47
![Page 26: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/26.jpg)
Step 3B - Fire again !
Attach the borders!
Ɛ= 0.47
![Page 27: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/27.jpg)
Step 3C - Kill it, before it kills you!
Detect Cluster, hubs and outliers
n0 is a hub because:
● n0 does not belong to any cluster● n0 bridges the clusters A and B.
n7 is a outlier because:
● it is not a hub =(
hub
outlier
![Page 28: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/28.jpg)
Results - Guard the guns… You are the winner!(or just a survivor...)
![Page 29: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/29.jpg)
Clustering of Automatically Selected Ɛ
If you have the Ɛ candidates extracted from the CCMST…
AND...
If you adopt a way to measure what is the best Ɛ...
Then, you can automatically select the Ɛ parameter.
One possible choice is to use the modularity Q as a quality measure of network clustering. The Q value belongs to [0,1]. The higher the value close to 1 indicates a better clustering result.
In a nutshell… You should run the gSkeletonClu for all Ɛ candidates and based on a quality index, choose the best partition!!!
![Page 30: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/30.jpg)
Did you like?
There is more!
From the CCMST is possible to extract the clustering hierarchy… (next opportunity)
Limitation
● The gSkletonClu just can be applied on networks!● In the author`s paper of gSkeletonClu, the tests show that it is slower than
SCAN…● Maybe it can not work in BIG networks. (more than 1 million of vertex)
○ SCAN ++ (Shiokawa, 2015) [1][2] did tests in BIG networks and could not perform the gSkeleton on them…
Have fun![1] http://www.vldb.org/pvldb/vol8/p1178-shiokawa.pdf[2] htp://pt.slideshare.net/LazyShion/scan-efficient-algorithm-for-finding-clusters-hubs-and-outliers-on-largescale-graphs-vldb-2015
![Page 31: gSkeletonClu - Revealing density-based clustering structure from the core-connected tree of a network](https://reader031.fdocuments.us/reader031/viewer/2022021419/58830c871a28ab31068b484d/html5/thumbnails/31.jpg)
Presentation created by:Danilo Amaral de Oliveira
Thank you!