Http://linc.ucy.ac.cy Andreas Papadopoulos - [email protected] [DEXA 2015] Clustering Attributed...
-
Upload
laurel-melton -
Category
Documents
-
view
221 -
download
2
Transcript of Http://linc.ucy.ac.cy Andreas Papadopoulos - [email protected] [DEXA 2015] Clustering Attributed...
![Page 1: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/1.jpg)
http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
ClusteringAttributed Multi-graphs
with Information Ranking
26th International Conference on Database and Expert Systems Applications
Sep. 1-4, 2015 Valencia, Spain
Andreas Papadopoulos, Dimitrios Rafailidis,George Pallis, Marios D. Dikaiakos
![Page 2: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/2.jpg)
Slide 2 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
The Real World: Information Networks
Friendship
Friendship
Coauth
or
Coauthor
Coauthor
Coauthor
Friendship
Coauthor
FriendshipCoauthor
![Page 3: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/3.jpg)
Slide 3 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
The Real World: Information Networks
Friendship
Friendship
Coauth
or
Coauthor
Coauthor
Coauthor
Friendship
Coauthor
FriendshipCoauthor
![Page 4: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/4.jpg)
Slide 4 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Challenges
• Identify importance of each edge-type/attribute property• For instance, clustering a bibliography network• Attribute ‘area of interest’ is important• Attributes ‘name’ and ‘gender’ may introduce noise and
reduce the clustering accuracy
• Combine the attribute and structural vertex properties• Edges and attributes are of different type
![Page 5: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/5.jpg)
Slide 5 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Related Work
• Limited attention to the different importance of attributes/edge-types• Weights are mainly updated at each iteration
• Ignore the existence of multiple edge-types• Increases computational cost and complexity
• Spectral clustering is not used for clustering attributed graphs • Used to identify dense clusters in attribute subspaces
Model-Based• BAGC [SIGMOD ‘12, TKDD ‘14]• CESNA [ICDM ‘13]
Distance-Based• SACluster [VLDB ‘09, TKDD ‘11]• PICS [SDM ‘12]• HASCOP [WI ‘13]
![Page 6: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/6.jpg)
Slide 6 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Proposed Approach: CAMIR
• Clustering Attributed Multi-graphs with Information Ranking: CAMIR
1. Rank edge-type and attribute properties
2. Construct a unified similarity matrix
3. Adopt spectral clustering technique to generate the final clusters
![Page 7: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/7.jpg)
http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Presentation OutlineMotivationProblem DefinitionRelated WorkBackgroundProposed Approach: CAMIR EvaluationSummary
![Page 8: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/8.jpg)
http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Presentation OutlineMotivationProblem DefinitionRelated WorkBackgroundProposed Approach: CAMIR EvaluationSummary
![Page 9: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/9.jpg)
Slide 9 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
• An edge represents the similarity of the two connected vertices
• Find the minimum cut of a graph• Minimizes inter-cluster similarities• Identifies an optimal partitioning of the graph
• Identifying a minimum cut is computationally difficult• Efficient approximations using linear algebra
Background: Graph Partitioning
![Page 10: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/10.jpg)
Slide 10 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
• Based on the graph Laplacian, or Laplacian matrix
• Given a similarity matrix The normalized symmetric Laplacian L is defined as
• The eigenvectors corresponding to top k eigenvalues are the projection of the graph into R|V| x k • Data is easily separable into clusters, i.e. using k-means
Background: Spectral Clustering
![Page 11: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/11.jpg)
Slide 11 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Background: Spectral Clustering
10
1
2
3
4
5
6
78
910
1
2
3
4
5
6
78
95
1
7
12 19
134
30
20
8
Adjacency Matrix1 2 3 4 5 6 7 8 9 10
1 1 12 1 13 1 1 14 1 15 1 167 189
10
Laplacian Matrix1 2 3 4 5 6 7 8 9 10
1 1 -0.354 -0.52 1 -0.408 -0.4083 -0.354 1 -0.25 -0.354-0.3544 -0.408 -0.289 -0.3335 -0.25 -0.289 1 -0.5 -0.2896 -0.5 17 1 -0.7078 -0.408 -0.333-0.289 19 -0.354 -0.707 1
10 -0.5 -0.354 1
Top 3 eigenvectorsU1 U2 U3
1 -0.659 -0.705 0.2632 -0.620 0.747 0.2413 -0.595 -0.486 -0.6404 -0.668 0.711 -0.2215 -0.723 0.395 0.5666 -0.669 0.414 -0.6177 -0.332 -0.486 -0.8088 -0.668 0.711 -0.2219 -0.379 -0.491 0.784
10 -0.659 -0.705 0.263
![Page 12: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/12.jpg)
Slide 12 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
How do we define the similarity matrix
for an attributed multi-graph?
![Page 13: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/13.jpg)
Slide 13 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Background: Similarity Matrices
IR
DM
DM
DM AI
AIAI
AI
AI
IR
[0,1]N X N
5
1
7
12
19
134
30
20
8
0
1
2
3
4
5
6
78
9
Gaussian Kernel
[0,1]N X N
Edges[0,1]N X N
#Edge types + #AttributesSymmetric Non-negative Similarity
Matrices
How do we efficiently combine the similarity matrices?
![Page 14: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/14.jpg)
Slide 14 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Presentation OutlineMotivationProblem DefinitionRelated WorkBackgroundProposed Approach: CAMIR EvaluationSummary
![Page 15: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/15.jpg)
Slide 15 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
CAMIR Overview
1. Rank vertex properties and calculate their weights accordingly• By considering the agreement among vertex properties
2. Compute a unified similarity matrix• By combining all vertex properties based on their ranking
3. Generate the final clusters• By adopting a spectral clustering approach
![Page 16: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/16.jpg)
Slide 16 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Presentation OutlineMotivationProblem DefinitionRelated WorkBackgroundProposed Approach: CAMIR
1. Information Ranking2. Unified Similarity Matrix3. Generate the final clusters
EvaluationSummary
![Page 17: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/17.jpg)
Slide 17 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
• Most informative property [NIPS ’11]:
• Has the highest ‘agreement’ with other properties• ‘agree’ assign vertices the same cluster labels when used individually
Information Ranking
Rank attribute and edge type propertiesIteratively select from the set of unranked properties the most informative property
![Page 18: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/18.jpg)
Slide 18 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Information Ranking
From the set of properties ( ), the most informative property is p [NIPS ‘11]
• The highest rank (| |) is assigned to the most informative property
• i.e. best separates the vertices
• The lowest rank (1.0) is assigned to the property that is selected last
• i.e. does not ‘agree’ with the rest of properties
Rank attribute and edge type propertiesIteratively select from the set of unranked properties the most informative property
![Page 19: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/19.jpg)
Slide 19 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Presentation OutlineMotivationProblem DefinitionRelated WorkBackgroundProposed Approach: CAMIR
1. Information Ranking2. Unified Similarity Matrix3. Generate the final clusters
EvaluationSummary
![Page 20: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/20.jpg)
Slide 20 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Unified Similarity Matrix
• Combines the multiple edge-type and attribute
properties with respect to identified ranking
• Defined as the weighted sum of the individual
similarity matrices
• Weights are defined by normalizing the rankings
• Contains all the similarity information about the network
under study
![Page 21: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/21.jpg)
Slide 21 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Presentation OutlineMotivationProblem DefinitionRelated WorkBackgroundProposed Approach: CAMIR
1. Information Ranking2. Unified Similarity Matrix3. Generate the final clusters
EvaluationSummary
![Page 22: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/22.jpg)
Slide 22 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Generating the Final Clusters
• Calculate normalized Laplacian of Unified
Similarity Matrix
• Perform Eigen decomposition
• Apply k-means to the eigenspace of top k
eigenvectors
• Generate the final clusters
![Page 23: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/23.jpg)
Slide 23 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
CAMIR Clustering Process Diagram
Properties rankingUnified Similarity
MatrixGenerate the final
clusters
Cluster 1Cluster 2
…Cluster k
Iteratively Select the Most Informative
Property
Apply Spectral Clustering
Normalize Rankings andCompute the
Unified Similarity Matrix
Step 1. Identify importance of vertex
properties
Step 2. Efficiently combine vertex
properties
Step 3. Cluster the attributed multi-graph
![Page 24: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/24.jpg)
Slide 24 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Presentation OutlineMotivationProblem DefinitionRelated WorkBackgroundProposed Approach: CAMIR EvaluationSummary
![Page 25: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/25.jpg)
Slide 25 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Evaluation - Datasets
• Real-World Datasets• DBLP: Bibliography Networks• GoogleSP23: Google Software Packages
Dataset DBLP-1K DBLP-10K GoogleSP-23
Nodes 1 000 10 000 1 297
Edges 17 128 65 734 268 956
Attributes 2 2 5
Edge Types 1 1 2
Total Vertex Properties 3 3 7
Synthetic Datasets
{100, 500, 1 000, 5 000, 10 000} 1 000
{1 000 – 1 230 000} ~ 40 000
4 {2, 4, 8, 16, 32}
1 1
5 {3, 5, 9, 17, 33}
![Page 26: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/26.jpg)
Slide 26 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
• Entropy
• Low entropy equals to high attribute homogeneity
• Normalized Mutual Information (NMI)
• High NMI is equivalent to high similarity between the
resulted clustering and the ground-truth
• NMI of value 1 indicates perfect match
• Runtime
• Quad-core i7 2.8Ghz, 8 Gb RAM
Evaluation Measures
![Page 27: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/27.jpg)
Slide 27 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
• SACluster [VLDB 2009]
• Similarity is defined as the Random Walk distance in the augmented graph
• BAGC [SIGMOD 2012]
• Uses Bayesian inference to update the parameters of the clusters
distributions
• PICS [SDM 2012]
• Compresses adjacency and attribute matrices
• HASCOP [WI 2013]
• Heuristic distance-based
• Applies to attributed multi-graphs
State-of-the-Art Competitors
![Page 28: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/28.jpg)
Slide 28 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Evaluation - Synthetic Datasets• CAMIR Entropy is
always less than 0.5• High Attribute
homogeneity
• CAMIR NMI is at least 0.8 on all experiments• High quality results
• Similar behavior as we increase the number of attributes
![Page 29: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/29.jpg)
Slide 29 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Evaluation - Synthetic Datasets
• CAMIR is the 2nd fastest algorithm• Less than 10 secs for
up to 5000 vertices
• CAMIR on average outperforms almost all its competitors
![Page 30: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/30.jpg)
Slide 30 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Evaluation - Real-world DatasetsDBLP-1K
DBLP-10K
• CAMIR achieves the
lowest entropy among
its competitors• Efficiently ranks and
combines vertex
properties
• Identifies clusters of
arbitrary shapes and
sizes (Spectral clustering)
![Page 31: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/31.jpg)
Slide 31 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Evaluation - Real-world Datasets
GoogleSP-23
GoogleSP-23
• CAMIR achieves low
entropy
• CAMIR achieves high
NMI• Identifies a high
percentage of software packages
![Page 32: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/32.jpg)
Slide 32 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Evaluation – Runtime and EntropyAlgorithm DBLP-1K DBLP-10K GoogleSP23
Runtime(secs) Entropy Runtime
(secs) Entropy Runtime(secs) Entropy
CAMIR 1.20 0.299 520.48 0.255 5.98 0.387
BAGC 0.15 1.448 0.35 1.649 0.81 1.573
SACluster 3.22 0.729 433.228 1.066 30.57 1.513
PICS 4.87 1.280 495.17 1.877 476.49 2.178
HASCOP 882.17 0.838 32957 1.306 4675 0.061
• CAMIR requires:• Less than 6 secs for ~1000 vertices• About 8 minutes for 10000 vertices
• CAMIR achieves on average 55% time and 60% entropy improvement
• BAGC is the fastest method, but achieved limited clustering quality• HASCOP achieved slightly better results than CAMIR, but it is the slowest
method
![Page 33: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/33.jpg)
Slide 33 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Presentation OutlineMotivationProblem DefinitionRelated WorkBackgroundProposed Approach: CAMIR EvaluationSummary
![Page 34: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/34.jpg)
Slide 34 of 35 http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
Summary
• A new approach for Clustering Attributed Multi-graphs with
Information Ranking: CAMIR
• A new mechanism to rank and weigh vertex properties• Identifies the importance of each attribute and edge-type property
• A unified similarity matrix for attributed multi-graphs• Efficiently combines vertex properties
• Identify clusters of arbitrary sizes and shapes• Effective in terms of clustering accuracy and computational
time
![Page 35: Http://linc.ucy.ac.cy Andreas Papadopoulos - andpapad@cs.ucy.ac.cy [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.](https://reader036.fdocuments.us/reader036/viewer/2022081513/56649f295503460f94c434ae/html5/thumbnails/35.jpg)
http://linc.ucy.ac.cyAndreas Papadopoulos - [email protected] [DEXA 2015]
ClusteringAttributed Multi-graphs
with Information Ranking
Andreas Papadopoulos, Dimitrios Rafailidis,George Pallis, Marios D. Dikaiakos
Department of Computer ScienceUniversity of Cyprus
Thank You!