A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case...

60
Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT) Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012 Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 Michael Hamann, Tanja Hartmann, Dorothea Wagner | February 2012 Institute of Theoretical Informatics, Prof. Dr. Dorothea Wagner Karlsruhe Institute of Technology (KIT) Tanja Hartmann · Karlsruhe Institute of Technology (KIT) Consistent Labeling of Rotating Maps · EuroCG 2011 t α 1 α 2 α 3 α 0 s s t α i α i-1 s t Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT) Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Transcript of A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case...

Page 1: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Complete Hierarchical Cut-Clustering:A Case Study on Modularity and ExpansionDIMACS 2012Michael Hamann, Tanja Hartmann, Dorothea Wagner | February 2012

Institute of Theoretical Informatics, Prof. Dr. Dorothea WagnerKarlsruhe Institute of Technology (KIT)

Tanja Hartmann · Karlsruhe Institute of Technology (KIT)

Consistent Labeling of Rotating Maps · EuroCG 2011

t

α1

α2

α3

α0

s

s

tαi

αi−1

s

t

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Page 2: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Introduction

(weighted) graph with a particular edge structure

identify subsets of vertices that are significantly related

weak external relations → more significance

The informal aim of graph clustering:

Page 3: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Introduction

(weighted) graph with a particular edge structure

identify subsets of vertices that are significantly related

weak external relations → more significance

clustering is a decomposition into such subsets,

clusters

The informal aim of graph clustering:

induced subgraphs are called clusters

Page 4: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Introduction

(weighted) graph with a particular edge structure

identify subsets of vertices that are significantly related

weak external relations → more significance

clustering is a decomposition into such subsets,

Many formalizations ofsignificant relations

→ many measures for quality→ many heuristics

clusters

The informal aim of graph clustering:

induced subgraphs are called clusters

Page 5: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Introduction

Many formalizations ofsignificant relations

→ many measures for quality→ many heuristics

clusters

Modularity

widely used

expresses significance of a clustering Ccompared to a random clustering

M(C) :=∑C∈C

c(EC )c(E)

−∑C∈C

(∑v∈C dc(v))2

4c(E)2

cov(C) E(cov(C))

{ {

Page 6: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Introduction

Many formalizations ofsignificant relations

→ many measures for quality→ many heuristics

clusters

Cut-Clustering by Flake et al. [2004]

exploits properties of minimum s-t-cuts

guarantees a lower bound on intra-cluster expansion (NP-hard)

Ψ(C) := minC∈C{minS⊂Cc(S,S)

min{|S|,|S|}}

Page 7: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Introduction

Cut-Clustering by Flake et al. [2004]

exploits properties of minimum s-t-cuts

guarantees a lower bound on intra-cluster expansion (NP-hard)

Ψ(C) := minC∈C{minS⊂Cc(S,S)

min{|S|,|S|}}

significance due to clearly indicated membership of vertices to clusters

s

t

for all U ⊂ C \ {s}: c(U,C) ≥ c(U, V \ C)

→ source-community of s

Page 8: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Outline

The Cut-Clustering Algorithm

Parametric Search Approach

depends on a parameter steering the coarsenessclusterings are hierarchically nested for different parameter valuesparameter constitutes guarantee on intra-cluster expansion

returns a complete hierarchy of cut-clusteringssimple and efficient (brief running time experiment)

Modularity Analysis of Cut-Clustering Algotithm

Expansion Analysis of Cut-Clustering Algorithm

compared to a modularity-based greedy approach

comparing the guarantee to trivial and non-trivial bounds

Page 9: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion

Page 10: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion

Page 11: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t

parameter α

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 12: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

2.5 α

parameter α

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 13: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

d2.5 α

parameter α

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 14: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

d2.5 α

parameter α

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 15: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

d2.5 α

parameter α

Community Cut:minimum s-t-cut with source side of minimum size

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 16: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

d2.5 α

uniqueparameter α

Community Cut:minimum s-t-cut with source side of minimum size

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 17: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

2.5 α

a

uniqueparameter α

Community Cut:minimum s-t-cut with source side of minimum size

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 18: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

2.5 α

a

uniqueparameter α

Community Cut:minimum s-t-cut with source side of minimum size

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 19: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

2.5 αc

uniqueparameter α

Community Cut:minimum s-t-cut with source side of minimum size

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 20: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

2.5 αc

uniqueparameter α

Community Cut:minimum s-t-cut with source side of minimum size

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 21: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

2.5 αc

unique non-crossingparameter α

Community Cut:minimum s-t-cut with source side of minimum size

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 22: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

a b

cd

t 1.5 α

10 α

0.5 α

2.5 αc

unique non-crossingparameter α

Community Cut:minimum s-t-cut with source side of minimum size

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 23: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

Flake et al. [2004]artificial vertex, artificial edgesspecial minimum s-t-cuts (so called community cuts)

t

unique non-crossingparameter α

a b

cd

−→ Cut-Clustering with two clusters

Community Cut:minimum s-t-cut with source side of minimum size

clusters are source-communities ⇒ good significance of clustering

guarantees bound on intra-cluster expansion Ψ(C) ≥ α

Page 24: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

t

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

Page 25: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

t

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

Page 26: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

Page 27: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

s

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

Source Community of sin underlying graph

Page 28: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

Page 29: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

α2

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

Page 30: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

α2

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

s

Source Community of sin contracted graph

Page 31: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

α2

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

uniquenon-crossing

s

Source Community of sin contracted graph

community-cuts are

Page 32: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

α2

vertically nested s

t

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

uniquenon-crossing

αi

αi−1

s

Source Community of sin contracted graph

community-cuts are

Page 33: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

α2

vertically nested s

t

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

uniquenon-crossing

αi

αi−1

community-cuts are

Page 34: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

α2

α3

vertically nested s

t

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

uniquenon-crossing

αi

αi−1

community-cuts are

Page 35: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

α2

α3

vertically nested s

t

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

uniquenon-crossing

αi

αi−1

Source Community of sin underlying graph

s

community-cuts are

Page 36: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

The Cut-Clustering Algorithm

tα1

α2

α3

α0

Flake et al. [2004]iterative application of the Cut-Clustering approachwith different values α1 > · · · > αkα0 = 0, singleton clustering C0

For i = 1, . . . , k doreplace αi−1 by αicontract each cluster C ∈ Ci−1

calculate clustering Ci

Naive idea: Binary search

needs discretizationtoo low → missing levels

too high → extensive running time

Problem: Choice of α

Page 37: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

New Parametric Search Approach

bases on cut-cost function

ωC : R+0 → [c(C, V \ C),∞) ⊂ R+

0

ωC(α) := c(C, V \ C) + |C|α

Page 38: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

New Parametric Search Approach

bases on cut-cost function

ωC : R+0 → [c(C, V \ C),∞) ⊂ R+

0

ωC(α) := c(C, V \ C) + |C|αslope

α

ωC

c(C, V\C)

Page 39: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

New Parametric Search Approach

bases on cut-cost function

ωC : R+0 → [c(C, V \ C),∞) ⊂ R+

0

ωC(α) := c(C, V \ C) + |C|αslope

α

ωC

c(C, V\C)

Let Ci < Cj denote two different clusterings withassociated parameter values αi > αj .In time O(|Ci|) a parameter value αmwith αj < αm ≤ αi can be computed such that

1) Ci ≤ Cm < Cj and2) Cm = Ci implies that αm is the breakpoint between Ci and Cj .

Theorem:

Page 40: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

New Parametric Search Approach

bases on cut-cost function

ωC : R+0 → [c(C, V \ C),∞) ⊂ R+

0

ωC(α) := c(C, V \ C) + |C|α

Let Ci < Cj denote two different clusterings withassociated parameter values αi > αj .In time O(|Ci|) a parameter value αmwith αj < αm ≤ αi can be computed such that

1) Ci ≤ Cm < Cj and2) Cm = Ci implies that αm is the breakpoint between Ci and Cj .

αm := minCj∈Cj{maxCi⊂Cj α′i}

with α′i the intersection point of ωCj and ωCi

α

ωCi

ωCj

Ci

Cj

slope

Theorem:

Page 41: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

New Parametric Search Approach

bases on cut-cost function

ωC : R+0 → [c(C, V \ C),∞) ⊂ R+

0

ωC(α) := c(C, V \ C) + |C|α

Let Ci < Cj denote two different clusterings withassociated parameter values αi > αj .In time O(|Ci|) a parameter value αmwith αj < αm ≤ αi can be computed such that

1) Ci ≤ Cm < Cj and2) Cm = Ci implies that αm is the breakpoint between Ci and Cj .

αm := minCj∈Cj{maxCi⊂Cj α′i}

with α′i the intersection point of ωCj and ωCi

α

ωCi

ωCj

Ci

Cj

slope

Theorem:

⇒ at most 2h iterations, returns complete hierarchy

Page 42: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Running Time Experiment

graph n m h ParS [m:s] BinS [fac]jazz 198 2742 3 0.062 7.871celegans metabolic 453 2025 8 0.300 8.380celegansneural 297 2148 17 0.406 9.919email 1133 5451 4 1.116 8.758netscience 1589 2742 38 4.310 11.952bo cluster 2114 2277 19 4.355 12.800data 2851 15093 4 11.506 9.620dokuwiki org 4416 12914 18 39.815 15.571power 4941 6594 66 1:25.736 15.742hep-th 8361 15751 56 6:26.213 18.183PGPgiantcompo 10680 24316 94 13:25.121 18.575as-22july06 22963 48436 33 39:54.495 20.583cond-mat 16726 47594 80 44:15.317 27.425astro-ph 16706 121251 60 98:25.791 24.825rgg n 2 15 32768 160240 46 245:25.644 22.573cond-mat-2003 31163 120029 74 268:14.601 20.933G n pin pout 100000 501198 4 369:29.033 *cond-mat-2005 40421 175691 82 652:32.163 21.446

Page 43: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Running Time Experiment

graph n m h ParS [m:s] BinS [fac]jazz 198 2742 3 0.062 7.871celegans metabolic 453 2025 8 0.300 8.380celegansneural 297 2148 17 0.406 9.919email 1133 5451 4 1.116 8.758netscience 1589 2742 38 4.310 11.952bo cluster 2114 2277 19 4.355 12.800data 2851 15093 4 11.506 9.620dokuwiki org 4416 12914 18 39.815 15.571power 4941 6594 66 1:25.736 15.742hep-th 8361 15751 56 6:26.213 18.183PGPgiantcompo 10680 24316 94 13:25.121 18.575as-22july06 22963 48436 33 39:54.495 20.583cond-mat 16726 47594 80 44:15.317 27.425astro-ph 16706 121251 60 98:25.791 24.825rgg n 2 15 32768 160240 46 245:25.644 22.573cond-mat-2003 31163 120029 74 268:14.601 20.933G n pin pout 100000 501198 4 369:29.033 *cond-mat-2005 40421 175691 82 652:32.163 21.446

Page 44: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Modularity Analysiscut-clustering algorithm vs. modularity-based greedy approach

0.9

0.6

0.3

0.0

mo

du

lari

ty

cut-clustering modularity-based clustering

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

*ce

leg

an

sneu

ral

*le

smis

do

ku

wik

ior

g

*n

etsc

ien

ce

po

lblo

gs

pow

er

*ka

rate

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

*ja

zz

Gn

pin

po

ut

*d

ata

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

*p

olb

oo

ks

PG

Pg

ian

tco

mp

o

*d

ela

un

ayn

10

*d

ela

un

ayn

11

*d

ela

un

ayn

12

foo

tba

ll

modClus ≥ cutClus

Page 45: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Modularity Analysiscut-clustering algorithm vs. modularity-based greedy approach

0.9

0.6

0.3

0.0

mo

du

lari

ty

cut-clustering modularity-based clustering

20

10

5

0

clu

ster

size

s 15

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

cele

ga

nsn

eura

l

lesm

is

do

ku

wik

ior

g

net

scie

nce

po

lblo

gs

pow

er

kara

te

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

jazz

Gn

pin

po

ut

da

ta

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

po

lbo

ok

s

PG

Pg

ian

tco

mp

o

del

au

nay

n1

0

del

au

nay

n1

1

del

au

nay

n1

2

foo

tba

ll

modClus ≥ cutClus

cutClus increaseswith coarseness

Page 46: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Modularity Analysiscut-clustering algorithm vs. modularity-based greedy approach

0.9

0.6

0.3

0.0

mo

du

lari

ty

cut-clustering modularity-based clustering

0.9

0.6

0.3

clu

ster

edve

rtic

es

20

10

5

0

clu

ster

size

s 15

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

cele

ga

nsn

eura

l

lesm

is

do

ku

wik

ior

g

net

scie

nce

po

lblo

gs

pow

er

kara

te

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

jazz

Gn

pin

po

ut

da

ta

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

po

lbo

ok

s

PG

Pg

ian

tco

mp

o

del

au

nay

n1

0

del

au

nay

n1

1

del

au

nay

n1

2

foo

tba

ll

modClus ≥ cutClus

cutClus increaseswith coarseness

Page 47: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Modularity Analysiscut-clustering algorithm vs. modularity-based greedy approach

0.9

0.6

0.3

0.0

mo

du

lari

ty

cut-clustering modularity-based clustering

0.9

0.6

0.3

clu

ster

edve

rtic

es

20

10

5

0

clu

ster

size

s 15

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

cele

ga

nsn

eura

l

lesm

is

do

ku

wik

ior

g

net

scie

nce

po

lblo

gs

pow

er

kara

te

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

jazz

Gn

pin

po

ut

da

ta

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

po

lbo

ok

s

PG

Pg

ian

tco

mp

o

del

au

nay

n1

0

del

au

nay

n1

1

del

au

nay

n1

2

foo

tba

ll

modularity-based

Page 48: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Modularity Analysiscut-clustering algorithm vs. modularity-based greedy approach

0.9

0.6

0.3

0.0

mo

du

lari

ty

cut-clustering modularity-based clustering

0.9

0.6

0.3

clu

ster

edve

rtic

es

20

10

5

0

clu

ster

size

s 15

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

cele

ga

nsn

eura

l

lesm

is

do

ku

wik

ior

g

net

scie

nce

po

lblo

gs

pow

er

kara

te

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

jazz

Gn

pin

po

ut

da

ta

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

po

lbo

ok

s

PG

Pg

ian

tco

mp

o

del

au

nay

n1

0

del

au

nay

n1

1

del

au

nay

n1

2

foo

tba

ll

significant clusters?

Page 49: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Modularity Analysis

0.9

0.6

0.3

0.0

mo

du

lari

ty

cut-clustering modularity-based clustering

0.9

0.6

0.3

clu

ster

edve

rtic

es

20

10

5

0

clu

ster

size

s 15

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

cele

ga

nsn

eura

l

lesm

is

do

ku

wik

ior

g

net

scie

nce

po

lblo

gs

pow

er

kara

te

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

jazz

Gn

pin

po

ut

da

ta

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

po

lbo

ok

s

PG

Pg

ian

tco

mp

o

del

au

nay

n1

0

del

au

nay

n1

1

del

au

nay

n1

2

foo

tba

ll

no

nS

Ccl

ust

ers

modClus ≥ cutClus

cutClus increaseswith coarseness

modClus often missessource-communityproperty

Source-community S of s:∀ U ⊂ S\{s}: c(U, S\U) ≥ c(U, V \S)

Page 50: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Modularity Analysis

modClus ≥ cutClus

cutClus increaseswith coarseness

modClus often missessource-communityproperty

0.9

0.6

0.3

0.1

mo

du

lari

tyn

on

SC

clu

ster

s

275 snapshots of the email network of the Department of Informarics at KIT.

0.9

0.6

0.3

clu

ster

edve

rtic

es

cut-clustering modularity-based clustering

Source-community S of s:∀ U ⊂ S\{s}: c(U, S\U) ≥ c(U, V \S)

Page 51: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Modularity Analysis

modClus ≥ cutClus

cutClus increaseswith coarseness

modClus often missessource-communityproperty

0.9

0.6

0.3

0.1

mo

du

lari

tyn

on

SC

clu

ster

s

275 snapshots of the email network of the Department of Informarics at KIT.

0.9

0.6

0.3

clu

ster

edve

rtic

es

cut-clustering modularity-based clustering

modClus closer tosource-communityproperty

cutClus close tomodClus

same trend for both

Source-community S of s:∀ U ⊂ S\{s}: c(U, S\U) ≥ c(U, V \S)

Page 52: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Modularity Analysis

modClus ≥ cutClus

cutClus increaseswith coarseness

modClus often missessource-communityproperty

0.9

0.6

0.3

0.1

mo

du

lari

tyn

on

SC

clu

ster

s

275 snapshots of the email network of the Department of Informarics at KIT.

0.9

0.6

0.3

clu

ster

edve

rtic

es

cut-clustering modularity-based clustering

modClus closer tosource-communityproperty

cutClus close tomodClus

same trend for bothhigh dependence ongraph structure

Conjecture:Good cutClusif significantclustering exists

Source-community S of s:∀ U ⊂ S\{s}: c(U, S\U) ≥ c(U, V \S)

Page 53: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Expansion Analysisintra-cluster expansion is

NP-hard to compute

2.0

1.5

2.5

1.0

0.5

0.0exp

an

sio

nb

ou

nd

s

275 snapshots of the email network of the Department of Informarics at KIT.

2.0

1.5

2.5

1.0

0.5

0.0exp

an

sio

nb

ou

nd

s

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

cele

ga

nsn

eura

l

lesm

is

do

ku

wik

ior

g

net

scie

nce

po

lblo

gs

pow

er

kara

te

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

jazz

Gn

pin

po

ut

da

ta

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

po

lbo

ok

s

PG

Pg

ian

tco

mp

o

Ψg = α

Page 54: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Expansion Analysisintra-cluster expansion is

NP-hard to compute

2.0

1.5

2.5

1.0

0.5

0.0exp

an

sio

nb

ou

nd

s

275 snapshots of the email network of the Department of Informarics at KIT.

2.0

1.5

2.5

1.0

0.5

0.0exp

an

sio

nb

ou

nd

s

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

cele

ga

nsn

eura

l

lesm

is

do

ku

wik

ior

g

net

scie

nce

po

lblo

gs

pow

er

kara

te

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

jazz

Gn

pin

po

ut

da

ta

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

po

lbo

ok

s

PG

Pg

ian

tco

mp

o

Ψg = α

trivial lower bound: Ψ`(C) := minC∈C{ c(S,C\S)|C|/2

} with S global min-cut in C

Ψ`cut

true gain ofknowledge

Page 55: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Expansion Analysisintra-cluster expansion is

NP-hard to compute

2.0

1.5

2.5

1.0

0.5

0.0exp

an

sio

nb

ou

nd

s

275 snapshots of the email network of the Department of Informarics at KIT.

2.0

1.5

2.5

1.0

0.5

0.0exp

an

sio

nb

ou

nd

s

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

cele

ga

nsn

eura

l

lesm

is

do

ku

wik

ior

g

net

scie

nce

po

lblo

gs

pow

er

kara

te

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

jazz

Gn

pin

po

ut

da

ta

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

po

lbo

ok

s

PG

Pg

ian

tco

mp

o

Ψg = α

trivial lower bound: Ψ`(C) := minC∈C{ c(S,C\S)|C|/2

} with S global min-cut in C

Ψ`cut

nontrivial lower bound Ψa: individually applying cutClus to Clusters

Ψacut

true gain ofknowledge

trueexpansioneven better

Page 56: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Expansion Analysisintra-cluster expansion is

NP-hard to compute

2.0

1.5

2.5

1.0

0.5

0.0exp

an

sio

nb

ou

nd

s

275 snapshots of the email network of the Department of Informarics at KIT.

2.0

1.5

2.5

1.0

0.5

0.0exp

an

sio

nb

ou

nd

s

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

cele

ga

nsn

eura

l

lesm

is

do

ku

wik

ior

g

net

scie

nce

po

lblo

gs

pow

er

kara

te

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

jazz

Gn

pin

po

ut

da

ta

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

po

lbo

ok

s

PG

Pg

ian

tco

mp

o

Ψg = α

trivial lower bound: Ψ`(C) := minC∈C{ c(S,C\S)|C|/2

} with S global min-cut in C

Ψ`cut

nontrivial lower bound Ψa: individually applying cutClus to Clusters

Ψumod

trivial upper bound: Ψu(C) := minC∈C{ c(S,C\S)min{|S|,|C\S|}} with S global min-cut in C

Ψacut

true gain ofknowledge

trueexpansioneven better

Page 57: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Expansion Analysisintra-cluster expansion is

NP-hard to compute

2.0

1.5

2.5

1.0

0.5

0.0exp

an

sio

nb

ou

nd

s

275 snapshots of the email network of the Department of Informarics at KIT.

2.0

1.5

2.5

1.0

0.5

0.0exp

an

sio

nb

ou

nd

s

cele

ga

ns

met

ab

.

as2

2ju

ly0

6

cele

ga

nsn

eura

l

lesm

is

do

ku

wik

ior

g

net

scie

nce

po

lblo

gs

pow

er

kara

te

bo

clu

ster

con

d-m

at

ast

ro-p

h

hep

-th

ema

il

ad

jno

un

do

lph

ins

jazz

Gn

pin

po

ut

da

ta

con

d-m

at-

20

05

con

d-m

at-

20

03

rgg

n2

15

s0

po

lbo

ok

s

PG

Pg

ian

tco

mp

o

Ψg = α

trivial lower bound: Ψ`(C) := minC∈C{ c(S,C\S)|C|/2

} with S global min-cut in C

Ψ`cut

nontrivial lower bound Ψa: individually applying cutClus to Clusters

Ψ`mod

Ψamod

Ψacut

true gain ofknowledge

trueexpansioneven better

cutClus ≥modClus

Page 58: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Conclusion

Analysis of the Hierarchical Cut-Clustering Algorithm of Flake et al. [2004]

Efficient Parametric Search Approach→ complete hierarchy

Comparison to modularity-based greedy approach→ competes well if graph structure allows

Study of guarantee on intra-cluster expansion→ true gain of knowledge

Page 59: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Conclusion

Analysis of the Hierarchical Cut-Clustering Algorithm of Flake et al. [2004]

Efficient Parametric Search Approach→ complete hierarchy

Comparison to modularity-based greedy approach→ competes well if graph structure allows

Study of guarantee on intra-cluster expansion→ true gain of knowledge

source-communities, which ensure a good significance, withdecent modularity values anda guaranteed intra-cluster expansion.

Hierarchical Cut-Clustering Algorithm fairly well combines

Page 60: A Case Study on Modularity and Expansion Complete ...Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion DIMACS 2012 The Cut-Clustering Algorithm Flake et

Michael Hamann, Tanja Hartmann, Dorothea Wagner · Karlsruhe Institute of Technology (KIT)

Complete Hierarchical Cut-Clustering: A Case Study on Modularity and Expansion · DIMACS 2012

Conclusion

Analysis of the Hierarchical Cut-Clustering Algorithm of Flake et al. [2004]

Efficient Parametric Search Approach→ complete hierarchy

Comparison to modularity-based greedy approach→ competes well if graph structure allows

Study of guarantee on intra-cluster expansion→ true gain of knowledge

source-communities, which ensure a good significance, withdecent modularity values anda guaranteed intra-cluster expansion.

Thank You

Hierarchical Cut-Clustering Algorithm fairly well combines