Segmentation Graph-Theoretic Clustering. Outline Graph theory basics Eigenvector methods for...

Post on 20-Dec-2015

217 views 0 download

Transcript of Segmentation Graph-Theoretic Clustering. Outline Graph theory basics Eigenvector methods for...

Segmentation

Graph-Theoretic Clustering

Outline

Graph theory basics Eigenvector methods for segmentation

Graph Theory Terminology

Graph G: Set of vertices V and edges E connecting pairs of vertices

Each edge is represented by the vertices (a, b) it joins

A weighted graph has a weight associated with each edge w(a, b)

Connectivity Vertices are connected if there is a sequence of edges joining

them A graph is connected if all vertices are connected Any graph can be partitioned into connected components (CC)

such that each CC is a connected graph and there are no edges between vertices in different CCs

Graphs for Clustering Tokens are vertices Weights on edges proportional to token similarity Cut: “Weight” of edges joining two sets of vertices:

Segmentation: Look for minimum cut in graph Recursively cut components until regions uniform enough

A B

Representing Graphs As Matrices

Use N x N matrix W for N–vertex graph

Entry W(i, j) is weight on edge between

vertices i and j Undirected graphs have symmetric weight

matrices

from Forsyth & Ponce

1

2

3

4

5

7

6

9

81 1

Example graph and its weight matrix

Affinity Measures

Affinity A(i, j) between tokens i and j should be proportional to similarity

Based on metric on some visual feature(s) Position: E.g., A(i, j) = exp [-((x-y)T (x-y)/2d

2 )] Intensity Color Texture

These are weights in an affinity graph A over tokens

Affinity by distance

Choice of Scale

=0.1 =1=0.2

Eigenvectors and Segmentation

Given k tokens with affinities defined by A, want partition into c clusters For a particular cluster n, denote the membership weights of the tokens

with the vector wn Require normalized weights so that

“Best” assignment of tokens to cluster n is achieved by selecting wn that maximizes objective function (highest intra-cluster affinity)

subject to weight vector normalization constraint Using method of Lagrange multipliers, this yields system of equations

which means that wn is an eigenvector of A and a solution is obtained from the eigenvector with the largest eigenvalue

Eigenvectors and Segmentation Note that an appropriate rearrangement of affinity matrix

leads to block structure indicating clusters

Largest eigenvectors A of tend to correspond to eigenvectors of blocks

So interpret biggest c eigenvectors as cluster membership weight vectors Quantize weights to 0 or 1 to make memberships definite

from Forsyth & Ponce

1

2

3

45

7

6

9

81 1

Example using dataset Fig 14.18

Next 3 Eigenvectors

Number of Clusters

Potential Problem

Normalized Cuts Previous approach doesn’t work when eigenvalues of blocks are similar

Just using within-cluster similarity doesn’t account for between-cluster differences

No encouragement of larger cluster sizes Define association between vertex subset A and full set V as

Before, we just maximized assoc(A, A); now we also want to minimize assoc(A, V). Define the normalized cut as

Normalized Cut Algorithm

Define diagonal degree matrix D(i, i) = j A(i, j) Define integer membership vector x over all vertices such that each element is 1 if the

vertex belongs to cluster A and -1 if it belongs to B (i.e., just two clusters) Define real approximation to x as

This yields the following objective function to minimize:

which sets up the system of equations The eigenvector with second smallest eigenvalue is the solution (smallest always 0) Continue partitioning clusters if normcut is over some threshold

Example: Fig 14.23

Example: Fig. 14-24