Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph...

42
Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department of Computational Linguistics, University of Groningen Seminar in Methodology and Statistics - May 20, 2009 Martijn Wieling 1/32

Transcript of Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph...

Page 1: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Bipartite spectral graph partitioning to co-clustervarieties and sound correspondences

Martijn Wieling

Department of Computational Linguistics, University of Groningen

Seminar in Methodology and Statistics - May 20, 2009

Martijn Wieling 1/32

Page 2: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Goal

Making the title of this presentation understandable!

Bipartite spectral graph partitioning to co-cluster varieties andsound correspondences

Martijn Wieling 2/32

Page 3: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Overview

Why co-clustering?

MethodIntroduction to eigenvalues and eigenvectorsSimple clusteringCo-clustering

Complete dataset

Results

Conclusions

Martijn Wieling 3/32

Page 4: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Why co-clustering?

Research interest: language and dialectal variationImportant method: cluster similar (dialectal) varieties togetherProblem: clustering varieties does not yield a linguistic basisPrevious solutions: investigate sound correspondences post hoc(e.g., Heeringa, 2004)Co-clustering: clusters varieties and sound correspondencessimultaneously

Eigenvalues and eigenvectors are central in this approach

Martijn Wieling 4/32

Page 5: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Graphs and matrices

A graph is a set of vertices connected with edges:

A graph can also be represented by its adjacency matrix AA B C D

A 0 1 1 1B 1 0 0 0C 1 1 0 0D 0 1 1 0

Martijn Wieling 5/32

Page 6: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Eigenvalues and eigenvectors

The eigenvalues λ and the eigenvectors xxx of a square matrix AAAare defined as follows:

AAAxxx = λxxx [⇒ (AAA− λIII)xxx = 000]

In matrix-form: [a11 − λ a12

a21 a22 − λ

] [x1x2

]=

[00

]This is solved when:

(a11 − λ)x1 + a12x2 = 0

a21x1 + (a22 − λ)x2 = 0

Martijn Wieling 6/32

Page 7: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Eigenvalues and eigenvectors

The eigenvalues λ and the eigenvectors xxx of a square matrix AAAare defined as follows:

AAAxxx = λxxx [⇒ (AAA− λIII)xxx = 000]

In matrix-form: [a11 − λ a12

a21 a22 − λ

] [x1x2

]=

[00

]This is solved when:

(a11 − λ)x1 + a12x2 = 0

a21x1 + (a22 − λ)x2 = 0

Martijn Wieling 6/32

Page 8: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Eigenvalues and eigenvectors

The eigenvalues λ and the eigenvectors xxx of a square matrix AAAare defined as follows:

AAAxxx = λxxx [⇒ (AAA− λIII)xxx = 000]

In matrix-form: [a11 − λ a12

a21 a22 − λ

] [x1x2

]=

[00

]This is solved when:

(a11 − λ)x1 + a12x2 = 0

a21x1 + (a22 − λ)x2 = 0

Martijn Wieling 6/32

Page 9: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of calculating eigenvalues and eigenvectors

Consider the following example: AAA =

[1 22 1

]Using (AAA− λIII)xxx = 000 we get:[

1− λ 22 1− λ

] [x1x2

]=

[00

]

Solved when det(AAA) = 0: (1− λ)2 − 4 = 0

Using λ1 = 3 and λ2 = −1 we obtain xxx =

[11

]and xxx =

[1−1

]

Martijn Wieling 7/32

Page 10: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of calculating eigenvalues and eigenvectors

Consider the following example: AAA =

[1 22 1

]Using (AAA− λIII)xxx = 000 we get:[

1− λ 22 1− λ

] [x1x2

]=

[00

]

Solved when det(AAA) = 0: (1− λ)2 − 4 = 0

Using λ1 = 3 and λ2 = −1 we obtain xxx =

[11

]and xxx =

[1−1

]

Martijn Wieling 7/32

Page 11: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of calculating eigenvalues and eigenvectors

Consider the following example: AAA =

[1 22 1

]Using (AAA− λIII)xxx = 000 we get:[

1− λ 22 1− λ

] [x1x2

]=

[00

]

Solved when det(AAA) = 0: (1− λ)2 − 4 = 0

Using λ1 = 3 and λ2 = −1 we obtain xxx =

[11

]and xxx =

[1−1

]

Martijn Wieling 7/32

Page 12: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of calculating eigenvalues and eigenvectors

Consider the following example: AAA =

[1 22 1

]Using (AAA− λIII)xxx = 000 we get:[

1− λ 22 1− λ

] [x1x2

]=

[00

]

Solved when det(AAA) = 0: (1− λ)2 − 4 = 0

Using λ1 = 3 and λ2 = −1 we obtain xxx =

[11

]and xxx =

[1−1

]

Martijn Wieling 7/32

Page 13: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Spectrum of a graph

The spectrum of a graph are the eigenvalues of the adjacencymatrix AAA of the graph

The spectrum is considered to capture important structuralproperties of a graph (Chung, 1997)

Some interesting applications of eigenvalues and eigenvectors:Principal Component Analysis (PCA; Duda et al., 2001: 114–117)Pagerank (Google; Brin and Page, 1998)Partitioning (i.e. clustering; Von Luxburg, 2007)

Martijn Wieling 8/32

Page 14: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of spectral graph clustering (1/8)

Consider the matrix A with sound correspondences:[a]/[i] [2]/[i] [r]/[x] [k]/[x] [r]/[ö] [r]/[K]

[a]/[i] 0 1 1 0 0 0[2]/[i] 1 0 1 0 0 0[r]/[x] 1 1 0 1 0 0[k]/[x] 0 0 1 0 1 1[r]/[ö] 0 0 0 1 0 1[r]/[K] 0 0 0 1 1 0

In graph-form:

Martijn Wieling 9/32

Page 15: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of spectral graph clustering (2/8)

To partition this graph, we have to determine the optimal cut:

The optimal cut yielding balanced clusters is obtained by findingthe eigenvectors of the normalized Laplacian: LLLn = DDD−1LLL, withLLL = DDD −AAA and DDD the degree matrix of AAA (Shi and Malik, 2000; VonLuxburg, 2007).

Martijn Wieling 10/32

Page 16: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of spectral graph clustering (3/8)

The adjacency matrix AAA:

[a]/[i] [2]/[i] [r]/[x] [k]/[x] [r]/[ö] [r]/[K][a]/[i] 0 1 1 0 0 0[2]/[i] 1 0 1 0 0 0[r]/[x] 1 1 0 1 0 0[k]/[x] 0 0 1 0 1 1[r]/[ö] 0 0 0 1 0 1[r]/[K] 0 0 0 1 1 0

Martijn Wieling 11/32

Page 17: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of spectral graph clustering (4/8)

The Laplacian matrix LLL:

[a]/[i] [2]/[i] [r]/[x] [k]/[x] [r]/[ö] [r]/[K][a]/[i] 2 -1 -1 0 0 0[2]/[i] -1 2 -1 0 0 0[r]/[x] -1 -1 3 -1 0 0[k]/[x] 0 0 -1 3 -1 -1[r]/[ö] 0 0 0 -1 2 -1[r]/[K] 0 0 0 -1 -1 2

Martijn Wieling 12/32

Page 18: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of spectral graph clustering (5/8)

The normalized Laplacian matrix LLLn:

[a]/[i] [2]/[i] [r]/[x] [k]/[x] [r]/[ö] [r]/[K][a]/[i] 1 -0.5 -0.5 0 0 0[2]/[i] -0.5 1 -0.5 0 0 0[r]/[x] -0.33 -0.33 1 -0.33 0 0[k]/[x] 0 0 -0.33 1 -0.33 -0.33[r]/[ö] 0 0 0 -0.5 1 -0.5[r]/[K] 0 0 0 -0.5 -0.5 1

Martijn Wieling 13/32

Page 19: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of spectral graph clustering (6/8)

The eigenvalues λ and eigenvectors xxx of LLLn (i.e. LLLnxxx = λxxx):λ1 = 0 with xxx = [-0.41 -0.41 -0.41 -0.41 -0.41 -0.41]T

λ2 = 0.21 with xxx = [0.46 0.46 0.27 -0.27 -0.46 -0.46]T

λ3 = 1.17 with xxx = [0.36 0.36 -0.49 -0.49 0.36 0.36]T

...

The first (smallest) eigenvector does not yield clusteringinformation. Does the second?

Martijn Wieling 14/32

Page 20: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of spectral graph clustering (6/8)

The eigenvalues λ and eigenvectors xxx of LLLn (i.e. LLLnxxx = λxxx):λ1 = 0 with xxx = [-0.41 -0.41 -0.41 -0.41 -0.41 -0.41]T

λ2 = 0.21 with xxx = [0.46 0.46 0.27 -0.27 -0.46 -0.46]T

λ3 = 1.17 with xxx = [0.36 0.36 -0.49 -0.49 0.36 0.36]T

...

The first (smallest) eigenvector does not yield clusteringinformation. Does the second? Yes!

-0.270.27

Martijn Wieling 14/32

Page 21: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of spectral graph clustering (7/8)

If we use the k -means algorithm (i.e. minimize the within-clustersum of squares; Lloyd, 1982) to cluster the eigenvector in twogroups we obtain the following partitioning:

To cluster in k > 2 groups we use the second to k (smallest)eigenvectors

Martijn Wieling 15/32

Page 22: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of spectral graph clustering (8/8)

To cluster in k = 3 groups, we use:λ2 = 0.21 with xxx = [0.46 0.46 0.27 -0.27 -0.46 -0.46]T

λ3 = 1.17 with xxx = [0.36 0.36 -0.49 -0.49 0.36 0.36]T

We obtain the following clustering:

Martijn Wieling 16/32

Page 23: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of spectral graph clustering (8/8)

To cluster in k = 3 groups, we use:λ2 = 0.21 with xxx = [0.46 0.46 0.27 -0.27 -0.46 -0.46]T

λ3 = 1.17 with xxx = [0.36 0.36 -0.49 -0.49 0.36 0.36]T

We obtain the following clustering:

Martijn Wieling 16/32

Page 24: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Bipartite graphs

A bipartite graph is a graph whose vertices can be divided in twodisjoint sets where every edge connects a vertex from one set to avertex in another set. Vertices within a set are not connected.

A matrix representation of a bipartite graph:[a]/[i] [2]/[i] [r]/[x] [k]/[x] [r]/[ö] [r]/[K]

Appelscha 1 1 1 0 0 0Oudega 1 1 1 0 0 0Zoutkamp 0 0 1 1 0 0Kerkrade 0 0 0 1 1 1Appelscha 0 0 0 1 1 1

Martijn Wieling 17/32

Page 25: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of co-clustering a biparte graph (1/6)

The (naive) co-clustering procedure is equal to clustering in onedimension (i.e. cluster eigenvector(s) of normalized Laplacian)Consider the following graph:

Martijn Wieling 18/32

Page 26: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of co-clustering a biparte graph (2/6)

The adjacency matrix AAA:appelscha oudega zoutkamp kerkrade vaals [a]/[i] [2]/[i] [r]/[x] [k]/[x] [r]/[ö] [r]/[K]

appelscha 0 0 0 0 0 1 1 1 0 0 0oudega 0 0 0 0 0 1 1 1 0 0 0zoutkamp 0 0 0 0 0 0 0 1 1 0 0kerkrade 0 0 0 0 0 0 0 0 1 1 1appelscha 0 0 0 0 0 0 0 0 1 1 1[a]/[i] 1 1 0 0 0 0 0 0 0 0 0[2]/[i] 1 1 0 0 0 0 0 0 0 0 0[r]/[x] 1 1 1 0 0 0 0 0 0 0 0[k]/[x] 0 0 1 1 1 0 0 0 0 0 0[r]/[ö] 0 0 0 1 1 0 0 0 0 0 0[r]/[K] 0 0 0 1 1 0 0 0 0 0 0

Martijn Wieling 19/32

Page 27: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of co-clustering a biparte graph (3/6)

The Laplacian matrix LLL:appelscha oudega zoutkamp kerkrade vaals [a]/[i] [2]/[i] [r]/[x] [k]/[x] [r]/[ö] [r]/[K]

appelscha 3 0 0 0 0 -1 -1 -1 0 0 0oudega 0 3 0 0 0 -1 -1 -1 0 0 0zoutkamp 0 0 2 0 0 0 0 -1 -1 0 0kerkrade 0 0 0 3 0 0 0 0 -1 -1 -1appelscha 0 0 0 0 3 0 0 0 -1 -1 -1[a]/[i] -1 -1 0 0 0 2 0 0 0 0 0[2]/[i] -1 -1 0 0 0 0 2 0 0 0 0[r]/[x] -1 -1 -1 0 0 0 0 3 0 0 0[k]/[x] 0 0 -1 -1 -1 0 0 0 3 0 0[r]/[ö] 0 0 0 -1 -1 0 0 0 0 2 0[r]/[K] 0 0 0 -1 -1 0 0 0 0 0 2

Martijn Wieling 20/32

Page 28: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of co-clustering a biparte graph (4/6)

The normalized Laplacian matrix LLLn:appelscha oudega zoutkamp kerkrade vaals [a]/[i] [2]/[i] [r]/[x] [k]/[x] [r]/[ö] [r]/[K]

appelscha 1 0 0 0 0 -0.33 -0.33 -0.33 0 0 0oudega 0 1 0 0 0 -0.33 -0.33 -0.33 0 0 0zoutkamp 0 0 1 0 0 0 0 -0.5 -0.5 0 0kerkrade 0 0 0 1 0 0 0 0 -0.33 -0.33 -0.33appelscha 0 0 0 0 1 0 0 0 -0.33 -0.33 -0.33[a]/[i] -0.5 -0.5 0 0 0 1 0 0 0 0 0[2]/[i] -0.5 -0.5 0 0 0 0 1 0 0 0 0[r]/[x] -0.33 -0.33 -0.33 0 0 0 0 1 0 0 0[k]/[x] 0 0 -0.33 -0.33 -0.33 0 0 0 1 0 0[r]/[ö] 0 0 0 -0.5 -0.5 0 0 0 0 1 0[r]/[K] 0 0 0 -0.5 -0.5 0 0 0 0 0 1

Martijn Wieling 21/32

Page 29: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of co-clustering a biparte graph (5/6)

To cluster in k = 2 groups, we use:λ2 = .057, xxx = [.32 -.32 0 .32 .32 -.34 -.34 -.23 .23 .34 .34]T

Martijn Wieling 22/32

Page 30: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of co-clustering a biparte graph (5/6)

To cluster in k = 2 groups, we use:λ2 = .057, xxx = [.32 -.32 0 .32 .32 -.34 -.34 -.23 .23 .34 .34]T

We obtain the following co-clustering:

-0.32

-0.34

-0.34

-0.23

0.23

0.34

0.34

-0.32

0

0.32

0.32

Martijn Wieling 22/32

Page 31: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of co-clustering a biparte graph (6/6)

To cluster in k = 3 groups, we use:λ2 = .057, xxx = [.32 -.32 0 .32 .32 -.34 -.34 -.23 .23 .34 .34]T

λ3 = .53, xxx = [.12 .12 -.7 .12 .12 .25 .25 -.33 -.33 .25 .25]T

Martijn Wieling 23/32

Page 32: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Example of co-clustering a biparte graph (6/6)

To cluster in k = 3 groups, we use:λ2 = .057, xxx = [.32 -.32 0 .32 .32 -.34 -.34 -.23 .23 .34 .34]T

λ3 = .53, xxx = [.12 .12 -.7 .12 .12 .25 .25 -.33 -.33 .25 .25]T

We obtain the following co-clustering:

(-0.32, 0.12)

(0, -0.7)

(0.32, 0.12)

(0.32, 0.12)

(-0.34, 0.25)

(-0.34, 0.25)

(-0.23, -0,33)

(0.23, -0.33)

(0.34, 0.25)

(0.34, 0.25)

(-0.32, 0.12)

Martijn Wieling 23/32

Page 33: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

A faster method

The previous method is relatively slow due to the use of the large(sparse) matrix AAA of size (n + m)× (n + m)

The matrix AAA′ contains the same information, but is more dense(size: n ×m):

[a]/[i] [2]/[i] [r]/[x] [k]/[x] [r]/[ö] [r]/[K]Appelscha 1 1 1 0 0 0Oudega 1 1 1 0 0 0Zoutkamp 0 0 1 1 0 0Kerkrade 0 0 0 1 1 1Appelscha 0 0 0 1 1 1

The singular value decomposition (SVD) of AAA′n also results inequivalent clustering information and is quicker to compute(Dhillon, 2001)

Martijn Wieling 24/32

Page 34: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Complete dataset

Alignments of pronunciations of 562 words for 424 varieties in theNetherlands against a reference pronunciation

Pronunciations originate from the GTRP (Goeman and Taeldeman,1996; Van den Berg, 2003; Wieling et al., 2007)The pronunciations of Delft were used as the referenceAlignments were obtained using the Levenshtein algorithm using aPointwise Mutual Information procedure (Wieling et al., 2009)

We generated a bipartite graph of varieties v and soundcorrespondences s

There is an edge between vi and sj iff freq(sj in vi) > 0

All the following co-clustering results are obtained applying thefast SVD method

Martijn Wieling 25/32

Page 35: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Distribution of sites

Martijn Wieling 26/32

Page 36: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Results: {2,3,4} co-clusters in the data

Martijn Wieling 27/32

Page 37: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Results: {2,3,4} clusters of varieties

Martijn Wieling 28/32

Page 38: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Results: {2,3,4} clusters of sound correspondences

Sound correspondences specific for the Frisian area

Reference [2] [2] [a] [o] [u] [x] [x] [r]Frisian [I] [i] [i] [E] [E] [j] [z] [x]

Sound correspondences specific for the Limburg areaReference [r] [r] [k] [n] [n] [w]Limburg [ö] [K] [x] [ö] [K] [f]

Sound correspondences specific for the Low Saxon area

Reference [@] [@] [@] [-] [a]Low Saxon [m] [N] [ð] [P] [e]

Martijn Wieling 29/32

Page 39: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

To conclude

Bipartite spectral graph partitioning is a very useful method tosimultaneously cluster

varieties and sound correspondenceswords and documentsgenes and conditions... and ...

Do you now understand the title?Bipartite Spectral graph partitioning to co-cluster varieties andsound correspondences

Martijn Wieling 30/32

Page 40: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

To conclude

Bipartite spectral graph partitioning is a very useful method tosimultaneously cluster

varieties and sound correspondenceswords and documentsgenes and conditions... and ...

Do you now understand the title?Bipartite Spectral graph partitioning to co-cluster varieties andsound correspondences

Martijn Wieling 30/32

Page 41: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

Any questions?

Thank You!

Martijn Wieling 31/32

Page 42: Bipartite spectral graph partitioning to co-cluster varieties and … · Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences Martijn Wieling Department

ReferencesBoudewijn van den Berg. 2003. Phonology & Morphology of Dutch & Frisian Dialects in 1.1 million transcriptions.Goeman-Taeldeman-Van Reenen project 1980–1995, Meertens Instituut Electronic Publications in Linguistics 3.Meertens Instituut (CD-ROM), Amsterdam.

Sergey Brin, and Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine. ComputersNetworks and ISDN Systems, 30(1–7):107–117.

Fan Chung. 1997. Spectral Graph Theory. American Mathematical Society.

Inderjit Dhillon. 2001. Co-clustering documents and words using bipartite spectral graph partitioning. Proceedings of theseventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 269–274. ACM New York.

Richard Duda, Peter Hart, and David Stork. 2001. Pattern Classification. Wiley New York.

Ton Goeman, and Johan Taeldeman. 1996. Fonologie en morfologie van de Nederlandse dialecten. Een nieuwemateriaalverzameling en twee nieuwe atlasprojecten. Taal en Tongval, 48:38–59.

Wilbert Heeringa. 2004. Measuring Dialect Pronunciation Differences using Levenshtein Distance. Ph.D. thesis,Rijksuniversiteit Groningen.

Stuart Lloyd. 1982. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129–137.

Ulrike von Luxburg. 2007. A tutorial on spectral clustering. Statistics and Computing, 17(4):495–416.

Jianbo Shi, and Jitendra Malik. 2000. Normalized cuts and image segmentation. IEEE Transactions on pattern analysisand machine intelligence, 22(8):888-905.

Martijn Wieling, Wilbert Heeringa, and John Nerbonne. 2007. An aggregate analysis of pronunciation in theGoeman-Taeldeman-Van Reenen-Project data. Taal en Tongval, 59(1):84–116.

Martijn Wieling, Jelena Prokic, and John Nerbonne. 2009. Evaluating the pairwise string alignment of pronunciations. In:Lars Borin and Piroska Lendvai (eds.) Language Technology and Resources for Cultural Heritage, Social Sciences,Humanities, and Education (LaTeCH - SHELT&R 2009) Workshop at the 12th Meeting of the European Chapter of theAssociation for Computational Linguistics. Athens, 30 March 2009, pp. 26-34.

Martijn Wieling 32/32