Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian...

24
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June 2003; 15 (6):1373-1396 Presentation for CSE291 sp07 M. Belkin 1 P. Niyogi 2 1 University of Chicago, Department of Mathematics 2 University of Chicago, Departments of Computer Science and Statistics 5/10/07 M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 1 / 24

Transcript of Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian...

Page 1: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Laplacian Eigenmaps for Dimensionality Reductionand Data Representation

Neural Computation, June 2003; 15 (6):1373-1396Presentation for CSE291 sp07

M. Belkin1 P. Niyogi2

1University of Chicago, Department of Mathematics2University of Chicago, Departments of Computer Science and Statistics

5/10/07

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 1 / 24

Page 2: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Outline

1 IntroductionMotivationThe Algorithm

2 Explaining The AlgorithmThe optimizationContinuous Manifolds - The Heat Kernel

3 ResultsThe Swiss rollSyntactic structure of words

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 2 / 24

Page 3: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Outline

1 IntroductionMotivationThe Algorithm

2 Explaining The AlgorithmThe optimizationContinuous Manifolds - The Heat Kernel

3 ResultsThe Swiss rollSyntactic structure of words

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 3 / 24

Page 4: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

The problem

In AI, information retrieval and data mining we getIntrinsically Low m-dimensional data lying in high N-dimensionalspace.Recovering the initial dimensionality results

Smaller data setsI Faster computationsI Reduce the space needed

More meaningful representations. . . much more

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 4 / 24

Page 5: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Non Linearity

PCA, MDS and their variations create linear embeddings of the dataFor non-linear structures their linear embeddings fail.IsoMAP, LLE and Laplacian eigenmaps partially solve this problem.The heart of these approaches is some kind of non-linear method

IsoMAP, Adjacency Graph of "Neighbor nodes"-edgesLLE, Each node is expressed by a limited number of neighbornodesIn Laplacian Eigenmaps, Adjacency Graphs of "Neighbornodes"-edges

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 5 / 24

Page 6: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Outline

1 IntroductionMotivationThe Algorithm

2 Explaining The AlgorithmThe optimizationContinuous Manifolds - The Heat Kernel

3 ResultsThe Swiss rollSyntactic structure of words

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 6 / 24

Page 7: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Step 1. Construct the Graph

We construct the Adjacency Graph A putting (i , j) edges if xi , xj are"close"Close may mean:

ε-neighborhoods ||xi − xj ||2 < ε

n nearest neighborsCombination of the above (at most n nearest neighbors with||xi − xj ||2 < ε)

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 7 / 24

Page 8: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Step 2. Choose the Weights

The edges have weights that can beSimple-minded: 1 if connected 0 else

Heat Kernel: Wij = e−||xi−xj ||

2

t

We note the with t =∞ we get the simple-minded approachThe second method includes more information in our model

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 8 / 24

Page 9: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Step 3. The Laplacian eigenvalues

Let A be the Adjacency matrix and D the degree matrix of the graphWe introduce the Laplacian matrix L = A− D and we solve thegeneralized eigenvector problem

Lf = λDf

We sort the eigenvectors according to their eigenvalues0 = λ + 0 ≤ λ1 ≤ λ2 . . . λk−1 We leave out f0 and use the first meigenvectors for embedding in m-dimensional Euclidian space.

xi → (f1(i), . . . , fm(i)

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 9 / 24

Page 10: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

A quick overview

There are several efficient approximate techniques for finding thenearest neighbors.The algorithm consists of a few local computations and onesparse eigenvalue problem.It is a member of the IsoMap-LLE family of algorithms.As LLE and IsoMap does not give good results for non-convexmanifolds.

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 10 / 24

Page 11: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Outline

1 IntroductionMotivationThe Algorithm

2 Explaining The AlgorithmThe optimizationContinuous Manifolds - The Heat Kernel

3 ResultsThe Swiss rollSyntactic structure of words

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 11 / 24

Page 12: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Our minimization problem for 1 Dimension

Let y = (y1, y2, . . . , yn)T be our mapping.

We would like to minimize ∑ij

(yi − yj)2Wij

under appropriate constraints∑ij(yi − yj)

2Wij =∑

ij(y2i + y2

j − 2yiyj)Wij =∑i y2

i Dii +∑

j y2j Djj − 2

∑ij yiyjWij = 2yT Ly

So we need to findargminyyT Ly

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 12 / 24

Page 13: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Our Constraints

The constraintyT Dy = 1

removes an arbitrary scaling factor in the embeddingargminyyT Ly , yT Dy = 1 is exactly the generalized eigenvalue problemLy = λDyThe additional constraint yT D1 = 0 the solution 1 with λ0 = 0So the final problem is:

argminyy tLy , yT Dy = 1, yT D1 = 0

The general problem for m dimensions is similar and gives exactly thefirst m smallest eigenvalues.

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 13 / 24

Page 14: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Sum Up

So each eigenvector is a function from nodes to R in a way that "closeby" points are assigned "close by" values.

The eigenvalue of each eigenfunction gives a measure of how "closeby" are the values of close by points

By using the first m eigenfunctions for determining our m-dimensionswe have our solution.

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 14 / 24

Page 15: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Outline

1 IntroductionMotivationThe Algorithm

2 Explaining The AlgorithmThe optimizationContinuous Manifolds - The Heat Kernel

3 ResultsThe Swiss rollSyntactic structure of words

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 15 / 24

Page 16: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Continuous Manifolds

When we are using continuous spaces the Graph Laplacian becomesthe Laplace Beltrami Operator (More on this by Nakul.)

And our optimization problem is finding functions f that map themanifold points to R, in a way that

∫M || 5 f (x)||2 is minimum.

Intuitively minimizing the gradient minimizes the values assigned toclose by points.

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 16 / 24

Page 17: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Heat Kernels

Continuous manifolds give the idea for the Heat Kernel

That is assigning weights Wij = e||xi−xj ||

2

t

The heat function is a solution for the Laplace Beltrami Operator, andas a result by assigning the weights according to it we get betterapproximation of the "ideal" infinite points case.

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 17 / 24

Page 18: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Outline

1 IntroductionMotivationThe Algorithm

2 Explaining The AlgorithmThe optimizationContinuous Manifolds - The Heat Kernel

3 ResultsThe Swiss rollSyntactic structure of words

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 18 / 24

Page 19: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

One more Swiss roll!

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 19 / 24

Page 20: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Unfolding the Swiss roll - the parameters t,N

N=number of nearest neighbors, t the heat kernel parameter

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 20 / 24

Page 21: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Outline

1 IntroductionMotivationThe Algorithm

2 Explaining The AlgorithmThe optimizationContinuous Manifolds - The Heat Kernel

3 ResultsThe Swiss rollSyntactic structure of words

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 21 / 24

Page 22: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Understanding syntactic structure of words

Input:300 most frequent words of Brown corpusInformation about the frequency of its left and right neighbors (600Dimensional space.)

The algorithm run with N = 14, t =∞

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 22 / 24

Page 23: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Understanding syntactic structure of Words

Three parts of the output. We can see verbs, prepositions, andauxiliary and modal verbs are grouped.

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 23 / 24

Page 24: Laplacian Eigenmaps for Dimensionality Reduction …anand/fa11/Laplacian_Eigenmaps...Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June

Discussion

Laplacian Eigenvalues is a part of the Isomap LLE familyThey actually prove that LLE minimizes approximately the samevalueMany times we need to skip eigenvectors, as eigenvectors withbigger eigenvalues give better results. However variations fordoing so exist.Although all these algorithms give good results for convexnon-linear structures, they do not give much better results in othercases.

M. Belkin P. Niyogi () Laplacian eigs for dim reduction 5/10/07 CSE291 24 / 24