choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... ·...
Transcript of choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... ·...
![Page 1: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/1.jpg)
Dimension Reduction
CSE 6242 A / CX 4242 DVA March 6, 2014
Guest Lecturer: Jaegul Choo
![Page 2: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/2.jpg)
Data is Too Big To Analyze
2
! Limited memory size ! Data may not be fitted to the memory of your machine
! Slow computation ! 106-dim vs. 10-dim vectors for Euclidean distance computation
![Page 3: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/3.jpg)
Two Axes of Data Set
3
! No. of data items ! How many data items?
! No. of dimensions ! How many dimensions representing each item?
Data item index
Dimension index
Columns as data items vs. Rows as data items
We will use this during lecture
![Page 4: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/4.jpg)
Dimension Reduction Let’s Reduce Data (along Dimension Axis)
4
Dimension Reduction
High-dim data
low-dim data
No. of dimensions
Additional info about data
Other parameters
Dim-reducing Transformer for
new data
: user-specified
![Page 5: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/5.jpg)
What You Get from DR
5
Obviously, ! Less storage ! Faster computation More importantly, ! Noise removal (improving quality of data)
! Leads better performance for tasks ! 2D/3D representation
! Enables visual data exploration
![Page 6: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/6.jpg)
Applications
6
Traditionally, ! Microarray data analysis ! Information retrieval ! Face recognition ! Protein disorder prediction ! Network intrusion detection ! Document categorization ! Speech recognition More interestingly, ! Interactive visualization of high-dimensional data
![Page 7: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/7.jpg)
Visualization
7
Visualizing “Map of Science”
http://www.mapofscience.com
![Page 8: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/8.jpg)
Two Main Techniques
1. Feature selection ! Selects a subset of the original variables as reduced
dimensions ! For example, the number of genes responsible for a
particular disease may be small 2. Feature extraction ! Each reduced dimension involves multiple original
dimensions ! Active research area
8
Note that Feature = Variable = Dimension
![Page 9: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/9.jpg)
Feature Selection
What are the optimal subset of m features to maximize a given criterion? ! Widely-used criteria
! Information gain, correlation, … ! Typically combinatorial optimization problems ! Therefore, greedy methods are popular
! Forward selection: Empty set → add one variable at a time ! Backward elimination: Entire set → remove one variable at a time
9
![Page 10: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/10.jpg)
From now on, we will only discuss about feature
extraction
![Page 11: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/11.jpg)
Aspects of DR
! Linear vs. Nonlinear ! Unsupervised vs. Supervised ! Global vs. Local ! Feature vectors vs. Similarity (as an input)
11
![Page 12: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/12.jpg)
Aspects of DR Linear vs. Nonlinear
Linear ! Represents each reduced dimension as a linear
combination of original dimensions ! e.g., Y1 = 3*X1 – 4*X2 + 0.3*X3 – 1.5*X4,
Y2 = 2*X1 + 3.2*X2 – X3 + 2*X4 ! Naturally capable of mapping new data to the same
space
12
Dimension Reduction
D1 D2
X1 1 1
X2 1 0
X3 0 2
X4 1 1
D1 D2
Y1 1.75 -0.27
Y2 -0.21 0.58
![Page 13: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/13.jpg)
Aspects of DR Linear vs. Nonlinear
Linear ! Represents each reduced dimension as a linear
combination of original dimensions ! e.g., Y1 = 3*X1 – 4*X2 + 0.3*X3 – 1.5*X4,
Y2 = 2*X1 + 3.2*X2 – X3 + 2*X4 ! Naturally capable of mapping new data to the same
space Nonlinear ! More complicated, but generally more powerful ! Recently popular topics
13
![Page 14: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/14.jpg)
Aspects of DR Unsupervised vs. Supervised
Unsupervised ! Uses only the input data
14
Dimension Reduction
High-dim data
low-dim data
No. of dimensions
Other parameters
Dim-reducing Transformer for
a new data
Additional info about data
![Page 15: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/15.jpg)
Aspects of DR Unsupervised vs. Supervised
Supervised ! Uses the input data + additional info
15
Dimension Reduction
High-dim data
low-dim data
No. of dimensions
Other parameters
Dim-reducing Transformer for
a new data
Additional info about data
![Page 16: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/16.jpg)
Aspects of DR Unsupervised vs. Supervised
Supervised ! Uses the input data + additional info
! e.g., grouping label
16
Dimension Reduction
High-dim data
low-dim data
No. of dimensions
Additional info about data
Other parameters
Dim-reducing Transformer for
a new data
![Page 17: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/17.jpg)
Aspects of DR Global vs. Local
Dimension reduction typically tries to preserve all the relationships/distances in data ! Information loss is unavoidable! Then, what would you care about? Global ! Treats all pairwise distances equally important
! Tends to care larger distances more Local ! Focuses on small distances, neighborhood relationships ! Active research area a.k.a. manifold learning
17
![Page 18: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/18.jpg)
Aspects of DR Feature vectors vs. Similarity (as an input)
18
Dimension Reduction
High-dim data
low-dim data
No. of dimensions
Other parameters
Dim-reducing Transformer for
a new data
Additional info about data
! Typical setup (feature vectors as an input)
![Page 19: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/19.jpg)
Aspects of DR Feature vectors vs. Similarity (as an input)
19
Dimension Reduction
Similarity matrix
low-dim data
No. of dimensions
Other parameters
Dim-reducing Transformer for
a new data
Additional info about data
! Typical setup (feature vectors as an input) ! Some methods take similarity matrix instead
! (i,j)-th component indicates similarity between i-th and j-th data
![Page 20: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/20.jpg)
Aspects of DR Feature vectors vs. Similarity (as an input)
20
Dimension Reduction
low-dim data
No. of dimensions
Other parameters
Dim-reducing Transformer for
a new data
Additional info about data
! Typical setup (feature vectors as an input) ! Some methods take similarity matrix instead ! Some methods internally convert feature vectors to
similarity matrix before performing dimension reduction Similarity
matrix High-dim
data
Dimension Reduction
low-dim data
a.k.a. Graph Embedding
![Page 21: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/21.jpg)
Aspects of DR Feature vectors vs. Similarity (as an input)
21
Why called graph embedding? ! Similarity matrix can be viewed as a graph where
similarity represents edge weight
Similarity matrix
High-dim data
Dimension Reduction
low-dim data
a.k.a. Graph Embedding
![Page 22: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/22.jpg)
Methods
! Traditional ! Principal component analysis (PCA) ! Multidimensional scaling (MDS) ! Linear discriminant analysis (LDA)
! Advanced (nonlinear, kernel, manifold learning)
! Isometric feature mapping (Isomap) ! t-distributed stochastic neighborhood embedding (t-SNE)
22
* Matlab codes are available at http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html
![Page 23: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/23.jpg)
Principal Component Analysis
! Finds the axis showing the greatest variation, and project all points into this axis
! Reduced dimensions are orthogonal ! Algorithm: Eigen-decomposition ! Pros: Fast ! Cons: Limited performances
23 http://en.wikipedia.org/wiki/Principal_component_analysis
PC1 PC2 Linear Unsupervised Global Feature vectors
![Page 24: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/24.jpg)
Principal Component Analysis Document Visualization
24
![Page 25: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/25.jpg)
Multidimensional Scaling (MDS)
Intuition ! Tries to preserve given ideal pairwise distances in low-
dimensional space ! Metric MDS
! Preserves given ideal distance values ! Nonmetric MDS
! When you only know/care about ordering of distances ! Preserves only the orderings of distance values
! Algorithm: gradient-decent type c.f. classical MDS is the same as PCA 25
Nonlinear Unsupervised Global Similarity input
ideal distance actual distance
![Page 26: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/26.jpg)
Multidimensional Scaling Sammon’s mapping
Sammon’s mapping ! Local version of MDS ! Down-weights errors in large distances
! Algorithm: gradient-decent type
26
Nonlinear Unsupervised Local Similarity input
![Page 27: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/27.jpg)
Multidimensional Scaling Force-directed graph layout
Force-directed graph layout ! Rooted from graph visualization, but essentially variant of
metric MDS ! Spring-like attractive + repulsive forces between nodes ! Algorithm: gradient-decent type
! Widely-used in visualization ! Aesthetically pleasing results ! Simple and intuitive ! Interactivity
27
Nonlinear Unsupervised Global Similarity input
![Page 28: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/28.jpg)
Multidimensional Scaling Force-directed graph layout
Demos ! Prefuse
! http://prefuse.org/gallery/graphview/
! D3: http://d3js.org/ ! http://bl.ocks.org/4062045
28
![Page 29: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/29.jpg)
Multidimensional Scaling
In all variants, ! Pros: widely-used (works well in general) ! Cons: slow
! Nonmetric MDS is even much slower than metric MDS
29
![Page 30: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/30.jpg)
Linear Discriminant Analysis
What if clustering information is available? LDA tries to separate clusters by ! Putting different cluster as far as possible ! Putting each cluster as compact as possible
30 (a) (b)
![Page 31: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/31.jpg)
Aspects of DR Unsupervised vs. Supervised
Supervised ! Uses the input data + additional info
! e.g., grouping label
31
Dimension Reduction
High-dim data
low-dim data
No. of dimensions
Additional info about data
Other parameters
Dim-reducing Transformer for
a new data
![Page 32: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/32.jpg)
Linear Discriminant Analysis vs. Principal Component Analysis
32
2D visualization of 7 Gaussian mixture of 1000 dimensions
Linear discriminant analysis (Supervised)
Principal component analysis (Unsupervised)
32
![Page 33: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/33.jpg)
Linear Discriminant Analysis
Maximally separates clusters by ! Putting different cluster as far as possible ! Putting each cluster as compact as possible
! Algorithm: generalized eigendecomposition ! Pros: better show cluster structure ! Cons: may distort original relationship of data
33
Linear Supervised Global Feature vectors
![Page 34: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/34.jpg)
Methods
! Traditional ! Principal component analysis (PCA) ! Multidimensional scaling (MDS) ! Linear discriminant analysis (LDA)
! Advanced (nonlinear, kernel, manifold learning)
! Isometric feature mapping (Isomap) ! t-distributed stochastic neighborhood embedding (t-SNE)
34
* Matlab codes are available at http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html
![Page 35: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/35.jpg)
Manifold Learning Swiss Roll Data
Swiss roll data ! Originally in 3D
! What is the intrinsic dimensionality? (allowing flattening)
intrinsic ≈ semantic
35
![Page 36: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/36.jpg)
Manifold Learning Swiss Roll Data
Swiss roll data ! Originally in 3D ! What is the intrinsic
dimensionality? (allowing flattening)
→ 2D intrinsic ≈ semantic
36
What if your data has low intrinsic dimensionality but resides in high-dimensional space?
![Page 37: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/37.jpg)
Manifold Learning Goal and Approach
37
Manifold ! “Curvi-linear” low-dimensional structure of your data
based on intrinsic dimensionality Manifold learning ! Match intrinsic dimensions to axes of dimension-reduced output space How? ! Each piece of manifold is appox. linear ! Utilize local neighborhood information
! e.g. for a particular point, ! Who are my neighbors? ! How closely am I related to neighbors?
Demo available at http://www.math.ucla.edu/~wittman/mani/
![Page 38: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/38.jpg)
Isomap (Isometric Feature Mapping)
Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance as the shortest path length
from k-nearest neighbor (k-NN) graph ! *Eigen-decomposition on pairwise geodesic distance
matrix to obtain embedding that best preserves given distances
38 * Recall eigen-decomposition is the main algorithm of PCA
![Page 39: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/39.jpg)
Isomap (Isometric Feature Mapping)
! Algorithm: all-pair shortest path computation + eigen-
decomposition ! Pros: performs well in general ! Cons: slow (shortest path), sensitive to parameters
39
Nonlinear Unsupervised Global: all pairwise distances are considered Feature vectors
![Page 40: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/40.jpg)
k=8
k=22 k=49
Cluster structure
Isomap Facial Data Example
40 Which one do you think is the best?
(k is the value in k-NN graph)
Angle
Person
![Page 41: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/41.jpg)
t-SNE (t-distributed Stochastic Neighborhood Embedding)
41
Made specifically for visualization! (in very low dimension) ! Can reveal clusters without any supervision
! e.g., spoken letter data
PCA t-SNE
Official website: http://homepage.tudelft.nl/19j49/t-SNE.html
![Page 42: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/42.jpg)
t-SNE (t-distributed Stochastic Neighborhood Embedding)
42
How it works ! Converts distance into probability
! Farther distance gets lower probability ! Then, minimize differences in probability distribution
between high- and low-dimensional spaces ! KL divergence naturally focuses on neighborhood relationships
! Difference from SNE ! t-SNE uses heavy-tailed t-distribution instead of Gaussian. ! Suitable for dimension reduction to a very low dimension
![Page 43: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/43.jpg)
t-SNE (t-distributed Stochastic Neighborhood Embedding)
43
! Algorithm: gradient-decent type ! Pros: works surprisingly well in 2D/3D visualization ! Cons: very slow
Nonlinear Unsupervised Local Similarity input
![Page 44: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/44.jpg)
DR in Interactive Visualization
44
What can you do from visualization via dimension reduction? ! e.g., Multidimensional scaling applied to document data
![Page 45: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/45.jpg)
DR in Interactive Visualization
45
As many data items involve, it’s harder to analyze ! For n data items, users are given O(n2) relations spatially
encoded in visualization ! Too many to understand in general
![Page 46: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/46.jpg)
DR in Interactive Visualization What to first look at?
46
Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g., ! Outliers (if any)
![Page 47: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/47.jpg)
DR in Interactive Visualization What to first look at?
47
Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g., ! Outliers (if any) More commonly, ! Subgroups/clusters
! However, it is hard to expect for DR to always reveal clusters
![Page 48: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/48.jpg)
DR in Interactive Visualization What to first look at?
48
What if DR cannot reveal subgroups/clusters clearly? Or even worse, what if our data do not originally have any? ! Often, pre-defined grouping information is injected and
color-coded. ! Such grouping information is usually obtained as
! Pre-given labels along with data ! Computed labels by clustering
![Page 49: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/49.jpg)
! Treating two subclusters in digit ‘5’ as separate clusters
! Classification accuracy improved from 89% to 93% (LDA+k-NN)
Dimension Reduction in Action Handwritten Digit Data Visualization
49
Now we can obtain • Cluster/data relationship • Subcluster/outlier
Visualization of handwritten digit data
Subcluster #1 in ‘5’ Subcluster #2 in ‘5’
Major data in ‘7’ Minor group #1 in ‘7’ Minor group #2 in ‘7’
![Page 50: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/50.jpg)
Practitioner’s Guide Caveats
50
Can you trust dimension reduction results? ! Expect significant distortion/information loss in 2D/3D ! What algorithm think is the best may not be what we think
is the best, e.g., PCA visualization of facial image data
(1, 2)-dimension (3, 4)-dimension
![Page 51: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/51.jpg)
Practitioner’s Guide Caveats
51
How would you determine the best method and its parameters for your needs? ! Unlike typical data mining problems where only one shot
is allowed, you can freely try out different methods with different parameters
! Basic understanding of methods will greatly help applying them properly ! What is a particular method trying to achieve? And how suitable is
it to your needs? ! What are the effects of increasing/decreasing parameters?
![Page 52: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/52.jpg)
Practitioner’s Guide General Recommendation
Want something simple and fast to visualize data? ! PCA, force-directed layout Want to first try some manifold learning methods? ! Isomap
! if it doesn’t show any good, probably neither will anything else. Have cluster label to use? (pre-given or computed) ! LDA (supervised)
! Supervised approach is sometimes the only viable option when your data do not have clearly separable clusters
No labels, but still want some clusters to be revealed? Or simply, want some state-of-the-art method for visualization? ! t-SNE (but, may be slow)
52
![Page 53: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/53.jpg)
Practitioner’s Guide Results Still Not Good?
Pre-process data properly as needed ! Data centering
! Subtract the global mean from each vector ! Normalization
! Make each vector have unit Euclidean norm ! Otherwise, a few outlier can affect dimension reduction
significantly ! Application-specific pre-processing
! Document: TF-IDF weighting, remove too rare and/or short terms ! Image: histogram normalization
53
![Page 54: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/54.jpg)
Practitioner’s Guide Too Slow?
! Apply PCA to reduce to an intermediate dimensions before the main dimension reduction step ! t-SNE does it by default ! The results may even be improved due to noise removed by PCA
! See if there is any approximated but faster version ! Landmarked versions (only using a subset of data items)
! e.g., landmarked Isomap ! Linearized versions (the same criterion, but only allow linear
mapping) ! e.g., Laplacian Eigenmaps → Locality preserving projection
54
![Page 55: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/55.jpg)
Practitioner’s Guide Still need more?
Be creative! And feel free to tweak dimension reduction ! Play with its algorithm, convergence criteria, etc.
! See if you can impose label information
55 Original t-SNE t-SNE with simple modification
![Page 56: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/56.jpg)
Practitioner’s Guide Still need more?
Be creative! And feel free to tweak dimension reduction ! Play with its algorithm, convergence criteria, etc.
! See if you can impose label information ! Restrict the number of iterations to save computational time.
The raison d’etre of DR is to serve us in exploring data and solving complicated real-world problems
56
![Page 57: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/57.jpg)
Useful Resource
Nice review article by L.J.P. van der Maaten et al. ! http://www.iai.uni-bonn.de/~jz/
dimensionality_reduction_a_comparative_review.pdf Matlab toolbox for dimension reduction ! http://homepage.tudelft.nl/19j49/
Matlab_Toolbox_for_Dimensionality_Reduction.html
Matlab manifold learning demo ! http://www.math.ucla.edu/~wittman/mani/
57
![Page 58: choo dimred lecture 2014 - Visualizationpoloclub.gatech.edu/cse6242/2014spring/lectures/... · Let’s preserve pairwise geodesic distance (along manifold) ! Compute geodesic distance](https://reader034.fdocuments.us/reader034/viewer/2022051918/600a81bdce8a8e2f4f42d332/html5/thumbnails/58.jpg)
Useful Resource FODAVA Testbed Software
Available at http://fodava.gatech.edu/fodava-testbed-software For a recent version, contact me at [email protected]
58