1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline...
Transcript of 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline...
![Page 1: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/1.jpg)
Warm-up as you walk in
1
1. https://www.sporcle.com/games/MrChewypoo/minimalist_disney
2. https://www.sporcle.com/games/Stanford0008/minimalist-cartoons-slideshow
3. https://www.sporcle.com/games/MrChewypoo/minimalist
![Page 2: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/2.jpg)
PCA
2
10-601 Introduction to Machine Learning
Pat Virtue
Machine Learning DepartmentSchool of Computer ScienceCarnegie Mellon University
Slide credits CMU MLD
![Page 3: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/3.jpg)
Welcome Micah!!
3
![Page 4: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/4.jpg)
Learning Paradigms
4
![Page 5: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/5.jpg)
PCA Outline
• Dimensionality Reduction– High-dimensional data– Learning (low dimensional) representations
• Principal Component Analysis (PCA)– Examples: 2D and 3D– Data for PCA– PCA Definition– Objective functions for PCA– PCA, Eigenvectors, and Eigenvalues– Algorithms for finding Eigenvectors / Eigenvalues
• PCA Examples– Image Compression– MRI Image Reconstruction
5
![Page 6: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/6.jpg)
DIMENSIONALITY REDUCTION
6
![Page 7: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/7.jpg)
High Dimension Data
Examples of high dimensional data:
– High resolution images (millions of pixels)
7
![Page 8: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/8.jpg)
High Dimension Data
Examples of high dimensional data:
– Multilingual News Stories (vocabulary of hundreds of thousands of words)
8
![Page 9: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/9.jpg)
High Dimension Data
Examples of high dimensional data:
– Brain Imaging Data (100s of MBs per scan)
9Image from https://pixabay.com/en/brain-mrt-magnetic-resonance-imaging-1728449/
Image from (Wehbe et al., 2014)
![Page 10: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/10.jpg)
PCA, Kernel PCA, ICA: Powerful unsupervised learning techniques for extracting hidden (potentially lower dimensional) structure from high dimensional datasets.
Learning Representations
Useful for:
• Visualization
• Further processing by machine learning algorithms
• More efficient use of resources (e.g., time, memory, communication)
• Statistical: fewer dimensions → better generalization
• Noise removal (improving data quality)
Slide from Nina Balcan
![Page 11: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/11.jpg)
PRINCIPAL COMPONENT ANALYSIS (PCA)
11
![Page 12: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/12.jpg)
Principal Component Analysis (PCA)
In case where data lies on or near a low d-dimensional linear subspace, axes of this subspace are an effective representation of the data.
Identifying the axes is known as Principal Components Analysis, and can be obtained by using classic matrix computation tools (Eigen or Singular Value Decomposition).
Slide from Nina Balcan
![Page 13: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/13.jpg)
2D Gaussian dataset
Slide from Barnabas Poczos
![Page 14: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/14.jpg)
1st PCA axis
Slide from Barnabas Poczos
![Page 15: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/15.jpg)
2nd PCA axis
Slide from Barnabas Poczos
![Page 16: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/16.jpg)
16GLBC – MSK Image Analysis
April 23, 2010
Growth Plate ImagingGrowth Plate Disruption and Limb Length Discrepancy
Images Courtesy H. Potter, H.S.S.
8 year-old boy with previous fracture and 4cm leg length discrepancy
![Page 17: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/17.jpg)
17GLBC – MSK Image Analysis
April 23, 2010
Growth Plate ImagingGrowth Plate Disruption and Limb Length Discrepancy
Images Courtesy H. Potter, H.S.S.
8 year-old boy with previous fracture and 4cm leg length discrepancy
![Page 18: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/18.jpg)
18GLBC – MSK Image Analysis
April 23, 2010
Growth Plate ImagingArea Measurement
![Page 19: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/19.jpg)
19GLBC – MSK Image Analysis
April 23, 2010
Growth Plate ImagingArea Measurement
Flatten Growth Plate to Enable 2D Area Measurement
![Page 20: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/20.jpg)
Data for PCA
We assume the data is centered
20
Q: What if your data is
not centered?
A: Subtract off the
sample mean
Slide from Matt Gormley
![Page 21: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/21.jpg)
Sample Covariance Matrix
The sample covariance matrix is given by:
21
Since the data matrix is centered, we rewrite as:
Slide from Matt Gormley
![Page 22: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/22.jpg)
Projections
Quiz: What is the projection of point 𝒙 onto vector 𝒗, assuming that ‖𝑣‖2 = 1?
A. 𝒗𝒙
B. 𝒗𝑇𝒙
C. 𝒗𝑇𝒙 𝒗
D. 𝒗𝑇𝒙 𝒙𝑇𝒗
22
![Page 23: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/23.jpg)
Principal Component Analysis (PCA)
Whiteboard
– PCA Sketch
– Objective functions for PCA
23
![Page 24: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/24.jpg)
Maximizing the Variance
Quiz: Consider the two projections below2. Which maximizes the variance?3. Which minimizes the reconstruction error?
24
Option A Option B
Slide from Matt Gormley
![Page 25: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/25.jpg)
Principal Component Analysis (PCA)
Whiteboard
– PCA, Eigenvectors, and Eigenvalues
– Algorithms for finding Eigenvectors / Eigenvalues
25
![Page 26: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/26.jpg)
PCA
26
Equivalence of Maximizing Variance and Minimizing Reconstruction Error
![Page 27: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/27.jpg)
PCA: the First Principal Component
27
![Page 28: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/28.jpg)
SVD for PCA
28
![Page 29: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/29.jpg)
Principal Component Analysis (PCA)
![Page 30: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/30.jpg)
Principal Component Analysis (PCA)
XTX v = λv , so v (the first PC) is the eigenvector of
sample correlation/covariance matrix 𝑋𝑇𝑋
Sample variance of projection v𝑇𝑋𝑇𝑋 v = 𝜆v𝑇v = 𝜆
Thus, the eigenvalue 𝜆 denotes the amount of variability captured along that dimension (aka amount of energy along
that dimension).
Eigenvalues 𝜆1 ≥ 𝜆2 ≥ 𝜆3 ≥ ⋯
• The 1st PC 𝑣1 is the the eigenvector of the sample covariance matrix𝑋𝑇𝑋 associated with the largest eigenvalue
• The 2nd PC 𝑣2 is the the eigenvector of the sample covariance matrix𝑋𝑇𝑋 associated with the second largest eigenvalue
• And so on …
Slide from Nina Balcan
![Page 31: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/31.jpg)
• For M original dimensions, sample covariance matrix is MxM, and has up to M eigenvectors. So M PCs.
• Where does dimensionality reduction come from?Can ignore the components of lesser significance.
0
5
10
15
20
25
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Var
ian
ce (
%)
How Many PCs?
© Eric Xing @ CMU, 2006-2011 31
• You do lose some information, but if the eigenvalues are small, you don’t lose much– M dimensions in original data – calculate M eigenvectors and eigenvalues– choose only the first D eigenvectors, based on their eigenvalues– final data set has only D dimensions
Variance (%) = ratio of variance along given principal component to total
variance of all principal components
![Page 32: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/32.jpg)
PCA EXAMPLES
32
![Page 33: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/33.jpg)
Projecting MNIST digits
33
Task Setting:1. Take 28x28 images of digits and project them down to K components2. Report percent of variance explained for K components3. Then project back up to 28x28 image to visualize how much information was preserved
![Page 34: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/34.jpg)
Projecting MNIST digits
34
Task Setting:1. Take 28x28 images of digits and project them down to 2 components2. Plot the 2 dimensional points
![Page 35: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/35.jpg)
Projecting MNIST digits
35
Task Setting:1. Take 28x28 images of digits and project them down to 2 components2. Plot the 2 dimensional points
![Page 36: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/36.jpg)
MRI Image Reconstruction
36
![Page 37: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/37.jpg)
MRI Image Reconstruction
37Heart MR Image: Akçakaya, et al, BM3D (2011)
Lots of redundant structure at patch level
![Page 38: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/38.jpg)
MRI Image Reconstruction
• Image Denoising
38
Noisy
α1 + α2 + α3 =
Cleanv1 v2 v3
![Page 39: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/39.jpg)
MRI Image Reconstruction
Quiz: If I have a 10x10 patch from a noisy image, which number of principal components kept will allow more noise in the resulting patch?
A. 100
B. 10
39
![Page 40: 1. disney 2. ...mgormley/courses/10601-f19/slides/... · 2020-03-05 · PCA Outline •Dimensionality Reduction –High-dimensional data –Learning (low dimensional) representations](https://reader034.fdocuments.us/reader034/viewer/2022052611/5f0514b17e708231d4112a51/html5/thumbnails/40.jpg)
Learning Objectives
Dimensionality Reduction / PCAYou should be able to…1. Define the sample mean, sample variance, and sample
covariance of a vector-valued dataset2. Identify examples of high dimensional data and common use
cases for dimensionality reduction3. Draw the principal components of a given toy dataset4. Establish the equivalence of minimization of reconstruction
error with maximization of variance5. Given a set of principal components, project from high to low
dimensional space and do the reverse to produce a reconstruction
6. Explain the connection between PCA, eigenvectors, eigenvalues, and covariance matrix
7. Use common methods in linear algebra to obtain the principal components
40