artificial intelligence: engineering, science, or - Stanford AI Lab
Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course
description
Transcript of Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course
![Page 1: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/1.jpg)
1
Machine LearningUnit 5, Introduction to Artificial Intelligence, Stanford online course
Made by: Maor Levy, Temple University 2012
![Page 2: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/2.jpg)
2
The unsupervised learning problem
Many data points, no labels
![Page 4: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/4.jpg)
4
K-Means
Many data points, no labels
![Page 5: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/5.jpg)
5
Choose a fixed number of clusters
Choose cluster centers and point-cluster allocations to minimize error
can’t do this by exhaustive search, because there are too many possible allocations.
Algorithm◦ fix cluster centers;
allocate points to closest cluster
◦ fix allocation; compute best cluster centers
x could be any set of features for which we can compute a distance (careful about scaling)
K-Means
x j i2
jelements of i'th cluster
iclusters
* From Marc Pollefeys COMP 256 2003
![Page 6: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/6.jpg)
6
K-Means
![Page 7: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/7.jpg)
7
K-Means
* From Marc Pollefeys COMP 256 2003
![Page 8: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/8.jpg)
8
K-means clustering using intensity alone and color alone
Image Clusters on intensity Clusters on color
* From Marc Pollefeys COMP 256 2003
Results of K-Means Clustering:
![Page 9: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/9.jpg)
9
Is an approximation to EM◦ Model (hypothesis space): Mixture of N Gaussians◦ Latent variables: Correspondence of data and Gaussians
We notice: ◦ Given the mixture model, it’s easy to calculate the
correspondence◦ Given the correspondence it’s easy to estimate the
mixture models
K-Means
![Page 10: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/10.jpg)
10
Data generated from mixture of Gaussians
Latent variables: Correspondence between Data Items and Gaussians
Expectation Maximzation: Idea
![Page 11: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/11.jpg)
11
Generalized K-Means (EM)
![Page 12: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/12.jpg)
12
Gaussians
![Page 13: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/13.jpg)
13
ML Fitting Gaussians
![Page 14: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/14.jpg)
14
Learning a Gaussian Mixture(with known covariance)
k
n
x
x
ni
ji
e
e
1
)(2
1
)(2
1
22
22
M-Step
k
nni
jiij
xxp
xxpzE
1
)|(
)|(][
E-Step
![Page 15: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/15.jpg)
15
Converges! Proof [Neal/Hinton, McLachlan/Krishnan]:
◦ E/M step does not decrease data likelihood◦ Converges at local minimum or saddle point
But subject to local minima
Expectation Maximization
![Page 16: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/16.jpg)
16
EM Clustering: Results
http://www.ece.neu.edu/groups/rpl/kmeans/
![Page 17: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/17.jpg)
17
Number of Clusters unknown Suffers (badly) from local minima Algorithm:
◦ Start new cluster center if many points “unexplained”
◦ Kill cluster center that doesn’t contribute◦ (Use AIC/BIC criterion for all this, if you want to be
formal)
Practical EM
![Page 18: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/18.jpg)
18
Spectral Clustering
![Page 19: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/19.jpg)
19
The Two Spiral Problem
![Page 20: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/20.jpg)
20
Spectral Clustering: Overview
Data Similarities Block-Detection
* Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University
![Page 21: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/21.jpg)
21
Block matrices have block eigenvectors:
Near-block matrices have near-block eigenvectors: [Ng et al., NIPS 02]
Eigenvectors and Blocks
1 1 0 0
1 1 0 0
0 0 1 1
0 0 1 1
eigensolver
.71
.71
0
0
0
0
.71
.71
l1= 2 l2= 2 l3= 0 l4= 0
1 1 .2 0
1 1 0 -.2
.2 0 1 1
0 -.2 1 1
eigensolver
.71
.69
.14
0
0
-.14
.69
.71
l1= 2.02 l2= 2.02 l3= -0.02 l4= -0.02
* Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University
![Page 22: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/22.jpg)
22
Can put items into blocks by eigenvectors:
Resulting clusters independent of row ordering:
Spectral Space
1 1 .2 0
1 1 0 -.2
.2 0 1 1
0 -.2 1 1
.71
.69
.14
0
0
-.14
.69
.71
e1
e2
e1 e2
1 .2 1 0
.2 1 0 1
1 0 1 -.2
0 1 -.2 1
.71
.14
.69
0
0
.69
-.14
.71
e1
e2
e1 e2
* Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University
![Page 23: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/23.jpg)
23
The key advantage of spectral clustering is the spectral space representation:
The Spectral Advantage
* Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University
![Page 24: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/24.jpg)
24
Measuring Affinity
Intensity
Texture
Distance
aff x, y exp 12 i
2
I x I y 2
aff x, y exp 12 d
2
x y
2
aff x, y exp 12 t
2
c x c y 2
* From Marc Pollefeys COMP 256 2003
![Page 25: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/25.jpg)
25
Scale affects affinity
* From Marc Pollefeys COMP 256 2003
![Page 26: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/26.jpg)
26* From Marc Pollefeys COMP 256 2003
![Page 27: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/27.jpg)
27
The Space of Digits (in 2D)
M. Brand, MERL
![Page 28: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/28.jpg)
28
Dimensionality Reduction with PCA
![Page 29: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/29.jpg)
29
Fit multivariate Gaussian
Compute eigenvectors of Covariance
Project onto eigenvectors with largest eigenvalues
Linear: Principal Components
![Page 30: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/30.jpg)
30
Other examples of unsupervised learning
Mean face (after alignment)
Slide credit: Santiago Serrano
![Page 31: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/31.jpg)
31
Eigenfaces
Slide credit: Santiago Serrano
![Page 32: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/32.jpg)
32
Isomap Local Linear Embedding
Non-Linear Techniques
Isomap, Science, M. Balasubmaranian and E. Schwartz
![Page 33: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/33.jpg)
33
Scape (Drago Anguelov et al)
![Page 34: Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course](https://reader035.fdocuments.us/reader035/viewer/2022081604/56816640550346895dd9acaf/html5/thumbnails/34.jpg)
34
References:◦ Sebastian Thrun and Peter Norvig, Artificial Intelligence, Stanford
University http://www.stanford.edu/class/cs221/notes/cs221-lecture6-fall11.pdf