WTF is Cross-Device Targeting? - WTF Programmatic UK, 11/11/14
MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, [email protected] MW 2:30 – 4:00 Room:...
-
Upload
carlos-bancroft -
Category
Documents
-
view
220 -
download
0
Transcript of MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, [email protected] MW 2:30 – 4:00 Room:...
![Page 1: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/1.jpg)
MIT 6.899 Learning and Inference in Vision
• Prof. Bill Freeman, [email protected]• MW 2:30 – 4:00• Room: 34-301• Course web page:
http://www.ai.mit.edu/courses/6.899/
![Page 2: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/2.jpg)
Reading class
• We’ll cover about 1 paper each class.
• Seminal or topical research papers in the intersection of machine learning and vision.
• One student will present each paper. Then we’ll discuss the paper as a class.
• One student will write a computer example illustrating the paper’s main idea.
![Page 3: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/3.jpg)
Learning and Inference
• “Learning”: learn the parameter values or structure of a probabilistic model.– Look at many examples of people walking, and
build up probabilistic model relating video images to 3-d motions.
• “Inference”: infer hidden variables, given a observations.– Eg, given a particular video of someone
walking, infer their motions in 3-d.
![Page 4: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/4.jpg)
Statistical dependencies between variables
Learning and Inference
y1 y2Observed variables
x1 x2Unobserved variables
![Page 5: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/5.jpg)
Statistical dependencies between variables
Learning and Inference
Observed variables
Unobserved variables
“Learning”: learn this model, and the formof the statistical dependencies.
![Page 6: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/6.jpg)
Statistical dependencies between variables
Learning and Inference
y1 y2Observed variables
x1 x2Unobserved variables
“Learning”: learn this model, and the formof the statistical dependencies.
“Inference”: given this model, and the observations, y1 & y2, infer x1 & x2, or their conditional distribution.
![Page 7: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/7.jpg)
Cartoon history of speech recognition research
• 1960’s, 1970’s, 1980’s: lots of different approaches; “hey, let’s try this”.
• 1980’s Hidden Markov Models (HMM), statistical approach took off.
• 1990’s and beyond: HMM’s now the dominant approach. “The person with the best training set wins”.
![Page 8: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/8.jpg)
Same story for document understanding
• The person with the best training set wins.
![Page 9: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/9.jpg)
Computer vision is ready to make that transition
• Machine learning approaches are becoming dominant.
• We get to make and watch the transition to principled, statistical approach happen.
• It’s not trivial: issues of representation, robustness, generalization, speed, …
![Page 10: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/10.jpg)
Categories of the papers
1. Learning image representations
2. Learning manifolds
3. Linear and bilinear models
4. Learning low-level vision
5. Graphical models, belief propagation
6. Particle filters and tracking
7. Face and object recognition
8. Learning models of object appearance
![Page 11: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/11.jpg)
1 Learning image representations
Example training image
From http://www.amsci.org/amsci/articles/00articles/olshausencap1.html
![Page 12: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/12.jpg)
1 Learning image representations
From: http://www.cns.nyu.edu/pub/eero/simoncelli01-reprint.pdf
![Page 13: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/13.jpg)
2 Learning manifolds
From: http://www.sciencemag.org/cgi/content/full/290/5500/2319
Joshua B. Tenenbaum, Vin de Silva, John C. Langford
![Page 14: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/14.jpg)
2 Learning manifolds
From: http://www.sciencemag.org/cgi/content/full/290/5500/2319
![Page 15: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/15.jpg)
2 Learning manifolds
From: http://www.sciencemag.org/cgi/content/full/290/5500/2319
![Page 16: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/16.jpg)
3 Linear and bilinear models
From: http://www-psych.stanford.edu/~jbt/NC120601.pdf
![Page 17: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/17.jpg)
4 Learning low-level vision
From Y. Weiss, http://www.cs.berkeley.edu/~yweiss/iccv01.ps.gz
Images, under different lighting
reflectance illumination
![Page 18: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/18.jpg)
5 Graphical models, belief propagation
From: http://www.cs.berkeley.edu/~yweiss/nips96.pdf
![Page 19: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/19.jpg)
6 Particle filters and tracking
From: http://www.robots.ox.ac.uk/~ab/abstracts/eccv96.isard.html
![Page 20: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/20.jpg)
7 Face and object recognition
From Viola and Jones, http://www.ai.mit.edu/people/viola/research/publications/ICCV01-Viola-Jones.ps.gz
![Page 21: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/21.jpg)
7 Face and object recognition
From Viola and Jones, http://www.ai.mit.edu/people/viola/research/publications/ICCV01-Viola-Jones.ps.gz
![Page 22: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/22.jpg)
7 Face and object recognition
From: Pinar Duygulu, Kobus Barnard, Nando deFreitas, and David Forsyth,
![Page 23: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/23.jpg)
8 Learning models of object appearance
Weber, Welling, and Perona, http://www.gatsby.ucl.ac.uk/~welling/papers/ECCV00_fin.ps.gz
Images containing the object
Images not containing the object
![Page 24: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/24.jpg)
8 Learning models of object appearance
Test images
Weber, Welling, and Perona, http://www.gatsby.ucl.ac.uk/~welling/papers/ECCV00_fin.ps.gz
Contains the object?
Contains the object?
![Page 25: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/25.jpg)
8 Learning models of object appearance
Weber, Welling, and Perona, http://www.gatsby.ucl.ac.uk/~welling/papers/ECCV00_fin.ps.gz
![Page 26: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/26.jpg)
Guest lecturers/discussants
• Andrew Blake (Condensation, Oxford/Microsoft)
• Baback Moghaddam (Bayesian face recognition, MERL)
• Paul Viola (Fast face recognition, MERL)
![Page 27: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/27.jpg)
Class requirements
1. Read each paper. Think about them. Discuss in class.
2. Present one paper to the class.
3. Present one computer example to the class.
4. Final project: write a conference paper related to vision and learning.
![Page 28: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/28.jpg)
1. Read the papers, discuss them
• Write down 3 insights about the paper that you might want to share with the class in discussion.
• Turn them in on a sheet of paper.
![Page 29: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/29.jpg)
2. Presentations about a paper
• About 15 minutes long. Set the stage for discussions.
• Review the paper. Summarize its contributions. Give relevant background. Discuss how it relates to other papers we’ve read.
• Meet with me two days before to go over your presentation about the paper.
![Page 30: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/30.jpg)
3. Programming example• Present a computer implementation of a toy
example that illustrates the main idea of the paper.
• Show trade-offs in parameter settings, or in training sets.
• Goal: help us build up intuition about these techniques.
• Ok to use on-line code. Then focus on creating informative toy training sets.
![Page 31: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/31.jpg)
Toy problems
• Simple summaries of the main idea.
• Identify an informative idea from the paper
• Make a simple example using it.
• Play with it.
![Page 32: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/32.jpg)
Toy problem
by Ted Adelson
![Page 33: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/33.jpg)
Toy problem
“If you can make a system to solve this, I’ll give you a PhD”
by Ted Adelson
![Page 34: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/34.jpg)
Particle filter for inferring human motion in 3-d
From: Hedvig Sidenbladh’s thesis, http://www.nada.kth.se/~hedvig/publications/thesis.pdf
![Page 35: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/35.jpg)
Particle filter toy example
From: Hedvig Sidenbladh’s thesis, http://www.nada.kth.se/~hedvig/publications/thesis.pdf
![Page 36: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/36.jpg)
What we’ll have at the end of the class
Non-negative matrix factorization example1-d particle filtering exampleBoosting for face recognitionExample of belief propagation for scene
understanding.Manifold learning comparisons.
…
Code examples
![Page 37: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/37.jpg)
4. Final project: write a conference paper
• Submitting papers to conferences, you get just one shot, so it’s important to learn how to make good submissions.
• We’ll discuss many papers, and what’s good and bad about them, during the class.
• I’ll give a lecture on “how to write a good conference paper”.
• Subject of the paper can be:– A project from your own research.– A project you undertake for the class.
• Your idea• One I suggest to you
![Page 38: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/38.jpg)
Feedback options
• At the end of the course: “it would have been better if we had done this…”– Somewhat helpful
• During the course: “I find this useful; I don’t find that useful…”– Very helpful
![Page 39: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/39.jpg)
What background do you need?
• Be able to read and understand the papers– Linear algebra– Familiarity with estimation theory– Image filtering
• Background in machine learning and computer vision.
![Page 40: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/40.jpg)
Auditing versus credit
• If you’re a student and want to take the class, sign up for credit.– You’ll stay more engaged.– Makes it more probable that I can offer the
class again.
• But if you do audit: – Please don’t come to class if you haven’t read
the paper.– I may ask you to present to the class, anyway.
![Page 41: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/41.jpg)
First paper
• Monday, Feb. 11.• Emergence of simple-cell receptive field properties
by learning a sparse code for natural images, Olshausen BA, Field DJ (1996) Nature, 381: 607-609
• Presenter: Bill Freeman• Computational demonstration: need volunteer
(software is available: http://redwood.ucdavis.edu/bruno/sparsenet.html)
![Page 42: MIT 6.899 Learning and Inference in Vision Prof. Bill Freeman, wtf@mit.edu MW 2:30 – 4:00 Room: 34-301 Course web page:](https://reader035.fdocuments.us/reader035/viewer/2022081512/56649c745503460f94927481/html5/thumbnails/42.jpg)
Second paper
• Wednesday, Feb. 13.
• Learning the parts of objects by non-negative matrix factorization, D. D. Lee and H. S. Seung, Nature 401, 788-791 (1999), and commentary by Mel.
• Presenter: need volunteer
• Computational demonstration: need volunteer