CS 764 Seminar in Computer Vision
-
Upload
april-casey -
Category
Documents
-
view
20 -
download
2
description
Transcript of CS 764 Seminar in Computer Vision
CORNELLUNIVERSITY
CS 764Seminar in Computer Vision
Ramin Zabih
Fall 1998
2
CORNELLUNIVERSITY
Course mechanics
Meeting time will be Tue/Thu 11-12, here• Starting a week from today
Home page is now upwww/CS764
Assignment: present one paper• You’ll have a lot of freedom, but you need to
talk to me in advance• Some possible papers will be posted shortly
3
CORNELLUNIVERSITY
Topic of this seminar
The use of “knowledge” in the analysis of visual data• Sometimes called “context”
Clearly this is vital• On both psychological and technical grounds• But how? No one has much of an idea…
What is the interface between reasoning and perception? (Or, mind and body?)
4
CORNELLUNIVERSITY
What is the visual system’s “contract”
Two standard (bad) answers Answer 1: describe the scene in terms of
surfaces [low-level vision]• There is a green patch 2” wide 1’ away
Answer 2: describe the scene in terms of objects [model-based recognition]• Start with a set of 3D models (modelbase)• Determine position and pose
5
CORNELLUNIVERSITY
Why are these answers wrong?
They are almost purely data-driven• Bottom-up (from the data) versus top-down (from
somewhere else) They report “objective fact”, with no room for
the task at hand• For a given image, there is only one right answer
Other problems as well• Not very useful, etc.
6
CORNELLUNIVERSITY
Technical and psychological arguments
There are technical arguments against this• Vision is an inverse problem
– Many 3D scenes could explain a single 2D image
• On engineering grounds, this makes no sense– Ultimately, perception is used for some task
The human perceptual system has both top-down and bottom-up elements• Various optical illusions
– Two people can look at the same picture and see something completely different
10
CORNELLUNIVERSITY
Your vision system doesn’t listen
11
CORNELLUNIVERSITY
It makes “reasonable” assumptions
12
CORNELLUNIVERSITY
Low-level vision has its solution
Inverse problems require assumptions The assumptions for low-level vision are extremely general (I.e., weak)• Reflect the physics of the visible world• For example, motion or depth or intensity tend
to be “coherent”– Saying that every pixel is moving differently from its
neighbors is a very unlikely answer– The world we live in tends not to do that– Helmholtz’s “unconscious inference”
13
CORNELLUNIVERSITY
We’ll need high-level vision
Most of the field is low-level vision or model-based recognition• Partly to avoid the confusion CS764 is about
Key question: how to avoid brittleness?• Can make the visual system compute just what we
need for our task (I.e., berries)• But how to handle the unexpected (I.e., lions)?
14
CORNELLUNIVERSITY
A short historical perspective
1960’s vision was completely task-specific• A black blob in the center of the image is a
telephone• These efforts are now considered “hacks”
1970’s vision became completely general• Marr pushed the field towards precise technical
questions• Low-level vision and recognition became
dominant
15
CORNELLUNIVERSITY
Tasks strike back
In the mid-1980’s, several attempts were made to re-introduce a notion of task• Active/animate/purposive vision
These attempts are widely viewed as failures, for good reasons• We’ll look at them a bit next week
It’s not enough to have good intuitions• There needs to be technical merit as well
16
CORNELLUNIVERSITY
Desiderata
Technical solutions (algorithms) that are very roughly consistent with human data• Goal is not AI, psychology or philosophy
Provide visual summaries useful for tasks, but degrade gracefully• Handle open/unstructured environments• Deal with expectations and breakdown
17
CORNELLUNIVERSITY
Our path for 764
No good computational work to read• Perhaps Vera will fix this?
We will examine papers along these lines:• Computational approaches that failed• Psychological data that is highly suggestive• Neurologically inspired architectures• Cognitive scientists and philosophers
– Their goal is argument, not algorithm!
– They’ve thought the most about these issues