PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset...

81

Transcript of PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset...

Page 1: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,
Page 2: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,
Page 3: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,
Page 4: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Frederick Kingdom

Page 5: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Frederick Kingdom

Page 6: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Photo: Georges Jansoon. Illusion: Frederick Kingdom, Ali Yoonessi and Elena Gheorghiu

Page 7: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Frederick Kingdom

Page 8: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Project 3: Camera calibration

Page 9: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Project 4: Scene Recognition

Page 10: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Normalization – why?

Required by some underlying property of the learning mechanism.

• E.G., removing hyperplane bias in SVM to aid fitting.

Also called ‘feature scaling’.

• Many methods, e.g.,

Normalization can be implemented wrt. other data points, and sometimes wrt. other features.

Page 11: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Wrt. other data points – human weight

Udacity - ML

175 115

Page 12: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Tiny Image as a feature vector

• Each pixel is treated as a different feature.

• Feature vector is matrix reshaped into an array.

Page 13: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Normalization – how?

function image_feats = get_tiny_images( image_paths )

size = 16;

N = size(image_paths, 1);

image_feats = zeros(N, size * size);

for i = 1:N

img = im2double( imread(image_paths{i, 1}) );

img = imresize( img, [size, size] );

fv = reshape( img, [1, size * size] );

fv = fv - mean( fv );

image_feats(i, :) = fv ./ norm( fv );

end

Page 14: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Normalization – how?

function image_feats = get_tiny_images( image_paths )

size = 16;

N = size(image_paths, 1);

image_feats = zeros(N, size * size);

for i = 1:N

img = im2double( imread(image_paths{i, 1}) );

img = imresize( img, [size, size] );

fv = reshape( img, [1, size * size] );

fv = fv - mean( fv );

image_feats(i, :) = fv ./ norm( fv );

end

Page 15: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Mean across features

function image_feats = get_tiny_images( image_paths )

size = 16;

N = size(image_paths, 1);

image_feats = zeros(N, size * size);

for i = 1:N

img = im2double( imread(image_paths{i, 1}) );

img = imresize( img, [size, size] );

fv = reshape( img, [1, size * size] );

fv = fv - mean( fv ); % Mean across features (pixels) per data point

image_feats(i, :) = fv ./ norm( fv );

end

Page 16: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Mean across data points

function image_feats = get_tiny_images( image_paths )

size = 16;

N = size(image_paths, 1);

image_feats = zeros(N, size * size);

for i = 1:N

img = im2double( imread(image_paths{i, 1}) );

img = imresize( img, [size, size] );

fv = reshape( img, [1, size * size] );

end

mean_img = mean(image_feats); % Mean of each feature (pixel) across data points

var_img = std(image_feats);

for i = 1:N

image_feats(i,:) = ( image_feats(i,:) - mean_img ) ./ var_img;

end

Page 17: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Friday: CV for Social Good Bad

• We saw how dataset bias can introduce error.

• But what about the underlying feature representation itself?

• How might I discover why my CV system is bad?

• How might I explain the failure?

Page 18: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Vondrick et al.

Page 19: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Car

Vondrick et al.

Page 20: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Car

Vondrick et al.

Page 21: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Car

Vondrick et al.

Page 22: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

What information is lost?

Vondrick et al.

Page 23: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

What information is lost?

Vondrick et al.

Page 24: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

HOG = ɸMany-to-one functionNo inverse

Vondrick et al.

Page 25: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

What information is lost?

Vondrick et al.

Page 26: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

HOGgles (Vondrick et al. ICCV 2013)

Page 27: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Method: Paired Dictionary

Vondrick et al.

Page 28: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Vondrick et al.

Page 29: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Vondrick et al.

Page 30: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Vondrick et al.

Page 31: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,
Page 32: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

HumanVision HOGVision

vs

Vondrick et al.

Page 33: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Vondrick et al.

Page 34: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Car

Why did the detector fail?

Vondrick et al.

Page 35: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Car

Why did the detector fail?

Vondrick et al.

Page 36: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Car

Why did the detector fail?

Vondrick et al.

Page 37: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

CodeAvailable

Try it on your project 5!

http://web.mit.edu/vondrick/ihog/

ihog = invertHOG(feat);

Page 38: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Friday: CV for Social Good Bad

• We saw how dataset bias can introduce error.

• …and how features can be ambiguous.

• What about label bias errors?

• How might I move towards a more flexible label system?

Page 39: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Describing Objects by their Attributes

Ali Farhadi, Ian Endres, Derek Hoiem, David Forsyth

CVPR 2009

Page 40: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,
Page 41: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

What do we want to

know about this

object?

Page 42: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

What do we want to

know about this

object?

Object recognition expert:

“Dog”

Page 43: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

What do we want to

know about this

object?

Object recognition expert:

“Dog”

Person in the Scene:

“Big pointy teeth”, “Can move

fast”, “Looks angry”

Page 44: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Goal: Infer Object Properties

Is it alive?Can I draw with it? Can I put stuff in it?

What shape is it? Is it soft?

Does it have a tail? Will it blend?

Page 45: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Why Infer Properties

1. We want detailed information about objects

“Dog”

vs.

“Large, angry animal with pointy teeth”

Page 46: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Why Infer Properties

2. We want to be able to infer something about unfamiliar objects – “zero shot learning”

Familiar Objects New Object

Page 47: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Why Infer Properties

2. We want to be able to infer something about unfamiliar objects – “zero shot learning”

Cat Horse Dog ???

If we can infer category names…

Familiar Objects New Object

Page 48: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Why Infer Properties

2. We want to be able to infer something about unfamiliar objects – “zero shot learning”

Has Stripes

Has Ears

Has Eyes

….

Has Four Legs

Has Mane

Has Tail

Has Snout

….

Brown

Muscular

Has Snout

….

Has Stripes (like cat)

Has Mane and Tail (like horse)

Has Snout (like horse and dog)

Familiar Objects New Object

If we can infer properties…

Page 49: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Why Infer Properties

3. We want to make comparisons between objects or categories

What is unusual about this dog? What is the difference between horses

and zebras?

Page 50: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Strategy 1: Category Recognition

classifierassociated

properties

Category Recognition: PASCAL 2008

Category Attributes: ??

Object Image Category

“Car”

Has Wheels

Used for Transport

Made of Metal

Has Windows

Page 51: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Strategy 2: Exemplar Matching

associated

properties

Object Image Similar ImageHas Wheels

Used for Transport

Made of Metal

Old

similarity

function

Malisiewicz Efros 2008

Hays Efros 2008

Efros et al. 2003

Page 52: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Strategy 3: Infer Properties Directly

Object ImageNo Wheels

Old

Brown

Made of Metal

classifier for each attribute

See also Lampert et al. 2009

Gibson’s affordances

Page 53: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

The Three Strategies

classifierassociated

properties

Object Image

Category

“Car”

Has Wheels

Used for Transport

Made of Metal

Has Windows

Old

No Wheels

Brown

associated

properties

Similar Imagesimilarity

function

classifier for each attribute

Direct

Page 54: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Candidate attributes

• Visible parts: “wheels”, “snout”, “eyes”

• Visible materials or material properties: “made of metal”, “shiny”, “clear”, “made of plastic”

• Shape: “3D boxy”, “round”

Page 55: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Attribute Examples

Shape: Horizontal Cylinder

Part: Wing, Propeller, Window, Wheel

Material: Metal, Glass

Shape:

Part: Window, Wheel, Door, Headlight,

Side Mirror

Material: Metal, Shiny

Page 56: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Attribute Examples

Shape:

Part: Head, Ear, Nose,

Mouth, Hair, Face,

Torso, Hand, Arm

Material: Skin, Cloth

Shape:

Part: Head, Ear, Snout,

Eye

Material: Furry

Shape:

Part: Head, Ear, Snout,

Eye, Torso, Leg

Material: Furry

Page 57: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Datasets

• a-Pascal– 20 categories from PASCAL 2008 trainval dataset (10K object images)

• airplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, dining table, dog, horse, motorbike, person, potted plant, sheep, sofa, train, tv monitor

– ‘Ground truth’ for 64 attributes– Annotation via Amazon’s Mechanical Turk

• a-Yahoo– 12 new categories from Yahoo image search

• bag, building, carriage, centaur, donkey, goat, jet ski, mug, monkey, statue of person, wolf, zebra

– Categories chosen to share attributes with those in Pascal

• Attribute labels are somewhat ambiguous– Agreement among “experts” 84.3– Between experts and Turk labelers 81.4– Among Turk labelers 84.1

Page 58: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Annotation on Amazon Turk

Page 59: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Approach

Page 60: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Features + classifiers

Spatial pyramid histograms of quantized– Color and texture for materials

– Histograms of gradients (HOG) for parts

– Canny edges for shape

Learn presence / absence of attribute.

– Train one classifier (linear SVM) per attribute

Page 61: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Average ROC Area

Test Objects Parts Materials Shape

a-PASCAL 0.794 0.739 0.739

a-Yahoo 0.726 0.645 0.677

Trained on a-PASCAL objects

Page 62: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Describing Objects by their Attributes

No examples from these object categories were seen during training

Page 63: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Describing Objects by their Attributes

No examples from these object categories were seen during training

Page 64: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Category Recognition

• Semantic attributes not enough– 74% accuracy even with ground truth attributes

• Introduce discriminative attributes– Trained by selecting subset of classes and features

• Dogs vs. sheep using color

• Cars and buses vs. motorbikes and bicycles using edges

– Train 10,000 and select 1,000 most reliable, according to a validation set

Page 65: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Attributes not big help when sufficient data

PASCAL 2008 Base Features

Semantic Attributes

All Attributes

Classification Accuracy 58.5% 54.6% 59.4%

Class-normalized Accuracy 35.5% 28.4% 37.7%

• Use attribute predictions as features

• Train linear SVM to categorize objects

Page 66: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Absence of typical attributes

752 reports

68% are correct

Page 67: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Presence of atypical attributes

951 reports

47% are correct

Page 68: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Visual Recognition with Humans in the Loop

Steve Branson, Catherine Wah, Florian Schroff, Boris Babenko, Peter Welinder, Pietro Perona,

Serge Belongie

Part of the Visipedia project

Slides from Brian O’Neil

Page 69: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Introduction:

Computers starting to get good at this.

If it’s hard for humans, it’s probably too hard

for computers.

Semantic feature extraction difficult for

computers.

Combine strengths to solve this

problem.

Page 70: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

The Approach: What is progress?

• Supplement visual recognition with the human capacity for visual feature extraction to tackle difficult (fine-grained) recognition problems.

• Typical progress is viewed as increasing data difficulty while maintaining full autonomy

• Reduction in human effort on difficult data.

Page 71: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

The Approach: 20 Questions

• Ask the user a series of discriminative visual questions to make the classification.

Page 72: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Which 20 questions?

• At each step, exploit the image itself and the user response history to select the most informative question to ask next.

Image xAsk user a question

Stop?( | , )tp c U x max ( | , )t

c p c U x

1( | , )tp c U x

No

Yes

Page 73: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Which question to ask?

• The question that will reduce entropy the most, taking into consideration the computer vision classifier confidences for each category.

Page 74: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

The Dataset: Birds-200

• 6033 images of 200 species

Page 75: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Implementation

• Assembled 25 visual questions encompassing 288 visual attributes extracted from www.whatbird.com

• Mechanical Turk users asked to answer questions and provide confidence scores.

Page 76: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

User Responses.

Page 77: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Visual recognition

• Any vision system that can output a probability distribution across classes will work.

• Authors used Andrea Vedaldis’s code.– Color/gray SIFT

– VQ geometric blur

– 1 v All SVM

• Authors added full image color histograms and VQ color histograms

Page 78: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Experiments

• 2 Stop criteria:

– Fixed number of questions – evaluate accuacy

– User stops when bird identified – measure number of questions required.

Image xAsk user a question

Stop?( | , )tp c U x max ( | , )t

c p c U x

1( | , )tp c U x

No

Yes

Page 79: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Results

• Average number of questions to make ID reduced from 11.11 to 6.43

• Method allows CV to handle the easy cases, consulting with users only on the more difficult cases.

Page 80: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Key Observations

• Visual recognition reduces labor over a pure “20 Q” approach.

• Visual recognition improves performance over a pure “20 Q” approach. (69% vs 66%)

• User input dramatically improves recognition results. (66% vs 19%)

Hays

Page 81: PowerPoint Presentation€¦ · • a-Pascal –20 categories from PASCAL 2008 trainval dataset (10K object images) • airplane, bicycle, bird, boat, bottle, bus, car, cat, chair,

Strengths and weaknesses

• Handles very difficult data and yields excellent results.

• Plug-and-play with many recognition algorithms.

• Requires significant user assistance

• Reported results assume humans are perfect verifiers

• Is the reduction from 11 questions to 6 really that significant?

Hays