Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf ·...

76
Lecture 12 Recognition Davide Scaramuzza Institute of Informatics – Institute of Neuroinformatics 1

Transcript of Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf ·...

Page 1: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Lecture 12Recognition

Davide Scaramuzza

Institute of Informatics – Institute of Neuroinformatics

1

Page 2: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Lab exercise today replaced by Deep Learning Tutorial

Room ETH HG E 1.1 from 13:15 to 15:00

Optional lab exercise is online: K-means clustering and Bag of Words place recognition

4

Page 3: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Outline

• Recognition applications and challenges

• Recognition approaches

• Classifiers

• K-means clustering

• Bag of words

5

Page 4: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Application: large-scale image retrieval

Query image Closest results from a database of 100 million images

6

Page 5: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

• Smartphone:– Lincoln Microsoft Research– Point & Find, Nokia– SnapTell.com (Amazon)– Google Goggles

Application: recognition for mobile phones

7

Page 6: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Application: Face recognition

See iPhoto, Google Photos, Facebook

Detection Recognition “Sally”

8

Page 7: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Application: Face recognition

• Detection works by using four basic types of feature detectors

– The white areas are subtracted from the black ones.

– A special representation of the sample called the integral image makes feature extraction faster.

P. Viola, M. Jones: Robust Real-time Object Detection, Int. Journal of Computer Vision 2001 9

Page 8: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Application: Optical character recognition (OCR)

Technology to convert scanned docs to text• If you have a scanner, it probably came with OCR software

Digit recognition, AT&T labs, using CNN,

by Yann LeCun (1993)

http://yann.lecun.com/

License plate readershttp://en.wikipedia.org/wiki/Automatic_number_plate_recognition

10

Page 9: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Application: pedestrian recognition

• Detector: Histograms of oriented gradients (HOG)

Credit: Van Gool’s lab, ETH Zurich 11

Page 10: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Challenges: object intra-class variations

• How to recognize ANY car

• How to recognize ANY cow

12

Page 11: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Challenges: object intra-class variations

• How to recognize ANY chair

15

Page 12: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Challenges: context and human experience

16

Page 13: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Outline

• Recognition applications and challenges

• Recognition approaches

• Classifiers

• K-means clustering

• Bag of words

17

Page 14: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

1960-1990

Polygonal objects

2000-today

Any kind of object1990-2000

Faces, characters,

planar objects

Research progress in recognition

18

Page 15: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Two schools of approaches

• Model based – Tries to fit a model (2D or 3D)

using a set of corresponding features (lines, point features)

• Example: SIFT matching and RANSAC for model validation

• Appearance based– The model is defined by a set of

images representing the object• Example: template matching can be

thought as a simple object recognition algorithm (the template is the object to recognize

19

Page 16: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Example of 2D model-based approachQ: Is this Book present in the Scene?

Scene Book

21

Page 17: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Example of 2D model-based approachQ: Is this Book present in the Scene?

Extract keypoints in both images

22

Page 18: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Most of the Book’s keypoints are present in the Scene

A: The Book is present in the Scene

Example of 2D model-based approachQ: Is this Book present in the Scene?

Look for corresponding matches

23

Page 19: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Example of appearance-based approach:Simple 2D template matching

• The model of the object is simply an image

• A simple example: Template matching

– Shift the template over the image and compare (e.g. NCC or SSD)

– Problem: works only if template and object are identical

24

Page 20: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Example of appearance-based approach:Simple 2D template matching

• The model of the object is simply an image

• A simple example: Template matching

– Shift the template over the image and compare (e.g. NCC or SSD)

– Problem: works only if template and object are identical

25

Page 21: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Outline

• Recognition applications and challenges

• Recognition approaches

• Classifiers

• K-means clustering

• Bag of words

26

Page 22: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

What is the goal of object recognition?

Goal: classify!

• Binary classifier

– say yes/no as to whether an object is present in an image

• Multi-class classifier

– categorize an object: determine what class it belongs to (e.g., car, apple, etc.)

How to display the result to the user

• Bounding box on object

• Full segmentation

Is this or is this not a car? Bounding box on object Full segmentation 27

Page 23: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Car/non-car

Classifier

Yes, car.No, not a car.

Basic component: a binary classifier

Detection via classification: Main idea

28

Page 24: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Detection via classification: Main idea

Car/non-car

Classifier

Feature

extraction

Training examples

1. Obtain training data2. Define features3. Define classifier

More in detail, we need to:

29

Page 25: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

• Consider all subwindows in an image

– Sample at multiple scales and positions

• Make a decision per window:

– “Does this contain object X or not?”

Detection via classification: Main idea

Car/non-car

Classifier

30

Page 26: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Generalization: the machine learning approach

31

Page 27: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

• Apply a prediction function to a feature representation of the image to get the desired output:

f( ) = “apple”

f( ) = “tomato”

f( ) = “cow”

Generalization: the machine learning approach

32

Page 28: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

The machine learning framework

𝑦 = 𝑓(𝒙)

• Training: given a training set of labeled examples{(𝒙1, 𝒚1), … , (𝒙𝑁, 𝒚𝑁)}, estimate the prediction function 𝑓 by minimizing the prediction error (𝑙𝑜𝑠𝑠 = 𝒚 − 𝑓(𝒙)) on the training set

• Testing: apply 𝑓 to a never-before-seen test example 𝒙 and output the predicted value 𝒚 = 𝑓(𝒙)

output prediction

function

Input: Image

features

33

Page 29: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

• Images in the training set must be labeled with the “correct answer” that the model is expected to produce

Contains a motorbike

Recognition task and supervision

34

Page 30: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Examples of possible features

• Blob features

• Histograms of oriented gradients (HOG)• Image Histograms

37

Page 31: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Classifiers: Nearest neighbor

𝑓(𝒙) = label of the training example nearest to 𝒙

• No training required!

• All we need is a distance function for our inputs

• Problem: need to compute distances to all training examples! (what if you have 1 million training images and 1 thousand features per image?)

Test

exampleTraining

examples

from class 1

Training

examples

from class 2

Features are represented in the descriptor space (Ex. What is the dimensionality of the descriptor space for SIFT features?)

38

Page 32: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Classifiers: Linear

• Find a linear function to separate the classes:

𝑓(𝒙) = 𝑠𝑔𝑛(𝒘 𝒙 + 𝑏)

39

Page 33: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Classifiers: non-linear

Bad classifier (over fitting)Good classifier

40

Page 34: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Outline

• Recognition applications and challenges

• Recognition approaches

• Classifiers

• K-means clustering

• Bag of words

41

Page 35: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

How do we define a classifier?

• We first need to cluster the training data

• Then, we need a distance function to determine to which cluster the query image belongs to

42

Page 36: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

K-means clustering

• k-means clustering is an algorithm to partition 𝑛 observations into 𝑘 clusters in which each observation 𝒙belongs to the cluster 𝑺𝒊 with center 𝒎𝑖

• It minimizes the sum of squared Euclidean distances between points 𝒙 and their nearest cluster centers 𝒎𝑖

𝐷 𝑋,𝑀 =

𝑖=1

𝑘

𝑥∈𝑆𝑖

(𝑥 − 𝑚𝑖)2

Algorithm:

• Randomly initialize 𝑘 cluster centers

• Iterate until convergence:

– Assign each data point 𝒙𝑗 to the nearest center 𝒎𝑖

– Recompute each cluster center as the mean of all points assigned to it

43

Page 37: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

K-means demo

Source: http://shabal.in/visuals/kmeans/1.html44

Page 38: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Outline

• Recognition applications and challenges

• Recognition approaches

• Classifiers

• K-means clustering

• Bag of words

45

Page 39: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Application: large-scale image retrieval

Query image Results on a database of 100 million images

46

Page 40: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Slide Slide Credit: Nister47

Page 41: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Slide Slide Credit: Nister48

Page 42: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Slide Slide Credit: Nister49

Page 43: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Fast visual search

“Scalable Recognition with a Vocabulary Tree”, Nister and Stewenius, CVPR 2006.

“Video Google”, Sivic and Zisserman, ICCV 2003

• Query of 1 image in a database of 100 million images in 6 seconds

50

Page 44: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Bag of Words• Extension to scene/place recognition:

– Is this image in my database?

– Robot: Have I been to this place before?

51

Page 45: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

University of Zurich – Robotics and Perception Group - rpg.ifi.uzh.ch

Visual Place Recognition

Goal: find the most similar images of a query image in a database of 𝑵 images

Complexity:𝑁2∙𝑀2

2feature comparisons (assumes each image has M features)

Each image must be compared with all other images!

𝑁 is the number of all images collected by a robot

- Example: assume your Roomba robot takes 1 image every meter to cover a

100 m2 house; assume 100 SIFT features per image → M = 100, 𝑁 = 100 →

𝑁2𝑀2/2= ~ 50 𝑀𝑖𝑙𝑙𝑖𝑜𝑛 feature comparisons!

Solution: Use an inverted file index! Complexity reduces to 𝑵 ∙ 𝑴

[“Video Google”, Sivic & Zisserman, ICCV’03][“Scalable Recognition with a Vocabulary Tree”, Nister & Stewenius, CVPR’06]

See also FABMAP and Galvez-Lopez’12’s (DBoW2)]

Page 46: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Indexing local features: inverted file text

• For text documents, an efficient way to find all pages in which a word occurs is to use an index

• We want to find all images in which a feature occurs

• How many distinct SIFT or BRISK features exist?

– SIFT → Infinite– BRISK-128 → 2128 = 3.4 ∙ 1038

• Since the number of image features may be infinite, before we build our visual vocabulary we need to map our features to “visual words”

• Using analogies from text retrieval, we should:

– Define a “Visual Word”– Define a “vocabulary” of Visual Words– This approach is known as “Bag of

Words” (BOW)

53

Page 47: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Building the Visual Vocabulary

Image Collection Extract Features Cluster Descriptors

Descriptors space

Examples of

Features belonging

to the same

clusters

What is a visual word? A visual word is the

centroid of a cluster!

54

Page 48: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Inverted File index• Inverted File Index lists all visual words in the vocabulary (extracted at training time)

• Each word points to a list of images, from the all image Data Base (DB), in which that word appears. The DB grows as the robot navigates and collects new images.

• Voting array: has as many cells as the images in the DB. Each word in the query image votes for an image

101103105105180180180

…Voting Array for Q

0

…101102103104105

Inverted File Index

Query image Q

Visual words in Q

+1 +1 +1

Visual word List of images in which this word appears

56

Page 49: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Populating the vocabulary

Feature descriptor space

57

Page 50: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Populating the vocabulary

58

Page 51: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Populating the vocabulary

59

Page 52: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Populating the vocabulary

60

Page 53: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Populating the vocabulary

61

Page 54: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Populating the vocabulary

62

Page 55: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

63

Page 56: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

64

Page 57: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

65

Page 58: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

66

Page 59: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

67

Page 60: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

68

Page 61: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

69

Page 62: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

70

Page 63: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

71

Page 64: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

72

Page 65: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

73

Page 66: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

74

Page 67: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Building the inverted file index

75

Page 68: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Building the inverted file index

76

Page 69: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Building the inverted file index

77

Page 70: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Building the inverted file index

78

Page 71: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Recognition

79

Page 72: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Robust object/scene recognition

• Visual Vocabulary discards the spatial relationships between features

– Two images with the same features shuffled around will return a 100% match when using only appearance information.

• This can be overcome using geometric verification

– Test the ℎ most similar images to the query image for geometric consistency (e.g. using 5- or 8-point RANSAC) and retain the image with the smallest reprojection error and largest number of inliers

– Further reading (out of scope of this course):

• [Cummins and Newman, IJRR 2011]

• [Stewénius et al, ECCV 2012]

80

Page 73: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Video Google System

1. Collect all words within query region

2. Inverted file index to find relevant frames

3. Compare word counts

4. Spatial verification

Sivic & Zisserman, ICCV 2003

• Demo online at : http://www.robots.ox.ac.uk/~vgg/research/vgoogle/

Query region

Retrieved frames

81

Page 74: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Improves

Retrieval

Improves

Speed

Branch factor

More words is better

83

Page 75: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

FABMAP [Cummins and Newman IJRR 2011]

• Place recognition for robot localization

• Uses training images to build the BoW database

• Captures the spatial dependencies of visual words to distinguish the most characteristic structure of each scene

• Probabilistic model of the world. At a new frame, compute:

– P(being at a known place)

– P(being at a new place)

• Very high performance

• Binaries available online

• Open FABMAP

86

Page 76: Lecture 12 Recognition - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2017/12_recognition.pdf · 2017-12-07 · • Detection works by using four basic types of feature detectors

Things to remember

• Appearance-based object recognition– Classifiers

– K-means clustering

• Bag of Words approach– What is visual word

– Inverted file index

– How it works

• Chapter 14 of the Szeliski’sbook

93