Visual Tracking with Online Multiple Instance...

Visual Tracking with Online Multiple

Instance Learning

Kelsie Zhao

Boris Babenko, Ming-Hsuan Yang, Serge Belongie

Content

• Goal

• Background: Tracking by Detection

• Previous Work

• New Tracking Solution • MILTrack

• Online MILBoost

• Experiments & Results

Goal

Track one arbitrary object in video, given its

location in first frame

Background: Tracking by detection

• Frame 1 is labeled, tracker location known


• Crop one positive and some negative patches

near tracker

Negative

Positive x2

x1 x3


• Use patches to train the classifier

Negative

Positive

Classifier

{(x1, 1), (x2, 0), (x3, 0)}

x2

x1 x3


• Frame 2 comes

Classifier


• Calculate classifier response within a range of

the old tracker location

Classifier

X

Old location


• Find the maximum response location

Classifier

X

Old location

X

New location


• Move tracker

Frame 1 Frame 2


• Repeat

Negative

Positive

Frame 2


• Problem: If tracker location is not precise, might

select bad training examples




Model start to degrade!




• How to select good training

examples?

Model start to degrade!

Previous Work

• Solution 1: multiple positive examples around

tracker location

x4

x2 x5

x1

x3

{(x1, 1), (x2, 1), (x3, 1), (x4, 0), (x5, 0)}

Classifier

Previous Work

• Solution 1

Might confuse classifier!

x4

x2 x5 x1 x3

{(x1, 1), (x2, 1), (x3, 1), (x4, 0), (x5, 0)}

Classifier

Previous Work

• Solution 2: Multiple Instance Learning (MIL)

[Keeler ‘90, Dietterich et al. ‘97]

Previous Work: Multiple Instance Learning

• Multiple examples in one bag


x13

x12

x11

x21 x31


X3 X2

X1



x13

x12

x11

x21 x31


(X3 , 0) (X2 , 0)

(X1 , 1)


• One bag one label


x13

x12

x11

x21 x31


(X3 , 0) (X2 , 0)

(X1 , 1)


• One bag one label

• Bag Positive if at least one example is Positive


x13

x12

x11

x21 x31


{(X1, 1), (X2, 0), (X3, 0)}

Classifier

(X3 , 0) (X2 , 0)

(X1 , 1)


x13

x12

x11

x21 x31


MIL training input:

𝑿𝟏, 𝑦1 … 𝑿𝒏, 𝑦𝑛 ,

where,

𝑿𝒊 = 𝑥𝑖1 …𝑥𝑖𝑚 ,

𝑦𝑖 = 𝑚𝑎𝑥𝑗 𝑦𝑖𝑗

(X3 , 0) (X2 , 0)

(X1 , 1)

x13

x12

x11

x21 x31


x13

x12

x11


MIL training input:

𝑿𝟏, 𝑦1 … 𝑿𝒏, 𝑦𝑛 ,

where,

𝑿𝒊 = 𝑥𝑖1 …𝑥𝑖𝑚 ,

𝑦𝑖 = 𝑚𝑎𝑥𝑗 𝑦𝑖𝑗

(X3 , 0) (X2 , 0)

(X1 , 1)

Bag babel is 1

if at least one instance is 1 [Keeler ‘90, Dietterich et al. ‘97]

x21 x31

Now we have training examples!

How to train the classifier?

Previous Work: MILBoost

• MIL + boosting

Train a boosting classifier that maximizes log

likelihood of bags

𝑙𝑜𝑔𝐿 = log (𝑝 𝑦𝑖 𝑿𝒊))

𝑖

where,

𝑝 𝑦𝑖 𝑿𝒊) = 1 − (1 − 𝑝 𝑦𝑖 𝑥𝑖𝑗))

𝑗

[Viola et al. ‘05]


• MIL + boosting

Train a boosting classifier that maximizes log

likelihood of bags

𝑙𝑜𝑔𝐿 = log (𝑝 𝑦𝑖 𝑿𝒊))

𝑖

where,

𝑝 𝑦𝑖 𝑿𝒊) = 1 − (1 − 𝑝 𝑦𝑖 𝑥𝑖𝑗))

𝑗


1 ~1


• Problem: need all training examples



But in tracking, only current frame available



But in tracking, only current frame available

Need an online training algorithm for MIL


Main Contribution of this paper

• Online-MILBoost: Online training for MIL-based classifier

• MILTrack New tracking solution using Online-MILBoost

MILTrack workflow

New frame comes in:

1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

X

Babenko et al., 09

1: s = 35 in authors’ experiment

MILTrack workflow

New frame comes in:


X


Babenko et al., 09

MILTrack workflow

New frame comes in:


X

Babenko et al., 09


MILTrack workflow

New frame comes in:


Babenko et al., 09


MILTrack workflow

New frame comes in:

1. Crop out a set of image patches

𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

2. Use MIL classifier to find new tracker location

𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))

Babenko et al., 09

MILTrack workflow

New frame comes in:





3. 1) Crop positive examples 𝑋𝑟 = {𝑥| < 𝑟1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

Babenko et al., 09

1: r = 5 in authors’ experiment

MILTrack workflow

(X1 , 1)

Babenko et al., 09

New frame comes in:





3. 1) Crop positive examples 𝑋𝑟 = {𝑥| < 𝑟1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

1: r = 5 in authors’ experiment

x13

x12

x11

MILTrack workflow

Babenko et al., 09

New frame comes in:





3. 1) Crop positive examples

𝑋𝑟 = {𝑥| < 𝑟 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

3. 2) Crop Negative examples 𝑋𝑟, 𝛽 = 𝑥 𝑟 𝑡𝑜 𝛽1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑎𝑤𝑎𝑦 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

1: 𝛽 = 50 in authors’ experiment

MILTrack workflow

(X3 , 0)

x31

(X2 , 0)

x21

Babenko et al., 09

New frame comes in:





3. 1) Crop positive examples

𝑋𝑟 = {𝑥| < 𝑟 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

3. 2) Crop Negative examples 𝑋𝑟, 𝛽 = 𝑥 𝑟 𝑡𝑜 𝛽1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑎𝑤𝑎𝑦 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

1: 𝛽 = 50 in authors’ experiment

MILTrack workflow

New frame comes in:





3. Crop positive and negative examples near new

object location

4. Online MILBoost: Update MIL classifier with positive and

negative example bags

(X3 , 0) (X2 , 0)

(X1 , 1)

Babenko et al., 09

x31 x21

x13

x12

x11

Classifier

Online-MILBoost:

f1

f2

f3

…

Image patch x

Babenko et al., 09

Online-MILBoost:

• ℎ𝑘: a weak classifier using one feature

Image patch x

Babenko et al., 09

f1

f2

f3

…

f9

…

ℎ1(𝑥)

ℎ2(𝑥)

Online-MILBoost:


ℎ𝑘(𝑥) = log𝑝(𝑦 = 1|𝑓𝑘(𝑥))

𝑝(𝑦 = 0|𝑓𝑘(𝑥))

Image patch x

Babenko et al., 09

f1

f2

f3

…

ft

…

ℎ𝑘(𝑥)

Online-MILBoost:


ℎ𝑘(𝑥) = log𝑝(𝑦 = 1|𝑓𝑘(𝑥))

𝑝(𝑦 = 0|𝑓𝑘(𝑥))

with, 𝑝 𝑓𝑘 𝑥 𝑦 = 1 ~ 𝒩(𝜇1, 𝜎1) 𝑝 𝑓𝑘 𝑥 𝑦 = 0 ~ 𝒩(𝜇0, 𝜎0) 𝑝 𝑦 = 1 = 𝑝 𝑦 = 0

Image patch x

Babenko et al., 09

f1

f2

f3

…

ft

…

ℎ𝑘(𝑥)

Online-MILBoost:

• 𝑯(𝒙) : the MIL classifier made from weak

classifiers

𝑯 𝒙 = ℎ𝑘(𝑥)

𝐾

𝑘=1

Babenko et al., 09 K = 50 in authors’ experiment

Image patch x

ℎ1(𝑥)

ℎ𝑘(𝑥)

…

…

f1

f2

f3

…

ft

…

Online-MILBoost:

• Always keep a pool of M >> K weak classifier

candidates

h3

h1

h2 …

hM

Babenko et al., 09 M = 250 & K = 50 in authors’ experiment

Online-MILBoost:

• Update all M weak classifiers with positive and

negative bags

X3

{(X1, 1), (X2, 0), (X3, 0)}

X2 X1 x13

x12

x11

x21 x31

Classifier h1 Classifier h2 Classifier hM …

Babenko et al., 09

Online-MILBoost:

• Pick best K weak classifiers to form 𝑯(𝒙), where

ℎ𝑘 = argmaxℎ ∈ {ℎ1

… ℎ𝑀}log 𝐿(𝐻𝑘 − 1 + ℎ)


𝐾

𝑘=1

where 𝐻𝑘 − 1 is the classifier made up of the first

𝑘 − 1 weak classifiers

Babenko et al., 09

Online-MILBoost:

• Prediction : 𝑝 𝑦 = 1 𝑥 = 𝜎(𝑯(𝒙))

𝜎 𝑥 = 1

1 + 𝑒 − 𝑥

Babenko et al., 09

Online-MILBoost:


𝜎 𝑥 = 1

1 + 𝑒 − 𝑥

Babenko et al., 09

ℎ1 𝑥 = 2, ℎ2 𝑥 = 1.8, ℎ3 𝑥 = 0.6

f1

f2

f3

ℎ1(𝑥)

ℎ2(𝑥)

ℎ3(𝑥)

Online-MILBoost:


𝜎 𝑡 = 1

1 + 𝑒 − 𝑡

Babenko et al., 09


𝐾

𝑘=1

= 4.4

f1

f2

f3

ℎ1(𝑥)

ℎ2(𝑥)

ℎ3(𝑥)

𝑝 𝑦 = 1 𝑥 = 𝜎 𝑯 𝒙 = 0.99

ℎ1 𝑥 = 2, ℎ2 𝑥 = 1.8, ℎ3 𝑥 = 0.6

MILTrack workflow

New frame comes in:





3. Crop positive and negative examples near new object location

4. Online MILBoost:

Update MIL classifier with positive and negative

example bags

Babenko et al., 09

Experiments

Datesets: 8 publicly available videos, • Grayscale, 320 x 240 pixels

• Ground truth labeled every 5 frames by hand

−

Babenko et al., 2009

Coke can Girl Occluded face

David

Occluded face 2

Tiger 1 Sylvester Tiger 2

Experiments

Compared with: • OAB1

Online AdaBoost w/ 1 positive example per frame

• OAB5

Online AdaBoost w/ 45 positive examples per frame

• SemiBoost

Label in 1st frame only.

• FragTrack

Static appearance model


Experiments

Evaluation criterion:

Tracker position error (pixels) w.r.t. Ground truth


X X

Results

Video David

Babenko et al., 09

https://www.youtube.com/watch?v=dx7PZzQh0ik

Results

Position error versus Frame #, Video David


Results

Video Occluded Face

Babenko et al., 09

https://www.youtube.com/watch?v=IrBe203nkF8

Results

Position error versus Frame #,

Video Occluded Face


Results

Average Center location errors (pixels)

Best

Second Best


Conclusion

• Online MILBoost: Online algorithm to update MIL-based classifier

• Performance of “MILTrack” is stable


Discussion

• Why it can handle occlusion?

• Possible improvements

• Motion Model

• Features

• Part based representation


Note…

Wu et al., 2013

Note…

Wu et al., 2013 Project 2 use this!

Thank You!

Q&A

Visual Tracking with Online Multiple Instance...

Documents

Transcript of Visual Tracking with Online Multiple Instance...