Visual Tracking with Online Multiple Instance...

68
Visual Tracking with Online Multiple Instance Learning Kelsie Zhao Boris Babenko, Ming-Hsuan Yang, Serge Belongie

Transcript of Visual Tracking with Online Multiple Instance...

Page 1: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Visual Tracking with Online Multiple

Instance Learning

Kelsie Zhao

Boris Babenko, Ming-Hsuan Yang, Serge Belongie

Page 2: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Content

• Goal

• Background: Tracking by Detection

• Previous Work

• New Tracking Solution • MILTrack

• Online MILBoost

• Experiments & Results

Page 3: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Goal

Track one arbitrary object in video, given its

location in first frame

Page 4: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Frame 1 is labeled, tracker location known

Page 5: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Crop one positive and some negative patches

near tracker

Negative

Positive x2

x1 x3

Page 6: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Use patches to train the classifier

Negative

Positive

Classifier

{(x1, 1), (x2, 0), (x3, 0)}

x2

x1 x3

Page 7: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Frame 2 comes

Classifier

Page 8: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Calculate classifier response within a range of

the old tracker location

Classifier

X

Old location

Page 9: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Find the maximum response location

Classifier

X

Old location

X

New location

Page 10: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Move tracker

Frame 1 Frame 2

Page 11: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Repeat

Negative

Positive

Frame 2

Page 12: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Problem: If tracker location is not precise, might

select bad training examples

Page 13: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Problem: If tracker location is not precise, might

select bad training examples

Model start to degrade!

Page 14: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Background: Tracking by detection

• Problem: If tracker location is not precise, might

select bad training examples

• How to select good training

examples?

Model start to degrade!

Page 15: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work

• Solution 1: multiple positive examples around

tracker location

x4

x2 x5

x1

x3

{(x1, 1), (x2, 1), (x3, 1), (x4, 0), (x5, 0)}

Classifier

Page 16: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work

• Solution 1

Might confuse classifier!

x4

x2 x5 x1 x3

{(x1, 1), (x2, 1), (x3, 1), (x4, 0), (x5, 0)}

Classifier

Page 17: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work

• Solution 2: Multiple Instance Learning (MIL)

[Keeler ‘90, Dietterich et al. ‘97]

Page 18: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: Multiple Instance Learning

• Multiple examples in one bag

[Keeler ‘90, Dietterich et al. ‘97]

x13

x12

x11

x21 x31

Page 19: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: Multiple Instance Learning

X3 X2

X1

• Multiple examples in one bag

[Keeler ‘90, Dietterich et al. ‘97]

x13

x12

x11

x21 x31

Page 20: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: Multiple Instance Learning

(X3 , 0) (X2 , 0)

(X1 , 1)

• Multiple examples in one bag

• One bag one label

[Keeler ‘90, Dietterich et al. ‘97]

x13

x12

x11

x21 x31

Page 21: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: Multiple Instance Learning

(X3 , 0) (X2 , 0)

(X1 , 1)

• Multiple examples in one bag

• One bag one label

• Bag Positive if at least one example is Positive

[Keeler ‘90, Dietterich et al. ‘97]

x13

x12

x11

x21 x31

Page 22: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: Multiple Instance Learning

{(X1, 1), (X2, 0), (X3, 0)}

Classifier

(X3 , 0) (X2 , 0)

(X1 , 1)

[Keeler ‘90, Dietterich et al. ‘97]

x13

x12

x11

x21 x31

Page 23: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: Multiple Instance Learning

MIL training input:

𝑿𝟏, 𝑦1 … 𝑿𝒏, 𝑦𝑛 ,

where,

𝑿𝒊 = 𝑥𝑖1 …𝑥𝑖𝑚 ,

𝑦𝑖 = 𝑚𝑎𝑥𝑗 𝑦𝑖𝑗

(X3 , 0) (X2 , 0)

(X1 , 1)

x13

x12

x11

x21 x31

[Keeler ‘90, Dietterich et al. ‘97]

Page 24: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

x13

x12

x11

Previous Work: Multiple Instance Learning

MIL training input:

𝑿𝟏, 𝑦1 … 𝑿𝒏, 𝑦𝑛 ,

where,

𝑿𝒊 = 𝑥𝑖1 …𝑥𝑖𝑚 ,

𝑦𝑖 = 𝑚𝑎𝑥𝑗 𝑦𝑖𝑗

(X3 , 0) (X2 , 0)

(X1 , 1)

Bag babel is 1

if at least one instance is 1 [Keeler ‘90, Dietterich et al. ‘97]

x21 x31

Page 25: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Now we have training examples!

How to train the classifier?

Page 26: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: MILBoost

• MIL + boosting

Train a boosting classifier that maximizes log

likelihood of bags

𝑙𝑜𝑔𝐿 = log (𝑝 𝑦𝑖 𝑿𝒊))

𝑖

where,

𝑝 𝑦𝑖 𝑿𝒊) = 1 − (1 − 𝑝 𝑦𝑖 𝑥𝑖𝑗))

𝑗

[Viola et al. ‘05]

Page 27: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: MILBoost

• MIL + boosting

Train a boosting classifier that maximizes log

likelihood of bags

𝑙𝑜𝑔𝐿 = log (𝑝 𝑦𝑖 𝑿𝒊))

𝑖

where,

𝑝 𝑦𝑖 𝑿𝒊) = 1 − (1 − 𝑝 𝑦𝑖 𝑥𝑖𝑗))

𝑗

[Viola et al. ‘05]

1 ~1

Page 28: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: MILBoost

• Problem: need all training examples

[Viola et al. ‘05]

Page 29: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: MILBoost

But in tracking, only current frame available

[Viola et al. ‘05]

Page 30: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Previous Work: MILBoost

But in tracking, only current frame available

Need an online training algorithm for MIL

[Viola et al. ‘05]

Page 31: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Main Contribution of this paper

• Online-MILBoost: Online training for MIL-based classifier

• MILTrack New tracking solution using Online-MILBoost

Page 32: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

New frame comes in:

1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

X

Babenko et al., 09

1: s = 35 in authors’ experiment

Page 33: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

New frame comes in:

1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

X

Babenko et al., 09

1: s = 35 in authors’ experiment

Page 34: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

New frame comes in:

1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

X

1: s = 35 in authors’ experiment

Babenko et al., 09

Page 35: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

New frame comes in:

1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

X

Babenko et al., 09

1: s = 35 in authors’ experiment

Page 36: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

New frame comes in:

1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

Babenko et al., 09

1: s = 35 in authors’ experiment

Page 37: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

New frame comes in:

1. Crop out a set of image patches

𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

2. Use MIL classifier to find new tracker location

𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))

Babenko et al., 09

Page 38: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

New frame comes in:

1. Crop out a set of image patches

𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

2. Use MIL classifier to find new tracker location

𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))

3. 1) Crop positive examples 𝑋𝑟 = {𝑥| < 𝑟1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

Babenko et al., 09

1: r = 5 in authors’ experiment

Page 39: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

(X1 , 1)

Babenko et al., 09

New frame comes in:

1. Crop out a set of image patches

𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

2. Use MIL classifier to find new tracker location

𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))

3. 1) Crop positive examples 𝑋𝑟 = {𝑥| < 𝑟1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

1: r = 5 in authors’ experiment

x13

x12

x11

Page 40: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

Babenko et al., 09

New frame comes in:

1. Crop out a set of image patches

𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

2. Use MIL classifier to find new tracker location

𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))

3. 1) Crop positive examples

𝑋𝑟 = {𝑥| < 𝑟 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

3. 2) Crop Negative examples 𝑋𝑟, 𝛽 = 𝑥 𝑟 𝑡𝑜 𝛽1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑎𝑤𝑎𝑦 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

1: 𝛽 = 50 in authors’ experiment

Page 41: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

(X3 , 0)

x31

(X2 , 0)

x21

Babenko et al., 09

New frame comes in:

1. Crop out a set of image patches

𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

2. Use MIL classifier to find new tracker location

𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))

3. 1) Crop positive examples

𝑋𝑟 = {𝑥| < 𝑟 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

3. 2) Crop Negative examples 𝑋𝑟, 𝛽 = 𝑥 𝑟 𝑡𝑜 𝛽1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑎𝑤𝑎𝑦 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

1: 𝛽 = 50 in authors’ experiment

Page 42: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

New frame comes in:

1. Crop out a set of image patches

𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

2. Use MIL classifier to find new tracker location

𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))

3. Crop positive and negative examples near new

object location

4. Online MILBoost: Update MIL classifier with positive and

negative example bags

(X3 , 0) (X2 , 0)

(X1 , 1)

Babenko et al., 09

x31 x21

x13

x12

x11

Classifier

Page 43: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

f1

f2

f3

Image patch x

Babenko et al., 09

Page 44: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

• ℎ𝑘: a weak classifier using one feature

Image patch x

Babenko et al., 09

f1

f2

f3

f9

ℎ1(𝑥)

ℎ2(𝑥)

Page 45: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

• ℎ𝑘: a weak classifier using one feature

ℎ𝑘(𝑥) = log𝑝(𝑦 = 1|𝑓𝑘(𝑥))

𝑝(𝑦 = 0|𝑓𝑘(𝑥))

Image patch x

Babenko et al., 09

f1

f2

f3

ft

ℎ𝑘(𝑥)

Page 46: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

• ℎ𝑘: a weak classifier using one feature

ℎ𝑘(𝑥) = log𝑝(𝑦 = 1|𝑓𝑘(𝑥))

𝑝(𝑦 = 0|𝑓𝑘(𝑥))

with, 𝑝 𝑓𝑘 𝑥 𝑦 = 1 ~ 𝒩(𝜇1, 𝜎1) 𝑝 𝑓𝑘 𝑥 𝑦 = 0 ~ 𝒩(𝜇0, 𝜎0) 𝑝 𝑦 = 1 = 𝑝 𝑦 = 0

Image patch x

Babenko et al., 09

f1

f2

f3

ft

ℎ𝑘(𝑥)

Page 47: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

• 𝑯(𝒙) : the MIL classifier made from weak

classifiers

𝑯 𝒙 = ℎ𝑘(𝑥)

𝐾

𝑘=1

Babenko et al., 09 K = 50 in authors’ experiment

Image patch x

ℎ1(𝑥)

ℎ𝑘(𝑥)

f1

f2

f3

ft

Page 48: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

• Always keep a pool of M >> K weak classifier

candidates

h3

h1

h2 …

hM

Babenko et al., 09 M = 250 & K = 50 in authors’ experiment

Page 49: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

• Update all M weak classifiers with positive and

negative bags

X3

{(X1, 1), (X2, 0), (X3, 0)}

X2 X1 x13

x12

x11

x21 x31

Classifier h1 Classifier h2 Classifier hM …

Babenko et al., 09

Page 50: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

• Pick best K weak classifiers to form 𝑯(𝒙), where

ℎ𝑘 = argmaxℎ ∈ {ℎ1

… ℎ𝑀}log 𝐿(𝐻𝑘 − 1 + ℎ)

𝑯 𝒙 = ℎ𝑘(𝑥)

𝐾

𝑘=1

where 𝐻𝑘 − 1 is the classifier made up of the first

𝑘 − 1 weak classifiers

Babenko et al., 09

Page 51: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

• Prediction : 𝑝 𝑦 = 1 𝑥 = 𝜎(𝑯(𝒙))

𝜎 𝑥 = 1

1 + 𝑒 − 𝑥

Babenko et al., 09

Page 52: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

• Prediction : 𝑝 𝑦 = 1 𝑥 = 𝜎(𝑯(𝒙))

𝜎 𝑥 = 1

1 + 𝑒 − 𝑥

Babenko et al., 09

ℎ1 𝑥 = 2, ℎ2 𝑥 = 1.8, ℎ3 𝑥 = 0.6

f1

f2

f3

ℎ1(𝑥)

ℎ2(𝑥)

ℎ3(𝑥)

Page 53: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Online-MILBoost:

• Prediction : 𝑝 𝑦 = 1 𝑥 = 𝜎(𝑯(𝒙))

𝜎 𝑡 = 1

1 + 𝑒 − 𝑡

Babenko et al., 09

𝑯 𝒙 = ℎ𝑘(𝑥)

𝐾

𝑘=1

= 4.4

f1

f2

f3

ℎ1(𝑥)

ℎ2(𝑥)

ℎ3(𝑥)

𝑝 𝑦 = 1 𝑥 = 𝜎 𝑯 𝒙 = 0.99

ℎ1 𝑥 = 2, ℎ2 𝑥 = 1.8, ℎ3 𝑥 = 0.6

Page 54: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

MILTrack workflow

New frame comes in:

1. Crop out a set of image patches

𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}

2. Use MIL classifier to find new tracker location

𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))

3. Crop positive and negative examples near new object location

4. Online MILBoost:

Update MIL classifier with positive and negative

example bags

Babenko et al., 09

Page 55: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Experiments

Datesets: 8 publicly available videos, • Grayscale, 320 x 240 pixels

• Ground truth labeled every 5 frames by hand

Babenko et al., 2009

Coke can Girl Occluded face

David

Occluded face 2

Tiger 1 Sylvester Tiger 2

Page 56: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Experiments

Compared with: • OAB1

Online AdaBoost w/ 1 positive example per frame

• OAB5

Online AdaBoost w/ 45 positive examples per frame

• SemiBoost

Label in 1st frame only.

• FragTrack

Static appearance model

Babenko et al., 2009

Page 57: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Experiments

Evaluation criterion:

Tracker position error (pixels) w.r.t. Ground truth

Babenko et al., 2009

X X

Page 58: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Results

Video David

Babenko et al., 09

Page 59: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Results

Position error versus Frame #, Video David

Babenko et al., 2009

Page 60: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Results

Video Occluded Face

Babenko et al., 09

Page 61: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Results

Position error versus Frame #,

Video Occluded Face

Babenko et al., 2009

Page 62: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Results

Average Center location errors (pixels)

Best

Second Best

Babenko et al., 2009

Page 63: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Conclusion

• Online MILBoost: Online algorithm to update MIL-based classifier

• Performance of “MILTrack” is stable

Babenko et al., 2009

Page 64: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Discussion

• Why it can handle occlusion?

• Possible improvements

• Motion Model

• Features

• Part based representation

Babenko et al., 2009

Page 65: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Note…

Wu et al., 2013

Page 66: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Note…

Wu et al., 2013

Page 67: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Note…

Wu et al., 2013 Project 2 use this!

Page 68: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution

Thank You!

Q&A