E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do...

38
EVENT DETECTION USING AN ATTENTION-BASED TRACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson

Transcript of E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do...

Page 1: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

EVENT DETECTION USING AN ATTENTION-BASED TRACKER

22 Outubro 2007. Universidade Federal do Paraná.Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson

Page 2: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Glasgow Airport

4 cameras

9 video clips 1 for training 8 for testing

Dataset Description

2

Page 3: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Loitering Staying in the scene for > 1 minute

Theft Taking someone else’s baggage

Left luggage Leaving the scene without your luggage

Evaluation Correct event times Correct 3D locations Without false detections

Challenge Problem

3

Page 4: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Basic tracking first

Detect interesting tracks

Improve interesting tracks to detect events

Our Approach

4

Page 5: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Processing Pipeline

5

Tracking & Event

Detection

Input Data

Background Subtraction

BackgroundModelingIllumination

Normalization

Page 6: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Illumination Changes

6

BA

CK

GR

OU

ND

126 999

2999

S08

Page 7: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Illumination Changes by Clip

7

0

20

40

60

80

100

120

Intra-Clip Time

Mea

n In

tens

ity

Page 8: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Background Modeling

8

Tracking & Event

Detection

Input Data

Background Subtraction

Robust Gaussian Fit

Ground-Plane Registration

Background Model Search

NormalizedBACKGROUND

NormalizedS01-S08

BackgroundModeling

Background Models

IlluminationNormalization

Page 9: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Fundamental assumption foreground is rare at every pixel

Reality for PETS 2007… background is rare for the pixels we care about

most Some pixels: foreground as much as 90% of the time

Breaking Adaptive Background Subtraction

9

0

2000

3000

4000

Page 10: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Robust Gaussian Fit (per pixel)

10

BACKGROUND clip Foreground is rare

everywhere

Fit a Gaussian Refit to inliers

Page 11: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Need for Model Adaptation

11

Another clip (S02) BACKGROUND’s

model: Suboptimal fit

BACKGROUND’s Gaussian modelBackground samplesForeground samples

BACKGROUND’s Gaussian modelBackground samplesForeground samples

Page 12: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Model Adaptation

12

Until convergence Find inliers Shift Gaussian center

Page 13: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Improvement

13

Adaptation Is robust Improves FG/BG

classification rates

Before adaptationAfter adaptationBefore adaptationAfter adaptation

Page 14: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Background Subtraction

14

Tracking & Event

Detection

Input Data

Background Subtraction

BackgroundModelingIllumination

Normalization

MRF blobsBlob

Extraction

Mahalanobis Distance

Background Models

NormalizedS01-S08

Page 15: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Background Subtraction

15

Frame 2000 With Foreground Mask Foreground Mask

Mahalanobis DistancesAdapted ModelNormalized

- =

Page 16: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Tracking & Event Detection

16

Tracking & Event

Detection

Input Data

Background Subtraction

BackgroundModelingIllumination

Normalization

alerts

blobsKalman Tracking

Meanshift

Tracking

Event Detection

Page 17: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Idea: Focus on tracking what we care about. Loitering humans Dropped luggage that becomes dissociated from

its owner

Kalman tracking Constant velocity Low false positive rate

Blob Tracking

17

Page 18: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Loitering humans Remain in the scene for a long time Likely to create isolated tracks

Dispossessed luggage Likely to create at least an isolated blob detection

Detecting Humans and Luggage

18

Object Type

Min. Blob Area (% of

frame)

Max. Blob Area (% of

frame)

Min. Blob Track Length

humans 1.5% 3.0% 16s

luggage 0.2% 1.0% 1 frame

Page 19: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Blob tracking Yields high-quality tracks (good) Requires isolated blobs (bad)

Meanshift tracker Learn a model (color histogram) from the good blob

tracks Tracks through occlusions

Humans Find scene entry/exit times

Luggage Find drop/pickup times Associate with human owners

Mean-Shift Tracking

19

Page 20: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Results

20

Page 21: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

No events occur

None detected

S00 – No Defined Behavior

21

Page 22: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Staged loitering 5.1s late

S01 – General Loitering 1 (Easy)

Page 23: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Staged loitering 1.4s late

S02 – General Loitering 2 (Hard)

23

Page 24: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Staged loitering 8.2s late

Dropped luggage Should not trigger an alarm No alarm triggered

Staged loitering man near purple-outlined woman Missed

Unscripted loitering Detected

S03 – Bag Swap 1 (Easy)

24

Page 25: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

S04 – Bag Swap 2 (Hard)

25

Stay close to each other the whole time

Page 26: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

S05 – Theft 1 (Easy)

26

Victim enters 19.2s late

Luggage stolen 0.08s late

Thief exits 0.08s late

Page 27: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

S06 – Theft 2 (Hard)

27

Thief

Luggage (never isolated)

Victims

Assistant thief

Page 28: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

S07 – Left Luggage 1 (Easy)

28

Luggage dropped 0.12s late

Owner tracked Luggage taken

0.08s late By owner

Page 29: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

S08 – Left Luggage 2 (Hard)

29

Owner drops: 0.2s late

Owner picks up: 0.12s late

Page 30: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

PETS 2007 Problem description, datasets, etc. http://www.pets2007.net

My Website The paper http://people.csail.mit.edu/dalleyg

Websites

30

Page 31: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Perguntas

31

Page 32: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Illumination Normalization

32

~ci ;t = § ¡ 12

t (ci ;t ¡ ¹ct);~ci ;t = § ¡ 12

t (ci ;t ¡ ¹ct);

where

ci ;t = the color of pixel i in frame t;

¹ct =1N

NX

i=1

ci ;t; and

§ t =1

N ¡ 1

NX

i=1

(ci ;t ¡ ¹ct)2:

ci ;t = the color of pixel i in frame t;

¹ct =1N

NX

i=1

ci ;t; and

§ t =1

N ¡ 1

NX

i=1

(ci ;t ¡ ¹ct)2:

Page 33: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Region-of-Interest Masks

33

BG-S04

S06

S05

S07-S08

Page 34: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Background Subtraction

34

Tracking & Event

Detection

Input Data

Background Subtraction

BackgroundModelingIllumination

Normalization

BG-Biased MRF

FG-Biased MRF

Blob Extractio

n

fragmented blobs

merged blobs

Blob Extractio

n

Mahalanobis Distance

Background Models

NormalizedS01-S08

Page 35: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Dual Background Subtraction

35

Low fragmentation But blobs merged Good for human

tracking

Mahalanobis Distance Map

Background-Biased Blobs

Foreground-Biased Blobs

Sharp boundaries But fragmented

blobs Good for

dropped luggage detection

Page 36: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Tracking & Event Detection

36

Tracking & Event

Detection

Input Data

Background Subtraction

BackgroundModelingIllumination

Normalization

alerts

fragmented blobs

mergedblobs

Kalman Tracking

Kalman Tracking

Meanshift

Tracking

Event Detection

Page 37: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Luggage: it often doesn’t travel far from the owner, so we need BG-biased

to avoid merging the dropped luggage blob with the owner The floor is more boring (except for specularities), so

camouflaging doesn’t occur much there, relative to the vertical surfaces

Easy to tell people fragments from luggage: small people fragments move They move when they’re isolated They move before and after isolation

Humans With the busyness of the scene, a BG-biased MRF produces a lot

of fragments and many-to-many blob matching quickly becomes impractical

A FG-biased MRF avoids the fragmentation issue but merges lots of blobs We only care about loiterers

Loiterers are in the scene for a long time They’re likely to be isolated from other people at least at some point in time

Why Dual Trackers/Motion Blobs

37

Page 38: E VENT D ETECTION U SING AN A TTENTION -B ASED T RACKER 22 Outubro 2007. Universidade Federal do Paraná. Gerald Dalley, Xiaogang Wang, and W. Eric L. Grimson.

Oriented Ellipsoids

38