MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target...

36
MURI Annual Review, Vanderbilt, Sep 8 th , 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076) Motion Pattern Analysis with(out) Trajectories John Fisher MIT CSAIL

Transcript of MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target...

Page 1: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain

(W911NF-06-1-0076)

Motion Pattern Analysis with(out) Trajectories

John Fisher

MIT CSAIL

Page 2: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

People interacting with people

What was he thinking?

Hard questions, most 9-year olds wouldn’t know the answer anyway.

What was he reacting to?

Perhaps a little easier.

Can we analyze the dependency between the player interactions?

Page 3: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

People interacting with people

Two camera angles provide noisy position estimates.

Can derive noisy estimates of kinematic state (i.e. position, velocity, etc.)

Could use other features as well (e.g. body position, etc.)

This gives multiple time series.

One for each player plus the ball.

Interaction can be cast as inference over the structure of influence between time-series.

Page 4: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

People interacting with their environments

Object tracking is challenging in these scenarios.

Can we derive aggregate models of behavior?

Assume there is a persistent or slowly varying motion pattern.

Can we estimate aggregate motion patterns without tracking each individual?

Page 5: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Motion Pattern Analysis

Inference over aggregate properties of a dynamic scene Scenarios in which tracking each object is intractable

– Challenging when the properties live in a curved space Integration of two mathematical formalisms

– Lie-algebraic representations of motion/deformation– Variational inference in graphical models

Inference over the structure of interactions between multiple time-series

Suppose we are only interested in the graph describing interactions

– Complexity of inference over structure is super-exponential O(NN) in the number of objects

– If the structure of interaction varies dynamically, complexity is exponential in the duration O((NN)T)

Page 6: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Lie Groups and Lie Algebras

Connected by matrix exponentiation and logarithm

Other properties– Identity transform corresponds to zero vector– Inverse transforms corresponds to negation – Commutable multiplication corresponds to addition

I0

G )(GLieexp

log

1 !

)exp(n

n

n

XIXT

1

1

)()1(

)log(n

nn

ITn

TX

0IXT 1

Page 7: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Acting on geometric points

Txx

Acting on images

)())((: 1xTIxITITI

Lie Group Action on Images

Lie group action

Two roles of the transform T

Twofold role is key for estimating T directly from images

),()( , s.t.

),( ::),(

1212 pTTpTTppI

pTpTMMGpT

Page 8: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

0t

Action of the infinitesimal generator

T: transform, X: infinitesimal generator

)exp(0

tXdt

dX

t

))(exp(0

ItXdt

dIX

t

1t 2t3t

4t

0tIX

Page 9: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

The key relation

We derive the following relation between the motion space and the space of image changes

n

kkk

n

kkk IEIE

11

)(

the decomposition of motion

the decomposition of image changes

The motion can be inferred by decomposing the image changes

Page 10: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Computation

Assume that the motion is characterized by

K

kkkEX

1

K

ktkk

t IEIXt

I

1

)(

)(1

tttt II

tt

I

)()(

)()())((

xExI

xVxIxIE

kT

t

kT

ttk

approximate byfinite difference

negated pointwise inner product of the gradient and the induced velocity

Page 11: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Extension to Triangular Meshes

Extend the formalism with triangle mesh Consistency at boundary

Consistent subspace (m triangles, n vertices) 1A

2A3A

4A5A

6Ax

xAxAxA 621

m6

nmmn

ii 26)1(2

1

n2

joint dim =

-) constraints =

subs. dim = The key relation still applieswith this extension.

Page 12: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Efficient Inference Over Deformations

-

1 K

tI

ttI )(

1ttt II

tt

I

tI1E KE

tIE 1 tK IE

finite difference

p.w. dot product

……

……

gradient

estimate byregression

Page 13: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Multi-flow probabilistic model

ig

iy

G ic

ivM

iz

M models

flow indicator

jg

jy

jc

jv jz

flow indicator

MRFMRF

Page 14: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

People interacting with their environments (Lie group)

Page 15: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

People interacting with their environments (Lie group)

Page 16: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

People interacting with their environments (opt flow)

Page 17: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Object Interaction Analysis

N tracked objects (cars, people, genes, consumers, etc..)

Observe T noisy samples of state (e.g. position/velocity)

Can we infer properties of the interaction?

Who is interacting with whom? Who is the probable leader? Is the nature of interaction

changing over time?

Page 18: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Outline

Challenges of Dependence Analysis– Static Dependence – Dynamic Dependence

Factorization Model Temporal Interaction Model Conclusion

18

Page 19: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Static Dependence Analysis

Given – Observations:

– Generative Model:

– Prior:

Find

19

Page 20: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Challenges of Static Analysis

Structural specification Unknown parameters Number of structures

– e.g. We explore > 109 structures

20

Page 21: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Temporal Interaction Model

x1

x2

x3

1

2

3

Page 22: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Conjugate Prior on Parameters

Given the structure, parameters are independent:

– and modular:

is the same for all such that

22

11

22

33

11

22

33

Page 23: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Conjugate Prior on Structures

The prior on structure factorizes as a product of weights on parent sets:

23

Uniform Dense Sparse

Page 24: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Posterior

Posterior on structure is a simple update to the parent set weights

where:

24

Matrix-T

Matrix Normal-Inverse-WishartMatrix Normal

Page 25: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Computing the Partition Function

How many structures are there?– Each time-series has possible parents

– time-series

25

Super-exponential in N

parent sets

structures

Page 26: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Computing the Partition Function

26

All Directed Structures

Super-exponential to Exponential

Page 27: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Bounded Parent Sets

Assume

,– structures (still super-exponential)

27

Polynomial-time computation

Page 28: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009"Motion Pattern Analysis", Fisher 28

Switching Vector Autoregressive Tree

SVART(1)

Page 29: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009

Bayesian Reasoning over Interaction Structures

Page 30: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 2009"Motion Pattern Analysis", Fisher 30

Basketball

Page 31: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 200931

Basketball Results

Using a STIM with 10 states with

Team A onOffense

Team B onOffense

Team B trans.to Offense

Team A trans.to Offense

Page 32: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 200932

Basketball Event Probabilities

Team A onOffense

Team B onOffense

Team B trans.to Offense

Team A trans.to Offense

Team A onOffense

Team B onOffense

Team B trans.to Offense

Team A trans.to Offense

Page 33: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 200933

Expected Number of Children

Top 4– Ball (1.8)– Point Guard A (1.7)– Forward 1 A (1.1)– Forward 1 B (1.0)

Page 34: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 200934

Influence of Point Guard A

0.0 1.00.5

Page 35: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 200917Apr08 Page 35

Prior and Posterior Expectations

Can tractably compute expectations of: Multiplicative functions on structure Additive functions on structure

Allows one to calculate: Expected number of children

– How influential? Expected number of parents

– How impressionable?

Page 36: MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF-06-1-0076)

MURI Annual Review, Vanderbilt, Sep 8th, 200917Apr08 Page 36

Comments

Focus was on vision sensors, but both methods have wider application

– Seismic data (petroleum exploration)– Audio-visual association (multi-media annotation)

Thank you