MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian...

58
MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie Jegelka (UCB postdoc)

Transcript of MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian...

Page 1: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

MURI Update 7/26/12

UC Berkeley

Prof. Trevor Darrell (UCB)Prof. Michael Jordan (UCB)

Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU)

Dr. Stefanie Jegelka (UCB postdoc)

Page 2: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Recent Effort

• NPB inference of visual structures- Objects- Trajectories

• Develop richer representations (appearance, shape)

• Efficient implementation in constrained environments

• Distributed and decentralized variants

Page 3: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

NPB Trajectory Models• Extend NPB models to

trajectory domains:– find structure in motion trajectories– identify anomalies– Considering Bluegrass data and

possibly ARGUS track data via LLNL collaboration

• HDP and hard clustering variants

• Decentralized / distributed implementation

topics topics

Page 4: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

New Algorithms for Hard Clustering via Bayesian Nonparametrics

Mixture ofGaussians

??k-means

Fix Covariances,

Take Limit

DP Mixture

Make Bayesian, Take Limit

Generalized to Hard HDP as well; [Kulis and Jordan ICML2012]

• Kulis Presentation

Page 5: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Small Variance Asymptotics, Bayesian Nonparametrics,

and k-meansBrian Kulis

Ohio State University

Joint work with Michael Jordan (Berkeley), Ke Jiang (OSU), and Tamara Broderick

(Berkeley)

Page 6: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Generalizing k-means

Mixture ofGaussians

????k-means

Fix Covariances,

Take Limit

DP Mixture

Make Bayesian, Take Limit

[Kulis and Jordan, ICML 2012]

Page 7: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Why Should We Care?•For “k-means people”

•Bayesian techniques permit great flexibility

•Extensions to multiple data sets would not be obvious without this connection

•For Bayesians

•Hard clustering methods scale better

•Connections to graph cuts and spectral methods

•k-means just works

Page 8: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Gaussian Mixture Models

•No closed-form for optimizing likelihood

• Typically resort to EM algorithm

Page 9: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

EM AlgorithmE-Step

M-Step

Page 10: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

k-means

Mixture ofGaussians

k-means

Fix Covariances,

Take Limit

In the limit as sigma goes to 0, this value is 1 if centroid c is the

closest, and 0 otherwise

Page 11: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Linear-Gaussian Latent Feature Models

For mixture of Gaussians:

Latent Variables:

Data:

Likelihood:

Page 12: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Log-Likelihood Asymptotics

Page 13: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Other Linear-Gaussian Models

(Essentially) probabilistic PCA

Let Z have a continuous distribution

As sigma goes to 0, get standard PCA

Page 14: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

The Polya Urn

•Imagine an urn with theta balls for each of k colors

•Pick a ball from the urn, replace the ball and add another ball of that color to the urn

•Repeat n times

•Induces a distribution over Z

Page 15: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

The Dirichlet-Multinomial Distribution

(If theta an integer, gamma functions are factorials.)

Page 16: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Small-Variance Asymptotics

Nothing interesting happens unless:

In this case, we obtain:

Page 17: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

The Chinese Restaurant Process

•Customers sequentially enter the restaurant

•First customer sits at first table

•Subsequent customers sit at occupied table with probability proportional to number of occupants

•Start a new table with probability propotional to theta

...1 2

3 45 6

Page 18: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

The Chinese Restaurant Process

The exchangeable partition probability

function (EPPF):

Select theta as before, yields

asymptotically:

Page 19: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Algorithmic Perspective: Gibbs Sampling

•Suppose we want to sample from p(x), where

•Repeatedly sample as follows

Page 20: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Collapsed Gibbs Sampling for CRP/DP

Mixtures

•Want to sample from the posterior:

•Need the following to do Gibbs sampling:

Page 21: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Asymptotics of the Gibbs Sampler

•Now, would like to see what happens when sigma goes to 0

•Need 1 additional thing

• must be a function of sigma:

????

DP Mixture

Page 22: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Asymptotics of the Gibbs Sampler

Existing Clusters Start a New Cluster

In the limit:

Page 23: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

DP-means

Page 24: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Underlying Objective Function

Theorem: The DP-means algorithm monotonically minimizes this objective

until local convergence.

Page 25: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

3 Gaussians

Page 26: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Clustering with Multiple Data Sets

Want to cluster each data set, but also want to share cluster structure

Use the Hierarchical Dirichlet Process (HDP)!

Page 27: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

The Hard Gaussian HDP: Objective

Theorem: The Hard HDP algorithm monotonically minimizes this objective

until local convergence.

Page 28: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Exponential Families

•What happens when we replace the Gaussian likelihood with an arbitrary exponential family distribution?

•Do asymptotics where the mean is fixed but the covariance goes to 0:

Page 29: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Exponential Families•Use the standard EF conjugate

prior

•Utilize Bregman divergences:

•Asymptotics of Gibbs require Laplace’s method on the marginal likelihood

•End up with same result as before, with Bregman divergence replacing sq. Euclidean

Page 30: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Illustration: Hard Topic Models

Page 31: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Overlapping Clusters / Binary Feature Models

So far, have considered single

assignment models

What if each point can be assigned to multiple

clusters?

Page 32: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Overlapping Clusters

Single Assignment Multi Assignment

Non-Bayesian

Bayesian

Bayesian Nonparametr

ic

Can perform analogous small-variance asymptotics using the above multi-assignment distributions

Page 33: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Indian Buffet Process

•First customer samples dishes

•Subsequent customer i samples existing dishes with probability equal to fraction of previous customers sampling that dish

•Also samples new dishes

Page 34: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Indian Buffet Process Small-Variance Asymptotics

Number of points possessing feature c

Number of new dishes sampled by

customer i

Page 35: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Indian Buffet Process Small-Variance Asymptotics

(Reduces to DP-means obj. for

single assignments)

Page 36: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Spectral Clustering Connections

•Standard spectral relaxation

•Write as

•Relax Y to be an arbitrary orthogonal matrix

•Take matrix of top k eigenvectors of K

Page 37: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Spectral Relaxation for DP-means

•How do we extend this?

•Write as

•Relax Y to be an arbitrary orthogonal matrix

•Take matrix of eigenvectors of K whose eigenvalues are greater than lambda

Page 38: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Graph Clustering

•Can further take advantage of connections between k-means and graph clustering

•Generalize hard DP objective to weighted and kernel forms

•Special cases lead to graph clustering objectives that penalize the number of clusters in the graph

Page 39: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Conclusions and Open Questions•Focus was on obtaining non-

probabilistic models from probabilistic ones

•Small-variance asymptotics for Bayesian nonparametrics yields models which regularize / penalize

•Number of clusters

•Number of features

•Number of topics

•Also yields algorithmic insights

Page 40: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Conclusions and Open Questions•Spectral or semidefinite relaxation

for the hard HDP?

•Better algorithms?

•Local Search

•Multilevel Methods

•Split/Merge

•Incorporate ideas from other inference schemes

•Applications?

•Other models?

Page 41: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Thanks!

Page 42: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

2) Decentralized models• Distributed Hard Topic Models• Jegelka presentation

Page 43: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Decentralized k-means algorithms

• Data distributed• Clusters shared globally

– Cluster assignment– Update centers

CommunicationWhere store centers?

Page 44: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Decentralized k-means

• Local copies, gossip [Datta et al,…]– Local assignment & update– Sharing & averaging with neighbor(s)

Page 45: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Decentralized k-means

• Local copies, gossip [Datta et al,…]

• Single copy of each mean – Summaries & pruning for restricted comparisons

[Papatreou et al.]

Page 46: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

New questions

• Decentralized cluster creation?– Sequential vs. parallel

• Naturally hierarchical• Exploit structure to reduce communication

– Partial sharing, e.g. locality– Integration into model?

• Generalization to IBP, PYP, …

Page 47: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Decentralized clustering• Observe trajectories in

a scene• Cluster locations by

local traffic behavior: HDP

anomalies, traffic prediction, …

• Clusters are not omnipresent:partial sharing

Page 48: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

3) Trajectory Data• Scalable Topic Models for Trajectories• Apply current results on scalable topic modeling to

problems in vision– find structure in motion trajectories– identify anomalies– Considering Bluegrass data and possibly ARGUS track

data via LLNL collaboration

Page 49: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Trajectory Video

Page 50: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

50

Experimental Results

Road NetworkTrack Fragment Vocabulary (Colored by Speed)

Page 51: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

51

Experimental Results

Road networkQuery trajectory

Query trajectory with vocabulary

below

Highlights the vocabulary assigned to the query trajectory

Page 52: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

52

Querying Results

Note this track “word”

Northbound Trajectory stops at intersection, turns left, continues East

Page 53: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

53

Querying Results

Eastbound Trajectory

Page 54: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

54

Querying Results

Eastbound Trajectory, Stops at Traffic Light

Page 55: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

55

Querying Results

Westbound Trajectory, Stops at Traffic Light

Page 56: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Right-Hand Turn

K-Nearest NeighborsTrajectories on Map

QUERY

Trajectory segments are 10 seconds in length, sampled every 2.5 seconds (75% overlap).

1

2

3

4

5

10

20

Page 57: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

U-Turn

K-Nearest NeighborsTrajectories on Map

QUERY

Trajectory segments are 10 seconds in length, sampled every 2.5 seconds (75% overlap).

1

2

3

4

5

10

20

Page 58: MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.

Coda: Timely Detection

[Karayev and Darrell, in review]