Inference in generative models of images and video John Winn MSR Cambridge May 2004.

Inference in generative models of images and video

John WinnMSR CambridgeMay 2004

Overview

Generative vs. conditional models

Combined approach

Inference in the flexible sprite model

Extending the model

We have an image I and latent variables H which we wish to infer, e.g. object position, orientation, class. There will also be other sources of variability, e.g. illumination, parameterised by θ.

Generative vs. conditional models

Generative model: P(H, θ, I)

Conditional model: P(H, θ|I) or P(H|I)

Conditional models use featuresFeatures are functions of I which aim to be informative about H but invariant to θ.

Edge features Corner features

Blob features

Conditional modelsUsing features f(I), train a conditional model e.g. using labelled data

))(()|( IIH fgP

Example: Viola & Jones face recognition using rectangle features and AdaBoost

Conditional modelsAdvantages

Simple - only model variables of interest

Inference is fast - due to use of features and simple model

Disadvantages

Non-robust

Difficult to compare different models

Difficult to combine different models

Generative modelsA generative model defines a process of generating the image pixels I from the latent variables H and θ, giving a joint distribution over all variables: P(H, θ, I)

Learning and inference carried out using standard machine learning techniques e.g. Expectation Maximisation, MCMC, variational methods.

No features!

Generative modelsExample: image modeled as layers of ‘flexible’ sprites.

Generative modelsAdvantages

Accurate – as the entire image is modeled

Can compare different models

Can combine different models

Can generate new images

Disadvantages

Inference is difficult due to local minima

Inference is slower due to complex model

Limitations on model complexity

Combined approach

Use a generative model, but speed up inference using proposal distributions given by a conditional model.

A proposal R(X) suggests a new distribution over some of the latent variables X H, θ.

Inference is extended to allow accepting or rejecting the proposal e.g. depending on whether it improves the model evidence.

Using proposals in an MCMC framework

Proposals for text and faces Accepted proposals

From Tu et al, 2003

Generative model: textured regions combined with face and text models

Conditional model: face and text detector using AdaBoost (Viola & Jones)

Using proposals in an MCMC framework

Proposals for text and faces Reconstructed image

From Tu et al, 2003

Generative model: textured regions combined with face and text models

Conditional model: face and text detector using AdaBoost (Viola & Jones)

Proposals in the flexible sprite model

Flexible sprite model

Set of images

e.g. frames from a video

Sprite shape and appearance

Sprite transform for this image (discretised)

Transformed mask instance for this image

Background

Inference method & problems Apply variational inference with factorised

Q distribution Slow – since we have to search entire

discrete transform space Limited size of transform space e.g.

translations only (160120). Many local minima.

Proposals in the flexible sprite model

We wish to create a proposal R(T).

Cannot use features of the image directly until object appearance found.

Use features of the inferred mask.

proposal

Moment-based featuresUse the first and second moments of the inferred mask as features. Learn a proposal distribution R(T).

True locationC-of-G of

Contour of proposal distribution over object location

Can also use R to get a probabilistic bound on T.

Iteration #1

Iteration #2

Iteration #3

Iteration #4

Iteration #5

Iteration #6

Iteration #7

Results on scissors video.

On average, ~1% of transform space searched. Always converges, independent of initialisation.

Original Reconstruction

Foreground only

Beyond translation

Extended transform space

Normalised video

Learned sprite appearance

Corner features

Learned sprite appearance

Masked normalised image

Corner feature proposals

Preliminary results

Future directions

Extensions to the generative model

Very wide range of possible extensions: Local appearance model e.g. patch-based Multiple layered objects Object classes Illumination modelling Incorporation of object-specific models e.g. faces Articulated models

Further investigation of using proposals

Investigate other bottom-up features, including: Optical flow Color/texture Use of standard invariant features e.g. SIFT Discriminative models for particular object

classes e.g. faces, text

Inference in generative models of images and video John Winn MSR Cambridge May 2004.

Documents

Transcript of Inference in generative models of images and video John Winn MSR Cambridge May 2004.

Texas! Because!of! Winn/Dixie! - BCSCRbcscr.3riversed.org/wp-content/uploads/4F-Winn-Dixie-TX.pdf · ! 3!!!!! Synopsis! Because!of!Winn/Dixie!! IndiaOpalBuloni!isalonelygirlwhohasjustmovedtoNaomi,Florida,with!

James Winn, piano

MSR Configuration

Msr catalogue

Travis Winn Portfolio

WINN$TOCK 2009

Winn proposal

Winn Overview

winn-dixie stores 2007_Annual_Report

MSR Format.doc

Winn wl cloud_study_webinar

MSR-MEDOC CC V2.2 - msr-wg.de€¦ · MSR-MEDOC CC V2.2.0 Container Catalog XML Model Speci ... ' MSR MEDOC. Abstract In case of joint engineering efforts it is necessary to …

MSR 175 Shock Transportation Data Logger: Protect your goods! · grams: MSR175 Dashboard, MSR ReportGenerator and MSR ShockViewer. The MSR Dashboard allows you to configure the loggers

WINN DIXIE - MIAMI

winn-dixie stores AsFiledDEFproxy2008

winn-dixie stores 1998_Annual_Report

For The Winn

winn-dixie stores 1999_Annual_Report

Unit One Because of Winn-Dixie Unit One, Because of Winn-Dixie, Story 1.

2001-03-21 Page 1 V1.4 © 2001 MSR MSR Agenda structure History, Mission, Organisation of the MSR Consortium Proceeding proposed by MSR for a ‚merger‘