The Shape Boltzmann Machine

30
The Shape Boltzmann Machine S. M. Ali Eslami Nicolas Heess John Winn CVPR 2012 Providence, Rhode Island A Strong Model of Object Shape

description

The Shape Boltzmann Machine. A Strong Model of Object Shape. S. M. Ali Eslami Nicolas Heess John Winn. CVPR 2012 Providence, Rhode Island. What do we mean by a model of shape?. A probabilistic distribution: Defined on binary images Of objects not patches - PowerPoint PPT Presentation

Transcript of The Shape Boltzmann Machine

PowerPoint Presentation

The Shape Boltzmann MachineS. M. Ali EslamiNicolas HeessJohn WinnCVPR 2012Providence, Rhode Island

A Strong Model of Object ShapeWhat do we mean by a model of shape?A probabilistic distribution:

Defined on binary images

Of objects not patches

Trained using limited training data

2

Weizmann horse dataset3Sample training images327 images

What can one do with an ideal shape model?4Segmentation (due to probabilistic nature)

What can one do with an ideal shape model?5Image completion (due to generative nature)

What can one do with an ideal shape model?6Computer graphics (due to generative nature)

What is a strong model of shape?We define a strong model of object shape as one which meets two requirements:7RealismGenerates samples that look realisticGeneralizationCan generate samples that differ from training imagesTraining images

Real distributionLearned distribution

Existing shape models8A comparisonRealismGeneralizationGloballyLocallyMeanFactor AnalysisFragmentsGrid MRFs/CRFsHigh-order potentials~DatabaseShapeBMExisting shape models9Most commonly used architecturesMRFMean

sample from the modelsample from the modelShallow and Deep architectures10Modeling high-order and long-range interactions

MRF

RBM

DBM

Deep Boltzmann MachinesProbabilisticGenerativePowerful

Typically trained with many examples.We only have datasets with few training examples.11

DBM

From the DBM to the ShapeBM12Restricted connectivity and sharing of weights

DBMShapeBMLimited training data, therefore reduce the number of parameters:

Restrict connectivity,Tie parameters,Restrict capacity.Shape Boltzmann Machine13Architecture in 2D

Top hidden units capture object poseGiven the top units, middle hidden units capture local (part) variabilityOverlap helps prevent discontinuities at patch boundariesShapeBM inference14Block-Gibbs MCMC

imagereconstructionsample 1sample nFast: ~500 samples per secondShapeBM learningMaximize with respect to

Pre-trainingGreedy, layer-by-layer, bottom-up,Persistent CD MCMC approximation to the gradients.

Joint trainingVariational + persistent chain approximations to the gradients,Separates learning of local and global shape properties.15Stochastic gradient descent

~2-6 hours on the small datasets that we considerResultsWeizmann horses 327 images 2000+100 hidden unitsSampled shapes 17Evaluating the Realism criterionWeizmann horses 327 images

Data

FAIncorrect generalization

RBMFailure to learn variability

ShapeBMNatural shapesVariety of posesSharply defined detailsCorrect number of legs (!)Weizmann horses 327 images 2000+100 hidden unitsSampled shapes 18Evaluating the Realism criterionWeizmann horses 327 images

This is great, but has it just overfit?Sampled shapes 19Evaluating the Generalization criterionWeizmann horses 327 images 2000+100 hidden units

Sample from the ShapeBMClosest image in training datasetDifference between the two images

Interactive GUI20Evaluating Realism and GeneralizationWeizmann horses 327 images 2000+100 hidden units

Further results21Sampling and completionCaltech motorbikes 798 images 1200+50 hidden units

TrainingimagesShapeBM samplesSamplegeneralizationShapecompletionImputation scoresCollect 25 unseen horse silhouettes,

Divide each into 9 segments,

Estimate the conditional log probability of a segment under the model given the rest of the image,

Average over images and segments.22Quantitative comparisonWeizmann horses 327 images 2000+100 hidden units

MeanRBMFAShapeBMScore-50.72-47.00-40.82-28.85Multiple object categoriesTrain jointly on 4 categories without knowledge of class:23Simultaneous detection and completionCaltech-101 objects 531 images 2000+400 hidden units

Shape completion

SampledshapesWhat does h2 do?Weizmann horsesPose information24

Multiple categoriesClass label information

Number of training imagesAccuracy

SummaryShape models are essential in applications such as segmentation, detection, in-painting and graphics.

The ShapeBM characterizes a strong model of shape:Samples are realistic,Samples generalize from training data.

The ShapeBM learns distributions that are qualitatively and quantitatively better than other models for this task.25QuestionsMATLAB GUI available athttp://arkitus.com/Ali/

Questions"The Shape Boltzmann Machine: a Strong Model of Object Shape"S. M. Ali Eslami, Nicolas Heess and John Winn (2012)Computer Vision and Pattern Recognition (CVPR), Providence, USA

MATLAB GUI available athttp://arkitus.com/Ali/Shape completion28Evaluating Realism and GeneralizationWeizmann horses 327 images 2000+100 hidden units

Constrained shape completion29Evaluating Realism and GeneralizationWeizmann horses 327 images 2000+100 hidden units

ShapeBMNNFurther results30Constrained completionCaltech motorbikes 798 images 1200+50 hidden units

ShapeBMNN