Computer vision: models, learning and inference Chapter 12 Models for Grids.

81
Computer vision: models, learning and inference Chapter 12 Models for Grids

Transcript of Computer vision: models, learning and inference Chapter 12 Models for Grids.

Page 1: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Computer vision: models, learning and inference

Chapter 12 Models for Grids

Page 2: Computer vision: models, learning and inference Chapter 12 Models for Grids.

2

Models for grids

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

• Consider models with one unknown world state at each pixel in the image – takes the form of a grid.

• Loops in the graphical model, so cannot use dynamic programming or belief propagation

• Define probability distributions that favor certain configurations of world states – Called Markov random fields– Inference using a set of techniques called graph cuts

Page 3: Computer vision: models, learning and inference Chapter 12 Models for Grids.

3Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Binary Denoising

Before AfterImage represented as binary discrete variables. Some proportion of pixels

randomly changed polarity.

Page 4: Computer vision: models, learning and inference Chapter 12 Models for Grids.

4Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Multi-label Denoising

Before AfterImage represented as discrete variables representing intensity. Some

proportion of pixels randomly changed according to a uniform distribution.

Page 5: Computer vision: models, learning and inference Chapter 12 Models for Grids.

5Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Denoising Goal

Observed Data Uncorrupted Image

Page 6: Computer vision: models, learning and inference Chapter 12 Models for Grids.

6

• Most of the pixels stay the same• Observed image is not as smooth as original

Now consider pdf over binary images that encourages smoothness – Markov random field

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Denoising Goal

Observed Data Uncorrupted Image

Page 7: Computer vision: models, learning and inference Chapter 12 Models for Grids.

7Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Markov random fields

This is just the typical property of an undirected model.We’ll continue the discussion in terms of undirected models

Page 8: Computer vision: models, learning and inference Chapter 12 Models for Grids.

8Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Markov random fields

Normalizing constant(partition function) Potential function

Returns positive number

Subset of variables(clique)

Page 9: Computer vision: models, learning and inference Chapter 12 Models for Grids.

9Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Markov random fields

Normalizing constant(partition function)

Cost functionReturns any number

Subset of variables(clique)Relationship

Page 10: Computer vision: models, learning and inference Chapter 12 Models for Grids.

10Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Smoothing Example

Page 11: Computer vision: models, learning and inference Chapter 12 Models for Grids.

11Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Smoothing Example

Smooth solutions (e.g. 0000,1111) have high probabilityZ was computed by summing the 16 un-normalized probabilities

Page 12: Computer vision: models, learning and inference Chapter 12 Models for Grids.

12Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Smoothing Example

Samples from larger grid -- mostly smooth Cannot compute partition function Z here - intractable

Page 13: Computer vision: models, learning and inference Chapter 12 Models for Grids.

13Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Denoising Goal

Observed Data Uncorrupted Image

Page 14: Computer vision: models, learning and inference Chapter 12 Models for Grids.

14Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Denoising overviewBayes’ rule:

Likelihoods:

Prior: Markov random field (smoothness)

MAP Inference: Graph cuts

Probability of flipping polarity

Page 15: Computer vision: models, learning and inference Chapter 12 Models for Grids.

15Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Denoising with MRFs

Observed image, x

Original image, w

MRF Prior (pairwise cliques)

Inference :

Likelihoods

Page 16: Computer vision: models, learning and inference Chapter 12 Models for Grids.

16

MAP Inference

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Unary terms (compatability of data with label y)

Pairwise terms (compatability of neighboring labels)

Page 17: Computer vision: models, learning and inference Chapter 12 Models for Grids.

17Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Graph Cuts Overview

Unary terms (compatability of data with label y)

Pairwise terms (compatability of neighboring labels)

Graph cuts used to optimise this cost function:

Three main cases:

Page 18: Computer vision: models, learning and inference Chapter 12 Models for Grids.

18Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Graph Cuts Overview

Unary terms (compatability of data with label y)

Pairwise terms (compatability of neighboring labels)

Graph cuts used to optimise this cost function:

Approach:

Convert minimization into the form of a standard CS problem,

MAXIMUM FLOW or MINIMUM CUT ON A GRAPH

Polynomial-time methods for solving this problem are known

Page 19: Computer vision: models, learning and inference Chapter 12 Models for Grids.

19Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Max-Flow Problem

Goal:

To push as much ‘flow’ as possible through the directed graph from the source to the sink.

Cannot exceed the (non-negative) capacities cij associated with each edge.

Page 20: Computer vision: models, learning and inference Chapter 12 Models for Grids.

20Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Saturated Edges

When we are pushing the maximum amount of flow:

• There must be at least one saturated edge on any path from source to sink(otherwise we could push more flow)

• The set of saturated edges hence separate the source and sink

Page 21: Computer vision: models, learning and inference Chapter 12 Models for Grids.

21Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Augmenting Paths

Two numbers represent: current flow / total capacity

Page 22: Computer vision: models, learning and inference Chapter 12 Models for Grids.

22

Choose any route from source to sink with spare capacity, and push as much flow as you can. One edge (here 6-t) will saturate.

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Augmenting Paths

Page 23: Computer vision: models, learning and inference Chapter 12 Models for Grids.

23Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Augmenting Paths

Choose another route, respecting remaining capacity. This time edge 6-5 saturates.

Page 24: Computer vision: models, learning and inference Chapter 12 Models for Grids.

24Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Augmenting Paths

A third route. Edge 1-4 saturates

Page 25: Computer vision: models, learning and inference Chapter 12 Models for Grids.

25Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Augmenting Paths

A fourth route. Edge 2-5 saturates

Page 26: Computer vision: models, learning and inference Chapter 12 Models for Grids.

26Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Augmenting Paths

A fifth route. Edge 2-4 saturates

Page 27: Computer vision: models, learning and inference Chapter 12 Models for Grids.

27There is now no further route from source to sink – there is a saturated edge along every possible route (highlighted arrows)

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Augmenting Paths

Page 28: Computer vision: models, learning and inference Chapter 12 Models for Grids.

28

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Augmenting Paths

The saturated edges separate the source from the sink and form the min-cut solution. Nodes either connect to the source or connect to the sink.

Page 29: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Graph Cuts: Binary MRF

Unary terms (compatability of data with label w)

Pairwise terms (compatability of neighboring labels)

Graph cuts used to optimise this cost function:

First work with binary case (i.e. True label w is 0 or 1)

Constrain pairwise costs so that they are “zero-diagonal”

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

29

Page 30: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Graph Construction• One node per pixel (here a 3x3 image)• Edge from source to every pixel node• Edge from every pixel node to sink• Reciprocal edges between neighbours

Note that in the minimum cut EITHER the edge connecting to the source will be cut, OR the edge connecting to the sink, but NOT BOTH (unnecessary).

Which determines whether we give that pixel label 1 or label 0.

Now a 1 to 1 mapping between possible labelling and possible minimum cuts

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

30

Page 31: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Graph ConstructionNow add capacities so that minimum cut, minimizes our cost function

Unary costs U(0), U(1) attached to links to source and sink.

• Either one or the other is paid.

Pairwise costs between pixel nodes as shown.

• Why? Easiest to understand with some worked examples.

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

31

Page 32: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Example 1

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

32

Page 33: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Example 2

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

33

Page 34: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Example 3

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

34

Page 35: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 35

Page 36: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Graph Cuts: Binary MRF

Unary terms (compatability of data with label w)

Pairwise terms (compatability of neighboring labels)

Graph cuts used to optimise this cost function:

Summary of approach

• Associate each possible solution with a minimum cut on a graph• Set capacities on graph, so cost of cut matches the cost function• Use augmenting paths to find minimum cut• This minimizes the cost function and finds the MAP solution

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

36

Page 37: Computer vision: models, learning and inference Chapter 12 Models for Grids.

General Pairwise costs

Modify graph to

• Add P(0,0) to edge s-b• Implies that solutions 0,0 and

1,0 also pay this cost• Subtract P(0,0) from edge b-a

• Solution 1,0 has this cost removed again

Similar approach for P(1,1)

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

37

Page 38: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Reparameterization

The max-flow / min-cut algorithms require that all of the capacities are non-negative.

However, because we have a subtraction on edge a-b we cannot guarantee that this will be the case, even if all the original unary and pairwise costs were positive.

The solution to this problem is reparamaterization: find new graph where costs (capacities) are different but choice of minimum solution is the same (usually just by adding a constant to each solution)

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

38

Page 39: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Reparameterization 1

The minimum cut chooses the same links in these two graphsComputer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 40: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Reparameterization 2

The minimum cut chooses the same links in these two graphsComputer vision: models, learning and inference. ©2011 Simon J.D. Prince

40

Page 41: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Submodularity

Adding together implies

Subtract constant b

Add constant, b

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

41

Page 42: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Submodularity

If this condition is obeyed, it is said that the problem is “submodular” and it can be solved in polynomial time.

If it is not obeyed then the problem is NP hard.

Usually it is not a problem as we tend to favour smooth solutions.

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

42

Page 43: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Denoising Results

Original Pairwise costs increasing

Pairwise costs increasingComputer vision: models, learning and inference. ©2011 Simon J.D. Prince

43

Page 44: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Plan of Talk

• Denoising problem• Markov random fields (MRFs)• Max-flow / min-cut• Binary MRFs – submodular (exact solution)• Multi-label MRFs – submodular (exact solution)• Multi-label MRFs - non-submodular (approximate)

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

44

Page 45: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Construction for two pixels (a and b) and four labels (1,2,3,4)

There are 5 nodes for each pixel and 4 edges between them have unary costs for the 4 labels.

One of these edges must be cut in the min-cut solution and the choice will determine which label we assign.

Multiple Labels

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

45

Page 46: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Constraint Edges

The edges with infinite capacity pointing upwards are called constraint edges.

They prevent solutions that cut the chain of edges associated with a pixel more than once (and hence given an ambiguous labelling)

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

46

Page 47: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Multiple Labels

Inter-pixel edges have costs defined as:

Superfluous terms :

For all i,j where K is number of labelsComputer vision: models, learning and inference. ©2011 Simon J.D. Prince

47

Page 48: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Example Cuts

Must cut links from before cut on pixel a to after cut on pixel b. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

48

Page 49: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Pairwise Costs

If pixel a takes label I and pixel b takes label J

Must cut links from before cut on pixel a to after cut on pixel b.

Costs were carefully chosen so that sum of these links gives appropriate pairwise term.

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

49

Page 50: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Reparameterization

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

50

Page 51: Computer vision: models, learning and inference Chapter 12 Models for Grids.

SubmodularityWe require the remaining inter-pixel links to be positive so that

or

By mathematical induction we can get the more general result

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

51

Page 52: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Submodularity

If not submodular, then the problem is NP hard.Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

52

Page 53: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Convex vs. non-convex costs

Quadratic • Convex• Submodular

Truncated Quadratic• Not Convex• Not Submodular

Potts Model • Not Convex• Not Submodular

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

53

Page 54: Computer vision: models, learning and inference Chapter 12 Models for Grids.

What is wrong with convex costs?

• Pay lower price for many small changes than one large one• Result: blurring at large changes in intensity

Observed noisy image Denoised result

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

54

Page 55: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Plan of Talk

• Denoising problem• Markov random fields (MRFs)• Max-flow / min-cut• Binary MRFs - submodular (exact solution)• Multi-label MRFs – submodular (exact solution)• Multi-label MRFs - non-submodular (approximate)

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

55

Page 56: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Alpha Expansion Algorithm• break multilabel problem into a series of binary problems• at each iteration, pick label a and expand (retain original or change to a)

Initial labelling

Iteration 1 (orange)

Iteration 3 (red)

Iteration 2 (yellow)

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

56

Page 57: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Alpha Expansion Ideas• For every iteration– For every label– Expand label using optimal graph cut solution

Co-ordinate descent in label space.Each step optimal, but overall global maximum not guaranteedProved to be within a factor of 2 of global optimum. Requires that pairwise costs form a metric:

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

57

Page 58: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Alpha Expansion Construction

Binary graph cut – either cut link to source (assigned to a) or to sink (retain current label)

Unary costs attached to links between source, sink and pixel nodes appropriately.Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

58

Page 59: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Alpha Expansion Construction

Graph is dynamic. Structure of inter-pixel links depends on a and the choice of labels.

There are four cases.

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

59

Page 60: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Alpha Expansion Construction

Case 1:

Adjacent pixels both have label a already.Pairwise cost is zero – no need for extra edges.

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

60

Page 61: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Alpha Expansion Construction

Case 2: Adjacent pixels are ,a b. Result either

• ,a a (no cost and no new edge).• ,a b (P( ,a b), add new edge).

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

61

Page 62: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Alpha Expansion Construction

Case 3: Adjacent pixels are ,b b. Result either • ,b b (no cost and no new edge).• ,a b (P( ,a b), add new edge).• ,b a (P( ,b a), add new edge).

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

62

Page 63: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Alpha Expansion Construction

Case 4: Adjacent pixels are ,b g. Result either • ,b g (P( ,b g), add new edge).• ,a g (P( ,a g), add new edge).• ,b a (P( ,b a), add new edge).• ,a a (no cost and no new edge).Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

63

Page 64: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Example Cut 1

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

64

Page 65: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Example Cut 1Important!

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

65

Page 66: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Example Cut 2

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

66

Page 67: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Example Cut 3

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

67

Page 68: Computer vision: models, learning and inference Chapter 12 Models for Grids.

Denoising Results

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

68

Page 69: Computer vision: models, learning and inference Chapter 12 Models for Grids.

69Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Conditional Random Fields

Page 70: Computer vision: models, learning and inference Chapter 12 Models for Grids.

70Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Directed model for grids

Cannot use graph cuts as three-wise term. Easy to draw samples.

Page 71: Computer vision: models, learning and inference Chapter 12 Models for Grids.

71Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

• Background subtraction

Applications

Page 72: Computer vision: models, learning and inference Chapter 12 Models for Grids.

72Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

• Grab cut

Applications

Page 73: Computer vision: models, learning and inference Chapter 12 Models for Grids.

73Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

• Stereo vision

Applications

Page 74: Computer vision: models, learning and inference Chapter 12 Models for Grids.

74Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

• Shift-map image editing

Applications

Page 75: Computer vision: models, learning and inference Chapter 12 Models for Grids.

75Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Page 76: Computer vision: models, learning and inference Chapter 12 Models for Grids.

76Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

• Shift-map image editing

Applications

Page 77: Computer vision: models, learning and inference Chapter 12 Models for Grids.

77Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

• Super-resolution

Applications

Page 78: Computer vision: models, learning and inference Chapter 12 Models for Grids.

78Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

• Texture synthesis

Applications

Page 79: Computer vision: models, learning and inference Chapter 12 Models for Grids.

79Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Image Quilting

Page 80: Computer vision: models, learning and inference Chapter 12 Models for Grids.

80Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Synthesizing faces

Page 81: Computer vision: models, learning and inference Chapter 12 Models for Grids.

81Computer vision: models, learning and inference. ©2011 Simon J.D. Prince

Graphical models