Segmentation

Introduction

Segmentationslides adopted from Svetlana LazebnikImage segmentation

2The goals of segmentationGroup together similar-looking pixels for efficiency of further processingBottom-up processUnsupervised

X. Ren and J. Malik. Learning a classification model for segmentation. ICCV 2003.superpixels3The goals of segmentationSeparate image into coherent objectsBottom-up or top-down process?Supervised or unsupervised?Berkeley segmentation database:http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/segbench/

imagehuman segmentation4Inspiration from psychologyThe Gestalt school: Grouping is key to visual perception

The Muller-Lyer illusionhttp://en.wikipedia.org/wiki/Gestalt_psychology5The Gestalt schoolElements in a collection can have properties that result from relationships The whole is greater than the sum of its parts

subjective contoursocclusionfamiliar configurationhttp://en.wikipedia.org/wiki/Gestalt_psychology6Emergence

http://en.wikipedia.org/wiki/Gestalt_psychology7Multistability

Gestalt factors

9

Grouping phenomena in real lifeForsyth & Ponce, Figure 14.710The story is in the book (figure 14.7)

Grouping phenomena in real lifeForsyth & Ponce, Figure 14.711The story is in the book (figure 14.7)Invariance

Gestalt factors

These factors make intuitive sense, but are very difficult to translate into algorithms13Segmentation as clustering

Source: K. Grauman14

ImageIntensity-based clustersColor-based clustersSegmentation as clusteringK-means clustering based on intensity or color is essentially vector quantization of the image attributesClusters dont have to be spatially coherent

15I gave each pixel the mean intensity or mean color of its cluster --- this is basically just vector quantizing the image intensities/colors. Notice that there is no requirement that clusters be spatially localized and theyre not.Segmentation as clustering

Source: K. Grauman16Segmentation as clusteringClustering based on (r,g,b,x,y) values enforces more spatial coherence

17K-Means for segmentationProsVery simple methodConverges to a local minimum of the error functionConsMemory-intensiveNeed to pick KSensitive to initializationSensitive to outliersOnly finds spherical clusters

18

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.htmlMean shift clustering and segmentationAn advanced and versatile technique for clustering-based segmentationD. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis, PAMI 2002. 19The mean shift algorithm seeks modes or local maxima of density in the feature spaceMean shift algorithm

imageFeature space (L*u*v* color values)20SearchwindowCenter ofmassMean ShiftvectorMean shiftSlide by Y. Ukrainitz & B. Sarel21SearchwindowCenter ofmassMean ShiftvectorMean shiftSlide by Y. Ukrainitz & B. Sarel22SearchwindowCenter ofmassMean ShiftvectorMean shiftSlide by Y. Ukrainitz & B. Sarel23SearchwindowCenter ofmassMean ShiftvectorMean shiftSlide by Y. Ukrainitz & B. Sarel24SearchwindowCenter ofmassMean ShiftvectorMean shiftSlide by Y. Ukrainitz & B. Sarel25SearchwindowCenter ofmassMean ShiftvectorMean shiftSlide by Y. Ukrainitz & B. Sarel26SearchwindowCenter ofmassMean shiftSlide by Y. Ukrainitz & B. Sarel27Cluster: all data points in the attraction basin of a modeAttraction basin: the region for which all trajectories lead to the same mode

Mean shift clusteringSlide by Y. Ukrainitz & B. Sarel28Find features (color, gradients, texture, etc)Initialize windows at individual feature pointsPerform mean shift for each window until convergenceMerge windows that end up near the same peak or mode

Mean shift clustering/segmentation

29

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.htmlMean shift segmentation results30More results

31More results

32Mean shift pros and consProsDoes not assume spherical clustersJust a single parameter (window size) Finds variable number of modesRobust to outliersConsOutput depends on window sizeComputationally expensiveDoes not scale well with dimension of feature space33Images as graphsNode for every pixelEdge between every pair of pixels (or every pair of sufficiently close pixels)Each edge is weighted by the affinity or similarity of the two nodes

wijijSource: S. Seitz34Segmentation by graph partitioningBreak Graph into SegmentsDelete links that cross between segmentsEasiest to break links that have low affinitysimilar pixels should be in the same segmentsdissimilar pixels should be in different segments

ABCSource: S. Seitzwijij35Measuring affinitySuppose we represent each pixel by a feature vector x, and define a distance function appropriate for this feature representationThen we can convert the distance between two feature vectors into an affinity with the help of a generalized Gaussian kernel:

36Scale affects affinitySmall : group only nearby pointsLarge : group far-away points

37Graph cutSet of edges whose removal makes a graph disconnectedCost of a cut: sum of weights of cut edgesA graph cut gives us a segmentationWhat is a good graph cut and how do we find one?ABSource: S. Seitz38Minimum cutWe can do segmentation by finding the minimum cut in a graphEfficient algorithms exist for doing this

Minimum cut example39Minimum cutWe can do segmentation by finding the minimum cut in a graphEfficient algorithms exist for doing this

Minimum cut example40Normalized cutDrawback: minimum cut tends to cut off very small, isolated componentsIdeal CutCuts with lesser weightthan the ideal cut* Slide from Khurram Hassan-Shafique CAP5415 Computer Vision 200341Normalized cutDrawback: minimum cut tends to cut off very small, isolated componentsThis can be fixed by normalizing the cut by the weight of all the edges incident to the segmentThe normalized cut cost is:

w(A, B) = sum of weights of all edges between A and B

J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 200042Normalized cutLet W be the adjacency matrix of the graphLet D be the diagonal matrix with diagonal entries D(i, i) = j W(i, j) Then the normalized cut cost can be written as

where y is an indicator vector whose value should be 1 in the ith position if the ith feature point belongs to A and a negative constant otherwiseJ. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 2000

43Normalized cutFinding the exact minimum of the normalized cut cost is NP-complete, but if we relax y to take on arbitrary values, then we can minimize the relaxed cost by solving the generalized eigenvalue problem (D W)y = Dy The solution y is given by the generalized eigenvector corresponding to the second smallest eigenvalueIntutitively, the ith entry of y can be viewed as a soft indication of the component membership of the ith featureCan use 0 or median value of the entries as the splitting point (threshold), or find threshold that minimizes the Ncut cost44Normalized cut algorithmRepresent the image as a weighted graph G = (V,E), compute the weight of each edge, and summarize the information in D and WSolve (D W)y = Dy for the eigenvector with the second smallest eigenvalueUse the entries of the eigenvector to bipartition the graph

To find more than two clusters:Recursively bipartition the graphRun k-means clustering on values of several eigenvectors45

Example result46Segmentationmany slides from Svetlana Lazebnik, Anat LevinChallengeHow to segment images that are a mosaic of textures?

48Using texture features for segmentationConvolve image with a bank of filtersJ. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001.

49Using texture features for segmentationConvolve image with a bank of filtersFind textons by clustering vectors of filter bank outputs

J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001.

Texton mapImage50Using texture features for segmentationConvolve image with a bank of filtersFind textons by clustering vectors of filter bank outputsThe final texture feature is a texton histogram computed over image windows at some local scale

J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001.

51Pitfall of texture featuresPossible solution: check for intervening contours when computing connection weights

J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001. 52Example results

53Results: Berkeley Segmentation Engine

http://www.cs.berkeley.edu/~fowlkes/BSE/54ProsGeneric framework, can be used with many different features and affinity formulationsConsHigh storage requirement and time complexityBias towards partitioning into equal segmentsNormalized cuts: Pro and con55

Segments as primitives for recognitionJ. Tighe and S. Lazebnik, ECCV 201056Bottom-up segmentation Normalized cuts Mean shift

Bottom-up approaches: Use low level cues to group similar pixels

57Bottom-up segmentation is ill posedSome segmentation example (maybe horses from Erans paper)

Many possible segmentation are equally good based on low level cues alone.images from Borenstein and Ullman 02 58

Top-down segmentation Class-specific, top-down segmentation (Borenstein & Ullman Eccv02) Winn and Jojic 05Leibe et al 04Yuille and Hallinan 02.Liu and Sclaroff 01Yu and Shi 03

59Combining top-down and bottom-up segmentation Find a segmentation:Similar to the top-down modelAligns with image edges

+60Why learning top-down and bottom-up models simultaneously?

Large number of freedom degrees in tentacles configuration- requires a complex deformable top down model On the other hand: rather uniform colors- low level segmentation is easy61Learn top-down and bottom-up models simultaneouslyReduces at run time to energy minimization with binary labels (graph min cut)Combined Learning Approach62

Energy model

Consistency with fragments segmentation Segmentation alignment with image edges

63Energy model

Segmentation alignment with image edges

Consistency with fragments segmentation 64Energy model

Segmentation alignment with image edges

Resulting min-cut segmentation Consistency with fragments segmentation

65

Learning from segmented class images Training data:

Learn fragments for an energy function66

Fragments selectionCandidate fragments pool:Greedy energy design:

67Fragments selection challengesStraightforward computation of likelihood improvement is impractical

2000 Fragments50 Training images10 Fragments selection iterations

1,000,000 inference operations!68Fragments selectionFragment with low error on the training setFirst order approximation to log-likelihood gain:

Fragment not accounted for by the existing model69Requires a single inference process on the previous iteration energy to evaluate approximations with respect to all fragmentsFirst order approximation evaluation is linear in the fragment sizeFirst order approximation to log-likelihood gain:

Fragments selection70Training horses model

71Training horses model-one fragment

72Training horses model-two fragments

73

Training horses model-three fragments

74Results- horses dataset

75Results- horses dataset

Fragments numberMislabeled pixels percent Comparable to previous but with far fewer fragments 76Results- artificial octopi

77

Top-down segmentationE. Borenstein and S. Ullman, Class-specific, top-down segmentation, ECCV 2002A. Levin and Y. Weiss, Learning to Combine Bottom-Up and Top-Down Segmentation, ECCV 2006.

78Visual motion

Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys79Motion and perceptual organizationSometimes, motion is the only cue

80Motion and perceptual organizationSometimes, motion is the only cue

81Motion and perceptual organizationEven impoverished motion data can evoke a strong percept

G. Johansson, Visual Perception of Biological Motion and a Model For Its Analysis", Perception and Psychophysics 14, 201-211, 1973.82Motion and perceptual organizationEven impoverished motion data can evoke a strong percept

G. Johansson, Visual Perception of Biological Motion and a Model For Its Analysis", Perception and Psychophysics 14, 201-211, 1973.83Motion and perceptual organizationEven impoverished motion data can evoke a strong perceptG. Johansson, Visual Perception of Biological Motion and a Model For Its Analysis", Perception and Psychophysics 14, 201-211, 1973.

84Uses of motionEstimating 3D structureSegmenting objects based on motion cuesLearning and tracking dynamical modelsRecognizing events and activities85Motion fieldThe motion field is the projection of the 3D scene motion into the image

86Motion field and parallaxX(t) is a moving 3D pointVelocity of scene point: V = dX/dtx(t) = (x(t),y(t)) is the projection of X in the imageApparent velocity v in the image: given by components vx = dx/dt and vy = dy/dtThese components are known as the motion field of the imagex(t)x(t+dt)X(t)X(t+dt)Vv87Motion field and parallaxx(t)x(t+dt)X(t)X(t+dt)Vv

To find image velocity v, differentiate x=(x,y) with respect to t (using quotient rule):

Image motion is a function of both the 3D motion (V) and thedepth of the 3D point (Z)

88Quotient rule for differentiation: D(f/g) = (g f g f)/g^2Motion field and parallaxPure translation: V is constant everywhere

89Motion field and parallaxPure translation: V is constant everywhere

The length of the motion vectors is inversely proportional to the depth ZVz is nonzero: Every motion vector points toward (or away from) the vanishing point of the translation direction

90At the vanishing point of the motion location, the visual motion is zeroWe have v = 0 when V_z x = v_0 or x = (f V_x / V_z, f V_y / V_z)Motion field and parallaxPure translation: V is constant everywhere

The length of the motion vectors is inversely proportional to the depth ZVz is nonzero: Every motion vector points toward (or away from) the vanishing point of the translation directionVz is zero: Motion is parallel to the image plane, all the motion vectors are parallel

91Optical Flow based segmentationsegmentation of flow vectors using the above techniques:mean shiftnormalized cutstop down approaches

Segmentation

Documents

Transcript of Segmentation