Bi-layer segmentation of binocular stereo video

Bi-layer segmentationBi-layer segmentation of binocular stereo video of binocular stereo video

V. Kolmogorov, A. Criminisi, A. Blake, G. Cross, C. Rother

Microsoft Research, Cambridge, UKComputer Vision Group

ProblemProblem

• Two layers (Bg and Fg)– Scenarios: video-conferencing, live bluescreening without bluescreen

• Two cameras• Background may not be static• Goal: accurately segment foreground object in real-time

– Applications: background substitution, ...

Left input Right input

Sources of informationSources of information

• Stereo– Foreground object has larger disparity

• Colour– Background and foreground have distinct colour

distributions

• Contrast– There is image gradient at Bg/Fg transition

• Spatial coherence– MRF model

Previous workPrevious work

• Colour/Contrast (+2D coherence)– Graph cuts

(Boykov et al. ’01, Rother et al. ’04)

• Stereo (+1D coherence)– Dynamic programming

(Ohta et al. ’85, Cox et al. ’96, Criminisi et al. ’03)

Left input

Fusing colour/contrast and stereoFusing colour/contrast and stereo

• Colour and stereo complement each other• Result from fusion:

Results from our technique

• Ideally:

segmentation (Bg/Fg/Occ)

disparity

Fusing colour/contrast and stereoFusing colour/contrast and stereo

data

Construct p(x , d | z) Prior p(x , d) –2D MRF model

Marginalise d out to get p(x | z)

Compute x as MAP configuration Intractable!

Approximations

• Simplify the model to get real-time performance• Two different approaches:

– Layered Dynamic Programming (LDP)– Layered Graph Cut (LGC)

• Probabilistic formulation– Parameters can mostly be set automatically

• Very similar error statistics– Consistently better than colour/contrast or stereo alone

Layered Dynamic Programming (LDP)Layered Dynamic Programming (LDP)

• Approximation: in the prior neglect coupling between scanlines

• 1D MRF model (Markov chain)

• Compute path from (0,0) to (N,N) and label {Fg,Bg,Occ} for each vertex

Left

Rig

ht

Disparity

Layered Graph Cut (LGC)Layered Graph Cut (LGC)

• Approximation: in the prior neglect conditioning of disparity dp on disparities of neighbours

– conditioned only on segmentation label xp{Bg,Fg,Occ}

• Marginalise disparities out• Energy minimisation problem with 3 labels• Solve it using 2 graph cut computations

– Approx. 20 frames per second (320 x 240, 3GHz)

Setting parametersSetting parameters

• Two approaches:– Generative: from physics (e.g. from average width of

occluded regions)– Discriminative: minimize error rates

• Consistent results! – See technical report 2005 (

http://research.microsoft.com/vision/cambridge/i2i/)

http://research.microsoft.com/vision/cambridge/i2i/

Experiments: ground truth dataExperiments: ground truth data

Ground-Truth segmentation at http://research.microsoft.com/vision/cambridge/i2i/

• 19 calibrated stereo sequences• 6 with ground truth segmentation

– Every 5th or 10th frame– Pixels marked as “Bg”, “Fg” or “Unknown”

Original Ground-truth segmentation

Accuracy Accuracy ofof segmentation segmentation

LGC Segmentation

Accuracy of segmentationAccuracy of segmentation

ConclusionConclusion

• Two algorithms based on different approximations– Fuse colour/constrast, stereo, and spatial coherence– Probabilistic formulation– Capable of real-time performance

• Similar error statistics– Consistently better than state-of-the art techniques

• Different characteristics– LDP: Parallelisable (scanlines processed independently)– LGC: Marginalisation could be done on GPU

MSN: i2i cambridge

Segmentation errors (LGC)

Bi-layer segmentation of binocular stereo video

Documents

Transcript of Bi-layer segmentation of binocular stereo video