Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Post on 20-Jan-2016

219 views 0 download

Transcript of Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereo Vision

ECE 847:Digital Image Processing

Stan BirchfieldClemson University

Modeling from multiple views

time

# cameras

photographbinocular stereo

trinocular stereo

multi-baseline stereo

camcorder

human vision

camera dome

two frames ...

...

– Greek for solid

Stereoscope

Invented by Wheatstone in 1838

Modern version

Can you fuse these?

rightleft

No special instrument needed

Just relax your eyes

L

R

Random dot stereogram

invented by Bela Juleszin 1959

http://www.magiceye.com/faq.htm

Autostereogram

Do you see the shark?

http://en.wikipedia.org/wiki/Autostereogram

Can you cross-fuse these?

right leftNote: Cross-fusion is necessary if distance between images

is greater than inter-ocular distance

L

R

R

L

impossible: instead, trickthe brain:

Tsukuba stereo images courtesy of Y. Ohta and Y. Nakamura at the University of Tsukuba

Human stereo geometry

http://webvision.med.utah.edu/space_perception.html

fixationpoint

corresponding points

aR

aL

disparity

Horopter

• Horopter: surfacewhere disparity is zero

• For round retina,the theoretical horopteris a circle(Vieth-Muller circle)

http://webvision.med.utah.edu/space_perception.html

Cyclopean image

http://webvision.med.utah.edu/space_perception.html

http://bearah718.tripod.com/sitebuildercontent/sitebuilderpictures/cyclops.jpg

Panum’s fusional area (volume)

• Human visual system is only capable of fusing the two images with a narrow range of disparities around fixation point

• This area (volume) is Panum’s fusional area

• Outside this area we get double-vision (diplopia)

http://www.allaboutvision.com/conditions/double-vision.htm

Human visual pathway

photos courtesy California Academy of Science

Cheetah:More accurate

depthestimation

Antelope:larger field

of view

Prey and predator

Stereo geometry for pinhole cameras with flat retinas

C,C’,x,x’ and X are coplanar

Left camera Right camera

world point

center ofprojection

epipolarplane epipolar line for x

epipole

baseline

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

epipoles e,e’= intersection of baseline with image plane = projection of projection center in other image= vanishing point of camera motion direction

an epipolar plane = plane containing baseline (1-D family)

an epipolar line = intersection of epipolar plane with image(always come in corresponding pairs)

Epipolar geometry

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

What if only C,C’,x are known?

Epipolar geometry

epipole

centerof

projection baseline

All points on p project on l and l’M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Family of planes and lines l and l’ Intersection in e and e’

Epipolar geometry

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Example: Converging cameras

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

e

e’

Example: Forward motion

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Epipolar geometry andFundamental matrix

epipolar line

(epipole: intersection of all epipolar lines)

(computed with 20 points)

Epipolar geometry andFundamental matrix

epipolar line

(epipole: intersection of all epipolar lines)

(computed with 50 points)

Fundamental matrix

point in image 1

point in image 2

fundamentalmatrix

Epipolar lines (1)

epipolar line in image 2associated with (x,y) in image 1

Epipolar lines (2)

epipolar line in image 1associated with (x’,y’) in image 2

Computing the fundamental matrix

knownunknown

Computing the fundamental matrix

knownunknown

Computing the fundamental matrix

1. Construct A (nx9) from correspondences2. Compute SVD of A: A = UVT

3. Unit-norm solution of Af=0 is given by vn (the right-most singular vector of A)

4. Reshape f into F1

5. Compute SVD of F1: F1=UFFVFT

6. Set F(3,3)=0 to get F’(The enforces rank(F)=2)

7. Reconstruct F=UFF’VFT

8. Now xTF is epipolar line in image 2,and Fx’ is epipolar line in image 1

(simple for stereo rectification)

Example: Motion parallel to image plane

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Example: Motion parallel to image scanlines

Epipoles are at infinity

Scanlines are the epipolar lines

In this case, the images are said to be “rectified”

Tsukuba stereo images courtesy of Y. Ohta and Y. Nakamura at the University of Tsukuba

Standard (rectified) stereo geometry

pure translation along X-axis

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Perspective projection

X

x

X

x

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Rectified geometry

xL xR

XL -XR

b

Standard stereo geometry

• disparity is inversely proportional to depth• stereo vision is less useful for distant objects

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Binocular rectified stereo

rightleft

disparity map depth discontinuities

epipolarconstraint

Matching scanlines

inte

nsi

ty

L

R

dis

par

ity

lamp

wall

pixel

rightleft

Stereo matching

• Search is limited to epipolar line (1D)• Look for most similar pixel

?

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Aggregation

• Use more than one pixel• Assume neighbors have similar

disparities*

– Use correlation window containing pixel

– Allows to use SSD, ZNCC, Census, etc.

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Block matching

Dissimilarity measures

Connection between SSD and cross correlation:

Also normalized correlation, rank, census, sampling-insensitive ...

Most common:

More efficient implementation

Key idea: Summation over window is correlation with box filter, which is separable

Running sum improves efficiency even more

Note: w is half-width

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Sum of Square Differences

Dissimilarity measures

Note: SAD is fast approximation (replace square with absolute value)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Dissimilarity measures

If energy does not change much, then minimizing SSD equals maximizing cross-correlation:

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Zero-mean Normalized Cross Correlation

Similarity measures

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Census

Similarity measures

125 126 125

127 128 130

129 132 135

0 0 0

0 1

1 1 1

(Real-time chip from TYZX based on Census)

only compare bit signature

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Sampling-InsensitivePixel Dissimilarity

d(xL,xR)

xL xR

d(xL,xR) = min{d(xL,xR) ,d(xR,xL)}Our dissimilarity measure:

[Birchfield & Tomasi 1998]

IL IR

Given: An interval A such that [xL – ½ , xL + ½] _ A, and

[xR – ½ , xR + ½] _ A

Dissimilarity Measure Theorems

If | xL – xR | ≤ ½, then d(xL,xR) = 0

| xL – xR | ≤ ½ iff d(xL,xR) = 0

∩∩

Theorem 1:

Theorem 2:

(when A is convex or concave)

(when A is linear)

Aggregation window sizes

Small windows • disparities similar• more ambiguities• accurate when correct

Large windows • larger disp. variation• more discriminant• often more robust• use shiftable windows to

deal with discontinuities

(Illustration from Pascal Fua)

Occlusions

(Slide from Pascal Fua)

Left-right consistency check

• Search left-to-right, then right-to-left• Retain disparity only if they agree

xL

d

Do minima coincide?

Results: correlation

disparity mapleft

with left-right consistency check

Constraints

• Epipolar – match must lie on epipolar line• Piecewise constancy – neighboring pixels should usually

have same disparity• Piecewise continuity – neighboring pixels should usually

have similar disparity• Disparity – impose allowable range of disparities (Panum’s

fusional area)• Disparity gradient – restricts slope of disparity• Figural continuity – disparity of edges across scanlines• Uniqueness – each pixel has no more than one match

(violated by windows and mirrors)• Ordering – disparity function is monotonic (precludes thin

poles)

Exploiting scene constraints

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Ordering constraint

11 22 33 4,54,5 66 11 2,32,3 44 55 66

2211 33 4,54,5 6611

2,32,3

44

55

66

surface slicesurface slice surface as a pathsurface as a path

occlusion right

occlusion left

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Uniqueness constraint

• In an image pair each pixel has at most one corresponding pixel– In general one corresponding pixel– In case of occlusion there is none

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Disparity constraint

surface slicesurface slice surface as a pathsurface as a path

bounding box

dispa

rity b

and

use reconstructed features to determine bounding box

constantdisparitysurfaces

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Figural continuity constraint

right left

[University of Tsukuba]

Cooperative algorithm

Disparity space image

Dynamic Programming: 1D Search

Dis

par

ity

map

occlusion

depthdiscontinuity

RIGHTL

EF

T

c a r t

ca

t 3 2 1 1 12 1 0 1 21 0 1 2 30 1 2 3 4

string editing:

stereo matching:

penalties: mismatch = 1 insertion = 1 deletion = 1

c a t

c a r t

Minimizing a 2D Cost FunctionalMinimize:

disparity

p,q

p,q

d(p, )2D:

GL

OB

AL

dis

par

ity

pixel?

p,q

d(p, )

1D:

E E d(p, ) u(l )data smoothness p p,q p,q

{p,q} N

Global

u(l )p,q

Discontinuity penalty:

lp,q

minimum cut = disparity surface

u(l )= lp,q p,q p,qsolves

LO

CA

L Local (GOOD)

(BAD)

Multiway-Cut:2D Search

pixels

labels

pixels

labels

[Boykov, Veksler, Zabih 1998]

Multiway-Cut Algorithm

),( x'x ))(, x(x fg

minimum cut

),(

)]()()[,())(,x'xx

x'xx'xx(x fffg Minimizes

source label

sink label

pixels

(cost of label discontinuity)

(cost of assigninglabel to pixel)

pixels

labels

Energy minimization

(Slide from Pascal Fua)

Graph Cut

(Slide from Pascal Fua)

(general formulation requires multi-way cut!)

(Boykov et al ICCV‘99)

(Roy and Cox ICCV‘98)

Simplified graph cut

Correspondence as Segmentation

• Problem: disparities (fronto-parallel) O()surfaces (slanted) O( 2 n)=> computationally intractable!

• Solution: iteratively determine which labels to use

labelpixels

find affineparametersof regions

multiway-cut(Expectation)

Newton-Raphson(Maximization)

Stereo Results (Dynamic Programming)

Stereo Results (Multiway-Cut)

Stereo Results on Middlebury Database

imag

eB

irch

fiel

dT

om

asi 1

999

Ho

ng

-C

hen

200

4

Untextured regions remain a challenge

Multiway-cutDynamic programming

Results: dynamic programming

disparity map

[Bobick & Intille]

left

Results: multiway cut

disparity mapleft

[Kolmogorov & Zabih]

Results: multiway cut (untextured)

disparity map

Multi-camera configurations

Okutami and Kanade

(illustration from Pascal Fua)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Example: Tsukuba

Tsukuba dataset

Real-time stereo on GPU• Computes Sum-of-Square-Differences (use

pixelshader)• Hardware mip-map generation for aggregation over

window• Trade-off between small and large support window

(Yang and Pollefeys, CVPR2003)

290M disparity hypothesis/sec (Radeon9800pro)e.g. 512x512x36disparities at 30Hz

GPU is great for vision too!

Stereo matching

Optimal path(dynamic programming )

Similarity measure(SSD or NCC)

Constraints• epipolar

• ordering

• uniqueness

• disparity limit

Trade-off

• Matching cost (data)

• Discontinuities (prior)

Consider all paths that satisfy the constraints

pick best using dynamic programming

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Hierarchical stereo matching

Dow

nsam

plin

g

(Gau

ssia

n p

yra

mid

)

Dis

pari

ty p

rop

ag

ati

on

Allows faster computation

Deals with large disparity ranges

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Disparity map

image I(x,y) image I´(x´,y´)Disparity map D(x,y)

(x´,y´)=(x+D(x,y),y)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Example: reconstruct image from neighboring images

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Stereo matching with general camera configuration

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Image pair rectification

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Planar rectification

Bring two views Bring two views to standard stereo setupto standard stereo setup

(moves epipole to )(not possible when in/close to image)

~ image size

(calibrated)(calibrated)

Distortion minimization(uncalibrated)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

polarrectification

planarrectification

originalimage pair

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Stereo camera configurations

(Slide from Pascal Fua)

More cameras

Multi-baseline stereo

[Okutomi & Kanade]