pages final print - University of WashingtonShape and Motion under Varying Illumination: Unifying...

1
Shape and Motion under Varying Illumination: Unifying Structure from Motion, Photometric Stereo, and Multi-view Stereo Li Zhang* Brian Curless* Aaron Hertzmann** Steven M. Seitz* *University of Washington, Seattle, WA, USA **University of Toronto, Toronto, ON, Canada Introduction Example Dense shape reconstruction of moving objects under varying illumination from a single video. A hand-held figurine rotating in front of a fixed camera under static lighting. Goal Standard methods Our solution Optical flow under varying illumination ... ... ... => Contributions Using both spatial and temporal brightness variations dense structure from motion stereo matching under lighting changes photometric stereo for moving scenes ^ | ^ | ^ | ^ | dense surface reconstruction in both textured and textureless regions spatial brightness variation motion cue surface position temporal brightness variation photometric cue surface orientation } ^ | Mathematical Formualtion Brightness-varying flow: I t (x t,p ) = γ t,p · I 0 (x 0,p ) Assume a Lambertian object moving rigidly in front of an orthographic camera under static distant light. Multi-point multi-frame optical flow can be computed by minimizing: Φ({x t,p , γ t,p })= Σ ( I t (x t,p ) - γ t,p · I 0 (x 0,p ) ) 2 t,p light camera frame 0 n p s p relative light and camera motion x 0,p object surface camera frame t x t,p light Let l t and b t be the directional and ambient light at frame t, and n p and α p be the normal and albedo of the point p, then I t (x t,p ) = α p · ( l t T n p + b t ) Image Formation: Irradiance ratio: γ t,p = = l 0 T n p + b 0 l t T n p + b t I t (x t,p ) I 0 (x 0,p ) only feature points only constant lighting only static objects Structure from Motion Multi-view Stereo Photometric Stereo R t , o t Reconstruction algorithm Results Conclusion Reconstruction of the figurine Three example frames. Optical flow is ambiguous within constant brightness regions; however, the temporal brightness variations uniquely determine surface orientation. Structure from Motion Multi-view Stereo Photometric Stereo X, Y R x , O x , R y , O y , Γ=1 Γ, constant X, Y R x , O x , R y , O y , S S N, L Known Unnown Motion cue constrains 3D positions inaccurately for low textured pixels Photometric cue reveals normals accurately even for moving scenes, despite the noisy motion estimation Combing both cues recovers moving shape densely Constraints on brightness-varying flow Our formulation extends [Irani99] and applies to features, edges, and textureless regions. Our formulation also subsumes Structure from Motion, Multi-view Stereo, and Photometric Stereo as special cases: Initialization Track features, compute R x , O x , R y , O y , estimate feature normals, and initialize L. Step1 Fix R x , O x , R y , O y , and L, and update S and N; Step2 Correct S with large uncertainties by integrating N; Step3 Fix R x , O x , R y , O y , and S, and update L. The algorithm is implemented in a coarse-to-fine manner, and two iterations are computed at each level of detail. The coarse-to-fine reconstruction, using both motion and photometric cues Reconstruction using only motion cue A profile view of the reconstruction The profile view with recovered albedo map Reconstruction of a nearly textureless object Rank 3 constraint on {x t,p , y t,p } [Tomasi&Kanade90]: Rank 4 constraint on {γ t,p } [Basri&Jacob01]: Consistency constraint on S and N: δS δx N, δS δy N. δS δx N, δS δy N. Multi-point multi-frame brightness-varying optical flow s.t. X = R x S+O x , Y = R y S+O y , Γ = LN, min Φ(X, Y, Γ29 Reconstruction using only motion cues Reconstruction using both motion and photometric cues Y = R y S + O y , similarly. X = = R x S + O x = s p [ ] + o x1 o x1 o xT o xT [ ] r xt T [ ] x 1,1 x 1,P x T,1 x T,P [ ] where β p = l 0 T n p +b 0 . Γ = = LN = γ 1,1 γ 1,P γ T,1 γ T,P [ ] [ ] [ ] n p β p β p 1 ~ l t T b t

Transcript of pages final print - University of WashingtonShape and Motion under Varying Illumination: Unifying...

Page 1: pages final print - University of WashingtonShape and Motion under Varying Illumination: Unifying Structure from Motion, Photometric Stereo, and Multi-view Stereo Li Zhang* Brian Curless*

Shape and Motion under Varying Illumination:Unifying Structure from Motion, Photometric Stereo, and Multi-view Stereo

Li Zhang* Brian Curless* Aaron Hertzmann** Steven M. Seitz**University of Washington, Seattle, WA, USA **University of Toronto, Toronto, ON, Canada

Introduction

Example

Dense shape reconstruction of moving objects under varying illumination from a single video.A hand-held figurine rotating in front of a fixed camera under static lighting.

Goal

Standard methods

Our solution

Optical flow under varying illumination

... ... ... =>

Contributions

Using both spatial and temporal brightness variations

♦dense structure from motion ♦stereo matching under lighting changes ♦photometric stereo for moving scenes

dense surface reconstruction inboth textured and textureless regions

spatial brightness variation

motion cue

surface position

temporal brightness variation

photometric cue

surface orientation}

Mathematical Formualtion

Brightness-varying flow: It(xt,p) = γt,p · I0(x0,p)

Assume a Lambertian object moving rigidly in front of an orthographic camera under static distant light.

Multi-point multi-frame optical flow can be computed by minimizing: Φ({xt,p, γt,p})=Σ( It(xt,p) − γt,p · I0(x0,p) )2

t,p

light camera

frame 0

np

sp

relative light and camera motion

x0,p

object surface

camera

frame tx t,p

light

Let lt and bt be the directional and ambient light at frame t, and np and αp be the normal and albedo of the point p, then

It(xt,p) = αp · ( ltT np + bt)Image Formation:

Irradiance ratio: γt,p==l0T np + b0

ltT np + btIt(xt,p)

I0(x0,p)

only feature points only constant lighting only static objects

♦Structure from Motion ♦Multi-view Stereo ♦Photometric Stereo

Rt, ot

Reconstruction algorithm

Results

Conclusion

Reconstruction of the figurine

Three example frames. Optical flow is ambiguous within constant brightness regions; however, the temporal brightness variations uniquely determine surface orientation.

Structure from MotionMulti-view StereoPhotometric Stereo

X, YRx, Ox, Ry, Oy, Γ=1

Γ, constant X, Y

Rx, Ox, Ry, Oy, SS

N, L

Known Unnown

♦ Motion cue constrains 3D positions inaccurately for low textured pixels♦ Photometric cue reveals normals accurately even for moving scenes, despite the noisy motion estimation♦ Combing both cues recovers moving shape densely

Constraints on brightness-varying flow Our formulation extends [Irani99] and applies to features, edges, and textureless regions. Our formulation also subsumes Structure from Motion, Multi-view Stereo, and Photometric Stereo as special cases:

Initialization Track features, compute Rx, Ox, Ry, Oy,estimate feature normals, and initialize L.Step1 Fix Rx, Ox, Ry, Oy, and L, and update S and N;Step2 Correct S with large uncertainties by integrating N;Step3 Fix Rx, Ox, Ry, Oy, and S, and update L.The algorithm is implemented in a coarse-to-fine manner, and two iterations are computed at each level of detail.

The coarse-to-fine reconstruction, using both motion and photometric cuesReconstruction using only motion cue

A profile view of the reconstruction

The profile view withrecovered albedo map

Reconstruction of a nearly textureless object

Rank 3 constraint on {xt,p, yt,p} [Tomasi&Kanade90]:

Rank 4 constraint on {γt,p} [Basri&Jacob01]:

Consistency constraint on S and N: δSδx

⊥ N, δSδy

⊥ N.

δSδx

⊥ N, δSδy

⊥ N.

Multi-point multi-frame brightness-varying optical flow

s.t. X = RxS+Ox, Y = RyS+Oy, Γ = LN,

min Φ(X, Y, Γ)

Reconstruction using only motion cues

Reconstruction using both motion and photometric cues

Y = RyS + Oy, similarly.

X = = RxS + Ox= … …sp[ ]+ox1 ox1

oxT oxT

……[ ]…

rxtT[ ]…

x1,1 x1,P

xT,1 xT,P

……[ ]…

where βp = l0T np+b0.

Γ = = LN =γ1,1 γ1,P

γT,1 γT,P

…[ ] [ ] … …[ ]npβp

βp

1… …~ltT bt…

……