Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics...

44
Class 4 1 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphi Class 4: Mutli-View 3D-from-2D CS329 Stanford University Amnon Shashua

Transcript of Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics...

Page 1: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 1

Multi-linear Systems and Invariant Theory

in the Context of Computer Vision and Graphics

Class 4: Mutli-View 3D-from-2D

CS329Stanford University

Amnon Shashua

Page 2: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 2

Material We Will Cover Today

• Epipolar Geometry and Fundamental Matrix

• The plane+parallax model and relative affine structure

• Why 3 views?

• Trifocal Tensor

Page 3: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 3

PeneHp T ';''

'' epHp

TneHH '

PIp ]0;[

PeHp ]';['

'p

P

1

y

x

p

p

P

H Stands for the family of 2D projective transformations

between two fixed images induced by a plane in space

Reminder (from class 1):

Page 4: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 4

PIp ]0,[

p

p’

),1,,( yxP

e’

PeHp '' '' epHp

pH

• what does stand for?

• what would we obtain after eliminating

Plane + Parallax

Page 5: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 5

tKZ

pRKKp '1

'' 1

Reminder (from class 1):

1' RKKH

,'' tKe

1)1

(' Ktnd

RKH T

PKp ]0;[

PtRKp ][''

1

Z

Y

X

P

Page 6: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 6

tKZ

pRKKp '1

'' 1

'1

'1

' 1 eZ

pKned

pHp T

1)1

(' Ktnd

RKH T

')( 1

eZd

pZKndpH

T

Z

Y

X

KZ

p1

Recall:

)( 1 pZKndd T Let:

'' eZd

dpHp

'' epHp

Page 7: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 7

Note that He ,' are determined (each) up to a scale.

'00 , ppLet Be any “reference” point not arising from

'00'0 epHp

'1

00

'0 epHp

Let H

0

1 be the homography we will use

Page 8: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 8

'1

'00

epHp

0

0

0 d

d

Z

Z

Zd

d

Recall:

P0P

Z0Z

0d d

Page 9: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 9

Plane + Parallax

'' epHp

P0P

Z0Z

0d d

We have used 4 space points for a basis:3 for the reference plane1 for the reference point (scaling)

Since 4 points determinean affine basis:

is called “relative affine structure”

Note: we need 5 points for a projective basis. The 5th point is thefirst camera center.

0

0

d

d

Z

Z

Page 10: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 10

Note: A projective invariant

'' epHp

P0P

Z0Z

0d d

0

0

d

d

Z

Zd̂

0d̂

'ˆ' ˆ epHp

0

0

ˆ

ˆ

ˆ d

d

d

d

This invariant (“projective depth”) is independent of both camera positions, therefore is projective.

5 basis points: 4 non-coplanar defines two planes, andA 5th point for scaling.

Page 11: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 11

Note: An Affine Invariant

P0P

Z0Z

0d d

What happens when camera center is at infinity? (parallel projection)

0,

0

0

0 d

d

d

d

Z

ZZZ

This invariant is independent ofboth camera positions, and is Affine.

Page 12: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 12

p

p’

),1,,( yxP

e’

'' epHp

pH

Fundamental Matrix

2'' epHprank

0)'(' pHep T

0)]'([' pHep T 0' Fpp T

Page 13: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 13

Fundamental Matrix

0)]'([' pHep T

0' Fpp TDefines a bilinear matching constraint whose coefficientsdepend only on the camera geometry (shape was eliminated)

• F does not depend on the choice of the reference plane

HeneHeHe T ]'[)'(]'[]'[

Page 14: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 14

Epipoles from F

Note: any homography matrix maps between epipoles:

c

e

'c

'e'eeH

Page 15: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 15

Epipoles from F

0Fe 0']'[]'[ eeeHe

0'eF T 0']'[ eeH T

Page 16: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 16

Estimating F from matching points

0' iT

i Fpp 8,...,1i Linear solution

0' iT

i Fpp 7,...,1i

0)det( F

N on-linear solution

0)det( F is cubic in the elements of F, thus we should expect3 solutions.

Page 17: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 17

Estimating F from Homographies

FH T is skew-symmetric (i.e. provides 6 constraints on F)

HeHHeneHFH TTTT ]'[]'[)'(

HeHneHeHHF TTTT ]'[)'(]'[

HFFH TT

2 homography matrices are required for a solution for F

Page 18: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 18

F Induces a Homography

p

Fp

F][ is a homography matrix induced by the plane definedby the join of the image line and the camera center

Page 19: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 19

Projective Reconstruction

1. Solve for F via the system 0' iT

i Fpp (8 points or 7 points)

2. Solve for e’ via the system 0'eF T

3. Select an arbitrary vector 0'eT

4. 0I '][ eFand are a pair of camera matrices.

'][' eFpp

Page 20: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 20

Trifocal Geometry

The three fundamental matrices completely describe the trifocalgeometry (as long as the three camera centers are not collinear)

1

3

212e

32e

13e23e

21e

31e

32123112 eeeF

0311232 eFeT

Likewise: 0122313 eFeT

0211323 eFeT

Each constraint is non-linear in the entries of the fundamental matrices (because the epipoles are the respective null spaces)

Page 21: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 21

Trifocal Geometry

0311232 eFeT

0122313 eFeT

0211323 eFeT

3 fundamental matrices provide 21 parameters. Subtract 3 constraints,Thus we have that the trifocal geometry is determined by 18 parameters.

This is consistent with the straight-forward counting:

3x11 – 15 = 18

(3 camera matrices provide 33 parameters, minus the projective basis)

Page 22: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 22

What Goes Wrong with 3 views?

13 212e32e

13e

23e21e31e

2131 ee

3212 ee

2313 ee

2 constraints each, thus we have21-6=15 parameters

Page 23: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 23

What Goes Wrong with 3 views?

13 212e32e

13e

23e21e31e

3t2t 1t

213 ttt

Thus, to represent 3t we need only 1 parameter

(instead of 3).

18-2=16 parameters are needed to represent the trifocal geometry in this case.

but the pairwise fundamental matrices can account for only 15!

Page 24: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 24

What Else Goes Wrong: Reprojection

1

3

2p'p

''p

'23 pFpF13

''' 2313 pFpFp

Given p,p’ and the pairwise F-matsone can directly determine the positionof the matching point p’’

This fails when the 3 camera centers are collinearbecause all three line of sights are coplanarthus there is only one epipolar line!

Page 25: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 25

The Trifocal Constraints

PIp 0 PeAp '' PeBp ''''

'

0

1

1

x

s

'

1

0

2

y

s0'1 psT

0'2 psT 'p

1s

2s

''

0

1

1

x

r

''

1

0

2

y

r0''1 prT

0''2 prT

Page 26: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 26

PIp 0

PeAp ''

PeBp ''''

0'1 psT

0'2 psT

0''1 prT

0''2 prT

0'1 PeAsT

0'2 PeAsT

0''1 PeBrT

0''2 PeBrT

0001 Px

0010 Py

The Trifocal Constraints

Page 27: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 27

0

''

''

'

'

010

001

4622

11

22

11

P

erBr

erBr

esAs

esAs

y

x

TT

TT

TT

TT

Every 4x4 minor must vanish!

12 of those involve all 3 views, they are arranged in 3 groupsDepending on which view is the reference view.

The Trifocal Constraints

Page 28: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 28

''

''

'

'

010

001

22

11

22

11

erBr

erBr

esAs

esAs

y

x

TT

TT

TT

TT

The reference view

Choose 1 row from here

Choose 1 row from here

We should expect to have 4 matching constraints 0)'',',( pppfi

The Trifocal Constraints

Page 29: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 29

Expanding the determinants:

'' eApp 0' esAps Ti

Ti 2,1i

'''' eBpp 0'' erBpr Tj

Tj 2,1j

eliminate

''' er

Bpr

es

ApsTj

Tj

Ti

Ti

))('())(''( BpresApser Tj

Ti

Ti

Tj 2,1, ji

The Trifocal Constraints

Page 30: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 30

P

r

sC

C’

C’’

p

x

0

1

y

1

0

4 planes intersect at P !

What is going on geometrically:

)',( esAps TT 0' PeAsT is a plane

The Trifocal Constraints

Page 31: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 31

The Trifocal Tensor

))('())(''( BpresApser Tj

Ti

Ti

Tj

New index notations: i-image 1, j-image 2, k-image 3

0' esAps TT 0' jj

ijij espas

js is a line in image 2

ip is a point in image 1

je' is a point in image 2

Page 32: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 32

The Trifocal Tensor

0' jlj

iji

lj espas

ljs 2,1l are the two lines coincident with p’, i.e. 0' jl

j ps

mkr 2,1m are the two lines coincident with p’’, i.e. 0'' km

k pr

0'' kmk

iki

mk erpbr

Eliminate

0))(''())('( iji

lj

kmk

iki

mk

jlj paserpbres

Page 33: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 33

The Trifocal Tensor

0))(''())('( iji

lj

kmk

iki

mk

jlj paserpbres

Rearrange terms:

0)'''( ji

kki

jmk

lj

i aebersp

The trifocal tensor is:

ji

kki

jjki aebeT '''

2,1, ml

Page 34: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 34

x” Ti13pi - x”x’ Ti

33pi + x’ Ti31pi- Ti

11pi = 0y” Ti

13pi - y”x’ Ti33pi + x’ Ti

32pi- Ti12pi = 0

x” Ti23pi - x”y’ Ti

33pi + y’ Ti31pi- Ti

21pi = 0y” Ti

23pi - y”y’ Ti33pi + x’ Ti

32pi- Ti22pi = 0

0jki

mk

lj

i Trsp

'10

'01

y

xs j

l

''10

''01

y

xr k

m

The Trifocal Tensor

The four “trilinearities”:

Page 35: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 35

The Trifocal Tensor

21jjj sss

'p

2s

1ss

21kkk rrr

0))(( 2121 jkikkjj

ijkikj

i TrrsspTrsp

A trilinearity is a contraction with a point-line-line where the linesare coincident with the respective matching points.

Page 36: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 36

Slices of the Trifocal Tensor

Now that we have an explicit form of the tensor, what can we do with it?

?jkij

i TspThe result must be a contravariant vector (a point). This pointis coincident with r for all lines coincident with ''p

kjkij

i pTsp ''

The point reprojection equation (will work when camera centersare collinear as well).

'' pes

Note: reprojection is possible after observing 7 matching points,(because one needs 7 matching triplets to solve for the tensor).This is in contrast to reprojection using pairwise fundamental matricesWhich requires 8 matching points (in order to solve for the F-mats).

Page 37: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 37

Slices of the Trifocal Tensor

p3

21

s

''p

'p

kjkij

i pTsp ''

Page 38: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 38

Slices of the Trifocal Tensor

?jkikj Trs

The result must be a line.

ijk

ikj qTrs

rk

sj

O

O’

O’’

qi13 matching linesare necessary forsolving for the tensor(compared to 7 matching points)

Line reprojection equation

Page 39: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 39

Slices of the Trifocal Tensor

?jkikT

The result must be a matrix.

3

21

p

jkik

ji TH

HpH is a homography matrix

jkikT

is a family of homography matrices (from 1 to 2) induced by the family of planes coincidant with the 3rd camera center.

jkik

i Tp is the reprojection equation

Page 40: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 40

Slices of the Trifocal Tensor

jkijT is the homography matrix from 1 to 3 induced by the plane

defined by the image line and the second camera center.

?jki

iT

3

21

s

13F

jkij

i Ts is the reprojection equation

The result is a point on theepipolar line of on

image 3

Page 41: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 41

Slices of the Trifocal Tensor

jkjki

i GT

3

21

s

13F

Gs Is a point on the epipolar line 13F

2)( Grank

(because it maps the dual planeonto collinear points)

13)( FGnull

12)( FGnull T

Page 42: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 42

18 Parameters for the Trifocal Tensor

ji

kki

jjki aebeT '''

)'('')''(' ji

ji

kki

ki

j enaeenbe kj

ikj

ijk

i eeneenT '''''' jk

iTjk

iT Has 24 parameters (9+9+3+3)minus 1 for global scaleminus 2 for scaling e’,e’’ to be unit vectorsminus 3 for setting insuch that B has a vanishing column

n

= 18 independent parameters

We should expect to find 9 non-linear constraints among the27 entries of the tensor (admissibility constraints).

Page 43: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 43

18 Parameters for the Trifocal Tensor

What happens when the 3 camera centers are collinear?

(we saw that pairwise F-mats account for 15 parameters).

''' 11 eBeA 13 2

'e''e''1eB

'1eA

This provides two additional (non-linear) constraints, thus18-2=16.

Page 44: Class 41 Multi-linear Systems and Invariant Theory in the Context of Computer Vision and Graphics Class 4: Mutli-View 3D-from-2D CS329 Stanford University.

Class 4 44

Items not Covered in Class

• Degenerate configurations (Linear Line Complex, Quartic Curve)

• The source of the 9 admissibility constraints (come from the homography slices).

• Concatenation of trifocal tensors along a sequence

• Quadrifocal tensor (and its relation to the homography tensor)