Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of...
-
Upload
blake-august-kelly -
Category
Documents
-
view
223 -
download
1
Transcript of Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of...
![Page 1: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/1.jpg)
Geometry 3:Stereo Reconstruction
Introduction to Computer VisionRonen Basri
Weizmann Institute of Science
![Page 2: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/2.jpg)
Material covered
• Pinhole camera model, perspective projection• Two view geometry, general case:• Epipolar geometry, the essential matrix• Camera calibration, the fundamental matrix
• Two view geometry, degenerate cases• Homography (planes, camera rotation)• A taste of projective geometry
• Stereo vision: 3D reconstruction from two views• Multi-view geometry, reconstruction through
factorization
![Page 3: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/3.jpg)
Summary of last lecture
Homography Perspective (calibrated)
Perspective (uncalibrated)
Orthographic
Form 0 0 0Properties One-to-one
(group)Concentric epipolar lines
Concentric epipolar lines
Parallel epipolar lines
DOFs 8(5) 8(5) 8(7) 4Eqs/pnt 2 1 1 1Minimal configuration 4 5+ (8,linear) 7+ (8,linear) 4
Depth No Yes, up to scale
Yes, projective structure
Affine structure (third view required for Euclidean structure)
![Page 4: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/4.jpg)
Camera rotation
• Images obtained by rotating the camera about its optical axis are related by homography:
()
• Verify that does not depend on :
,
,
![Page 5: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/5.jpg)
Planar scene
• For a planar scene , with
and
,
,
![Page 6: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/6.jpg)
Epipolar lines
epipolar linesepipolar lines
BaselineO O’
epipolar plane
𝑝 ′𝑇 𝐸𝑝=0
![Page 7: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/7.jpg)
Rectification
• Rectification: rotation and scaling of each camera’s coordinate frame to make the epipolar lines horizontal and equi-height,by bringing the two image planes to be parallel to the baseline
• Rectification is achieved by applying homography to each of the two images
![Page 8: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/8.jpg)
Rectification
BaselineO O’
𝑞 ′𝑇𝐻 𝑙−𝑇 𝐸𝐻𝑟
−1𝑞=0
𝐻 𝑙 𝐻𝑟
![Page 9: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/9.jpg)
Cyclopean coordinates
• In a rectified stereo rig with baseline of length , we place the origin at the midpoint between the camera centers.
• a point is projected to:• Left image: , • Right image: ,
• Cyclopean coordinates:
![Page 10: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/10.jpg)
Disparity
• Disparity is inverse proportional to depth• Constant disparity constant depth• Larger baseline, more stable reconstruction of depth
(but more occlusions, correspondence is harder)
(Note that disparity is defined in a rectified rig in a cyclopean coordinate frame)
![Page 11: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/11.jpg)
The correspondence problem
• Stereo matching is ill-posed:• Matching ambiguity: different regions may look similar
![Page 12: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/12.jpg)
The correspondence problem
• Stereo matching is ill-posed:• Matching ambiguity: different regions may look similar• Specular reflectance: multiple depth values
![Page 13: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/13.jpg)
Random dot stereogram
• Depth is perceived from a pair of random dot images• Stereo perception is based solely on local
information (low level)
![Page 14: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/14.jpg)
Moving random dots
![Page 15: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/15.jpg)
Compared elements for correspondence
• Single pixel intensities• Pixel color• Small window (e.g. or ), often using normalized
correlation to offset gain• Features and edges• Mini segments
![Page 16: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/16.jpg)
Dynamic programming
• Each pair of epipolar lines is compared independently• Local cost, sum of unary term and binary term• Unary term: cost of a single match• Binary term: cost of change of disparity (occlusion)
• Analogous to string matching (‘diff’ in Unix)
![Page 17: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/17.jpg)
String matching
• Swing → String
S t r i n g
S w i n g
Start
End
![Page 18: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/18.jpg)
String matching
• Cost: #substitutions + #insertions + #deletions
S t r i n g
S w i n g
![Page 19: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/19.jpg)
![Page 20: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/20.jpg)
Stereo with dynamic programming• Shortest path in a grid• Diagonals: constant disparity• Moving along the diagonal –
pay unary cost (cost of pixel match)• Move sideways – pay binary cost,
i.e. disparity change (occlusion, right or left)• Cost prefers fronto-parallel planes.
Penalty is paid for tilted planes
![Page 21: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/21.jpg)
Dynamic programming on a grid
Start
, Complexity?
![Page 22: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/22.jpg)
Probability interpretation: the Viterbi algorithm
• Markov chain
• States: discrete set of disparity
• Log probabilities: product sum
![Page 23: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/23.jpg)
Probability interpretation: the Viterbi algorithm
• Markov chain
• States: discrete set of disparity
• Maximum likelihood: minimize sum of negative logs• Viterbi algorithm: equivalent to shortest path
![Page 24: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/24.jpg)
Dynamic programming: pros and cons• Advantages:• Simple, efficient• Achieves global optimum• Generally works well
• Disadvantages:
![Page 25: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/25.jpg)
Dynamic programming: pros and cons• Advantages:• Simple, efficient• Achieves global optimum• Generally works well
• Disadvantages:• Works separately on each epipolar line,
does not enforce smoothness across epipolars• Prefers fronto-parallel planes• Too local? (considers only immediate neighbors)
![Page 26: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/26.jpg)
Markov random field
• Graph In our case: graph isa 4-connected gridrepresenting one image
• States: disparity
• Minimize energy of the form
• Interpreted as negative log probabilities
![Page 27: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/27.jpg)
Iterated conditional modes (ICM)
• Initialize states (= disparities) for every pixel• Update repeatedly each pixel by the most likely
disparity given the values assigned to its neighbors:
• Markov blanket: the state of a pixel only depends on the states of its immediate neighbors• Similar to Gauss-Seidel iterations• Slow convergence to (often bad) local minimum
![Page 28: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/28.jpg)
Graph cuts: expansion moves
• Assume is non-negative and is metric:
• We can apply more semi-global moves using minimal s-t cuts
• Converges faster to a better (local) minimum
![Page 29: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/29.jpg)
α-Expansion
• In any one round, expansion move allows each pixel to either • change its state to α, or• maintain its previous state
Each round is implemented via max flow/min cut
• One iteration: apply expansion moves sequentially with all possible disparity values
• Repeat till convergence
![Page 30: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/30.jpg)
α-Expansion
• Every round achieves a globally optimal solution over one expansion move• Energy decreases (non-increasing) monotonically
between rounds• At convergence energy is optimal with respect to all
expansion moves, and within a scale factor from the global optimum:
where
![Page 31: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/31.jpg)
α-Expansion (1D example)
![Page 32: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/32.jpg)
α-Expansion (1D example)
𝛼
𝛼
![Page 33: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/33.jpg)
α-Expansion (1D example)
𝐷𝑝(𝛼) 𝐷𝑞 (𝛼)
𝛼
𝛼
𝑉 𝑝𝑞 (𝛼 ,𝛼 )=0
![Page 34: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/34.jpg)
α-Expansion (1D example)
𝛼
𝛼
𝐷𝑝(𝑑𝑝) 𝐷𝑞 (𝑑𝑞)
But what about?
![Page 35: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/35.jpg)
α-Expansion (1D example)
𝛼
𝛼
𝐷𝑝(𝑑𝑝) 𝐷𝑞 (𝑑𝑞)
𝑉 𝑝𝑞(𝑑𝑝 ,𝑑𝑞)
![Page 36: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/36.jpg)
α-Expansion (1D example)
𝛼
𝛼
𝐷𝑝(𝑑𝑝)
𝑉 𝑝𝑞(𝑑𝑝 ,𝛼)𝐷𝑞 (𝛼)
![Page 37: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/37.jpg)
α-Expansion (1D example)
𝛼
𝛼
𝐷𝑞 (𝑑𝑞)
𝑉 𝑝𝑞(𝛼 ,𝑑𝑞)𝐷𝑝(𝛼)
![Page 38: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/38.jpg)
α-Expansion (1D example)
𝛼
𝛼
𝑉 𝑝𝑞(𝛼 ,𝑑𝑞)𝑉 𝑝𝑞(𝑑𝑝 ,𝛼)
𝑉 𝑝𝑞(𝑑𝑝 ,𝑑𝑞)
Such a cut cannot be obtained due to triangle inequality:
![Page 39: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/39.jpg)
Common metrics
• Potts model:
• Truncated :
• Truncated squared difference is not a metric
![Page 40: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/40.jpg)
Reconstruction with graph-cuts
Original Result Ground truth
![Page 41: Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.](https://reader038.fdocuments.us/reader038/viewer/2022103100/56649ed45503460f94be4dbf/html5/thumbnails/41.jpg)
A different application: detect skyline• Input: one image, oriented with sky above• Objective: find the skyline in the image• Graph: grid• Two states: sky, ground• Unary (data) term:
• State = sky, low if blue, otherwise high• State = ground, high if blue, otherwise low
• Binary term for vertical connections:• If state(node)=sky then state(node above)=sky (infinity if not)• If state(node)=ground then state(node below)= ground
• Solve with expansion move. This is a two state problem, and so graph cut finds the global optimum in one expansion move