IEEE Transactions on Circuits and Systems for Video Technology, 2011
description
Transcript of IEEE Transactions on Circuits and Systems for Video Technology, 2011
![Page 1: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/1.jpg)
1
Real-time Stereo Matching on CUDA using an Iterative Refinement Method for Adaptive Support-Weight Correspondences
IEEE Transactions on Circuits and Systems for Video Technology, 2011
University of Nebraska-Lincoln
Jedrzej KowalczukEric T. Psota
Lance C. Pérez
![Page 2: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/2.jpg)
2
Outline• Introduction•Related work• Iterative model• Implement on parallel hardware•Result•Conclusion
![Page 3: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/3.jpg)
3
Introduction•A novel real-time stereo matching method is
presented by using ▫a two-pass approximation of adaptive support-weight
aggregation.▫a low-complexity iterative disparity refinement
technique.
•The refinement technique, constructed using a probabilistic framework.
![Page 4: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/4.jpg)
4
Introduction• two-pass method produces
▫an accurate approximation of the support weights. ▫reducing the complexity of aggregation.
•This method has been implemented on massively parallel using the CUDA computing engine.
![Page 5: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/5.jpg)
5
Introduction• In this paper, a real-time stereo matching method is
introduced by using▫window-based cost aggregation.▫a low-complexity iterative technique implemented.
on CUDA.
![Page 6: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/6.jpg)
6
Introduction•Many real-time methods focus on reducing the
complexity, at the expense of reduced accuracy.
•The proposed approach takes full advantage of the GTX 580’s computing capabilities to produce a highly accurate stereo matching method.
![Page 7: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/7.jpg)
7
Outline• Introduction•Related work• Iterative model• Implement on parallel hardware•Result•Conclusion
![Page 8: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/8.jpg)
8
Related work•Adaptive support-weight
▫mimics the process of visual grouping in the HVS.▫decreases as the geometric distance between p and q
increases.▫typical scene surfaces have locally consistent color.
![Page 9: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/9.jpg)
9
Adaptive Support-Weight•.
• .
• .
![Page 10: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/10.jpg)
10
Adaptive Support-Weight•Complexity of ASW makes it unsuitable for cost
aggregation in real-time applications.
• It is necessary to reduce the complexity of raw adaptive support-weight cost aggregation.▫two-pass adaptive support weights [21]▫approximated joint bilateral filtering [22]▫exponential step-size adaptive weights [9]▫cross-based support weight [11]
![Page 11: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/11.jpg)
11
Two-pass Adaptive Support-Weight• Instead of using square windows for matching.
•The two-pass approach approximates the ASW by performing cost aggregation along the vertical and then the horizontal direction.
•Complexity is reduced from O(n2) to O(n).
![Page 12: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/12.jpg)
12
Two-pass Adaptive Support-Weight•Fail to accurately approximate the support weights
under certain conditions.
![Page 13: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/13.jpg)
13
Compare the Four Modifications
Two-pass
Bilateral Filtering
ESAW
Cross-based
![Page 14: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/14.jpg)
14
Outline• Introduction•Related work• Iterative model• Implement on parallel hardware•Result•Conclusion
![Page 15: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/15.jpg)
15
Flow Diagram
![Page 16: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/16.jpg)
16
Iterative model• Improve the accuracy of the adaptive support-weight
stereo matching.•Let denote a probabilistic event
▫.
![Page 17: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/17.jpg)
17
Iterative model•Bayes’ theorem
![Page 18: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/18.jpg)
18
Iterative model•Stereo matching is performed by using an additive
distance metric, arbitrarily denoted by δ(q, ͞q).▫.
• .
![Page 19: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/19.jpg)
19
Iterative model•.
![Page 20: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/20.jpg)
20
Iterative Disparity Refinement•Let Dp
i be the disparity estimate for pixel p obtained in the ith iteration of matching.
•Let Fpi used to express the confidence level associated
with the disparity estimate of pixel p.
• .
![Page 21: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/21.jpg)
21
Iterative Disparity Refinement•Penalty function
![Page 22: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/22.jpg)
22
Iterative Disparity Refinement•After the matching costs are computed, the minimum
cost matches are found for both reference and target images using the WTA decision criteria.
![Page 23: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/23.jpg)
23
Iterative Disparity Refinement• If ͞p = m(p) and p’ = m(͞p), then
▫disparity d(p, ͞p) is assigned to reference disparity map.▫disparity d(p’, ͞p) is assigned to target disparity map.
• If | d(p, ͞p) - d(p’, ͞p) | > 1, then its confidence Fpi is
set to zero.
![Page 24: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/24.jpg)
24
Outline• Introduction•Related work• Iterative model• Implement on parallel hardware
▫CUDA execution model▫stereo matching on CUDA▫complexity and runtime distribution
•Result•Conclusion
![Page 25: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/25.jpg)
25
Flow Diagram
![Page 26: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/26.jpg)
26
CUDA execution model•A block of threads is an abstract representation of a
multiprocessor and capable of performing operations in parallel.▫The threads are executed on the graphics device
equipped with a GPU.
▫At runtime, each block of threads gets mapped to a single multiprocessor on the device.
![Page 27: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/27.jpg)
27
CUDA execution model•The implementation of the proposed method utilizes
the NVIDIA GeForce GTX 580 GPU computing processor, equipped with 512 CUDA cores.
•The device code is encapsulated in special functions called kernels that are invoked by the host, and executed in parallel by multiple threads.
![Page 28: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/28.jpg)
28
Stereo Matching on CUDA•The kernels are designed such that each thread within
a block is responsible for computing the matching cost for a single pair of pixels.
•This granularity of computations allows the threads in each warp to take advantage of memory coalescing.
![Page 29: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/29.jpg)
29
Stereo Matching on CUDA
![Page 30: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/30.jpg)
30
![Page 31: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/31.jpg)
31
![Page 32: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/32.jpg)
32
![Page 33: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/33.jpg)
33
Complexity and Runtime Distribution•Complexity of matching cost volume is O(mnwr/s).
•Complexity of iterative refinement is O(mnwk/s).
![Page 34: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/34.jpg)
34
Percentages of the total execution time
![Page 35: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/35.jpg)
35
Outline• Introduction•Related work• Iterative model• Implement on parallel hardware•Result•Conclusion
![Page 36: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/36.jpg)
36
Result•γc = 30.91 and γg = 28.21 for matching cost
aggregation.
•γc = 10.94 and γg = 118.78 for iterative disparity refinement, and the disparity penalty was set to
α = 0.085.
![Page 37: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/37.jpg)
37
Result
![Page 38: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/38.jpg)
38
Result
![Page 39: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/39.jpg)
39
![Page 40: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/40.jpg)
40
![Page 41: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/41.jpg)
41
![Page 42: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/42.jpg)
42
Outline• Introduction•Related work• Iterative model• Implement on parallel hardware•Result•Conclusion
![Page 43: IEEE Transactions on Circuits and Systems for Video Technology, 2011](https://reader033.fdocuments.us/reader033/viewer/2022061606/568162bc550346895dd3474d/html5/thumbnails/43.jpg)
43
Conclusion•The refinement technique iteratively improves the
accuracy of the disparity map and typically converges after only six iterations.
•The added complexity associated with iterative refinement is shown both analytically and experimentally to be relatively small.