Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive...

30
Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger

Transcript of Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive...

Page 1: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Multi-view real-time depth estimation based on

combination of visual-hull and hybrid recursive matching

HHI

Wolfgang Waizenegger

Page 2: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Overview• Field of application: 3D Presence

– 2D Videoconferencing – 3D Videoconferencing– 3D Presence concept and 3D displays– The camera system

• 3D Analysis– 3D algorithmic chain– Hybrid recursive matching (HRM)– Visual Vull (VH)– HRM and VH combination

• Results• Hardware• Conclusion and Outlook

Page 3: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

3D Presence Consortium

Page 4: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

SoA of Telepresence Systems

Polycom TPX System

Telepresence System by

CISCO

HP Halo Telepresence

System

Page 5: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Drawbacks of conventional telepresence systems

• Drawback: – No eye contact, e.g. it is hard to

recognize who is talking to whom– Misleading gestures and body

language

• Ideal situation:Every local participant has its own view for each remote conferee

• Solution: Immersive 3D videoconferencing

Missing eye contact (CISCO system)

Page 6: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

SoA of 3D Videoconferencing

MultiView by Univ. of

California,Berkeley, 2004

Virtue/im.point by Fraunhofer HHI, 2003/2004

Real Meet Room, France Telecom R&D, 2001

Page 7: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

The concept of 3D Presence

Three partiesTwo conferees per party

• Multi-party 3D videoconferencing• 3D multi-user auto-stereoscopic display technology• Multi-party eye contact and gesture-based

interaction

Replace remote confereesby 3D displays

Page 8: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Multi-View 3D Displays

Multiple 3D views from different perspectives

Advantages:- Own view for each local conferee- Adapted viewing perspective- 3D impression- Multiple views allow conferees to switch perspective by moving the head

multiple viewing cones

Page 9: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Multi-View 3D Display

Page 10: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

The Multi-View Camera System

Narrow baseline system• Robust disparity estimation• Consistency check by trifocal matching

b

b

kb combined trifocal system

vertical wide baseline system

horizontal wide baseline system

horizontal narrow baseline system

vertical narrow baseline system

vertical wide baseline system

Wide baseline system• Increased depth resolution• Option to combine with Visual Hull

Page 11: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

The Mock-up for Camera Configuration Testing

Page 12: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

3D Analysis Chain

n stereo streams

segmentation

disparity estimation

volumetric reconstructi

on

head tracking

hand tracking

data fusion

depth maps

3D modeling

data

occlusion information etc.

video + depth (n)

Page 13: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Hybrid-Recursive Matching (HRM)

pixel recursion

choice of best disparity

disparity memory

block recursion

3 candidates

disparity vector

left image

start vector

update vector

right image

Page 14: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Trifocal system

vertical narrow baseline

after consistency check

horizontalnarrow baseline

Page 15: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Multi-View Video Analysis Chain

n stereo streams

segmentation

disparity estimation

volumetric reconstructi

on

head tracking

hand tracking

data fusion

depth maps

3D modeling

data

occlusion information etc.

video + depth (n)

Colored Visual Hull reconstruction

Page 16: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Visual Hull Techniques

• Polygonal• Volume based space carving (VH)• Image based (IBVH)

3D Presence demands real-time processing!!

Parallelization of the last two approaches on graphics hardware is straightforward!

Page 17: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

IBVH Algorithm

Our implementation is based on the initial work of Matusik et al. (2000)

Advantages of our algorithm• Improved caching strategy that allows pixel pre-selection

which significantly speeds up the computation• GPU only implementation using CUDA• Establishes an interconnection to voxel based

implementation by applying cameras at infinity.

Page 18: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

IBVH interconnection to voxel based methods

Page 19: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

VH vs. IBVHTimings for two GPU based implementations with different resolutions. The

imageupload time is included.

Volume based approach from Ladikos et al. 2008 (VH_Lad)Our image based approach (PPSIBVH, without pixel pre-selection IBVH)

Input: Middlebury dinoRig dataset ( 48 images, 640 x 480 )

Hardware 1283 2563 5123

VH_Lad 4 x 8800GTX 99.89 ms 296.71 ms -

IBVH 1 x GTX280 47.9 ms 82.5 ms 280.6 ms

PPSIBVH 1 x GTX280 41.6 ms 60.9 ms 150.6 ms

Page 20: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

IBVH result for the dinoRig dataset

left) Voxel representation of the IBVH result (5123), right) image based depth map

Page 21: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

IBVH result for a 3D Presence conferee

Timing for a typical 3D Presence setup with depth maps of 192x256 and 8 Visual Hull cameras: 10–20 msec on a single GTX280.

Soares et al. use an eight CPU dual Opteron 2.2GHz machine to achieve almost the same results with 5 cameras and an octree based Visual Hull algorithm

Page 22: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Combination HRM and VH

Result for the combination of HRM and VH

Page 23: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Combination HRM and VH (cont.)

Page 24: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Realization: Hardware Overview for the 3D Presence

setup

• 5 x PCs with dual Nehalem Xeon CPUs

• 2 x Geforce GTX295 per cluster node• Infiniband 40GB/s interconnection

Page 25: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

3D Presence System Architecture

Node_VH

Node_2

Node_0

Node_1 Node_3

Node_N-Capture (4 cameras)-Segmentation-Lens un-distortion-Rectification-HRM (trifocal)-Bilateral filtering-Virtual view generation-Encoding (video+depth)-Networking

Page 26: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Inalienability of GPUs• Hardware:

– CPU: Intel 3.0GHz (single core computation)– GPU: Geforce GTX280

• Input:– Images: 1024 x 768, RGB24– Depth Maps: 1024 x 768, float

• GPU results include up- and download times

GPU CPU

Lens un-distortion + rectification

2 msec 68 msec

Bilateral filtering of depth mapVirtual view synthesis (RGB)

11 msec 1000 msec

1 msec 150 msec

Page 27: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Demo

Virtual view generation based on estimated depth maps

Page 28: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Conclusion and Outlook

• Three party immersive 3D Videoconferencing system • Real-time 3D analysis for a 16 camera setup• Fast IBVH algorithm which runs entirely on a single GPU• Combination of trifocal HRM and VH significantly improves the

results• All processing runs in real-time on only 5 PCs• System allows to rapidly test various camera configuration

• First real-time demonstrator prototype available by October 2009

• Future: Full HD real-time 3D processing chain

Page 29: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

Thank you!

Contact: [email protected]

Web: www.3dpresence.eu

Page 30: Multi-view real-time depth estimation based on combination of visual-hull and hybrid recursive matching HHI Wolfgang Waizenegger.

ReferencesAtzpadin, N., Kauff, P. and Schreer, O.: Stereo Analysis by Hybrid Recursive Matching for

Real-Time Immersive Video Conferencing, IEEE Transactions on Circuits and Systems for Video Technology, special Issue on Immersive Telecommunications, vol. 14, no. 3, pp. 321-334, January 2004.

Matusik, W., Buehler, C., Raskar, R., Gortler, S. J., and McMillan, L. 2000. Image-based visual hulls. In Proceedings of the 27th Annual Conference on Computer Graphics and interactive Techniques International Conference on Computer Graphics and Interactive Techniques.

Lakikos, A., Benhimane, S., Navab, N., Efficient Visual Hull Computation for Real-Time 3D Reconstruction using CUDA, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska (USA), June 2008. Workshop on Visual Computer Vision on GPUs (CVGPU).

Soares, L., Menier, C., Raffin, B., and Roch, J.L. Parallel adaptive octree carving for real-time 3d modeling. Poster at IEEE VR'2007 - Virtual Reality Charlotte, Northe Carolina, USA, March 2007.