Tilting Optimization of Fixed Stereovision Cameras

10
Procedia Engineering 48 (2012) 479 – 488 1877-7058 © 2012 Published by Elsevier Ltd.Selection and/or peer-review under responsibility of the Branch Office of Slovak Metallurgical Society at Faculty of Metallurgy and Faculty of Mechanical Engineering, Technical University of Košice doi:10.1016/j.proeng.2012.09.542 MMaMS 2012 Tilting optimization of fixed stereovision cameras Petr Novak a , Vladimír Mostyn a , Václav Krys a , Zdenko Bobovský b * a Department of Robotics, Faculty of Mechanical Engineering,, VŠB- TU Ostrava, 17. listopadu 15, Ostrava-Poruba, 708 33, Czech Republic b Department of Applied Mechanics and Mechatronics, Faculty of Mechanical Engineering, TU-Košice, Letná 9, Košice, 040 01, Slovak Republic Abstract The article describes a stereoscopic camera system based on a pair of separate cameras. The cameras are stationary and tilted together. The value of their tilting angle is derived in regards to their utilization of the chip area. By optimizing tilting we can maximally utilize the chip area, without the unused parts. It can increase the size of the cropped image of both left and right images as well as their resolution. The results were tested and verified on a real stereoscopic system used for an exploratory mobile robot. Keywords: mobile robots, stereovision, camera tilting, optimisation 1. Introduction Stereovision - also called stereoscopic or 3-D imaging - is any technique capable of recording three-dimensional visual information or creating the illusion of depth in an image. Stereoscopy is the enhancement of the illusion of depth in a photo, movie, or other two-dimensional images by presenting a slightly different image to each eye, and thereby adding the first of these cues (stereopsis) as well. For getting pairs of stereoscopic images there exist some basic methods. The first one is based on a pair of cameras, which are tilting against each other. It is a necessary condition in order to cross the camera’s optics axes on the object of interest (OI) and due to it, cameras must be capable of tilting. The principle is that the centre of the image of an actual scene must be projected into the centres of both image sensors. The disadvantage of this solution is the necessity of a mechanical positioning device, and an increase in size, weight, price and a decrease in reliability. An example is presented in [7]. The second method uses a one image sensor and optics. A part of the identical optic system is composed of a camera lens and it divides an image into two parts. Every image is projected on the left/ right part of a common image sensor. The disadvantage is the portrait image form – in practice it needs an additional cropping, a mostly bad aperture (suitable only for good light conditions), constant focus, a large depth of sharpness, etc. This method was earlier mainly used for taking static 3D images - not for video. An example of this system is described in [9]. Another system is based on a pair of parallel cameras, respectively a lens with parallel optic axes. Image sensors are moveable and must be positioned in such a way that the centres´ image of an actual scene have to be projected into the centres of both image sensors – see [10]. The disadvantage of this system is the use of special cameras and optics systems and very fine mechanics. * Corresponding author. Tel.: + 00421-55-602-2469. E-mail address: zdenko.bobovsky @tuke.sk. Available online at www.sciencedirect.com © 2012 Published by Elsevier Ltd.Selection and/or peer-review under responsibility of the Branch Office of Slovak Metallurgical Society at Faculty of Metallurgy and Faculty of Mechanical Engineering, Technical University of Košice

Transcript of Tilting Optimization of Fixed Stereovision Cameras

Procedia Engineering 48 ( 2012 ) 479 – 488

1877-7058 © 2012 Published by Elsevier Ltd.Selection and/or peer-review under responsibility of the Branch Offi ce of Slovak Metallurgical Society at Faculty of Metallurgy and Faculty of Mechanical Engineering, Technical University of Košicedoi: 10.1016/j.proeng.2012.09.542

MMaMS 2012

Tilting optimization of fixed stereovision cameras

Petr Novaka, Vladimír Mostyna, Václav Krysa, Zdenko Bobovskýb* aDepartment of Robotics, Faculty of Mechanical Engineering,, VŠB- TU Ostrava, 17. listopadu 15, Ostrava-Poruba, 708 33, Czech Republic

bDepartment of Applied Mechanics and Mechatronics, Faculty of Mechanical Engineering, TU-Košice, Letná 9, Košice, 040 01, Slovak Republic

Abstract

The article describes a stereoscopic camera system based on a pair of separate cameras. The cameras are stationary and tilted together. The value of their tilting angle is derived in regards to their utilization of the chip area. By optimizing tilting we can maximally utilize the chip area, without the unused parts. It can increase the size of the cropped image of both left and right images as well as their resolution. The results were tested and verified on a real stereoscopic system used for an exploratory mobile robot. © 2012 The Authors. Published by Elsevier Ltd.

Selection and/or peer-review under responsibility of the Branch Office of Slovak Metallurgical Society at Faculty of Metallurgy and Faculty of Mechanical Engineering, Technical University of Košice.

Keywords: mobile robots, stereovision, camera tilting, optimisation

1. Introduction

Stereovision - also called stereoscopic or 3-D imaging - is any technique capable of recording three-dimensional visual information or creating the illusion of depth in an image. Stereoscopy is the enhancement of the illusion of depth in a photo, movie, or other two-dimensional images by presenting a slightly different image to each eye, and thereby adding the first of these cues (stereopsis) as well. For getting pairs of stereoscopic images there exist some basic methods.

The first one is based on a pair of cameras, which are tilting against each other. It is a necessary condition in order to cross the camera’s optics axes on the object of interest (OI) and due to it, cameras must be capable of tilting. The principle is that the centre of the image of an actual scene must be projected into the centres of both image sensors. The disadvantage of this solution is the necessity of a mechanical positioning device, and an increase in size, weight, price and a decrease in reliability. An example is presented in [7].

The second method uses a one image sensor and optics. A part of the identical optic system is composed of a camera lens and it divides an image into two parts. Every image is projected on the left/ right part of a common image sensor. The disadvantage is the portrait image form – in practice it needs an additional cropping, a mostly bad aperture (suitable only for good light conditions), constant focus, a large depth of sharpness, etc. This method was earlier mainly used for taking static 3D images - not for video. An example of this system is described in [9].

Another system is based on a pair of parallel cameras, respectively a lens with parallel optic axes. Image sensors are moveable and must be positioned in such a way that the centres´ image of an actual scene have to be projected into the centres of both image sensors – see [10]. The disadvantage of this system is the use of special cameras and optics systems and very fine mechanics.

* Corresponding author. Tel.: + 00421-55-602-2469. E-mail address: zdenko.bobovsky @tuke.sk.

Available online at www.sciencedirect.com

© 2012 Published by Elsevier Ltd.Selection and/or peer-review under responsibility of the Branch Offi ce of Slovak Metallurgical Society at Faculty of Metallurgy and Faculty of Mechanical Engineering, Technical University of Košice

480 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488

The fourth one uses the solid tilt of a pair of cameras. The mutual tilt of cameras must respect that a camera’s optic axes must cross on the OI. The advantage is simplicity and usage of all the image sensors area. The disadvantage of this method is that OI must lie at the constant distance from the cameras. A similar principle of a pair of tilting cameras is based on John Cameron’s invention and is described in [8].

The method of capturing stereovision is similar to the previous one. It consists of a pair of solid tilted cameras too. It is possible to change the distance of an object of interest as well as the convergence angle. The value of this solution is that only part of image sensor area is used and a decrease of resolution in a horizontal direction. The advantage of this system is the capability to take stereovision images (video) with various distances from the object of interest (various convergence angles). The disadvantage – in case of non-optimal tilt – is a significant decrease of the horizontal resolution. A detailed description of this system is presented [11].

2. Parallel camera configuration

First, we will suppose a parallel camera configuration for stereovision without the possibility of their reciprocal tilting. In all cases, when OI does not lie on optic axes (see axis o in Fig. 2), it is projected outside of image sensor centre – see point B in Fig. 2. In cases, when OI lies on optic axe o, the OI is projected in the centre of the image sensor – see the position of the dashed sensor in Fig. 2. Due to it, it is necessary to use only part of the image sensor and to move with it – see the detail in Chyba! Nenalezen zdroj odkaz .. In the same way it is possible to derive it for the right camera too. The final images will have the size in axis y identical with the resolution of the image sensor (the number of rows is identical). The resolution in axis x (number of columns) will be equal to the cropped image width and will be always smaller than the resolution of the original chip. The next consideration supposes the position of OI on central axis m of stereoscopic system. Axis m lies in the middle of both cameras and is perpendicular to the camera’s plane – see Fig. 5.

Fig. 1. Image sensor (blue) and moveable cropped image (green). Where x – the size of x axis of image sensor, v – the size of the cropped image

The cross ratio p of the size (H) of the chip (x) and its width of cropped image (v) or – more practically – the ratio of their pixels in axis x:

v

xp =

(1)

Let us derive the minimal distance g between the camera’s plane and OI lying in the middle of them on axis m at selected cross ratio p.

The determination of the distance between a chip and its cropped part centre e:

vxforp

x

p

xxvxe ≥−=−=−= 1

122222

(2)

481 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488

Fig. 2. The detail of left camera optics in parallel configuration

The distance b: from the P1-P2-f and P3-P2-B triangles – see Fig. 2 it is possible to express the angles and compare them with each other:

eG

b

G

ftg

+==β (3)

The distance b between the image sensor and optical centre:

( ) +=+=G

ef

G

feGb 1 (4)

Substituting e we obtain:

−+=Gp

x

G

xfb

221 ( 5)

482 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488

The resultant distance g between a camera’s plane and OI lying at axis m is:

fb

bfg

f

1

b

1

g

1

−⋅=→=+

( 6)

3. An example

We assume the real configuration of the stereovision system based on a pair of DKF21BU04 cameras with USB of the Imaging Source producer [5]. The system consists of two cameras with a type of 1/4“ Sony ICX098BQ image sensor [1] and T 0412 FICS-3 lens [3] with basic parameters:

Table 1. The parameters of the used stereovision system

An example of a column heading Column A (t)

3,6(H) x 2,6 (V) mm,

unit cell size 5.6 m (H) × 5.6 m (V),

sensor size (active)

aspect ratio 4:3

resolution 640(H) x 480(V)

distance between cameras 2s=90mm

fix focal length f=4 mm

MOD (Minimal Object Distance) 200 mm

Fig. 3. The implemented stereovision system based on DKF21BU04 cameras with a T 0412 FICS-3 lens. (There is a laser distance sensor in the middle of of the cameras.)

4. Substituting:

Number of pixels (axe x) of cropped image v = 340 (specified in our case) OI is in the middle of the cameras on axis m s = G

2340

640 ==p

[ ]mmfsp

x

s

xfb 08,4

2452

6,3

452

6,31

221 =

⋅⋅−

⋅+=−+= (7)

483 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488

[ ]mmfb

bfg 204

408,4

08,44 =−

⋅=−⋅= (8)

The calculated shortest distance to OI from the camera’s plane with regard to the pre-specified size of cropped image v(ratio p=2) is cca 200 mm. Whereas the used lens have MOD = 200mm, the result is acceptable – it lies in the lens focusing area. In other words, the described system is capable of working on an image field with OI situated in the middle of the optical axis of both cameras from a distance of 0,2 m to infinity. The minimal distance between the camera’s plane and OI (lying on axis m) is restricted by:

1. the parameter MOD of the used lens, 2. the size of cropped image v, which must have a reasonable value. Let us derive the size of cropped image v – respectively ratio p - in the case g=MOD (OI is lying on the MOD) It is valid:

fMOD

MODfb

−⋅= (9)

and too:

bMOD

sesGandMODgform

g

b

G

e ⋅=→==== (10)

By substitution (9) into (10):

fMOD

fse

−⋅

= (11)

From (2) expression p and substitututing e from (11):

1112

12222

22

1−−−

−⋅

−=−

⋅−=−=

−=

xfMOD

fs

fMOD

fsxxe

xx

ex

xp (12)

Let us evaluate the configuration MOD =200 mm, f=4mm and s=45mm. The result will be the appropriate size of cropped image v (and ratio p).

Substituting into (12):

04,26,3

2

4200

4451

1

=⋅−⋅−=

p (1)

Size of cropped image v :

[ ]mm,,

,

p

xv 761

042

63 === (14)

or more practically in pixels:

[ ]pixel,

v 314042

640 ≅= (15)

484 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488

For the given parameters of lens: MOD=200mm, focusing length f=4 mm, chip size x=3,6mm and the distance between cameras 2s=90 mm, the system is able to visualize OI placed in the middle of both cameras from the MOD to an infinite distance. The size of the cropped image is 314 columns. In real conditions, the size of the cropped image will be set to 320 pixels and it increases the minimal OI distance.

From the above result it is evident that the size of the cropped image is only 49% (1,76mm) of the original (3,6mm). In the case, when the OI lies at infinity – see Fig. 4, the right part of the left chip (and in a similar way for the right chip) will be never used† – the hatching part in Fig. 4. Here it means about 0,92 mm (by substitution into (11)), which means about 26% of the chip area.

Fig. 4. OI lying at infinity

5. Tilt camera configuration

In the case of putting an accent on the maximal usage of the chip area, it is possible to achieve by tilting both cameras together. From the right part of Fig. 5 it is visible that the tilt of the camera is half of angle between optical axis o of the not tilted camera (lens) and the line between the centre of the lens and OI lying in the middle of both cameras on the MOD distance – see Fig. 5. This angle is . At first glance it is evident that there is a significantly positive ratio of the chip size and cropped image – see the green cropped image of the blue highlighted chip and green cropped image of the red highlighted chip. The positions of crop rectangles also utilize the full chip area.

The size of tilt angle is:

a

sarctg

a

stg

2

1=→= εγ (16)

For example, we suppose the lens T 0412 FICS-3 (see [3]) and system based on Tab. 1. After substitution:

026346200

45

2

1 ′=== ,arctgε (17)

485 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488

Fig. 5. Tilted camera – only the left camera is illustrated. In the left – before tilting, in the right after tilting

From Fig. 5 it is possible to derive:

εε

cos

sG

G

scos

2

1

2=→= (18)

as well as:

( ) εεε

ε coscos tgsagtgsa

g ⋅+=→⋅+

= (19)

Is valid – derived from (9) and (19):

( )

( )

( )( ) ftgsa

tgsaf

fg

gfb

MODa

MODa

−⋅+⋅+⋅=

−=

=

=

εεεε

cos

cos. (20)

and next:

bg

Gem

g

b

G

e ⋅=→== (21)

And after substituting (18)(19)(20) into (21) we get:

486 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488

( )( )

( ) ( ) ftgsa

sf

ftgsa

tgsaf

tgsa

s

e−⋅+

⋅→−⋅+

⋅+⋅+

=εεεεε

εεεε

εcoscos2

1

cos

cos.

coscos2

1

(22)

By substituting the real values (properties) of the given system: (a=MOD)

( ) [ ]mmtg

e 45,075,199

445.

99,02

1

434,6cos34,645200

445

34,6cos2

1 =⋅⋅

=−⋅+

⋅⋅=

Ratio p - by modification (12) and by substitution:

[ ]−=−=−=−

⋅=−−

33,145,02

6,3

2

6,3

222

2

111

exx

ex

xp (22)

The size of cropped image v then will be:

[ ]mm,,

,v 72

331

63 ==

or more practically in pixels:

[ ]pixelv 48033,1

640 ≅=

The result presents that the size of the cropped image is about 75% of the chip area. Comparing with the result achieved by parallel cameras it reaches a 50% rising crop rectangle of the chip size and increasing the resolution. The original resolution (parallel cameras) was 320x480 pixels. Now, with a camera tilted at angle the resolution is 480x480 pixels. At the same time, the full chip area is used – see Fig. 6.

Fig. 6. OI lying at infinity with a tilted camera

487 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488

In the case OI at infinity f = b is valid – the surface of the image chip is at distance f from the centre of the lens optical system. Let us check the changes:

First we recalculate distance e (see Fig. 6):

[ ]mmtgtgfe 44,034,6.4 =⋅=⋅= ε

By substitution into (23) we get ratio p:

[ ]−= 32,1p

The cropped image area v is:

[ ]mmv 72,232,1

6,3 ≅=

or more clearly in pixels of the x axis:

[ ]pixelv 48532,1

640 ≅=

In the situation when OI lies at infinity, the recalculated size of the cropped image – it is now more closely towards the optical centre (the distance equals f ) – 485 pixels compared to 480 pixels in the x axis for OI lying at MOD.

In practice, these 5 columns of pixels (for example 2 from the left and 3 from the right border of the cropped image) will not compile, and 3 columns of chip will not be used. The calculation of the angle of camera tilt can be derived from a value of higher complexity for the full use of the chip area.

In the similar way, we can derive the equations for the size of the tilting angle of cameras equipped by a special lens focusing not from MOD to infinity, but from MOD to MAX. For example, lens H0514-MP [4] can focus from 0,1 to 0,9 m, which means MOD=100mm and MAX=0,9m. Its focus a distance f=5mm and is intended for a 1/2“ chip.

6. Conclusion

Part of stereoscopic camera systems is based on a pair of cameras. They can be composed in a parallel way or they can be tilted to each other by angle . By optimizing the tilting we can maximally utilize the chip area, without the unused parts and increase the size of the cropped image of both the left and right images as well as their resolution. The equations derived above, together with the examples present in detailed way the procedure for optimizing camera tilting. The results were verified on a real stereovision system (see Fig. 3) and they confirm – see Table below- the positive effect of the optimised tilting of cameras. In the case of the parallel cameras the size of the cropped image is cca 320(H) x 480(V) pixels and the efficiency of chip area utilization is about 74%. In the case of an optimized tilt = 6,34° the resolution is higher - the size of the cropped image is cca 480(H) x 480(V) pixels and the efficiency of chip area utilization is cca 100%.

Table 2. The influence of tilting on the efficiency of the chip area and cropped image resolution

lens Chip [mm] / [pixels]

f[mm] MOD [mm] MAX [mm] tilt [°] cropped image [%]/[pixels]

Efficiency [%]

0 49 / 320x480 74 T 0412 FICS-3

3,6 x 2,7 640x480

4 200 Inf. 6,34 75 / 480x480 100

According to the derived equation for optimal camera tilting, a stereoscopic system for a mobile robot surveyor was developed. The stereoscopic system is mounted on a robot´s arm (see Fig. 7) and the output video signal – after processing – is displaying in a 3D VR helmet[13].

488 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488

Fig. 7. Application of the stereoscopic system on a mobile robot (inspection of the armoured carrier - NATO days, Ostrava, Czech Republic 2010)

Acknowledgements

This article was compiled as a part of project FT-TA3/014, supported by the Fund for University Development from the Ministry of Industry and Trade. Also this article is the result of the project implementation: Centre for research of control of technical, environmental and human risks for permanent development of production and products in mechanical engineering (ITMS:26220120060) supported by the Research & Development Operational Programme funded by the ERDF.

References

[1] Sony CCD image sensor ICX098BQ: <http://www.unibrain.com/download/pdfs/Fire-i_Board_Cams/ICX098BQ.pdf> [2] <http://www.theimagingsource.com/en_US/products/optics/lenses/> [3] Lens T 0412 FICS-3 <http://www.theimagingsource.com/downloads/t0412fics3.en_US.pdf> [4] Lens H0514-MP <http://www.theimagingsource.com/downloads/h0514mp.en_US.pdf> [5] Camera DKF21BU04: <http://www.theimagingsource.com/en_US/products/cameras/usb-ccd-color/dfk21bu04/> [6] Wikipedia: Image sensor format <http://en.wikipedia.org/wiki/Image_sensor_format)> [7] United States Patent 6701081: Dual camera mount for stereo imaging, <http://www.freepatentsonline.com/6701081.pdf> [8] United States Patent 7643748: Platform for stereoscopic image acquisition, <http://www.freepatentsonline.com/7643748.pdf> [9] United States Patent 4464028: Motion picture system for single strip 3-D filming, <http://www.freepatentsonline.com/4464028.pdf> [10] United States Patent 5063441: Stereoscopic video cameras with image sensors having variable effective position,

<http://www.freepatentsonline.com/5063441.pdf)> [11] United States Patent Application 0040070667: Electronic stereoscopic imaging system, <http://www.freepatentsonline.com/20040070667.pdf> [12] Erthardt, A., 2000. Ferron: Theory and Applications of Digital Image Processing. University of Applied Sciences Offenburg, 2000, 32pp. [13] NOVÁK, P. , TVAR ŽKA, A., 2006. Control and stereovision subsystems for remote controlled mobile robot surveyor. Transactions of the VŠB-

TU of Ostrava, vol. LII, Mechanical series, no. 1, pp. 145-156. ISSN 1210–0471. [14] ŠPA EK, P., NOVÁK, P., MOSTÝN, V., 2009. Visualisation of distances determined by stereometry. Transactions of the VŠB-TU of Ostrava, vol.

LV, Mechanical series, No. 1, pp. 251-256. ISSN 1210–0471. [15] ŠPA EK, P., NOVÁK, P., 2008. Determine accuracy of measuring by camera. Transactions of the ERIN 2008. Bratislava : SjF STU Bratislava,

ISBN 978-80-227-2849-2.