Tilting Optimization of Fixed Stereovision Cameras
Transcript of Tilting Optimization of Fixed Stereovision Cameras
Procedia Engineering 48 ( 2012 ) 479 – 488
1877-7058 © 2012 Published by Elsevier Ltd.Selection and/or peer-review under responsibility of the Branch Offi ce of Slovak Metallurgical Society at Faculty of Metallurgy and Faculty of Mechanical Engineering, Technical University of Košicedoi: 10.1016/j.proeng.2012.09.542
MMaMS 2012
Tilting optimization of fixed stereovision cameras
Petr Novaka, Vladimír Mostyna, Václav Krysa, Zdenko Bobovskýb* aDepartment of Robotics, Faculty of Mechanical Engineering,, VŠB- TU Ostrava, 17. listopadu 15, Ostrava-Poruba, 708 33, Czech Republic
bDepartment of Applied Mechanics and Mechatronics, Faculty of Mechanical Engineering, TU-Košice, Letná 9, Košice, 040 01, Slovak Republic
Abstract
The article describes a stereoscopic camera system based on a pair of separate cameras. The cameras are stationary and tilted together. The value of their tilting angle is derived in regards to their utilization of the chip area. By optimizing tilting we can maximally utilize the chip area, without the unused parts. It can increase the size of the cropped image of both left and right images as well as their resolution. The results were tested and verified on a real stereoscopic system used for an exploratory mobile robot. © 2012 The Authors. Published by Elsevier Ltd.
Selection and/or peer-review under responsibility of the Branch Office of Slovak Metallurgical Society at Faculty of Metallurgy and Faculty of Mechanical Engineering, Technical University of Košice.
Keywords: mobile robots, stereovision, camera tilting, optimisation
1. Introduction
Stereovision - also called stereoscopic or 3-D imaging - is any technique capable of recording three-dimensional visual information or creating the illusion of depth in an image. Stereoscopy is the enhancement of the illusion of depth in a photo, movie, or other two-dimensional images by presenting a slightly different image to each eye, and thereby adding the first of these cues (stereopsis) as well. For getting pairs of stereoscopic images there exist some basic methods.
The first one is based on a pair of cameras, which are tilting against each other. It is a necessary condition in order to cross the camera’s optics axes on the object of interest (OI) and due to it, cameras must be capable of tilting. The principle is that the centre of the image of an actual scene must be projected into the centres of both image sensors. The disadvantage of this solution is the necessity of a mechanical positioning device, and an increase in size, weight, price and a decrease in reliability. An example is presented in [7].
The second method uses a one image sensor and optics. A part of the identical optic system is composed of a camera lens and it divides an image into two parts. Every image is projected on the left/ right part of a common image sensor. The disadvantage is the portrait image form – in practice it needs an additional cropping, a mostly bad aperture (suitable only for good light conditions), constant focus, a large depth of sharpness, etc. This method was earlier mainly used for taking static 3D images - not for video. An example of this system is described in [9].
Another system is based on a pair of parallel cameras, respectively a lens with parallel optic axes. Image sensors are moveable and must be positioned in such a way that the centres´ image of an actual scene have to be projected into the centres of both image sensors – see [10]. The disadvantage of this system is the use of special cameras and optics systems and very fine mechanics.
* Corresponding author. Tel.: + 00421-55-602-2469. E-mail address: zdenko.bobovsky @tuke.sk.
Available online at www.sciencedirect.com
© 2012 Published by Elsevier Ltd.Selection and/or peer-review under responsibility of the Branch Offi ce of Slovak Metallurgical Society at Faculty of Metallurgy and Faculty of Mechanical Engineering, Technical University of Košice
480 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488
The fourth one uses the solid tilt of a pair of cameras. The mutual tilt of cameras must respect that a camera’s optic axes must cross on the OI. The advantage is simplicity and usage of all the image sensors area. The disadvantage of this method is that OI must lie at the constant distance from the cameras. A similar principle of a pair of tilting cameras is based on John Cameron’s invention and is described in [8].
The method of capturing stereovision is similar to the previous one. It consists of a pair of solid tilted cameras too. It is possible to change the distance of an object of interest as well as the convergence angle. The value of this solution is that only part of image sensor area is used and a decrease of resolution in a horizontal direction. The advantage of this system is the capability to take stereovision images (video) with various distances from the object of interest (various convergence angles). The disadvantage – in case of non-optimal tilt – is a significant decrease of the horizontal resolution. A detailed description of this system is presented [11].
2. Parallel camera configuration
First, we will suppose a parallel camera configuration for stereovision without the possibility of their reciprocal tilting. In all cases, when OI does not lie on optic axes (see axis o in Fig. 2), it is projected outside of image sensor centre – see point B in Fig. 2. In cases, when OI lies on optic axe o, the OI is projected in the centre of the image sensor – see the position of the dashed sensor in Fig. 2. Due to it, it is necessary to use only part of the image sensor and to move with it – see the detail in Chyba! Nenalezen zdroj odkaz .. In the same way it is possible to derive it for the right camera too. The final images will have the size in axis y identical with the resolution of the image sensor (the number of rows is identical). The resolution in axis x (number of columns) will be equal to the cropped image width and will be always smaller than the resolution of the original chip. The next consideration supposes the position of OI on central axis m of stereoscopic system. Axis m lies in the middle of both cameras and is perpendicular to the camera’s plane – see Fig. 5.
Fig. 1. Image sensor (blue) and moveable cropped image (green). Where x – the size of x axis of image sensor, v – the size of the cropped image
The cross ratio p of the size (H) of the chip (x) and its width of cropped image (v) or – more practically – the ratio of their pixels in axis x:
v
xp =
(1)
Let us derive the minimal distance g between the camera’s plane and OI lying in the middle of them on axis m at selected cross ratio p.
The determination of the distance between a chip and its cropped part centre e:
vxforp
x
p
xxvxe ≥−=−=−= 1
122222
(2)
481 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488
Fig. 2. The detail of left camera optics in parallel configuration
The distance b: from the P1-P2-f and P3-P2-B triangles – see Fig. 2 it is possible to express the angles and compare them with each other:
eG
b
G
ftg
+==β (3)
The distance b between the image sensor and optical centre:
( ) +=+=G
ef
G
feGb 1 (4)
Substituting e we obtain:
−+=Gp
x
G
xfb
221 ( 5)
482 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488
The resultant distance g between a camera’s plane and OI lying at axis m is:
fb
bfg
f
1
b
1
g
1
−⋅=→=+
( 6)
3. An example
We assume the real configuration of the stereovision system based on a pair of DKF21BU04 cameras with USB of the Imaging Source producer [5]. The system consists of two cameras with a type of 1/4“ Sony ICX098BQ image sensor [1] and T 0412 FICS-3 lens [3] with basic parameters:
Table 1. The parameters of the used stereovision system
An example of a column heading Column A (t)
3,6(H) x 2,6 (V) mm,
unit cell size 5.6 m (H) × 5.6 m (V),
sensor size (active)
aspect ratio 4:3
resolution 640(H) x 480(V)
distance between cameras 2s=90mm
fix focal length f=4 mm
MOD (Minimal Object Distance) 200 mm
Fig. 3. The implemented stereovision system based on DKF21BU04 cameras with a T 0412 FICS-3 lens. (There is a laser distance sensor in the middle of of the cameras.)
4. Substituting:
Number of pixels (axe x) of cropped image v = 340 (specified in our case) OI is in the middle of the cameras on axis m s = G
2340
640 ==p
[ ]mmfsp
x
s
xfb 08,4
2452
6,3
452
6,31
221 =
⋅⋅−
⋅+=−+= (7)
483 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488
[ ]mmfb
bfg 204
408,4
08,44 =−
⋅=−⋅= (8)
The calculated shortest distance to OI from the camera’s plane with regard to the pre-specified size of cropped image v(ratio p=2) is cca 200 mm. Whereas the used lens have MOD = 200mm, the result is acceptable – it lies in the lens focusing area. In other words, the described system is capable of working on an image field with OI situated in the middle of the optical axis of both cameras from a distance of 0,2 m to infinity. The minimal distance between the camera’s plane and OI (lying on axis m) is restricted by:
1. the parameter MOD of the used lens, 2. the size of cropped image v, which must have a reasonable value. Let us derive the size of cropped image v – respectively ratio p - in the case g=MOD (OI is lying on the MOD) It is valid:
fMOD
MODfb
−⋅= (9)
and too:
bMOD
sesGandMODgform
g
b
G
e ⋅=→==== (10)
By substitution (9) into (10):
fMOD
fse
−⋅
= (11)
From (2) expression p and substitututing e from (11):
1112
12222
22
1−−−
−⋅
−=−
⋅−=−=
−=
xfMOD
fs
fMOD
fsxxe
xx
ex
xp (12)
Let us evaluate the configuration MOD =200 mm, f=4mm and s=45mm. The result will be the appropriate size of cropped image v (and ratio p).
Substituting into (12):
04,26,3
2
4200
4451
1
=⋅−⋅−=
−
p (1)
Size of cropped image v :
[ ]mm,,
,
p
xv 761
042
63 === (14)
or more practically in pixels:
[ ]pixel,
v 314042
640 ≅= (15)
484 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488
For the given parameters of lens: MOD=200mm, focusing length f=4 mm, chip size x=3,6mm and the distance between cameras 2s=90 mm, the system is able to visualize OI placed in the middle of both cameras from the MOD to an infinite distance. The size of the cropped image is 314 columns. In real conditions, the size of the cropped image will be set to 320 pixels and it increases the minimal OI distance.
From the above result it is evident that the size of the cropped image is only 49% (1,76mm) of the original (3,6mm). In the case, when the OI lies at infinity – see Fig. 4, the right part of the left chip (and in a similar way for the right chip) will be never used† – the hatching part in Fig. 4. Here it means about 0,92 mm (by substitution into (11)), which means about 26% of the chip area.
Fig. 4. OI lying at infinity
5. Tilt camera configuration
In the case of putting an accent on the maximal usage of the chip area, it is possible to achieve by tilting both cameras together. From the right part of Fig. 5 it is visible that the tilt of the camera is half of angle between optical axis o of the not tilted camera (lens) and the line between the centre of the lens and OI lying in the middle of both cameras on the MOD distance – see Fig. 5. This angle is . At first glance it is evident that there is a significantly positive ratio of the chip size and cropped image – see the green cropped image of the blue highlighted chip and green cropped image of the red highlighted chip. The positions of crop rectangles also utilize the full chip area.
The size of tilt angle is:
a
sarctg
a
stg
2
1=→= εγ (16)
For example, we suppose the lens T 0412 FICS-3 (see [3]) and system based on Tab. 1. After substitution:
026346200
45
2
1 ′=== ,arctgε (17)
485 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488
Fig. 5. Tilted camera – only the left camera is illustrated. In the left – before tilting, in the right after tilting
From Fig. 5 it is possible to derive:
εε
cos
sG
G
scos
2
1
2=→= (18)
as well as:
( ) εεε
ε coscos tgsagtgsa
g ⋅+=→⋅+
= (19)
Is valid – derived from (9) and (19):
( )
( )
( )( ) ftgsa
tgsaf
fg
gfb
MODa
MODa
−⋅+⋅+⋅=
−=
=
=
εεεε
cos
cos. (20)
and next:
bg
Gem
g
b
G
e ⋅=→== (21)
And after substituting (18)(19)(20) into (21) we get:
486 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488
( )( )
( ) ( ) ftgsa
sf
ftgsa
tgsaf
tgsa
s
e−⋅+
⋅→−⋅+
⋅+⋅+
=εεεεε
εεεε
εcoscos2
1
cos
cos.
coscos2
1
(22)
By substituting the real values (properties) of the given system: (a=MOD)
( ) [ ]mmtg
e 45,075,199
445.
99,02
1
434,6cos34,645200
445
34,6cos2
1 =⋅⋅
=−⋅+
⋅⋅=
Ratio p - by modification (12) and by substitution:
[ ]−=−=−=−
⋅=−−
33,145,02
6,3
2
6,3
222
2
111
exx
ex
xp (22)
The size of cropped image v then will be:
[ ]mm,,
,v 72
331
63 ==
or more practically in pixels:
[ ]pixelv 48033,1
640 ≅=
The result presents that the size of the cropped image is about 75% of the chip area. Comparing with the result achieved by parallel cameras it reaches a 50% rising crop rectangle of the chip size and increasing the resolution. The original resolution (parallel cameras) was 320x480 pixels. Now, with a camera tilted at angle the resolution is 480x480 pixels. At the same time, the full chip area is used – see Fig. 6.
Fig. 6. OI lying at infinity with a tilted camera
487 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488
In the case OI at infinity f = b is valid – the surface of the image chip is at distance f from the centre of the lens optical system. Let us check the changes:
First we recalculate distance e (see Fig. 6):
[ ]mmtgtgfe 44,034,6.4 =⋅=⋅= ε
By substitution into (23) we get ratio p:
[ ]−= 32,1p
The cropped image area v is:
[ ]mmv 72,232,1
6,3 ≅=
or more clearly in pixels of the x axis:
[ ]pixelv 48532,1
640 ≅=
In the situation when OI lies at infinity, the recalculated size of the cropped image – it is now more closely towards the optical centre (the distance equals f ) – 485 pixels compared to 480 pixels in the x axis for OI lying at MOD.
In practice, these 5 columns of pixels (for example 2 from the left and 3 from the right border of the cropped image) will not compile, and 3 columns of chip will not be used. The calculation of the angle of camera tilt can be derived from a value of higher complexity for the full use of the chip area.
In the similar way, we can derive the equations for the size of the tilting angle of cameras equipped by a special lens focusing not from MOD to infinity, but from MOD to MAX. For example, lens H0514-MP [4] can focus from 0,1 to 0,9 m, which means MOD=100mm and MAX=0,9m. Its focus a distance f=5mm and is intended for a 1/2“ chip.
6. Conclusion
Part of stereoscopic camera systems is based on a pair of cameras. They can be composed in a parallel way or they can be tilted to each other by angle . By optimizing the tilting we can maximally utilize the chip area, without the unused parts and increase the size of the cropped image of both the left and right images as well as their resolution. The equations derived above, together with the examples present in detailed way the procedure for optimizing camera tilting. The results were verified on a real stereovision system (see Fig. 3) and they confirm – see Table below- the positive effect of the optimised tilting of cameras. In the case of the parallel cameras the size of the cropped image is cca 320(H) x 480(V) pixels and the efficiency of chip area utilization is about 74%. In the case of an optimized tilt = 6,34° the resolution is higher - the size of the cropped image is cca 480(H) x 480(V) pixels and the efficiency of chip area utilization is cca 100%.
Table 2. The influence of tilting on the efficiency of the chip area and cropped image resolution
lens Chip [mm] / [pixels]
f[mm] MOD [mm] MAX [mm] tilt [°] cropped image [%]/[pixels]
Efficiency [%]
0 49 / 320x480 74 T 0412 FICS-3
3,6 x 2,7 640x480
4 200 Inf. 6,34 75 / 480x480 100
According to the derived equation for optimal camera tilting, a stereoscopic system for a mobile robot surveyor was developed. The stereoscopic system is mounted on a robot´s arm (see Fig. 7) and the output video signal – after processing – is displaying in a 3D VR helmet[13].
488 Petr Novak et al. / Procedia Engineering 48 ( 2012 ) 479 – 488
Fig. 7. Application of the stereoscopic system on a mobile robot (inspection of the armoured carrier - NATO days, Ostrava, Czech Republic 2010)
Acknowledgements
This article was compiled as a part of project FT-TA3/014, supported by the Fund for University Development from the Ministry of Industry and Trade. Also this article is the result of the project implementation: Centre for research of control of technical, environmental and human risks for permanent development of production and products in mechanical engineering (ITMS:26220120060) supported by the Research & Development Operational Programme funded by the ERDF.
References
[1] Sony CCD image sensor ICX098BQ: <http://www.unibrain.com/download/pdfs/Fire-i_Board_Cams/ICX098BQ.pdf> [2] <http://www.theimagingsource.com/en_US/products/optics/lenses/> [3] Lens T 0412 FICS-3 <http://www.theimagingsource.com/downloads/t0412fics3.en_US.pdf> [4] Lens H0514-MP <http://www.theimagingsource.com/downloads/h0514mp.en_US.pdf> [5] Camera DKF21BU04: <http://www.theimagingsource.com/en_US/products/cameras/usb-ccd-color/dfk21bu04/> [6] Wikipedia: Image sensor format <http://en.wikipedia.org/wiki/Image_sensor_format)> [7] United States Patent 6701081: Dual camera mount for stereo imaging, <http://www.freepatentsonline.com/6701081.pdf> [8] United States Patent 7643748: Platform for stereoscopic image acquisition, <http://www.freepatentsonline.com/7643748.pdf> [9] United States Patent 4464028: Motion picture system for single strip 3-D filming, <http://www.freepatentsonline.com/4464028.pdf> [10] United States Patent 5063441: Stereoscopic video cameras with image sensors having variable effective position,
<http://www.freepatentsonline.com/5063441.pdf)> [11] United States Patent Application 0040070667: Electronic stereoscopic imaging system, <http://www.freepatentsonline.com/20040070667.pdf> [12] Erthardt, A., 2000. Ferron: Theory and Applications of Digital Image Processing. University of Applied Sciences Offenburg, 2000, 32pp. [13] NOVÁK, P. , TVAR ŽKA, A., 2006. Control and stereovision subsystems for remote controlled mobile robot surveyor. Transactions of the VŠB-
TU of Ostrava, vol. LII, Mechanical series, no. 1, pp. 145-156. ISSN 1210–0471. [14] ŠPA EK, P., NOVÁK, P., MOSTÝN, V., 2009. Visualisation of distances determined by stereometry. Transactions of the VŠB-TU of Ostrava, vol.
LV, Mechanical series, No. 1, pp. 251-256. ISSN 1210–0471. [15] ŠPA EK, P., NOVÁK, P., 2008. Determine accuracy of measuring by camera. Transactions of the ERIN 2008. Bratislava : SjF STU Bratislava,
ISBN 978-80-227-2849-2.