Squaring the Circle in Panoramas Abstract

in: Proceedings of the Tenth IEEE International Conference on Computer Vision, pp. 1292-1299, Beijing, China,October 15-21, 2005.

Squaring the Circle in Panoramas

Lihi Zelnik-Manor1 Gabriele Peters2 Pietro Perona1

1. Dept. of Electrical Engineering Califormia Institute ofTechnology Pasadena, CA 91125, USA2. Informatik VII (Graphische Systeme), Universitat Dortmund, Dortmund, Germany

http://www.vision.caltech.edu/lihi/SquarePanorama.html

Abstract

Pictures taken by a rotating camera cover the viewingsphere surrounding the center of rotation. Having a set ofimages registered and blended on the sphere what is left tobe done, in order to obtain a flat panorama, is projectingthe spherical image onto a picture plane. This step is unfor-tunately not obvious – the surface of the sphere may not beflattened onto a page without some form of distortion. Theobjective of this paper is discussing the difficulties and op-portunities that are connected to the projection from view-ing sphere to image plane. We first explore a number of al-ternatives to the commonly used linear perspective projec-tion. These are ‘global’ projections and do not depend onimage content. We then show that multiple projections maycoexist successfully in the same mosaic: these projectionsare chosen locally and depend on what is present in the pic-tures. We show that such multi-view projections can pro-duce more compelling results than the global projections.

1. Introduction

As we explore a scene we turn our eyes and head and cap-ture images in a wide field of view. For millennia paintersand (more recently) photographers have grappled with theproblem of creating pictures that render the visual impres-sion of ‘being there’. Recent advances in storage, com-putation and display technology have made it possible todevelop ‘virtual reality’ environments where the user feels‘immersed’ in a virtual scene and can explore it by mov-ing within it. However, the humble still picture, paintedor printed on a flat surface, is still a popular medium: itis inexpensive to reproduce, easy and convenient to carry,store and display. Even more importantly, it has unrivaledsize, resolution and contrast. Furthermore, the advent of in-expensive digital cameras, their seamless integration withcomputers, and recent progress in detecting and matchinginformative image features [4] together with the develop-

ment of good blending techniques [7, 5] have made it possi-ble for any amateur photographer to produce automaticallymosaics of photographs covering very wide fields of viewand conveying the vivid visual impression of large panora-mas, something that so far was the exclusive preserve ofthe artist. Such mosaics are superior to panoramic picturestaken with conventional fish-eye lenses in many respects:they may span wider fields of view, they have unlimitedresolution, they make use of cheaper optics and they are notrestricted to the projection geometry imposed by the lens.

The geometry of single view point panoramas has longbeen well understood [12, 21]. This has been used for mo-saicing of video sequences (e.g., [13, 20]) as well as for ob-taining super-resolution images (e.g., [6, 23]). By contrastwhen the point of view changes the mosaic is ‘impossible’unless the structure of the scene is very special. Let’s ex-plore for a moment the ‘easy’ case, where all pictures sharethe same center of projectionC. If we consider the viewingsphere, i.e. the unit sphere centered inC, we may identifyeach pixel in each picture with the ray connectingC withthat pixel and passing through the surface of the viewingsphere, as well as through the physical point in the scenethat is imaged by that pixel. By detecting and matching vi-sual features in different images we may register automat-ically the images with respect to each other. We may thenmap every pixel of every images we collected to the corre-sponding point of the viewing sphere and obtain a spheri-cal image that summarizes all our information on the scene.This spherical image is the most natural representation: wemay represent this way a scene of arbitrary angular widthand if we place our head inC, the center of the sphere, wemay rotate it around and capture the same images as if wewere in the scene.

What is left to be done, in order to obtain our panorama-on-a-page, is projecting the spherical image onto a pictureplane. This step is unfortunately not obvious – the surfaceof the sphere may not be flattened onto a page without someform of distortion. The choice of projection from the sphereto the plane has been dealt with extensively by painters and

1

cartographers. An excellent review is provided in [9].The best known projection islinear perspective (also

called ‘gnomonic’ and ‘rectilinear’). It may be obtained byprojecting the relevant points of the viewing sphere onto atangent plane, by means of rays emanating from the cen-ter of the sphereC. Linear perspective became popularamongst painters during the Renaissance. Brunelleschi iscredited with being the first to use correct linear perspec-tive. Alberti wrote the first textbook on linear perspectivedescribing the main construction methods [1]. It is believedby many to be the only ‘correct’ projection because it mapslines in 3D space to lines on the 2D image plane and be-cause when the picture is viewed from one special point, the‘center of projection’ of the picture, the retinal image that isobtained is the same as when observing the original scene.A further, somewhat unexpected, virtue is that perspectivepictures look ‘correct’ even if the viewer moves away fromthe center of projection, a very useful phenomenon called‘robustness of perspective’ [18, 22].

Unfortunately, linear perspective has a number of draw-backs. First of all: it may only represent scenes that areat most 180◦ wide: as the field of view becomes wider,the area of the tangent plane dedicated to representing onedegree of visual angle in the peripheral portion of the pic-ture becomes very large compared to the center, and even-tually becomes unbounded. Second, there is an even morestringent limit to the size of the visual field that may berepresented successfully using linear perspective: beyondwidths of 30◦-40◦ architectural structures (parallelepipeds)appear to be distorted, despite the fact that their edges arestraight [18, 14]. Furthermore, spheres that are not in thecenter of the viewing field project to ellipses onto the imageplane and appear unnatural and distorted [18] (see Fig 1). Asimilar phenomenon affects cylinders. Renaissance paintersknew of these shortcomings and adopted a number of cor-rective measures [14], some of which we will discuss later.

The objective of this paper is discussing the difficultiesand opportunities that are connected to the projection fromviewing sphere to image plane, in the context of digital im-age mosaics. We first explore a number of alternatives tolinear perspective which were developed by painters andcartographers. These are ‘global’ projections and do notdepend on image content. We explore experimentally thetradeoffs of these projections: how they distort architec-ture and people and how well do they tolerate wide fieldsof view. We then show that multiple projections may co-exist successfully in the same mosaic: these projections arechosen locally and depend on what is seen in the picturesthat form the mosaic. We conclude with a discussion of thework that lies ahead.

In this paper we do not address issues of image regis-tration and image blending and instead rely on the code byBrown and Lowe [4, 2] for our experiments.

Figure 1:Perspective distortions. Left: Five photographsof the same person taken by a rotating camera, after rec-tification (removing spherical lens distortion). Right: Anoverlay of the five photographs after blackening everythingbut the person’s face. This shows that spherical objects lookdistorted under perspective projection even at mild viewingangles. For example, in the above figure, the centers of thefaces in the corners are at∼ 20

◦ horizontal eccentricity.

2 Global Projections

What are the alternatives to linear perspective?An important drawback of linear perspective is the ex-

cessive scaling of sizes at high eccentricities. Considera painter taking measurements in the scene by using herthumb and using these measurements to scale objects onthe canvas. She takes angular measurements in the sceneand translates them into linear measurements onto the can-vas. This construction is calledPostel projection [9]. Itavoids the ‘explosion’ of sizes in the periphery of the pic-ture. Along lines radiating from the point where the pictureplane touches the viewing sphere, it actually maps lengthson the sphere to equal lengths in the image. Lines that runorthogonal to those (i.e., concentric circles around the tan-gent point) will be magnified at higher eccentricities, butmuch less than by linear perspective. The Postel projectionis close to the cartographicstereographic projection. Thestereographic projection is obtained by using the pole oppo-site to the point of tangency as the center of projection.

Consider now the situation in which we wish to repre-sent a very wide field of view. A viewer contemplating awide panorama will rotate his head around a vertical axis inorder to take in the full view. Suppose now that the viewhas been transformed into a flat picture hanging on a walland consider a viewer exploring that picture: the viewer willwalk in front of the picture with a translatory motion that isparallel to the wall. If we replace rotation around a verticalaxis with sideways translation in front of the picture we ob-tain a family of projections which are popular with cartog-raphers. Wrap a sheet of paper around the viewing sphereforming a cylinder that touches the sphere at the equator.One may project the meridians onto the cylinder by main-taining lengths along vertical lines, thus obtaining thege-ographic projection. Alternatively, one may want to varylocally the scale of the meridians so that they keep in pro-

2

Perspective Geographic Mercator Transverse Mercator Stereographic

Figure 2:Spherical projections. Figures taken out of Matlab’s help pages visualizing the distortions of various projections.Grid lines correspond to longitude and latitude lines. Small circles are placed at regular intervals across the globe. Afterprojection, the small circles appear as ellipses (called Tissot indicatrices) of various sizes, elongations, and orientations.The sizes and shapes of the ellipses reflect the projection distortions.

portion with the parallels. This is theMercator projection(for mathematical definitions of these projections see [16]).

Figure 2 visualizes the properties of these projections. Inthis visualization grid lines correspond to longitude and lat-itude lines. When projecting images onto the sphere, verti-cal lines are projected onto longitude lines. Horizontal linesare not projected onto latitude lines but rather onto tiltedgreat circles, thus the visualization of the latitude linesdoesnot convey what happens to horizontal image lines. All ofthese projections are global and are independent of the im-age content.

Figure 3 illustrates the above projections on a panoramaconstructed of images taken at an indoor scene. This is atypical example of panoramas of man-made environmentswhich usually contain many straight lines. Selecting fromthe above projections implies bending either the horizon-tal lines, the vertical lines, or both. In most cases a bet-ter choice is to keep vertical lines straight as this resultsina panorama where narrow vertical slits look correct. Thismatches the observations in [22], which shows that our per-ception of a picture is affected by the fact that normally peo-ple shift their gaze horizontally and rarely shift it vertically.Shifting one’s gaze horizontally across a panorama looksbest when vertical lines are not bent. This motivates theuse of either the Geographic or the Mercator projections, asboth keep vertical lines straight. In both these projectionsthe rotation of the camera is transformed into sideways mo-tion of the observer.

When the camera performs mostly pan motion, i.e.,when the vertical angle is small, both projections producepractically the same result. However, for larger tilt an-gles the Geographic projection distorts circles, i.e., it doesnot maintain correct proportions, while the Mercator doesmaintain conformality, thus the Mercator projection is a bet-ter option (see Figure 4). Note, that the conformality im-plies that in the Mercator projection spherical and cylindri-cal objects, such as people, are not distorted but the back-ground is, see for example Figure 8.

An important issue in all cylindrical projections is thechoice of equator. Once the images are on the sphere one

can rotate the sphere in any desired way before projecting tothe plane. In other words, the cylinder wrapping the spherecan touch the sphere along an equator of choice. When awrong equator is selected, vertical lines in 3D space willnot be projected onto vertical lines in the panorama (see leftpanel of Figure 5). Finding the correct equator is easy. Theuser is requested to mark a single vertical line and a horizonpoint in one (or two) of the input images. The sphere isthen rotated so that projection of the marked vertical linealigns with a longitude line and the equator goes throughthe selected horizon point. This results in a straightenedpanorama, see for example, right panel of Figure 5.

Should other projections be considered? Yes, we thinkso. The Transverse Mercator projection is known inthe mapping world as an excellent choice for mapping ar-eas that are elongated north-to-south. This corresponds topanoramas with little pan motion and large tilt motion. Thebending of vertical lines is small near the meridian, thus,when the pan angle is small we are better off using theTransverse Mercator projection which keeps the horizontallines straight. This is illustrated in Figures 4, 6.

For far away outdoors scenes almost any projection looksgood as the scenes rarely contain any straight lines. Never-theless, too much bending might disturb the eye even onfree form objects like clouds. This implies the usage of thestereographic projection, which bends both vertical and hor-izontal lines but less than the cylindrical projections.

3 Multi View Projection

The projections explored in Section 2 are ‘global’, in thatonce a tangent point or a tangent line is chosen, the pro-jection is completely determined by this parameter. This isby no means a necessary property for a good projection. Wemay instead tailor the projection locally to the content of theimages in order to improve the final effect. We next explorea few options for such multi-view projections.

3

Perspective Transverse Mercator

Mercator Stereographic

Geographic Multi-Plane

Figure 3:Spherical projections. There are many spherical projections. Each has its pros and cons.

Figure 4:Preserving proportions. In the Geographic pro-jection the circular pot at the bottom of the panorama isdistorted into an ellipse. In the Mercator projection thisdoes not happen.

3.1 Multi-Plane Perspective Projection

As was shown in Section 2, a global projection of widepanoramas bends lines, which is unpleasant to the eye. Toobtain both a rectilinear appearance and a large field of viewwe suggest using a multi-plane perspective projection. Suchmulti-plane projections were suggested by Greene [11] forrendering textured surfaces. Rather than projecting thesphere onto a single plane, multiple tangent planes to thesphere are used. Each projection is linear perspective. Thetangent planes have to be arranged so that they may be un-

folded into a flat surface without distortion, e.g., the pointsof tangency belong to a maximal circle. One may thinkof the intersections of the tangent planes being fitted withhinges that allow flattening. The projection onto each planeis perspective and covers only a limited field of view, thus itis pleasant to the eye.

This process introduces large orientation discontinuitiesat the intersection between the projection planes, however,in many man-made environment these discontinuities willnot be noticed if they occur along natural discontinuities.The tangent planes must therefore be chosen in a way thatfits the geometry of the scene, e.g. so that the vertical edgesof a room project onto the seams and each projection planecorresponds to a single wall. Orientation discontinuitiescaused by the projection this way co-occur with orientationdiscontinuities in the scene and therefore they are visuallyunnoticeable (see Figures 3, 8, 6). Sometimes no seam maybe found that completely corresponds to discontinuities inthe scene: for example in Figure 9 the chair on the rightis clearly distorted. Another caveat is that some arrange-ments will cause a loss in the impression of depth: for ex-ample, when projecting a panorama of a standard room ontoa square prism (see left panel of Figure 7). Most often thesensation of depth can be maintained by appropriate choiceof the projection planes (see right panel of figure 7).

We have currently implemented a simple user inter-face to allow choosing the position of the multiple tan-gent planes. We assume that the hinges between tangent

4

Mercator With Wrong Equator Mercator With Correct Equator

Figure 5:Choice of equator Panoramas of the Pantheon. A wrong choice of the equator results in tilted vertical lines. Thecolumns on the right and left appear converging. Correctingthe equator selection results in columns standing up-right.

Uncropped

TransversePerspective Geographic Mercator Mercator Multi-Plane

Cropped

Figure 6:Vertical panoramas. Left and right panels show results before and after cropping(see Section 4 for further details).For wide angle panoramas, perspective cannot capture the full range, thus the photographers legs are excluded. Geographicdistorts proportions (see how squashed the legs look). Mercator stretches the legs across the bottom. Transverse-Mercatorcaptures both the sculpture and the photographer which suggests it is the best global projection option for narrow verticalpanoramas. Multi-Plane does even better.

planes are either associated to vertical or horizontal lines:the user is presented with the Geographic projection of thepanorama and clicks once anywhere on a single vertical lineto choose a seam and once again to choose the point of tan-gency of each projection plane. Automating this operationis an interesting exercise which we leave for the future.

3.2 Preserving Foreground Objects

The multi-plane perspective projection takes us back to thesecond challenge presented in Section 1. Recall, that evenfor small fields of view nearby (foreground) objects are of-ten perceived as distorted. Our solution to this problemdraws its inspiration from the Renaissance artists.

During the Renaissance the rules of perspective were un-derstood, and linear perspective was used to produce pic-tures that had a realistic look. Painters noticed earlier on,that spheres and cylinders (and therefore people) would ap-pear distorted if they were painted according to the rules of

a global perspective projection (a sphere will project to anellipse). It thus became common practice to paint people,spheres and cylinders by using linear perspective centeredaround each object. (see for example the The School ofAthens by Raphael [18, 14]). This results in paintings withmultiple view points. There is one global view point usedfor the background and an additional view point for eachforeground person/object.

Renaissance paintings look good precisely because theyare constructed using a multiplicity of projections. Eachprojection is chosen in order to minimize the apparent dis-tortion of either the ambient architecture, or of a specificperson/object. We follow this example and adopt the multi-view point approach to construct realistic looking panora-mas. We first separate the background and foreground ob-jects. A panorama is constructed from the background byusing a global projection: perspective for fields of view thatare narrower than, say,40

◦ and Multi-Plane otherwise. Theforeground objects are projected using a ‘local’ perspec-

5

tive projection, with a central line of sight going throughthe center of each object, and then they are pasted onto thebackground. More in detail:(1) Obtain a foreground-background segmentation foreach image and cut out the foreground objects [15, 19].Currently we use the GIMP [10] implementation of In-telligent Scissors [17] which requires manual interac-tion, we found it to take less than a minute per image.(2) Fill in the holes in the background caused by cuttingout the foreground objects using a texture propagationtechnique (e.g., [8, 3]). We used our implementation of[8]. Note, that the hole filling need not be perfect as mostof it will be covered eventually by the repasting of theforeground objects. As we are most sensitive to people’sdistortions, one could acquire each picture containing aperson a second time, once the person moved. In thatcase hole filling won’t be required.(3) Construct a panorama of the filled background im-ages.(4) Overlay foreground objects on top of the backgroundpanorama. For each foreground object, find its boundingbox in the original image and in the panorama if it wereprojected along with the background. Rescale the cut-out object to have the same height as its projection (note,that the width will be different). Paste the object so thatthe centers of the bounding boxes align.

This process is illustrated in Figure 10. Five frames weretaken out of a video sequence showing a child walking fromright to left, while facing the camera. The child was cut-outfrom each image, texture propagation was used to fill in theholes and a perspective panorama of the background wasconstructed (see Figure 10 top). The cut-outs of the childwere then pasted onto the background in two ways. Onceapplying the same perspective projection used for the back-ground, which resulted in distorting the child’s head into avariety of ellipsoidal shapes (see Figure 10 middle). Thenusing the multi-view approach described above which pro-duced a significantly better looking result, removing all thehead distortions, see Figure 10 bottom. Another exampleis displayed in Figure 11 (for this example we had avail-able clear background images so hole filling was not re-quired). Figure 9 displays our full solution including bothmulti-plane projection for the background and multi-viewprojection to correct the chair in the foreground.

4 Results

In all the experiments displayed in this paper the compu-tation of the transformations of the input images to thesphere was done using Matthew Brown’s Autostitch soft-ware [4, 2].

When the images do not cover the full viewing sphere,the boundaries of the panorama can have all sort of shapes,

depending on the projection, e.g., see left panel of Fig-ure 6. Thus, for visualization purposes, the panoramas werecropped to display a complete rectangular portion. This re-sults in different coverage areas for each projection. Theuncropped panoramas, as well as more results are providedin the attached supplemental material.

5 Discussion & Conclusions

The challenge of constructing panoramas from imagestaken from a single viewpoint goes beyond image matchingand image blending. The choice of the mapping betweenthe viewing sphere and the image plane is an interestingproblem in itself. Artists and cartographers have exploredthis problem and have proposed a number of useful globalprojections. Additionally, artists have developed a practiceto use multiple local projections which are guided by thecontent of the images. Inspired by the artists we have pro-posed a new set of projections which incorporate multiplelocal projections with multiple view points into the samepanorama to produce more compelling results. Further au-tomating this process is a worthwhile challenge for machinevision researchers.

6 Acknowledgements

This reseearch was supported by MURI award numberAS3318 and the Center of Neuromorphic Systems Engi-neering award EEC-9402726. We also wish to acknowl-edge our useful conversations with Pat Hanrahan, Jan Koen-derink, Marty Banks, Bill Freeman, Ged Ridgway andDavid Lowe and to thank Matthew Brown for providing hisAutostitch software.

References[1] Leon Battista Alberti.On Painting. First appeared 1435-36.

Translated with Introduction and Notes by John R. Spencer.New Haven: Yale University Press. 1970.

[2] Autostitch. http://www.autostitch.net/.

[3] M. Bertalmo, G. Sapiro, V. Caselles, and C. Ballester. Im-age inpainting. InProceedings of SIGGRAPH, New Orleans,USA, July 2000.

[4] M. Brown and D. Lowe. Recognising panoramas. InPro-ceedings of the 9th International Conference on ComputerVision, volume 2, pages 1218–1225, Nice, October 2003.

[5] P. J. Burt and Edward H. Adelson. A multiresolution splinewith application to image mosaics.ACM Trans. Graph.,2(4):217–236, 1983.

[6] D. Capel and A. Zisserman. Automatic mosaicing withsuper-resolution zoom. InCVPR ’98: Proceedings of the

6

Straight Projection Oblique Projection

Figure 7: Multi-Plane projection. In each panel the top figure displays the geographic projection and the interactionrequired by the user - definition of the intersection lines between the tangent planes (marked in blue) and the center ofprojection for each tangent plane (marked in green and red).The middle panel displays a top view of the projection. Thebottom panel displays the final result.

IEEE Computer Society Conference on Computer Visionand Pattern Recognition, page 885. IEEE Computer Society,1998.

[7] P. E. Debevec and J. Malik. Recovering high dynamic rangeradiance maps from photographs. InProceedings of SIG-GRAPH, August 1997.

[8] A.A. Efros and Thomas K. Leung. Texture synthesis by non-parametric sampling. InIEEE International Conference onComputer Vision, pages 1033–1038, Corfu, Greece, Septem-ber 1999.

[9] A. Flocon and A. Barre.Curvilinear Perspective, From Vi-sual Space to the Constructed Image. University of Califor-nia Press, 1987.

[10] The GIMP. http://www.gimp.org/.

[11] N. Greene. Environment mapping and other applicationsofworld projections. IEEE Computer Graphics and Applica-tions, 6(11):21–29, November 1986.

[12] R. I. Hartley and A. Zisserman.Multiple View Geometry inComputer Vision. Cambridge University Press, 2000.

[13] M. Irani, B. Rousso, and S. Peleg. Computing occludingand transparent motions.Int. J. Comput. Vision, 12(1):5–16,1994.

[14] M. Kubovy. The Psychology of Perspective and RenissanceArt. Cambridge University Press, 1986.

[15] Y. Li, J. Sun, C.K. Tang, and H. Shum. Lazy snapping. InProceedings of SIGGRAPH, 2004.

[16] MathWorld. http://mathworld.wolfram.com/.

[17] E.N. Mortensen and W.A. Barrett. Intelligent scissorsforimage composition. InSIGGRAPH ’95: Proceedings of the22nd annual conference on Computer graphics and interac-tive techniques, pages 191–198. ACM Press, 1995.

[18] M. H. Pirenne.Optics, Painting & Photography. CambridgeUniversity Press, 1970.

[19] C. Rother, V. Kolmogorov, and A. Blake. Grabcut - inter-active foreground extraction using iterated graph cuts.Proc.ACM Siggraph, 2004.

[20] H. S. Sawhney and R. Kumar. True multi-image alignmentand its application to mosaicing and lens distortion correc-tion. IEEE Trans. Pattern Anal. Mach. Intell., 21(3):235–243, 1999.

[21] R. Szeliski and H. Shum. Creating full view panoramic im-age mosaics and environment maps.Computer Graphics,31(Annual Conference Series):251–258, 1997.

[22] D. Vishwanath, A. R. Girshick, and M. S. Banks. Why pic-tures look right when viewed from the wrong place. Person-nal communication. (Manuscript accepted for publication).

[23] A. Zomet and S. Peleg. Applying super-resolution topanoramic mosaics. InWACV ’98: Proceedings of the4th IEEE Workshop on Applications of Computer Vision(WACV’98), page 286. IEEE Computer Society, 1998.

7

Perspective

Mercator

Multi-Plane

Figure 8: Architecture vs. spherical objects. The perspective projection distorts people at large viewing angles. TheMercator projection keeps the people undistorted, but distorts the wall and white-board at the background. The Multi-Planeprojection provides the most compelling result with no noticeable distortions in both background and people.

Background

Perspective

Multi-View

Figure 10: Correcting perspective distortions. Top:Panorama of the background only. Artifacts in the holefilling are visible, but are inessential as they will be even-tually covered by the foreground object. Center: A globalperspective projection of both background and foreground.The child’s head appears distorted. Bottom: A multi-viewpoint panorama providing the most compelling look with nohead distortions.

8

Mercator Multi-Plane Multi-Plane Multi-View

Figure 9: Multi-Plane Multi-View. The multi-plane projection rectified the background but thechair on the right gotdistorted. Using the Multi-View approach the chair is undistorted.

Perspective Multi-View

Figure 11: Correcting perspective distortions. In the Perspective panorama the person’s head is highly distorted. AMulti-view panorama provides a more compelling look, removing all distortions.

9

Squaring the Circle in Panoramas Abstract

Documents

Transcript of Squaring the Circle in Panoramas Abstract