01423919

5
IEEE DISTRIBUTED SYSTEMS ONLINE 1541-4922 © 2005 Published by the IEEE Computer Society Vol. 6, No. 3; March 2005 SLAM Dunk? Real-Time Mapping Technology Could Take Gaming, Movies to New Level Benjamin Alfonsi Combine a video camera's mechanical utility with the technology that draws instant-replay diagrams on your television screen during sporting events, and you'd have a good idea what researchers at Oxford University are cooking up. Ian Reid and Andrew Davison, of Oxford's engineering department, are developing a technology to simultaneously localize and map a single camera's surroundings in an unknown environment. Called Visual SLAM (Simultaneous Localization and Mapping), the technology will let users combine real-life and computer-generated images in real time. "The problem is deceptively easy to state but encompasses major, difficult research challenges," says Mark Hylton, the research portfolio manager for the Engineering and Physical Sciences Research Council, the organization that's funding the project for the next three years. Until now, it has only been possible to interweave real-life and computer-generated images in a studio, post production. A new dimension for SLAM Along with having an array of possible applications from interior decorating to TV broadcasting Visual SLAM could revolutionize the film, robotics, and computer gaming industries if it can deliver on its promise. At its core, the technology is an algorithm that processes images from a video camera in real time (that is, 30 frames per second). Using motion from image features automatically extracted in each frame, the algorithm works out the camera position and the features' 3D locations at each instant in time. IEEE Distributed Systems Online March 2005 1

description

hggjhg

Transcript of 01423919

Page 1: 01423919

IEEE DISTRIBUTED SYSTEMS ONLINE 1541-4922 © 2005 Published by the IEEE Computer Society Vol. 6, No. 3; March 2005

SLAM Dunk? Real-Time Mapping Technology Could Take Gaming, Movies to New Level

Benjamin Alfonsi

Combine a video camera's mechanical utility with the technology that draws instant-replay diagrams on your television screen during sporting events, and you'd have a good idea what researchers at Oxford University are cooking up.

Ian Reid and Andrew Davison, of Oxford's engineering department, are developing a technology to simultaneously localize and map a single camera's surroundings in an unknown environment. Called Visual SLAM (Simultaneous Localization and Mapping), the technology will let users combine real-life and computer-generated images in real time.

"The problem is deceptively easy to state but encompasses major, difficult research challenges," says Mark Hylton, the research portfolio manager for the Engineering and Physical Sciences Research Council, the organization that's funding the project for the next three years.

Until now, it has only been possible to interweave real-life and computer-generated images in a studio, post production.

A new dimension for SLAM

Along with having an array of possible applications from interior decorating to TV broadcasting Visual SLAM could revolutionize the film, robotics, and computer gaming industries if it can deliver on its promise.

At its core, the technology is an algorithm that processes images from a video camera in real time (that is, 30 frames per second). Using motion from image features automatically extracted in each frame, the algorithm works out the camera position and the features' 3D locations at each instant in time.

IEEE Distributed Systems Online March 2005 1

Page 2: 01423919

Figure 1. The Visual SLAM technology (a) builds a 3D map from the camera's view of its surroundings, a kitchen in this case, then (b) overlays virtual shelves and a table in the scene.

IEEE Distributed Systems Online March 2005 2

Page 3: 01423919

Researchers believe that by adding a visual component, Visual SLAM stands to take SLAM research to the next level. In the past, SLAM has been applied primarily to mobile robotics.

"Most mobile robots operate in a flat environment, and so typically can cope with three degrees of freedom, often getting away with essentially 2D maps. We need to deal with six degrees of freedom and full 3D maps," Reid says. "In addition, mobile robots typically make controlled motions e.g., 'move forward by this much.' In contrast, our system is controlled by whoever happens to be holding the camera, and thus we have no prior information about the control inputs to the system."

Possible applications

It's not that Visual SLAM advancements can't be applied to mobile robotics, but researchers say they can be applied to a host of other fields as well.

"The most immediate application is probably where mapping and localization are required in a small environment such as for gaming or wearable or assistive robotics," explains Reid. "To address mobile robotics we need to address the issues that arise when parts of the environment may be invisible for large periods. This forms a major part of our ongoing research."

He presents the possibility of a robot that maps its environment as it explores. "Such a scenario would probably be more useful in the context of personal and assistive robotics rather than factory ones. Home, office, or hospital environments are typically much less controlled and more dynamic than factories. Therefore, the need for mapmaking on the fly is potentially greater."

Home gaming for interaction with virtual elements is another possibility. Reid says. "Imagine a game system with virtual reality goggles providing virtual monsters coming out of cupboards in users' living rooms!" he says.

Albeit much further down the road, camera phones are another consideration. "We are perhaps some way off in terms of the processing power and video capabilities required," Davison says. "Applications of our technology [to camera phones] are less clear than in some other areas."

IEEE Distributed Systems Online March 2005 3

Page 4: 01423919

From Oxford to Hollywood?

One area where the technology's potential seems certain is in virtual and augmented environments for film and TV.

"The most exciting impact may be on vision systems being able to cope with high uncertainty in real-time situations or, in an application example, augmented reality for live cameras in television broadcasts," says Hylton.

"An immediate application to the film industry is to enable a director to view the virtualaugmented environment from the 'beauty camera' during filming," says Davison. "This

scenario, in which the augmented scene is provided primarily for the director's own visualization, is less demanding and requires less robustness than the broadcast scenario, in which the augmented scene is for the audience's viewing."

According to Dave Thomas, a Los Angeles-based commercial director, "The availability of this technology would take a great percentage of the guess work out of high-end visual effects shots and ultimately allow directors, during the process, to make better judgments on the overall quality of the final product."

"Currently, visual effect shots that contain live-action and computer-generated elements require a complicated process of tracking," Thomas says. "Most of this tracking comes in the post-production process, long after the live-action portion has been shot. This technology would allow directors to view CG/live-action combined shots while they are actually shooting the live-action shot."

For Thomas, this translates into better, cheaper shots music to most film producers' ears. "A majority of projects today contain at least a few, if not many, visual-effects shots, and this technology could offer directors more accurate, more dynamic, and ultimately less expensive shots," he says. "Previsualizing the shot, with greater effects to be added later, would then require less imagination, not to mention money."

IEEE Distributed Systems Online March 2005 4

Page 5: 01423919

Conclusion

So when will this technology be ready to market to Hollywood or elsewhere? And when will official tests begin?

"We have a demonstration prototype which has applications in various areas," says Davison of the project that commenced in January 2005. A standard PC with a FireWire webcam is enough to successfully run the software in its present form.

"But this is research in progress," he says. "We have had, and continue to have, discussions with various companies involved in a variety of applications ranging from mobile robotics to gaming to film and broadcasting. But at this point in time, we have no commitment to any one company or application."

Cite this article: Benjamin Alfonsi, "SLAM Dunk? Real-Time Mapping Technology Could Take Gaming, Movies to New Level," IEEE Distributed Systems Online, vol. 6, no. 3, 2005.

IEEE Distributed Systems Online March 2005 5