Human Telesupervision of a Fleet of Autonomous …gwp/papers/HRI 2006.Distribution.pdf · Telepre...

8
This paper is an extended version of the paper with same title published in the proccedings of the ACM/IEEE 1st Annual Conference on Human-Robot Interaction, Salt Lake City, Utah, USA, March 2-4, 2006. Human Telesupervision of a Fleet of Autonomous Robots for Safe and Efficient Space Exploration Gregg Podnar, John Dolan, Alberto Elfes * , Marcel Bergerman, H. Benjamin Brown, Alan D. Guisewite Carnegie Mellon University 5000 Forbes Ave. Pittsburgh PA 15213-3890 USA [email protected] * Jet Propulsion Laboratory 4800 Oak Grove Drive Pasadena CA 91109-8099 USA [email protected] ABSTRACT When Gene Cernan stepped off the surface of the moon in December 1972 he said, “… we leave as we came, and God willing, as we shall return, with peace and hope for all mankind.” In January 2004, NASA began a bold enterprise to return to the Moon, and with the technologies and expertise gained, press on to Mars. The underlying Vision for Space Exploration calls for a sustained and affordable human and robotic program to explore the solar system and beyond; to conduct human expeditions to Mars after successfully demonstrating sustained human exploration missions to the Moon. The approach is to “send human and robotic explorers as partners, leveraging the capabilities of each where most useful.” Human-robot interfacing technologies for this approach are required at readiness levels above any available today. In this paper, we describe the HRI aspects of a robot supervision architecture we are developing under NASA’s auspices, based on the authors’ extensive experience with field deployment of ground, underwater, lighter-than-air, and inspection autonomous and semi-autonomous robotic vehicles and systems. 1. INTRODUCTION Space exploration is one of the costliest and riskiest activities human beings pursue. It is estimated that, for each astronaut who walks on the surface of Mars, it will be necessary to lift some 500,000 tons of cargo, at a cost of several hundred billion dollars. Because these missions require extended stays, the use of local resources is compulsory to provide building materials, breathable atmosphere, and even fuel. Once there, an astronaut will explore and work, often performing extra-vehicular activities (EVA) wearing a space suit that is heavy and cumbersome. While the duration of an EVA is limited by the amount of life-support consumables that can be carried, such as energy and oxygen, more importantly, experiences both on orbit and on the Moon show that after about five hours of work in a space suit, astronauts must return to their base habitat, exhausted by the intensive effort required. Robots, on the other hand, have been proven to work very well in extraterrestrial environments, being able to travel autonomously, explore, photograph, collect samples, and even perform limited scientific analysis of materials. The Spirit and Opportunity Mars Exploration Rovers, for example, combined have logged more than 1,100 Martian days, and over 10.5 km, and are still operating. It is therefore desirable that robotic systems, with their proven resilience to the harsh space environment and autonomous capabilities, be used to augment the most precious resource, the human beings and their reasoning abilities. The decision to send human explorers back to the Moon, and then on to Mars for extended missions results in a setting very different from robotic exploration such as that on Mars with Spirit and Opportunity. Human interaction with the Mars Exploration Rovers is restricted by round-trip communication delays of from eight to forty minutes. When humans and robots are near 1 each other the communication delays become insignificant, and a human telesupervisor can take direct control of a robot when necessary as if the human were immediately present. By exploiting this scenario of human-supervised robotic exploration and work, we can optimize the human’s time by deploying a fleet of robots that operate as autonomously as possible, and have the human supervisor provide direct control only when necessary. This paradigm has significant benefits: the robot fleet multiplies the effectiveness of one human; the human supervisor remains in a relatively safe, shirtsleeve environment; and the weight lifted from Earth can be reduced by orders of magnitude. This telesupervision/teleoperation combination, also referred to in the literature as “sliding autonomy” or “variable autonomy”, requires suitable human-robot interfaces, including those for mission planning and monitoring, robot monitoring, and high- fidelity telepresence. NASA inaugurated the implementation of the Vision for Space Exploration by, among other actions, funding key R&D efforts to mature a portion of the technologies needed in the near- and medium-term. The authors were awarded funding for the development of a general and widely applicable architecture for human supervision of a fleet of autonomous robots in support of sustained, affordable, and safe space exploration. To realistically achieve this goal, it is necessary to prove the supervision architecture in a complete working system (including all sub- 1 The definition of ‘near’ is determined by maximum round-trip communication delays of about one-quarter second, or about 23,250 miles (37,400 km) distant.

Transcript of Human Telesupervision of a Fleet of Autonomous …gwp/papers/HRI 2006.Distribution.pdf · Telepre...

This paper is an extended version of the paper with same title published in the proccedings of the ACM/IEEE 1st Annual Conference on Human-Robot Interaction, Salt Lake City, Utah, USA, March 2-4, 2006.

Human Telesupervision of a Fleet of Autonomous Robots for Safe and Efficient Space Exploration

Gregg Podnar, John Dolan, Alberto Elfes*, Marcel Bergerman, H. Benjamin Brown, Alan D. Guisewite

Carnegie Mellon University

5000 Forbes Ave. Pittsburgh PA 15213-3890 USA

[email protected]

*Jet Propulsion Laboratory 4800 Oak Grove Drive

Pasadena CA 91109-8099 USA [email protected]

ABSTRACT When Gene Cernan stepped off the surface of the moon in December 1972 he said, “… we leave as we came, and God willing, as we shall return, with peace and hope for all mankind.” In January 2004, NASA began a bold enterprise to return to the Moon, and with the technologies and expertise gained, press on to Mars.

The underlying Vision for Space Exploration calls for a sustained and affordable human and robotic program to explore the solar system and beyond; to conduct human expeditions to Mars after successfully demonstrating sustained human exploration missions to the Moon. The approach is to “send human and robotic explorers as partners, leveraging the capabilities of each where most useful.”

Human-robot interfacing technologies for this approach are required at readiness levels above any available today. In this paper, we describe the HRI aspects of a robot supervision architecture we are developing under NASA’s auspices, based on the authors’ extensive experience with field deployment of ground, underwater, lighter-than-air, and inspection autonomous and semi-autonomous robotic vehicles and systems.

1. INTRODUCTION Space exploration is one of the costliest and riskiest activities human beings pursue. It is estimated that, for each astronaut who walks on the surface of Mars, it will be necessary to lift some 500,000 tons of cargo, at a cost of several hundred billion dollars.

Because these missions require extended stays, the use of local resources is compulsory to provide building materials, breathable atmosphere, and even fuel.

Once there, an astronaut will explore and work, often performing extra-vehicular activities (EVA) wearing a space suit that is heavy and cumbersome. While the duration of an EVA is limited by the amount of life-support consumables that can be carried, such as energy and oxygen, more importantly, experiences both on orbit and on the Moon show that after about five hours of work in a space suit, astronauts must return to their base habitat, exhausted by the intensive effort required.

Robots, on the other hand, have been proven to work very well in extraterrestrial environments, being able to travel autonomously, explore, photograph, collect samples, and even perform limited scientific analysis of materials. The Spirit and Opportunity Mars

Exploration Rovers, for example, combined have logged more than 1,100 Martian days, and over 10.5 km, and are still operating.

It is therefore desirable that robotic systems, with their proven resilience to the harsh space environment and autonomous capabilities, be used to augment the most precious resource, the human beings and their reasoning abilities.

The decision to send human explorers back to the Moon, and then on to Mars for extended missions results in a setting very different from robotic exploration such as that on Mars with Spirit and Opportunity. Human interaction with the Mars Exploration Rovers is restricted by round-trip communication delays of from eight to forty minutes. When humans and robots are near1 each other the communication delays become insignificant, and a human telesupervisor can take direct control of a robot when necessary as if the human were immediately present.

By exploiting this scenario of human-supervised robotic exploration and work, we can optimize the human’s time by deploying a fleet of robots that operate as autonomously as possible, and have the human supervisor provide direct control only when necessary. This paradigm has significant benefits: the robot fleet multiplies the effectiveness of one human; the human supervisor remains in a relatively safe, shirtsleeve environment; and the weight lifted from Earth can be reduced by orders of magnitude.

This telesupervision/teleoperation combination, also referred to in the literature as “sliding autonomy” or “variable autonomy”, requires suitable human-robot interfaces, including those for mission planning and monitoring, robot monitoring, and high-fidelity telepresence.

NASA inaugurated the implementation of the Vision for Space Exploration by, among other actions, funding key R&D efforts to mature a portion of the technologies needed in the near- and medium-term. The authors were awarded funding for the development of a general and widely applicable architecture for human supervision of a fleet of autonomous robots in support of sustained, affordable, and safe space exploration. To realistically achieve this goal, it is necessary to prove the supervision architecture in a complete working system (including all sub- 1 The definition of ‘near’ is determined by maximum round-trip

communication delays of about one-quarter second, or about 23,250 miles (37,400 km) distant.

This paper is an extended version of the paper with same title published in the proccedings of the ACM/IEEE 1st Annual Conference on Human-Robot Interaction, Salt Lake City, Utah, USA, March 2-4, 2006.

systems) and use this to assess the performance improvement from one person working EVA in a spacesuit, to one remote operator at base supervising a fleet of robots.

In the first year we are testing and demonstrating the architecture’s feasibility and usefulness by tasking a fleet of adapted robot vehicles with wide-area prospecting for mineral resources for in situ utilization. This initial instantiation of the telesupervised autonomous robotic technologies is referred to as PROSPECT: Planetary Robots Organized for Safety and Prospecting Efficiency via Cooperative Telesupervision (Figure 1).

Figure 1. Artist’s conception of telesupervised wide-area robotic prospecting on the Moon.

We note that in addition to mineral prospecting, there are many areas in which multiple robots will be deployed and which can benefit from the efficiencies our architecture enables. Mining, transporting, and construction are a few of the planetary surface applications. In the same way, applications on orbit would also benefit such as construction, inspection, and repair of large space structures and satellites.

In this paper, we focus on the human-robot fleet interaction aspects of our architecture, which were designed primarily based on the authors’ extensive experience with field deployment of ground, underwater, lighter-than-air, and inspection autonomous and semi-autonomous robotic vehicles and systems2 [3], [8], [12], [13].

The paper is organized as follows: in Section 2 we present a literature review, focusing on those systems which bear some degree of similarity to ours. In Section 3 we present a brief overview of the Robot Supervision Architecture, delineating how the HRI aspects are integrated with the autonomy subsystems; in Section 4 we describe human interaction modes with individual robots and the robot fleet through the telesupervisor workstation; in Section 5 we present the Telepresence and Teleoperation System (TTS), especially the geometrically-correct binocular stereoscopic video system designed, built, and already in use for local and remote robot teleoperation. We conclude by presenting our immediate and long-term future work and discussing the applicability of our architecture to other important space applications. 2 Readers interested in the autonomous navigation and hazard and

assistance detection aspects are welcomed to contact the authors.

2. LITERATURE REVIEW The current state-of-the-art in space exploration, as seen in the Mars Exploration Rover (MER) mission, requires several humans to telesupervise each robot [10]. In contrast, our approach provides for one human telesupervising a numerous robot fleet. Although many of the underlying technologies needed have reached high readiness levels3, no system has yet demonstrated an integrated approach at TRL 6, the last level prior to system demonstration in space. In this section we review the main systems reported in the literature with some similarity to ours, and we point out the main differences in approach or overall goals. MER is, of course, one notable program which shares many technologies with the system we are developing. It is, however, focused on telecommanding a pair of rovers across a very large communication delay via command sequences, and therefore is substantially different from our work with respect to the functionalities provided. Just to mention one particular subsystem, our ability to take direct telecontrol of a vehicle is not relevant for MER, since the rovers can never be teleoperated directly from Earth.

Robonaut [1] is a space robot “designed to approach the dexterity of a space suited astronaut.” Its main similarity with our work is the telepresence capability implemented with stereo cameras. It does not, however, take advantage of geometrically-correct stereo, and it requires the use of complex telepresence equipment, while ours is built on top of simple, inexpensive eyewear as described in Section 5.

Nourbakhsh et al. [11] have created a human-multirobot architecture for urban search and rescue that allows simultaneous operation of real-world and simulated entities. Sierhuis et al. [14] have created a Mobile Agents Architecture (MAA) integrating diverse mobile entities in a wide-area wireless system for lunar and planetary surface operations. Our system is similar to these in intent, but focuses on human safety and efficiency in a planetary exploration environment by providing high-fidelity telepresence and a hazard and assistance detection methodology that seeks to optimize the use of human attention resources given a set of robot assistance requests.

Brookshire et al. [2] are working with coordinated teams of robots for assembly tasks. Their architecture implements sliding autonomy with which “the operator can augment autonomous control by providing input to help the system recover from unexpected errors and increase system efficiency.” While their main focus is on the varying degrees of autonomy each robot should maintain during the coordinated task, ours is on augmenting a human’s capability to efficiently perform exploration and operations in space. Finally, it is worth mentioning that the concept of geometrically-correct stereoscopic imagery is not new, and has in fact been applied in computer vision and robotic inspection applications [13]. To the best of our knowledge, however, it has not been integrated into a system which also supports human supervised autonomous fleet operations at TRL 6. This, and the Robot Supervision Architecture, are the main contributions of our work to the state-of-the-art.

3 For a review of technology readiness levels (TRL) we

recommend the NASA white paper [9].

This paper is an extended version of the paper with same title published in the proccedings of the ACM/IEEE 1st Annual Conference on Human-Robot Interaction, Salt Lake City, Utah, USA, March 2-4, 2006.

3. ROBOT SUPERVISION ARCHITECTURE At its highest level of abstraction the Robot Supervision Architecture (RSA) supports human supervision of a fleet of autonomous robots, as schematically depicted in Figure 2.

Figure 2. High-level Robot Supervision Architecture concept.

The human interacts with the system in three ways: making high-level plans and assignments to the robots; monitoring the progress as the robots carry out their assigned plans autonomously; and intervening when a robot requires assistance, whether navigational, or with respect to its assigned task. These are taken care of by the three top-level processes in Figure 3 – Task Planning and Monitoring, Robot Telemonitoring, and Telepresence and Teleoperation – which are described in more detail in Section 5.

Figure 3. RSA system-level block diagram.

4. HUMAN INTERACTION MODES The main physical embodiment of the RSA’s human-robot interaction aspect is the telesupervisor workstation, depicted in Figure 4.

At the center of the Telesupervisor Workstation, located in a “shirt-sleeve environment” human habitat base, is the live stereoscopic video display, which faithfully reproduces the scene as if the operator were viewing through a window on one of the robot vehicles.

The workstation also provides two other groupings of displays: those which support the telesupervisor interacting with the Robot Supervision Architecture for task planning and monitoring; and those which allow of continuous monitoring of each robot through a “dashboard” providing low-bandwidth imagery and telemetry.

Task Planning and Monitoring To the left in Figure 4 are displays that allow monitoring and interaction with the high-level task planning and monitoring. It is here that inference grids4 are displayed and manipulated including being overlaid on an image of the terrain. These displays provide the graphical user interface for mission task design, which for the test mission of wide-area mineral prospecting consists of a list of prospecting tasks to be performed at specific prospecting sites. It also provides the feedback needed for successful mission monitoring and replanning, as necessary. As one example, the lower-left monitor shows each of four robots, their navigated paths, and the points at which prospecting data have been collected. Dashboards Each robot is constantly monitored at low bandwidth with imagery and data updated regularly over a wireless Ethernet link. As shown on the right of Figure 4, each robot dashboard includes an image from one of the robot’s cameras, and graphical depictions of status data such as battery charge, attitude, motor temperatures, and any other telemetry of this nature provided by the robot that the operator would monitor.

When the Hazard and Assistance Detection (HAD) subsystem monitoring a robot determines a condition of which the telesupervisor should be made aware, it is on that robot’s dashboard that the condition is indicated. This is indicated for robot #2 in Figure 4 by an orange surround of the robot view, and a similar indication on one of the graphical dashboard displays. Teleoperation Controls We make a distinction between telepresence and teleoperation, considering telepresence (being at a distance) as involving primarily telesensory aspects, and teleoperation (acting at a distance) as primarily involving telecontrol. Whereas monitoring is supported by simultaneous low-bandwidth data streams from each of the robots, telepresence is supported by high-bandwidth stereoscopic imagery and other telesensory modalities for one robot at a time. It provides not only the stereoscopic visual, but also aural and attitude proprioceptive feedback that allows for more immersive telepresence.

4 Inference grids represent the main data structures of the RSA.

We refer the reader to [4] for details on this stochastic lattice model approach to inferencing.

This paper is an extended version of the paper with same title published in the proccedings of the ACM/IEEE 1st Annual Conference on Human-Robot Interaction, Salt Lake City, Utah, USA, March 2-4, 2006.

Figure 4. Workstation for human interface with the RSA. This instantiation applies to the wide area prospecting task.

Joysticks, a keyboard, and mouse are depicted in Figure 4 as the primary input peripherals for taking manual control and for interacting with the RSA, such as modifying a navigation waypoint. Touch screens can also be employed.

There are two primary sub-systems that the telesupervisor will teleoperate: when a prospecting vehicle must be remotely driven rather than operating under its Autonomous Navigation System; and when the prospecting tools are to be operated manually.

Driving the vehicle is accomplished with a joystick, the output of which is converted to the appropriate commands to drive the particular robot vehicle. Operating the prospecting tools can be by joystick or simpler input depending on the complexity of the mechanism which deploys the prospecting tools.

Proprioception The ability to orient the human stereoscopic camera to allow the telesupervisor to see a portion of the robot vehicle or arm is analogous to an astronaut in a suit viewing these same parts through a helmet visor or vehicle window. This visual proprioception enhances the ability to drive the vehicle as if the operator were sitting on or in it.

Secondary Hand-Off to a Distant Expert A key aspect of any robot deployment task is identifying a situation beyond the automatic abilities of the robot, and presenting relevant information and control to a human telesupervisor. Depending on the task involved, the telesupervisor may encounter a situation beyond the telesupervisor’s training or experience (e.g., an unusual rock formation), and may decide to request consultation with an expert. Additional communication from the Telesupervisor Workstation base from, say, the Moon down to Earth will support a secondary telesupervisor assistance level, providing access to consultation with distant experts. Although these Earth-based consultants will be at a communications delay disadvantage (minimum roundtrip delays: Moon = 2.5 seconds, Mars = 6-42 minutes), their ability to “look over the shoulder” of the telesupervisor allows them to provide valuable insight and decisions when presented with unusual circumstances, or curious geological formations.

Communication

The system is implemented over a redundant, dual-path data communication infrastructure: a digital, radio Ethernet low-

bandwidth path used for communication of commands and telemetry data; and an analog, high-bandwidth path used for communication of the stereoscopic video streams to the telesupervisor at base. The radio Ethernet supports a possible low-bandwidth fallback should the analog video transmitters experience signal loss. The video systems can also be used to carry telemetry data.

5. TELEPRESENCE AND TELEOPERATION SYSTEM Founded on our previous work on telepresence, navigation, and proprioception for mobile robots, and remote 3D aircraft skin inspection [13], and on our understanding of the human binocular visual system, we have designed a geometrically correct binocular stereoscopic vision system with a stereoscopic camera for each fleet robot, and a stereoscopic display for the operator's workstation.

The major challenge is to provide an end-to-end telepresence vision system which faithfully reproduces the scene as if it were viewed by the operator's naked eyes. We advance these technologies by having addressed first the geometric issues and then incorporating a binocular stereoscopic camera and computer-controlled mount onto each robot vehicle.

Binocular Stereoscopic Cameras The telesupervisor's remote video systems must faithfully reproduce a view analogous to that viewed by the uninstrumented eyes. Each variety of distortion introduced impairs the operator's ability to work precisely, and can cause substantial fatigue with prolonged use.

Camera Geometry Traditional binocular video systems which converge the optical axes, or “toe in” two cameras result in horizontal and vertical misalignment distortions which increase as the gaze moves away from the center of the scene. As depicted in Figure 5, when two conventional cameras have their optical axes converged, there are a number of geometric distortions introduced when these two images are presented as coplanar by the viewing system. The distortions are more pronounced as the viewer’s gaze approaches the corners of the scene.

This paper is an extended version of the paper with same title published in the proccedings of the ACM/IEEE 1st Annual Conference on Human-Robot Interaction, Salt Lake City, Utah, USA, March 2-4, 2006.

Figure 5. Converged camera geometry.

The test subject is represented by a flat, rectangular grid of dots set at a distance from the camera where the two centers of view coincide. When displayed together on a monitor screen (with the necessary multiplexing hardware to present the “left” and “right” images to the correct eyes), while the entire grid should appear as coincident, there is only one line where the two images are coincident: a vertical line at the center, here identified as the “Line of Coincidence”. To clarify, the green grid represents the image seen by the left eye, while the red grid represents that seen by the right.

The human visual system recovers depth information from horizontal disparities. The horizontal disparity errors result in depth distortions. More importantly, the vertical misalignments place unnatural demands on the human visual system to continually change vertical vergence as the gaze traverses the scene. These artificially-introduced vertical disparity distortions are fatiguing and can cause headache, nausea, and can leave the person with temporary residual vertical phoria (misalignment). There are many other distortions which can be introduced, but they need not be exhaustively covered here.

In Figure 6, the geometries have been corrected. The image sensors are now coplanar (i.e., their normals are parallel). With the same interpupillary separation as shown in Figure 5, the optical axes are now parallel. To achieve coincident centers of view, instead of converging the optical axes, we independently shift the center of view of each camera by shifting each image sensor while keeping them coplanar.

This results in the images from both cameras being entirely coincident. This is depicted by a reproduction of the test subject grid where both eyes see the same image at all points across the monitor screen.

Figure 6. Parallel camera geometry.

For a camera system to be capable of presenting a binocular stereoscopic image analogous to gazing with human eyes, the interpupillary distance must match that of the human visual system. The interpupillary spacing of adult human eyes averages 63 mm [5]. Choosing an interpupillary distance substantially larger or smaller introduces stereoscopic scaling which departs from our primary goal of providing the telesupervisor a faithful reproduction of the scene as viewed with naked eyes.

The “area of coincidence” is determined by the lateral offset of the image sensors outboard from the optical axes. Using triangulation, the distance from the camera to the area of coincidence can be calculated based on the intersection of the lines which represent the “center of view” as depicted in Figure 6. This will be set based on the distance of the stereoscopic monitor viewing screen from the human viewer as described below.

Our design for three-dimensional telesensory vision systems is geometrically analogous to the perception of uninstrumented eyes [7], thereby increasing operator accuracy and effectiveness and minimizing fatigue.

Camera Geometry Adjustments The initial human stereoscopic camera design allows for a variety of adjustments. These include sensor shift (both horizontally and vertically), and independent sensor rotation to allow precise registration and prevent introduction of torsional errors. Interpupillary spacing however, is fixed at 63 mm, as explained above.

The Center of View axis is defined as the line running through the center of the image sensor and the optical center of the lens. As the interpupillary distance of the lenses is fixed, the distance between the two optical centers is also fixed, and since the Center of View angle determines the distance to the Area of Coincidence, it is adjustable by shifting both image sensors outboard or inboard equally.

Viewing System Geometry It is insufficient to consider only the camera in a geometrically-correct telepresence viewing system. To reproduce reality as if the viewer were gazing on the scene with uninstrumented eyes, the display system must also conform to specific geometric requirements. If the camera imagery is displayed on a video

This paper is an extended version of the paper with same title published in the proccedings of the ACM/IEEE 1st Annual Conference on Human-Robot Interaction, Salt Lake City, Utah, USA, March 2-4, 2006.

monitor, it is natural to consider this monitor as a window through which the viewer gazes.

By adhering to equivalent geometries of a direct view through a window for the camera, and the view of a virtual image “through” the screen of a stereoscopic display system we can accurately reproduce the object scene.

In the left diagram of Figure 7, the interpupillary distance of the eyes is taken as fixed. The width of the window constrains the angle of view for each eye and defines the area of coincidence when we position the eyes such that a line drawn through the two pupils is parallel with the window, and position the cyclopean point (the point between the two pupils) normal to the plane of the window and centered on the aperture of the window.

Figure 7. Direct view compared with camera and display geometries.

Selection of the effective window aperture is limited by the physical size of the display screen that will be used. Incorporating the distance of the viewer’s eyes from the display screen completes the system’s geometric constraints.

In the center diagram of Figure 7, the spacing of the cameras is set at the average adult human interpupillary distance of 63 mm. The area of coincidence is set at the distance of the viewer’s eyes from the display screen as shown in the right diagram of Figure 7. The camera lenses must then be chosen to equal the calculated angle of view. Given a viewer distance of 60 cm (24 in.) and a screen width of 40 cm (15.75 in.) (similar to a 21” CRT monitor) we calculate an angle of view as:

2 * arctan((0.5 * 40) / 60) ≈ 37˚ horizontal field of view

Binocular Multiplexing In the right diagram of Figure 7, a set of eyewear is indicated, interposed between the viewing screen and the viewer’s eyes. There are many ways of multiplexing the left and right images such that each eye sees the image intended for it. Our choice is a StereoGraphics Corporation Z-Screen combined with passive, circularly polarized eyewear, selected based on our previous positive experiences with it.

The monitor displays the left and right images alternately very fast, taking advantage of the human visual system’s persistence of vision to apparently provide a continuous image without flicker to each eye. The Z-Screen is a specialized large-area active liquid crystal polarizer that is mounted in front of a high-refresh-rate CRT monitor with low phosphor persistence (to minimize visual

“cross-talk”). The Z-Screen rapidly (and in synchronism with the monitor) alters the circular polarization of the light coming from the monitor.

One lens of the passive eyewear is circularly polarized in one direction, the other lens in the opposite. In this way, viewer comfort is maximized with lightweight eyewear, and a larger number of viewers may conveniently observe over the primary viewer’s shoulders when a demonstration is presented to a group.

Limitations of the System Geometry Were we to track the head of the viewer, we could servo the other aspects of the system geometry to compensate. This is beyond the scope of the project in the near term. Fortunately, a telesupervisor can reasonably be expected to sit in one place. If the viewer leans back, there will be an apparent increase in the depth of the scene. In the same way, leaning closer to the display will appear to lessen the depth. Shifting to the side provides an unrealistic apparent change of perspective. This is primarily an “apparent” change as the image on the screen does not change, but the natural expectation of the viewer is not realized, and objects at different depths appear to shift in the “wrong” direction.

Camera Support On each prospecting robot vehicle the stereoscopic camera is supported on a mast near the aft, combined with a pan and tilt mount. This allows the vehicle to be brought into view. In the near term, because the prospecting tool is also a camera, its deployment mast will also serve as the human stereoscopic camera mast. The added articulation will allow the stereoscopic camera to attain a close-up view of each prospecting site. In the future separate support will be provided for the prospecting sampling tools.

Figure 8 shows a schematic view of CMU’s K10 rover, built at NASA Ames Research Center and delivered to CMU for this project, with the added mast.

Figure 8. CMU K10 rover with articulated stereo camera

mast.

This paper is an extended version of the paper with same title published in the proccedings of the ACM/IEEE 1st Annual Conference on Human-Robot Interaction, Salt Lake City, Utah, USA, March 2-4, 2006.

System Prototype Based on the principles laid out in the previous sections we designed and built a stereo camera for high fidelity telepresence. Figure 9 presents the camera CAD drawing and the actual unit.

The stereo camera has been successfully integrated to the rover and the prototype telesupervisor workstation, which currently allows for local teleoperation of the CMU K10 rover as well as for remote teleoperation of NASA JPL and NASA ARC rovers. Figure 10 presents the camera mounted on top of the K10. Figure 11 shows this initial instantiation of the workstation with the operator wearing the passive eyewear and driving the rover looking at the stereoscopic view provided through the Z-Screen. Ongoing Development In the future we will expand the teleperceptual feedback with additional sensors on each robot vehicle for aural and attitude proprioception. Although there may be no atmosphere, if an operator was deployed in a human-habitat cabin of an on-site vehicle, the sounds of the vehicle's actuators, stresses, and interactions with the surface or structure, and tools all provide a rich orchestra of information which is perceivable naturally via sound. Stereophonic conduction microphones on each robot and audio reproducers at the Telesupervisor's Workstation will provide this additional and effective cue. We will also advance the camera systems providing better depth information to the Autonomous Navigation System with elevating mounts and interocular expansion [6].

(a)

(b)

Figure 9. (a) Stereo camera assembly schematic design.

(b) Initial unit built.

Figure 10. CMU’s K10 with stereo camera.

Figure 11. Initial workstation with stereoscopic telepresence

capability.

The demands put on a remote operator who must disconnect the proprioception provided by spatial orientation cues from visual cues can be fatiguing, even nauseating (especially when switching from one robot to another). By providing a small amount of orientation feedback from attitude sensors on the vehicle controlling short-travel Telesupervisor Workstation platform actuators, this seat-of-the-pants information can be relayed from the vehicle to the teleoperator. This is especially important when the horizon is occluded (e.g., by the rim of a crater), and the operator may be unaware of the robot's orientation on an incline.

Temporal context is also important. As the telesupervisor “switches in” to one of the robots, a history sequence of recent imagery and telemetry can be presented to help the telesupervisor gain the context of the particular robot vehicle more easily.

6. CONCLUSION In this paper we reported on the authors’ ongoing work contracted by NASA to develop an architecture for human supervision of a

This paper is an extended version of the paper with same title published in the proccedings of the ACM/IEEE 1st Annual Conference on Human-Robot Interaction, Salt Lake City, Utah, USA, March 2-4, 2006.

fleet of space and planetary robots, in support of safe, effective, and efficient space exploration. We focused on the HRI aspects of the architecture; readers interested in the autonomous navigation and hazard and assistance detection aspects are encouraged to contact the authors.

We conclude by noting that the paradigm of a human-supervised fleet of autonomous robots is not limited to planetary prospecting; it is also directly applicable to a variety of tasks in other realms such as space assembly, inspection, and maintenance; and mining and construction. To take inspection as an example: the comprehensive and periodic inspection of large space structures requires an inspector to visit each critical site of the structure. One astronaut in a maneuvering unit can only inspect a limited number of sites in one shift; to accomplish the inspection goal, the inspector spends most of the time unproductively flying from one site to the next. The provision of a fleet of autonomous inspection robots would make realizable the parallelization of the task and safety improvements and reduction in fatigue for the astronaut. The same architectures for navigation, hazard and assistance detection, and hand-off to a teleoperator working through a high-fidelity telepresence system are applicable.

7. ACKNOWLEDGMENTS This work was supported by NASA under the Exploration Systems Mission Directorate’s Technology Maturation Program Cooperative Agreement No. NNA05CP96A.

8. REFERENCES [1] R.O. Ambrose, R.T. Savely, S.M. Goza, et al. “Mobile

Manipulation Using NASA’s Robonaut.” IEEE Intl. Conference on Robotics and Automation, New Orleans, USA, April 2004, pp. 2104-2109.

[2] J. Brookshire, S. Singh, R. Simmons, “Preliminary Results in Sliding Autonomy for Coordinated Teams.” Proceedings of The 2004 Spring Symposium Series, March, 2004.

[3] A. Elfes, S.S. Bueno, M. Bergerman, J.J.G. Ramos. “A semi-autonomous robotic airship for environmental monitoring missions.” IEEE Intl. Conference on Robotics and Automation, May 1998, pp. 3449-3455.

[4] A. Elfes. “Dynamic control of robot perception using multi-property inference grids.” IEEE Intl. Conference on Robotics and Automation, Nice, France, May 1992, pp. 2561-2567.

[5] N. A. Dodgson. “Variation and extrema of human interpupillary distance.” Stereoscopic Displays and Virtual Reality Systems XI, A. J. Woods, J. O. Merritt, S. A. Benton, and M. T. Bolas (eds.), January 2004, pp. 36–46.

[6] V. Grinberg, G. Podnar, and M. Siegel. “Geometry of Binocular Imaging II, The Augmented Eye.” IS&T/SPIE Symposium, Stereoscopic Displays and Applications VI, 1995.

[7] V. Grinberg, G. Podnar, and M. Siegel. “Geometry of Binocular Imaging.” IS&T/SPIE Symposium, Stereoscopic Displays and Applications V, 1994.

[8] T. Kampke and A. Elfes. “Optimal wind-assisted flight planning for planetary aerobots.” IEEE Intl. Conference on Robotics and Automation, April 2004, pp. 2542-2549.

[9] J.C. Mankins. Technology Readiness Levels: A White Paper. NASA Office of Space Access and Technology, April 1995. http://advtech.jsc.nasa.gov/downloads/TRLs.pdf.

[10] NASA Mars Exploration Rover Mission. http://marsrovers.jpl.nasa.gov/home/index.html.

[11] I. Nourbakhsh, K. Sycara, M. Koes, et al. “Human-Robot Teaming for Search and Rescue.” Pervasive Computing, Vol. 4, No. 1, Jan-Mar 2005, pp. 72-78.

[12] M. Saptharishi, C.S. Oliver, C.P. Diehl, et al. "Distributed Surveillance and Reconnaissance Using Multiple Autonomous ATVs: CyberScout.” IEEE Transactions on Robotics and Automation: Special Issue on Multi-Robot Systems, Vol. 18, No. 5, October, 2002, pp. 826-836.

[13] M. Siegel, P. Gunatilake, and G. Podnar. “Robotic Assistants for Aircraft Inspectors.” Instrumentation and Measurement Magazine, Vol. 1, No. 1, March 1998.

[14] M. Sierhuis, W.J. Clancey, R.L. Alena, et al. “NASA’s Mobile Agents Architecture: A Multi-Agent Workflow and Communication System for Planetary Exploration.” 8th iSAIRAS Conference, Munich, September 2005.