mars2020_usc_report.doc

25
Final Report for DARPA MARS 2020 USC Interaction Lab July 2004 Maja Matarić - PI Marcelo Kallmann (postdoc), Chad Jenkins (postdoc), Amit Ramesh (PhD student), Nathan Miller (subcontracted engineer) Abstract This report summarizes the work conducted at the USC Interaction Lab in the scope of the DARPA MARS 2020 Program project “Acquisition of Autonomous Behaviors by Robotic Assistants”, via the NASA subcontract NAG9-1444 “Skill Learning by Primitives-Based Demonstration & Imitation”. It covers the full period of the project, from August/01/2002 to August/31/2004. 1. Introduction The USC team performed research towards achieving autonomous robotics assistants, with a focus on applications using NASA’s Robonaut, a humanoid torso robot at the Johnson Space Center [Amb + 00]. Our approach consists of learning skills from demonstration and imitation of primitive motions. In this context, we have achieved the following milestones and contributions: Development of the USC motion suit, which is an IMU-based motion capture system for low-cost, lightweight, untethered collection of motion data. We present details about the equipment in this report as well as results of using the motion suit to teleoperate Robonaut. The motion suit was developed by N. Miller, and M. Kallmann developed the simulation software used to control Robonaut during performed teleoperation experiments. page 1 of 25

Transcript of mars2020_usc_report.doc

Page 1: mars2020_usc_report.doc

Final Report for DARPA MARS 2020

USC Interaction Lab

July 2004

Maja Matarić - PIMarcelo Kallmann (postdoc), Chad Jenkins (postdoc),

Amit Ramesh (PhD student), Nathan Miller (subcontracted engineer)

Abstract

This report summarizes the work conducted at the USC Interaction Lab in the scope of the DARPA MARS 2020 Program project “Acquisition of Autonomous Behaviors by Robotic Assistants”, via the NASA subcontract NAG9-1444 “Skill Learning by Primitives-Based Demonstration & Imitation”. It covers the full period of the project, from August/01/2002 to August/31/2004.

1. Introduction

The USC team performed research towards achieving autonomous robotics assistants, with a focus on applications using NASA’s Robonaut, a humanoid torso robot at the Johnson Space Center [Amb+00]. Our approach consists of learning skills from demonstration and imitation of primitive motions. In this context, we have achieved the following milestones and contributions:

Development of the USC motion suit, which is an IMU-based motion capture system for low-cost, lightweight, untethered collection of motion data. We present details about the equipment in this report as well as results of using the motion suit to teleoperate Robonaut. The motion suit was developed by N. Miller, and M. Kallmann developed the simulation software used to control Robonaut during performed teleoperation experiments.

Cross-kinematics metrics for imitation learning, allowing for comparison and mapping of motion data across various kinematic structures. This is an essential issue in the problem of learning from imitation, both for motion data acquisition and for behavior/primitive identification and selection. The main contributor to this topic is A. Ramesh.

Primitive motion controllers based on collected motion and sensory data. The use of demonstrated motion data for derivation of controllers [JM03] [Jen03] [Ero+03] was presented in DARPA MARS, preceding this project. New results [JM04b] [OMM04] [DCM04] were applied to build controllers for Robonaut. This research also developed methods for learning sensory structures from Robonaut teleoperation data. Relevant structures, indicating the correct accomplishment of a task, e.g., a successful grasping, are built by analyzing captured sensory data from Robonaut teleoperation. Extracted structures can then be used to guide sensory-motor controllers. The main contributor to this topic is C. Jenkins.

Primitive motion controllers based on randomized roadmaps. Two problems are addressed using randomized roadmap-based motion planners: (1) the generation of collision-free reaching motions for Robonaut in changing environments, and (2) the planning of sequences of parameterized

page 1 of 18

Page 2: mars2020_usc_report.doc

motion (from controllers or collected motion data) towards achieving given tasks. The main contributor to this topic is M. Kallmann.

These four topics are presented in the following sections of this report. The diagram shown in Figure 1 illustrates how these topics are interconnected in order to achieve autonomous robotics assistants through learning from demonstration of motion data.

Figure 1. Diagram illustrating the interconnections between the involved research topics. Boxes with double line frames are described in detail in this report.

2. USC Motion Suit

The ability to capture kinematic motion of human subjects is emerging as an invaluable tool in the control of robots. Motion capture is used for teleoperation, implementing autonomous controllers, learning from demonstration, human-robot interaction, etc. The current state of motion capture systems, however, imposes considerable physical limitations (e.g., optical, exoskeleton, electro-magnetic systems), does not provide accuracy and reliability suitable for robot control (e.g., vision-based and fiberoptic systems), or cannot be used for real time applications (e.g., optical and vision-based systems). In addition, commercially available capture systems can be cost-prohibitive.

page 2 of 18

USC Motion Suit

Cross-kinematics metrics for imitation learning are used for collecting motion from various types of kinematics

Other sources of motion collection

Robot-Ready Motion Database

Derivation of motion controllers based on collected motion and

sensory data

Robonaut Control

Learned skills are represented as parameterized motion controllers

Derivation of motion controllers based on collected motion and

randomized roadmaps

Page 3: mars2020_usc_report.doc

Our perspective is that in order for motion capture to be truly effective for robotics, it must be able to accurately perform in unstructured environments without significantly encumbering a performer. The development of such systems would effectively bring motion capture out of the laboratory and into society. The benefits of untethered and uncumbersome motion capture are three-fold: (1) increased information for human-robot interaction, (2) more representative motion time-series data for robot learning, and (3) greater accessibility to robot control for non-technical people.

Towards this end, we have developed a “motion suit” comprised of small and lightweight independently functioning sensor units. Each sensor independently provides its global orientation using inertial measurement units (IMUs). Orientation information is communicated to a host computer to drive any desired kinematic model. The primary advantages in design of our IMU-based motion suit are that: (1) communication between sensors and a host computer does not represent a physical limitation, (2) the form factor for IMUs is small and progressively decreasing in size, and (3) the sensor topology is highly scalable with regard to adding, removing, or altering nodes. A depiction of the system is given in Figure 2.

Figure 2. Overview of the system. Untethered motion capture is achieved with the use of a wireless radio connection with the host computer.

Our motion capture system consists of 3 hardware subsystems: the computer interface, the capture subject interface, and the individual sensor units. These subsystems are described below.

The computer interface consists of a single Atmel Mega32 microcontroller, a Linx Technologies TR-916-SC-PA transceiver, and a DLP Design DLP-USB245M USB FIFO adapter. The microcontroller provides the low-level control functionality and filters incoming packets for correct structure. In the current implementation, this subsystem acts as a relay between the transceiver and USB in receive mode only.

The capture subject interface consists of a single Mega32 microcontroller, a TR-916-SC-PA transceiver, 6v nicad battery, and 5 minidin ports in which sensors may be connected. The microcontroller acts as the Serial Peripheral Interface (SPI) master, requesting orientation packets from the connected sensors. Once a valid packet is buffered, it is sent over the transceiver, which currently operates in send mode only. The radio subsystems operate reliably indoors to over 100 feet.

page 3 of 18

Page 4: mars2020_usc_report.doc

The inertial measurement units (IMUs) contain: a single Mega32 microcontroller; 12-bit, 11-channel Texas Instruments ADC; 3 axis magnetometer (consisting of 1 Honeywell HMC1051 and 1 Honeywell HMC1052); 3 axis accelerometer (consisting of 2 Analog Devices ADXL311JE); and a 3 axis gyroscope (consisting of 3 Analog Devices ADXRS300ABG). Various passive components and a voltage regulator are also present. Each sensor is capable of resolving global orientation relative to the gravity and Earth's magnetic field vector. The sensors communicate with the capture subject radio as SPI slaves. Two minidin connectors allow multiple sensors to be chained together. The bare sensors have a 1.5x1.5.0.75 inch footprint (see Figure 3).

Figure 3. Hardware for an assembled IMU sensor unit (left image), and two sensors being used to capture arm motions (right image). The radio transmitter is a small black box on the user’s left hand.

Sensors can be added or subtracted as needed prior to the capture session. The limiting factor in our application is the bandwidth of the Linx radios (33,600 bps). Future plans include migrating to 802.11b, where the limiting factor will become the speed at which the SPI is able to operate. This is governed by the length of the physical connections as well as the Mega32 16Mhz operating speed.

The capture subject radio, operating as SPI master, polls the SPI bus for the presence of a particular sensor id. Should a sensor with that id be attached to the system, that sensor will respond to the SPI master and the orientation information will be transferred. If the id is not found, the SPI master tries two additional times in case the sensor is in a critical section, then moves on to the next sensor id.

Our approach to fuse the sensors' information in order to estimate a 3 degrees of freedom (DOFs) orientation is to integrate gyroscope signals for orientation estimates and correct drift errors with accelerometer and magnetometer readings. This approach is described in detail in a separated report [Mil+04]. The next section presents our experiments on teleoperating Robonaut using the motion suit.

3. Motion Suit Experiments at NASA JSC

We have performed experiments using the motion suit to teleoperate Robonaut at NASA Johnson Space Center, Houston. In these experiments, we have used two sensors to capture arm motions of the teleoperator. Sensor 1 was placed on the upper arm, and sensor 2 was placed in the lower arm of the teleoperator.

Individual sensors of the motion suit provide orientation with respect to a global frame. Consequently, orientation information from each sensor needs to be converted to local frame, according to the parent-child relationship of the sensors placement. Rotations expressed in their local

page 4 of 18

Page 5: mars2020_usc_report.doc

frames still need to be decomposed into the correct Euler angles sequence in order to obtain the correct angular rotations for each of the controlled Robonaut’s arm joints.

Note that the three axes of rotations in Robonaut's shoulder do not coincide at a single point. Such alignment exists in the sensed rotations, and a correction would be required for a perfect match. However the misalignment in Robonaut's shoulder is small; therefore, we did not need to include such an additional correction step.

Offsets must also be added to the obtained angular rotations in order to match motion suit postures to the target Robonaut postures. The offsets are defined by the joint values of the target structure (Robonaut's arm) required to achieve the same posture as the initialization posture defined for the motion suit. Some angular offsets require to be negated in order to conform with the direction of orientation of the corresponding actuated joint.

Let 1, 2, and 3 be the three final Euler angles obtained from sensor 1 after all of the corrections previously discussed. Similarly, let 4, 5, and 6 be the final angles obtained from sensor 2. Similar to a human arm, Robonaut’s arm has a total of 7 DOFs: 2 in the shoulder, 1 in the upper arm, 1 in the elbow, 1 in the lower arm, and 2 in the wrist. In the performed experiments, only 5 DOFs were used. Angles 1, 2, and 3 are mapped to the 3 joints affecting the rotation of Robonaut’s shoulder. Angles 4 and 5 are mapped to the elbow flexion and lower arm twist (see Figure 4). A third sensor placed in the user’s hand would be required in order to teleoperate the remaining 2 joints of Robonaut’s wrist.

Figure 4. Sensor mapping to Robonaut's arm.

Robonaut’s arm control interface requires the use of its built-in inverse kinematics (IK) mechanism. The IK mechanism accepts desired 6 DOF position and orientation of the end-effector in Robonaut’s coordinates (in relation to a frame located within Robonaut’s chest) as commands. The Jacobian-based IK is then responsible for constructing a valid posture. Robonaut is explicitly modeled kinematically in order to translate motion capture data into IK commands for Robonaut. Figure 5 illustrates this mapping from the sensors to the intermediate model and finally to Robonaut, and Figure 6 shows some snapshots of teleoperation trials.

Note that the adopted approach maps sensory data directly to the joint angles of the simulated Robonaut model (Figure 5). Another possible approach is to map the data to a model of the teleoperator’s arm, and then command Robonaut’s hand to achieve the same position of the teleoperator’s hand. This would result in a better match of the end-effector’s motions to the task, and such an approach might be better suited to teleoperate manipulation tasks. However, it requires a realistic model of the teleoperator’s arm in order to achieve realistic results. Our approach is more general as it does not require models of the teleoperator’s arm and automatically scales teleoperated motions into the reachable space of Robonaut.

page 5 of 18

Page 6: mars2020_usc_report.doc

Figure 5 - Two sensors are placed in the right arm of the user; the radio unit can be seen in the left hand of the user (left image). Joint angles are derived from the sensors and mapped to the arm of a simulated model of Robonaut (center image). The obtained end-effector position and orientation in our simulator are sent via network to the Robonaut control interface for actuation (right image).

Figure 6. Snapshot sequences from Robonaut teleoperation trials.

Due to safety concerns, Robonaut does not accept control commands unless its actual endeffector position is matched, within some close threshold, by the desired endeffector position sent by the motion suit. This problem was fixed through reindexing. As initialization, reindexing saves the offset from the robot’s current posture to the motion suit’s current posture. During teleoperation, the offset is applied to the postures derived from the motion suit, and the robot is commanded maintaining the relative positioning of the operator and Robonaut arms.

The implemented reindexing interface control not only addresses safety issues, but also allows for simple implementation of interesting performer-robot mappings. For instance, the user can have his/her hands in a comfortable location proximal to his/her body, while Robonaut’s end-effectors are displaced to a more distant location, which would be too uncomfortable, or even inaccessible, for the user (see Figure 7).

For communications and reliability, our motion suit has demonstrated reliable accuracy over 916MHz radio communications. The suit has operated successfully for periods up to 1 hour at distances up to 100 feet. Our choice of IMUs and radios were driven more by cost rather than accuracy. The equipment cost for each sensor is less than $300 USD and radio equipment is less than $100 USD. The sensors, radios, and batteries were constructed from standard components, and therefore the form factor can potentially be reduced such that worn sensors can be hidden under clothing.

page 6 of 18

Page 7: mars2020_usc_report.doc

Figure 7. Reindexing relative control during teleoperation. The contracted (colored) arm represents the Robonaut’s actual posture, and the extended arm represents the motion suit posture. Both arms move together maintaining their relative position, which can be “reindexed” at anytime.

4. A Cross-Kinematics Metric for Imitation Learning

Imitation learning is a very common and natural form of learning suitable for humanoid and other highly articulated robots that are expected to be versatile and adaptive. Imitation involves mapping the elements of the observed demonstration to one's owns set of capabilities. A successful imitation could involve several different aspects and levels. First, the mapping from the limbs and joints of the demonstrator to that of the imitator needs to be established. This can be very challenging when the demonstrator and the imitator have dissimilar kinematics. Even if the kinematics are similar, there can still exist issues arising from asymmetry in structure and possible differences in size and scale. Once such a mapping has been established, it is essential to ascertain the mapping between the actions of the demonstrator to that of the imitator. It is not always necessarily clear as to what constitutes similar behavior on the part of the imitator. Often, the demonstrator could have a goal in performing an action, and this may not be readily apparent. Owing to these and related issues, concluding what constitutes a successful imitation and what measures could be employed in order to evaluate imitation is a very challenging problem.

Our goal in this topic is to develop a comprehensive correspondence metric that can provide a scalar measure of dissimilarity/distance between any pair of action sequences executed by agents with similar or different kinematic structure. By developing such a metric, we intend to provide a standardized means to quantitatively evaluate imitation. The metric can also be employed for action selection by the imitating agent. In addition, it can also benefit learning agents by serving as a gradient in action space. Apart from these benefits, such a metric can also have applications in a wider range of domains like animation and synthesis of believable motion, transfer of motion and action strategies from one configuration to another for a reconfigurable robot, monitoring the progress of a patient in physical therapy, etc.

We have developed a metric that geometrically compares any two action sequences and provides a scalar measure of dissimilarity. In order to make the metric comprehensive we propose the use of a weighting mechanism that can be suitably instantiated to bias the metric based on what is significant in the demonstrator's action sequence. To keep the metric sufficiently general, we also enforce two additional constraints, namely (1) reference frame independence: the two agents are not required to be in a common reference frame; and (2) scale invariance: scaling of any one agent does not affect the dissimilarity measure. The first property automatically implies translational and rotational invariance.

page 7 of 18

Page 8: mars2020_usc_report.doc

In order to develop such a comprehensive metric, we introduced the notion of a pose metric. This is to provide distance measures between static poses of two agents. This was then extended to measures of distance between actions. We define a kinematic tree or k-tree as a uniform representation to encode the structure of an agent. The notion of a formal pose is then defined which incorporates position and orientation information of an agent. A pose matching algorithm takes two poses and provides a distance measure. The pose matching algorithm proceeds in several steps. It first projects the two poses onto a common reference frame by employing multi-dimensional scaling (MDS) followed by principal component analysis (PCA). A subset matching algorithm is then applied that maps groups of nodes from the pose of one agent to groups of nodes from the pose of the other. This mapping is then used in order to compute the pose distance between the two poses.

An action is treated as a sequence of frames. Actions of two agents are compared by using an interpolation and convolution based mechanism. The poses constituting the actions of each agent are treated as key frames and additional poses are introduced between these frames using interpolation. The number of additional frames introduced is a function of the pose distance between adjacent key frames. This operation is performed individually on both the actions such that they result in the same number of frames. This constitutes the interpolation step in the comparison. The convolution step involves minimizing the summation of the pose distances between the frames from the two actions for all corresponding positions of the frames.

Achieved results are available in two publications [Amit04a] [Amit04b]. The comparison of actions is currently being pursued. The pose distance measure has been successfully employed in the comparison of dissimilarly embodied agents, namely (1) a human, (2) Sony Aibo robotic dog, and (3) simulated dolphin-like skeletal structure. Figure 8 shows an example pose for each of these agents.

Figure 8 – Example poses of the human, the Sony AIBO and the dolphin-like skeleton.

The pair wise distances of various poses of these agents were computed and a symmetric dissimilarity matrix was created. The Multi-dimensional Scaling (MDS) algorithm was applied on this dissimilarity matrix and an embedding on a 2D Euclidean space was calculated. Figure 9 shows the embedding of all the poses. It can be seen that similar poses are closer to each other than less similar ones.

page 8 of 18

Page 9: mars2020_usc_report.doc

Figure 9. Embedding of poses on 2D Euclidean space.

5. Motion Controllers Based on Collected Motion

We have evaluated the utility of motion primitives derived via ST-Isomap [JM04a] in several contexts described in [HM04b] and [Jen03]. Our primitives were combined with a task-level arbitration mechanism for incremental online trajectory formation for humanoid control. This control structure successfully controlled Robosim, NASA’s simulation tool for testing controllers for Robonaut (see Figure 10). Our derivation method was further evaluated by using primitive-controlled motion as input into our derivation procedure. Derived primitives were used in the application example described in Section 9, and have also been used as predictors for classifying human motion and imitation.

Figure 10. Snapshots from on-line right-arm control of a “cabbage patch” dance on the Robonaut simulator (Robosim).

6. Learning Sensory Structures from Robonaut Teleoperation

In collaboration with Alan Peters of Vanderbilt University, we have applied sequentially continuous ST-Isomap [Jen03] [JM04a] to sensory data collected from Robonaut. For this joint work, Robonaut was teleoperated to grasp a horizontal wrench at nine different locations within its workspace. Robonaut continuously publishes its sensory and motor information to programs that record this information for further use. We applied sequentially continuous ST-Isomap on sensory data from five of the teleoperated grasps in an attempt to uncover the spatio-temporal structure of the grasp behavior. Data vectors recorded from Robonaut consist of 110 variables for both motor and sensory data. Motor data, including motor actuation forces and joint position and velocity, were zeroed out.

page 9 of 18

Page 10: mars2020_usc_report.doc

The remaining 57 non-zeroed variables contain sensory data, consisting of tactile sensors on the fingers and force sensors on various positions of the robot. Each of these variables was normalized to a common range across all variables.

The embedding of this sensory data by ST-Isomap is shown in Figure 11 with a comparison to embedding by PCA. The structure of the grasps can be vaguely interpreted in the PCA embedding. In contrast, the structure of the grasps are apparent in the ST-Isomap embedding as two loops. The smaller loop is indicative of reaching from and returning to the zero posture of the robot. The larger loop is indicative of grasp closure and release around the wrench. The points occurring during the grasp are within the smaller cluster.

(a) (b)

(c) (d)

(e)

Figure 11. Two views of the PCA embedding for the grasp data from Robonaut teleoperation (a,b). Two views of the same data embedded by sequentially continuous ST-Isomap (c,d). Distance matrix for the ST-Isomap embedding (e).

The structure uncovered for the grasp provides a model that is a description of sensor values during a grasp. This model can also serve to describe sensory data of grasps not included for

page 10 of 18

Page 11: mars2020_usc_report.doc

producing the embedding. To test this hypothesis, we selected and normalized data from a grasp not used for training. Given sensory data for the test grasp, training grasps, and embedded training grasps, interpolation can be used to map the test grasp on the structure found in the embedding space. For this purpose, Shepards interpolation [She68] was used to perform this mapping. The mapped test grasp from interpolation is shown in Figure 12.

(a) (b)

Figure 12. A test grasp mapped via Shepards interpolation onto the grasp structure in the ST-Isomap embedding (a,b).

7. Planning Collision-Free Reaching Motions

Robotic algorithms are not yet capable of producing arm motions as efficiently as humans, and efficient motion planning [Lat91] for complex manipulators such as humanoid robots is an open research problem. In this context, we have investigated the use of randomized roadmaps [Kav+96] for on-line planning collision-free arm motions among obstacles for Robonaut.

We consider that the obstacles in the workspace are static; however, our goal here is to plan collision-free motions on-line. This allows obstacles to change their position between queries for motions. A typical target application is for manipulation tasks in environments with obstacles: the objects are static and the motion planner gives collision-free arm motions for reaching desired objects to be grasped. The on-line planner can be called again for relocating the grasped object in another place. Such kind of scenario seems to appear in several maintenance tasks suitable for robotics operation, and when the tools, controls, handles, etc, to be reached are situated in difficult locations, the use of a motion planner becomes essential.

We have employed the Rapidly-Exploring Random Tree (RRT) roadmap generation method [Lav98] in its on-line bi-directional version [KL00] to generate on-line motions for both arms of Robonaut. In simple environments, such as the one illustrated in Figure 13, collision-free motions can be obtained in less than 2 seconds. This time includes all the planning process, which consists of three sequential steps: the required roadmap generation, path retrieval from roadmap, and final path optimization (smoothing).

page 11 of 18

Page 12: mars2020_usc_report.doc

Figure 13. Example of a planned collision-free motion between two possible pre-grasp postures for a box. This motion was generated in less than 2 seconds and controls 17 DOFs: 7 in each arm, and 3 at the base (base rotation was only slightly used in this example).

In order to address complex environments with acceptable computation times, we have investigated the use of pre-computed dynamic roadmaps. The idea is to encode valid motions with a pre-computed roadmap, and to use a cell decomposition of the workspace to map, for each cell overlapped by obstacles, the edges and nodes of the roadmap that are affected. Each time obstacles changes are perceived, the affected workspace cells provide the corresponding edges and nodes of the roadmap to be re-validated on-line. Such roadmap maintenance prevents having to construct a roadmap from scratch for each motion planning query, eliminating the on-line roadmap construction step. The obtained Dynamic Roadmap (DRM) is thus able to cope with dynamic changes in the environment.

There is, however, a maintenance time cost that needs to be added to the planning process. We have performed several experiments with different simulated scenarios and manipulator types in order to determine the situations in which DRMs are valuable. Three different scenarios were used (see Figure 14) and the results are reported in detail in a recent publication [KM04].

Our experiments show that the maintenance of precomputed roadmaps can lead to faster and more accurate results than performing single-query motion planning alone. In the Robonaut scenario (Figure 14c), the method experienced only a modest speed gain. However, the much better performance (up to 8 times) obtained in the planar scenarios (Figure 14a,b) motivates further experimentation with the method. Factors such as the frequency of queries, complexity of the environment, the number of DOFs, and the number and size of the dynamic obstacles have an important impact on the final performance of the method.

page 12 of 18

Page 13: mars2020_usc_report.doc

(a) (b) (c)

(d)

Figure 14. Three different scenarios were used for comparing the maintenance of DRMs against using an on-line RRT planner: a 4 DOF manipulator arm (a), a 7 DOF manipulator with two arms (b), and a Robonaut model with 17 DOF (7 for each arm and 3 at the base) (c). One example planned motion in the Robonaut scenario is also shown (d).

8. A Planner for Sequencing Parameterized Motions

The control of complex robot motions remains a key challenge in robotics. While the number of degrees of freedom (DOF) can characterize the complexity of a robot, the complexity of its motions is further influenced by the constraints they are subjected to.

Evidence from neuroscience supports the idea that complex and adaptive motor behavior might be obtained through the combination of motor primitives [TS00]. In robotics, most work in this domain has focused on the design, learning, and combination of different kinds of motor primitives. In contrast, we focus here on the problem of automatic sequencing of movement primitives in order to satisfy a given motion task in an unknown environment.

We consider that a movement primitive affects the configuration of a robot through a proper parameterization, respecting a set of motion constraints. Our method is then able to plan motions traversing different configuration sub-spaces, each being covered by a single primitive. A movement primitive can be a motion segment captured from teleoperation, or any parameterized motion controller. We present here some results obtained for planning statically-stable walking motions for a biped robot. A detailed report of this work can be found in a recent publication [KBM04].

The benefits of our approach are twofold: 1) planning complex motions becomes more efficient in the reduced dimensionality of each movement primitive, and 2) the ability to plan entire motions containing heterogeneous types of constraints, such as collision-free, balanced, alternating support contacts, etc. The technique is directly applicable to several other domains, as for instance, for determining the correct sequence of manipulations to relocate a complex object from one place to

page 13 of 18

Page 14: mars2020_usc_report.doc

another, or to synchronize arm control of a humanoid using two different tools, one on each hand, towards the achievement of a task.

The given vocabulary of movement primitives is responsible for reducing the planning complexity, which is related both to the number of DOF to be controlled and to the diversity of motion constraints to be satisfied. If the vocabulary of movement primitives is able to express the motions required to satisfy the task, the planner will easily find connection points between primitives, allowing the search tree to approach a solution. The chosen planner dictates how movement primitives are adapted to overcome obstacles towards connection points.

In our test example, we have used the algorithm for the successful planning of statically stable walking motions for a biped robot moving in a planar environment containing polygonal obstacles. Obstacles are avoided during motion and are also used as support, allowing the generation of climbing sequences.

The designed biped robot has a total of 9 DOF: the first two specify the position of the body center in the Cartesian plane. The remaining DOF are rotation angles: one to specify the orientation of the body, and three for the articulations of each leg. These articulation are only controlled trough three pre-defined primitive movement controllers, which ensure the following validity constraints: compliance with articulation limits, collision-free, and balance, i.e., the center of mass projects inside the support segment of the robot. A summary of these primitive movement controllers is shown in Table 1.

Movement Primitive

Primitive Illustration

Instantiation Condition

Type of MotionParametric Space

Dimension

PLsupport in left foot

moves right leg articulations and body rotation

4

PBsupport in both feet

moves body, legs fixed with IK 3

PRsupport in right foot

moves left leg articulations and body rotation

4

Table 1. Summary of used movement primitives.

Our method finds the sequencing of primitives by means of a search tree in which nodes are configurations reachable by more than one movement primitive, and edges represent valid paths connecting parent and child nodes. An on-line bi-directional RRT sampling-based motion planner [KL00] operating in the parameter space of a single movement primitive determines each valid path. Therefore, edges represent primitive motions leading to nodes that serve as connection points allowing primitive change. The tree is expanded with A*-like best-first search using greedy problem-specific heuristics [RN95], until the desired task is satisfied.

Figure 15 illustrates the expansion process. Figure 15a shows the roadmap constructed for the robot in a configuration with both feet in support. Each node in the roadmap represents a full configuration, but in this image, only the position of the body center is used to draw the roadmap. The roadmap in Figure 15b shows a mark (a cross) on each node allowing primitive change, which are the nodes that become new leaves in the search tree. Figure 15c shows the robot configuration in the highest priority leaf, which is the one selected for node expansion, now with a single leg support. Priorities are set following an A* heuristic, which adds the distance-to-come to a heuristic distance-to-goal. The roadmap in Figure 15d shows the coverage of the tip of the free foot in the free space. The free foot is the one not used as support in the current primitive controller. The figure shows marks in

page 14 of 18

Page 15: mars2020_usc_report.doc

two configurations again allowing a primitive change, and thus becoming new leaves in the next expansion of the tree. In all images, the circle identifies the root of the roadmaps. The search finishes when the next leaf to be expanded represents a configuration near enough to the target location. Figure 16 exemplifies one obtained motion.

(a) (b) (c) (d)

Figure 15. Node expansion example starting with both feet support.

Figure 16. An example of a planned motion for climbing stairs.

9. An Integrated Application Example

We have evaluated the research topics presented in this report with the implementation of a cooperative human-robot example application. This example was tested with Robosim, which is NASA’s official Robonaut simulator. This experiment was limited to the simulator due to the difficulty in scheduling all required work with Robonaut at NASA JSC to be completed before the end of the MARS 2020 program.

The experiment consists of learning scooping motions to allow Robonaut to perform autonomous scooping in a tray full of sand, according to the user’s instructions. The result is the ability to perform tasks requiring human-robot cooperation.

The learning step can be summarized in three parts:

1. Demonstration of example motions to Robonaut using the USC motion suit.

2. Segmentation of the demonstrated motions into meaningful example motions and mapping of the example motions onto Robonaut’s kinematics.

3. The derivation of a primitive controller, which is able to interpolate the example motions in order to obtain a scooping motion in any position inside the tray.

After the learning step has been accomplished, Robonaut is able to interpolate the example motions on-line, in order to scoop in the exact locations pointed by the user. Pointed locations can be captured with the motion suit or with any simple vision system. At this point Robonaut is able to scoop in any location determined by the user. A further possibility is to command Robonaut to scoop

page 15 of 18

Page 16: mars2020_usc_report.doc

in a different tray or to change tools. Such commands can be given with simple voice commands, and the required arm motions are generated using the randomized motion planners described in this report.

Figure 17 illustrates the few experiments performed in simulation towards the achievement of an integrated demonstration. The simulations were performed with Robosim as well, which has the same control interface as Robonaut. Our successful teleoperation experiments with the motion suit prove that our methods can be successfully integrated with Robonaut, towards the implementation of real human-robot cooperative applications, such as the one depicted in this section.

Figure 17. The left image shows our model of Robonaut performing scooping motions in desired locations in the tray. The right image illustrates Robosim correctly being controlled to perform the same motions.

10. Summary

Our research achievements contribute significantly to increasing the autonomy of robotic assistants. The presented methods allow robots to move a step further from direct teleoperation. We have presented methods allowing robots to learn motion tasks from demonstrated example motions. Such approaches are particularly suitable to tasks involving human-robot collaboration, which is the next big challenge in robotics: to make robots part of society, becoming real and efficient assistants for a variety of tasks.

In this context, we have achieved the following milestones and contributions:

Development of a motion capture system for low cost, lightweight, wireless, real-time motion collection. The system was successfully used to teleoperate Robonaut.

Cross-kinematics analysis for learning motion primitives from various sources

Prediction of expected sensory information for a successful grasping, from the analysis and interpolation of collected Robonaut sensory data.

The use of randomized motion planning for reaching and for motion sequencing

The use of parameterized controllers allowing the implementation of human-robot cooperation applications for a variety of tasks. An application example was demonstrated using Robosim (NASA’s Robonaut simulator).

References

page 16 of 18

Page 17: mars2020_usc_report.doc

[Mil+04] N. Miller, O. D. Jenkins, M. Kallmann, and M. Matarić, “Motion Capture from Inertial Sensing for Untethered Humanoid Teleoperation”, submitted to the IEEE-RAS/RSJ International Conference on Humanoid Robots, 2004.

[Am04a] R. Amit, “A Metric for the Evaluation of Imitation”, to appear in the Doctoral Consortium, Proceedings of the 19th National Conference on Artificial Intelligence (AAAI'04), San Jose, California, July 25-29, 2004.

[Am04b] R. Amit, “A Correspondence Metric for Imitation”, to appear in the Poster Abstracts, Proceedings of the 19th National Conference on Artificial Intelligence (AAAI'04), San Jose, California, July 25-29, 2004.

[JM03] O. C. Jenkins and M. Matarić, “Automated Derivation of Behavior Vocabularies for Autonomous Humanoid Motion”, Second International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 225-232, Melbourne, Australia, July 2003.

[Jen03] O. C. Jenkins, “Data-driven Derivation of Skills for Autonomous Humanoid Agents”, Ph.D. dissertation, The University of Southern California, 2003.

[JM04a] O. C. Jenkins, M. Matarić, “A Spatio-temporal Extension to Isomap Nonlinear Dimension Reduction”, to appear in the International Conference On Machine Learning (ICML), 2004.

[DCM04] E. Drumwright, O. C. Jenkins, and M. J. Matarić, “Exemplar-Based Primitives for Humanoid Movement Classification and Control”, In IEEE International Conference on Robotics and Automation (ICRA), pages 140-145, Apr 2004.

[Ero+03] D. Erol, J. Park, E. Turkay, K. Kawamura, O. C. Jenkins, and M. J. Matarić, “Motion Generation for Humanoid Robots with Automatically Derived Behaviors”, IEEE Systems Man and Cybernetics (SMC 2003), pp 1816-1822, October, 2003, Washington, D.C., USA.

[Amb+00] R. O. Ambrose, H. Aldridge, R. S. Askew, R. R. Burridge, W. Bluethmann, M. Diftler, C. Lovchik, D. Magruder, and F. Rehnmark, “ROBONAUT: NASA's Space Humanoid”, IEEE Intelligent Systems journal 4(15):57-63, 2000.

[She68] D. Shepard, “A two-dimensional interpolation function for irregularly-spaced data”, Proceedings of the {ACM} National Conference, 517-524, 1968.

[JM04b] O. C. Jenkins, M. J Matarić, "Performance-Derived Behavior Vocabularies: Data-driven Acquisition of Skills from Motion", To appear in the International Journal of Humanoid Robotics, 2004.

[KBM04] M. Kallmann, R. Bargmann, and M. Matarić, “Planning the Sequencing of Movement Primitives”, Proceedings of the International Conference on Simulation of Adaptive Behavior (SAB), Los Angeles, 2004.

[KM04] M. Kallmann and M. Matarić, “Motion Planning Using Dynamic Roadmaps”, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), New Orleans, 2004, pp. 4399-4404.

[Kav+96] L. E. Kavraki, P. Svestka, J.-C. Latombe, and M. H. Overmars, “Probabilistic roadmaps for fast path planning in high-dimensional configuration spaces”, IEEE Transactions on Robotics and Automation, 12(4):566-580, June 1996.

[KL00] J. J. Kuffner and S. M. La Valle, “RRT-connect: an efficient approach to single-query path planning”, Proc. of the International Conference on Robotics and Automation (ICRA), San Francisco, 2000, pp. 995-1001.

page 17 of 18

Page 18: mars2020_usc_report.doc

[Lat91] J.-C. Latombe, “Robot motion planning”, ISBN 0-7923-9206-X, Kluwer, Academic Publishers, 1991.

[Lav98] S. M. La Valle, “Rapidly-exploring random trees: a new tool for path planning”, Technical Report TR 98-11, Computer Science Dept., Iowa State University, Oct. 1998.

[TS00] K. A. Thoroughman and R. Shadmehr, “Learning of action through combination of motor primitives”, Nature 407:742–747, 2000.

[RN95] S. J. Russell and P. Norvig, “Artificial intelligence: a modern approach”, Prentice Hall, Englewood Cliffs, NJ, 1995.

[OMM04] O. C. Jenkins, M. N. Nicolescu, and M. J Matarić, “Autonomy and Supervision for Robot Skills and Tasks Learned from Demonstration”, to appear in the AAAI-04 Workshop on Supervisory Control of Learning and Adaptive Systems, 2004.

page 18 of 18