dodoi-enia-03

download dodoi-enia-03

of 7

Transcript of dodoi-enia-03

  • 8/7/2019 dodoi-enia-03

    1/7

    High level techniques for self-repairing robotic systems

    Claus C. Aranha

    , Jacques Wainer

    , Andre Covic Bastos

    Instituto de Computacao Universidade Estadual de Campinas

    Avenida Albert Einstein, 1251, Caixa Postal 6176 13083-970 Campinas, SP - Brasil

    [email protected], [email protected], [email protected]

    Abstract. Usually, robotic fault-tolerance techniques refer to methods to isolate

    and treat faults individually. In this work we propose high level (planning) tech-

    niques for a new approach, where multiple, possibly heterogeneous devices in

    a robotic system cooperate to diagnose faults and to take over a faulty devices

    functions in the system. This technique uses pre-defined replacement plans to

    deal with faults, avoiding costly online replanning.

    1. Introduction

    Robots are usually employed to replace humans in harzadous tasks, or tasks in harzadous

    envirionment. These missions usually strains the robots circuitry to its extremes, mak-

    ing components failures more common than on robots that perform assembly line tasks.

    Therefore, the ability to avoid and tolerate failure has became an important factor in eval-

    uating a robotic systems performance on a given mission. Another desirable feature on

    such an autonomous system is the ability to recover and reconfigure itself after detecting

    an error, so that it does not lose its functionality, even if at some performance cost. Both of

    these should be done in a reasonable amount of time, for a robot to survive in an dynamic

    envirionment.

    Fault tolerant and self reconfigurable systems are specially useful to extend a

    robots autonomy time. That is, how long the systen is able to run without direct or in-

    direct support by humans. This is necessary for robots used on space exploration, where

    the time lag between base and robot can be very long, or when the communication with

    the base is not reliable. Even on earth-based robots, a large degree of authonomy is de-

    sirable for robots which will face missions where human support is not avaliable, like

    on underwater envirionments, or emergency situations. Our aim, however, is not only to

    make robots last long enough, so that they can be eventually repaired by humans, but

    eventually make robots that are capable of repairing themselves.

    While, on the field of robotics, the terms fault tolerance, self repair, and au-

    tonomous robots have all been used to coin quite different lines of research, in this work

    we focus on a high level (planning level) approach. It is called Replacement process. It

    consists of, given an action

    belonging to a plan needed for a robot to accomplish a

    mission, to find a group of actions

    so that the preconditions and postconditions of

    still hold ( still accomplishes the mission).

    Supported by FAPESP

  • 8/7/2019 dodoi-enia-03

    2/7

    While the obvious application of this technique (and which we build this work

    upon) is to use the replacement process to mantain the functionality of a robot that had

    faced a few faults in some of its devices, it does have other uses in robotics. For instance,

    a robot that runs self diagnostics procedures online might need to leave a system being

    checked unavaliable for the planner. Another possibility is that the robot needs to execute

    two tasks at the same time, therefore making some of its resources unavaliable for a

    time. Finally, we propose that this approach, while leading to a non-optimal plan when

    compared with replanning techniques, will make the robots response to faults faster.

    The next session of this article will discuss previous works that influenced our cur-

    rent research. In the following session well present the technique Im currently working

    on. A theoretical discussion will be followed by a report on preeliminary experiments,

    performed on a simulated platform. Following that, proposals on where to continue the

    research efforts are made. In the last session, well introduce the issues still left open in

    our work and interesting questions.

    2. Related Works

    One of the main trends in robotic fault tolerance are low level techniques for di-

    agnosis and isolation of faults in robotic components [Visinsky et al., 1994]. While

    the basic idea of comparing a robotic devices internal sensor feedback to its

    expected values [Visinsky, 1991] still remains, current developments of this ap-

    proach includes the use of neural networks to avoid mistaking data noise for faults

    [Tinos and Terra, 2001, Terra et al., 2001] and the use of pretty complicated mathemat-

    ics to obtain more information from relatively little sensory data, called Analitical Re-

    dundancy [Leuschen et al., 2002, Visinsky et al., 1994]. These works, however, do little

    more than say that the planner will take on from here, after they lock out a faulty device

    on a robotic system. Thats where we intend to pick up and carry on.Another approach for the robotic fault tolerance problem comes from self-

    configurable robotic systems [Kotay et al., 1998]. These are systems composed of many

    very simple robots (like 1DoF robots), which try to behave like multicelular biological

    systems. It is proposed [Ortega and Tyrrel, 2000, Tyrrell, 1999] that these systems abil-

    ity to change their own configuration leads to high fault-recovery capabilities. In this

    direction, evolutive algorithms also have been employed to develop fault tolerance hard-

    ware [Thompson, 1995].

    Benso has proposed [Benso et al., 2001] a fault tolerance approach for micropro-

    cessors in which we base our robotic proposal. In his work, replacement tables are used

    to replace a faulty device within a processor for a new command set that reaches the sameresults, exchanging performance for reliability. This is the idea we want to use to provide

    fault tolerance abilities in robots in this work.

    In [Parker, 1998], a system quite similar to the one we will present in this paper is

    proposed. The main differences is that our approach is initially focused in single, complex

    robot systems, while being able to be extended to multi-robots domains. Also, we dwelve

    a little deeper into the software functions that can be performed by a single robot. Still,

    the two works are different approaches for handling the same problem.

  • 8/7/2019 dodoi-enia-03

    3/7

    3. Replacement System

    Let us call a plan a set of actions

    which we expect to takes us from an initial

    state

    to a final state

    which target variables

    from the set of the variables that

    define a state in our world have a desired target value.

    For a given plan, we will define a emphreplacement for a plan

    on an action

    (

    ),

    as a second plan,

    so that the initial and final states of

    are the same as

    those of

    , but

    .

    For an example, let us define a simple robot. It has three possible actions: go

    forward, turn left and turn right (

    ). Let

    be a plan to make the robot take

    one step left:

    (turn left, then go forward). A replacement plan for

    on

    (

    )

    would be:

    (three turns left, then go forward). On the other hand, there is no

    replacement plan for

    on

    with the avaliable actions - that would be different if, for

    instance, the robot had a go backwards action.

    We call a Replacement System S a set of plans so that for every action

    , there is

    at least one plan

    !

    such as

    is a replacement for

    the subplan composed of only

    the action

    . For instance, a very simple replacement system for the above robot would

    look like the one described in 1.

    Action Replacement

    Go forward No replacement for this robot

    Turn Right Turn Left, Turn Left, Turn Left

    Turn Left Turn Right, Turn Right, Turn Right

    Table 1: A Simple Sample Replacement System

    We can use such a system to provide a robot with planning repair capabilities

    during runtime. A robot running with a replacement system would begin its missiondoing the usual planning. Then it would run the plans instructions. Whenever the robot

    detect an actuator failure, it would mark the corresponding actions as unavaliable. If the

    plan later required a marked off action to be used, the robot would replace that action

    from a equivalent plan from the Replacement System.

    It can be easily noted that this procedure can recurse. While the robot is executing

    the subplan, it must check the subplans actions for faults, so it can replace the replace-

    ment actions themselves. If done without care, this could lead the replacement system

    into a deadlock. Many different safeguards can be made to avoid this situation: we could

    make the replacement system directed, with higher level actions that replace into lower

    level actions, like done in [Benso et al., 2001]. While this is simple, it will reduce ourreplacement ability (in our simple example above, the turn left and turn right actions need

    to make a cycle, or one of them will lose its replacement ability). Another solution would

    be to store which actions we have already replaced, and avoid using them again in the

    replacement process.

    Therefore, before using the replacement plan, it is needed to check if it is valid. A

    replacement subplan is considered valid when: 1- Starting from that state, all its actions

    are valid (for instance, it will not try to do something the envirionment would usualy

  • 8/7/2019 dodoi-enia-03

    4/7

    prevent it from doing, like going through a wall). 2- It does not contains actions that are

    already being replaced then (like described on the previous paragraph.

    4. Experimental results

    we propose that the use of a replacement system can reduce the processing time per action,

    when it is able to avoid or postpone the need of replanning in a faulty envirionment.

    Altought this also means that the replaced substitute plan might have more actions than

    replanning, for a robot in a dynamic envirionment, the response time is more important

    than finding an optimal solution.

    To validate our proposal, we have developed a simple robot simulator. It simulates

    a four wheeled robot, where each wheel is independent from the others, and capable of

    going forward, stopping and going backwards. Therefore, the simulated robot is capable

    of 81 different actions. This robots envirionment is a simple maze which the robot must

    navigate. During the simulation, we can inject faults in the robots wheels, making each

    of them unable to go forward, backwards, or to stop.

    The experiment consisted on running the robot on four different mazes, four timeson each maze (with different starting and ending points). This was to represent short and

    long, simple and complex, narrow and wide paths the robot was supposed to walk. Each

    of the sixteen paths was run under twelve different sets of failures, each set containing up

    to two failures. For this experiment, all failures happens in the simulations first turn, are

    readly detected by the robot, and are permanent (last for the entire simulation).

    We implemented a simple replacement system for the simulated robot. In this

    systems replacemente table, each action had 5 replacement plans which we defined man-

    ually. The robot would check this table whenever he tried to do a movement for which

    one of the engines were damaged. If any of the subplans for that particular movement

    was composed of non-damaged movements only, and the subplan was itself valid in thatstate (would not bump into a wall), it would replace the damaged movement for it. Else,

    it would try to replan the path with the avaliable movements all over again.

    We compared this robots performance to that of a simulated robot without the

    replacement system. This second robot would simply replan everything from scratch

    whenever it tried to perform a faulty movement. We used a simple A* search for the

    planning part.

    The preeliminary test runs revealed many issues concerning the implementation of

    a replacement system. The first of them was the need to take the possibility of replacing

    actions during the planning stage. The first A* planner used would find optimal paths that

    bordered any walls between the starting and ending point. However, this optimal pathwould rule out many possible replacement actions, which would bump the robot into the

    wall (like in 1). We, therefore, changed slightly the searchs heuristics so that the robot

    would find a path that avoided being too near the walls whenever possible. After that, a

    slight increase of suscesfully replaced actions could be observed.

    Another important issue regarded the implementation of the algorithm which

    would search the replacement system for a suitable subplan to a faulty movement. A

    very simple approach, where a table simply listed the possible replacements, and each

  • 8/7/2019 dodoi-enia-03

    5/7

    Figure 1: When walking too close to the wall, movement constrainsts preventsome of the replacement plans to work

    would be checked for suitability and then used, could be executed in constant time. How-

    ever, its success rate would depend entirely on the cleverness of the table it was based.

    Like suggested in the previous session, some recursive action could improve this. Hov-

    ever, we then open the question of how to implement this recursiveness without losing the

    constant, small time needed to replace an action using the table directly.

    In the experiments for this paper we tried two different implementations to solve

    this issue. First we tried separating the actions in two kinds: those that didnt need re-

    placements with more than two different actions, and those which couldnt do without

    them. The first type we called primitive actions, and the second, complex actions. When

    testing a substitute plan for fitness, complex actions wouldnt be checked for errors, un-

    less they were the first action in the plan, so that they could be recursively replaced when

    they were used by the robot. The first action requirement was to guarantee the stop condi-

    tion. This solution didnt work very well in the experiments, mainly due to the simplified

    replacement table that was used. The other way to solve the recursion problem was de-

    scribed in the previous session, and consists of storing which actions are currently being

    replaced, and not using subplans composed of those actions.

    After those considerations, the simulated robot performed well, spending a neg-

    ligible amount of time per action whenever it could find a substitute for all faulty action

    in its plan, and a noticeably reduced amount of replanning time when it could substitute

    most actions in its plan. After each testing round, manipulating the most non-replaceable

    actions in the replacement table would yield better results, which indicated the need to

    generate a broader, more compreensive replacement table.

    5. Conclusion and Future work

    In this paper, we intended to present the basic idea of a replacement system and discuss

    the main issues regarding it.

    The first problem we need to address from now on is the automatic generation of a

    replacement table. The experience with the current work showed us that human designed

    tables, besides taking too long to make, are very prone to error and to miss key replace-

    ment subplans. Having acknowledged the need of an automatically generated replaced

    table, we face the problem of how to do it. Taylor [Taylor, 1992] proposes a method

  • 8/7/2019 dodoi-enia-03

    6/7

    to eliminate action sequences which lead to identical states in depth first search. This

    method could be used to generate a replacement table by turning the redundant sequences

    into replacement subplans. Other machine learning techniques, like reinforcement learn-

    ing, look promising in respect to finding a good replacement table for a given robotic

    system.

    However, further study will be needed to balance out the replacement table size

    and scope against the cost to find a suitable replacement in it. The use of special algo-

    rithms for replacement finding could also play a part in it. Also, the idea of the replace-

    ment process itself is something that could be worked on. Instead of directly replacing

    faulty instructions, the system could try to use a lookahead, where it would replace the

    faulty action and actions before and/or after it, so that it could use a more efficient

    replacement subplan.

    Finally, we can extend the ideas presented here for a multi-robot system, where a

    faulty acton in one of the robots could be replaced by a subplan composed of actions in

    different robots belonging to the system. For this, the replacement system idea should be

    revised to work on higher level actions and replacement plans. This higher level replace-

    ment system might then treat not only failures due to system faults, but also to problems

    related to a dynamic envirionment.

    References

    Benso, A., Chiusano, S., and Prinetto, P. (2001). A self-repairing execution unit for

    microprogrammed processors. IEEE Micro, pages 1621.

    Kotay, K., Rus, D., Vona, M., and McGray, C. (1998). The self reconfiguring robotic

    molecule. In Proceedings of IEEE International Conference on Robotics and Automa-

    tion.

    Leuschen, M. L., Walker, I. D., and Cavallaro, J. R. (2002). Robotic fault detection usingnonlinear analytical redundancy. In IEEE International Conference on Robotics and

    Automation.

    Ortega, C. and Tyrrel, A. (2000). Reability analysis in self-repairing embryonic systems.

    Parker, L. E. (1998). Allance: Am architecture for fault tolerant multi-robot cooperation.

    IEEE Transactions on robotics and automation, 14(2):220240.

    Taylor, L. A. (1992). Pruning duplicate nodes in depth-first search. Technical report,

    University of California.

    Terra, M. H., Bergerman, M., Tinos, R., and Siqueira, A. A. G. (2001). Controle tolerante

    a falhas de robos manipuladores. SBA controle e automacao, 12(2):7392.

    Thompson, A. (1995). Evolving fault tolerant systems. In Proc. 1st IEE/IEEE Int. Conf.

    on Genetic Algorithms in Engineering Systems: Innovations and Applications (GALE-

    SIA95), pages 524529. IEE Conf. Publication No. 414.

    Tinos, R. and Terra, M. H. (2001). Fault detection and isolation in robotic manipula-

    tors using a multilayer perceptron and a rbf network trained by the kohonens self-

    organizing map. Revista Controle & Automac ao, 12(1):1118.

  • 8/7/2019 dodoi-enia-03

    7/7

    Tyrrell, A. (1999). Computer know thy self!: A biological way to look at fault tolerance.

    Visinsky, M. L. (1991). Fault detection and fault tolerance methods for robotics. Masters

    thesis, Rice University.

    Visinsky, M. L., Cavallaro, J. R., and Walker, I. D. (1994). Robotic fault detection and

    fault tolerance: a survey. Reliability Eng. and System Safety, 46:139158.