dodoi-enia-03

8/7/2019 dodoi-enia-03

1/7

High level techniques for self-repairing robotic systems

Claus C. Aranha

, Jacques Wainer

, Andre Covic Bastos

Instituto de Computacao Universidade Estadual de Campinas

Avenida Albert Einstein, 1251, Caixa Postal 6176 13083-970 Campinas, SP - Brasil

[email protected], [email protected], [email protected]

Abstract. Usually, robotic fault-tolerance techniques refer to methods to isolate

and treat faults individually. In this work we propose high level (planning) tech-

niques for a new approach, where multiple, possibly heterogeneous devices in

a robotic system cooperate to diagnose faults and to take over a faulty devices

functions in the system. This technique uses pre-defined replacement plans to

deal with faults, avoiding costly online replanning.

1. Introduction

Robots are usually employed to replace humans in harzadous tasks, or tasks in harzadous

envirionment. These missions usually strains the robots circuitry to its extremes, mak-

ing components failures more common than on robots that perform assembly line tasks.

Therefore, the ability to avoid and tolerate failure has became an important factor in eval-

uating a robotic systems performance on a given mission. Another desirable feature on

such an autonomous system is the ability to recover and reconfigure itself after detecting

an error, so that it does not lose its functionality, even if at some performance cost. Both of

these should be done in a reasonable amount of time, for a robot to survive in an dynamic

envirionment.

Fault tolerant and self reconfigurable systems are specially useful to extend a

robots autonomy time. That is, how long the systen is able to run without direct or in-

direct support by humans. This is necessary for robots used on space exploration, where

the time lag between base and robot can be very long, or when the communication with

the base is not reliable. Even on earth-based robots, a large degree of authonomy is de-

sirable for robots which will face missions where human support is not avaliable, like

on underwater envirionments, or emergency situations. Our aim, however, is not only to

make robots last long enough, so that they can be eventually repaired by humans, but

eventually make robots that are capable of repairing themselves.

While, on the field of robotics, the terms fault tolerance, self repair, and au-

tonomous robots have all been used to coin quite different lines of research, in this work

we focus on a high level (planning level) approach. It is called Replacement process. It

consists of, given an action

belonging to a plan needed for a robot to accomplish a

mission, to find a group of actions

so that the preconditions and postconditions of

still hold ( still accomplishes the mission).

Supported by FAPESP


2/7

While the obvious application of this technique (and which we build this work

upon) is to use the replacement process to mantain the functionality of a robot that had

faced a few faults in some of its devices, it does have other uses in robotics. For instance,

a robot that runs self diagnostics procedures online might need to leave a system being

checked unavaliable for the planner. Another possibility is that the robot needs to execute

two tasks at the same time, therefore making some of its resources unavaliable for a

time. Finally, we propose that this approach, while leading to a non-optimal plan when

compared with replanning techniques, will make the robots response to faults faster.

The next session of this article will discuss previous works that influenced our cur-

rent research. In the following session well present the technique Im currently working

on. A theoretical discussion will be followed by a report on preeliminary experiments,

performed on a simulated platform. Following that, proposals on where to continue the

research efforts are made. In the last session, well introduce the issues still left open in

our work and interesting questions.

2. Related Works

One of the main trends in robotic fault tolerance are low level techniques for di-

agnosis and isolation of faults in robotic components [Visinsky et al., 1994]. While

the basic idea of comparing a robotic devices internal sensor feedback to its

expected values [Visinsky, 1991] still remains, current developments of this ap-

proach includes the use of neural networks to avoid mistaking data noise for faults

[Tinos and Terra, 2001, Terra et al., 2001] and the use of pretty complicated mathemat-

ics to obtain more information from relatively little sensory data, called Analitical Re-

dundancy [Leuschen et al., 2002, Visinsky et al., 1994]. These works, however, do little

more than say that the planner will take on from here, after they lock out a faulty device

on a robotic system. Thats where we intend to pick up and carry on.Another approach for the robotic fault tolerance problem comes from self-

configurable robotic systems [Kotay et al., 1998]. These are systems composed of many

very simple robots (like 1DoF robots), which try to behave like multicelular biological

systems. It is proposed [Ortega and Tyrrel, 2000, Tyrrell, 1999] that these systems abil-

ity to change their own configuration leads to high fault-recovery capabilities. In this

direction, evolutive algorithms also have been employed to develop fault tolerance hard-

ware [Thompson, 1995].

Benso has proposed [Benso et al., 2001] a fault tolerance approach for micropro-

cessors in which we base our robotic proposal. In his work, replacement tables are used

to replace a faulty device within a processor for a new command set that reaches the sameresults, exchanging performance for reliability. This is the idea we want to use to provide

fault tolerance abilities in robots in this work.

In [Parker, 1998], a system quite similar to the one we will present in this paper is

proposed. The main differences is that our approach is initially focused in single, complex

robot systems, while being able to be extended to multi-robots domains. Also, we dwelve

a little deeper into the software functions that can be performed by a single robot. Still,

the two works are different approaches for handling the same problem.


3/7

3. Replacement System

Let us call a plan a set of actions

which we expect to takes us from an initial

state

to a final state

which target variables

from the set of the variables that

define a state in our world have a desired target value.

For a given plan, we will define a emphreplacement for a plan

on an action

(

),

as a second plan,

so that the initial and final states of

are the same as

those of

, but

.

For an example, let us define a simple robot. It has three possible actions: go

forward, turn left and turn right (

). Let

be a plan to make the robot take

one step left:

(turn left, then go forward). A replacement plan for

on

(

)

would be:

(three turns left, then go forward). On the other hand, there is no

replacement plan for

on

with the avaliable actions - that would be different if, for

instance, the robot had a go backwards action.

We call a Replacement System S a set of plans so that for every action

, there is

at least one plan

!

such as

is a replacement for

the subplan composed of only

the action

. For instance, a very simple replacement system for the above robot would

look like the one described in 1.

Action Replacement

Go forward No replacement for this robot

Turn Right Turn Left, Turn Left, Turn Left

Turn Left Turn Right, Turn Right, Turn Right

Table 1: A Simple Sample Replacement System

We can use such a system to provide a robot with planning repair capabilities

during runtime. A robot running with a replacement system would begin its missiondoing the usual planning. Then it would run the plans instructions. Whenever the robot

detect an actuator failure, it would mark the corresponding actions as unavaliable. If the

plan later required a marked off action to be used, the robot would replace that action

from a equivalent plan from the Replacement System.

It can be easily noted that this procedure can recurse. While the robot is executing

the subplan, it must check the subplans actions for faults, so it can replace the replace-

ment actions themselves. If done without care, this could lead the replacement system

into a deadlock. Many different safeguards can be made to avoid this situation: we could

make the replacement system directed, with higher level actions that replace into lower

level actions, like done in [Benso et al., 2001]. While this is simple, it will reduce ourreplacement ability (in our simple example above, the turn left and turn right actions need

to make a cycle, or one of them will lose its replacement ability). Another solution would

be to store which actions we have already replaced, and avoid using them again in the

replacement process.

Therefore, before using the replacement plan, it is needed to check if it is valid. A

replacement subplan is considered valid when: 1- Starting from that state, all its actions

are valid (for instance, it will not try to do something the envirionment would usualy


4/7

prevent it from doing, like going through a wall). 2- It does not contains actions that are

already being replaced then (like described on the previous paragraph.

4. Experimental results

we propose that the use of a replacement system can reduce the processing time per action,

when it is able to avoid or postpone the need of replanning in a faulty envirionment.

Altought this also means that the replaced substitute plan might have more actions than

replanning, for a robot in a dynamic envirionment, the response time is more important

than finding an optimal solution.

To validate our proposal, we have developed a simple robot simulator. It simulates

a four wheeled robot, where each wheel is independent from the others, and capable of

going forward, stopping and going backwards. Therefore, the simulated robot is capable

of 81 different actions. This robots envirionment is a simple maze which the robot must

navigate. During the simulation, we can inject faults in the robots wheels, making each

of them unable to go forward, backwards, or to stop.

The experiment consisted on running the robot on four different mazes, four timeson each maze (with different starting and ending points). This was to represent short and

long, simple and complex, narrow and wide paths the robot was supposed to walk. Each

of the sixteen paths was run under twelve different sets of failures, each set containing up

to two failures. For this experiment, all failures happens in the simulations first turn, are

readly detected by the robot, and are permanent (last for the entire simulation).

We implemented a simple replacement system for the simulated robot. In this

systems replacemente table, each action had 5 replacement plans which we defined man-

ually. The robot would check this table whenever he tried to do a movement for which

one of the engines were damaged. If any of the subplans for that particular movement

was composed of non-damaged movements only, and the subplan was itself valid in thatstate (would not bump into a wall), it would replace the damaged movement for it. Else,

it would try to replan the path with the avaliable movements all over again.

We compared this robots performance to that of a simulated robot without the

replacement system. This second robot would simply replan everything from scratch

whenever it tried to perform a faulty movement. We used a simple A* search for the

planning part.

The preeliminary test runs revealed many issues concerning the implementation of

a replacement system. The first of them was the need to take the possibility of replacing

actions during the planning stage. The first A* planner used would find optimal paths that

bordered any walls between the starting and ending point. However, this optimal pathwould rule out many possible replacement actions, which would bump the robot into the

wall (like in 1). We, therefore, changed slightly the searchs heuristics so that the robot

would find a path that avoided being too near the walls whenever possible. After that, a

slight increase of suscesfully replaced actions could be observed.

Another important issue regarded the implementation of the algorithm which

would search the replacement system for a suitable subplan to a faulty movement. A

very simple approach, where a table simply listed the possible replacements, and each


5/7

Figure 1: When walking too close to the wall, movement constrainsts preventsome of the replacement plans to work

would be checked for suitability and then used, could be executed in constant time. How-

ever, its success rate would depend entirely on the cleverness of the table it was based.

Like suggested in the previous session, some recursive action could improve this. Hov-

ever, we then open the question of how to implement this recursiveness without losing the

constant, small time needed to replace an action using the table directly.

In the experiments for this paper we tried two different implementations to solve

this issue. First we tried separating the actions in two kinds: those that didnt need re-

placements with more than two different actions, and those which couldnt do without

them. The first type we called primitive actions, and the second, complex actions. When

testing a substitute plan for fitness, complex actions wouldnt be checked for errors, un-

less they were the first action in the plan, so that they could be recursively replaced when

they were used by the robot. The first action requirement was to guarantee the stop condi-

tion. This solution didnt work very well in the experiments, mainly due to the simplified

replacement table that was used. The other way to solve the recursion problem was de-

scribed in the previous session, and consists of storing which actions are currently being

replaced, and not using subplans composed of those actions.

After those considerations, the simulated robot performed well, spending a neg-

ligible amount of time per action whenever it could find a substitute for all faulty action

in its plan, and a noticeably reduced amount of replanning time when it could substitute

most actions in its plan. After each testing round, manipulating the most non-replaceable

actions in the replacement table would yield better results, which indicated the need to

generate a broader, more compreensive replacement table.

5. Conclusion and Future work

In this paper, we intended to present the basic idea of a replacement system and discuss

the main issues regarding it.

The first problem we need to address from now on is the automatic generation of a

replacement table. The experience with the current work showed us that human designed

tables, besides taking too long to make, are very prone to error and to miss key replace-

ment subplans. Having acknowledged the need of an automatically generated replaced

table, we face the problem of how to do it. Taylor [Taylor, 1992] proposes a method


6/7

to eliminate action sequences which lead to identical states in depth first search. This

method could be used to generate a replacement table by turning the redundant sequences

into replacement subplans. Other machine learning techniques, like reinforcement learn-

ing, look promising in respect to finding a good replacement table for a given robotic

system.

However, further study will be needed to balance out the replacement table size

and scope against the cost to find a suitable replacement in it. The use of special algo-

rithms for replacement finding could also play a part in it. Also, the idea of the replace-

ment process itself is something that could be worked on. Instead of directly replacing

faulty instructions, the system could try to use a lookahead, where it would replace the

faulty action and actions before and/or after it, so that it could use a more efficient

replacement subplan.

Finally, we can extend the ideas presented here for a multi-robot system, where a

faulty acton in one of the robots could be replaced by a subplan composed of actions in

different robots belonging to the system. For this, the replacement system idea should be

revised to work on higher level actions and replacement plans. This higher level replace-

ment system might then treat not only failures due to system faults, but also to problems

related to a dynamic envirionment.

References

Benso, A., Chiusano, S., and Prinetto, P. (2001). A self-repairing execution unit for

microprogrammed processors. IEEE Micro, pages 1621.

Kotay, K., Rus, D., Vona, M., and McGray, C. (1998). The self reconfiguring robotic

molecule. In Proceedings of IEEE International Conference on Robotics and Automa-

tion.

Leuschen, M. L., Walker, I. D., and Cavallaro, J. R. (2002). Robotic fault detection usingnonlinear analytical redundancy. In IEEE International Conference on Robotics and

Automation.

Ortega, C. and Tyrrel, A. (2000). Reability analysis in self-repairing embryonic systems.

Parker, L. E. (1998). Allance: Am architecture for fault tolerant multi-robot cooperation.

IEEE Transactions on robotics and automation, 14(2):220240.

Taylor, L. A. (1992). Pruning duplicate nodes in depth-first search. Technical report,

University of California.

Terra, M. H., Bergerman, M., Tinos, R., and Siqueira, A. A. G. (2001). Controle tolerante

a falhas de robos manipuladores. SBA controle e automacao, 12(2):7392.

Thompson, A. (1995). Evolving fault tolerant systems. In Proc. 1st IEE/IEEE Int. Conf.

on Genetic Algorithms in Engineering Systems: Innovations and Applications (GALE-

SIA95), pages 524529. IEE Conf. Publication No. 414.

Tinos, R. and Terra, M. H. (2001). Fault detection and isolation in robotic manipula-

tors using a multilayer perceptron and a rbf network trained by the kohonens self-

organizing map. Revista Controle & Automac ao, 12(1):1118.


7/7

Tyrrell, A. (1999). Computer know thy self!: A biological way to look at fault tolerance.

Visinsky, M. L. (1991). Fault detection and fault tolerance methods for robotics. Masters

thesis, Rice University.

Visinsky, M. L., Cavallaro, J. R., and Walker, I. D. (1994). Robotic fault detection and

fault tolerance: a survey. Reliability Eng. and System Safety, 46:139158.

dodoi-enia-03

Documents

Transcript of dodoi-enia-03