Search algorithm for optimal execution of incident commander guidance in macro action planning

354 Int. J. Intelligent Systems Technologies and Applications, Vol. 14, Nos. 3/4, 2015

Search algorithm for optimal execution of incidentcommander guidance in macro action planning

Reza Nourjou* and Hirokazu TatanoDisaster Prevention Research Institute (DPRI),Kyoto University,Kyoto 611-0011, JapanEmail: [email protected]: [email protected]*Corresponding author

Hossein Aghamohammadi

Department of Remote Sensing and GIS,Science and Research Branch, IAU,Tehran 1477893855, IranEmail: [email protected]

Abstract: This paper presents a state space search algorithm that solves theoptimal execution problem of incident commander’s guidance during disasteremergency management. To achieve a joint goal, the IC should select the bestchoice, as an optimal strategic decision, from available alternatives in a definitetime. A strategic decision coordinates/controls macro actions of a team of fieldunits by constraining a subteam to a subgoal in sublocation in a time window;moreover a sequence of strategic decisions generates a macro action plan thatdefines how to reach the goal. Three results are achieved by running this algorithmfor a scenario: (1) calculate an optimal macro action plan; (2) estimate a minimumtotal time to achieve a joint goal and (3) reason about the best choice. We appliedour approach to develop an intelligent software system (autonomous agent) forassisting the human in crisis response to earthquake disaster.

Keywords: state space search algorithm; multi-agent planning; disaster crisisresponse; incident commander; optimal decision making; human strategyexecution.

Reference to this paper should be made as follows: Nourjou, R., Tatano, H.and Aghamohammadi, H. (2015) ‘Search algorithm for optimal execution ofincident commander guidance in macro action planning’, Int. J. IntelligentSystems Technologies and Applications, Vol. 14, Nos. 3/4, pp.354–384.

Biographical notes: Reza Nourjou received his Master degree in GIS andthe PhD in Informatics in 2006 and 2014, respectively. He was a ResearchStudent and a Researcher of Disaster Prevention Research Institute at the KyotoUniversity during 2009–2014 and 2014, respectively. Since 2006, he has beencontributing to the public safety domain (disaster emergency management, crisisresponse, rescue and relief operations) by applying: GIS, autonomous software

Copyright © 2015 Inderscience Enterprises Ltd.

Search algorithm for optimal execution of incident commander guidance 355

agents and multi-agent systems, automated planning and scheduling algorithms,spatial agent-based modelling and geospatial simulation. His work/researchfields currently include distributed autonomous GIS, machine learning, softwareapplication development (mobile app development, software system engineering,cloud Communication platform, and location-based web services), and internetof things.

Hirokazu Tatano is a Professor of Disaster Prevention Research Institute andthe Director of Social Systems for Disaster Risk Governance Lab at the KyotoUniversity, Japan. He is a Co-Founder of the IDRiM Society (InternationalSociety for Integrated Disaster Risk Management). His fields of specialisationinclude disaster risk management, economic analysis of disasters, infrastructureeconomics.

Hossein Aghamohammadi holds the Master degree and the PhD in GIS from theK.N. Toosi University of Technology. Since 2007, he teaches remote sensing,GIS, database, and geomatics courses in Azad University.

1 Introduction

Action Planning is a key research content of emergency management of disasters especiallyin the search and rescue operation (SAR) domain, where a commander of a team aims tocontrol and coordinate field units’ action. Crisis response to an earthquake disaster startswith SAR by a team of field units.

Effective coordination is an essential ingredient for efficient emergency responsemanagement (Chen et al., 2008). Coordination is the act of managing interdependenciesamong activities performed to achieve a goal. Inefficient coordination causes the team tofail to reach their goal. At least five reasons necessitate the coordination among field unitsin the SAR problem domain:

• The ‘enabling’ dependency between two tasks states that accomplishment of thepredecessor enables field units to start carrying out the successor. For example, asearch task and a rescue task that are associated to a victim should be sequentiallydone. Field units can do tasks whose state are ‘enabled’. To accomplish a task revealsa new ‘enabled’ task or changes the state of a ‘not yet enabled’ task to the ‘enabled’task. Not yet enabled tasks result in idle and inactive field units.

• The ‘enabling’ dependency between field units’ actions states that the sub-actions ofa field unit maybe dependent to outcomes of actions of other field units. A field unitmight be capable of doing a subset of tasks because of distribution of capabilities(and also resources and expertise) among field units. Therefore, this field unit isenabled to do a subset of ‘enabled’ tasks

I which have already revealed by actions of other field units or by itself

II for which this field unit can provide capability requirements.

356 R. Nourjou et al.

It is reasonable to involve all field units in performing tasks by decreasing amount oftime a field units is idle.

• Redundant actions result in anarchy or chaos. Because a subset of field units maypossess overlapping capabilities, this causes conflict between field units who intendto do same tasks, while a field unit is enough to accomplish an instance of these tasks.

• Information sharing allows field units and also the commander to have a betterperception of the world state in real-time and in the future. Information wouldinclude task information, task schedules, or action plans. This enables them to makebetter action plans.

• Task assignment and scheduling deal with a field unit as a machine that shouldperform tasks with regards to some optimisation criteria. To optimise the globalutility, this is crucial for the team to define what tasks a certain field unit should dowhen and where.

Many good works have applied AI planning techniques in the problem domain ofdisaster crisis response in which each one takes into account a specific problem (problemcharacteristics, assumptions, requirements, etc.). Some works have addressed a centralisedapproach, where an agent, e.g., the incident commander, is responsible for making an actionplan for field units.

To sharpen our understandings, we consider some characteristics and assumption asfollows:

• A single objective: The global goal is to carry out SAR by a team of field units. Theoptimisation criterion is to minimise the overall time called makespan to achieve thisjoint objective by field units’ actions.

• Commander as human decision-maker: The commander is the team planner. Thispaper focuses on the role of commander in making an action plan for field units.Field units are considered autonomous entities that are capable of reasoning abouttheir own actions, taking into account the action plan made by the commander.

• Centralised approach: A centralised approach is required to calculate a central plan.The action plan, which is made by the commander in a centralised manner, will beexecuted by field units in a distributed and decentralised approach.

• Location-based temporal macro tasks: The incident commander has a big picture oftask environment, which is formed by location-based temporal macro tasks (LoTeMtasks). See Section 2 for details.

• Macro action: The plan that the commander makes has the macro feature. A macroaction plan can not specify actions of field units in detail. It partially specifies actionsof field units and delegates these intelligent agents to autonomously and explicitlyplan (reason and make decision) their own micro actions. A macro action states thatwhat subset of field units should execute what subset of actions (or what subset oftasks) within what sub-area during what time window. A micro action specifies adefinite action which a definite field unit is planning to execute in a definite time at adefinite location. See Section 2 for more details.

• Human supervisor: Humans should be involved in the planning loop and should havea supervisor role in the planning process. Because of complexity of commanding


crisis response operations, a human planner cannot be completely replaced by a fullyautomated system. In fact, it is not feasible for fully automated planning systems toeffectively plan for field units by reasoning about all the possibilities that might ariseduring the execution of tasks in a complex environment. This is true especially inoperation centres.

• Partial observation of the task environment: The commander has a partial view oftask environment, because tasks are discovered (revealed, observed) by field units’actions over time. There is uncertainty in

1 outcomes of field units’ actions

2 amount of time in which a task is accomplished.

Therefore, the commander should act according to a loop consisting of five steps:

1 estimate tasks and integrate them with observed data

2 plan, re-plan, or revise (adapt, refine) the current plan to new situations in a timemanner

3 disseminate information of the plan for execution

4 continuously monitor how the plan is executed by the field units, gather andintegrate data reported by field units to have a timely situational awareness

5 learn to act better.

This loop is continuously repeated over time until ultimate goals are achieved.

• Quick answer: Quickly, the commander should make the action plan for field unitswhile planning is done under time pressure.

• Time: executing tasks is a time-consuming operation. Therefore, we shoulddetermine the amount of time that an action being done by a definite field unit needsto accomplish a definite task.

• Planning verse scheduling: the commander aims to define what macro action shouldbe done in the real time and timely remake a new decision for revising the previousone. The thread assignment problem is not a scheduling problem that is capable ofchanging the task environment. In fact, task/action scheduling is done under a setstrategic decision.

This paper addresses the problem of optimal execution of human strategy, in which the bestchoice should be made from a set of presented alternatives. The selected choice is the beststrategic decision the commander should make for specifying macro actions of field unitsin SAR.

This is a difficult problem for the human, because it is not feasible for the commander asthe human planner to quickly and appropriately reason which alternative might be the best inreal time. In fact it is beyond the human capabilities in the problem addressed in this paper. Inaddition, a review of the related works shows that neither have they thoroughly addressed thestated problem, nor has a proper solution been developed for solving this type of problem.As a result, it is necessary to assist the commander and support the human planner in


the strategic decision making procedure by optimally executing the human strategy. Ourmotivation in this paper is to propose an ideal approach for solving this problem.

Our objective in this paper is to apply artificial intelligence techniques in problemsolving. The paper aims to design and implement a heuristic search algorithm that is capableof automated reasoning to select the best choice. In addition, this algorithm is capable ofcalculating an optimal macro action plan in the shortest amount of time, by which the goalwill be satisfied via the calculated plan.

2 Background to execution problem of incident commander guidance

SAR plays a major role in disaster response occurring in urban areas. SAR is concernedwith reducing the number of fatalities in the first few days after the occurrence of a disaster.Tasks that should be done in the SAR problem domain are categorised into four task types:

1 conduct reconnaissance and assessment by collecting information on the extent ofdamage

2 search and locate victims trapped in collapsed structures

3 extract and rescue trapped victims

4 transport injured survivors to hospitals or refuges.

A disaster-affected area contains a number of tasks geographically dispersed in the area, andtask information is associated with geographic objects such as buildings, road segments,and city blocks. Tasks entail inter-dependencies; it means that a task might enable (discover,release, reveal) another task completion. To save one victim, who is located at a damagedbuilding in a certain spatial location, a sequence of four tasks should be accomplished. Atask to be performed requires one or several capabilities synchronously and a considerableamount of time. All tasks are not known in advance, and accomplishing tasks reveals newtasks and changes the state of tasks over time. Figure 1 presents a simple task tree that isassociated with saving a person. It is notable that attributes of these tasks may be differentfrom each other and are changeable over time.

Field units are hierarchically organised as a team. A team consists of

I a commander situated at the top level called the strategic level

II field units at the lower level called the tactical level.

This paper refers to a team as a society of cooperative intelligent agents. The team is facedwith the problem of carrying out geographically dispersed tasks, under evolving executioncircumstances, in a manner that achieves global objectives. Field units, which are spatiallydistributed in the geographic area, might be robots or humans that are responsible for doingSAR. A field unit displaced from one location to another:

• would possess different capabilities that are required by tasks

• perceives its local environment

• cooperates and coordinates with other field units in order to maximise the globalutility


• might have a set of actions that each one is associated with a definite speed and a setof capabilities, autonomously reasons about its actions to select and execute theirown actions

• has a partial and local view of the world state

• reports to the operations centre

• executes the commander’ orders (action plan).

In the SAR domain, field units are categorised into several types according to capabilitiessuch as

1 reconnaissance

2 canine search

3 electronic search

4 light rescue

5 medium rescue

6 heavy rescue

7 volunteer.

Figure 1 A simple task structure consisting of three tasks, which should be accomplished in orderto save a victim in SAR domain (see online version for colours)


The role of commander is to control and command the field units. The commander hasa large global picture of the world’s state, which enables him or his unit to define globalobjectives for the team and make strategic decisions for the field units. One example of ajoint objective is to finish all rescue tasks located within a zone by all field units in minimumtime. Figure 2 presents a team of four field units that is command by an incident commander.We study this team in detail in Section 3.

Figure 2 Structure of a team of four heterogeneous field units (see online version for colours)

The global perception, which the team commander has of the task environment, is formedby information of location-based temporal macro tasks (LoTeM tasks). A LoTeM task stateshow many basic tasks from an certain type are spatially located within a definite geographicarea during a definite time. The concept of LoTeM task is shown in Figure 3. Three LoTeMtasks are spatially contained within a geographic area, such as road segment, city blocks,etc. These tasks have dynamic and temporal attributes.

Planning and scheduling techniques are two major coordination mechanisms in multi-agent systems. The problem of how agents should get from the current world’s state to thedesired goal state through a sequence of actions (an action plan) represents a multi-agentplanning problem. An action plan specifies a sequence of actions that a definite agent shoulddo. The multi-agent scheduling is the problem of assignment of limited resources (agents)to time-consuming tasks within a defined time window and coping with a set of constraintsand requirements over time in order to maximise an optimisation criterion.

Macro action planning for a team of field units responding to disaster crisis is importantto the team’s commander in order to achieve a global/joint objective, e.g., accomplishingsearch and rescue tasks in a minimum overall time.


Figure 3 A LoTeM task structure of three task types which are spatially contained within ageographic area in a time (see online version for colours)

Strategic decision making is a technique by which the commander can coordinate andcontrol actions of field units by making strategic decision in the SAR domain with regardsto the defined requirements. Figure 4 briefly presents the strategic decision making processthat consists of four main phases as follows:

1 Specify a strategy: Strategy specification enables a commander, as a human, toexpress and encode his or her intuition for action planning. A strategy decomposes aproblem (e.g., carrying out SAR by a team within an operational area) into a finite setof small sub-problems each called a thread. A strategy is a set of prioritised threadsthat ordered from high to low according to their importance under humansupervision. A thread consists of:

I a subset of task types (a subgoal)

II a subset of zones (a sublocation)

III a subset of field units (a subteam).

A strategy might define a field unit in several threads. If we completely ignore acommander, a strategy will define using a single thread that includes all task types,the whole operational area, and all field units.

2 Calculate a set of alternatives: Execution of a strategy is the problem of appropriateassignment of field units to threads of a human strategy according to the world’s statein real-time. A strategy is required to be executed or be re-executed whenever a newset of field units enter into one thread or several threads. A thread receives a new setof field units from two sources:


I the commander who can directly send a set of (free) field units to this thread or

II the higher thread that releases a set of field units, which are assigned or areentered into to this thread, and sends this set into the lower thread during theadaption phase (phase 4).

Human strategy execution is done at two steps. The first step is to calculate possiblealternatives that present feasible choices for solving the thread assignment problemwithin a time frame. An alternative presents which field units are assigned to whichthreads, provided that a field unit cannot be assigned to more than one thread in analternative. Calculated alternatives are input data for the next phase.

3 Make a choice: The second step in the execution of the human strategy is to make achoice by selecting an alternative from the calculated alternatives. The selectedchoice is referred to as a strategic decision that the commander should select forcontrol and coordinating field units. With regard to the strategy definition, a strategicdecision constrains a subteam to a subtype of task at a subarea. In other words,a strategic decision assigns a field unit to either a thread or no thread. A field unitwhich is assigned to a thread is constrained to the thread definition and is delegated todo any task contained by the thread. The selected strategic decision specifies macroactions of field units for a time window, therefore field units are required to specifyand execute their actions, taking into account this decision. As a result, a strategicdecision is considered a macro action.

4 Adapt the strategic decision in a time manner: A strategic decision, made at theprevious phase at a certain time and being executed by field units, is valid for alimited time. Therefore the commander is required to adapt (revise, refine) thisstrategic decision at the right time. The adaption of a strategic decision includes

I identifying a subset of field units that should be released from threads, to whichthose are assigned via the strategic decision, at the right time

II sending the released field units to the lower threads. The adaption of strategicdecision results in re-executing phase 2.

In the strategic decision making chain, strategic decisions are sequentially and timely madeover time until the desirable goal is met. These made decisions generate a macro actionplan that states how and when a team reaches a pre-defined goal. The plan is initiated fromone of the presented alternative in the current time and then evolves over time whenevera new decision should be made in Phase 3. It is obvious that due to the existence ofdifferent choices for making a strategic decision within a time, several different plans maybe generated, which each one having a strong effect on the efficiency of SAR operation.Among these plans, there is an optimal plan that guarantees the team’s ability to maximisethe joint objective. One of the alternatives included by the optimal macro action plan isconsidered the best strategic decision the commander should select. The optimal executionof the commander’s strategy aims to reason on which choice could be the best strategicdecision.


Figure 4 Strategic decision-making process and role of step 3 in this process as the problemaddressed by this paper (see online version for colours)

3 A simulated scenario

3.1 SAR scenario

To have a better understanding of and address the problem, this section is dedicated to theexplanation of a simulated and simple scenario of SAR, to which a team has been assigned.

Imagine that an earthquake disaster has occurred in an urban area and a team hasdeparted to this area to engage in SAR. Figure 5 shows a map, which is created bygeographic information systems (GIS), that visualises the location of four field units andspatial distribution of five highlighted road segments, as five operational zones in time 0.This map provides a timely situational awareness for the commander of SAR.

Figure 5 A simulated SAR scenario (see online version for colours)

Table 1 lists three task types with associated properties to present SAR domain data. Forexample, to do an instance of task type T0, which presents one reconnaissance task, requires


one capability of type C0 and 5 min duration. The domain data are essentially defined andmodified by the commander.

Table 1 Matrix of task types to present the SAR domain data

∆ t Capability requirementsTask-type (min) C0 C1 C2

T0 5 1 0 0T1 20 0 1 0T2 60 0 0 1

Capabilities description: C0: Reconnaissance; C1: Search; C2: Light rescue.Task types description: T0: Reconnaissance; T1: Search; T2: Light rescue.

The team includes four field units and a commander. We assume that all of the field unitsare free (or idle) in time 0. The action matrix shown in Table 2 lists these field units withan action set associated to each field unit. For example, the field unit a2 has two actionsand is capable of doing one of them at a time. The second action provides one unit of thecapability type C1 with a speed of two times faster than the basic level. In summary, by thisaction, this field unit can carry out one unit of the T1 task within 10 min.

Table 2 Matrix of field units’ actions

Number of capabilitiesField unit ID Action speed C0 C1 C2

a0 2 1 0 0a2 1 1 0 0

2 0 1 0a6 1 0 1 0

2 0 0 1a7 1 0 1 0

2 0 0 1

Table 3 shows the shortest distances among six locations in time 0. We assume that theaverage moving speed of all field units equals 20 (metre per minute) through the roadnetwork. To make the problem simple, we assume that information provided by this tabledoes not change over time. In real situation, this table is calculated and updated by a teamof road-clearing vehicles.

A set of 12 LoTeM tasks, which are observed or estimated at time 0 are geo-located infive road segments. Table 4 presents the state of task environment at time 0. For example,the 10th item states that the proximity of the road s5. Five tasks of the light rescue typeare estimated to be revealed in the future and six tasks of the same type have been revealedand are ready to be carried out by field units. Because of the ‘enabling’ dependency amongtasks, e.g., between 9th task and 10th task, if the whole 9th task (2 not yet enabled amountplus 5 enabled amount) is completely done, it is estimated that five tasks of the light rescuetype will be revealed the proximity the road s5. From another point of view, six civilianshave been successfully located (searched) under debris and now they need to be rescued inlocation s5. In addition, five persons are estimated to be under debris, and to rescue them,


10 tasks of T0 type and seven tasks of T1 type should be completely carried out in thesame geographic area. Over time, estimated information is replaced with real and observedinformation.

This data are used to provide a timely situational awareness of the world’s state andform a big picture of the crisis situation for the commander.

Table 3 The shortest distances (given in metre) among six locations visualised in Figure 5

s2 s1 s3 s4 s5 s6

s2 0 225 447 764 364 625s1 225 0 370 687 418 548s3 447 370 0 343 452 221s4 764 687 343 0 618 224s5 364 418 452 618 0 476s6 625 548 221 224 476 0

Table 4 State of a set of 12 LoTeM tasks geo-located in five road segments in time 0

LoTeM Location Not yet enabled Enabledtask Id (Road S.) Task-type amount amount

1 s1 T0 0 252 s1 T2 0 43 s2 T1 0 54 s2 T2 10 55 s3 T0 0 156 s3 T1 8 07 s3 T2 2 88 s5 T0 0 109 s5 T1 2 510 s5 T2 5 611 s4 T1 0 1812 s4 T2 10 2

3.2 A scenario of strategic decision making process

This subsection presents a simple scenario of the strategic decision making process, inwhich the commander of the team is faced with the problem of making the best strategicdecision.

The commander, first, specifies a strategy as Table 5 presents. This strategy in thisscenario is composed of three threads that partition the objective into three sub-problems.Thread 1 states that the first and the highest priority for the team is to do two task types:

Thread 1 states that the first and the highest priority for the team is to do two task types{T0, T1} at three geographic zones {s3, s4, s5}. Any appropriate and available subset offour field units {a0, a2, a6, a7} can be assigned to this thread, in order to do tasks that this


thread will contain after strategy is executed in {real time}. The human strategy has definedfield unit a2 at three threads. It means that this field unit is capable of being assigned toone of three threads via a strategic decision. In addition, abstractly, the task environmentdefined by thread 2 is completely dependent on thread 1. We assume the commander hassent all four field units as free ones to thread 1 in time 0.

Table 5 An example of incident commander guidance (human strategy)

A sub-goalThread Id A sub-location (task type) A sub-team

1 s3, s4, s5 T0, T1 a0, a2, a6, a72 s3, s4, s5 T2 a2, a6, a73 s1, s2 T0, T1, T2 a0, a2, a6, a7

Strategy execute is the next step that includes two phases. First, the whole problem needsto be partition into three sub-problems, taking into account the human strategy and thereal-time state of the world (see Table 6).

Table 6 Partition the whole problem into three subproblem in strategy execution in time 0

Thread LoTeM Field units Field unitsId task Id assigned to sent or released into

1 5, 6, 8, 9, 11 a0, a2, a6, a72 7, 10, 123 1, 2, 3, 4

Calculation of feasible alternatives results in 10 choices presented in Table 7. Eachalternative is considered a potential candidate for the strategic decision that the commandercan select/make in time 0.

Table 7 A set of 10 alternatives (choice) calculated for making a strategic decision in time 0

Alternative Assignment to Assignment to Assignment tono. Thread 1 Thread 2 Thread 3

1 a2 a7 a0, a62 a2 a6, a7 a03 a0, a7 a6 a24 a2, a7 a6 a05 a0, a2 a7 a66 a0, a2 a6, a77 a0, a2, a6 a28 a2, a6, a7 a09 a0, a2, a7 a610 a0, a2, a6, a7

Phase 2 in strategy execution is to select the best choice from 10 alternatives that presentsthe optimal strategic decision made by the commander in time 0. Therefore, the key questionis which choice can be the optimal strategic decision?


4 Literature review

Good works, which focused on optimisation of emergency response operations, taking intoaccount a centralised approach, have been done by different schools. This section reviewssome of them. Review of the related works shows that they have not thoroughly addressedthe stated problem and a proper solution has not been developed for solving this type ofproblem.

A system of distributed autonomous GIS (DAGIS) is proposed by Nourjou andGelernter (2015) to solve the coalition formation problem within a human team for publicsafety applications via an automated mechanism. DAGIS imply a system (social network)of autonomous software agents carry out the coalition formation method on behalf ofhuman users with some degree of independence or autonomy, and in so doing, automatedcommunicate with each other and employ some problem solving algorithms. DAGIS arerun on mobile devices such as smart phones, and an instance of DAGIS is used by onetype of human user: field unit, civilian, incident commander. A DAGIS which interactswith the incident commander only follows the human’ guidance without optimisation ofhis decision.

The incident command system (ICS) is the official designation for a particular approachused by many public safety professions (e.g., firefighters and police) to assemble andcontrol the temporary systems they employ to manage personnel and equipment at a widerange of emergencies, such as fires, multi-casualty accidents (air, rail, water, roadway),natural disasters, hazardous materials spills, and so forth (Bigley et al., 2001). The IncidentCommander, the highest-ranking position within the ICS, is ultimately responsible for allactivities that take place at an incident, including the development and implementation ofstrategic decisions and the ordering and releasing of resources. The planning section, oneof the sections that reports directly to the incident commander, develops the action planto accomplish the systems objectives. It collects, evaluates, and disseminates informationabout the development of the incident and status of resources. Information is needed tounderstand the situation, predict probable courses of events, prepare alternative strategies,and control operations. An action plan is made of five phases:

• understand the situation

• establish incident objectives (priorities, objectives, strategies, tactics/ tasks)

• develop an action plan

• prepare and disseminate the plan

• continually execute, evaluate, and revise the plan (FEMA, 2012).

The ICS uses the strategic planning approach for coordination of emergency operationsby goal selection, goal decomposition, grouping people into units, and assigning units tosubgoals. Unfortunately, FEMA provides a set of useful guidelines about practices but, doesnot explicitly identify the algorithms and design requirements for information systems inorder to make incident action plans. It is not clear how the incident commander is involved inthe action planning process and how the information system assists the incident commandervia a mixed-initiative planning system.

STaC addresses multi-agent planning problems in dynamic environments where mostgoals are revealed during execution, where uncertainty in the duration and outcome ofactions plays a significant role, and where unexpected events can cause large disruptions


to existing plans (Maheswaran et al., 2011). STaC is composed of a strategy specificationlanguage that captures human-generated high-level strategies and corresponding algorithms,executing them in dynamic and uncertain settings. This partitions the problem into strategygeneration, designed by humans and understood by the system, and tactics, orchestrated bythe system with information to and from responders on the ground. STaC gives the abilityto create changing subteams with task threads under constraints (e.g., focus on injured).The connection between a STaC strategy and the STaC execution algorithm is the notion ofcapabilities: agents have capabilities; tasks require capabilities. STaC dynamically updatesthe total capability requirements for the tasks in the strategy and assigns agents to tasksduring execution following the human guidance. The big inefficiency in this approach isthat the execution algorithm searches for and assigns a subteam providing a minimum totalcapability requirements for a thread, while the best decision maybe to select a subteamproviding a maximum one. This paper proposes a AI problem-solving technique that aimsto optimise the decision-making problem by identifying the best choice from a set ofalternatives effectively and efficiency.

DEFACTO is a multi-agent based tool for training incident commanders for large scaledisasters (man-made or natural) (Schurr and Tambe, 2008). One key aspect of the proxy-based coordination is ‘adjustable autonomy’ that refers to an agent’s ability to dynamicallychange its own autonomy, possibly to transfer control over a decision to a human or anotheragent. A transfer-of-control strategy is a pre-planned sequence of actions to transfer controlover a decision among multiple entities. For example, an AH1H2 strategy implies that anagent (A) attempts a decision and if the agent fails in the decision, then the control over thedecision is passed to a human H1, and then if H1 cannot reach a decision, then the control ispassed to H2. The adjustable autonomy concept is different with strategy decision-makingproblem discussed here. Therefore this approach cannot be used to solve the problem statedby this paper.

The RoboCup Rescue Simulation program is to advance research in the area of disastermanagement and SAR (Kitano and Tadokoro, 2001). It provides a platform for disastermanagement where heterogeneous field agents (police, fire brigades, and ambulances)coordinate with each other to deal with a simulated disaster scenario. Police agents haveto clear road blockades to provide access to the disaster sites, ambulance agents have torescue civilians, and fire brigade agents have to control the spread of fire and extinguishit. The simulator also provides centres, a Police Office, a Fire Station, and an AmbulanceCentre, to help the field agents coordinate. A partitioning strategy can partition/divide thedisaster space among agents in the pre-determined and homogeneous fashion (Paquet et al.,2004) or another strategy (Nanjanath et al., 2010) allows the centres to partition the city intoclusters of roads and buildings using the k-means algorithm and assigning each to an agent.These strategies are considered a simplified version of the strategy proposed in this paper.Unfortunately approaches which have been proposed for RoboCup Rescue Simulation donot take into account the macro action planning problem and human guidance.

One of the most widely-used technique for problem-solving in artificial intelligenceis state-space search (Tambe and Norvig, 2009). Formulation of a problem in a state-space search framework requires four basic components: state representation, initial state,expansion operator, and goal state. The objective is to find a sequence of actions thattransforms the start state into a goal state, and also optimises some measure of the qualityof the solution (Kwok and Ishfaq Ahmad, 2005). Heuristic searches, such as A* search, arehighly popular means of finding least-cost plans due to their generality, strong theoretical


guarantees on completeness, optimality and simplicity in the implementation (Cohen et al.,2010). In this algorithm, a cost function f(s) is attached to each state, s in the search-space, and the algorithm always chooses the state with the minimum value of f(s) forexpansion. The function f(s) can be decomposed into two components g(s) and h(s)such that f(s) = g(s) + h(s), where g(s) is the cost from the initial state to state s, andh(s) (which is also called the heuristic function) is the estimated cost from state s to agoal state. Since g(s) represents the actual cost of reaching a state, it is h(s) where theproblem dependent heuristic information is captured. Indeed, h(s) is only an estimate ofthe actual cost from state s to a goal state, denoted by h∗(s). An h(s) is called admissibleif it satisfies h(s) <= h∗(s) which in turn implies f(s) <= f∗(s). Many problems thatcan be formalised in the state-space search model, are solvable by using different versionof this technique. Some problems include the optimal task assignment/allocation problem(Kwok and Ishfaq Ahmad, 2005; Shahul et al., 2010), finding the shortest paths on realroad networks (Zeng and Church, 2009), action planning (bulitko and Lee, 2006; Bonetand Geffner, 2001). A specific search algorithm, which has been designed for a specificproblem, is appropriate for solving a specific type of problem. As a result, a few searchalgorithms have been developed for different problems. This paper aimed to apply thestate-space search technique for optimal execution of incident commander strategy.

The Markov decision process (MDP) is used for problems of planning under uncertainty(Boutilier et al., 2011). A MDP models problems of sequential decision making that includeactions that transform a state into one of several possible successor states, with each possiblestate transition occurring with some probability. The MDP guarantees an optimal solution,but does not work for large problems, because of its high time and space complexity,because it requires calculating all feasible states. This paper proposes a faster approach bya further-reduced state space.

In mixed-initiative planning systems, humans and machine collaborate in thedevelopment and management of plans by providing capabilities of each one that doesthe best (Burstein and McDermott, 1996). Human-system interaction generates and refinesplans by adding and removing activities during execution, while minimising changes to areference plan or schedule (Ai-Chang et al., 2004). In the strategic decision making, thehuman provides a high-level strategy guidance for the software system to use during realtime execution to produce concrete plans and satisfy particular goals, that are not revealeda priori but are revealed during execution.

5 Methodology

We formulate the stated problem as a search problem. This paper presents a space-statesearch algorithm that intends to calculate/make an optimal macro action plan that minimisesthe total time required by the team to achieve the goal.

The calculated plan is composed of a sequential strategic decisions and the first one arethe alternates the commander should select for optimal execution of his strategy for thattime. In addition, a time window is calculated and associated to each strategic decision toshow when that strategic decision starts and when it finishes.

Design the algorithm requires two essential steps: problem statement and problemformulation. See Nourjou et al. (2014a) for more detailed information.


5.1 Problem statement

The problem addressed by this paper is stated as follows:

.

argminp

f(p) = ∆t

P = {d1, d2, . . . , dm}T = {0, t2, t3, . . . ,∆t}d1 ∈ A

A = {a1, a2, . . . , an}∆t: the total time to reach the goalP : an optimal macro action plan that minimises ∆t

f : the goal is achieved under the P

ti: a time of making the di

d1: the best choice in time 0A: a set of alternative available.

. (1)

5.2 Problem modelling

A complete data model is required to formulate and model the problem. Moreover, this datamodel enable us to design and implement the algorithm. The SAP data model was usedby this paper to develop this algorithm. Figure 6 shows a part of the SAP data model thatpresents the classes of strategy, thread, node, thread assignment, and alternative. The SAPdata model is completely discussed in Nourjou et al. (2014a).

6 Algorithm

The state-space search algorithm, which is presented in Algorithm 1, is proposed by thispaper for solving the optimal execution problem of incident commander guidance. Thisalgorithm combines three key algorithms:

1 the A* search algorithm

2 the human guidance execution algorithm (Nourjou et al., 2014d)

3 LoTeM task assignment algorithm (Nourjou et al., 2014c).

We are not concerned with Algorithms 2 and 3 because of limited space.This algorithm generates new nodes and calculates the attributes associated to that node.

The computation starts from the initial node, which models the world’s state at time 0, thetime of strategy execution. A sub-algorithm is responsible for generating a new node pereach alternative available at that time. For a newly generated node, LoTeM task schedulingand task execution are done under the strategic decision encoded by this node until thisstrategic decision needs to be adapted. The adaption time points to the node’s finish time,and this time window points to the time window that the node’s strategic decision is valid.Then, the attribute h is calculated. Finally, all these nodes are added to the state space inorder to extend the search tree.


Figure 6 A part of the SAP data model which is used in this paper for problem modelling(see online version for colours)

Source: Nourjou et al. (2014a)

A sub-algorithm searches the state space to select a node that contains the minimum f. Ifthe goal is not met by node, a new set of alternatives as feasible choices for revising thenode’s strategic decision will be calculated, and the algorithm deals with these alternativeas it dealt with the initial ones. If this node reaches the goal, the search tree will be exploredfrom this leaf to the root node to extract a macro action plan as the optimal one.

The proposed algorithm integrates a number of sub-algorithms to solve the mainproblem. The following subsections are dedicated to methods/algorithms used in the mainalgorithm.

6.1 Subalgorithm: generate a new node

The objective of this algorithm is to generate a new node per an alternative using the parentnode. This alternative plays the role of a strategic decision for the node and will be consistentduring the node’s life time. Attributes associated to the stateNode class are described asfollows:

• g0 is a time in which the node is generated. The g2 attribute of the parent node isassigned to this attribute.

• g2 is a time that this node ends (or finishes). This parameter indicates when thenode’s strategic decision needs to be adapted. These parameters are calculated in theLoTeM tasks assignment algorithm.

• f is an overall time from the root node to the goal node.

• h is an estimation of cost (amount of time) that is required by the team to reach thegoal state from the current node.


• ThreadAssignments_node comprises a set of instances of the threadAssignment class.An instance of the ‘threadAssignment’ class states what field units are assigned towhat thread. This property presents a strategic decision associated with this node. Itis valid during the lifetime of this node.

• Segments_node encodes LoTeM tasks.

If we execute this algorithm on the problem stated in Sections 1 and 2, the 10 new nodeswill be generated. The g0 property of all these nodes is set into 0 which points to the initialtime.

Algorithm 1 The heuristic search algorithm to make the best choice among a set of alternativesavailable for optimal execution of human strategy in macro action planning

Data: A :a set of available alternatives.Data: n0 :the initial node, an instance of the "stateNode" class, that models the world’s state in time 0.Result: d1 :the best alternative as the best strategic decision.Result: t :a minimum overall time.Result: p :the optimal macro action plan.StateSpcae←− ∅;while true do

for a ∈ A don← Generate_aNew_Node(a, n0);Assign_LoTeMTasks_toF ieldUnits(n);n.h← Calculate_h(n);StateSpcae←− StateSpcae ∪ {n};

endn← Select_a_Node(StateSpcae);if n.g2 = null then

d1 ← null;return [null, null,null] ;

endif Is_the_goalNode(n) = true then

p← Extract_theP lan(n, StateSpcae);d1 ← Select_theF irst_Member_from(p);t← n.g2;return [d1, p, t];

endn0← n;A←− Calculate_aNewSet_Alternatives(n0);

end

Algorithm 1: The heuristic search algorithm to make the best choice among a set of

6.2 Subalgorithm: assign LoTeM tasks to field units (Nourjou et al., 2014c)

Algorithm 2 presents a heuristic algorithm that intends to assign LoTeM tasks to field unitsand execute LoTeM tasks under the node’s strategic decision. In addition, this algorithmcalculates a time at which the node’ strategic decision needs to revised. This algorithmidentifies a subset of field units that should be released from threads. The ‘Segments_node’and ‘g2’ properties of the node will updated by this algorithm. This algorithm has beendiscussed in detail by Nourjou et al. (2014c).


Algorithm 2 Heuristic algorithm for dynamic assignment of LoTeM tasks to field units within anode

g2← g0;L←− Create_emptyset_of_legalAssignment();while true do

Select_Efficient_Agents();T ←− Select_Active_MacroTasks();A←− Select_Idle_Agents(g2);L←− L ∪A;if Identify_theRelease_T ime() = true then

return;endwhile true do

if |L| = 0 or |T | = 0 thenbreak;

endT2←− Nominate_MacroTasks();U ←− Calculate_Utilities();u←− Find_theHighest_Utilities();if u.benefitRation >= 10 then

Assign_Agents_toMacroTasks();continue;

endelse

T ←− T − T2;continue;

end

endif |L| = 0 and |T | > 0 then

g2← Calculate_earliestF inishT ime();continue;

endelse if |L| > 0 and |T | = 0 then

if Identify_theRelease_T ime() = true thenreturn;

endelse

g2← Update_ProblemState();if g2 = null then

return;endelse

continue;end

end

end

end

Algorithm 2: Heuristic algorithm for dynamic assignment of LoTeM tasks to field unitsSource: Nourjou et al. (2014c)

If this algorithm is executed on the 6th node, which contains the 6th alternative, the taskschedule presented in Figure 7 will be calculated. This algorithm has calculated that fieldunit a0 needs to be released from its thread at time 96 because thread 1 does not need tokeep this field unit anymore. As result, g2 attribute will be updated to 96. Table 8 shows thestate of LoTeM tasks at the time 96 that this algorithm has calculated.


Algorithm 3 Automated algorithm for calculation of a new set of feasible alternatives in incidentcommander guidance execution

Data: n :an entity of the "stateNode" class of the data model.Data: S :an entity of the "strategy" class.Data: D :the problem domain.Data: p :type of selection method.Result: N :a set of entities of the "stateNode" class that present feasible alternatives.for i← 1 to |S.Threads| do

t← S.Threads[i];ta← n.ThreadAssignments_node[i];ta.MacroTasks_ofThread←− f_Calculate_MacroTasks(n.Segments_node, t,D);

endNa←− ∅;Nb←− ∅;Nb←− Nb ∪ {n};for i← 1 to |S.Threads| do

t← S.Threads[i];for nb ∈ Nb do

ta← nb.ThreadAssignments_node[i];A1←− f_Identify_Agents_ResidentIn(ta);A2←− f_Identify_Agents_ReceivedBy(ta);for m0 ∈ ta.macroTask_ofThread do

tm0← m0.T emporalMacroTasks.Last();tm0.LegalAssignments←− f_Select_Efficient_Agents(m0, t, A1, A2);

endM ←− ta.macroTask_ofThread;C1←− f_Form_EfficientCoalitions(M,A1, A2);C2←− f_Purify_Coalitions(C1,M);C3←− f_Select_Coalitions(C2, p);for j ← 1 to |C3| do

Na←− Na ∪ {f_Generate_NewNode(C3[j], nb)};end

endNb←− Na;Na←− ∅;

endN ←− Nb;

Algorithm 3: Automated algorithm for calculation of a new set of feasible alternativesSource: Nourjou et al. (2014d)

6.3 Subalgorithm: calculate h

Figure 8 shows a heuristic algorithm that aims to calculate ‘h’ and ‘f’ properties of node,at time g2. First, for each LoTeM task of the task environment formulated by the node, theparameter TDD (total dependent duration) is calculated using two kinds of information:

1 number of enabled tasks

2 number of tasks dependent to this LoTeM task.

If a set of field units is assigned to a LoTeM task, this algorithm will take into account thefinish time of this task in calculation. Then, ‘h’ variable of node is the aggregation of alltotal dependent durations of LoTeM tasks. Consequently, the ‘f’ property of this node will


be the aggregation of the ‘h’ and ‘g2’ properties. Table 9 will be achieved if we execute thisalgorithm for the LoTeM tasks shown in Table 8.

Figure 7 Task/action scheduling in the 6th node (see online version for colours)

Table 8 State of LoTeM task environment forecasted at time 96 in 6th node

Location Not yet enabled EnabledNo. (Road S.) Task-type amount amount

1 s1 T0 0 252 s1 T2 0 43 s2 T1 0 54 s2 T2 10 55 s3 T0 0 06 s3 T1 0 87 s3 T2 2 58 s5 T0 0 09 s5 T1 0 710 s5 T2 5 311 s4 T1 0 912 s4 T2 5 7

6.4 How to expand the state space

The search space is a search tree in which each node presents a definite strategic decision.The search tree is expanded by new nodes that are generated using input alternatives. TheLoTeM task assignment algorithm cannot produce new nodes. If 10 nodes, which werenewly generated and calculated, are added to the space search, the result will be a searchtree shown in Figure 9.

To reach a quick solution, the search tree is expanded due to new alternatives calculatedin executing or re-executing the human strategy problem. The strategic decision makingproblem is prioritised higher than the task scheduling problem. As result, a heuristicalgorithm that does not produce any node was used in presented algorithm.


Figure 8 Heuristic algorithm for calculating the ‘Total Dependent Duration’ parameter which isassociated with three LoTeM tasks located in the same geographic area (see onlineversion for colours)

Table 9 The ‘total dependent duration’ variable calculated for LoTeM tasks of Table 8

LoTeM task no. Total dependent duration

1 1252 2403 7004 3005 06 2807 1508 09 44010 9011 39012 420

6.5 Subalgorithm: select a node

The purpose of algorithm is to select a node from the state space. There may be differentmethods to select a node according to some criteria. A selected node will be used for twopurposes:

• to select as the goal state or

• to expand the search tree.

Our method in this paper is to select a node with the smallest f from the search space. Thismethod is used in A* search algorithms. Execution of this method on the calculated searchtree resulted in selection of node 6.

6.6 Sub-algorithm: extract the plan

This algorithm will be executed if the selected node is recognised as the goal node, in whichthe global goal is satisfied (or met). The objective of this algorithm is to extract the optimal


macro action plan from the search tree using this selected node (the leaf node). The leafnode indicates the time in which the goal is met, and the root node indicates the best choicefor making the strategic decision in time 0. Therefore, ‘g2’ property of the selected nodepresents the minimum overall time (cost) that the team can reach the objective from theinitial state.

Figure 9 The search tree (state space) that includes 10 newly generated nodes

6.7 Subalgorithm: calculate a new set of alternatives

This algorithm will be run if the selected node is not identified as the goal node. Algorithm 3is used to calculate a set of new alternative for revising the selected node’s strategic decisionat time ‘g2’. The ‘g2’ property indicates a right time that a right subset of field units, whichare assigned to subset of threads via the strategic decision, should be released (get free)from these threads, and a released field unit should be sent/entered into the lower thread.

At this time, we are faced with the problem of re-assignment of new field units to athread; These new field have been entered into this thread from the higher thread. A threadof this type is capable of selecting and keeping any sufficient subset of the field units andsends unwanted ones into the lower thread, but there may be several subsets as availableoptions for this thread. Consequently, a number of scenarios may exist that present differentkinds of distribution of field units among threads in a time. An alternative is a candidate formaking the final strategic decision. This algorithm has been discussed in detail by Nourjouet al. (2014d).

We saw at Figure 7 that the strategic decision of 6th Node should release field unit ‘a0’from thread 1 at time 96 (Nourjou et al., 2014d). Figure 10 presents a set of new feasiblealternatives that is calculated by running this algorithm for this node. Newly calculatedalternatives will be used to generate new nodes. Figure 11 shows the state space which isexpanded by the new alternative.

7 Implementation

The algorithm presented in the previous section was implemented in GICoordinator. Designthe GICoordinator has identified the significant issue of this algorithm in collaborationbetween the human (incident commander) and an intelligent software agent. A human is


equipped with a computer running an instance of GICoordinator. This system is a GIS-based intelligent assistant system that collaborates with the commander and supports IC inhis guidance execution (Nourjou et al., 2014b). C# programming language has been usedto implement the core of the GICoordinator.

Figure 10 Adaption of the strategic decision associated with 6th node at time 96 (see onlineversion for colours)

Source: Nourjou et al. (2014d)

Figure 11 The search tree which is expanded from node 6 (see online version for colours)

The proposed algorithm was executed on the scenario stated in Section 3. To evaluate theproposed approach, we applied this algorithm on various strategies and various team withdifferent size. For each scenario, computation includes the computation time, number ofgenerated nodes, the minimum total time to reach the goal, and type of search method.


8 Results

Tables 10 and 11 show the results which were calculated by executing the presentedalgorithm on the scenario stated in Section 3. Alternative 6 shown in Table 7 was selectedas the best choice among 10 alternative for optimal execution of the human strategy inthe strategic decision-making process. This choice is the best strategic decision that thecommander can select in time 0 for macro action planning. Time 894 states a minimum totaltime (cost) that the goal state is estimated to be achieved via the optimal macro action plan.Figure 12 presents the made strategic decision via the GIS-based interface for the human.

Table 10 The best alternative (choice) selected for optimally executing the human strategy statedin Section 3

Alternative no. Minimum total time (in minutes)

6 894

Table 11 The optimal macro action plan calculated for the scenario stated at Section 3

Field units Field units Field unitsStrategic assigned assigned Fassigned Start timedecision no. to Thread 1 to Thread 2 to Thread 3 (assignment time)

1 (the Alternative 6) a0, a2 a6, a7 02 a2 a6, a7 a0 963 a2 a6, a7 1804 a6, a7 a2 3935 a6, a7 4636 a6, a7 5767 894

In addition, the optimal macro action plan was calculated by this algorithm. Seven sequencestrategic decisions form this plan. First one is alternative 6, and others are calculated by thealgorithm. A strategic decision presents which field units are assigned to which threads inwhat life time.

8.1 Evaluation

To evaluate efficiency of the proposed algorithm, we executed it on 9 scenarios. Thesescenarios are considered to be 27 examples, which are generated using three types ofstrategies, three kinds of team size, and three types of search methods.


Figure 12 Display the best strategic decision that was made for the scenario stated in Section 3(see online version for colours)

The task environment of SAR scenario, which was stated in Section 3, was used to generate27 scenarios in this evaluation. Three human strategies were defined as:

1 the strategy type I composed of three threads independent in field units

2 the strategy type II composed of three semi-independent threads

3 the strategy type III as a complex one composed of three threads containing all fieldunits.

Three search methods were:

1 the A* search method with α = 1.0

2 the A* search method with α = 0.3

3 the Breadth-First algorithm, which is a kind of the A* search method α = 0.0.

Table 12 shows complete, feasible, and optimal solutions for the simulated scenarios. Anoptimal macro action plan was calculated for each simulated scenario. Three types ofinformation were computed for each scenario’s plan:

1 the computation time (run time),

2 number of generated nodes, and

3 the minimum overall time.

The results show that the breadth-first search method can guaranty the optimality of theproblem, but it should generate a big state space including a huge number of nodes in orderto find the goal node. Consequently it is impossible to quickly calculate a plan in a complexSAR. The A* search method finds a semi-optimal solution in a meaning computation time,but for a complex problem, we should apply a correct variable α.


Table 12 Evaluation of the proposed algorithm using the simulated scenarios

Team Strategy Minimum total Computation time Number of Searchsize Type time (in minutes) (in milliseconds) generated nodes method

4 I 1074 5 94 A* with α = 1.04 I 1074 5 93 A* with α = 0.34 I 1074 5 91 Breadth-First4 II 1104 7 95 A* with α = 1.04 II 1104 7 97 A* with α = 0.34 II 1074 10 100 Breadth-First4 III 931 9 102 A* with α = 1.04 III 931 9 97 A* with α = 0.34 III 931 22 141 Breadth-First8 I 725 12 99 A* with α = 1.08 I 530 12 107 A* with α = 0.38 I 530 25 138 Breadth-First8 II 541 12 105 A* with α = 1.08 II 541 12 103 A* with α = 0.38 II 540 32 167 Breadth-First8 III 1082 9 109 A* with α = 1.08 III 729 15 121 A* with α = 0.38 III 558 38 168 Breadth-First12 I 714 12 95 A* with α = 1.012 I 714 12 102 A* with α = 0.312 I 384 27 158 Breadth-First12 II 374 12 106 A* with α = 1.012 II 374 12 106 A* with α = 0.312 II 372 30 179 Breadth-First12 III 1072 9 178 A* with α = 1.012 III 661 17 230 A* with α = 0.312 III 403 59 314 Breadth-First

9 Conclusion

This paper applied search algorithms, which belongs to the artificial intelligence, to solvethe optimal decision-making problem addressed throughout this presentation. The presentedalgorithm is capable of reasoning which is the best choice among a set of alternatives tooptimally execute the commander’s strategy in the strategic decision making problem inthe SAR domain. This algorithm calculates a complete, feasible and optimal solution forthe problem addressed by this paper. Three results achieved by executing this algorithm ona simulated scenario includes:

1 a macro action plan made

2 an overall time calculated

3 an alternative selected.


They support the commander’s decisions and assist the human in effective control andcoordination of field units’ actions by macro action planning.

Incident commanders need a quick and feasible solution, and optimisation of thehuman strategy execution problem is a difficult problem. We certainly do not claimoptimality of calculated results in the real world. We made some assumptions andsimplifications to model a complex problem, and design and run the problem-solvingalgorithm in a simulated environment. SAR that is carried out in a real disaster includesmore sophisticated characteristics: road blockage distributes SAR; synchronous actionsare required to accomplish a specific task; uncertainty in task duration, task state, andfield units’ capabilities distribute an action plan; a bad strategy that are defined by anunprofessional commander results in second disaster; there are different task-types and alsodifferent interdependencies among tasks.

To decrease the size of search space, two techniques were used in our algorithm. Thestrategy adaption algorithm keeps only two alternatives for each thread. The task assignmentalgorithm is a greedy and heuristic algorithm that does not produce any node.

The proposed algorithm is considered a search-based planning algorithm that makes amacro action plan according some optimisation criteria and the human strategy. It calculatesa measurement (a total time) associated to choices available in time 0.

In addition, decision-theoretic planning, as planning under uncertainty, takes up a hugeamount of time. A selected strategic decision constrains field units’ actions for real-timeexecution, and the commander will monitor the world state to adapt this decision remakefor a new one at a right time. As result, better strategic decisions should be made over time.

This paper introduced the strategy definition to encode human initiative in the decision-making solving process and involve humans in the loop in the SAR domain. This approachis required to be refined to real requirements that commanders of different teams are facedin the real world.

Learning from past experiences enables us to develop robust algorithms. Learning howto adjust the task type matrix, the matrix of field units’ actions, the parameter h of the searchalgorithm, or a strategy defined by humans are considered the important issue.

This algorithm can be used to develop an autonomous software system for automatedexecution of human strategy. The aim of this system to execute the human strategy andcontinuously monitor execution of made strategic decisions until the goal is met.

The proposed algorithm can be refined to address new requirements/demands. A jointobjective can maximise the global utility due to a definite time, e.g., maximise the numberof rescued people till 72 h, or minimise the number of human fatalities during 72 h. Multipleobjectives can be considered in this algorithm. Human strategy definition and executioncan be used for the resource allocation problem, e.g., allocate refuges or emergencytransportation.

The presented algorithm can enable a commander to specify different strategies andexecute this algorithm on them. Results provide a measurement for the human to evaluateand assess the quality of the defined strategy.

In this paper, a simplified scenario was simulated and used to present the applicationof the proposed algorithm. Crisis response to real disaster is a complex environment withwhich incident commanders should handle. Our algorithm seems to make contribution toreal SAR by addressing the commander’s demands and requirements stated in this paper.

This paper focused on Phase 3 of the strategic decision-making technique stated inSections 1 and 2. It is notable that other parameters are important in the optimisation problem


of SAR. Specifying a good strategy is a considerable issue because our algorithm is executedunder a human-defined strategy as a human guidance/supervision.

Future work includes at least two directions:

1 develop a smart algorithm that aims to adjust the human strategy and recommend abetter one

2 apply machine learning techniques to correctly refine the matrix of task types, matrixof field units actions, and correctly estimate the variable h in the real world.

Acknowledgement

We would like to thank Mrs. Denise Badolato for improving the language of this paper.

References

Ai-Chang, M., Bresina, J., Charest, L., Chase, A., Hsu, J.C.J., Jonsson, A., Kanefsky, B., Morris, P.,Rajan, K., Yglesias, J. and Chafin, B.G. (2004) ‘Mapgen: mixed-initiative planning andscheduling for the mars exploration rover mission’, Intelligent Systems, IEEE, Vol. 19, No. 1,pp.8–12.

Bigley, G.A. and Roberts, K.H. (2001) ‘The incident command system: high-reliability organizingfor complex and volatile task environments’, Academy of Management Journal, Vol. 44, No. 6,pp.1281–1299.

Bonet, B. and Geffner, H. (2001) ‘Planning as heuristic search’, Artificial Intelligence, Vol. 129, No. 1,pp.5–33.

Boutilier, C., Dean, T. and Hanks, S. (1999) ‘Decision-theoretic planning: structural assumptions andcomputational leverage’, Journal of Artificial Intelligence Research, Vol. 11, No. 1, p.94.

Bulitko, V. and Lee, G. (2006) ‘Learning in real-time search: a unifying framework’, J. Artif. Intell.Res.(JAIR), Vol. 25, pp.119–157.

Burstein, M.H. and McDermott, D.V. (1996) ‘Issues in the development of human-computer mixed-initiative planning’, Advances in Psychology, Vol. 113, pp.285–303.

Chen, R., Sharman, R., Raghav Rao, H. and Upadhyaya, S.J. (2008) ‘Coordination in emergencyresponse management’, Communications of the ACM, Vol. 51, No. 5, pp.66–73.

Cohen, B.J., Chitta, S. and Likhachev, M. (2010) ‘Search-based planning for manipulation with motionprimitives’, 2010 IEEE International Conference on Robotics and Automation (ICRA), IEEE,May, Anchorage, Alaska, pp.2902–2908.

FEMA (2012) Fema Incident Action Planning Guide, http://www.uscg.mil/hq/cg5/cg534/nsarc/FEMA%20Incident%20Action%20Planning%20Guide%20(IAP).pdf (Accessed on September, 2014).

Kitano, H. and Tadokoro, S. (2001) ‘Robocup rescue: a grand challenge for multiagent and intelligentsystems’, AI Magazine, Vol. 22, No. 1, p.39.

Kwok, Y-K. and Ahmad, I. (2005) ‘On multiprocessor task scheduling using efficient statespace search approaches’, Journal of Parallel and Distributed Computing, Vol. 65, No. 12,pp.1515–1532.

Maheswaran, R.T., Szekely, P. and Sanchez, R. (2011) ‘Automated adaptation of strategic guidance inmultiagent coordination’, Agents in Principle, Agents in Practice, Springer Berlin Heidelberg,pp.247–262.

Myers, K.L., Jarvis, P., Tyson, M. and Wolverton, M. (2003) ‘A mixed-initiative framework for robustplan sketching’, ICAPS, Trento, Italy, pp.256–266.


Nanjanath, M., Erlandson, A.J., Andrist, S., Ragipindi, A., Mohammed, A.A., Sharma, A.S. andGini, M. (2010) ‘Decision and coordination strategies for robocup rescue agents’, Simulation,Modeling, and Programming for Autonomous Robots, Springer Berlin Heidelberg, pp.473–484.

Nourjou, R., Szekely, P., Hatayama, M., Ghafory-Ashtiany, M. and Smith, S.F. (2014a) ‘Data modelof the strategic action planning and scheduling problem in a disaster response team’, Journal ofDisaster Research, Vol. 9, No. 3, pp.381–399.

Nourjou, R., Hatayama, M., Smith, S.F., Sadeghi, A. and Szekely, P. (2014b) Design of a GIS-based Assistant Software Agent for the incident Commander to Coordinate Emergency ResponseOperations, arXiv preprint arXiv:1401.0282.

Nourjou, R., Smith, S.F., Hatayama, M., Okada, N. and Szekely, P. (2014c) ‘Dynamic assignmentof geospatial-temporal macro tasks to agents under human strategic decisions for centralizedscheduling in multi-agent systems’, International Journal of Machine Learning and Computing(IJMLC), Vol. 4, No. 1, pp.39–46.

Nourjou, R., Smith, S.F., Hatayama, M. and Szekely, P. (2014d) ‘Intelligent algorithm for assignmentof agents to human strategy in centralized multi-agent coordination’, Journal of Software, Vol. 9,No. 10, pp.2586–2597.

Nourjou, R. and Gelernter, J. (2015) ‘Distributed autonomous GIS to form teams for public safety’,MobiGIS ’15 Proceedings of the 4th ACM SIGSPATIAL International Workshop on MobileGeographic Information Systems, ACM, Bellevue, WA, USA.

Paquet, S., Bernier, N. and Chaib-draa, B. (2004) ‘Comparison of different coordination strategiesfor the robocuprescue simulation’, Innovations in Applied Artificial Intelligence, Springer BerlinHeidelberg, pp.987–996.

Russell, S. and Norvig, P. (2009) Artificial Intelligence: A Modern Approach, 3rd ed., Pearson, p.1152.Schurr, N. and Tambe, M. (2008) ‘Using multi-agent teams to improve the training of incident

commanders’, Defence Industry Applications of Autonomous Agents and Multi-Agent Systems,Birkhuser Basel, pp.151–166.

Shahul, A., Semar, Z. and Sinnen, O. (2010) ‘Scheduling task graphs optimally with A*’, The Journalof Supercomputing, Vol. 51, No. 3, pp.310–332.

Zeng, W. and Church, R.L. (2009) ‘Finding shortest paths on real road networks: the case for A*’,International Journal of Geographical Information Science, Vol. 23, No. 4, pp.531–543.

Search algorithm for optimal execution of incident commander guidance in macro action planning

Software

Transcript of Search algorithm for optimal execution of incident commander guidance in macro action planning