Honeywell Laboratories ~sa-circa/talks/circa-overview-for-cmu-12-04 1 Honeywell Labs Achieving...

Honeywell Laboratories ~sa-circa/talks/circa-overview-for-cmu-12-04 1

Honeywell LabsHoneywell Labs

Achieving Intelligent Real-Time Control

Dr. David J. Musliner

[email protected]

(612) 951-7599


OutlineOutline

• Motivation for this research and this talk.

• Approach:

– Overview of CIRCA, the Cooperative Intelligent Real-Time Control Architecture.

– Brief details on executive.

– Details on planner.

– Plan verification.


Characteristics of Motivating ProblemsCharacteristics of Motivating Problems

• UAVs, UGVs, UUVs, spacecraft, rovers…

• Complex dynamic environments:

– Require intelligence.

• Critical environments:

– Require predictable performance.

– Logical correctness.

– Timeliness guarantees.

• Resource limitations:

– Bounded reactivity.

– Bounded rationality.

• Sometimes distributed.


CIRCA Motivation: Real-Time Intelligent ControlCIRCA Motivation: Real-Time Intelligent Control

• Time-critical, hazardous situations --- Possibly catastrophic.

– CIRCA guarantees that it will respond in a timely way to threats in its environment.

• Flexible systems --- Many alternative actions/behaviors.

– CIRCA automatically selects behaviors and reconfigures itself while it is operating.

• Limited resources --- Bounded reactivity & rationality.

– CIRCA dynamically synthesizes controllers (plans) for only the immediately relevant parts of the situation. CIRCA does this introspectively, reasoning about resource limits.


Component TechnologiesComponent Technologies

• Reactive controller execution to ensure real-time responsiveness and opportunistic behavior.

• Automatic controller synthesis (planning) to adapt to changing conditions.

– This work formed the motivation and precursors to Häkan’s thesis.

• Task-based negotiation to distribute roles/responsibilities.

• Decision-theoretic deliberation scheduling to optimize benefits of negotiation and synthesis activity.


CIRCA ArchitectureCIRCA Architecture

Adaptive Mission Planner: Divides an overall mission into multiple control problems, with limited performance goals designed to make the controller synthesis problem solvable with available time and available execution resources. Deliberation scheduling.

Controller Synthesis Module: For each control problem, synthesizes a real-time reactive controller according to the constraints sent from AMP. Planning.

Real Time Subsystem: Continuously executes synthesized control reactions in hard real-time environment; does not “pause” waiting for new controllers.

Adaptive Mission Planner

Controller Synthesis Module

Real Time

Subsystem


Generate controller


State Space

Planner

Real Time

System

Break down mission

Generate controller

Execute controllerif (state-1) then action-1if (state-2) then action-2

...

Generate controller


Extending Performance Guarantees to Multi-Agent TeamsExtending Performance Guarantees to Multi-Agent Teams

Adaptive Mission Planner: Explicitly manages complexity of planning and negotiation processes that dynamically distribute roles/responsibilities.

Controller Synthesis Module: Builds controllers that include coordinated actions by multiple agents.

Real Time Subsystem: Executes coordinated controllers predictably, including distributed sensing and acting.

Only system to guarantee end-to-end multi-agent coordinated behaviors



Real Time

System

Roles, Goals

Real-Time Reactions

Planned Actions,Planned Negotiations



Real Time

System


CIRCA Architecture - RTSCIRCA Architecture - RTS



Real Time

System

Subgoals, Configurations

Reactive PlansFeedback Data

Feedback Data

The World


Real Time Subsystem (RTS)Real Time Subsystem (RTS)

test1 action1 test2 action2 test1 action1 test3 action3 test1 action1 test4 action4

• The RTS executes loops of Test-Action Pairs (TAPs).

• Each TAP takes a snapshot of world state and then executes test expression.

• Testi tests for a group of states in which Actioni was planned.

• Actioni may take up to its worst-case execution time to execute.

• The Controller Synthesis Module is responsible for building the TAP loop.


Key RTS FeaturesKey RTS Features

• The RTS executes in parallel with the other CIRCA modules.

• Parallel execution permits re-planning using computationally-expensive algorithms while preserving platform safety.

• Special-purpose TAPs used to download and switch to next controller.

• RTS includes multiple TAP schedule caches to hold controllers before they are activated.

action1 action2 action1 action3 action1test1 test2 test1 test3 test1 test4 action4


Cautionary Example: Ghosting Can Ruin Your DayCautionary Example: Ghosting Can Ruin Your Day

• All real executives take non-zero time to select and perform a planned action: the “sense-act gap”.

– CIRCA’s RTS takes a “snapshot” of consistent world state, executes boolean test expression, and performs action if expression returns true.

• If the world changes between snapshot and action, the action may occur in a different state than originally intended.

• Planner must understand this “ghosting”.

(light green)(status stopped)

(light green)(status moving)

Accel

(light red)(status stopped)

Light-changes (light red)

(status moving)

Accel

(ghost)


CIRCA Architecture - CSMCIRCA Architecture - CSM



Real Time

System



Feedback Data

The World


Available actions

“Non-volitional” transitions

Goal state description

Initial state description

TimedAutomataControllerDesign&ExecutableReactiveController

Controller Synthesis Module (CSM)Controller Synthesis Module (CSM)

Transition-based input model similar to classical planners, but with temporal characteristics and non-volitional transitions.

Controller SynthesisModule

TAP Compiler

Scheduler

State Space PlannerVerifier


CSM FunctionalityCSM Functionality

• State Space Planner predicts future threats and opportunities, plans actions with timing constraints for future states.

• Verifier reasons about complex temporal model to ensure that all failures are preempted.

• TAP compiler reduces timed automata controller model to time-constrained reactions (Test-Action Pairs).

• Scheduler builds executable cycle of TAPs to meet time constraints.

Controller SynthesisModule

TAP Compiler

Scheduler

State Space PlannerVerifier


Domain ModelsDomain Models

• Domain models implicitly describe a state-space world model by describing its transitions.

• Transitions are STRIPS-like (pre- and post-conditions).

• Transitions may be controllable (actions) or uncontrollable.

• Transitions have associated time bounds:

– Lower bounds for uncontrollable transitions;

– Upper bounds for actions;

– Both for “reliable temporals” used to capture continuous processes.


Objectives of CIRCA plansObjectives of CIRCA plans

• Maintain safety of agent while attempting to achieve goals.

• Exert supervisory control of continuous processes in the environment and the plant.

• Maintain safety by preempting bad events.

OK Threatened

Failure

Safe

Build and execute these plans (controllers) on the fly.


Cassini Spacecraft ExampleCassini Spacecraft Example

• Saturn orbit insertion was a mission-critical, one-shot engine burn opportunity.

• Dual Inertial Reference Units (IRUs).

• If one IRU fails during engine burn, system has only milliseconds to switch to a hot backup.

• IRUs take much longer to “warm up”.

• Hence both IRUs must be warmed up well in advance of critical engine burns.


Cassini Spacecraft ExampleCassini Spacecraft Example

(make-instance 'action :name "start_IRU1_warm_up" :preconds '((IRU1 off)) :postconds '((IRU1 warming)) :delay (make-range 0 1))

(make-instance 'temporal :name "fail_if_burn_with_broken_IRU1" :preconds '((engine on)

(active_IRU IRU1) (IRU1 broken))

:postconds '((failure T)) :delay (make-range 5 )

(make-instance 'reliable-temporal :name "warm_up_IRU1" :preconds '((IRU1 warming)) :postconds '((IRU1 on)) :delay (make-range 45 60))


A Simple PreemptionA Simple Preemption

State Space Planner (SSP) has planned action select_IRU2 to preempt temporal transition to failure.


Temporal Reasoning IssueTemporal Reasoning Issue

• A single action may not “solve” a temporal transition to failure; TTF may persist across multiple states.

• Non-Markovian representation of temporal information.

– Path-dependent nature of transitions: how long was this transition enabled in prior states?

– Need for efficient representation of only critical timing value distinctions leads to timed automata theory.

• Approach: use timed automata methods to handle almost all temporal reasoning.

Preparing SafeThreatened

Failure


Why Use Non-Markovian?Why Use Non-Markovian?

• Efficient representation of temporal, concurrent, partially-controllable worlds.

– All processes/actions have temporal extent and constraints.

– Exogenous processes run concurrently with own actions.

– They interact.

• To write down in Markovian form, need to write state of all different process clocks into the state space.

• Using compact non-Markovian representation allows planner to make decisions that span broad swaths of clock space.

• But planner (and exec) cannot make time-dependent decisions that are not flagged by some discretized process state change.


Verifying ControllersVerifying Controllers

• Translate state-space model of current plan into a timed automata model.

– Introduce failure transitions for real failures.

– Introduce failure transitions for states which the planner believes it has preempted.

– Treat any as yet unplanned states as safe sink states.

• Invoke verifier to test for the reachability of failure from the initial state.

• Verifier enumerates all possible futures in a timed simulation.

• Verification assures controller avoids known failures when given a correct domain model.


What are Timed Automata?What are Timed Automata?

• State machines with transitions that depend on synchronously-updated timers (called “clocks”).

– All clocks update at same rate.

• States can have “invariant” expressions on clocks that force automata to leave state before invariant would become false.

• Transitions can have “guard” expressions that restrict when they can be followed (preconditions).

• Transitions can have “reset” expressions that set some of the clocks back to zero.


IRU1 brokenActive-IRU IRU1

Engine onInvariant: Cr < 3

FAILURE

Select-IRU2

Fail-if-burn-with-broken-IRU1Guard: Cr > 5

Timed Automata for Preemption Timed Automata for Preemption

IRU failsReset: Cr=0

This planned action causes this invariant


Multi-Machine MappingMulti-Machine Mapping

• Base machine captures structure of SSP state space using edges labeled with synchronization tags.

• Each SSP transition has separate machine moving between enabled and disabled states.

• Base machine synchronizes with individual transition machines via tags/labels.

• Model checker builds “cross product”.

Transition machines

Base machine

Planner’s state space machine


Base Model for State SpaceBase Model for State Space

• Base model captures states of world and links parallel models for each planned or uncontrollable transition.

• Labels sync parallel automata.


Model for Uncontrollable EventModel for Uncontrollable Event

• Edges with shared labels can fire only when both models are able to traverse edge at same time.

• No timing info: event has zero delay from system’s perspective.


Model for Temporal Transition to Failure Model for Temporal Transition to Failure

• Clock Ct constrains when failure can happen.


Executive Model for ActionExecutive Model for Action

• Clock Ca times planned select-IRU2 action.

• Executive may latch sensors and test in any state of world (base model); only commits if in state 41.

• Constraint on Clock Ca enforces preemption.

Here’s where ghosting happens


Model Checking VerifiersModel Checking Verifiers

• Use advanced techniques to find equivalence regions in the space of continuous clock values.

• Exhaustively enumerate the possible system traces, modulo clock region equivalence.

• Produce example system trace to failure when verification fails.

• Kronos (from VERIMAG).

• CIRCA-Specific Verifier (HTC):

– Optimized for CIRCA problems, implicit transition-based representation of state spaces.

– Incremental version saves verifier state during plan generation and revision, reducing effort dramatically.


CSM Algorithm in a NutshellCSM Algorithm in a Nutshell

• A search algorithm that– Assigns an action to each reachable state that:

- Preserves safety and- If possible, moves towards a goal state;

– Re-computes the set of reachable states as actions are chosen and uncontrolled processes projected forward.

– Invokes a timed-automaton verifier after each decision to determine whether safety is preserved (is a failure state reachable?).- If partial plan is unsafe, use verifier trace information to

guide backjumping.• Similar to timed game-theoretic approaches [Asarin, Maler,

Pneuli]: choose a move for each discrete state that will avoid a victory by nature.


CIRCA Architecture - AMPCIRCA Architecture - AMP



Real Time

System



Feedback Data

The World


Subdividing the MissionSubdividing the Mission


Adaptive Mission PlannerAdaptive Mission Planner

• Divide mission into phases, subdividing them as necessary to handle resource restrictions.

• Build problem configurations for each phase, to drive CSM.

• Modify problem configurations, both internally and via negotiation with other AMPs, to handle resource limitations.

– Capabilities (assets).

– Bounded rationality: deliberation resources.

– Bounded reactivity: execution resources.


Deliberation SchedulingDeliberation Scheduling

• AMP directs and monitors controller synthesis for each mission phase.

• Attempts to maximize expected utility of total mission by selecting and ordering planning tasks.

• Ensures timeliness by ordering planning tasks so that a safe plan is ready for the next phase.

• Currently deliberation scheduling performed based on myopic approximate solutions to optimal MDP policies; ongoing research.



Adaptive Mission Planner: Explicitly manages complexity of negotiation processes that dynamically distribute roles/responsibilities.





Real Time

System

Roles, Goals

Real-Time Reactions




Real Time

System


AMP NegotiationAMP Negotiation

• Contract Net style system for allocation of threats and goals to highest bidder.

– Contracts arrive at any site.

– Multiple concurrent negotiations.

– Bids are commitments.

• CN cycle: announce, bid, award, succeed/fail.


Multi-Agent Self-Adaptive CIRCAMulti-Agent Self-Adaptive CIRCAApproach: • Automatic synthesis and adaptation of guaranteed real-time

controllers.

Performance:• Reactive control responses to threats and contingencies in

milliseconds.

• Coordinated multi-agent behaviors in tens of milliseconds.

• Dynamic reconfiguration of team mission plan in less than 10 seconds.

• Demonstrations in simulated UAV team domains: coordinated defense, dynamic replanning for contingencies.

Impact:• Robust UAVs that rebuild their own control systems in response to

contingencies (e.g., damage, target of opportunity).

• Smart UAV teams that actively coordinate distributed capabilities/resources to maximize mission effectiveness.

• Sponsor: NSF; Honeywell; DARPA: SAFER, ANTS, MICA.

Goal: Adaptive real-time coordination and control of multi-UAV teams.


CIRCADIA: Synthesizing Security Control SystemsCIRCADIA: Synthesizing Security Control SystemsApproach: • Use CIRCA controller synthesis to automatically generate security

controllers.

• Tailor responses automatically according to available resources, varying threat levels & security policies.

Performance:• Fully autonomous operations defeating attacks in milliseconds.

• Rapid reconfiguration for dynamic network assets, security state, threat profile.

• Demonstrations in real computer networks.

Impact:• Real-time responses defeat manual and automated attack scripts.

• Automatic tradeoffs of security vs. service level and accessibility.

• System derives responses for novel attacks built from known components.

Sponsor: DARPA: CyberPanel.

Goal: Automatic real-time response to computer security intrusions.

Computing services

Active Security ControllerExecutive

Controller Synthesis ModuleNetworks, Computers

Attacks, intrusions

Intrusion Assessment

Security Tradeoff Planner


SummarySummary

If your planner doesn’t really understand your executive, it doesn’t really understand its plans.

Planning for real world autonomy requires consideration of time, exogenous processes, and concurrency.

Reliable autonomy is hard and important.


Related WorkRelated Work

• Executive models: not much.

• Planning in timed automata worlds:

– SimPlan (Kabanza).

– Synthesis of controllers via backwards fixpoint computations: Asarin, Maler, Pneuli, Sifakis.

- Least restrictive controller design.

• Efficient representations of state spaces using BDDs, planning as model checking w/ BDDs: Giunchiglia, Traverso.

• Lots of work on timed automata verification (Yovine, Dill, …).

• Also recent work on more complex forms of hybrid automata and relatively small amount of synthesis work for them: Henzinger, Sastry, Wong-Toi ...


THE END

Honeywell Technology CenterHoneywell Technology Center


Incremental VerificationIncremental Verification

• CIRCA calls verifier many times to check partial plans.

• Problem: Model checkers can take a long time to explore all possible paths. Most model checkers designed for batch operations.

• Key idea: Retain information about prior verification runs to make subsequent verification runs more efficient.

• Issues:

– Dependency-directed backtracking changes plan and verification model in non-monotonic fashion.

- Solution: invalidate verifier cache and start from scratch on backtrack.


Incremental Verification ResultsIncremental Verification Results• Extremely valuable in low-backtracking domains: up to 97% speedup.

• Low overhead to invalidate cache.

• Patent application filed October 2001.

10

100

1000

10000

100000

1e+006

1e+007

0 5 10 15 20

Pla

nn

er

Tim

e (

mill

ise

con

ds)

Domain

RTAincremental RTA

0

10

20

30

40

50

60

70

80

90

100

0 5 10 15 20

Pe

rce

nta

ge

sp

ee

du

p

Domain

41% average speedup on 21 regression test domains.17.5 minutes -->

1.4 minutes


Planning with Accurate Executive ModelPlanning with Accurate Executive Model

• Theme: pay attention to the plan executive… if your planner doesn’t understand the execution semantics, its plans may be incorrect.

• Approach:

– Simplify executive.

– Model executive explicitly in planner.

- Use increasing model resolution as plan nears completion.

• Result:

– Planner really understands what its plans will do, accounting for uncontrollable events and time.

– Performance guarantees.


Cautionary Example: Ghosting Can Ruin Your DayCautionary Example: Ghosting Can Ruin Your Day

• All real executives take non-zero time to select and perform a planned action: the “sense-act gap”.

– CIRCA’s RTS takes a “snapshot” of consistent world state, executes boolean test expression, and performs action if expression returns true.

• If the world changes between snapshot and action, the action may occur in a different state than originally intended.

• Planner must understand this “ghosting”.

(light green)(status stopped)

(light green)(status moving)

Accel

(light red)(status stopped)

Light-changes (light red)

(status moving)

Accel

(ghost)


Inflammatory Position Inflammatory Position • Plan executives are getting increasingly complicated to account for

world dynamics and uncertainty, and to encode human procedural knowledge conveniently.

– Examples include RAPS/3T, Remote Agent Exec.

• But,

– Executives with multiple methods and persistent goals can have unmodeled side effects during plan execution.

- Resource consumption, clobbering.

– The whole point of planning is to give system the base knowledge to plan its way out of unusual situations.

– Reactivity is relatively easy to plan.

• So, as planner scalability improves, let’s get back to using strong planning with a simpler execution model.

• And be sure planner understands its executive!


Executive Models to the RescueExecutive Models to the Rescue

• Verify plans using models of executive, representing sense-act gap explicitly.

• Model checking efficiently examines all possible executions including those with ghosting effects.


CIRCA’s Controller Synthesis ModuleCIRCA’s Controller Synthesis Module

Available actions

Uncontrollabletransitions

Goal state descriptions

Initial state descriptions

AutomataControllerDesignincludingtimingconstraints


TAP Compiler

Scheduler

State SpacePlanner

Verifier


Three Executive ModelsThree Executive Models

• During action admission: Simple delay reactive engine.

– Actions occur with some maximum delay.

– Ignores sensing delays, ghosting, etc.

• After action selection: Committing reactive engine.

– Commit to action based on sensed state, react after delay.

– Action may “ghost” into unplanned state.

– Estimated reaction time b/c no schedule yet.

• After planning finished and TAP schedule generated: full TAP loop model.

– Scheduled reactions: reaction time depends on other planned reactions.


Verification of Timed AutomataVerification of Timed Automata

• The CIRCA temporal model can be mapped into timed automata.

• We can use verification tools (model checkers) to prove safety of CIRCA plans.

• Counter-examples generated by verification tool can guide intelligent backjumping to direct search for viable solution.


What are Timed Automata?What are Timed Automata?

• State machines with transitions that depend on synchronously-updated timers (called “clocks”).

– All clocks update at same rate.

• States can have “invariant” expressions on clocks that force automata to leave state before invariant would become false.

• Transitions can have “guard” expressions that restrict when they can be followed (preconditions).

• Transitions can have “reset” expressions that set some of the clocks back to zero.


THE END


So What? Is CIRCA The Answer?So What? Is CIRCA The Answer?

• Maybe so: online planning with good executive models, verification, and predictable reactive execution will meet most needs.

• But: planners need a lot more work.• Future Directions:

– Quantified uncertainty: probabilistic domains see Häkan’s thesis; it started with probabilistic versions of CIRCA planning problems.

– Integrated spatial reasoning: vehicles always need it.– Learning.– More multi-agent work.– Still more planning… scalability, heuristics…


Temporal Reasoning IssueTemporal Reasoning Issue

• A single action may not “solve” a temporal transition to failure; TTF may persist across multiple states.

• Non-Markovian representation of temporal information.

– Path-dependent nature of transitions: how long was this transition enabled in prior states?

– Need for efficient representation of only critical timing value distinctions leads to timed automata theory.

• Approach: use timed automata methods to handle almost all temporal reasoning.


Using Model CheckingUsing Model Checking

• Existing search loop iteratively selects a state and chooses action for that state.

– Heuristics guide choice.

– Approximations indicate timing will work.

• Inserted formal reachability analysis call after each action choice.

– Confirms timing characteristics, ensuring planned preemptions actually will occur.

• If model checker finds failure reachable, path to failure can be used to backjump to most recent decision related to any state on the path.



Adaptive Mission Planner: Explicitly manages complexity of negotiation processes that dynamically distribute roles/responsibilities.



Only system to guarantee end-to-end multi-agent coordinated behaviors



Real Time

System

Roles, Goals

Real-Time Reactions




Real Time

System


OutlineOutline

• Research summary.

• CIRCA, the Cooperative Intelligent Real-time Control Architecture:

– Reactive plan execution.

– Controller synthesis (planning).

– Deliberation scheduling.

– Demo in simulated UAV domain.

• CIRCADIA: CIRCA for Dynamic Information Assurance.

– Controller synthesis with probabilities.

• Future directions.


AMP ResponsibilitiesAMP Responsibilities

• Divide mission into phases, subdividing them as necessary to handle resource restrictions.

• Build problem configurations for each phase, to drive CSM.

• Modify problem configurations, both internally and via negotiation with other AMPs, to handle resource limitations.

– Capabilities (assets).

– Bounded rationality: deliberation resources.

– Bounded reactivity: execution resources.


AMP OverviewAMP Overview


AlgorithmControls

Executable TAP schedule

AlgorithmPerformance

Problem ConfigurationProblem Configurations

AMP Synthesis Control(Negotiation)

MissionThreats, Goals


AMP NegotiationAMP Negotiation

• Contract Net style system for allocation of threats and goals to highest bidder.

• Based on Honeywell’s prior IR&D-funded Contract Net implementation from distributed scheduling of satellite data processing.

– Contracts arrive at any site.

– Multiple concurrent negotiations.

– Bids are commitments.

• CN cycle: announce, bid, award, succeed/fail.


Demonstration GoalsDemonstration Goals

To illustrate:

• Multi-agent negotiation of responsibilities.

– Both before and during mission.

• Dynamic synthesis of controllers.

– Both before and during mission.

• Real-time reactive response to events.


Mission OverviewMission Overview

Ingress

Attack

Egress

0

1

2

3

45

6

IR

IR

Unknown radar threat


Dynamically Negotiated ResponsibilitiesDynamically Negotiated Responsibilities

Master Wing1 Wing2 Wing3 Wing4

Ingress

Attack

Egress

IR threat

IR threat

IR threat

Radar threat

Radar threat

Destroy target

Destroy target

IR threat


DEMODEMO


AMP Deliberation SchedulingAMP Deliberation Scheduling

• Seeking principled, practical method for AMP to adjust CSM problem configurations and algorithm parameters to maximize expected utility of deliberation.

• Different CSM problem configuration operators yield different types of plan improvements.

– Defeating threats improves survival probability.

– Achieving goals improves expected reward.

• Configuration operators have different expected resource requirements (computation time).


Markov Model for Deliberation SchedulingMarkov Model for Deliberation Scheduling

Markov chain behavior in the mission phases:

• Probability of surviving vs. entering absorbing failure state.

• Reward expectations unevenly distributed.

Phase1

Phase2

FAILURE

Phase4

n

i

i

jji sRRU

2

1

11

s1

1-s1

R3

s2 Phase5

Phase3 R5

s3 s4


Example Deliberation Scheduling MDP ModelExample Deliberation Scheduling MDP Model


Deliberation Scheduling StrategiesDeliberation Scheduling Strategies

• Optimal policy would account for all nondeterministic outcomes of deliberation and world state.

– Can be found, but computation is intractable; impractical.

• Bounded horizon “greedy” or “myopic” strategy:

– Only assign limited future deliberation time, in discretized intervals, to maximize expected utility of deliberation.

– Execute one or more of the scheduled deliberation activities (CSM methods) and then re-derive schedule.

• Greedy approach reduces complexity of deliberation management to permit real-time response.

• Reacts effectively to actual outcome of CSM processing.

• Time-discounting useful to avoid getting distracted by high-value operators that affect phases far in the future.


Runtime Comparison of Optimal and GreedyRuntime Comparison of Optimal and Greedy

10

100

1000

10000

100000

1e+06

1e+07

0 20 40 60 80 100 120 140 160 180 200

Ru

nti

me

(m

se

c.)

Experiment

OptimalGreedy

• Note logarithmic scale of runtime! • Simple Greedy runtime = Discounted Greedy runtime.• Optimal time is lower bound: omits state-space enumeration time.


Result Quality ComparisonResult Quality Comparison

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

0 20 40 60 80 100 120 140 160 180 200

Ex

pe

cte

d U

tilit

y

Experiment

OptimalGreedy

Discounted greedy agent averages 75-92% of optimal expected utility, depending on domain difficulty.


SummarySummary

• Multi-agent negotiation of responsibility for threats/goals.

• Dynamic synthesis of real-time reactive controllers tailored to negotiated responsibilities.

• Active management of deliberation (synthesis) process using greedy decision-theoretic techniques.

– Demo of online deliberation scheduling also available.


Coordinated Real-Time PlansCoordinated Real-Time Plans

• Real-time performance guarantees that allow one agent to rely on another in mission-critical situations. End-to-end performance guarantees. Example: “You sense, I’ll act”.

– One platform can detect a threat.

– One platform can respond to the threat.

– Coordinated plan includes communication action under time constraints, and shared commitments.

• AMP allocates coordination roles based on capabilities.

• AMP configures planning models with built-in coordination requirements, enforced by pseudo-failures.

• CSM builds reactive plans that include communication to meet end-to-end reaction deadlines.

• RTSs communicate during plan execution to enforce timing constraints on notification and reaction.


Dynamic Abstraction PlanningDynamic Abstraction Planning

• Start with abstract states omitting all non-goal features.

• Incrementally and non-uniformly add features to states when required:

– When no safe actions are applicable.

– When goal achievement heuristic indicates.

• Result: planner decides what it needs to think about, when.

• Recent advance: partial abstraction, allowing refinement to specify only a single predicate value and leave other values abstracted.


Optimizing Plans in GSMPsOptimizing Plans in GSMPs

• Adding probabilistic delay distributions to timed automata yields Generalized Semi-Markov Process model:

– Efficient for representing uncertain, concurrent real world.

– No analytic solutions available.

• Adding reward model gives opportunity for decision-theoretic solution criterion: maximize expected utility.

• Approach: generate plans and assess EU dominance using Monte Carlo sampling of GSMP executions.

– Backjump based on sample traces.

• Recent advances:

– Local search.

– GA search in reaction space.


Deliberation SchedulingDeliberation Scheduling

• MDP model of mission phases with non-uniform reward and absorbing failure state.

• Approximate performance profiles of CSM planning based on syntactic domain analysis (goal & threat structure).

• Optimal solution is intractable.

• Myopic approach with constant updates can yield most (90%+) of expected deliberation utility.

Phase1

Phase2

FAILURE

Phase4

s1

1-s1

R3

s2 Phase5

Phase3

R5

s3 s4

0

10

20

30

40

50

60

70

5 10 15 20 25 30 35 40 45 50

Fre

qu

en

cy

Planner Time (secs)

100 Samples Per Threat Group

0.5 Second Wide Sample Bins

1 Threat 2 Threats3 Threats4 Threats


Adjustable Autonomy: Tasking InterfacesAdjustable Autonomy: Tasking Interfaces

Approach: • Playbook user interface simplifies rapid tasking of asset teams.

• MACBeth constraint-based planner builds multi-agent plans tailored for situation.

Performance:• Command complex team behaviors with a single interaction

(click).

• Constraints tailor behaviors.

• System fills in all remaining details to build executable plan that meets constraints.

• Demonstrations with mobile robot teams.

Impact:• Efficient command and control of autonomous assets.

• Improved utility, reduced human workload, increased ratio of assets/people.

Sponsor: DARPA: TMR, MICA.

Goal: Variable initiative command and control of autonomous asset teams.


Related Research AreasRelated Research Areas

• Formal verification (model checking) for timed automata, including Honeywell avionics software products.

• Hybrid systems: modeling, analysis, synthesis.

• Planning with model checking, incl. BDD representations.

• Procedure automation for refineries.

• Dynamic procedures for mixed-initiative industrial operations.

• Multi-agent negotiation and distributed scheduling for air traffic control and EOSDIS processing.

Airspace Model

Hardhats! Nomex!


OutlineOutline

• Research summary.

• CIRCA, the Cooperative Intelligent Real-time Control Architecture:

– Reactive plan execution.

– Controller synthesis (planning).

– Deliberation scheduling.

• Demo in simulated UAV domain.

• CIRCADIA: CIRCA for Dynamic Information Assurance.

– Controller synthesis with probabilities.

• Future directions.


Controller Synthesis for Over-Constrained DomainsController Synthesis for Over-Constrained Domains

• CIRCA ideally builds controllers that cover all reachable states of the world.

• But for real domains:

- You can’t afford to think about every possible future state of the world.

- You can’t monitor for every state, so you couldn’t execute an exhaustive controller even if you could synthesize it.

• We can’t worry about everything.

– We have to ignore some things…

– Approach: ignore least-probable states.


Probabilistic Controller SynthesisProbabilistic Controller Synthesis

• Add transition probabilities to state model.

– World transitions and controlled actions.

• Sample simulated executions of the current controller to estimate probability of reaching different states.

• Build controllers that handle most-probable states.

• Allows CIRCADIA to trade off planning time and controller complexity against system safety.


Probabilistic World Model DynamicsProbabilistic World Model Dynamics

• The world model is a generalized semi-Markov process (GSMP).

• The world occupies a single state at any point in time.

• Enabled transitions in the current state compete to trigger.

• One transition triggers in each state, determining the next state.

• Non-Markovian because trigger distributions depend on holding times.

• There are no analytic solutions for unrestricted GSMPs.

• Must use a sampling-based approach to estimate state probabilities.

– Simpler: estimate whether failure is too likely.


Acceptance SamplingAcceptance Sampling

• Let pF be the failure probability of a plan.

• Want to specify failure threshold such that:

– Plan is accepted if pF

– Plan is rejected if pF >

• Use acceptance sampling to decide whether to accept a plan.

• Exhaustive sampling is impossible, so we must expect errors:

– Type I error: reject acceptable plan.

– Type II error: accept rejectable plan.

• Want to bound probability of error.


Sequential SamplingSequential Sampling

• Single sampling plan always requires fixed number of samples.

• Sequential sampling plan decides whether to generate more samples based on samples seen so far.

– Define acceptance number an and rejection number r

n at stage

n.

– Accept plan if observed failures are at most an

– Reject plan if observed failures are at least rn

• Intuitive sequential sampling:

– If you’ve already seen c failures at any iteration, then reject (r

n = c).

– If you cannot possibly see c failures in remaining iterations, then accept (a

n = c+i -n-1 ).


Number of Samples RequiredNumber of Samples Required

Wald acceptance sampling requires significantly fewer samples.

Actual failure probability

Static sampling plan

Wald sequential sampling plan

threshold


PerformancePerformance

• Expected number of required samples only depends on failure probability and threshold, not state space size!

• Domain-dependent factors affecting time to generate each sample:

– Time period considered (tmax

).

– Mean values of the distribution functions F.

• In practice, this allows us to generate probabilistically-verified plans for very large domains that cannot be handled by complete (non-probabilistic) model-checking approaches.


Simulation Demonstration SummarySimulation Demonstration Summary

• Automatic construction and execution of reactive computer security controllers, including:

• Dynamic adjustment of security monitoring and logging.

– Alerts can trigger additional sensing, according to automatically synthesized control policy.

• Dynamic adjustment of network connectivity.

– Defenses raised and lowered in response to runtime events.

• Various sensing modalities included: event-driven, anomaly-based, etc.


Verification of Timed AutomataVerification of Timed Automata

• Prove hypotheses about possible executions of a timed automata model.

• Commonly include reachability, liveness, etc.

• For safety-critical aspects of CIRCA plans, we only care about reachability: is failure reachable or not?

• The CIRCA temporal model can be mapped into timed automata.

• We can use existing verification tools to prove safety of CIRCA plans.

• Counter-examples generated by verification tool can guide intelligent backjumping to direct search for viable solution.


AMP OverviewAMP Overview

• Mission is the main input: threats and goals, specific to different mission phases (e.g., ingress, attack, egress).

– Threats are safety-critical: must guarantee to maintain safety in worst case, using real-time reactions.

– Goals are best-effort: don’t need to guarantee.

• Each mission phase requires a plan (or controller), built by the CSM to handle a problem configuration.

• Agents negotiate to assign responsibilities for threats/goals and build customized controllers within time bounds.

• Changes in capabilities, mission, environment can lead to need for additional negotiation and controller synthesis.


AMP Information Display (AID)AMP Information Display (AID)

We are in the Ingress phase

No outgoing communication now (else green)

Another AMP is handling this threat/goal.

I have been awarded this contract, and am

planning now.

I have a plan for this threat/goal contract.

Heartbeat shows I’m alive


Wing3 deploys flares, defending against IR missile

(threat)

RTS Information Display (RID)RTS Information Display (RID)

Other UAVs testing to see if reached

next waypoint (goal)

Wing2 attempting to locate target to destroy (goal)


Related WorkRelated Work

• Lots of work on timed automata verification (Yovine, Dill, …).

• Synthesis of timed automata controllers via backwards fixpoint computations: Asarin, Maler, Pneuli, Sifakis, Tripakis, Altisen.

– Least restrictive controller design.

• Efficient representations of state spaces using BDDs, planning as model checking w/ BDDs: Giunchiglia, Traverso.

• Also recent work on more complex forms of hybrid automata and relatively small amount of synthesis work for them: Henzinger, Sastry, Wong-Toi …

– Honeytech tool developed to test Wong-Toi fixpoint.


What is “Real-Time”?What is “Real-Time”?

A common misconception about real-time computing is that it is equivalent to “fast computing.”

- John Stankovic

What it really means is “predictably fast enough.”

- Dave Musliner


CSM AlgorithmCSM Algorithm

• CSM essentially determines a strategy in a timed game against a worst-case adversary.

• Search loop iteratively selects a state and chooses action for that state.

– Heuristics guide choice for safety and goal achievement.

– Approximations indicate that timing will work.

– Formal reachability analysis called after each action choice, to confirm that all planned preemptions will occur.

– If failure reachable, path to failure can be used to backjump to most recent decision related to any state on the path.


Why Verification?Why Verification?

• Planner’s state-space model is non-Markov.

• Time is not explicitly represented in state description.

– Simplifies planner’s representation.

– Limits search space.

– Models the clockless executive.

• But, there are many paths to the same state, and

• Time remaining before a transition can fire depends on the path.

Honeywell Laboratories ~sa-circa/talks/circa-overview-for-cmu-12-04 1 Honeywell Labs Achieving...

Documents

Transcript of Honeywell Laboratories ~sa-circa/talks/circa-overview-for-cmu-12-04 1 Honeywell Labs Achieving...