Towards Model-lite Planning A Proposal For Learning & Planning with Incomplete Domain Models
description
Transcript of Towards Model-lite Planning A Proposal For Learning & Planning with Incomplete Domain Models
Towards Model-lite PlanningA Proposal For Learning & Planning with Incomplete Domain Models
Sungwook YoonSubbarao Kambhampati
Supported by DARPA Integrated Learning Program
A Planning Problem
Towards Model-lite Planning - Sungwook Yoon
Suppose you have a super fast planner and a target application.What is the first problem you have to solve? Is it a problem from the application?
Domain Engineering is hard Model-lite Planning
Snapshot of the talk• This is a proposal. We formulate learning and planning
problems and solution methods for them. We tested our idea on some problems. But the verification is still an undergoing process
• We propose– Representation for model-lite planning
• probabilistic logic, incompleteness is quantified• Explicit consideration of domain invariant
– Learning of the domain model• Update of the probability and finding of the new axioms
– Planning with the model• Deterministic planning domain needs probabilistic planning• Most plausible plan that respects the current domain model
Towards Model-lite Planning - Sungwook Yoon
Representation
• Precondition Axiom: pAi, A → prei
• Uncertainty is quantified as a probability
• Effect Axiom: eAi, A → effecti
• Facilitates learning
Towards Model-lite Planning - Sungwook Yoon
Domain Model - Blocksworld
• 0.9, Pickup (x) -> armempty()• 1, Pickup (x) -> clear(x)• 1, Pickup (x) -> ontable(x)• 0.8, Pickup (x) –> holding(x)• 0.8, Pickup (x) -> not armempty()• 0.8, Pickup (x) -> not ontable(x)
Precondition Axiom:Relates Actions with Current state facts
Effect Axiom:Relates Actions with Next state facts
Towards Model-lite Planning - Sungwook Yoon
Representation• One modeling problem
• Conjunction of the effect have different semantics, if the probability of each effect is independently specified
• Add hidden variable, O , (e, A → O), then add deterministic axioms for each effect, (1,O → eff1), (1,O → eff2), …
• We can alleviate this problem also with explicit domain invariant property
• Writing explicit domain invariant property is easier than writing initial state generator and a set of operators that respects such property
Towards Model-lite Planning - Sungwook Yoon
• 0.8, Pickup (x) –> holding(x)• 0.8, Pickup (x) -> not armempty()• 0.8, Pickup (x) -> not ontable(x)
Effect Axiom:Relates Actions with Next state facts
• 1, holding(x) -> not armempty()• 1, holding(x) -> not ontable(x)
Static Property:Relates Facts in a State
Learning the domain model• Given a trajectory of states and actions, S1,A1,S2,A2, … , Sn,An,Sn+1
– We can learn precondition axioms from (S1,A1), (S2,A2), …, (Sn,An)– We can learn effect axioms from (A1,S2), (A2,S3), … , (An,Sn+1)– We can learn domain invariant properties from each state (S1), … , (Sn+1)– The weights (probabilities) of the axioms can be updated with simple
perceptron update
• There are readily available package for weighted logic learning– Alchemy (MLN)– Problog
• Structure learning– Alchemy provides structure learning too– We can also enumerate all the possible axioms (very costly for planning)
Towards Model-lite Planning - Sungwook Yoon
Model-lite planning Probabilistic Planning
• As stated before, with incomplete domain knowledge, a deterministic planning domain should be treated as a probabilistic domain
• The resulting plan should be maximally consistent with the current domain model
• We develop a planning technique for this purpose– A plan that is maximally plausible, given the
probabilistic axioms, initial state and goal • MPE solution to a Bayes Net problem
– Build on plangraph
Towards Model-lite Planning - Sungwook Yoon
Probabilistic PlangraphA B
AB
clear_aclear_barmemptyontable_aontable_b
pickup_apickup_b
clear_aclear_barmemptyontable_aontable_bholding_aholding_b
pickup_apickup_bstack_a_bstack_b_a
clear_aclear_barmemptyontable_aontable_bholding_aholding_bon_a_bon_b_a
noop_clear_anoop_clear_bnoop_armemptynoop_ontable_anoop_ontable_b
noop_clear_anoop_clear_bnoop_armemptynoop_ontable_anoop_ontable_bnoop_holding_anoop_holding_b
0.8
How do we generate a weighted clause?0.95, pickup_b’ v holding_b
Red lines indicate Mutexes
0.8
Domain Invariant PropertyCan be asserted too
Towards Model-lite Planning - Sungwook Yoon
A BAB
clear_aclear_barmemptyontable_aontable_b
pickup_apickup_b
clear_aclear_barmemptyontable_aontable_bholding_aholding_b
pickup_apickup_bstack_a_bstack_b_a
clear_aclear_barmemptyontable_aontable_bholding_aholding_bon_a_bon_b_a
noop_clear_anoop_clear_bnoop_armemptynoop_ontable_anoop_ontable_b
noop_clear_anoop_clear_bnoop_armemptynoop_ontable_anoop_ontable_bnoop_holding_anoop_holding_b
0.8
Can we view the probabilistic plangraph as Bayes net?
Evidence Variables
How we find a solution?MPE (most probabilistic explanation)There are some solvers out there
0.5
0.8
Domain Invariant PropertyCan be asserted too, 0.9
Towards Model-lite Planning - Sungwook Yoon
MPE as Maxsat
• There has been a work by James D. Park, AAAI 2002
• Set –log(P) as the weight of the clauses
A/B P
T T 0.7
F T 0.3
T F 0.2
F F 0.8
Weighted Clauses-log0.7 -A v –B -log0.3 A V –B-log0.2 –A v B-log0.8 A v B
Intuitive explanationViolating the clause is easier for
High probability instances
Thus the MaxSat ProblemGives you the highest probability
instantiations
A->B, T T 1, T F 0, InfinityWeight for –A v B, (complies with our intuitive understanding)
Towards Model-lite Planning - Sungwook Yoon
A BAB
clear_aclear_barmemptyontable_aontable_b
pickup_apickup_b
clear_aclear_barmemptyontable_aontable_bholding_aholding_b
pickup_apickup_bstack_a_bstack_b_a
clear_aclear_barmemptyontable_aontable_bholding_aholding_bon_a_bon_b_a
noop_clear_anoop_clear_bnoop_armemptynoop_ontable_anoop_ontable_b
noop_clear_anoop_clear_bnoop_armemptynoop_ontable_anoop_ontable_bnoop_holding_anoop_holding_b
-log0.8
Probabilistic Plangraph to MaxSat
Evidence Variables
-log0.5
For each probabilistic weight, we give –log(1-p)!That’s it.
-log0.8
Domain Invariant PropertyCan be asserted too, -log0.9
Towards Model-lite Planning - Sungwook Yoon
Exploding Blocksworld
Towards Model-lite Planning - Sungwook Yoon
Current Status (ongoing)• Learning test
– Generated Blocksworld Random Wandering Data and feed them to Alchemy with correct and incorrect axioms
– Alchemy found higher weight on the correct axioms and lower weight on the incorrect axioms
• Planning test – Tested on probabilistic planning problems– Hand tested on a couple of instances of Slippery Gripper
Domain• Hand encoded the clauses and assigned the weight• Put the resulting clauses to MaxSat solve• Got desired results
– On Exploding Blocksworld• Implemented generic MaxSat encoder for probabilistic planning
problems• Tested on a couple of problems from Exploding Blocksworld• Finds desired output frequently (not always)
Towards Model-lite Planning - Sungwook Yoon
Summary• We can learn precondition axioms and effect
axioms separately.– A -> Prec, A->Effect– Facilitates the learning
• Domain axiom or Invariant Property can be, provided, learned and used explicitly– It is better for domain modeler
• For planning, we can apply probabilistic plangraph approach– We proposed using MaxSat to solve probabilistic
planning problems– Interesting parallel to deterministic planning to SAT
Towards Model-lite Planning - Sungwook Yoon
Domain Learning – Related Work
• Logical Filtering (Chang & Eyal, ICAPS’06)– Update belief state and domain transition model– Experiments involved planning
• Probabilistic operator learning (Zettlemoyer, Pasula and Kaelbling, AAAI’05)– Experiments involved planning
• ARMS (Yang, Wu and Jiang, ICAPS ‘05)– No observation besides initial state and goal
Towards Model-lite Planning - Sungwook Yoon
Probabilistic Planning in Plangraph – Related Work
• Pgraphplan, Paragraph• Both search plans in the graphplan
framework.• pGraphplan searches for a consistent plan that
maximizes the goal-reaching probability – Forward probability propagation
• Paragraph searches for a plan that minimizes the cost to reach the goal– Backward plan search
Towards Model-lite Planning - Sungwook Yoon