Approximate Solutions for Factored Dec-POMDPs with Many...
Transcript of Approximate Solutions for Factored Dec-POMDPs with Many...
![Page 1: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/1.jpg)
May 8, 2013 1 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Approximate Solutions for FactoredDec-POMDPs with Many Agents
Frans A. Oliehoek, Shimon Whiteson, & Matthijs T.J. Spaan
AAMAS, Wednesday May 8, 2013
![Page 2: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/2.jpg)
May 8, 2013 2 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Visual Roadmap
resultsresults
modelmodelbackground: FSPCbackground: FSPC
Factored FSPCFactored FSPC
practical factored FFSPCpractical factored FFSPC
![Page 3: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/3.jpg)
May 8, 2013 3 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Multiagent decision making under Uncertainty
Outcome Uncertainty
Partial Observability
Multiagent Systems: uncertainty about others
![Page 4: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/4.jpg)
May 8, 2013 4 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Formal Model A Dec-POMDP
n agents S – set of states A – set of joint actions PT – transition function
O – set of joint observations PO – observation function
R – reward function h – horizon (finite)
⟨S , A , PT ,O , PO , R ,h⟩
a=⟨a1,a2, ... ,an⟩
o=⟨o1,o2, ... , on⟩
P(s '∣s , a)
P(o∣a , s ')
R (s , a)=E s ' R(s ,a , s ' )
![Page 5: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/5.jpg)
May 8, 2013 5 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Factored Dec-POMDPs
1 2 3 4
h4
h1
h2
h3
t
state variables or 'factors'
E.g., 'Firefighting Graph' H houses h1...hH
H-1 agents actions: left or right observations: flames or not
![Page 6: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/6.jpg)
May 8, 2013 6 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Factored Dec-POMDPs
h4
h1
h2
h3
h1'
h2'
h3'
h4'
t t+1
CPTs(for each factor)
1 2 3 4
E.g., 'Firefighting Graph' H houses h1...hH
H-1 agents actions: left or right observations: flames or not
![Page 7: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/7.jpg)
May 8, 2013 7 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Factored Dec-POMDPs
h4
h1
h2
h3
a1
h1'
h2'
h3'
h4'
a2
a3
t t+1
CPTs(for each factor)
1 2 3 4
E.g., 'Firefighting Graph' H houses h1...hH
H-1 agents actions: left or right observations: flames or not
![Page 8: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/8.jpg)
May 8, 2013 8 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Factored Dec-POMDPs
h4
h1
h2
h3
a1
h1'
h2'
h3'
h4'
o1
a2o2
a3o3
t t+1
CPTs(for each factor)
1 2 3 4
E.g., 'Firefighting Graph' H houses h1...hH
H-1 agents actions: left or right observations: flames or not
![Page 9: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/9.jpg)
May 8, 2013 9 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Factored Dec-POMDPs
h4
h1
h2
h3
a1
h1'
h2'
h3'
h4'
o1
a2o2
a3o3
t t+1
reward for house 1
R1
1 2 3 4
E.g., 'Firefighting Graph' H houses h1...hH
H-1 agents actions: left or right observations: flames or not
![Page 10: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/10.jpg)
May 8, 2013 10 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Factored Dec-POMDPs
h4
h1
h2
h3
a1
h1'
h2'
h3'
h4'
o1
a2o2
a3o3
t t+1
R1
Expected reward for house 1
1 2 3 4
E.g., 'Firefighting Graph' H houses h1...hH
H-1 agents actions: left or right observations: flames or not
![Page 11: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/11.jpg)
May 8, 2013 11 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Background: FSPC – 1
Goal: scalability w.r.t. #agents Forward-sweep policy computation (FSPC)
for each t=0,...,h-1 compute best joint decision rule δt
given past joint policy φt
![Page 12: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/12.jpg)
May 8, 2013 12 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Background: FSPC – 1
Goal: scalability w.r.t. #agents Forward-sweep policy computation (FSPC)
for each t=0,...,h-1 compute best joint decision rule δt
given past joint policy φt
φ0
( )
![Page 13: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/13.jpg)
May 8, 2013 13 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Background: FSPC – 1
Goal: scalability w.r.t. #agents Forward-sweep policy computation (FSPC)
for each t=0,...,h-1 compute best joint decision rule δt
given past joint policy φt
φ0
δ0
( )
![Page 14: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/14.jpg)
May 8, 2013 14 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Background: FSPC – 1
Goal: scalability w.r.t. #agents Forward-sweep policy computation (FSPC)
for each t=0,...,h-1 compute best joint decision rule δt
given past joint policy φt
φ1
φ0
( )
(δ0)
![Page 15: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/15.jpg)
May 8, 2013 15 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Background: FSPC – 1
Goal: scalability w.r.t. #agents Forward-sweep policy computation (FSPC)
for each t=0,...,h-1 compute best joint decision rule δt
given past joint policy φt
φ1
φ0
( )
(δ0)
δ1
![Page 16: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/16.jpg)
May 8, 2013 16 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Background: FSPC – 1
Goal: scalability w.r.t. #agents Forward-sweep policy computation (FSPC)
for each t=0,...,h-1 compute best joint decision rule δt
given past joint policy φt
φ1
φ2
φ0
( )
(δ0)
(δ0,δ1)
![Page 17: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/17.jpg)
May 8, 2013 17 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Background: FSPC – 2
As collaborative Bayesian games (CBGs) agents, actions types θi histories↔
probabilities: P(θ) payoffs: Q(θ,a)
δit(θ⃗i
t)=ait
B(φ1)
B(φ2)
B(φ0)
φ1
φ2
φ0
Problem at stage t select δ=<δ1,...,δn>
δi : histories actions→
![Page 18: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/18.jpg)
May 8, 2013 18 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Background: CBGs
As collaborative Bayesian games (CBGs) agents, actions types θi histories↔
probabilities: P(θ) payoffs: Q(θ,a)
B(φ1)
B(φ2)
B(φ0)
H2 H3
... ...
... ...
H2 H3
H1 +2 -1
H2 ... ...
... ...
... ...
H1 … ...
H2 ... ...
No FlamesFlames
No Flames
Flames
ag. 2
agent 1
1 2 3
![Page 19: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/19.jpg)
May 8, 2013 19 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
This paper: Factored FSPC
if value function factored
→ replace CBGs with Collaborative graphical Bayesian games (CGBGs)
Q( x , θ⃗ , a)=∑eQe(x e , θ⃗e , ae)
Factored FSPC – Basic idea:
exploit independence between agents in FSPC
Factored FSPC – Basic idea:
exploit independence between agents in FSPC
![Page 20: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/20.jpg)
May 8, 2013 20 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
if value function factored
→ replace CBGs with Collaborative graphical Bayesian games (CGBGs)
This paper: Factored FSPC
agent 1 agent 3
agent 2
Total payoff is sum of components
![Page 21: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/21.jpg)
May 8, 2013 21 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Factored FSPC for Many Agents
Improving scalability w.r.t. the number of agents...
1) Approximate structure of Q* Predetermined scope structure
2) Compute CGBG payoff functions Transfer Planning (TP)
3) Inference techniques to construct CGBGs Extension of factored frontier [Murphy&Weiss UAI 2001]
4) Efficient solutions of CGBGs Max-plus to ATI-FG [OWS UAI 2012]
B(φ1)
B(φ2)
B(φ0)
![Page 22: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/22.jpg)
May 8, 2013 22 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Factored FSPC for Many Agents
Improving scalability w.r.t. the number of agents...
1) Approximate structure of Q* Predetermined scope structure
2) Compute CGBG payoff functions Transfer Planning (TP)
3) Inference techniques to construct CGBGs Extension of factored frontier [Murphy&Weiss UAI 2001]
4) Efficient solutions of CGBGs Max-plus to ATI-FG [OWS UAI 2012]
B(φ1)
B(φ2)
B(φ0)
Rest of this talk
see paper
![Page 23: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/23.jpg)
May 8, 2013 23 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Accumulation of Reward over Time
a1
h1
h2
h3
h4
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
t=0 t=1 t=2
R1Qe=1(xe , θ⃗e , ae)
![Page 24: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/24.jpg)
May 8, 2013 24 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Accumulation of Reward over Time
a1
h1
h2
h3
h4
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
t=0 t=1 t=2
R1
![Page 25: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/25.jpg)
May 8, 2013 25 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Accumulation of Reward over Time
a1
h1
h2
h3
h4
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
t=0 t=1 t=2
R1
Q1
Still factored, but components not local!Still factored, but components not local!
![Page 26: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/26.jpg)
May 8, 2013 26 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
1) Predetermined Scope Structure Use smaller scopes
a1
h1
h2
h3
h4
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
t=0 t=1 t=2
Q1
![Page 27: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/27.jpg)
May 8, 2013 27 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
a1
h1
h2
h3
h4
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
t=0 t=1 t=2
Q1
Q3
Q2
Q4
1) Predetermined Scope Structure Use smaller scopes
![Page 28: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/28.jpg)
May 8, 2013 28 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Define a source problem for each component involving fewer agents
Solve those exactly or approximately (QMDP, etc.)
2) Transfer Planning
a1
h1
h2
h3
h4
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
a1
h1
h2
h3
h4
o1
o2
o3
a2
a3
t=0 t=1 t=2
Q3
Q2
1 2 3
2 3 4
![Page 29: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/29.jpg)
May 8, 2013 29 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Results – Compared to Optimal
1 2 3 4
Fact. FSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally
![Page 30: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/30.jpg)
May 8, 2013 30 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Results – vs. Approximate
FFSCP: TP vs ADP, NR Non-factored FSPC, DICE [Oliehoek et al. 2008 Informatica]
TP achieves (near-) best value....
TP achieves (near-) best value.... ...and superior
scaling behavior...and superior
scaling behavior
![Page 31: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/31.jpg)
May 8, 2013 31 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Results – Many Agents
![Page 32: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/32.jpg)
May 8, 2013 32 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
Conclusions Factored FSPC with transfer planning:
approximates factored Dec-POMDPs with multiple abstractions involving subsets of agents
Unprecedented scalability for this class results up to 1000 agents
Future work: scale to higher horizon understand such abstractions
empirically verified near-optimal quality formal understanding influence-based abstraction [AAAI' 12]
![Page 33: Approximate Solutions for Factored Dec-POMDPs with Many …people.csail.mit.edu/fao/docs/Oliehoek13AAMAS_pres.pdfFSPC + TP + QBG performs optimallyFact. FSPC + TP + QBG performs optimally](https://reader034.fdocuments.us/reader034/viewer/2022050221/5f672b37327a694ed17fffaa/html5/thumbnails/33.jpg)
May 8, 2013 33 / 33Oliehoek, Whiteson & Spaan - Factored Dec-POMDPs with Many Agents
References R. Emery-Montemerlo, G. Gordon, J. Schneider, and S. Thrun. Approximate solutions for
partially observable stochastic games with common payoffs. In AAMAS, 2004. K. P. Murphy and Y. Weiss. The factored frontier algorithm for approximate inference in
DBNs. In UAI, 2001. F. A. Oliehoek, J. F. Kooi, and N. Vlassis. The cross-entropy method for policy search in
decentralized POMDPs. Informatica, 32:341–357, 2008 F. A. Oliehoek, M. T. J. Spaan, S. Whiteson, and N. Vlassis. Exploiting locality of interaction in
factored Dec-POMDPs. In AAMAS, 2008. F. A. Oliehoek, S. Witwicki, and L. P. Kaelbling. Influence-Based Abstraction for Multiagent
Systems. In AAAI, 2012. F. A. Oliehoek, S. Whiteson, and M. T. J. Spaan. Exploiting structure in cooperative Bayesian
games. In UAI, 2012. D. Szer, F. Charpillet, and S. Zilberstein. MAA*: A heuristic search algorithm for solving
decentralized POMDPs. In UAI, 2005.