Reactive Bandits with Attitude - Pedro A. OrtegaReactive Bandits with Attitude Pedro A. Ortega,...

1

Reactive Bandits with Attitude Pedro A. Ortega, Kee-Eung Kim and Daniel D. Lee GRASP Laboratory, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, U.S.A. Reactive Bandits Bandit Algorithms Learning Conclusions 2. The bandit can react to the player's strategy by choosing a location. 1. The possible locations depend on the attitude parameter. 1 0 -3 +3 0 The optimum switches between arms. In the limit, the arm with highest variance is chosen. 1 0 -0.09 -1 2e5 4e5 6e5 8e5 1 0 -0.09 2e5 4e5 6e5 8e5 1 0 1 0 UCB1's empirical frequencies do not converge to the optimum. 0.2 0.0 -0.2 -0.4 1 0 -0.09 2e4 4e4 6e4 8e4 1 0 True parameter is learned very quickly. The mixed strategy converges to the optimal strategy. Motivation Optimal Strategy

Upload
others
Category

Documents
view
5
download
0

Embed Size (px):

Transcript of Reactive Bandits with Attitude - Pedro A. OrtegaReactive Bandits with Attitude Pedro A. Ortega,...

Page 1: Reactive Bandits with Attitude - Pedro A. OrtegaReactive Bandits with Attitude Pedro A. Ortega, Kee-Eung Kim and Daniel D. Lee GRASP Laboratory, School of Engineering and Applied Sciences,

Reactive Bandits with AttitudePedro A. Ortega, Kee-Eung Kim and Daniel D. Lee

GRASP Laboratory, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, U.S.A.

Reactive Bandits

Bandit Algorithms

Learning

Conclusions

2. The bandit can react to the player's strategy by choosing a location.1. The possible locations

depend on the attitude parameter.

1

0

-3 +30

The optimum switchesbetween arms. In the limit, the arm withhighest variance is

chosen.

1

0

-0.09-12e5 4e5 6e5 8e5

1

0

-0.092e5 4e5 6e5 8e5

1

0

1

0

UCB1's empirical frequencies do not

converge to the optimum.

0.2

0.0

-0.2

-0.4

1

0

-0.092e4 4e4 6e4 8e4

1

0

True parameter is learned very quickly.

The mixed strategy converges to the optimal

strategy.

Motivation

Optimal Strategy

BREADCRUMBS & BANDITS

BREADCRUMBS & BANDITS

Multi-Armed Bandits in Metric Spaces - Brown …cs.brown.edu/~eli/papers/bandits-lip-full.pdfMulti-Armed Bandits in Metric Spaces Robert Kleinbergy Aleksandrs Slivkinsz Eli Upfalx

Multi-Armed Bandits in Metric Spaces - Brown …cs.brown.edu/~eli/papers/bandits-lip-full.pdfMulti-Armed Bandits in Metric Spaces Robert Kleinbergy Aleksandrs Slivkinsz Eli Upfalx

Power-Constrained Bandits

Power-Constrained Bandits

Foraging and Multi-armed Bandits Optimal Foraging and Multi-armed Bandits Vaibhav ...vaibhav/talks/2013a.pdf · 2016-08-16 · Optimal Foraging and Multi-armed Bandits Vaibhav Srivastava

Foraging and Multi-armed Bandits Optimal Foraging and Multi-armed Bandits Vaibhav ...vaibhav/talks/2013a.pdf · 2016-08-16 · Optimal Foraging and Multi-armed Bandits Vaibhav Srivastava

Bandits Stretchable Organizers

Bandits Stretchable Organizers

Bandits in the Roman Empire

Bandits in the Roman Empire

Instituto de Sistemas e Robótica...Instituto de Sistemas e Robótica Pólo de Lisboa Attitude Control Strategies for Small Satellites Paulo Tabuada, Pedro Alves, Pedro Tavares, Pedro

Instituto de Sistemas e Robótica...Instituto de Sistemas e Robótica Pólo de Lisboa Attitude Control Strategies for Small Satellites Paulo Tabuada, Pedro Alves, Pedro Tavares, Pedro

CHESAPEAKE BANDITS PARENTS MEETING 2013

CHESAPEAKE BANDITS PARENTS MEETING 2013

Vision-Aided Complementary Filters for Attitude and Position … · 2020. 10. 28. · Vision-Aided Complementary Filters for Attitude and Position Estimation of UAVs João Pedro Dias

Vision-Aided Complementary Filters for Attitude and Position … · 2020. 10. 28. · Vision-Aided Complementary Filters for Attitude and Position Estimation of UAVs João Pedro Dias

Eric Hobsbawm - Social Bandits- Reply

Eric Hobsbawm - Social Bandits- Reply

PowerPoint Presentation - CHESAPEAKE BANDITS PARENTS … · Family Dinner at Chilis TODAY Bandits Apparel - Signature Sportswear Order now with uniforms Bandits Book (need a VOLUNTEER

PowerPoint Presentation - CHESAPEAKE BANDITS PARENTS … · Family Dinner at Chilis TODAY Bandits Apparel - Signature Sportswear Order now with uniforms Bandits Book (need a VOLUNTEER

Structural Causal Bandits: Where to Intervene?papers.nips.cc/paper/7523-structural-causal-bandits...Structural Causal Bandits: Where to Intervene? Sanghack Lee Department of Computer

Structural Causal Bandits: Where to Intervene?papers.nips.cc/paper/7523-structural-causal-bandits...Structural Causal Bandits: Where to Intervene? Sanghack Lee Department of Computer

Bethlehem Bandits Script - musiclinedirect.com . 8 Bethlehem Bandits ... Flip chart (See Production ... The Bethlehem Bandits should wear full length smocks tied around the waist with

Bethlehem Bandits Script - musiclinedirect.com . 8 Bethlehem Bandits ... Flip chart (See Production ... The Bethlehem Bandits should wear full length smocks tied around the waist with

CHESAPEAKE BANDITS PARENTS MEETING

CHESAPEAKE BANDITS PARENTS MEETING

the bandits of Rio Frio

the bandits of Rio Frio

CHESAPEAKE BANDITS PARENTS MEETING 2012

CHESAPEAKE BANDITS PARENTS MEETING 2012

CHESAPEAKE BANDITS PARENTS MEETING 2014

CHESAPEAKE BANDITS PARENTS MEETING 2014

Quad Cities River Bandits Motivational

Quad Cities River Bandits Motivational

Brooks Bandits Wheelchair Rugby Manual

Brooks Bandits Wheelchair Rugby Manual

Languages

Pages

Legal

Copyright © 2022 FDOCUMENTS