1 Project Ideas. 2 Algorithmic Evaluations/Comparisons Compare variants of (nested) policy rollout...

5
1 Project Ideas

Transcript of 1 Project Ideas. 2 Algorithmic Evaluations/Comparisons Compare variants of (nested) policy rollout...

Page 1: 1 Project Ideas. 2 Algorithmic Evaluations/Comparisons  Compare variants of (nested) policy rollout using different bandit algorithms  Compare some.

1

Project Ideas

Page 2: 1 Project Ideas. 2 Algorithmic Evaluations/Comparisons  Compare variants of (nested) policy rollout using different bandit algorithms  Compare some.

2

Algorithmic Evaluations/Comparisons

h Compare variants of (nested) policy rollout using different bandit algorithms

h Compare some variants of Monte-Carlo tree search

h Implement an algorithm from the literature and attempt to replicate results, e.g.5 Forward Search Sparse Sampling (a type of Monte-

Carlo tree search algorithm)5 Anytime AO*5 Least-Squares Policy Iteration5 I could give other pointers depending on interests

Page 3: 1 Project Ideas. 2 Algorithmic Evaluations/Comparisons  Compare variants of (nested) policy rollout using different bandit algorithms  Compare some.

3

Algorithmic Comparisons

h Compare some reinforcement learning algorithms across some interesting problems5 E.g. compare TD-based vs. Policy Gradient based5 You could use the domains I have in the Java framework

for evaluation

Page 4: 1 Project Ideas. 2 Algorithmic Evaluations/Comparisons  Compare variants of (nested) policy rollout using different bandit algorithms  Compare some.

4

Solve a Particular Problem

h Pick a challenging sequential decision making problem5 Apply one or more of our planning/learning approaches to it

and evaluate

h Problems from past projects:5 Games

g Tetrisg Pokemong Blockusg Chessg Backgammong Othellog Clueg Space Wars (Galcon Fusion)g Starcraftg Pac Man

Page 5: 1 Project Ideas. 2 Algorithmic Evaluations/Comparisons  Compare variants of (nested) policy rollout using different bandit algorithms  Compare some.

5

Solve a Particular Problem

h Problems from past projects:5 Compiler scheduling5 Adaptive Java program optimization5 Forest Fire Management5 Crop Management5 Optimizing Policies for Network Protocols5 Controllers for Real-Time Strategy Games

g Subproblems of the game5 Optimizing file sharing policies

h Reinforcement learning and Monte-Carlo were the most commonly applied solution approaches