Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert...
Transcript of Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert...
![Page 1: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/1.jpg)
[BBH Brazil for Renault / Art: Pedro Utzeri]
Learning to Navigate … at City Scale
Raia Hadsell Senior Research Scientist
![Page 2: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/2.jpg)
Where am I?Where am I going?
Where did I start? How distant is A from B? What is the shortest path from A to B? Have I been here before? How long until we get there?
Navigation
![Page 3: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/3.jpg)
Raia Hadsell - Learning to Navigate - 2018
ExplorationMulti-task prediction of sensory data
RepresentationGrounding inneuroscience
MemoryOne-shot navigation
in unseen environment
Real worldModularity and
transfer learning
![Page 4: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/4.jpg)
Raia Hadsell - Learning to Navigate - 2018
ExplorationMulti-task prediction of sensory data
RepresentationGrounding inneuroscience
MemoryOne-shot navigation
in unseen environment
Real worldModularity and
transfer learning
![Page 5: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/5.jpg)
Raia Hadsell - Learning to Navigate - 2018
Can we teach agents to explore partially observed environments?
Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharsh Kumaran and Raia Hadsell
arxiv.org/abs/1602.01783 (ICLR 2017)
Learning to Navigate in Complex Environments
[MIT News / Photo: Mark Ostow]
![Page 6: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/6.jpg)
Raia Hadsell - Learning to Navigate - 2018
Navigation mazes
+10 +1
Within episode:Fixed goal (static or randomly changing b/w episodes)Random respawns
[Beattie et al (2016)“DeepMind Lab”,
github.com/deepmind/lab]
![Page 7: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/7.jpg)
Raia Hadsell - Learning to Navigate - 2018
Given sparse rewards… … explore and learn spatial knowledge
Accelerate reinforcement learning through auxiliary lossesDerive spatial knowledge from auxiliary tasks:
Depth predictionLocal loop closure prediction
Assess navigation skills through position decoding
![Page 8: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/8.jpg)
Raia Hadsell - Learning to Navigate - 2018
v π
Agent training
CNN
v π
CNN
policy LSTMValue and policy are updated with estimate of policy gradient given by the k-step advantage function A
Advantage actor critic reinforcement learning[Mnih, Badia et al (2015)
“Asynchronous Methods for Deep Reinforcement Learning”]
Policy term: r✓ log ⇡(at|st; ✓)A(st, at; ✓V )
Agent observes state st and takes action at
![Page 9: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/9.jpg)
Raia Hadsell - Learning to Navigate - 2018
v π
Navigation agent architectures
Hiddens
rewardt-1
LSTM
CNN
velocityt, actiont-1
v π
CNN
v π
CNN
policy LSTM
policy LSTM
depth
Long Short-Term Memory (LSTM)
![Page 10: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/10.jpg)
Raia Hadsell - Learning to Navigate - 2018
Results on large static mazes
Environment steps
Rew
ard
at g
oal
Importance of auxiliary tasks
Environment steps
Depth prediction as auxiliary taskoutperforms using depth as inputs
![Page 11: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/11.jpg)
Mirowski, Pascanu et al (2017), “Learning to Navigate in Complex Environments”
![Page 12: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/12.jpg)
• 3D, first person environment • partially observed • procedural variations
… but it’s not real
![Page 13: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/13.jpg)
![Page 14: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/14.jpg)
Raia Hadsell - Learning to Navigate - 2018
ExplorationMulti-task prediction of sensory data
RepresentationGrounding inneuroscience
MemoryOne-shot navigation
in unseen environment
Real worldModularity and
transfer learning
![Page 15: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/15.jpg)
Raia Hadsell - Learning to Navigate - 2018
Can we solve navigation tasks in the real world?
Piotr Mirowski*, Matthew Koichi Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, Koray Kavukcuoglu,
Andrew Zisserman and Raia Hadsell
arxiv.org/abs/1804.00168
Learning to Navigate in Cities Without a Map
![Page 16: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/16.jpg)
Raia Hadsell - Learning to Navigate - 2018
Can we solve navigation tasks in the real world?
Street View
![Page 17: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/17.jpg)
Raia Hadsell - Learning to Navigate - 2018
Street View as an RL environment: StreetLearn
Google Maps graph
Street View image
RGB panoramic image(we crop it and render at 84x84)
Actions:move to the next node,
turn left/right
![Page 18: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/18.jpg)
Raia Hadsell - Learning to Navigate - 2018
New York, London, Paris
● 14,000 to 60,000 nodes (panoramas) per “city”, covering range of 3.5-5km
● Discrete action space allows rotating in place and stepping to next node
● Multi-city dataset and RL environment will be released later this year
![Page 19: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/19.jpg)
Raia Hadsell - Learning to Navigate - 2018
The Courier Task
![Page 20: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/20.jpg)
Raia Hadsell - Learning to Navigate - 2018
● Test to get a black cab license in London
● Candidates study for 3-4 years
● Memorize 25,000 roads and 20,000 named locations
● By the time they’ve passed the exam,
their hippocampuses are ‘significantly enlarged’.
The Knowledge
Woollett & Maguire. 2011. Acquiring ‘‘the Knowledge’’ of London’s Layout Drives Structural Brain Changes. Current Biology
![Page 21: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/21.jpg)
![Page 22: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/22.jpg)
Presentation Title — SPEAKER
![Page 23: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/23.jpg)
Raia Hadsell - Learning to Navigate - 2018
The Courier Task● Random start and target● Navigation without a map● Reward shaped when close to goal (<200m)● Actions: rotate left, right, or step forward● Inputs for the agent at every time point t:
○ 84x84 RGB image observations○ landmark-based goal description
![Page 24: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/24.jpg)
Raia Hadsell - Learning to Navigate - 2018
[Mnih, Badia et al (2015)
“Asynchronous Methods for Deep Reinforcement Learning”]
Architecture
![Page 25: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/25.jpg)
Raia Hadsell - Learning to Navigate - 2018
Architecture
![Page 26: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/26.jpg)
Raia Hadsell - Learning to Navigate - 2018
Successful learning on all 3 cities
Environment steps Environment steps
Rew
ard
at g
oal
New York City around NYU Central London
![Page 27: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/27.jpg)
![Page 28: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/28.jpg)
![Page 29: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/29.jpg)
Raia Hadsell - Learning to Navigate - 2018
Examples of 1000-step episodes
Analysis of goal acquisition
Examples of value function for the same target
![Page 30: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/30.jpg)
Raia Hadsell - Learning to Navigate - 2018
Generalization on new goal areas
Goal locations held-out during trainingand landmark locations
![Page 31: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/31.jpg)
Raia Hadsell - Learning to Navigate - 2018
Architecture
![Page 32: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/32.jpg)
Raia Hadsell - Learning to Navigate - 2018
Given a sequence of cities (regions of NYC), compare the following
Multi-city modular transfer
Successful navigation in target city,even though the convnet and policy LSTM are frozen and only the goal LSTM is trained.
Moreover, we note that the transfer success is correlated to number of cities seen during pre-training.
single joint modular transfer
![Page 33: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/33.jpg)
![Page 34: Learning to Navigate … at City Scale · Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray](https://reader033.fdocuments.us/reader033/viewer/2022041717/5e4c1c6a3699c37e3e0271bc/html5/thumbnails/34.jpg)
• Learning to navigate in complex environments (ICLR 2017)Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino,Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharsh Kumaran and Raia Hadsell
• Learning to navigate in cities without a map (NIPS 2018)Piotr Mirowski*, Matthew Koichi Grimes, Keith Anderson, Denis Teplyashin, Mateusz Malinowski, Karl Moritz Hermann, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell
Many thanks to many collaborators!
www.deepmind.com www.raiahadsell.com