Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and...
-
Upload
joanna-horton -
Category
Documents
-
view
212 -
download
0
Transcript of Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and...
![Page 1: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/1.jpg)
Embodied Learning of Qualitative Models
Jure Žabkar
Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011
joint work with xpero partners
![Page 2: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/2.jpg)
problem
“How should a robot choose its actions and experiences so as to maximize the
effectiveness of its learning?”
![Page 3: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/3.jpg)
goals
• to learn comprehensible models
• no extrinsic reward
• intrinsic reward: improved prediction model about the environment
![Page 4: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/4.jpg)
our way
• learning from scratch(no explicit background knowledge, but given a learning algorithm)
• real robots, real-time learning
![Page 5: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/5.jpg)
learning loop
1. observe the environment (collect data)2. learn a model3. use the model to predict the effect of
each action4. choose the best action (w.r.t. active
learning strategy)5. observe the environment and check
whether the predictions match new observations
![Page 6: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/6.jpg)
starting scenarioQ: how does the area of the ball (as observed by the robot)change w.r.t. robot's actions?
area := #pixels of the red blob in the image from robot's camera
actions: sL, sR
(the distance of the L/R wheel)
![Page 7: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/7.jpg)
area = area(sL,sR)
task: find the appropriate model
equation discovery?we tried several algorithms, no success
![Page 8: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/8.jpg)
motivation
people most oftenreason qualitatively
AI: robots should mimic
human intelligence
why learning qualitative relations?
![Page 9: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/9.jpg)
the area problem, qualitatively
if action=forward then the area increases until it becomes constant (blob occupies the whole image)
if orientation<0 and action=left (increasing the
absolute value of the angle) then the area decreases until it becomes constant (zero)
...
![Page 10: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/10.jpg)
qualitative rules
prediction model gets much more accurate,but the predictions are
not that precise.
![Page 11: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/11.jpg)
methods
• active learning + planning• learning methods:
PadéŽabkar, Možina, Bratko, Demšar Learning Qualitative Models from Numerical Data, AIJ, 2011
STRUDELKošmerlj, Bratko, Žabkar Embodied Concept Discovery through Qualitative Action Models, IJUFKS, 2011
QubeŽabkar et al Preference Learning from Qualitative Partial Derivatives, ECML Preference Learning Workshop, 2010
Hyper (with predicate invention mechanism)Leban, Žabkar, Bratko An experiment in robot discovery with ILP Proc. ILP 2008
• tested on simulated (billiards) and real data (medical application, robotics)
![Page 12: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/12.jpg)
ceteris paribus
• e.g. partial differentiation• observe a qualitative relation
between two selected features, other features held constant
• qualitative relations of 3 types:– x increases f(x) increases (Padé)– preference relation: x y f(x) f(y) – structural: on(A,B,t1), on(A,C,t2)
"all other things being equal"
![Page 13: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/13.jpg)
qualitative modelsdata
qualitative
changes
qualitative models
Padé, Qube, STRUDEL
machine learning,statistics
![Page 14: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/14.jpg)
qualitative modelsdata
qualitative
changes
qualitative models
Padé, Qube, STRUDEL
machine learning,statistics
![Page 15: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/15.jpg)
qualitative modelsdata
qualitative
changes
qualitative models
Padé, Qube, STRUDEL
machine learning,statistics
![Page 16: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/16.jpg)
learning with structured data• ILP with predicate invention too
complex for real-time learning
• we use ILP to learn smaller subtasks – structural qualitative changes
![Page 17: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/17.jpg)
www.ailab.si/xpero
![Page 18: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e735503460f94b73585/html5/thumbnails/18.jpg)
the concept "movable"the discovered condition which distinguishes different effects of actions:p1(Obj):-
at(T1, Obj, Pos1),at(T2, Obj, Pos2),neq_pos(Pos1, Pos2).
move(T, Obj):-p1(Obj),f1(T, Obj).
move(T, Obj):-not p1(Obj),f2(T, Obj).
f1(T1, Obj):-at(T1, Obj, Pos1),at(T2, Obj, Pos2),Pos1 \== Pos2,{T2 = T1+1}.
f2(T, Obj):-not f1(T, Obj).
the discovered effects of actions:
p1 is true if the object was observed at two different positions