Qualitative Induction for Behavioral Cloning
description
Transcript of Qualitative Induction for Behavioral Cloning
![Page 1: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/1.jpg)
Qualitative Induction for Behavioral Cloning
Dorian Šuc and Ivan BratkoAI Lab
Faculty of Computer and Information Sc.University of Ljubljana, Slovenia
![Page 2: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/2.jpg)
Kvalitativno ucenje v vedenjskem kloniranju
Dorian Suc in Ivan Bratko
![Page 3: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/3.jpg)
Vedenjsko kloniranje
OperaterDinamicni sistem:
zerjav, letalo, akrobot...
Sled vodenja
Strojno ucenje
Operatorjev dvojnik (”klon”)
![Page 4: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/4.jpg)
“Direktni kontroler”: induciraj preslikavo
States Actions, Action = f(State)
Pristopi k kloniranju
Sled je zaporedje:
(State1,Action1), (State2, Action2), ...
“Indirektni kontroler”: Dva problema ucenja
• ucenje operaterjeve trajektorije
• ucenje dinamike sistema
![Page 5: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/5.jpg)
Uporaba indirektnih kontrolerjev
Indirektni kontroler = “posplosena trajektorija” + dinamika
sistema
1. Izracunaj Error = diff(CurrentState,GeneralTrajectory)
2. Z uporabo dinamike doloci naslednjo akcijo Action, tako da Action zmanjsa Error
![Page 6: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/6.jpg)
Primerjava direktnih in indirektnih kontrolerjev
Eksperimentalne ugotovitve:
Indirektni kontrolerji: - so bolj robustni - omogocajo razlago vescine z operaterjevimi podcilji - dajo boljsi vpogled v podzavestno vescino operaterja
![Page 7: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/7.jpg)
Ta clanek
• Induciranje indirektnih kontrolerjev
• Kvalitativno ucenje trajektorij
• QUIN: program za induciranje
kvalitativnih dreves iz numericnih
podatkov
• Uporaba v vodenju zerjava
![Page 8: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/8.jpg)
Primer kvalitativne relacije
Kvantitativni zakon:
Pressure * Volume / Temperature = const.
Kvalitativni zakon:
Pressure = M+,-(Temperature, Volume)
Vedenje plina
![Page 9: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/9.jpg)
Program QUIN QUalitative INduction
Numericni primeri
QUIN
Kvalitativno drevo
Kvalitativno drevo: podobno odlocitvenemu
drevesu, vendar kvalitativne omejitve v listih
![Page 10: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/10.jpg)
Primer problema za QUIN
Sumni primeri:z = x2 - y2 + noise(st.dev. 50)
![Page 11: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/11.jpg)
z = x*x - y*yNoisy examples (std.dev.=50)x = 0 ; y = 0
x > 0 & y > 0 =>
z = M+,-
(x,y)
Kvalitativni vzorci v podatkih
![Page 12: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/12.jpg)
Inducirano kvalitativnodrevo za z=x2-y2
Z monotonically increasing with X and monotonically decreasing with Y
z=M-,+
(x,y) z=M-,-
(x,y) z=M+,+
(x,y) z=M+,-
(x,y)
0> 0 > 0
0 > 0
0
y
x
y
![Page 13: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/13.jpg)
Qualitatively Constrained Functions, QCF
Ms1, ..., sm: R m --> R, si = + or -
Signs si indicate directions of change:
If si = + then:
function monotonically increases in i-th attribute
Function “positively related” to i-th attr.si = -: function “negatively related” to i-th att.
![Page 14: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/14.jpg)
QCF consistency with examples
•Each pair of examples (e,f) defines
a qualitative change vector q with
respect to no-change threshold
•A QCF is consistent with (e,f) if QCF
permits q
![Page 15: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/15.jpg)
QCF ambiguity
• A QCF may be consistent with qualitative change vector q and ambiguous w.r.t. q
• QCF is ambiguous w.r.t. q if QCF also permits other qualitative changes in class then those in q
![Page 16: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/16.jpg)
Error-cost of QCF
• Error-cost of a QCF w.r.t. an example set defined as weighted encoding length
• Error-cost of a QCF considers: encoding of QCF + encoding of inconsistent predictions by
QCF + encoding of ambiguous predictions by QCF
Weighted by proximity of concerned examples
![Page 17: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/17.jpg)
Algoritem QUIN
• Top-down pozresni algoritem, ki inducira kvalitativna drevesa
• Za vsako mozno delitev (vozlisce), poisci ”najbolj konsistentno” QCF (min. cena) za vsako podmnozico primerov
• Poisci najboljsi atribut (najboljso delitev) glede na MDL
![Page 18: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/18.jpg)
Eksperimentalna evaluacija
• Na mnozici umetnih domen– QUIN deluje dobro na sumnih podatkih– QUIN najde kvalitativne relacije, ki ustrezajo
intuiciji
• QUIN v vedenjskem kloniranju:– QUIN uporabljen za ucenje operaterjeve
strategije vodenja– Poskusi v domeni zerjava
![Page 19: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/19.jpg)
Uporaba v vedenjskem kloniranju
• Domena: vodenje zerjava
• Cilj kloniranja: uspesni in razumljivi kloni
![Page 20: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/20.jpg)
Kontejnerski zerjav
X0=0L0=20
load
trolley
X
L
Xg=60Lg=32
Control forces: Fx, FL
State: X, dX, , d, L, dL
Temelji na prejsnjem delu T. Urbancic(94)
Naloga vodenja: prenesi tovor iz zacetnega do ciljnega polozaja
![Page 21: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/21.jpg)
QUIN v modeliranju vescine, domena zerjava
• Kvalitativna drevesa inducirana za vodenje
vozicka in vodenje vrvi
• Sledi dveh operaterjev z zelo razlicnim
stilom vodenja
![Page 22: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/22.jpg)
Vodenje vozicka, operater S
desired_velocity = f(X, , d)
M-(X) M+()
X < 20.7
X < 60.1M+(X)
yes
yes
no
no
First the trolley velocity is increasing
First the trolley velocity is increasing
From about middle distance from the goal the trolley velocity is decreasing
From about middle distance from the goal the trolley velocity is decreasing
At the goal reduce the swing of the rope (by acceleration of the trolley when the rope angle increases)
At the goal reduce the swing of the rope (by acceleration of the trolley when the rope angle increases)
![Page 23: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/23.jpg)
Vodenje zerjava: primerjva operaterjev
M-(X) M+()
X < 20.7
X < 60.1
X < 29.3
M+(X) d < -0.02
M-(X) M-,+(X,)
M+,+,-(X, , d)
yes
yes
yes
yes
no
no
no
no
Primerjava razlik v stilu
vodenjaOperater S Operater L
![Page 24: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/24.jpg)
Transformacija kvalitativne v kvantitativno strategijo
• S konkretizacijo QCF v realne funkcije
M+(X)
• Lahko uporabimo znanje domene:– maksimalne in minimalne vrednosti
spremenljivk stanja– vozicek se na zacetku mora zaceti premikati– vozicek se mora ustaviti na cilju
f+(X)
Nakljucno generirana funkcija, ki ustreza
QCF
![Page 25: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/25.jpg)
QUIN v modeliranju vescine
Inducirane strategije vodenja:
• Razumljive in zelo uspesne
• Omogocajo vpogled v razlike med
individualnimi stili vodenja
QUIN zmozen detektirati zelo skrite vidike
clovekove podzavestne vescine (vidiki, ki pred
to aplikacijo programa QUIN niso bili znani)
![Page 26: Qualitative Induction for Behavioral Cloning](https://reader034.fdocuments.us/reader034/viewer/2022051417/56814b92550346895db87575/html5/thumbnails/26.jpg)
Related work in qualitative reasoning
• In qualitative reasoning: Our QFC’s inspired by qualitative proportionalities (Q+) in QPT (Forbus) and monotonicity relations (M+) in QSIM (Kuipers)
• In learning qualitative models of dynamic systems: Mozetic; Coiera; Bratko et al.; Varsek; Richards et al.; Dzeroski, Todorovski;
• Distinguishing features of QUIN: models of static systems, qualitative trees, takes numerical examples directly