International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based...
-
date post
22-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based...
International GraduateSchool of DynamicIntelligent Systems
Machine Learning
RG Knowledge Based Systems
Hans Kleine Büning
19 April 2023
Hans Kleine Büning9 January 2009
2RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Outline
Learning by Example Motivation Decision Trees ID3 Overfitting Pruning Exercise
Reinforcement Learning Motivation Markov Decision Processes Q-Learning Exercise
Hans Kleine Büning9 January 2009
3RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Outline
Learning by Example Motivation Decision Trees ID3 Overfitting Pruning Exercise
Reinforcement Learning Motivation Markov Decision Processes Q-Learning Exercise
Hans Kleine Büning9 January 2009
4RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Motivation
Partly inspired by human learning
Objectives: Classify entities according to some given examples Find structures in big databases Gain new knowledge from the samples
Input: Learning examples with Assigned attributes Assigned classes
Output: General Classifier for the given task
Hans Kleine Büning9 January 2009
5RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Classifying Training Examples
Training Example for EnjoySport
General Training Examples
Hans Kleine Büning9 January 2009
6RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Attributes & Classes
Attribute: Ai
Number of different values for Ai: |Ai|
Class: Ci
Number of different classes: |C|
Premises: n > 2 Consistent examples
(no two objects with the same attributes and different classes)
Hans Kleine Büning9 January 2009
7RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Possible Solutions
Decision Trees ID3 C4.5 CART
Rule Based Systems
Clustering
Neural Networks Backpropagation Neuroevolution
Hans Kleine Büning9 January 2009
8RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Decision Trees
Idea: Classify entities using if-then-rules
Example: Classifing Mushrooms Attributes: Color, Size, Points Classes: eatable, poisonous
Resulting rules: if (Colour = red)
and (Size = small) then poisonous
if (Colour = green)then eatable
…
Color Size Points Class
red
brown
brown
green
red
small
small
big
small
big
yes
no
yes
no
no
poisonous
eatable
eatable
eatable
eatable
Colour
poisonous/1
Size eatable/1 eatable/2
red green brown
eatable/1
small big
Hans Kleine Büning9 January 2009
9RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Decision Trees
There exist different decision trees for the same task.
In the mean the left tree decides earlier.
Colour
poisonous/1
Size eatable/1 eatable/2
red green brown
eatable/1
small big
Size
poisonous/1
Points eatable/2
small big
eatable/2
yes no
Hans Kleine Büning9 January 2009
10RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
How to measure tree quality?
Number of leafs? Number of generated rules
Tree height? Maximum rule length
External path length? = Sum of the length of all paths from root to leaf Amount of memory needed for all rules
Weighted external path length Like external path length Paths are weighted by the number of objects they represent
Hans Kleine Büning9 January 2009
11RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Back to the Example
Colour
poisonous/1
Size eatable/1 eatable/2
red green brown
eatable/1
small big
Size
poisonous/1
Points eatable/2
small big
eatable/2
yes no
Criterion Left Tree Right Tree
number of leafs 4 5
height 2 2
external path length 6 5
weighted external path length 7 8
Hans Kleine Büning9 January 2009
12RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Weighted External Path Length
Idea from information theory: Given:
Text which should be compressed Probabilities for character occurrence
Result: Coding tree
Example: eeab p(e) = 0.5 p(a) = 0.25 p(b) = 0.25 Encoding: 110001
Build tree according to the information content.
0
e
a b
10
1
Hans Kleine Büning9 January 2009
13RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Entropy
Entropy = Measurement for mean information content
In general:
Mean number of bits to encode each element by optimal encoding.(= mean height of the theoretically optimal encoding tree)
0.2 0.4 0.6 0.8 1
0.1
0.2
0.3
0.4
0.5
Hans Kleine Büning9 January 2009
14RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Information Gain
Information gain = expected reduction of entropy due to sorting
Conditional Entropy:
Information Gain:
Hans Kleine Büning9 January 2009
15RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Use conditional entropy and information gain for selecting split attributes.
Chosen split attribute Ak:
Possible values for Ak:
xi – Number of objects with value ai for Ak
xi,j – Number of objects with value ai for Ak and class Cj
Probability that one of the objects has attribute ai
Probability that an object with attribute ai has class Cj
Probability that one of the objects has attribute ai
Entropy & Decision Trees
Probability that one of the objects has attribute ai
Hans Kleine Büning9 January 2009
16RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Decision Tree Construction
Choose split attribute Ak which gives the highest information
gain or the smallest
Example: colour
Color Size Points Class
red
brown
brown
green
red
small
small
big
small
big
yes
no
yes
no
no
poisonous
eatable
eatable
eatable
eatable
Hans Kleine Büning9 January 2009
17RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Decision Tree Construction (2)
Analogously: H(C|Acolour) = 0.4
H(C|Asize) ≈ 0.4562
H(C|Apoints) = 0.4
Choose colour or points as first split criterion
Recursively repeat this procedure
Points
Colour Size Class
red small poisonous
brown big eatable
Colour Size Class
red big eatable
brown small eatable
green small eatable
yes no
Hans Kleine Büning9 January 2009
18RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Decision Tree Construction (3)
Right side is trivial:
Left side: both attributes have the same information gain
Points
Colour Size Class
red small poisonous
brown big eatable
yes no
eatable/3
Points
yes no
eatable/3Colour
eatable/1poisonous/1
greenred
Hans Kleine Büning9 January 2009
19RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Generalisation
The classifier should also be able to handle unknown data.
Classifing model is often called hypothesis.
Testing Generality: Divide samples into
Training set Validation or test set
Learn according to training set Test generality according to validation set
Error computation: Test set X Hypothesis h error(X,h) – Function which is monotonously
increasing in the number of wrongly classified examples in X by h
Hans Kleine Büning9 January 2009
20RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Overfitting
Learnt hypothesis performs good on training set but bad on validation set
Formally:h is overfitted if there exists a hypothesis h’ with error(D,h) < error(D,h’) and error(X,h) > error(X,h’)
X validation setD training set
Hans Kleine Büning9 January 2009
21RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Avoiding Overfitting
Stopping Don‘t split further if some criteria is true Examples:
Size of node n:Don‘t split if n contains less then ¯ examples.
Purity of node n:Don‘t split of purity gain is not big enough.
Pruning Reduce decision tree after training. Examples:
Reduced Error Pruning Minimal Cost-Complexity Pruning Rule-Post Pruning
Hans Kleine Büning9 January 2009
22RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Pruning
Pruning Syntax:
If T0 was produced by (repeated) pruning on T we write
n
n n
T Tn T/Tn
Hans Kleine Büning9 January 2009
23RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Maximum Tree Creation
Before pruning we need a maximum tree Tmax
What is a maximum tree? All leaf nodes are smaller then some threshold or All leaf nodes represent only one class or All leaf nodes have only objects with the same attribute values
Tmax is then pruned starting from the leaves.
Hans Kleine Büning9 January 2009
24RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Reduced Error Pruning
1. Consider branch Tn of T
2. Replace Tn by leaf with the class that is mostly associated with
Tn
3. If error(X, h(T)) < error(X, h(T/Tn)) take back the decision
4. Back to 1. until all non-leaf nodes were considered
Hans Kleine Büning9 January 2009
25RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Exercise
Fred wants to buy a VW Beetle and classifies all offering in the classes interesting and uninteresting. Help Fred by creating a decision tree using the ID3 algorithm.
Colour Year of Construction Mileage Class
redblue
greenred
greenblue
yellow
1975198019751975 197019751970
> 200 000 km> 200 000 km< 200 000 km> 200 000 km< 200 000 km> 200 000 km< 200 000 km
interestinguninterestinginterestinginteresting
uninterestinguninterestinginteresting
Hans Kleine Büning9 January 2009
26RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Outline
Learning by Example Motivation Decision Trees ID3 Overfitting Pruning Exercise
Reinforcement Learning Motivation Markov Decision Processes Q-Learning Exercise
Hans Kleine Büning9 January 2009
27RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Hans Kleine Büning9 January 2009
28RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Reinforcement Learning: The Idea
A way of programming agents by reward and punishment without specifying how the task is to be achieved
Hans Kleine Büning9 January 2009
29RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Learning to Balance on a Bicycle
States: Angle of handle bars Angular velocity of handle bars Angle of bicycle to vertical Angular velocity of bicycle to
vertical Acceleration of angle of bicycle
to vertical
Hans Kleine Büning9 January 2009
30RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Learning to Balance on a Bicycle
Actions: Torque to be applied to the
handle bars Displacement of the center of
mass from the bicycle’s plan (in cm)
Hans Kleine Büning9 January 2009
31RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Angle of bicycle to vertical is greater
than 12°
Reward = 0
Reward = -1
no yes
Hans Kleine Büning9 January 2009
32RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Reinforcement Learning: Applications
Board Games TD-Gammon program, based on reinforcement learning, has
become a world-class backgammon player
Control a Mobile Robot Learning to Drive a Bicycle Navigation Pole-balancing Acrobot
Robot Soccer
Learning to Control Sequential Processes Elevator Dispatching
Hans Kleine Büning9 January 2009
33RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Deterministic Markov Decision Process
Hans Kleine Büning9 January 2009
34RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Value of Policy and Agent’s Task
Hans Kleine Büning9 January 2009
35RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Nondeterministic Markov Decision Process
P = 0
.8
P = 0.1
P = 0.1
Hans Kleine Büning9 January 2009
36RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Methods
Dynamic Programming
ValueFunction
Approximation+
DynamicProgramming
ReinforcementLearning
ValuationFunction
Approximation+
ReinforcementLearning
continuousstates
discrete states discrete statescontinuous
states
Model (reward function and transitionprobabilities) is known
Model (reward function or transitionprobabilities) is unknown
Hans Kleine Büning9 January 2009
37RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Q-learning Algorithm
Hans Kleine Büning9 January 2009
38RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Q-learning Algorithm
Hans Kleine Büning9 January 2009
39RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example
Hans Kleine Büning9 January 2009
40RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Q-table Initialization
Hans Kleine Büning9 January 2009
41RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Episode 1
Hans Kleine Büning9 January 2009
42RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Episode 1
Hans Kleine Büning9 January 2009
43RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Episode 1
Hans Kleine Büning9 January 2009
44RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Episode 1
Hans Kleine Büning9 January 2009
45RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Episode 1
Hans Kleine Büning9 January 2009
46RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Q-table
Hans Kleine Büning9 January 2009
47RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Episode 1
Hans Kleine Büning9 January 2009
48RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Episode 1
Hans Kleine Büning9 January 2009
49RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Q-table
Hans Kleine Büning9 January 2009
50RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Episode 2
Hans Kleine Büning9 January 2009
51RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Episode 2
Hans Kleine Büning9 January 2009
52RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Episode 2
Hans Kleine Büning9 January 2009
53RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Q-table after Convergence
Hans Kleine Büning9 January 2009
54RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Value Function after Convergence
Hans Kleine Büning9 January 2009
55RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Optimal Policy
Hans Kleine Büning9 January 2009
56RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Example: Optimal Policy
Hans Kleine Büning9 January 2009
57RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Q-learning
Hans Kleine Büning9 January 2009
58RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Convergence of Q-learning
Hans Kleine Büning9 January 2009
59RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Blackjack
Standard rules of blackjack hold State space:
element[0] - current value of player's hand (4-21)
element[1] - value of dealer's face -up card (2-11)
element[2] - player does not have usable ace (0/1)
Starting states: player has any 2 cards (uniformly
distributed), dealer has any 1 card (uniformly distributed)
Actions: HIT STICK
Rewards: 1 for a loss 0 for a draw 1 for a win
Hans Kleine Büning9 January 2009
60RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Blackjack: Optimal Policy
Hans Kleine Büning9 January 2009
61RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Exercise:
Hans Kleine Büning9 January 2009
62RG Knowledge Based SystemsUniversity of Paderborn
International GraduateSchool of DynamicIntelligent Systems
Exercise: