Introduction to Machine Learning Laurent Orseau AgroParisTech [email protected] EFREI...

66
Introduction to Introduction to Machine Learning Machine Learning Laurent Orseau AgroParisTech [email protected] EFREI 2010-2011 Based on slides by Antoine Cornuejols

Transcript of Introduction to Machine Learning Laurent Orseau AgroParisTech [email protected] EFREI...

Page 1: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

Introduction to Introduction to Machine LearningMachine Learning

Laurent OrseauAgroParisTech

[email protected]

EFREI 2010-2011Based on slides by Antoine Cornuejols

Page 2: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

2

OverviewOverview

• Introduction to Induction (Laurent Orseau)• Neural Networks• Support Vector Machines• Decision Trees • Introduction to Data-Mining (Christine Martin)• Association Rules• Clustering• Genetic Algorithms

Page 3: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

3

Overview: IntroductionOverview: Introduction

• Introduction to Induction Examples of applications Learning types

• Supervised Learning• Reinforcement Learning• No-supervised Learning

Machine Learning Theory

• What questions to ask?

Page 4: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

IntroductionIntroduction

Page 5: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

5

What is Machine Learning ?What is Machine Learning ?• Memory

Knowledge acquisition Neurosciences

• Short-term (working) Keep 7±2 objects at a time

• Long-term Procedural

» Action sequences Declarative

» Semantic (concepts) » Episodic (facts)

• Learning Types By heart From rules By imitation / demonstration By trial & error

• Knowledge reuseKnowledge reuse In similar situations

Introduction

Page 6: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

6

What is Machine Learning?What is Machine Learning?

• "The field of study that gives computers the ability to learn without being explicitly programmed "

Arthur Samuel, 1959

Samuel's Checkers> Schaeffer 2007 (solved)+ TD-Gammon, Tesauro 1992

Introduction

Page 7: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

7

What is Machine Learning?What is Machine Learning?

Given:Experience E, A class of tasks T A performance measure P,

A computer is said to learn if

its performance on a task of T

measured by P

increases with experience E

Tom Mitchell, 1997

Introduction

Page 8: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

8

Terms related to Machine LearningTerms related to Machine Learning

• Robotic Automatic Google Cars, Nao

• Prediction / forecasting Stock exchange, pollution peaks, …

• Recognition Face, language, writing, moves, …

• Optimization Subway speed, traveling salesman, …

• Regulation Heat, traffic, fridge temperature, …

• Autonomy Robots, hand prosthesis

• Automatic problem solving• Adaptation

User preferences, robot in changing environment• Induction• Generalization• Automatic discovery• …

Introduction

Page 9: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

Some applicationsSome applications

Page 10: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

10

Learning to cookLearning to cook

•Learning by imitation / demonstration•Procedural Learning (motor precision)•Object recognition

Applications

Page 11: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

11

DARPA Grand challenge (2005)DARPA Grand challenge (2005)

Applications

Page 12: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

12

200km of desert

Natural and artificial dangers

No driver

No remote control

200km of desert

Natural and artificial dangers

No driver

No remote control

Applications > DARPA Grand Challenge

Page 13: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

13

5 Finalists5 Finalists

Applications > DARPA Grand Challenge

Page 14: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

14

Recognition of the roadRecognition of the road

Applications > DARPA Grand Challenge

Page 15: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

15

Learning to label images:Learning to label images:Face recognitionFace recognition

“Face Recognition: Component-based versus Global Approaches” (B. Heisele, P. Ho, J. Wu and T. Poggio), Computer Vision and Image Understanding, Vol. 91, No. 1/2, 6-21, 2003.

Applications

Page 16: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

16

Applications > Reconnaissance d'images

Feature combinationsFeature combinations

Page 17: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

17

Hand prosthesisHand prosthesis

• Recognition of pronator and supinator signals Imperfect sensors Noise Uncertainty

Applications

Page 18: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

18

Autonomous robot rover on MarsAutonomous robot rover on Mars

Applications

Page 19: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

19

Supervised Supervised LearningLearning

Learning by heart? UNEXPLOITABLE Generalize

How to encode forms?

b

Page 20: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

Introduction to Introduction to Machine Learning TheoryMachine Learning Theory

Page 21: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

21

Introduction to Machine Learning theoryIntroduction to Machine Learning theory

• Supervised Learning

• Reinforcement Learning

• Unsupervised Learning (CM)

• Genetic Algorithms (CM)

Page 22: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

22

Supervised LearningSupervised Learning

• Set of examples xi labeled ui

• Find a hypothesis h so that:

h(xi) = ui ?

h(xi): predicted label

• Best hypothesis h* ?

Page 23: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

23

Supervised Learning: 1Supervised Learning: 1stst Example Example

• Houses: Price / m²

• Searching for h Nearest neighbors? Linear, polynomial regression?

• More information Localization (x, y ? or symbolic variable?),

age of building, neighborhood, swimming-pool, local taxes, temporal evolution,…?

Supervised Learning

Page 24: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

24

Problem Problem

Prediction du prix du m² pour une maison donnee.

1) Modeling

2) Data gathering

3) Learning

4) Validation

5) Use in real case

Supervised Learning

Ideal Practice

Page 25: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

25

1) Modeling1) Modeling

• Input space What is the meaningful information? Variables

• Output space What is to be predicted?

• Hypothesis space Input –(computation) Output What (kind of) computation?

Supervised Learning

Page 26: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

26

1-a) Input space: Variables1-a) Input space: Variables

• What is the meaningful information?• Should we get as much as possible?• Information quality?

Noise Quantity

• Cost of information gathering? Economic Time Risk (invasive?) Ethic Law (CNIL)

• Definition domain of each variable? Symbolic, bounded numeric, not bounded, etc.

Supervised Learning > 1) Modeling

Page 27: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

27

Price of m²: VariablesPrice of m²: Variables

• Localization Continuous: (x, y) longitude latitude ? Symbolic: city name?

• Age of building Year of creation? Relative to present or to creation date?

• Nature of soil

• Swimming-pool?

Supervised Learning > 1) Modeling > a) Variables

Page 28: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

28

1-b) Output space1-b) Output space

• What do we want on output? Symbolic classes? (classification)

• Boolean Yes/No (concept learning)• Multi-valued A/B/C/D/…

Numeric? (regression)• [0 ; 1] ?• [-∞ ; +∞] ?

• How many outputs? Multi-valued Multi-class ?

• 1 output for each class Learn a model for each output?

• More "free" Learn 1 model for all outputs?

• Each model can use others' information

Supervised Learning > 1) Modeling

Page 29: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

29

1-c) Hypothesis space1-c) Hypothesis space

• Critical!

• Depends on the learning algorithm Linear Regression: space = ax + b

• Parameters: a and b Polynomial regression

• # parameters = polynomial degree• Neural Networks, SVM, Gen Algo, …

Supervised Learning > 1) Modeling

Page 30: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

30

Choice of hypothesis spaceChoice of hypothesis space

Estimation

Error

Total ErrorApproximation Error

Page 31: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

31

Choice of hypothesis spaceChoice of hypothesis space

• Space too "poor" Inadequate solutions Ex: model sin(x) with y=ax+b

• Space too "rich" risk of overfittingoverfitting• Defined by set of parametersparameters

High # params learning more difficult

• But prefer a richer hypothesis space! Use of generic methods Add regularization

Supervised Learning > 1) Modeling > c) Hypothesis space

Page 32: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

32

2) Data gathering2) Data gathering

• Gathering Electronic sensors Simulation Polls Automated on the Internet …

• Get highest quantity of data Collect cost

• Data as "pure" as possible Avoid all noise

• Noise in variables• Noise in labels!

1 example = 1 value for each variable• missing value = useless example?

Supervised Learning

Page 33: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

33

Gathered dataGathered data

x1x1 x2x2 x3x3 uu

Example 1 Yes 1.5 Green -

Example 2 No 1.4 Orange +

Example 3 Yes 3.7 Orange -

… … … … …

Inputs / Variables

measured

Output /

Class /

Label

But true label y

unreachable !

Supervised Learning > 2) Data gathering

Page 34: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

34

Data preprocessingData preprocessing

• Clean up data ex: Reduce background noise

• Transform data Final format adapted to task Ex: Fourier Transform of radio signal

time/amplitude frequency/amplitude

Supervised Learning > 2) Data gathering

Page 35: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

35

3) Learning3) Learning

a) Choice of program parameters

b) Choice of inductive test

c) Running the learning program

d) Performance test

If bad, return to a)…

Supervised Learning

Page 36: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

36

a) Choice of program parametersa) Choice of program parameters

• Max allocated computation time

• Max accepted error

• Learning parameters Specific to model

• Knowledge introduction Initialize parameters to "ok" values?

• …

Supervised Learning > 3) Learning

Page 37: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

37

b) Choice of inductive testb) Choice of inductive test

Goal: find hypothesis h H minimizing real riskreal risk (risk expectancy, generalization error)

predictedlabel

true label y(or desired u)

Loss functionLoss functionJoint probability law

over X Y

Supervised Learning > 3) Learning

R(h) l h(x),y dP(x, y)XY

Page 38: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

38

Real riskReal risk

• Goal: Minimize real risk

• Real risk is not known, in particular P(X,Y).

Supervised Learning > 3) Learning > b) Inductive test

• Discrimination

• Regression

l (h(xi),ui) 0 si ui h(xi )

1 si ui h(xi )

l (h(xi),ui) h(xi) ui 2

R (h ) l h (x ), y dP (x , y )X Y

Page 39: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

39

Empirical Risk MinimizationEmpirical Risk Minimization

• ERM principleERM principle Find h H minimizing empirical risk empirical risk

• Least error on training set

REmp (h) l h(xi ),ui i 1

m

Supervised Learning > 3) Learning > b) Inductive test

Page 40: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

40

Learning curveLearning curve

• Data quantity is important!

Training set size

"error"

Learning curve

Supervised Learning > 3) Learning > b) Inductive test > Empirical risk

Page 41: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

41

Test / ValidationTest / Validation

• Measures overfitting overfitting / generalizationgeneralization Acquired knowledge can be reused in new new

circumstancescircumstances? Do NOT validate over training set!

• Validation over additional test settest set

• Cross Validation Useful when few data leave-p-out

Supervised Learning > 3) Learning > b) Inductive test > Empirical risk

Page 42: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

42

OverfittingOverfittingSupervised Learning > 3) Learning > b) Inductive test > Empirical risk

Real Risk

Emprirical Risk

Overfitting

Data quantity

Page 43: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

43

RegularizationRegularization

• Limit overfitting before measuring it on test set

• Add penalizationpenalization in inductive test Ex:

• Penalize large number• Penalize resource use• …

Supervised Learning > 3) Learning > b) Inductive test > Empirical risk

Page 44: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

44

Maximum a posterioriMaximum a posteriori

• Bayesian approach• We suppose there exists a priorprior probability distribution over

space H: pH(h)

Maximum A Posteriori principleMaximum A Posteriori principle (MAP)(MAP)::• Search for most probable h after observing data S

• Ex: Observation of sheep color h = "A sheep is white"

Supervised Learning > 3) Learning > b) Inductive test

Page 45: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

45

Minimum Description Length PrincipleMinimum Description Length Principle

• Occam RazorOccam Razor"Prefer simplest hypotheses"

• Simplicity: size of h Maximum compressionMaximum compression

• Maximum a posteriori with pH(h) = 2-d(h)

• d(h): length in bits of h

• Compression generalization

Supervised Learning > 3) Learning > b) Inductive test

Page 46: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

46

c) Running the learning programc) Running the learning program

• Search for h

• Use examples of training settraining set One by one All together

• Minimize inductive testinductive test

Supervised Learning > 3) Learning

Page 47: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

47

Finding the parameters of the modelFinding the parameters of the model

• Explore hypothesis space H Best hypothesis given inductive test? Fundamentally depends on H

a) Structured exploration

b) Local exploration

c) No exploration

Supervised Learning > 3) Learning > c) Running the program

Page 48: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

48

Structured explorationStructured exploration• Structured by generality relation Structured by generality relation

(partial order)(partial order) Version space ILP (Inductive Logic Programming) EBL (Explanation Based Learning) Grammatical inference Program enumeration

hi hj

gms(hi, hj)

smg(hi, hj)

H

Supervised Learning > 3) Learning > c) Running the program > Exploring H

Page 49: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

49

Representation of the version spaceRepresentation of the version space

Structured by:

Upper bound: G-set

Lower bound: S-set

• G-set = Set of all most general hypotheses

consistent with known examples

• S-set = Set of all most specific hypotheses

consistent with known examples

H

G

S

hi hj

Supervised Learning > 3) Learning > c) Running the program > Exploring H

Page 50: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

50

Learning…Learning…

… by iterated updates of the version space

Idea:

update S-set

and G-set

after each new example

Candidate elimination algorithm

Example: rectangles (cf. blackboard…)

Supervised Learning > 3-c) > Exploring H > Version space

Page 51: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

51

Candidate Elimination algorithmCandidate Elimination algorithm

Initialize S (resp. G):

Set of most specific (resp. general), consistent with 1st example

For each new example (+ or -)

update S

update G

Until convergence

or until S = G = Ø

Supervised Learning > 3-c) > Exploring H > Version space

Page 52: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

54

Updating S and G: xUpdating S and G: xii is is positivepositive

• Updating SUpdating S Generalize hypotheses in S not covering xi ,

just enough to cover it

Then eliminate hypotheses in S

• covering one or more negative example

• more general than another hypothesis in S

• Updating GUpdating G Eliminate hypotheses in G not covering xi

Supervised Learning > 3-c) > Exploring H > Version space

Page 53: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

55

Updating S and G: xUpdating S and G: xii is is negativenegative

• Updating SUpdating S Eliminate hypotheses in S (wrongly) covering xi

• Updating GUpdating G Specialize hypotheses in G covering xi

just enough not to cover it

Then eliminate hypotheses in G

• not more general than at least one element of S

• more specific than at least another hypothesis of G

Supervised Learning > 3-c) > Exploring H > Version space

Page 54: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

56

Candidate Elimitation AlgorithmCandidate Elimitation Algorithm

Updating S et G

H

G

Sx

x

x

x(a)

(b)

(c)

(d)

x

(d')

(b')

(a')

xx

x

Supervised Learning > 3-c) > Exploring H > Version space

Page 55: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

57

Local explorationLocal exploration

• Only a Only a neighborhoodneighborhood notion in notion in HH "Gradient" methods

• Neural Networks• SVM• Simulated annealing / simulated evolution

• /!\ Local Minima

Supervised Learning > 3) Learning > c) Running the program > Exploring H

xh

H

Page 56: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

58

Exploration without hypothesis spaceExploration without hypothesis space

• No hypothesis spaceNo hypothesis space Use examples directly

• and example space Nearest Neighbors methods

(Case Based Reasoning / Instance-based learning)

Notion of distancedistance

• Example: k Nearest Neighbors Optional: Vote weighted by distance

Supervised Learning > 3) Learning > c) Running the program > Exploring H

Page 57: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

59

Inductive biaisInductive biais

• A priori preference of some hypotheses Depends on H Depends on search algorithm

• Whatever the inductive test: ERM: implicit in H MAP: explicit, user chooses MDL: explicit, (length in bits) PPV: distance notion

• What justification?

Supervised Learning

Page 58: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

Supervised LearningSupervised Learning

Less frequent learning typesLess frequent learning types

Page 59: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

61

Incremental LearningIncremental Learning

• Examples are given/taken one after the other Incremental update of best hypothesis Use acquired knowledge to

• learn better• learn faster

• Data is no more i.i.d. ! i.i.d: Independently and Identically distributed

= sampled uniformly from a non-changing example generator Dependence to time / sequence

• Ex: Mobile phone users preferences Learning to program …

Supervised Learning

Page 60: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

62

Active LearningActive Learning

• Set of unlabeled examples Labeling an example is expansive

Choose an example to be labeled How to choose?

• Data is not i.i.d.

• Ex: video sequence labeling

Supervised Learning

Page 61: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

Other types of Machine LearningOther types of Machine Learning

Reinforcement LearningReinforcement Learning

Unsupervised LearningUnsupervised Learning

Page 62: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

64

Reinforcement LearningReinforcement Learning

• Pavlov Bell : trigger Dog bowl : reward Salivate : action Association

bell ↔ bowl Reinforcement of

"salivation"

ActionPerception

Environment

Reward /Punition

• Control behavior with rewards/punitions

Page 63: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

65

Reinforcement LearningReinforcement Learning

• Agent must discover the right behavior And optimize itMaximize expected rewardexpected reward

st: state at time t

Action selection: at:= argmaxa Q(st, a)

• Updating valuesrt: reward received at time tQ(st, at) α Q(st, at) + (1- α) [ rt+1 + γ maxa Q(st+1, a) ]

Page 64: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

66

Unsupervised LearningUnsupervised Learning

• No class, no output, no reward• Goal: group similar group similar examples together

• Notion of distance• Inductive bias

Page 65: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

67

ConclusionConclusion

• Induction Find a general hypothesis from examples

• Avoid overfitting• Choose the right hypothesis space

Not too small (bad induction) Not too large (overfitting)

• Use an adequate algorithm With data With hypothesis space

Page 66: Introduction to Machine Learning Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr EFREI 2010-2011 Based on slides by Antoine Cornuejols.

68

What to rememberWhat to remember

• Mostly supervised learning is studied

• Learning is always biased

• Learning depends on the structure of the hypothesis space No structure: interpolation methods

Local structure: gradient methods (approximation)

Partial order relation: guided exploration (exploration)