I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine...

27
1 Machine Learning – Introduction I Why Machine Learning? I What is a well-defined learning problem? I An example: learning to play checkers I What questions should we ask about Machine Learning?

Transcript of I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine...

Page 1: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

1 Machine Learning – Introduction

I Why Machine Learning?

I What is a well-defined learning problem?

I An example: learning to play checkers

I What questions should we ask about Machine Learning?

Page 2: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

1 Machine Learning – Introduction

I Why Machine Learning?

I What is a well-defined learning problem?

I An example: learning to play checkers

I What questions should we ask about Machine Learning?

I Tom M. Mitchell: Machine Learning. McGraw-Hill: 1997.

1 Machine Learning – Introduction (4th April 2006) 1

Page 3: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Why Machine Learning?

I Recent progress in algorithms and theory

I Growing flood of online data

I Computational power is available

I Budding industry

Page 4: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Why Machine Learning?

I Recent progress in algorithms and theory

I Growing flood of online data

I Computational power is available

I Budding industry

I Niches for machine learning:

. Data mining: using historical data to improve decisions

. medical records medical knowledge

. learning to program: we can’t program everything by hand

. autonomous driving

. speech recognition

. Self customizing programs

. Newsreader that learns user interests

1 Machine Learning – Introduction (4th April 2006) 2

Page 5: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Typical Datamining Task

I Data:Patient103 Patient103Patient103 ...time=1 time=2 time=n

Age: 23FirstPregnancy: noAnemia: noDiabetes: noPreviousPrematureBirth: no

...

Elective C−Section: ?Emergency C−Section: ?

Age: 23FirstPregnancy: noAnemia: no

PreviousPrematureBirth: noDiabetes: YES

...Emergency C−Section: ?

Ultrasound: abnormal

Elective C−Section: no

Age: 23FirstPregnancy: noAnemia: no

PreviousPrematureBirth: no

...

Elective C−Section: no

Ultrasound: ?

Diabetes: no

Emergency C−Section: Yes

Ultrasound: ?

I Given:

. 9714 patient records, each describing a pregnancy and birth

. Each patient record contains 215 features

I Learn to predict:

. Classes of future patients at high risk for Emergency Cesarean Section

1 Machine Learning – Introduction (4th April 2006) 3

Page 6: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Datamining ResultI Data:

Patient103 Patient103Patient103 ...time=1 time=2 time=n

Age: 23FirstPregnancy: noAnemia: noDiabetes: noPreviousPrematureBirth: no

...

Elective C−Section: ?Emergency C−Section: ?

Age: 23FirstPregnancy: noAnemia: no

PreviousPrematureBirth: noDiabetes: YES

...Emergency C−Section: ?

Ultrasound: abnormal

Elective C−Section: no

Age: 23FirstPregnancy: noAnemia: no

PreviousPrematureBirth: no

...

Elective C−Section: no

Ultrasound: ?

Diabetes: no

Emergency C−Section: Yes

Ultrasound: ?

I One of 18 learned rules:

If No previous vaginal delivery, andAbnormal 2nd Trimester Ultrasound, andMalpresentation at admission

Then Probability of Emergency C-Section is 0.6

Over training data: 26/41 = .63,Over test data: 12/20 = .60

1 Machine Learning – Introduction (4th April 2006) 4

Page 7: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Credit Risk AnalysisI Data:

Customer103: Customer103: Customer103:(time=t0) (time=t1) (time=tn)...

...

Own House: Yes

Other delinquent accts: 2

Loan balance: $2,400Income: $52k

Max billing cycles late: 3

Years of credit: 9

Profitable customer?: ?

...

Own House: Yes

Years of credit: 9

Profitable customer?: ?

...

Own House: Yes

Years of credit: 9Loan balance: $3,250Income: ?

Other delinquent accts: 2Max billing cycles late: 4

Loan balance: $4,500Income: ?

Other delinquent accts: 3Max billing cycles late: 6Profitable customer?: No

I Rules learned from synthesized data:

If Other-Delinquent-Accounts > 2, andNumber-Delinquent-Billing-Cycles > 1

Then Profitable-Customer\TR{?} = No[Deny Credit Card application]

If Other-Delinquent-Accounts = 0, and(Income > $30k) OR (Years-of-Credit > 3)

Then Profitable-Customer\TR{?} = Yes[Accept Credit Card application]

1 Machine Learning – Introduction (4th April 2006) 5

Page 8: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Other Prediction Problems

I Customer purchase behavior:

Customer103: Customer103: Customer103:(time=t0) (time=t1) (time=tn)...

...

Sex: MAge: 53Income: $50kOwn House: Yes

MS Products: WordComputer: 386 PC

Purchase Excel?: ?

...

Sex: MAge: 53Income: $50kOwn House: Yes

MS Products: Word

...

Sex: MAge: 53Income: $50kOwn House: Yes

Purchase Excel?: ?

MS Products: WordComputer: Pentium Computer: Pentium

Purchase Excel?: Yes

Page 9: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Other Prediction Problems

I Customer purchase behavior:

Customer103: Customer103: Customer103:(time=t0) (time=t1) (time=tn)...

...

Sex: MAge: 53Income: $50kOwn House: Yes

MS Products: WordComputer: 386 PC

Purchase Excel?: ?

...

Sex: MAge: 53Income: $50kOwn House: Yes

MS Products: Word

...

Sex: MAge: 53Income: $50kOwn House: Yes

Purchase Excel?: ?

MS Products: WordComputer: Pentium Computer: Pentium

Purchase Excel?: Yes

I Process optimization:

(time=t0) (time=t1) (time=tn)...Product72: Product72: Product72:

...

Viscosity: 1.3

... ...

Viscosity: 1.3

Product underweight?: ?? Product underweight?:

Viscosity: 3.2

Yes

Fat content: 15%

Stage: mixMixing−speed: 60rpm

Density: 1.1

Stage: cookTemperature: 325

Fat content: 12%

Density: 1.2

Stage: coolFan−speed: medium

Fat content: 12%

Spectral peak: 3200Density: 2.8Spectral peak: 2800 Spectral peak: 3100

Product underweight?: ??

1 Machine Learning – Introduction (4th April 2006) 6

Page 10: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Problems Too Difficult to Program by Hand

I ALVINN [Pomerleau] drives 70 mph on highways

Sharp Left

SharpRight

4 Hidden Units

30 Output Units

30x32 Sensor Input Retina

Straight Ahead

1 Machine Learning – Introduction (4th April 2006) 7

Page 11: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Software that Customizes to User

1 Machine Learning – Introduction (4th April 2006) 8

Page 12: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Where Is this Headed?

I Today: tip of the iceberg

. First-generation algorithms: neural nets, decision trees, regression, etc.

. Applied to well-formated database

. Budding industry

I Opportunity for tomorrow: enormous impact

. Learn across full mixed-media data

. Learn across multiple internal databases, plus the web and newsfeeds

. Learn by active experimentation

. Learn decisions rather than predictions

. Cumulative, lifelong learning

. Programming languages with embedded learning

1 Machine Learning – Introduction (4th April 2006) 9

Page 13: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Relevant Disciplines

I Artificial intelligence

I Bayesian methods

I Computational complexity theory

I Control theory

I Information theory

I Philosophy

I Psychology and neurobiology

I Statistics

I . . .

1 Machine Learning – Introduction (4th April 2006) 10

Page 14: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

What is the Learning Problem?

I Learning = Improving with experience at some task.

. Improve over task T ,

. with respect to performance measure P ,

. based on experience E.

I E.g., Learn to play checkers:

. T : play checkers,

. P : percentage of games won in world tournament,

. E: opportunity to play against self.

I E.g., Learning to drive:

. T : driving on public four-lane highway using vision sensors,

. P : average distance traveled before an error,

. E: a sequence of images and steering commandsrecorded while observing a human driver.

1 Machine Learning – Introduction (4th April 2006) 11

Page 15: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Learning to Play Checkers

I T : play checkers

I P : percentage of games won in world tournament

Page 16: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Learning to Play Checkers

I T : play checkers

I P : percentage of games won in world tournament

I What experience?

I What exactly should be learned?

I How shall it be represented?

I What specific algorithm to learn it?

1 Machine Learning – Introduction (4th April 2006) 12

Page 17: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Type of Training Experience

I Direct or indirect?

I The problem of credit assignment.

I Teacher or not?

Page 18: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Type of Training Experience

I Direct or indirect?

I The problem of credit assignment.

I Teacher or not?

I Problem is training experience representative of performance goal?

1 Machine Learning – Introduction (4th April 2006) 13

Page 19: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Choose the Target Function

I ChooseMove : Board→Move ?

I V : Board→ R ?

I ...

1 Machine Learning – Introduction (4th April 2006) 14

Page 20: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Possible Definition for Target Function V

I if b is a final board state that is won, then V (b) = 100

I if b is a final board state that is lost, then V (b) = −100

I if b is a final board state that is drawn, then V (b) = 0

I if b is a not a final state in the game, then V (b) = V (b′), where b′ is the bestfinal board state that can be achieved starting from b and playing optimally untilthe end of the game.

. This gives correct values, but is not operational.

. Ultimate goal Find an operational description of the ideal target function V .

. But we can often only acquire some approximation V̂ .

1 Machine Learning – Introduction (4th April 2006) 15

Page 21: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Choose Representation for Target Function

I Collection of rules?

I Neural networks ?

I Polynomial function of board features?

I etc.

1 Machine Learning – Introduction (4th April 2006) 16

Page 22: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

A Representation for a Learned Function

w0 + w1 · bp(b) + w2 · rp(b) + w3 · bk(b) + w4 · rk(b) + w5 · bt(b) + w6 · rt(b)

I bp(b): number of black pieces on board b

I rp(b): number of red pieces on b

I bk(b): number of black kings on b

I rk(b): number of red kings on b

I bt(b): number of red pieces threatened by black(i.e., which can be taken on black’s next turn)

I rt(b): number of black pieces threatened by red

1 Machine Learning – Introduction (4th April 2006) 17

Page 23: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Obtaining Training Examples

I V (b): the true target function

I V̂ (b): the learned function

I Vtrain(b): the training value

Page 24: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Obtaining Training Examples

I V (b): the true target function

I V̂ (b): the learned function

I Vtrain(b): the training value

I One rule for estimating training values:

. Vtrain(b)← V̂ (Successor(b))

1 Machine Learning – Introduction (4th April 2006) 18

Page 25: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Choose Weight Tuning Rule

I Least mean square weight update rule:

I Do repeatedly:

. Select a training example b at random

. Compute error(b):

error(b) = Vtrain(b)− V̂ (b)

. For each board feature fi, update weight wi:

wi ← wi + c · fi · error(b)

I c is some small constant, say 0.1, to moderate the rate of learning

1 Machine Learning – Introduction (4th April 2006) 19

Page 26: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Design Choices

DetermineTarget Function

Determine Representationof Learned Function

Determine Type of Training Experience

DetermineLearning Algorithm

Games against self

Games against experts Table of correct

moves

Linear functionof six features

Artificial neural network

Polynomial

Gradient descent

Board Ý value

BoardÝ move

Completed Design

...

...

Linear programming

...

...

1 Machine Learning – Introduction (4th April 2006) 20

Page 27: I Why Machine Learning? - Computational Logic · 1 Machine Learning – Introduction IWhy Machine Learning? IWhat is a well-defined learning problem? IAn example: learning to play

Some Issues in Machine Learning

I What algorithms can approximate functions well (and when)?

I How does the number and distribution of the training examplesinfluence the accuracy?

I How does the complexity of the hypothesis representation impact it?

I How does noisy data influence accuracy?

I What are the theoretical limits of learnability?

I How can prior knowledge help the learner?

I What clues can we get from biological learning systems?

I How can systems alter their own representations?

1 Machine Learning – Introduction (4th April 2006) 21