Rule Based Classifier

18
L19-03-09-09 1 RULE-BASED CLASSIFIER Lecture 19/07-09-09 (Monday) (this lecture was suppose to happen on 03- 09-09, DB2 WS)

description

pdf 6

Transcript of Rule Based Classifier

Page 1: Rule Based Classifier

L19-03-09-09 1

RULE-BASED CLASSIFIER

Lecture 19/07-09-09 (Monday)(this lecture was suppose to happen on 03-09-09, DB2

WS)

Page 2: Rule Based Classifier

L19-03-09-09 2

Rule-Based Classifier• Classify records by using a collection of

“if…then…” rules

• Rule: (Condition) y– where

• Condition is a conjunctions of attributes • y is the class label

– LHS: rule antecedent or condition– RHS: rule consequent– Examples of classification rules:

• (Blood Type=Warm) (Lay Eggs=Yes) Birds• (Taxable Income < 50K) (Refund=Yes) Evade=No

Page 3: Rule Based Classifier

L19-03-09-09 3

Rule-based Classifier (Example)

R1: (Give Birth = no) (Can Fly = yes) BirdsR2: (Give Birth = no) (Live in Water = yes) FishesR3: (Give Birth = yes) (Blood Type = warm) MammalsR4: (Give Birth = no) (Can Fly = no) ReptilesR5: (Live in Water = sometimes) Amphibians

Name Blood Type Give Birth Can Fly Live in Water Classhuman warm yes no no mammalspython cold no no no reptilessalmon cold no no yes fisheswhale warm yes no yes mammalsfrog cold no no sometimes amphibianskomodo cold no no no reptilesbat warm yes yes no mammalspigeon warm no yes no birdscat warm yes no no mammalsleopard shark cold yes no yes fishesturtle cold no no sometimes reptilespenguin warm no no sometimes birdsporcupine warm yes no no mammalseel cold no no yes fishessalamander cold no no sometimes amphibiansgila monster cold no no no reptilesplatypus warm no no no mammalsowl warm no yes no birdsdolphin warm yes no yes mammalseagle warm no yes no birds

Page 4: Rule Based Classifier

L19-03-09-09 4

Application of Rule-Based Classifier• A rule r covers an instance x if the

attributes of the instance satisfy the condition of the ruleR1: (Give Birth = no) (Can Fly = yes) Birds

R2: (Give Birth = no) (Live in Water = yes) Fishes

R3: (Give Birth = yes) (Blood Type = warm) Mammals

R4: (Give Birth = no) (Can Fly = no) Reptiles

R5: (Live in Water = sometimes) Amphibians

The rule R1 covers a hawk => Bird

The rule R3 covers the grizzly bear => Mammal

Name Blood Type Give Birth Can Fly Live in Water Classhawk warm no yes no ?grizzly bear warm yes no no ?

Page 5: Rule Based Classifier

L19-03-09-09 5

Rule Coverage and Accuracy• Coverage of a rule:

– Fraction of records that satisfy the antecedent of a rule.

– Consider a dataset D, and a rule as r:A y, then,

Coverage (r)=|A|/|D|• Accuracy of a rule:

– Fraction of records that satisfy both the antecedent and consequent of a rule.

Accuracy(r)= |A y|/|A|

Tid Refund Marital Status

Taxable Income Class

1 Yes Single 125K No

2 No Married 100K No

3 No Single 70K No

4 Yes Married 120K No

5 No Divorced 95K Yes

6 No Married 60K No

7 Yes Divorced 220K No

8 No Single 85K Yes

9 No Married 75K No

10 No Single 90K Yes 10

(Status=Single) No Coverage = 40%, Accuracy = 50%

Page 6: Rule Based Classifier

6

The rule:

(gives birth=yes) and (body temp= warm_blooded) Mammals

Coverage=(6/20)*100=30% and

Accuracy = (6/6)*100= 100%

Name Blood Type Give Birth Can Fly Live in Water Classhuman warm yes no no mammalspython cold no no no reptilessalmon cold no no yes fisheswhale warm yes no yes mammalsfrog cold no no sometimes amphibianskomodo cold no no no reptilesbat warm yes yes no mammalspigeon warm no yes no birdscat warm yes no no mammalsleopard shark cold yes no yes fishesturtle cold no no sometimes reptilespenguin warm no no sometimes birdsporcupine warm yes no no mammalseel cold no no yes fishessalamander cold no no sometimes amphibiansgila monster cold no no no reptilesplatypus warm no no no mammalsowl warm no yes no birdsdolphin warm yes no yes mammalseagle warm no yes no birds

Page 7: Rule Based Classifier

L19-03-09-09 7

How does Rule-based Classifier Work?

R1: (Give Birth = no) (Can Fly = yes) Birds

R2: (Give Birth = no) (Live in Water = yes) Fishes

R3: (Give Birth = yes) (Blood Type = warm) Mammals

R4: (Give Birth = no) (Can Fly = no) Reptiles

R5: (Live in Water = sometimes) Amphibians

A lemur triggers rule R3, so it is classified as a mammal

A turtle triggers both R4 and R5

A dogfish shark triggers none of the rules

Name Blood Type Give Birth Can Fly Live in Water Classlemur warm yes no no ?turtle cold no no sometimes ?dogfish shark cold yes no yes ?

Page 8: Rule Based Classifier

L19-03-09-09 8

Characteristics of Rule-Based Classifier• Mutually exclusive rules

– Classifier contains mutually exclusive rules if the rules are independent of each other

– The rules in a rule set R are ME if no two rules in R are triggered by the same record.

– The above property ensures that every record is covered by at most one rule.

• Exhaustive rules– Classifier has exhaustive coverage if it accounts for every

possible combination of attribute values– This property ensures that each record is covered by at least

one rule

Page 9: Rule Based Classifier

L19-03-09-09 9

R1: {body_temp=cold_blooded} non-mammals

R2: {body_temp=warm-blooded} {gives birth= yes) Mammals

R3: {body_temp=warm-blooded} {gives birth=No} non-mammlas

•This rule set is ME and Exhaustive

•Together these 2 properties ensure that every record is covered by exactly one rule.

•Many rule-based classifiers (also the one shown previously )do not have these properties.

•What if a rule set is not exhaustive?????

Page 10: Rule Based Classifier

L19-03-09-09 10

• Suppose a rule set is not exhaustive, then a default rule is considered:

rd: () yd

this rule covers the cases that are not covered by the other rules and assigned a default label yd.

• If a rule not ME then a record may be covered by more than one rule thus providing conflicting class labels.

Page 11: Rule Based Classifier

L19-03-09-09 11

Rules• Non mutually exclusive rules

– A record may trigger more than one rule – Solution?

• Ordered rule set

• Non exhaustive rules– A record may not trigger any rules– Solution?

• Use a default class

Page 12: Rule Based Classifier

L19-03-09-09 12

Two ways to overcome the problem when rules are not ME

• Ordered rule set – Rules are ordered in decreasing order of their

priority– Priority can be defined in many ways based on

accuracy, coverage or the order in which the rules were generated.

• Unordered rule set – Allows a test record to trigger multiple

classification rules and consider consequent of each rule as a vote for a particular class.

– use voting schemes

Page 13: Rule Based Classifier

L19-03-09-09 13

Ordered Rule Set• Rules are rank ordered according to their priority

– An ordered rule set is known as a decision list.• When a test record is presented to the classifier

– It is assigned to the class label of the highest ranked rule it has triggered

– If none of the rules fired, it is assigned to the default class

R1: (Give Birth = no) (Can Fly = yes) Birds

R2: (Give Birth = no) (Live in Water = yes) Fishes

R3: (Give Birth = yes) (Blood Type = warm) Mammals

R4: (Give Birth = no) (Can Fly = no) Reptiles

R5: (Live in Water = sometimes) Amphibians

Name Blood Type Give Birth Can Fly Live in Water Classturtle cold no no sometimes ?

Page 14: Rule Based Classifier

L19-03-09-09 15

Rule Ordering Schemes

Rule based ordering

Class based ordering

Page 15: Rule Based Classifier

L19-03-09-09 16

• Rule-based ordering– Individual rules are ranked based on their

quality– This scheme ensures that every test record is

classified by the “best” rule covering it.– If the no. of rules is large, interpreting lower

ranked rules become cumbersome.

Page 16: Rule Based Classifier

L19-03-09-09 17

• Class-based ordering

– Rules that belong to the same class appear together in the rule set R.

– The relative ordering among the rules from the same class doesn’t matter.

– Most of the well known rule based classifiers (like C4.5 rules and RIPPER) employ class based ordering.

Page 17: Rule Based Classifier

L19-03-09-09 18

Rule-based Ordering

(Refund=Yes) ==> No

(Refund=No, Marital Status={Single,Divorced},Taxable Income<80K) ==> No

(Refund=No, Marital Status={Single,Divorced},Taxable Income>80K) ==> Yes

(Refund=No, Marital Status={Married}) ==> No

Class-based Ordering

(Refund=Yes) ==> No

(Refund=No, Marital Status={Single,Divorced},Taxable Income<80K) ==> No

(Refund=No, Marital Status={Married}) ==> No

(Refund=No, Marital Status={Single,Divorced},Taxable Income>80K) ==> Yes

Page 18: Rule Based Classifier

L19-03-09-09 19

From Decision Trees To Rules

YESYESNONO

NONO

NONO

Yes No

{Married}{Single,

Divorced}

< 80K > 80K

Taxable Income

Marital Status

Refund

Classification Rules

(Refund=Yes) ==> No

(Refund=No, Marital Status={Single,Divorced},Taxable Income<80K) ==> No

(Refund=No, Marital Status={Single,Divorced},Taxable Income>80K) ==> Yes

(Refund=No, Marital Status={Married}) ==> No

Rules are mutually exclusive and exhaustive

Rule set contains as much information as the tree