Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree...

26
http://majorplus.nerbnerb Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthup inyo Faculty of Computer Science Thammasat University

Transcript of Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree...

Page 1: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Ensemble with Neighbor Rules VotingEnsemble with Neighbor Rules Voting

Itt Romneeyangkurn, Sukree SinthupinyoFaculty of Computer Science

Thammasat University

Page 2: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

2

http://majorplus.nerbnerb.com

OutlineOutline

Introduction and Motivation Preliminaries

Decision Tree Ensemble of Decision Trees Simple Majority Rule Simple Majority Class

Experiment Process Making Rule into Group Majority Rule+ and Majority Class+

Results of Experiment Conclusions and Future Work

Page 3: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Introduction of Decision TreeIntroduction of Decision Tree

Decision Tree is world-wide Used in data mining and machine learning

CART, C4.5 and ID3 etc. Advantages

Simple to understand and interpret Requires little data preparation Able to handle both numerical and categorical data Use a white box model Possible to validate a model using statistical tests Robust, perform well with large data in a short time

Page 4: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Introduction of Decision TreeIntroduction of Decision Tree

Outlook

Yes

No Yes No Yes

Attribute

Sunny

RainOvercast

Humidity

Wind

High Normal Strong

Weak

Attribute Value

Class

Page 5: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Decision Tree EnsembleDecision Tree Ensemble

Single Classifier -> Ensemble of Classifiers More accurate than individual classifier AdaBoost, Bagging and Random forest

etc.

Page 6: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Decision Tree EnsembleDecision Tree Ensemble

DT

TrainingSet

R

DT3

OriginalTraining

Set

R3

DTn

Rn

DT2

R2

DT1

R1

TrainingSet 1

for Tree 1

TrainingSet 2

for Tree 2

TrainingSet 3

for Tree 3

TrainingSet n

for Tree n

Individual Classifier Ensemble of Classifiers

Page 7: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

BootstrappingBootstrapping

Also called replicationsBe created by uniformly sampling m

times with replacement from a dataset of size m

Used to train the multiple classifiersCart, nearest neighbor classifiers and

C4.5 etc.10 bootstrap replications

Page 8: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

BootstrappingBootstrapping

Example of Original Dataset

Original Dataset: 1,2,3,4,5,6,7,8

Example of Bootstrap Replications

1st Bootstrap: 2,7,8,3,7,6,3,1

2nd Bootstrap: 7,8,5,6,4,2,7,1

3rd Bootstrap: 3,6,2,7,5,6,2,2

4th Bootstrap: 4,5,1,4,6,4,3,8

Page 9: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Simple Majority Vote (Bagging)Simple Majority Vote (Bagging)

Advantages Improve classification accuracy Reduce variance Helps to avoid over-fitting

Method Generates T bootstrap samples Generates each classifier Majority vote among the resulting T

decision tree is the final output

Page 10: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Simple Majority Vote (Bagging)Simple Majority Vote (Bagging)

DT1

Test Set

DT2

DT3

DT4

DT5

DT6

DT7

DT8

DT9

DT10

A

A

A

B

B

A

A

B

A

A

A

A

A

A

A

A

A

A=7B=3B

B

B

Simple

Majority

VoteA

Page 11: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Simple Majority ClassSimple Majority Class

Based on BaggingDifference in Voting

Majority vote among the class of training set that match the rule, which classify the data tester, in T decision trees, is the final output

Page 12: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Simple Majority ClassSimple Majority Class

DT1

Test set

DT2

DT3

DT4

DT5

DT6

DT7

DT8

DT9

DT10

DT1

Rule 1

Rule 2

Rule 3

Original Training Set

Data 1: AData 2: AData 3: BData 4: A

Data n: B

A: 0

B: 0

123

1

45678

23Example

A: 8B: 3A: 5B: 2

A: 6B: 3

A: 4B: 5

A: 9B: 2A: 8B: 2

B: 5

A: 8B: 3

A: 8B: 1

A: 7B: 2

A: 6

A=69B=28Simple

Majority

ClassA

Page 13: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Similarity Between RulesSimilarity Between Rules

Continuous Attribute Using the overlap between two rules’ ranges

Discrete Attribute Using the number of discrete attribute values in commo

n between both rules

lyindividual ruleeach of ranges

ruleeach of ranges ebetween th overlap

lyindividual rules twoby the covered attributes theof average

rulesboth by shared attributes theof percentage

More information about similarity between rules, please see “Bootstrapping Rule Induction to Archive Rule Stability and Reduction”

Page 14: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

ExperimentExperiment

Nine well-known bench mark data sets

Ten-fold cross-validation

Based on C4.5 Run in default

mode with pruning enabled

Data Set Instances Attributes

Classes

Balance-Scale

625 4 3

Bridges 105 10 6

Car 1,728 6 4

Dermatology

366 34 6

Hayes-Roth 132 4 3

Labor-Neg 40 16 2

Soybean 307 35 19

TAE 151 3 3

Zoo 101 16 7

Page 15: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

ExperimentExperiment

Generate 10 bootstrap samples of the original training set.

Generate 10 classifiers by using of C4.5.

Find similarity between rules from all classifier.

All rules are made into groups.

Page 16: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.comGroup of Rules with Similarity Value 0.8Group of Rules with Similarity Value 0.8

DT1

DT2

DT3

DT10

R(1,1)

R(1,2)

R(1,3)

R(2,1)

R(2,2)

R(3,1)

R(3,2)

R(3,3)

R(10,1)

R(10,2)

R(10,3)

Group of Rule(1,1)

R(1,1)

Similarity Between Rule R(1,1) and R(1,2)

0.2302Similarity Between Rule R(1,1) and R(1,3)

-0.7495Similarity Between Rule R(1,1) and R(2,1)

0.9454Similarity Between Rule R(1,1) and R(2,2)

0.7382

R(2,1)

R(3,2)

R(5,4)R(6,2)

R(7,1)

R(10,3)

Page 17: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Majority Rule+Majority Rule+

Based on BaggingDifference in Voting

Majority vote among the class of rule-member is the final output

Page 18: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Majority Rule+Majority Rule+

DT1

Test Set

DT2

DT3

DT4

DT5

DT6

DT7

DT8

DT9

DT10

A: 3B: 4A: 2B: 5

A: 5B: 5

A: 2B: 1

A: 4B: 2A: 2B: 3

B: 6

A: 4B: 2

A: 1B: 3

A: 1B: 5

A: 3

A=27B=36Majority

Rule+ B

DT1

Rule 1

Rule 2

Rule 3

Example

Group of Rule 2

R(1,2)

R(2,2)

R(3,1)

R(4,2)R(6,2)

R(8,2)

R(9,1)

A A

B

AB

B

B

A=3B=4

A:

B:

3

4

Page 19: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Majority Class+Majority Class+

Based on BaggingDifference in Voting

Majority vote among the class of training set that matches the rule-member is the final output

Page 20: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Majority Class+Majority Class+

DT1

Test Set

DT2

DT3

DT4

DT5

DT6

DT7

DT8

DT9

DT10

A: 2B: 10A: 2B: 8

A: 6B: 4

A: 2B: 12

A: 4B: 7A: 2B: 14

B: 7

A: 4B: 12

A: 1B: 5

A: 5B: 13

A: 5

A=33B=92Majority

Class+ B

DT1

Rule 1

Rule 2

Rule 3

Example

Group of Rule 2

R(1,2)

R(2,2)

R(3,1)

R(4,2)R(6,2)

R(8,2)

R(9,1)

A:

B:

0

0

Original Training Set

Data 1: AData 2: AData 3: BData 4: A

Data n: B

12345678910

12

Page 21: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.comComparing Bagging and Majority Rule+Comparing Bagging and Majority Rule+

Data SetBaggin

g

Majority Rule+

0.6 0.7 0.8 0.9

Balance-Scale

78.58±4.0978.09±3.43

79.21±3.66

79.70±4.79 ⊕ 80.01±4.

54 ⊕

Bridges 59.91±14.8661.82±15.68 ⊕ 59.00±18

.2061.73±16.57

61.73±16.57

Car 93.81±1.3789.76±2.00 ⊖ 90.34±2.

33 ⊖ 93.17±1.87

94.33±1.97

Dermatology 95.36±4.0495.90±3.29 ⊕ 95.91±3.

90 ⊕ 96.19±3.88 ⊕ 95.63±3.

69

Hayes-Roth 74.89±8.6574.89±13.02

75.66±11.94

74.95±11.96

73.41±11.60

Labor-Neg 67.50±22.5067.50±22.50

67.50±22.50

67.50±22.50

67.50±22.50

Soybean 85.67±6.3986.62±5.62

85.02±5.45

86.62±5.75

86.96±4.85

TAE 43.00±15.9545.00±14.08

45.00±14.08

43.00±15.95

43.00±15.95

Zoo 92.00±7.4892.00±7.48

92.00±7.48

94.00±6.63 ⊕ 94.00±6.

63 ⊕

Page 22: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.comComparing Simple Majority Class and Majority Class+Comparing Simple Majority Class and Majority Class+

Data SetBaggin

g

Majority Class+

0.6 0.7 0.8 0.9

Balance-Scale

81.45±5.4780.81±4.81 ⊖ 80.81±4.

3281.29±5.68

81.45±5.74

Bridges 63.64±19.1359.82±17.00 ⊖ 62.64±19

.3763.64±17.78

64.55±18.89

Car 94.68±1.2891.67±1.29 ⊖ 93.40±1.

19 ⊖ 94.45±1.48

94.56±1.77

Dermatology 91.51±6.2391.25±6.29

93.44±4.65 ⊕ 93.71±5.

37 ⊕ 93.71±4.05 ⊕

Hayes-Roth 70.38±10.6973.30±12.73

74.12±11.15 ⊕ 72.64±10

.52 ⊕ 71.15±9.71

Labor-Neg 67.50±22.5067.50±22.50

67.50±22.50

67.50±22.50

67.50±22.50

Soybean 87.28±5.4163.94±11.39 ⊖ 78.84±6.

99 ⊖ 85.00±3.38 ⊖ 87.60±3.

87

TAE 43.67±14.4939.67±14.64 ⊖ 44.33±15

.0644.33±15.06

43.67±14.49

Zoo 91.00±11.3640.64±15.20 ⊖ 84.09±13

.61 ⊖ 92.00±6.00

92.00±6.00

Page 23: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Appropriate Similarity ValueAppropriate Similarity Value

Comparing Bagging and Majority Rule+

Similarity Value

Significantly Better

Significantly Worse

0.6 2 1

0.7 1 1

0.8 3 0

0.9 2 0

Comparing Simple Majority Class and Majority Class+

Similarity Value

Significantly Better

Significantly Worse

0.6 0 6

0.7 2 3

0.8 2 1

0.9 1 0

The Best Similarity Between Rules is 0.8

Page 24: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

ConclusionsConclusions

Majority vote with neighbor rules improves accuracy over traditional simple majority vote.

The least similarity value is 0.8.

Page 25: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

Future WorkFuture Work

Run more with 10-15 data set from UCI

Cluster the rules by using similarity value and derive to one classifier Reduces time and resource.

The similarity value 0.8 could be used to apply with other decision trees methods, such as AdaBoost etc.

Page 26: Http://majorplus.nerbnerb.com Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.

http://majorplus.nerbnerb.com

The EndThe End