Learning Fuzzy Association Rules and Associative Classification Rules
description
Transcript of Learning Fuzzy Association Rules and Associative Classification Rules
Learning Fuzzy Association Rules Learning Fuzzy Association Rules and and
Associative Classification Rules Associative Classification Rules
Jianchao HanJianchao Han
Computer Science DepartmentComputer Science DepartmentCalifornia State University Dominguez HillsCalifornia State University Dominguez Hills
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 22
AgendaAgenda
IntroductionIntroductionTraditional Association Rules Traditional Association Rules Positive and Negative Fuzzy Positive and Negative Fuzzy Association RulesAssociation RulesAn Illustrative Example An Illustrative Example Positive and Negative Fuzzy Positive and Negative Fuzzy Associative Classification RulesAssociative Classification RulesImplementation AlgorithmsImplementation AlgorithmsConclusionConclusion
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 33
IntroductionIntroductionAssociation Association – a relationship between data itemsa relationship between data items
Sales data associationSales data association– If a set of items A occurs in a sale transaction, If a set of items A occurs in a sale transaction,
then another set of items B will likely also then another set of items B will likely also occurs in the same transaction occurs in the same transaction
LimitationsLimitations– Data are described in binary attribute valuesData are described in binary attribute values– Only positive associations are pursuedOnly positive associations are pursued
SolutionsSolutions– Fuzzy attribute valuesFuzzy attribute values– Negative associationsNegative associations
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 44
Traditional Association RulesTraditional Association RulesBasket dataBasket data– I={II={I11, I, I2 2 , … , I, … , Imm}, a set of possible items}, a set of possible items
– D={tD={t11, t, t2 2 , … , t, … , tnn}, a database of transactions}, a database of transactions
– tt∈∈D is represented as a binary vector, with D is represented as a binary vector, with t[It[Ikk]=1 if t contains I]=1 if t contains Ikk
t[It[Ikk]=0 if t does not contain I]=0 if t does not contain Ikk
Support of itemsetSupport of itemset– ∀∀XX⊂⊂I, t satisfies X, if I, t satisfies X, if ∀∀IIkk∈∈I, I, t[It[Ikk]=1]=1
– The support of X in D is defined asThe support of X in D is defined asSupp(X) = |{tSupp(X) = |{t∈∈D| D| t satisfies X}|t satisfies X}|
That is the number of transactions that satisfy XThat is the number of transactions that satisfy X
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 55
Traditional Association RulesTraditional Association RulesItemset (binary) association rulesItemset (binary) association rules– For any X, YFor any X, Y⊂⊂I, XI, X⋂⋂Y=Y=ФФ, X, XY is an association Y is an association
rule ifrule if
– The support of the rule Supp(XThe support of the rule Supp(XY) is the Y) is the probability of occurrence of Xprobability of occurrence of X⋃⋃Y in DY in D
– The confidence of the rule Conf(XThe confidence of the rule Conf(XY) is the Y) is the conditional probability of Y given X conditional probability of Y given X
Mining association rulesMining association rules– Look for all possible associations XLook for all possible associations XY such Y such
that Supp(Xthat Supp(XY) ≥ Y) ≥ αα – a given threshold and – a given threshold and Conf(XConf(XY) ≥ Y) ≥ ββ– another given threshold– another given threshold
||
)()(
D
YXSuppYXSupp
)(
)()(
XSupp
YXSuppYXConf
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 66
Association Rules Mining Association Rules Mining AlgorithmAlgorithm
Two stepsTwo steps– Discovering all frequent itemsets that have the Discovering all frequent itemsets that have the
support ≥support ≥αα– Generating association rules Generating association rules
Partition each frequent itemset into two parts, X and YPartition each frequent itemset into two parts, X and YTest the Conf(XTest the Conf(XY)Y)
Level-wise algorithmLevel-wise algorithm– Observation: if X is a frequent itemset, its all Observation: if X is a frequent itemset, its all
subsets aresubsets are– Test all 1-item itemsetsTest all 1-item itemsets– Test all 2-item itemsets that are the superset of Test all 2-item itemsets that are the superset of
frequent 1-item itemsetsfrequent 1-item itemsets– Repeat until no new frequent itemsets are foundRepeat until no new frequent itemsets are found
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 77
Fuzzy Association RulesFuzzy Association RulesBinary value is extended to the interval [0,1]Binary value is extended to the interval [0,1]
Example -- Item Example -- Item TomatoTomato belongs to belongs to VegetableVegetable in in some degree, say 0.7some degree, say 0.7
Itemset A={AItemset A={A11, A, A2 2 , … , A, … , All}}⊂⊂II, where A, where Ai i is a fuzzy is a fuzzy subset of Isubset of I
Support of an itemset A is defined asSupport of an itemset A is defined as
Support of a rule ASupport of a rule AB isB is
Confidence of a rule AConfidence of a rule AB isB is
Dt
l
iA tASuppi
1
)()(
||
)(
)(D
t
BASupp Dt BAxx
Dt Axx
Dt BAxx
t
t
BAConf)(
)(
)(
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 88
Positive vs. Negative Association RulesPositive vs. Negative Association RulesPositive association rulesPositive association rules
– Like ALike ABB
Negative association rulesNegative association rules– Like ¬ALike ¬AB, ¬AB, ¬A¬B, A¬B, A¬B¬B
Different rule-interest measures exist for Different rule-interest measures exist for negative association rules, e.g.negative association rules, e.g.
– Negative example of ANegative example of AB is positive example B is positive example of Bof BAA
– AA¬B, if ¬B, if AA⋃⋃B is infrequentB is infrequentAA⋃⋃¬B is frequent¬B is frequentSupp(Supp(AA⋃⋃¬B) – Supp(A)Supp(¬B)≥¬B) – Supp(A)Supp(¬B)≥ααSupp(Supp(AA⋃⋃¬B)/Supp(A) ≥¬B)/Supp(A) ≥ββ
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 99
Fuzzy Positive Association RulesFuzzy Positive Association RulesSimple fuzzy extension to traditional Simple fuzzy extension to traditional association rulesassociation rules
AAB is a fuzzy positive association rule, ifB is a fuzzy positive association rule, if
1)1) AA⋂⋂B = B = ФФ
2)2)
3)3)
||
)()()(
D
ttBASupp Dt
By yAx
x
DtAx x
Dt AxBy yx
t
ttBAConf
)(
)()()(
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 1010
Fuzzy Negative Association RulesFuzzy Negative Association RulesAA¬B is a negative association rule if¬B is a negative association rule if1)1) AA⋂⋂B = B = ФФ
2)2) Supp(A) ≥Supp(A) ≥αα
3)3) Supp(B) ≥Supp(B) ≥αα
4)4) Supp(ASupp(AB) < B) <
5)5)
6)6)
||
))(1()()(
D
ttBASupp Dt
By yAx
x
DtAx x
Dt AxBy yx
t
ttBAConf
)(
))(1()()(
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 1111
Fuzzy Negative Association RulesFuzzy Negative Association Rules
¬A¬AB is a negative association rule ifB is a negative association rule if1)1) AA⋂⋂B = B = ФФ
2)2) Supp(A) ≥Supp(A) ≥αα
3)3) Supp(B) ≥Supp(B) ≥αα
4)4) Supp(ASupp(AB) < B) <
5)5)
6)6)
||
)())(1()(
D
ttBASupp Dt
By yAx
x
DtAx x
Dt AxBy yx
t
ttBAConf
))(1(
)())(1()(
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 1212
Fuzzy Negative Association RulesFuzzy Negative Association Rules¬A¬A¬B is a negative association rule if¬B is a negative association rule if1)1) AA⋂⋂B = B = ФФ
2)2) Supp(A) ≥Supp(A) ≥αα
3)3) Supp(B) ≥Supp(B) ≥αα
4)4) Supp(ASupp(AB) < B) <
5)5)
6)6)
||
))(1())(1()(
D
ttBASupp Dt
By yAx
x
DtAx x
Dt AxBy yx
t
ttBAConf
))(1(
))(1())(1()(
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 1313
Algorithm for Mining both Positive Algorithm for Mining both Positive and Negative Fuzzy Rulesand Negative Fuzzy Rules
Two stepsTwo steps– Generating all frequent and infrequent Generating all frequent and infrequent
itemsetsitemsets– Extracting fuzzy association rulesExtracting fuzzy association rules
Positive rules are extracted from the Positive rules are extracted from the frequent itemsetsfrequent itemsets
Negative rules are extracted from the Negative rules are extracted from the infrequent itemsets infrequent itemsets
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 1414
An ExampleAn Example
Trans. i1 i2 i3 i4 i5 i6
t1 1.0 0.7 0.2 0.0 1.0 1.0
t2 0.8 0.0 0.6 0.8 0.4 0.2
t3 0.5 0.8 0.0 0.8 0.8 0.0
t4 0.7 0.2 1.0 0.9 1.0 0.8
t5 0.4 0.4 0.0 0.6 0.8 0.9
t6 0.8 0.0 0.1 1.0 0.1 0.8
t7 0.9 0.9 0.8 0.2 1.0 1.0
t8 0.6 0.1 0.1 0.8 0.7 0.8
1-itemset 2-itemsets 3-itemsets
itemset support itemset Support itemset support
i1 5.7/8 i1, i4 3.37/8 i1, i4, i5 1.99/8
i2 3.1/8 i1, i5 4.14/8 i1, i5, i6 3.21/8
i3 2.8/8 i1, i6 4.10/8
i4 5.1/8 i4, i5 3.20/8
i5 5.8/8 i4, i6 3.06/8
i6 5.5/8 i5, i6 4.24/8
Transaction Database Frequent vs. Infrequent ItemsetsWith support threshold 40%
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 1515
An Example: An Example: Positive Fuzzy Association RulesPositive Fuzzy Association Rules itemset association support confidence
i1, i4 i1i4
i4i1
3.37/8 59.1%66.1%
i1, i5 i1i5
i5i1
4.14/8 72.6%71.4%
i1, i6 i1i6
i6i1
4.10/8 71.9%74.5%
i4, i5 i4i5
i5i4
3.20/8 62.7%55.2%
i5, i6 i5i6
i6i5
4.24/8 73.1%77.1%
i1, i5, i6 i1, i5i6
i1, i6i5
i5, i6i1
i1i5, i6
i5i1, i6
i6i1, i5
3.21/8 77.6%78.3%75.8%56.4%55.4%58.4%
Support threshold: 40%
Confidence threshold: 75%
Support threshold: 50%
Confidence threshold: 70%
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 1616
An Example: An Example: Negative Fuzzy Association RulesNegative Fuzzy Association Rules
Support threshold: 25%
Confidence threshold: 70%
itemset association support confidence
i4, i6i4i6
i6i4
2.04/8 35.8%73.0%
i6i4
i4i6
2.44/8 44.4%84.1%
i4i6
i6i6
0.46/8 15.9%18.4%
i1, i4, i5i1, i4i5
i5i1, i4
1.376/8 40.8%62.5%
i1, i5i4
i4i1, i5
2.146/8 51.8%74.0%
i4, i5i1
i1i4, i5
1.206/8 37.6%52.4%
i1i4, i5
i4, i5i1
0.184/8 3.20%61.3%
i4i1, i5
i1, i5i4
0.524/8 10.3%81.9%
i5i1, i4
i1, i4i5
0.454/8 7.80%79.6%
i1i4, i5
i4, i5i1
i4i1, i5
i1, i5i4
i5i1, i4
i1, i4i5
0.116/8 5.00%38.7%4.00%18.1%5.30%20.4%
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 1717
Associative Classification RulesAssociative Classification RulesAssociative classification rules are a special Associative classification rules are a special subset of association rules whose right-subset of association rules whose right-hand-side is restricted to the class labels.hand-side is restricted to the class labels.In classification, data attributes are In classification, data attributes are partitioned into two categories: partitioned into two categories: condition condition attributes attributes andand decision attributes decision attributes. .
For simplicity, decision attributes are For simplicity, decision attributes are converted into decision attribute-value pairs converted into decision attribute-value pairs that are indicated as class labels. that are indicated as class labels.
Thus, class labels are also items in the Thus, class labels are also items in the database, but separate from condition database, but separate from condition items.items.
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 1818
Two ConstraintsTwo Constraints
the left-hand-side of classification the left-hand-side of classification rules must be frequent itemsets of rules must be frequent itemsets of condition attributes, or the negation condition attributes, or the negation of infrequent conditional itemsets of infrequent conditional itemsets
the class labels that appear in the the class labels that appear in the right-hand-side of classification rules right-hand-side of classification rules must also be frequent 1-itemsets must also be frequent 1-itemsets
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 1919
Positive Fuzzy Associative Positive Fuzzy Associative Classification RulesClassification Rules
Let Let AAII be an itemset, and be an itemset, and cc C C be a be a class label. The relationship class label. The relationship AAcc is a is a positive fuzzy associative positive fuzzy associative classification rule, if the following classification rule, if the following conditions hold:conditions hold:
1)1) AA {c}{c} is a frequent itemsets in is a frequent itemsets in DD,, Supp(ASupp(A{c})/|D| {c})/|D| minsupp minsupp
2)2) AA c c is confident, is confident, Conf(AConf(Ac}=Supp(Ac}=Supp(A{c})/Supp(A){c})/Supp(A) minconf minconf
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 2020
Negative Fuzzy Associative Negative Fuzzy Associative Classification RulesClassification Rules
We only consider the format We only consider the format AAcc – where where AA is a frequent itemset, is a frequent itemset, – {c}{c} is a frequent class label, is a frequent class label, – AA{c}{c} is infrequent is infrequent
AAc c is a negative fuzzy associative is a negative fuzzy associative classification rule ifclassification rule if
1 1 Supp(A) ≥ minsuppSupp(A) ≥ minsupp; ;
22 Supp({c}) ≥ minsupp Supp({c}) ≥ minsupp; ;
3 3 Supp(ASupp(A{c})/|D| < minsupp;{c})/|D| < minsupp;
4 4 Supp(¬ASupp(¬A{c})/|D|{c})/|D| ≥ minsupp; ≥ minsupp;
55 Conf( Conf(AAc)=Supp(¬Ac)=Supp(¬A{c})/Supp(¬A)≥minconf.{c})/Supp(¬A)≥minconf.
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 2121
Learning AlgorithmLearning AlgorithmStep 1:Step 1: Finding the set of frequent conditional Finding the set of frequent conditional itemsets for associative classification rules itemsets for associative classification rules Step 2:Step 2: Inducing both positive and negative Inducing both positive and negative fuzzy associative classification rules fuzzy associative classification rules
– add each frequent class label add each frequent class label cc to each to each frequent itemset frequent itemset XX
If If X X {c}{c} is still is still frequentfrequent, then test if , then test if XXcc is a is a positive fuzzy association rule;positive fuzzy association rule;
If X If X {c} {c} is is infrequentinfrequent, then, then test if test if XXc c is a is a negative fuzzy association rule.negative fuzzy association rule.
– a frequent itemset Y is partitioned into two a frequent itemset Y is partitioned into two subsets A and B, and the associations subsets A and B, and the associations AABBc c andand A ABBc c are tested against the are tested against the support threshold and confidence threshold.support threshold and confidence threshold.
July 19, 2006July 19, 2006 WCCI 2006WCCI 2006 2222
ConclusionConclusion
Traditional association rulesTraditional association rules
Fuzzy extensions and negative rulesFuzzy extensions and negative rules
Fuzzy associative classification rulesFuzzy associative classification rules
An exampleAn example
AlgorithmsAlgorithms