Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining –...

27
Data Mining Techniques Association Rule

Transcript of Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining –...

Page 1: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Data Mining Techniques Association Rule

Page 2: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

What Is Association Mining?• Association Rule Mining

– Finding frequent patterns, associations, correlations, or causal structures among item sets in transaction databases, relational databases, and other information repositories

• Applications– Market basket analysis (marketing strategy: items to put

on sale at reduced prices), cross-marketing, catalog design, shelf space layout design, etc

• Examples– Rule form: Body ead [Support, Confidence].– buys(x, “Computer”) buys(x, “Software”) [2%, 60%]– major(x, “CS”) ^ takes(x, “DB”) grade(x, “A”) [1%,

75%]

Page 3: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Market Basket Analysis

Typically, association rules are considered interesting if they satisfy both a minimum support threshold and a minimum confidence threshold.

Page 4: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Rule Measures: Support and Confidence

• Let minimum support 50%, and minimum confidence 50%, we have– A C [50%, 66.6%]

– C A [50%, 100%]

Transaction ID Items Bought1000 A,B,C2000 A,C3000 A,D4000 B,E,F

Page 5: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Support & Confidence

Page 6: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Association Rule: Basic Concepts

• Given– (1) database of transactions, – (2) each transaction is a list of items

(purchased by a customer in a visit)

• Find all rules that correlate the presence of one set of items with that of another set of items

• Find all the rules A B with minimum confidence and support– support, s, P(A B)– confidence, c, P(B|A)

Page 7: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Terminologies• Item

– I1, I2, I3, …– A, B, C, …

• Itemset– {I1}, {I1, I7}, {I2, I3, I5}, …– {A}, {A, G}, {B, C, E}, …

• 1-Itemset– {I1}, {I2}, {A}, …

• 2-Itemset– {I1, I7}, {I3, I5}, {A, G}, …

Page 8: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Terminologies

• K-Itemset– If the length of the itemset is K

• Frequent (Large) K-Itemset– If the length of the itemset is K and the itemset

satisfies a minimum support threshold.

• Association Rule– If a rule satisfies both a minimum support thres

hold and a minimum confidence threshold

Page 9: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Analysis• The number of itemsets of a given cardinality

tends to grow exponentially

Page 10: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Fast Algorithms for Mining Association Rules

Page 11: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Mining Association Rules: Apriori Principle

• For rule A C:– support = support({A C}) = 50%– confidence = support({A C})/support({A}) = 66.6%

• The Apriori principle:– Any subset of a frequent itemset must be frequent

Transaction ID Items Bought1000 A,B,C2000 A,C3000 A,D4000 B,E,F

Frequent Itemset Support{A} 75%{B} 50%{C} 50%

{A,C} 50%

Min. support 50%Min. confidence 50%

Page 12: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Mining Frequent Itemsets: the Key Step

• Find the frequent itemsets: the sets of items that

have minimum support

– A subset of a frequent itemset must also be a frequent

itemset

• i.e., if {AB} is a frequent itemset, both {A} and {B} should be a

frequent itemset

– Iteratively find frequent itemsets with cardinality from 1 to

k (k-itemset)

• Use the frequent itemsets to generate

association rules

Page 13: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Example Database D1 3 42 3 51 2 3 52 5

scan D

count C1

C1 count1 22 33 34 15 3

generate L1

L1

1 2 3 5

scan D

count C2

C2 count12 113 215 123 225 335 2

generate L2

L2

13232535

C2

121315232535

generate C2

scan D

count C3

C3 count235 2

generate L3L3

235C3

235generate C3

Page 14: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Example of Generating Candidates

• L3={abc, abd, acd, ace, bcd}

• Self-joining: L3*L3

– abcd from abc and abd

– acde from acd and ace

• Pruning:

– acde is removed because ade is not in L3

• C4={abcd}

Page 15: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Example

Page 16: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Apriori Algorithm

Page 17: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Apriori Algorithm

Page 18: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Apriori Algorithm

Page 19: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Exercise 4

min-sup = 20%min-conf =

80%

Page 20: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Demo-IBM Intelligent Minner

Page 21: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Demo Database

Page 22: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,
Page 23: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,
Page 24: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,
Page 25: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Multi-Dimensional Association• Single-Dimensional (Intra-Dimension) Rules: Single

Dimension (Predicate) with Multiple Occurrences.buys(X, “milk”) buys(X, “bread”)

• Multi-Dimensional Rules: 2 Dimensions– Inter-dimension association rules (no repeated predicates)

age(X,”19-25”) occupation(X,“student”) buys(X,“coke”)

– hybrid-dimension association rules (repeated predicates)age(X,”19-25”) buys(X, “popcorn”) buys(X, “coke”)

• Categorical (Nominal) Attributes– finite number of possible values, no ordering among

values

• Quantitative Attributes– numeric, implicit ordering among values

Page 26: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Exercise 5min-sup = 20%min-conf = 80%

Page 27: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Research Topics• Quantitative Association Rules

– buys (bread, 5) buys (milk, 3)• Weighted Association Rules• High Utility Association Rules• Non-redundant Association Rule• Constrained Association Rules Mining• Multi-dimensional Association Rules• Generalized Association Rules• Negative Association Rules• Incremental Mining Association Rules• Data Stream Association Rule Mining• Interactive Mining Association Rules