Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

16
1 Moment: Maintaining Closed Frequent Itemsets over a St ream Sliding window Yun Chi, Haixun Wang, Philip S. Yun Chi, Haixun Wang, Philip S. Yu, Richard R. Muntz, ICDM 20 Yu, Richard R. Muntz, ICDM 20 04. 04. Adviser: Jia-Ling Koh Adviser: Jia-Ling Koh Speaker: Shu-Ning Shin Speaker: Shu-Ning Shin Date: 2005.5.6 Date: 2005.5.6

description

Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window. Yun Chi, Haixun Wang, Philip S. Yu, Richard R. Muntz, ICDM 2004. Adviser: Jia-Ling Koh Speaker: Shu-Ning Shin Date: 2005.5.6. Introduction. - PowerPoint PPT Presentation

Transcript of Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

Page 1: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

1

Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

Yun Chi, Haixun Wang, Philip S. Yu, RYun Chi, Haixun Wang, Philip S. Yu, Richard R. Muntz, ICDM 2004.ichard R. Muntz, ICDM 2004.

Adviser: Jia-Ling KohAdviser: Jia-Ling KohSpeaker: Shu-Ning ShinSpeaker: Shu-Ning ShinDate: 2005.5.6Date: 2005.5.6

Page 2: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

2

IntroductionIntroduction• Algorithm Moment: Mime closed freque

nt itemsets in the most N transactions in data stream.

• Data structure, closed enumeration tree (CET), maintain:– Closed frequent itemsets,– Boundary between closed frequent itemset

s and the rest.

Page 3: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

3

ProblemProblem• Lexicographic order: • Closed frequent itemset: none of its supersets

has the same support.• Items Σ={A, B, C, D}, window size N=4, minimu

m support s = ½.

CDACABCAB

Page 4: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

4

CET (1)CET (1)• Four types of itemsets node:

– Infrequent:• Infrequent gateway node, dashed circle — D.

– Frequent but not closed:• Unpromising gateway node, dashed rectangle — AC.• Intermediate node — A.

– Closed:• Closed node, solid rectangle — ABC.

Page 5: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

5

CET (2)CET (2)• Property 1: if nI is an infrequent gateway node, t

hen any node nJ where represents an infrequent itemset.

• Property 2: if nI is an unpromising gateway node, then nI is not closed, and none of nI’s descendents is closed.

• Property 3: if nI is an intermediate node, then nI is not closed and nI has closed descendents.

IJ

Page 6: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

6

Moment: Build CET (1)Moment: Build CET (1)

• Node nI has information :– itemset I, node type, support, tid_sum

• Hash table: – store all closed frequent itemsets– check if nI is an unpromising gateway node,

if exit a nJ where – hash on the (support, tid_sum) of nI

IJIJ ,

Page 7: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

7

Moment: Build CET (2)Moment: Build CET (2)

Page 8: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

8

Moment: Build CET (3)Moment: Build CET (3)

• Items Σ={A, B, C, D}, Explore(n{i}), for each i in Σ.

ψ

D0C0B0A0

Page 9: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

9

Moment: Add CET (1)Moment: Add CET (1)

Page 10: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

10

Moment: Add CET (2)Moment: Add CET (2)• Adding a transaction tid 5:• Call Addition(nψ, t5, D, minsup)

5 A, C, D

4 A 4 C 2 D

3 AC 0 AD 0 CD

ψ

F={D}2 CDAD1

Page 11: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

11

Moment: Delete CET (1)Moment: Delete CET (1)

Page 12: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

12

Moment: Delete CET (2)Moment: Delete CET (2)• Deleting a transaction tid 1:

3 C 1 F={D}D

Page 13: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

13

Moment: Update CET (3)Moment: Update CET (3)• Deleting a transaction tid 2:

3 A

2 AB

2 B

Page 14: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

14

Experiment (1)Experiment (1)• Dataset: T20I4D100K• Window Size N = 100000

Page 15: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

15

Experiment (2)Experiment (2)

Page 16: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding window

16

Experiment (3)Experiment (3)• Real Datase: BMS-WebView-1• Items: 497, transactions: 59602• Window Size N = 50000