FS-Miner : Efficient and FS-Miner : Efficient and Incremental Mining of Incremental Mining of
Frequent Sequence Patterns Frequent Sequence Patterns in Web Logsin Web LogsMaged EL-Sayed, Carolina Ruiz, and Elke A. Maged EL-Sayed, Carolina Ruiz, and Elke A.
RundensteinerRundensteiner
6th ACM International Workshop on Web Information and 6th ACM International Workshop on Web Information and Data Management (WIDM 2004), pp.128-135, 2004Data Management (WIDM 2004), pp.128-135, 2004
November 12-13, 2004, Washington, DC, USANovember 12-13, 2004, Washington, DC, USA
Advisor: Professor Hsin-Hsi ChenAdvisor: Professor Hsin-Hsi ChenReporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh
Natural Language Processing Laboratory,Natural Language Processing Laboratory,Dept. of Computer Science and Info. Dept. of Computer Science and Info.
Engineering, NTUEngineering, NTU2005/10/112005/10/11
SlideSlide - - 22Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
OutlineOutlineIntroductionIntroductionFS-Tree ConstructionFS-Tree ConstructionMining the FS-TreeMining the FS-TreeMaintaining the FS-Tree Maintaining the FS-Tree
IncrementallyIncrementallyMining the FS-Tree IncrementallyMining the FS-Tree IncrementallyInteractive MiningInteractive MiningExperimental EvaluationExperimental EvaluationConclusionsConclusions
SlideSlide - - 33Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
IntroductionIntroduction
Path Traversal PatternPath Traversal Pattern– FS, SSFS, SS– AABBC, BC, BCCD…D…
Web Traversal PatternWeb Traversal Pattern– IPA, MFTPIPA, MFTP– AABBDDCCA, CA, CAACCAADDB…B…
SlideSlide - - 44Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Introduction Introduction (Cont.)(Cont.)
Consider Backward TraversalConsider Backward TraversalSubsequenceSubsequence
– Need Need ContinuousContinuousMSuppRMSuppRlinklink System Define System DefineMSuppRMSuppRseqseq User Define User DefineMSuppCMSuppClinklink
MSuppCMSuppCseqseq
SlideSlide - - 55Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree ConstructionFS-Tree Construction
SIDSID InSeqInSeq11 dgidgi22 dgdg33 cdehicdehi44 cdecde55 cbcdgcbcdg66 cbcb77 abcdgiabcdgi88 abcdabcd99 bdehibdehi
1010 bdehbdeh1111 cdebfabccdebfabc1212 cdefabccdefabc1313 aicaic1414 diedie1515 igdbaigdba1616 efaefa1717 efef1818 efabefab
SIDSID InSeqInSeq
MSuppCMSuppClinklink=2=2
MSuppCMSuppCseqseq=3=3
Total # of links = 50Total # of links = 50
MSuppRMSuppRlinklink=4%=4%
MSuppRMSuppRseqseq=6%=6%
System Define:System Define: MSuppR MSuppRlinklink
User Define:User Define: MSuppR MSuppRseqseq
SlideSlide - - 66Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLinkCountCountd-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22e-be-b 11
LinkLinkCountCount
b-fb-fe-fe-fa-ia-ii-ci-cd-id-ii-ei-ei-gi-gg-dg-dd-bd-bb-ab-a
11111111111111111111
SIDSID InSeqInSeq11 dgidgi22 dgdg33 cdehicdehi44 cdecde55 cbcdgcbcdg66 cbcb77 abcdgiabcdgi88 abcdabcd99 bdehibdehi
1010 bdehbdeh1111 cdebfabccdebfabc1212 cdefabccdefabc1313 aicaic1414 diedie1515 igdbaigdba1616 efaefa1717 efef1818 efabefab
SIDSID InSeqInSeq
SlideSlide - - 77Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22e-be-b 11
LinkLink CountCount
b-fb-fe-fe-fa-ia-ii-ci-cd-id-ii-ei-ei-gi-gg-dg-dd-bd-bb-ab-a
11111111111111111111
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
e-be-b 11LinkLink CountCount
b-fb-fe-fe-fa-ia-ii-ci-cd-id-ii-ei-ei-gi-gg-dg-dd-bd-bb-ab-a
11111111111111111111
1111SIDSID
1111121213131313141414141515151515151515
Non-Frequent Non-Frequent Links Table(NFLT)Links Table(NFLT)
When FS-Tree BuiltWhen FS-Tree Built
SlideSlide - - 88Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq11 dgidgi
dd
gg
i:1i:1
11
11
SlideSlide - - 99Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq22 dgdg
dd
g:2g:2
i:1i:1
22
11
SlideSlide - - 1010Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq33 cdehicdehi
dd
g:2g:2
i:1i:1
22
11
cc
dd
ee
hh
i:3i:3
11
11
11
11
SlideSlide - - 1111Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq44 cdecde
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
22
22
11
11
SlideSlide - - 1212Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq55 cbcdgcbcdg
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
22
22
11
11
bb
cc
dd
g:5g:5
11
11
11
11
SlideSlide - - 1313Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq66 cbcb
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
22
22
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
SlideSlide - - 1414Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq77 abcdgiabcdgi
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
22
22
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
dd
gg
i:7i:7
11
11
11
11
11
SlideSlide - - 1515Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq88 abcdabcd
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
22
22
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
SlideSlide - - 1616Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq99 bdehibdehi
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
22
22
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
hh
i:9i:9
11
11
11
11
SlideSlide - - 1717Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq1010 bdehbdeh
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
22
22
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
SlideSlide - - 1818Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq1111 cdcdebfebfabcabc
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
33
33
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
ff
aa
bb
cc
11
11
11
SlideSlide - - 1919Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq1212 cdcdefefabcabc
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
44
44
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
ff
aa
bb
cc
22
22
22
SlideSlide - - 2020Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq1313 aicaic
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
44
44
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
ff
aa
bb
cc
22
22
22
SlideSlide - - 2121Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq1414 diedie
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
44
44
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
ff
aa
bb
cc
22
22
22
SlideSlide - - 2222Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
FS-Tree Construction FS-Tree Construction (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq1515 gdbagdba
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
44
44
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
ff
aa
bb
cc
22
22
22
SlideSlide - - 2323Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Mining the FS-TreeMining the FS-Tree
Step 1: Extracting Derived PathsStep 1: Extracting Derived PathsStep 2: Constructing Conditional Step 2: Constructing Conditional
Sequence BaseSequence BaseStep 3: Constructing Conditional Step 3: Constructing Conditional
FS-TreeFS-TreeStep 4: Extracting Frequent Step 4: Extracting Frequent
SequencesSequences
SlideSlide - - 2424Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Mining the FS-Tree Mining the FS-Tree (Cont.)(Cont.)
Step 1Step 1LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 22
a-ba-b 44b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRoot
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
44
44
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
ff
aa
bb
cc
22
22
22
SlideSlide - - 2525Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Mining the FS-Tree Mining the FS-Tree (Cont.)(Cont.)
Step 2Step 2RootRoot
cc
dd
e:4e:4
hh
44
44
11
bb
dd
ee
h:10h:10
22
22
22
Conditional Sequence base :Conditional Sequence base :
Step 3Step 3 Conditional FS-Tree:Conditional FS-Tree:
(c-d:1, d-e:1), (b-d:2, d-e:2)(c-d:1, d-e:1), (b-d:2, d-e:2)
RootRoot
ee
dd
cc bb
33
11 22
RootRoot
ee
dd
11
RootRoot
ee
dd
cc
11
11
RootRoot
ee
dd
cc
33
11
SlideSlide - - 2626Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Mining the FS-Tree Mining the FS-Tree (Cont.)(Cont.)
Step 4Step 4
Depth first traversalDepth first traversal
RootRoot
ee
dd
cc bb
33
11 22<<deh deh : 3>: 3>
OutputOutput
SlideSlide - - 2727Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Mining the FS-Tree Mining the FS-Tree (Cont.)(Cont.)
The AnswersThe AnswersLinkLink d-gd-g
Derived PathsDerived Paths(d-g:2)(d-g:2)
(c-b:2, b-c:1, c-d:1, d-g:1)(c-b:2, b-c:1, c-d:1, d-g:1)(a-b:2, b-c:2, c-d:2, d-g:1)(a-b:2, b-c:2, c-d:2, d-g:1)
Conditional Sequence basesConditional Sequence bases
(c-b:1, b-c:1, c-d:1)(c-b:1, b-c:1, c-d:1)(a-b:1, b-c:1, c-d:1)(a-b:1, b-c:1, c-d:1)Conditional FS-TreesConditional FS-Trees
Frequent SequencesFrequent Sequences
LinkLink c-dc-dDerived PathsDerived Paths
(c-d:4)(c-d:4)(c-b:2, b-c:1, c-d:1)(c-b:2, b-c:1, c-d:1)(a-b:2, b-c:2, c-d:2)(a-b:2, b-c:2, c-d:2)Conditional Sequence basesConditional Sequence bases
(c-b:1, b-c:1)(c-b:1, b-c:1)(a-b:2, b-c:2)(a-b:2, b-c:2)
Conditional FS-TreesConditional FS-Trees
(b-c:3)(b-c:3)Frequent SequencesFrequent Sequences
<bcd <bcd : 3>: 3>
SlideSlide - - 2828Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Mining the FS-Tree Mining the FS-Tree (Cont.)(Cont.)
The AnswersThe Answers
LinkLink e-he-hDerived PathsDerived Paths
(c-d:4, d-e:4, e-h:1)(c-d:4, d-e:4, e-h:1)(b-d:3, d-e:2, e-h:2)(b-d:3, d-e:2, e-h:2)Conditional Sequence basesConditional Sequence bases
(c-d:1, d-e:1)(c-d:1, d-e:1)(b-d:2, d-e:2)(b-d:2, d-e:2)
Conditional FS-TreesConditional FS-Trees
(d-e:3)(d-e:3)Frequent SequencesFrequent Sequences
<<deh deh : 3>: 3>
LinkLink d-ed-eDerived PathsDerived Paths(c-d:4, d-e:4)(c-d:4, d-e:4)(b-d:3, d-e:2)(b-d:3, d-e:2)
Conditional Sequence basesConditional Sequence bases
(c-d:4)(c-d:4)(b-d:2)(b-d:2)
Conditional FS-TreesConditional FS-Trees
(c-d:4)(c-d:4)Frequent SequencesFrequent Sequences
<<cde cde : 4>: 4>
SlideSlide - - 2929Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Mining the FS-Tree Mining the FS-Tree (Cont.)(Cont.)
The AnswersThe Answers
LinkLink a-ba-bDerived PathsDerived Paths
(a-b:2)(a-b:2)(f-a:2, a-b:2)(f-a:2, a-b:2)
Conditional Sequence basesConditional Sequence bases
(f-a:2)(f-a:2)Conditional FS-TreesConditional FS-Trees
Frequent SequencesFrequent Sequences
LinkLink b-cb-cDerived PathsDerived Paths(c-b:2, b-c:1)(c-b:2, b-c:1)(a-b:2, b-c:2)(a-b:2, b-c:2)
(f-a:2, a-b:2, b-c:2)(f-a:2, a-b:2, b-c:2)Conditional Sequence basesConditional Sequence bases
(c-b:1)(c-b:1)(a-b:2)(a-b:2)
(f-a:2, a-b:2)(f-a:2, a-b:2)Conditional FS-TreesConditional FS-Trees
(a-b:4)(a-b:4)Frequent SequencesFrequent Sequences
<abc : 4><abc : 4>
SlideSlide - - 3030Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Maintaining the FS-Tree Maintaining the FS-Tree IncrementallyIncrementally
1616 efaefa1717 efef1818 efabefab
SIDSID InSeqInSeqe-f:3e-f:3f-a:2f-a:2a-b:1a-b:1
MSuppCMSuppClinklink=2=2
MSuppCMSuppCseqseq=3=3
e-f in NFLT Becomes e-f in NFLT Becomes Frequent, Move to Frequent, Move to Table HT Table HT
LinkLink CountCount
e-fe-f 11SIDSID1212
Non-Frequent Non-Frequent Links Table(NFLT)Links Table(NFLT)
1212 cdefabccdefabcSIDSID InSeqInSeq
Retrieve the Sequence Retrieve the Sequence from Original DBfrom Original DBDelete this record from NFLTDelete this record from NFLT
(Move to HT)(Move to HT)
SlideSlide - - 3131Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Maintaining the FS-Tree IncrementallyMaintaining the FS-Tree Incrementally (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 44
a-ba-b 55b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq1212 cdcdefefabcabc
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
44
44
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
ff
aa
bb
cc
11
11
11
e-fe-f 44
ff
aa
bb
c:12c:12
11
11
11
11
DeleteDelete
SlideSlide - - 3232Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Maintaining the FS-Tree IncrementallyMaintaining the FS-Tree Incrementally (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 44
a-ba-b 55b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq1616 efaefa
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
44
44
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
ff
aa
bb
cc
11
11
11
e-fe-f 44
ff
aa
bb
c:12c:12
11
11
11
11
ee11
ff
a:16a:1611
SlideSlide - - 3333Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Maintaining the FS-Tree IncrementallyMaintaining the FS-Tree Incrementally (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 44
a-ba-b 55b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq1717 efef
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
44
44
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
ff
aa
bb
cc
11
11
11
e-fe-f 44
ff
aa
bb
c:12c:12
11
11
11
11
ee22
f:17f:17
a:16a:1611
SlideSlide - - 3434Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Maintaining the FS-Tree IncrementallyMaintaining the FS-Tree Incrementally (Cont.)(Cont.)
LinkLink CountCount
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 44
a-ba-b 55b-db-d 22
ListHListHHeader Table (HT)Header Table (HT)
RootRootSIDSID InSeqInSeq1818 efabefab
dd
g:2g:2
i:1i:1
22
11
cc
dd
e:4e:4
hh
i:3i:3
44
44
11
11
b:6b:6
cc
dd
g:5g:5
22
11
11
11
aa
cc
bb
d:8d:8
gg
i:7i:7
22
22
22
11
11
bb
dd
ee
h:10h:10
i:9i:9
22
22
22
11
ff
aa
bb
cc
11
11
11
e-fe-f 44
ff
aa
bb
c:12c:12
11
11
11
11
ee33
f:17f:17
a:16a:1622
b:18b:1811
SlideSlide - - 3535Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Mining the FS-Tree IncrementallyMining the FS-Tree Incrementally Type 1:Type 1:
– Mine for those Links if they are AffectedMine for those Links if they are Affected Type 2 and 4:Type 2 and 4:
– Mine for these LinksMine for these Links Type 3 and 5:Type 3 and 5:
– Delete Previously Discovered Patterns that Include these Delete Previously Discovered Patterns that Include these LinksLinks
Type 6, 7, 8, and 9:Type 6, 7, 8, and 9:– Do NothingDo Nothing
FrequentFrequentLinksLinks
PotentiallyPotentiallyFrequentFrequent
LinksLinks
Non-FrequentNon-FrequentLinksLinks
Header Table (HT)Header Table (HT)Non-Frequent Non-Frequent
Links Table (NFLT)Links Table (NFLT)11
2233 44
55
66
7788
99
SlideSlide - - 3636Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Mining the FS-Tree IncrementallyMining the FS-Tree Incrementally (Cont.)(Cont.)
The AnswersThe Answers LinkLink a-ba-bDerived PathsDerived Paths
(c-d:4, d-e:4, e-f:1, f-a:1, a-b:1)(c-d:4, d-e:4, e-f:1, f-a:1, a-b:1)(a-b:2)(a-b:2)
(f-a:1, a-b:1)(f-a:1, a-b:1)(e-f:3, f-a:2, a-b:1)(e-f:3, f-a:2, a-b:1)Conditional Sequence basesConditional Sequence bases
(c-d:1, d-e:1, e-f:1, f-a:1)(c-d:1, d-e:1, e-f:1, f-a:1)(f-a:1)(f-a:1)
(e-f:1, f-a:1)(e-f:1, f-a:1)Conditional FS-TreesConditional FS-Trees
(f-a:3)(f-a:3)Frequent SequencesFrequent Sequences
<fab <fab : 3>: 3>
LinkLink CountCount ListHListHHeader Table (HT)Header Table (HT)
d-gd-g 44g-ig-i 22c-dc-d 77d-ed-e 66e-he-h 33h-ih-i 22c-bc-b 22b-cb-c 55
f-af-a 44
a-ba-b 55b-db-d 22
e-fe-f 44
SlideSlide - - 3737Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Mining the FS-Tree IncrementallyMining the FS-Tree Incrementally (Cont.)(Cont.)
The AnswersThe AnswersLinkLink e-fe-f
Derived PathsDerived Paths(c-d:4, d-e:4, e-f:1)(c-d:4, d-e:4, e-f:1)
(e-f:3)(e-f:3)Conditional Sequence basesConditional Sequence bases
(c-d:1, d-e:1)(c-d:1, d-e:1)
Conditional FS-TreesConditional FS-Trees
Frequent SequencesFrequent Sequences
LinkLink f-af-aDerived PathsDerived Paths
(c-d:4, d-e:4, e-f:1, f-a:1)(c-d:4, d-e:4, e-f:1, f-a:1)(f-a:1)(f-a:1)
(e-f:3, f-a:2)(e-f:3, f-a:2)Conditional Sequence basesConditional Sequence bases
(c-d:1, d-e:1, e-f:1)(c-d:1, d-e:1, e-f:1)(e-f:2)(e-f:2)
Conditional FS-TreesConditional FS-Trees
(e-f:3)(e-f:3)Frequent SequencesFrequent Sequences
<efa <efa : 3>: 3>
SlideSlide - - 3838Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Interactive MiningInteractive Mining
Setting the Setting the MSuppCMSuppClinklink to a to a Small Enough ValueSmall Enough Value– Enough Information in the Enough Information in the FS-FS-
TreeTree– Without to Reference the Without to Reference the
Original DatabaseOriginal Database
SlideSlide - - 3939Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Experimental EvaluationExperimental Evaluation
MS Data SetMS Data Set– Microsoft Anonymous Web Data SetMicrosoft Anonymous Web Data Set– 32,711 Sessions32,711 Sessions
1 up to 35 page references1 up to 35 page references
– 294 distinct pages294 distinct pages MSNBC Data SetMSNBC Data Set
– MSNBCMSNBC Anonymous Web Data SetAnonymous Web Data Set– 989,818 Sections989,818 Sections
1 up to several thousands of page reference1 up to several thousands of page reference
– 17 distinct pages17 distinct pages http://kdd.ics.uci.eduhttp://kdd.ics.uci.edu
SlideSlide - - 4040Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Experimental Evaluation Experimental Evaluation (Cont.)(Cont.)
Scalability with the Number of Input SessionsScalability with the Number of Input Sessions– MS Data SetMS Data Set
No No MSuppRMSuppRseqseq????
SlideSlide - - 4141Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Experimental Evaluation Experimental Evaluation (Cont.)(Cont.)
Scalability with the Number of Input SessionsScalability with the Number of Input Sessions– MSNBC Data SetMSNBC Data Set
No No MSuppRMSuppRseqseq????
SlideSlide - - 4242Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Experimental Evaluation Experimental Evaluation (Cont.)(Cont.)
Scalability with Support ThresholdScalability with Support Threshold– MS Data SetMS Data Set
SlideSlide - - 4343Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Experimental Evaluation Experimental Evaluation (Cont.)(Cont.)
Scalability with Support ThresholdScalability with Support Threshold– MSNBC Data SetMSNBC Data Set
SlideSlide - - 4444Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Experimental Evaluation Experimental Evaluation (Cont.)(Cont.)
Incremental MiningIncremental Mining– MS Data SetMS Data Set
SlideSlide - - 4545Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
Experimental Evaluation Experimental Evaluation (Cont.)(Cont.)
Incremental MiningIncremental Mining– MSNBC Data SetMSNBC Data Set
SlideSlide - - 4646Copyright © Natural Language Processing Lab., NTU, 2005Copyright © Natural Language Processing Lab., NTU, 2005
Reporter: Clarence Min-Chi HsiehReporter: Clarence Min-Chi Hsieh FS-Miner : Efficient and Incremental MiningFS-Miner : Efficient and Incremental Mining of Frequent Sequence Patterns in Web Logsof Frequent Sequence Patterns in Web Logs
ConclusionsConclusionsTwo Scans for the Input DatabaseTwo Scans for the Input DatabaseAllows for Incremental Discovery Allows for Incremental Discovery
of Frequent Sequences when the of Frequent Sequences when the Input Database is UpdatedInput Database is Updated
Allows Interactive Response to Allows Interactive Response to Changes to the Minimun SupportChanges to the Minimun Support
Top Related