Stamped Metal Boxes Covers Accessories Stamped Conduit Boxes P36443 M
Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5....
Transcript of Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5....
![Page 1: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/1.jpg)
Fast Mining and Forecasting of Complex Time-Stamped Events�
Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), Christos Faloutsos (CMU), Tomoharu Iwata (NTT), Masatoshi Yoshikawa (Kyoto Univ.) �
KDD 2012 1 Y. Matsubara et al.
![Page 2: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/2.jpg)
Motivation �
Complex time-stamped events consists of {timestamp + multiple attributes}
KDD 2012 2 Y. Matsubara et al.
e.g., web click events: {timestamp, URL, user ID, access devices, http referrer,…}
Timestamp� URL� User � Device�2012-08-01-12:00� CNN.com � Smith � iphone �
2012-08-02-15:00� YouTube.com� Brown � iphone �
2012-08-02-19:00� CNET.com � Smith � mac �
2012-08-03-11:00� CNN.com � Johnson � ipad �
… � … � … � … �
![Page 3: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/3.jpg)
Motivation �
Q1. Are there any topics ? - news, tech, media, sports, etc...
e.g., CNN.com, CNET.com -> news topic YouTube.com -> media topic
KDD 2012 3 Y. Matsubara et al.
Timestamp� URL� User � Device�2012-08-01-12:00� CNN.com � Smith � iphone �
2012-08-02-15:00� YouTube.com� Brown � iphone �
2012-08-02-19:00� CNET.com � Smith � mac �
2012-08-03-11:00� CNN.com � Johnson � ipad �
… � … � … � … �
![Page 4: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/4.jpg)
Motivation �
Q2. Can we group URLs/users accordingly?
e.g., CNN.com & CNET.com (related to news topic) Smith & Johnson (related to news topic)
KDD 2012 4 Y. Matsubara et al.
Timestamp� URL� User � Device�2012-08-01-12:00� CNN.com � Smith � iphone �
2012-08-02-15:00� YouTube.com� Brown � iphone �
2012-08-02-19:00� CNET.com � Smith � mac �
2012-08-03-11:00� CNN.com � Johnson � ipad �
… � … � … � … �
![Page 5: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/5.jpg)
Motivation �
Q3. Can we forecast future events? - How many clicks from ‘Smith’ tomorrow? - How many clicks to ‘CNN.com’ over next 7 days?
KDD 2012 5 Y. Matsubara et al.
Timestamp� URL� User � Device�2012-08-01-12:00� CNN.com � Smith � iphone �
2012-08-02-15:00� YouTube.com� Brown � iphone �
2012-08-02-19:00� CNET.com � Smith � mac �
2012-08-03-11:00� CNN.com � Johnson � ipad �
… � … � … � … �2012-08-05-12:00� CNN.com � Smith � iphone
2012-08-05-19:00� CNET.com � Smith � iphone �
future clicks? �
![Page 6: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/6.jpg)
Motivation �
Web click events – can we see any trends? Original access counts of each URL
- 100 random users - 1 week (window size = 1 hour)
KDD 2012 6 Y. Matsubara et al.
URL: blog site URL: money site
![Page 7: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/7.jpg)
Motivation �
Web click events – can we see any trends? Original access counts of each URL
- 100 random users - 1 week (window size = 1 hour)
L We cannot see any trends !!
KDD 2012 7 Y. Matsubara et al.
URL: blog site URL: money site
Bursty L �Noisy L � Sparse L �
![Page 8: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/8.jpg)
Outline �
- Motivation - Problem definition - Proposed method: TriMine - TriMine-F forecasting - Experiments - Conclusions �
KDD 2012 8 Y. Matsubara et al.
![Page 9: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/9.jpg)
Problem definition �
KDD 2012 9
Given: a set of complex time-stamped events
Y. Matsubara et al.
Original web-click events�
![Page 10: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/10.jpg)
Problem definition �
KDD 2012 10
Given: a set of complex time-stamped events
Y. Matsubara et al.
1.Find major topics/trends 2.Forecast future events Original web-click events�
![Page 11: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/11.jpg)
Problem definition �
KDD 2012 11
Given: a set of complex time-stamped events
Y. Matsubara et al.
1.Find major topics/trends 2.Forecast future events URL in topic space � User in topic space�
Time evolution �
“Hidden topics” wrt each aspect (URL, user, time)
Original web-click events�
![Page 12: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/12.jpg)
Outline �
- Motivation - Background - Proposed method: TriMine - TriMine-F forecasting - Experiments - Conclusions �
KDD 2012 12 Y. Matsubara et al.
![Page 13: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/13.jpg)
Main idea (1) : M-way analysis�
Complex time-stamped events e.g., web clicks
KDD 2012 Y. Matsubara et al. 13
Time� URL� User �
08-01-12:00� CNN.com � Smith �
08-02-15:00� YouTube.com� Brown �
08-02-19:00� CNET.com � Smith �
08-03-11:00� CNN.com � Johnson �
… � … � … �
![Page 14: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/14.jpg)
Main idea (1) : M-way analysis�
Complex time-stamped events e.g., web clicks
KDD 2012 Y. Matsubara et al. 14
Time� URL� User �
08-01-12:00� CNN.com � Smith �
08-02-15:00� YouTube.com� Brown �
08-02-19:00� CNET.com � Smith �
08-03-11:00� CNN.com � Johnson �
… � … � … �
Represent as Mth order tensor (M=3)
object/URL
actor/user
Time
x�
u
v
n
![Page 15: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/15.jpg)
Main idea (1) : M-way analysis�
Complex time-stamped events e.g., web clicks
KDD 2012 Y. Matsubara et al. 15
Time� URL� User �
08-01-12:00� CNN.com � Smith �
08-02-15:00� YouTube.com� Brown �
08-02-19:00� CNET.com � Smith �
08-03-11:00� CNN.com � Johnson �
… � … � … �
Represent as Mth order tensor (M=3)
object/URL
actor/user
Time
x�
u
v
n
Element x: # of events
e.g., ‘Smith’, ‘CNN.com’, ‘Aug 1, 10pm’; 21 times
![Page 16: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/16.jpg)
Main idea (1) : M-way analysis�
Undesirable properties • High dimension L • Categorical data K • Sparse tensor K • Look like noise L
e.g., x={0, 1, 0, 2, 0, 0, 0, …}
KDD 2012 Y. Matsubara et al. 16
object/URL
actor/user
Time
x�
u
v
n
Event tensor
![Page 17: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/17.jpg)
Main idea (1) : M-way analysis�
Undesirable properties • High dimension L • Categorical data K • Sparse tensor K • Look like noise L
e.g., x={0, 1, 0, 2, 0, 0, 0, …}
KDD 2012 Y. Matsubara et al. 17
object/URL
actor/user
Time
x�
u
v
n
Event tensor
Questions: How to find meaningful patterns?
![Page 18: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/18.jpg)
Main idea (1) : M-way analysis�
A. decompose to a set of 3 topic vectors: Object vector Actor vector Time vector KDD 2012 Y. Matsubara et al. 18
u
v
n
Object
Actor
Time
Topic A (business)
Topic B (news)
Topic C (media)
…
Object/URL
Actor/user
Time
Web clicks
![Page 19: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/19.jpg)
Main idea (1) : M-way analysis�
A. decompose to a set of 3 topic vectors: Object vector Actor vector Time vector
KDD 2012 Y. Matsubara et al. 19
u
v
n
Object
Actor
Time
Topic1 (business)
Topic2 (news)
Topic3 (media)
…
Object/URL
Actor/user
Time
e.g., business topic vectors �
Object/URL
Money.com
CNN.com
Smith Johnson
Actor/user
Time
Mon-Fri Sat-Sun
Higher value: Highly related topic
![Page 20: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/20.jpg)
Main idea (1) : M-way analysis�
A set of 3 topic vectors = 3 topic matrices • [O] Object-topic matrix (u x k) • [A] Actor-topic matrix (k x v) • [C] Time-topic matrix (k x n)
KDD 2012 Y. Matsubara et al. 20
Object matrix
Actor matrix
Time matrix
u
v n
k
k
k€
O
€
AC
u
v
n
Object
Actor
Time
![Page 21: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/21.jpg)
Main idea (1) : M-way analysis (details)�
M-way decomposition (M=3) [Gibbs sampling] infer k hidden topics for each non-zero element of X, according to probability p
KDD 2012 Y. Matsubara et al. 21
time t
acto
rs j
€
O
€
A
€
⇒
€
n€
u
€
k
€
u
€
k€
k
!
vv
€
n / l0C
![Page 22: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/22.jpg)
Main idea (2) : Multi-scale analysis�
Q: What is the right window size to capture meaningful patterns? … minute? hourly? … daily?
KDD 2012 Y. Matsubara et al. 22
time
acto
rs
€
n€
u!
€
l0
v
window size l1l2
![Page 23: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/23.jpg)
Main idea (2) : Multi-scale analysis�
Q: What is the right window size to capture meaningful patterns?
… minute? hourly? … daily?
KDD 2012 Y. Matsubara et al. 23
time
acto
rs
€
n€
u!
€
l0
v
window size l1l2
A. Our solution: Multiple window sizes
![Page 24: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/24.jpg)
Main idea (2) : Multi-scale analysis (details)�
Tensors with multiple window sizes �
KDD 2012 Y. Matsubara et al. 24
time
acto
rs
€
O
€
A
€
⇒
€
n€
u
€
k
€
u
€
k€
k
€
χ = χ(0)
€
u
€
χ(1)
€
C(1)
€
k
€
u€
n / l1
€
C(2)
€
k
€
n / l2
TriMine-single
€
χ(2)
€
⇒
€
⇒
€
l0
€
l1
€
l2 TriMine
v
v
v
v
€
n / l0
€
C(0)
1. Infer O, A, C at highest level�
Hourly pattern
Daily pattern
Weekly pattern
![Page 25: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/25.jpg)
Main idea (2) : Multi-scale analysis (details)�
Tensors with multiple window sizes �
KDD 2012 Y. Matsubara et al. 25
time
acto
rs
€
O
€
A
€
⇒
€
n€
u
€
k
€
u
€
k€
k
€
χ = χ(0)
€
u
€
χ(1)
€
C(1)
€
k
€
u€
n / l1
€
C(2)
€
k
€
n / l2
TriMine-single
€
χ(2)
€
⇒
€
⇒
€
l0
€
l1
€
l2 TriMine
v
v
v
v
€
n / l0
€
C(0)2. Share O & A for all levels �
Hourly pattern
Daily pattern
Weekly pattern
3. Compute C for each level�
![Page 26: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/26.jpg)
Main idea (2) : Multi-scale analysis (details)�
Tensors with multiple window sizes �
KDD 2012 Y. Matsubara et al. 26
time
acto
rs
€
O
€
A
€
⇒
€
n€
u
€
k
€
u
€
k€
k
€
χ = χ(0)
€
u
€
χ(1)
€
C(1)
€
k
€
u€
n / l1
€
C(2)
€
k
€
n / l2
TriMine-single
€
χ(2)
€
⇒
€
⇒
€
l0
€
l1
€
l2 TriMine
v
v
v
v
€
n / l0
€
C(0)2. Share O & A for all levels �
3. Compute C for each level�
Hourly pattern
Daily pattern
Weekly pattern
TriMine is linear on the input size N, i.e.,
N: counts of events in X, n: duration of X
O(N logn)!O(N )
![Page 27: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/27.jpg)
Outline �
- Motivation - Background - Proposed method: TriMine - TriMine-F forecasting - Experiments - Conclusions �
KDD 2012 27 Y. Matsubara et al.
![Page 28: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/28.jpg)
TriMine-Forecasts �
Final goal: “forecast future events”! Q. How can we generate a realistic events?
e.g., estimate the number of clicks for user “smith”, to URL “CNN.com”, for next 10 days
KDD 2012 Y. Matsubara et al. 28
Object/URL
Actor/User
Time
v
u
Future?�
![Page 29: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/29.jpg)
Why not naïve?�
Individual-sequence forecasting - Create a set of (u * v) sequences of length(n) - apply the forecasting algorithm for each sequence
KDD 2012 Y. Matsubara et al. 29
n n+1 …
v
u
n
Object
Actor
Time
v
u
![Page 30: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/30.jpg)
Why not naïve?�
Individual-sequence forecasting - Create a set of (u * v) sequences of length(n) - apply the forecasting algorithm for each sequence
KDD 2012 Y. Matsubara et al. 30
n n+1 …
v
u
n
Object
Actor
Time
v
u
- L Scalability : time complexity is at least - L Accuracy : each sequence “looks” like noise,
(e.g., {0, 0, 0, 1, 0, 0, 2, 0, 0, ….}) -> hard to forecast
O(uvn)
![Page 31: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/31.jpg)
TriMine-F �
Our approach: - [Step 1] Forecast time-topic matrix: Ĉ
- [Step 2] Generate events using 3 matrices
KDD 2012 Y. Matsubara et al. 31
Future events �
Tensor X�€
O
€
AC C
€
O
€
A
C
![Page 32: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/32.jpg)
[Step 1] Forecast ‘time-topic matrix’ (details)
Q. How to capture multi-scale dynamics ? e.g., bursty pattern, noise, multi-scale period A. Multi-scale forecasting Forecast using multiple levels of matrices
KDD 2012 32
r,t(0)c
Y. Matsubara et al.
time 1−t2−t
€
c r(0)
€
c r(1)
€
c r(2)
r,t(0)c
Forecasted value
w=1
w=2
w=4
(Details in paper)
![Page 33: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/33.jpg)
[Step 2] Generate events using O A Ĉ (details)
We propose 2 solutions: A1. Count estimation Use O A Ĉ matrices A2. Complex event generation Use sampling–based approach (Details in paper)
KDD 2012 33 Y. Matsubara et al.
Future events �€
O
€
A
C
![Page 34: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/34.jpg)
Outline �
- Motivation - Background - Proposed method: TriMine - TriMine-F forecasting - Experiments - Conclusions �
KDD 2012 34 Y. Matsubara et al.
![Page 35: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/35.jpg)
Experimental evaluation�
The experiments were designed to answer:
• Effectiveness Q1. How successful is TriMine in spotting patterns?
• Forecasting accuracy Q2. How well does TriMine forecast events?
• Scalability Q3. How does TriMine scale with the dataset size?
KDD 2012 Y. Matsubara et al. 35
![Page 36: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/36.jpg)
Experimental evaluation�
Datasets - WebClick data
Click: {URL, user ID, time} - 1,797 URLs, 10,000 heavy users, one month
- Ondemand TV data View: {channel ID, viewer ID, time}
- 13,231 TV program, 100,000 users, 6 month
KDD 2012 Y. Matsubara et al. 36
![Page 37: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/37.jpg)
Q1. Effectiveness�
Result of three matrices O, A, C
Visualization: “TriMine-plots” • URL-topic matrix O • User-topic matrix A • Time-topic matrix C
KDD 2012 Y. Matsubara et al. 37
Tensor X�A
O C
URL O�
User A �
Time C �
![Page 38: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/38.jpg)
Q1-1. WebClick data�
URL-topic matrix (O) Three hidden topics: “drive”, “business”, “media” * Red point : each web site
KDD 2012 Y. Matsubara et al. 38
Money site& Finance site have
similar trends�
Car & bike site is related to travel site �
![Page 39: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/39.jpg)
Q1-1. WebClick data�
User-topic matrix (A) Three hidden topics: “drive”, “business”, “media” * Red point : each user
KDD 2012 Y. Matsubara et al. 39
Very clear user groups along the spokes�
![Page 40: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/40.jpg)
Q1-1. WebClick data�
Time-topic matrix (C) Three hidden topics: “drive”, “business”, “media” * Each sequence: each topic over time
KDD 2012 Y. Matsubara et al. 40
“Drive” topic: Spikes during
weekend
“Business” topic: Less access during
weekend �
![Page 41: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/41.jpg)
Q1-1. WebClick data�
Other topics Three topics: “Communication”, “food”, “blog”
KDD 2012 Y. Matsubara et al. 41
URL-topic matrix O Three related sites: route-map, diet, restaurant i.e., users check out
1. Restaurants 2. route map in their area 3. Calories of their meals
![Page 42: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/42.jpg)
Q1-1. WebClick data�
Other topics Three topics: “Communication”, “food”, “blog”
KDD 2012 Y. Matsubara et al. 42
Time-topic matrix C 4pm: Food related sites:
visited in the early evening before users go out
11pm: Communication sites: Used in the late evening for
private purposes
![Page 43: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/43.jpg)
Q1-2. Ondemand TV data�
TV program-topic matrix (O) Three topics: “sports ”, “action”, “romance” * Red point : each TV program
KDD 2012 Y. Matsubara et al. 43
Several clusters (LOST, tennis etc. )�
![Page 44: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/44.jpg)
Q1-2. Ondemand TV data�
Time-topic matrix (C) Three hidden topics: “sports ”, “action”, “romance” * Each sequence: each topic over time
KDD 2012 Y. Matsubara et al. 44
Daily & weekly periodicities �
“Action”: High peeks
on weekends�
![Page 45: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/45.jpg)
Q2-1. Forecasting accuracy�
Temporal perplexity (entropy for each time-tick) Lower perplexity: higher predictive accuracy
KDD 2012 Y. Matsubara et al. 45
T2: [Hong et al. KDD’11]
![Page 46: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/46.jpg)
Q2-2. Forecasting accuracy�
Accuracy of event forecasting RMSE between original and forecasted events (lower is better)
KDD 2012 Y. Matsubara et al. 46
PLiF [Li et al.VLDB’10] , T2: [Hong et al.KDD’11]
![Page 47: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/47.jpg)
Q2-3. Forecasting accuracy�
Benefit of multiple time-scale forecasting
KDD 2012 Y. Matsubara et al. 47
Original sequence of matrix (C) �
Forecast C’ using single level
-> failed �
Multi-scale forecast
-> captured cyclic patterns �
business �
drive�
![Page 48: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/48.jpg)
Q3. Scalability
KDD 2012 Y. Matsubara et al. 48
Computation cost (vs. AR)
TriMine provides a reduction in computation time (up to 74x)
![Page 49: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/49.jpg)
Outline �
- Motivation - Problem definition - Proposed method: TriMine - TriMine-F forecasting - Experiments - Conclusions �
KDD 2012 49 Y. Matsubara et al.
![Page 50: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/50.jpg)
Conclusions
- TriMine has following properties: • Effective
– It finds meaningful patterns in real datasets
• Accurate – It enables forecasting
• Scalable – It is linear on the database size
KDD 2012 Y. Matsubara et al. 50
![Page 51: Fast Mining and Forecasting of Complex Time-Stamped Eventsyasuko/PUBLICATIONS/... · 2019. 5. 7. · Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto](https://reader035.fdocuments.us/reader035/viewer/2022063022/5fe6e84d7936935fab04be19/html5/thumbnails/51.jpg)
Thank you
KDD 2012 Y. Matsubara et al. 51
URL matrix� User matrix � Time matrix �
Code: http://www.kecl.ntt.co.jp/csl/sirg/people/yasuko/software.html
Email: matsubara.yasuko lab.ntt.co.jp