Time Series Shapelets: A New Primitive for Data Mining Lexiang Ye and Eamonn Keogh University of...

Time Series Shapelets: A New Primitive for Data Mining

Lexiang Ye and Eamonn KeoghUniversity of California, Riverside

KDD 2009

Presented by: Zhenhui Li

Classification in Time Series

• Application: Finance, Medicine

• 1-Nearest Neighbor– Pros: accurate, robust, simple– Cons: time and space complexity (lazy learning); results are not

interpretable

0 200 400 600 800 1000 1200

Solution

• Shapelets– time series subsequence– representative of a class– discriminative from other classes

MOTIVATING EXAMPLE

false nettles

stinging nettles

false nettles

Shapelet

stinging nettlesfalse nettles stinging nettles

Leaf Decision Tree

Shapelet Dictionary

5.1

yes no

I

I

0 1

BRUTE-FORCE ALGORITHM

ca

Candidates Pool

Extract subsequences of all possible lengths

Testing the utility of a candidate shapelet

• Arrange the time series objects– based on the distance from candidate

• Find the optimal split point (maximal information gain)

• Pick the candidate achieving best utility as the shapelet

Split Point

0

candidate

Information gain

Problem

• Total number of candidate

• Each candidate: compute the distance between this candidate and each training sample

• Trace dataset– 200 instances, each of length 275– 7,480,200 shapelet candidates– approximately three days

MAXLEN

MINLENl DTi

i

lT )1(

Candidates Pool

Speedup

• Distance calculations from time series objects to shapelet candidates are the most expensive part

• Reduce the time in two ways– Distance Early Abandon

• reduce the distance computation time between two time series

– Admissible Entropy Pruning• reduce the number of distance calculatations

0

candidate

DISTANCE EARLY ABANDON

0 10 20 30 40 50 60 70 80 90 100

T

S

0 10 20 30 40 50 60 70 80 90 100

best matching location Dist= 0.4Dist= 0.4S

T

0 10 20 30 40 50 60 70 80 90 100

T

S

calculation abandoned at this point

Dist> 0.4Dist> 0.4

Distance Early Abandon

• We only need the minimum Dist

• Method– Keep the best-so-far distance– Abandon the calculation if the current distance is

larger than best so far.

ADMISSIBLE ENTROPY PRUNING

Admissible Entropy Pruning

• We only need the best shapelet for each class• For a candidate shapelet

– We don’t need to calculate the distance for each training sample

– After calculating some training samples, the upper bound of information gain < best candidate shapelet

– Stop calculation– Try next candidate

0

false nettlesstinging nettles

0

0

I=0.42I=0.42

I= 0.29I= 0.29

false nettles stinging nettles

Leaf Decision Tree

Shapelet Dictionary

5.1

yes no

I

I

0 1

false nettles

stinging nettles

false nettles

false nettles

Shapelet

stinging nettles

ClassificationClassification

EXPERIMENTAL EVALUATION

Performance Comparison

Original Lightning DatasetLength 2000

Training 2000

Testing 18000

Projectile Points

11.24

85.47

Shapelet Dictionary

(Clovis)

(Avonlea)

I

II

0 200 400

0

1.0

Arrowhead Decision Tree

I

21

II

0

Clovis Avonlea

Method Accuracy Time

Shapelet 0.80 0.33

Rotation Invariant Nearest Neighbor 0.68 1013

Wheat SpectrographySpectrography

0 200 400 600 800 1000 1200

0

0.5

1

one sample from each class

Wheat DatasetLength 1050

Training 49

Testing 276

2 4 0 1 3 6 5

I

II

III IV

V

VI

100 200 3000

0.1

0.2

0.3

0.4

0.0

I

II

III

IV

V

VI

Shapelet Dictionary

Wheat Decision Tree


Shapelet 0.720 0.86

Nearest Neighbor 0.543 0.65

the Gun/NoGun Problem


Shapelet 0.933 0.016

Rotation Invariant Nearest Neighbor 0.913 0.064

0 50 100

0

238.94

Shapelet Dictionary

Gun Decision Tree

(No Gun)

No Gun

Gun

I

I

1 0

Conclusions

• Interpretable results

• more accurate/robust

• significantly faster at classification

Discussions - Comparison

Hong Cheng, Xifeng Yan, Jiawei Han, and Chih-Wei Hsu, “Discriminative Frequent Pattern Analysis for Effective Classification” (ICDE'07)

Hong Cheng, Xifeng Yan, Jiawei Han, and Philip S. Yu, "Direct Discriminative Pattern Mining for Effective Classification", (ICDE'08)

Similarities:• motivation: Discriminative frequent pattern = Shapelet• technique: Use upper bound of information gain to speed upDifferences:• application: general feature selection v.s. time series (no explicit features)• split node: binary (contain/not contain a pattern) v.s. numeric value (smaller/larger than a value)

Discussions – other topics

• Similar ideas could be applied to other research topics– graph– image– spatio-temporal– social network– ….


• Graph classification:

Xifeng Yan, Hong Cheng, Jiawei Han, and Philip S. Yu, “Mining Significant GraphPatterns by Scalable Leap Search”, Proc. 2008 ACM SIGMOD Int. Conf. onManagement of Data (SIGMOD'08), Vancouver, BC, Canada, June 2008.


• moving object classification

Discriminative sub-movement


• Social network– classify normal/spamming users


• Social network– classify normal/spamming users– How to find discriminative features on social network?

• social network structure• user behaviour


• For different applications, this idea could be adapted to improve the performance; but not easily adapted.

Thank You

Question?

Time Series Shapelets: A New Primitive for Data Mining Lexiang Ye and Eamonn Keogh University of...

Documents

Transcript of Time Series Shapelets: A New Primitive for Data Mining Lexiang Ye and Eamonn Keogh University of...