Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.
-
Upload
curtis-barker -
Category
Documents
-
view
228 -
download
0
Transcript of Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.
![Page 1: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/1.jpg)
1
Discovering Deformable Motifs in Time Series Data
Jin ChenCSE 891-001
2012 Fall
![Page 2: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/2.jpg)
2
Background• Time series data are collected in multiple domains, including surveillance,
pose tracking, ICU patient monitoring and finance. • These time series often contain motifs - “segments that repeat within and
across different series”.• Discovering these repeated segments can provide primitives that are
useful for domain understanding and as higher-level, meaningful features that can be used to segment time series or discriminate among time series data from different groups.
• Therefore, we need a tool to explain the entire data in terms of repeated, warped motifs interspersed with non-repeating segments.
![Page 3: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/3.jpg)
3
Motif Detection Approaches• A motif is defined via a pair of windows of the same length that are closely
matched in terms of Euclidean distance. • Such pairs are identified via a sliding window approach followed by
random projections to identify highly similar pairs that have not been previously identified.
• However, this method is not geared towards finding motifs that can exhibit significant deformation.
Mueen et al., SDM 2009
![Page 4: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/4.jpg)
4
Motif Detection Approaches• Discover regions of high density in the space of all subsequences via
clustering. • These works define a motif as a vector of means and variances over the
length of the window, a representation that also is not geared to capturing deformable motifs.
• To addresses deformation, dynamic time warping is used to measure warped distance. However, motifs often exhibit structured transformations, where the warp changes gradually over time.
Minnen et. al. AAAI 2007
![Page 5: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/5.jpg)
5
Motif Detection Approaches• The work of focus on
developing a probabilistic model for aligning sequences that exhibit variability.
• However, these methods rely on having a segmentation of the time series into corresponding motifs. This assumption allows relatively few constraints on the model, rendering them highly under-constrained in certain settings.
Listgarten et. al. NIPS. 2005
![Page 6: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/6.jpg)
6
Motif Detection Approaches
![Page 7: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/7.jpg)
7
Continuous Shape Template Model (CSTM) for Deformable Motif Discovery
• CSTM is a hidden, segmental Markov model, in which each state either generates a motif or samples from a non-repeated random walk.
• A motif is represented by smooth continuous functions that are subject to non-linear warp and scale transformations.
• CSTM learns both the motifs and their allowed warps in an unsupervised way from unsegmented time series data.
• A demonstration on three distinct real-world domains shows that CSTM achieves considerably better performance than previous methods.
![Page 8: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/8.jpg)
8
CSTM – Generative Model• CSTM assumes that the observed time series is generated by switching
between a state that generates nonrepeating segments and states that generate repeating (structurally similar) segments.
• Motifs are generated as samples from a shape template that can undergo nonlinear transformations such as shrinkage, amplification or local shifts. The transformations applied at each observed time t for a sequence are tracked via latent states, the distribution over which is inferred.
• Simultaneously, the canonical shape template and the likelihood of possible transformations for each template are learned from the data.
• The random-walk state generates trajectory data without long-term memory. Thus, these segments lack repetitive structural patterns.
![Page 9: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/9.jpg)
9
Canonical Shape Template (CST)• Each shape template, indexed by k, is
represented as a continuous function sk(l) where l in (0,Lk] and Lk is the length of the kth template.
• In many domains, motifs appear as smooth functions. A promising representation is piecewise Bezier splines. Shape templates of varying complexity are intuitively represented by using fewer or more pieces.
• A third order Bezier curve is parameterized by four points pi (i in {0,1,2,3}), and over two dimensions, where the 1st dimension is the time t and the 2nd dimension is the signal value.
![Page 10: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/10.jpg)
10
Shape Transformation Model - Warping
• Motifs are generated by non-uniform sampling and scaling of sk. Temporal warp can be introduced by moving slowly or quickly through sk.
• The allowable temporal warps are specified as an ordered set {w1, … ,wn} of time increments that determines the rate at which we advance through sk. A template-specific warp transition matrix Mk
w specifies the probability of transitions between warp states.
• To generate an observation series y1, … , yT , let wt in {w1,… ,wn} be the random variable tracking the warp and t be the position within the template sk at time t. Then, yt+1 would be generated from the value sk(Pt+1) where Pt+1 = Pt+wt+1 and wt+1 ~ Mk
w(wt)
![Page 11: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/11.jpg)
11
Shape Transformation Model - Rescale
• The set of allowable scaling coefficients are maintained as the set {c1,…, cn}. Let ϕt+1 in {c1, …, cn} be the sampled scale value at time t+1, sampled from the scale transition matrix Mk
ϕ.
• Thus, the observation yt+1 would be generated around the value ϕt+1sk(Pt+1), a scaled version of the value of the motif at Pt+1, where ϕt+1~Mk
ϕ(ϕt).
• An additive noise value vt+1 ~ N(0, σ) models small shifts.
• In summary, putting together all three possible deformations, we have that yt+1 = vt+1 +ϕt+1sk(Pt+1).
![Page 12: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/12.jpg)
12
Non-repeating Random Walk (NRW)
• The NRW model is used to capture data not generated from the templates.
• If this data has different noise characteristics, the task becomes simpler as the noise characteristics can help disambiguate between motif generated segments and NRW segments.
• The generation of smooth series can be modeled using an autoregressive process.
![Page 13: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/13.jpg)
13
Template Translations• Transitions between generating NRW data and motifs from the CSTs are
modeled via a transition matrix T of size (K+1)X(K+1) where the number of CSTs is K.
• The random variable t tracks the template for an observed series. Transitions into and out of templates are only allowed at the start and end of the template, respectively.
• Thus, when the position within the template is at the end, we have that Kt~T(Kt−1), otherwise Kt = Kt−1.
• For T, we fix the self-transition parameter for the NRW state as λ, a pre-specified input. Different settings of λ allows control over the proportion of data assigned to motifs versus NRW.
![Page 14: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/14.jpg)
14
Learning the Model• The canonical shape templates, their template-specific
transformation models, the NRW model, the template transition matrix and the latent states for the observed series are all inferred from the data using hard EM.
• Coordinate ascent is used to update model parameters in the M-step.
• In the E-step, given the model parameters, Viterbi is used for inferring the latent trace.
![Page 15: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/15.jpg)
15
Learning the Model
![Page 16: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/16.jpg)
16
Results
![Page 17: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/17.jpg)
17
Results
![Page 18: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/18.jpg)
18
Results
![Page 19: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/19.jpg)
19
Results
![Page 20: Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall 1.](https://reader030.fdocuments.us/reader030/viewer/2022033021/56649f125503460f94c253e6/html5/thumbnails/20.jpg)
20
Conclusions• Warp invariant signatures can be used for a forward lookup within beam
pruning to significantly speed up inference when K, the number of templates is large.
• A Bayesian nonparametric prior is another approach that could be used to systematically control the number of classes based on model complexity.
• A different extension could build a hierarchy of motifs, where larger motifs are comprised of multiple occurrences of smaller motifs, thereby possibly providing an understanding of the data at different time scales.
• This work can serve as a basis for building nonparametric priors over deformable multivariate curves.