Unsupervised Mining of Statistical Temporal Structures in Video
description
Transcript of Unsupervised Mining of Statistical Temporal Structures in Video
![Page 1: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/1.jpg)
Unsupervised Mining of Statistical Temporal Structures
in Video
Liu ze yuan
May 15,2011
![Page 2: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/2.jpg)
What purpose does Markov Chain Monte-Carlo(MCMC) serve in this chapter?
Quiz of the Chapter
![Page 3: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/3.jpg)
1 Introduction 1.1Keywords 1.2 Examples 1.3 Structure discovery problem 1.4 Characteristics of video structure 1.5 Approach
2 Methods Hierarchical Hidden Markov Models Learning HHMM parameters with EM Bayesian model adaptation Feature selection for unsupervised learning
3 Experiments & Results 4 Conclusion
Agenda
![Page 4: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/4.jpg)
Algorithms for discovering statistical structures and finding informative features from videos in an unsupervised setting.
Effective solutions to video indexing require detection and recognition of structures and events.
We focus on temporal structures
1 Introduction
![Page 5: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/5.jpg)
Hierarchical hidden Markov Model(HHMM) Hidden Markov model(HMM) Markov Chain Monte-Carlo(MCMC) Dynamic Bayesian network(DBN) Bayesian Information criteria(BIC) Maximum Likelihood(ML) Expectation Maximization(EM)
1.1 Introduction: keywords
![Page 6: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/6.jpg)
General to various domains and applicable at different levels
At the lowest level, repeating color schemes in a video
At the mid level, seasonal trends in web traffics
At the highest level, genetic functional regions
1.2Introduction: examples
![Page 7: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/7.jpg)
The problem of identifying structure consists of two parts: finding and locating.
The former is referred as training, while the latter is referred to as classification.
Hidden Markov Model(HMM) is a discrete state-space stochastic model with efficient learning algorithm that works well for temporally correlated data streams and successful application. However, due to domain restrictions, we propose a new algorithm that fully unsupervised statistical techniques.
1.3 Introduction: the structure discovery problem
![Page 8: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/8.jpg)
Fixed domain: audio-visual streams The structures have the following properties:
Video structure are in a discrete state-space features are stochastic sequences are correlated in time Focus on dense structures
Assumptions Within events, states are discrete and Markov Observations are associated with states under Gaussian
1.4 Introduction: Characteristics of Video Structure
![Page 9: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/9.jpg)
Model the temporal, dependencies in video and generic structure of events in a unified statistical framework
Model recurring events in each video as HMM and HHMM, where the state inference and parameter estimation learned using EM
Developed algorithms to address model selection and feature selection problems
Bayesian learning techniques for model complexity Bayesian Information Criteria as model posterior Filter-wrapper method for feature selection
1.5 Introduction: Approach
![Page 10: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/10.jpg)
Use two-level hierarchical hidden Markov model
Higher- level elements correspond to semantic events and lower-levels elements represent variations
Special case of Dynamic Bayesian Network Could be extended to more levels and
feature distribution is not constrained to a mixture of Gaussians
2 Hierarchical Hidden Markov Models(HHMM)
![Page 11: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/11.jpg)
2. Hierarchical Hidden Markov Models: Graphical Representation
![Page 12: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/12.jpg)
Generalization to HMM with a hierarchical control structure.
Bottom-up structure
2 Hierarchical Hidden Markov Models: Structure of HHMM
![Page 13: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/13.jpg)
(1) supervised learning
(2) unsupervised learning
(3) a mixture of the above
2 Hierarchical Hidden Markov Models: Structure of HHMM:
applications
![Page 14: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/14.jpg)
Multi-level hidden state inference with HHMM is O(T3);however, not optimal due to some other algorithm with O(T).
A generalized forward-backward algorithm for hidden state inference
A generalized EM algorithm for parameter estimation with O(DT*|Q|2D).
2 Complexity of Inferencing and Learning with HHMM
![Page 15: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/15.jpg)
Representations of states and parameter set of an HHMM
Scope of EM is the basic parameter estimation
Model size given and Learned over a per-defined feature set
2 Learning HHMM parameter with EM
![Page 16: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/16.jpg)
2 Learning HHMM parameters with EM: representing an HHMM
The entire configuration of the hierarchical states from top to bottom with N-ary and D-digit integer.
Whole parameter set theta of an HHMM is represented by the followings:
![Page 17: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/17.jpg)
2 Learning HHMM parameters with EM: Overview of EM algorithm
![Page 18: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/18.jpg)
Parameter learning for HHMM using EM is known to converge to a local max and predefined model structures.
It has drawbacks, thus we adopt and Bayesian model.
use a Markov Chain Monte Carlo(MCMC) to maximize Bayesian information criterion
2 Bayesian Model adaptation
![Page 19: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/19.jpg)
A class of algorithms designed to solve high dimensional optimization problems
MCMC iterates between two steps new model sample based on current model and stat of
data Decision step computes an acceptance probability based
on fitness of the proposed new model
Converge to global optimum
2 Overview of MCMC
![Page 20: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/20.jpg)
Model adaptation for HHMM involves an iterative procedure.
Based on the current model, compute a probability profile involving EM, split(d),merge(d) and swap(d)
Certain formula to determine whether a proposed move is accepted
2 MCMC for HHMM
![Page 21: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/21.jpg)
Select a relevant and compact feature subset that fits the HHMM model
Task of feature selection is divided into two aspect:
Eliminating irrelevant and redundant ones
2 Feature selection for unsupervised learning
![Page 22: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/22.jpg)
Suppose the feature is a discrete set e.g F={ f1, …,fD}
Markov blanket filtering to eliminate
redundant features
A human operator needed to decide on whether to iterate
2 Feature selection for unsupervised learning: feature selection algorithm
![Page 23: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/23.jpg)
2 Feature selection for unsupervised learning: evaluating information gain
![Page 24: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/24.jpg)
After wrapping information gain criterion, we are left with possible redundancy.
Need to apply Markov blanket to solve this matter
Iterative algorithm with a threshold less than 5%
2 Feature selection for unsupervised learning: finding a Markov blanket
![Page 25: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/25.jpg)
Computes a value that influences decision on whether to accept it.
Initialization and convergence issues exist, so randomization.
2 Feature selection for unsupervised learning: normalized BIC
![Page 26: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/26.jpg)
3 Experiments & Results
Sports videos represent an interesting structure discovery
![Page 27: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/27.jpg)
We compare the learning accuracy of four different learning schemes against the ground truth
Supervised HMM Supervised HHMM Unsupervised HHMM without model adaptation Unsupervised HHMM with model adaptation
EM MCMC
3 Experiments & Results: parameter and structure learning
![Page 28: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/28.jpg)
3 Experiments & Results: parameter and structure learning
Run each of the four algorithm for 15 times with random starting points
![Page 29: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/29.jpg)
Test the performance of the automatic feature selection method on the two video clips
For the Spain case, the evaluation has an accuracy of
74.8% and the Korea clip achieves an accuracy of 74.5%
3 Experiments & Results: feature selection
![Page 30: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/30.jpg)
Conduct the baseball video clip on a different domain
HHMM learning with full model adaptation
Consistent results and agree with intuition
3 Experiments & Results: testing on a different domain
![Page 31: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/31.jpg)
Simplified HHMM boils down to a sub-HMM but left to right model with skips
Fully connected general 2-level HHMM model Results show the constrained model is 2.3%
lower than the fully connected model, but more modeling power
3 Experiments & Results: comparing to HHMM with simplifying constraints
![Page 32: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/32.jpg)
In this chapter, we proposed algorithms for unsupervised discovery of structure from video sequences.
We model video structures using HHMM with parameters learned using EM and MCMC.
We test them out on two different video clips and achieve results comparable to its supervised learning counterparts
Application to many other domains and simplified constraints.
4 Conclusion
![Page 33: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/33.jpg)
It serves to solve high dimensional optimization problems
Solution to the Quiz
![Page 34: Unsupervised Mining of Statistical Temporal Structures in Video](https://reader036.fdocuments.us/reader036/viewer/2022062305/568164d8550346895dd71cf5/html5/thumbnails/34.jpg)
Questions?
Q&A