Time Series Analysis: An Overviewhombao.ics.uci.edu/Day1-Overview.pdf · Outline of Talk Time...
Transcript of Time Series Analysis: An Overviewhombao.ics.uci.edu/Day1-Overview.pdf · Outline of Talk Time...
Outline
TIME SERIES ANALYSIS: AN OVERVIEW
Hernando Ombao
University of California at Irvine
November 26, 2012
Outline
OUTLINE OF TALK
1 TIME SERIES DATA
2 OVERVIEW OF TIME DOMAIN ANALYSIS
3 OVERVIEW SPECTRAL ANALYSIS
Outline
OUTLINE OF TALK
1 TIME SERIES DATA
2 OVERVIEW OF TIME DOMAIN ANALYSIS
3 OVERVIEW SPECTRAL ANALYSIS
Outline
OUTLINE OF TALK
1 TIME SERIES DATA
2 OVERVIEW OF TIME DOMAIN ANALYSIS
3 OVERVIEW SPECTRAL ANALYSIS
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
TIME SERIES DATA
Visual-motor electroencephalogram (HAND EEG)
Global temperature series
Seismic recordings
LA county environmental data (mortality, pollution,temperature)
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
NEUROSCIENCE DATA AND STATISTICAL GOALS
Electrophysiologic data: multi-channel EEG, local fieldpotentialsHemodynamic data: fMRI time series at several ROIs
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
NEUROSCIENCE DATA AND STATISTICAL GOALS
Multi-channel (multivariate)Two movement conditions: leftward vs. rightward
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
NEUROSCIENCE DATA AND STATISTICAL GOALS
External Stimulus
Visual, Auditory, Somatosensory, Stress
Personality traits, Genes, Socio-Environmental Factors
Unobserved: brain network/cell assemblies
Brain Signals (indirect measures of neuronal activity)
Functional: fMRI, EEG, MEG, PETAnatomical: DTI
Acute Outcomes
Emotion, Skin conductance, Motor response
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
NEUROSCIENCE DATA AND STATISTICAL GOALS
StimulusNeuronal
Response
Brain
SignalsBehavior
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
NEUROSCIENCE DATA AND STATISTICAL GOALS
StimulusNeuronal
Response
Brain
SignalsBehavior
Moderators
Modifiers
Genes
Trait
Socio-Environment
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
NEUROSCIENCE DATA AND STATISTICAL GOALS
Changes in
the mean
Changes in
variance
Changes in
Cross-Dependence
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
NEUROSCIENCE DATA AND STATISTICAL GOALS
Characterize dependence in a brain network
Temporal: Y1(t) ∼ [Y1(t − 1),Y2(t − 1), . . .]′
Spectral: interactions between oscillatory activities at Y1,Y2
Develop estimation and inference methods for connectivity
Investigate potential for connectivity as a biomarker
Predicting behavior
Motor intent (left vs. right movement)[Brain-Computer-Interface]State of learningLevel of mental fatigue
Differentiating patient groups (bipolar vs. healthy children)
Connectivity between left DLPFC ⇆ right STG is greater forbipolar than healthy
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
NEUROSCIENCE DATA AND STATISTICAL GOALS
New dependence measures must be easily interpretable
Models must incorporate information across trials, acrosssubjects
Models must account differences in brain network betweenconditions
Take advantage of multi-modal data (EEG, fMRI, DTI)
Model should be informed by physiology and physics
Dimension reduction: extract information from massivedata that is most relevant for estimating dependence
Develop formal statistical inference procedures
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
NEUROSCIENCE DATA AND STATISTICAL GOALS
Selected References
Automatic methods:SLEX Transform (Smooth Localized Complex EXponnetials)
Ombao et al. (2001, JASA)Ombao et al. (2001, Biometrika)Ombao et al. (2002, Ann Inst Stat Math)Huang, Ombao and Stoffer (2004, JASA)Ombao et al. (2005, JASA)Böhm, Ombao et al. (2010, JSPI)
Massive data; Complex-dependence; Mixed Effects
Bunea, Ombao and Auguste (2006, IEEE Trans Sig Proc)Ombao and Van Bellegem (2008, IEEE Trans Sig Proc)Freyermuth, Ombao, von Sachs (2009, JASA)Fiecas and Ombao (2011, Annals of Applied Statistics)Gorrostieta, Ombao et al. (2012, NeuroImage)Kang, Ombao et al. (2012, JASA)
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
GLOBAL TEMPERATURE SERIES
Time
Glo
ba
l Te
mp
era
ture
Devi
atio
n
1880 1900 1920 1940 1960 1980 2000
−0
.4−
0.2
0.0
0.2
0.4
0.6
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
GLOBAL TEMPERATURE SERIES
Model: Y (t) = µ(t) + ǫ(t)
Deterministic part: µ(t) = β0 + β1t
Stochastic part: ǫ(t) colored noise
ǫt ∼ E ǫt = 0,Cov(ǫr , ǫs) = γ(r , s)
Inference on the trend β1
Is this simply a “local" trend?
What is the impact of man’s activities on temperaturefluctuations?
What is the impact of temperature increases on naturalcalamities?
Statistical challenge: develop a model that captures the(a.) complexities in the spatio-temporal covariancestructure and (b.) causual relationships
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
SEISMIC RECORDINGS
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
SEISMIC RECORDINGS
Two classes: Π1 (earthquake) and Π2 (explosion)
Discrimination
What “features" separate Π1 and Π2?
Classification
Given a new time series x∗, classify into Π1 or Π2
D(x∗, f1) vs D(x∗, f2)
Background: ban nuclear testing (classify Novaya Zelmyaevent of unknown origin)
Seminal work: Shumway (1982, 1998, 2003)
Statistical challenges: feature extraction, feature selectionfrom massive data
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
LA COUNTY MORTALITY AND ENVIRONMENTAL DATA
Cardiovascular Mortality
1970 1972 1974 1976 1978 1980
70
10
01
30
Temperature
1970 1972 1974 1976 1978 1980
50
70
90
Particulates
1970 1972 1974 1976 1978 1980
20
60
10
0
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
LA COUNTY MORTALITY AND ENVIRONMENTAL DATA
Shumway, Azari and Pawitan (1988); Shumway and Stoffer(2010)
LA County
Weekly data on mortality, temperature and pollution levels
Model mortality ∼ (temperature + pollution)
Granger causality: does past knowledge of temperatureand pollution help improve prediction for mortality?
Causation vs Association
Practical issues
Hospitalization (rather than mortality)Effect of pollution might be long term (rather than shortterm)
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
COVARIANCE AND CORRELATION
Bivariate time series Y(t) = [Y1(t),Y2(t)]′
Auto-covariance γℓℓ(s, t) = Cov[Yℓ(s),Yℓ(t)], ℓ = 1,2
Variance γℓℓ(t , t) = Cov[Yℓ(t),Yℓ(t)], ℓ = 1,2
Auto-correlation ρℓℓ(s, t) =γℓℓ(s,t)√
γℓℓ(s,s)γℓℓ(t,t)
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
COVARIANCE AND CORRELATION
Cross-covariance γpq(s, t) = Cov[Yp(s),Yq(t)]
Cross-correlation ρpq(s, t) =γpq(s,t)√
γpp(s,s)γqq(t,t)
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
COVARIANCE AND CORRELATION
Trivariate time series: X,Y,Z
Cross-correlation ρ(X,Y) = Cov(X,Y)√Var XVar Y
Partial cross-correlation between X and Y given Z
Remove Z from X: ǫX = X − βX ZRemove Z from Y: ǫY = Y − βY Zρ(X,Y|Z) = Cov(ǫX ,ǫY )√
Var ǫX Var ǫY
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
COVARIANCE AND CORRELATION
Model A Model BCross-Corr Yes YesPartial CC NO Yes
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
WEAK STATIONARITY
Bivariate time series Y(t) = [Y1(t),Y2(t)]′
Under weak stationarity, the following quantities do notchange with time t :
EY(t) = [µ1, µ2]′,
Cov[Yp(s),Yq(t)] = λpq(|s − t |)Var Yp(t) = λpp(0)
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
TIME DOMAIN MODELS
Univariate time series Y (t)
Moving Average (MA)Auto-Regressive (AR)Moving Average Auto-Regressive (ARMA)
Multivariate time series Y(t)
Vector Moving Average (VMA)Vector Auto-Regressive (VAR)Vector ARMA (VARMA)VARMA with exogenous series (VARMAX)
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
WHITE NOISE
{W (t)} is a white-noise time series if
EW (t) = 0 for all t
{W (t)} is uncorrelated
Cov[W (t),W (s)] ={
σ2W , s = t0, s 6= t
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
MOVING AVERAGE (MA) TIME SERIES
Let {W (t)} be a white noise time seriesY (t) is a first order MA [MA(1)] if it can be expressed as
Y (t) = W (t) + θ1W (t − 1)
MA(Q) time series
Y (t) =Q∑
q=1
θqW (t − q) where θ0 = 1.
Linear System where input is white noise; output is themoving averaqe time seriesMA gives a one-sided linear combination of present andpast white noiseConsider the case θ0 = . . . = θQ = 1 then Y (t) is a“summed" or “smoothed" version of the white noise
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
AUTO-REGRESSIVE TIME SERIES (AR)
Backshift Operator B
BY (t) = Y (t − 1)
Let k be a positive integer. Then BkY (t) = Y (t − k)
Let Φ(B) = 1 − φ1B − . . .− φPBP . Then
Φ(B)Y (t) = Y (t)− φ1Y (t − 1)− . . .− φPY (t − P)
Let Φ(B) = 1 − φ1B. Then
Φ(B)Y (t) = 1 − φ1Y (t − 1)
Φ−1(B) =∑
ℓ=0
φℓ1Bℓ when |φ1| < 1
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
AUTO-REGRESSIVE TIME SERIES (AR)
Let {W (t)} be a white noise seriesLet φ1 ∈ (−1,1){Y (t)} is a stationary first-order auto-regressive modelAR(1) series if
Y (t) = φ1Y (t − 1) + W (t)
One can also express Y (t) as having an infinite-order MAprocess [MA(∞)]
W (t) = Y (t)− φ1Y (t − 1)
W (t) = [1 − φ1B]Y (t)
Y (t) = [
∞∑
ℓ=0
φℓ1Bℓ]W (t)
Y (t) =
∞∑
ℓ=0
φℓ1W (t − ℓ)
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
AUTO-REGRESSIVE TIME SERIES (AR)
{Y (t)} is an AR(P) time series if it can be expressed as
Y (t) =
P∑
p=1
φpY (t − p) + W (t)
Φ(B) = 1 − φ1B − . . .− φPBP is called the AR polynomialequation
Y (t) is stb causal if the roots of Φ(z) lie outside of the unitcircle
Example: AR(1)
Φ(B) = 1 − φ1BThe solution to Φ(z) = 0 is z = 1
φ1where |z| > 1
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
AUTO-REGRESSIVE TIME SERIES (AR)
On ESTIMATION AND INFERENCE
Y (t) = φ1Y (t − 1) + W (t), φ1 ∈ (−1,1)
Impose W (t) ∼ N(0, σ2W )
Y (t) ∼ N(0, σ2W
1−φ21)
Y (t)|Y (t − 1), . . .Y (1) ∼ N(φ1Y (t − 1), σ2W )
Data: {Y (1), . . . ,Y (T )}
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
AUTO-REGRESSIVE TIME SERIES (AR)
Conditional likelihood
LC(φ1, σ2W ) = f (y(2)|y(1)) . . . f (y(T )|y(1) . . . y(T − 1))
=
1√
2πσ2W
T−1
×
exp
(
− 12σ2
W
T∑
t=2
(y(t) − φ1y(t − 1))2
)
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
AUTO-REGRESSIVE TIME SERIES (AR)
Full likelihood
L(φ1, σ2W ) = f (y(1)) LC(φ1, σ
2W )
=1 − φ2
1(
√
2πσ2W
)T exp(− 12σ2
W
×
[
(1 − φ21)y(1)
2 +
T∑
t=2
(y(t) − φ1y(t − 1))2
]
).
Conditional likelihood asymptotically equivalent to the fulllikelihood
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
THE SPECTRUM
X (t) STATIONARY TEMPORAL PROCESS
Cramér Representation
X(t) =∫
exp(i2πωt)dZ (ω), t = 0,±1,±2, . . .
Basis Fourier waveforms exp(i2πωt), ω ∈ (−0.5,0.5)
Random coefficients dZ (ω) – increment random process
EdZ (ω) = 0 andCov[dZ (ω), dZ (λ)] = δ(ω − λ)f (ω)dωdλVar dZ (ω) = f (ω)dω Spectrum f (ω)DECOMPOSITION OF VARIANCE of X(t):Var X(t) =
∫
f (ω)dω
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
THE SPECTRUM
Stochastic Regression ModelWave (2 oscillations)
Wave (10 oscillations)
−4 40
Distribution of Random Coeff
−4 40
Distribution of Random Coeff
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
THE SPECTRUM
SPECTRUM – decomposition of variance
X = [X (1), . . . ,X (T )]′ - zero mean stationary time series
Φ - columns are the orthonormal Fourier waveforms
d = [d(ω0), . . . ,d(ωT−1)]′ - Fourier coefficients
X = Φd
X′X = d′d - Parseval’s identity1T EX′X = 1
T Ed′d
Var X (t) ≈∫
f (ω)dω
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
THE SPECTRUM
A more formal derivation ...
X (t) =∫
exp(i2πωt)dZ (ω)
γ(h) = Cov[X (t + h),X (t)]
f (ω) =∑∞
h=−∞ γ(h)exp(−i2πωh)
γ(h) =∫ 0.5−0.5 f (ω)exp(−i2πωh)dω
γ(0) =∫
f (ω)dω
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
SPECTRUM = VARIANCE DECOMPOSITION
AR(1): Xt = 0.9Xt−1 + ǫt
Low Frequency Oscillations
0 100 200 300 400 500 600 700 800 900 1000−10
−8
−6
−4
−2
0
2
4
6
8
Time
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
SPECTRUM = VARIANCE DECOMPOSITION
Spectrum of AR(1) with φ = 0.9
Time
Fre
qu
en
cy
0 1
0.5
0
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
SPECTRUM = VARIANCE DECOMPOSITION
AR(1): Xt = −0.9Xt−1 + ǫt
High Frequency Oscillations
0 100 200 300 400 500 600 700 800 900 1000−8
−6
−4
−2
0
2
4
6
Time
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
SPECTRUM = VARIANCE DECOMPOSITION
Spectrum of AR(1) with φ = −0.9
Time
Fre
qu
en
cy
0 1
0.5
0
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
SPECTRUM = VARIANCE DECOMPOSITION
Mixture: Low + High Frequency Signal
0 100 200 300 400 500 600 700 800 900 1000−15
−10
−5
0
5
10
15
20
Time
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
SPECTRUM = VARIANCE DECOMPOSITION
Spectrum of the mixed signal
Time
Fre
qu
en
cy
0 1
0.5
0
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
CROSS-COHERENCE - A MEASURE OF DEPENDENCE
An Illustration: Interactions between oscillatorycomponents
Latent Signals
U1(t) - low frequency signalU2(t) - high frequency signal
Observed Signals
X (t) = U1(t) + U2(t) + Z1(t)Y (t) = U1(t + ℓ) + Z2(t)
X and Y are linearly related through U1.
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
CROSS-COHERENCE - A MEASURE OF DEPENDENCE
Theta
Delta
Alpha
Beta
Gamma
X1(t) X2(t)
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
CROSS-COHERENCE - A MEASURE OF DEPENDENCE
Time series at 3 channels: X,Y,Z
Cross-correlation ρ(X,Y) = Cov(X,Y)√Var XVar Y
Partial cross-correlation between X and Y given Z
Remove Z from X: ǫX = X − βX ZRemove Z from Y: ǫY = Y − βY Zρ(X,Y|Z) = Cov(ǫX ,ǫY )√
Var ǫX Var ǫY
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
CROSS-COHERENCE - A MEASURE OF DEPENDENCE
Model A Model BCross-Corr Yes YesPartial CC NO Yes
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
CROSS-COHERENCE - A MEASURE OF DEPENDENCE
When ρ(X,Y|Z) 6= 0, we want to identify the frequencybands that drive the direct linear association.When ρ(X,Y) 6= 0, we want to identify the frequency bandsthat drive the linear association.Notation
U(t) =
X (t)Y (t)Z (t)
dZ (ω) =
dZX (ω)dZY (ω)dZZ (ω)
Spectral representation of a stationary process
U(t) =∫ 0.5
−0.5exp(i2πωt)dZ (ω).
Spectral matrix Cov dZ (ω) = f (ω)dωFormal definition of coherence
ρX ,Y (ω) = |Corr (dZX (ω),dZY (ω))|2
Outline of Talk Time Series Data Overview of Time Domain Analysis Overview of Spectral Analysis
CROSS-COHERENCE: AN INTUITIVE INTERPRETATION
Ombao and Van Bellegem (2008, IEEE Trans Signal Processing)Filtered Signals
Xω(t) = FωX (t) Yω(t) = FωY (t) Zω(t) = FωZ (t)
Coherence at frequency band around ω
ρX ,Y (ω) ≈ |Corr (Xω(t),Yω(t))|2
Partial coherenceRemove Zω(t) from Xω(t): ξX
ω (t) = Xω(t) − βX Zω(t)Remove Zω(t) from Yω(t): ξY
ω (t) = Yω(t) − βY Zω(t)
ρX ,Y |Z (ω) =
∣
∣
∣
∣
Cov(ξXω(t),ξY
ω(t))√
Var ξXω(t)Var ξY
ω(t)
∣
∣
∣
∣
2
Relevant work: Pupin (1898)Estimator for fX (ω) is Var Xω(t).