Stochastic Process Theory and Spectral Estimation Bijan Pesaran Center for Neural Science New York...

Post on 18-Jan-2018

223 views 0 download

description

Overview  Stochastic process theory  Spectral estimation

Transcript of Stochastic Process Theory and Spectral Estimation Bijan Pesaran Center for Neural Science New York...

Stochastic Process Theory and Spectral EstimationBijan PesaranCenter for Neural ScienceNew York University

Data is modeled as a stochastic process

0.6 1.1

0.4

0.2

0

Am

plitu

de (m

V)

Time (s)

Spikes

LFP Similar considerations for EEG, MEG, ECoG, intracellular

membrane potentials, intrinsic and extrinsic optical images, 2-photon line scans and so on

Overview Stochastic process theory

Spectral estimation

Stochastic process theory Defining stochastic processes Time translation invariance; Ergodicity Moments (Correlation functions) and spectra Example Gaussian processes

Stochastic processes Each time series is a realization of a stochastic

process Given a sequence of observations, at times, a

stochastic process is characterized by the probability distribution

Akin to rolling a die for each time series Probability distribution for time series

Alternative is deterministic process No stochastic variability

1 2, , , Tp x x x

Defining stochastic processes High dimensional random variables

Rolling one die picks a point in high dimensional space. Function in ND space.

Indexed families of random variables Roll many dice

1 2, , , Tp x x x p x

tx

Challenge of data analysis We can never know the full probability

distribution of the data Curse of dimensionality

Parametric methods Parametric methods infer the PDF by

considering a parameterized subspace

Employ relatively strong models of underlying process

Non-parametric methods Non-parametric methods use the observed

data to infer statistical properties of the PDF

Employ relatively weak models of the underlying process

Stationarity Stochastic processes don’t exactly repeat

themselves

They have statistical regularities: Stationarity E x t T E x t

E x t T x t T E x t x t

x t T x t

1 2 3 1 2 3E x t T x t T x t T E x t x t x t

Ergodicity Ensemble averages are equivalent to time

averages

Often assumed in experimental work More stringent than stationarity

is not ergodic unless only one constant Is activity with time-varying constant ergodic?

10

limT

TTx t T E x t

x t c

Gaussian processes

Ornstein Uhlenbeck process

Weiner process

111 2 2/2

1, , , exp2 det

N i jN ij ijp x x x x C x

C

ij i jC E x t x t

Fourier Transform

Parseval’s Theorem (Total power is conserved)

2 iftX f e x t dt

2 iftx t e X f df

2 212x t dt X f df

Real functions: *X f X f

Examples of Fourier Transforms

2

22

1 exp22

t

2 21

2exp

' 'x t h t t dt

X H

2 f

1 2

Time domain Frequency domain

Time translation invariance Leads directly to spectral analysis

Fourier basis is eigenbasis of

x t x t T T a t Tat aT ate e e e T x t x tT aTe

T

Implications for second moment If process is stationary, second moment is

time translation invariant

Hence, for

Because

2 ifTX f e X fT * *' 'E X f X f E X f X f T T

2 ' * *' 'i f f Te E X f X f E X f X f

* ' 0E X f X f

'f f

Stationarity Stationarity means neighboring frequencies

are uncorrelated

Not true for neighboring times

Also due to stationarity,

*E X f X f S f f f

exp 2C if S f df

' 0E x t x t (In general)

Ornstein Uhlenbeck Process Exponentially decaying correlation function

Obtained by passing passing white noise through a ‘leaky’ integrator

Spectrum is Lorentzian

'2, ' t tC t t e

d x t x t tdt

2' 'E t t t t

2

21 2S f

f

Ornstein Uhlenbeck process

2

21 2S f

f

2~S f

2

22S f

f

( 1)f

( 1)f

Markovian process “Future depends on the past given the

present”

Simplifies joint probability density

1 2 1 1, ,...,n n n n np x t x t t t p x t x t

nx t 1nx t 2nx t 3nx t

nx t 1nx t 2nx t 3nx t

1 1 1 2,n n n n n np x t x t p x t x t p x t x t

nx t 1nx t 2nx t 3nx t

Wiener process

Cross-spectrum and coherence

*XYS f f f E X f Y f

exp 2XYS f if E x t y t d

XY

XY

X Y

S fC f

S f S f

Coherence Coherence measures the linear association

between two time series.

Cross-spectrum is the Fourier transform of the cross-correlation function

y t ax t t

2 ifY f aX f e f

Coherence

Frequency-dependent time delay

2

2

if

XY

X

aeC fa S f S f

dfdf

Advantages of coherence functions Neighboring bins are uncorrelated

Error bars relatively easy to calculate Stable statistical estimators Separate signals together that have different

frequencies Normalized quantities

Allow averaging and comparisons

Spectral estimation for continuous processes

Spectral estimation for continuous processes Spectral estimation: Periodogram

Bias Variance

Nonparametric quadratic estimators: Tapering Multitaper estimates using Slepians

Spectrum and coherence

Example LFP spectrum

Periodogram – Single Trial Multitaper estimate- Single Trial, 2NT=10

Spectral estimation problemThe Fourier transform requires an infinite

sequence of data

In reality, we only have finite sequences of data and so we calculate truncated DFT

2 iftX f e x t

/2

2

/2

Tift

TT

X f e x t

What happens if we have a finite sequence of data?

/2

2

/2

Tift

T

e x t

'/2 1/22 ' 2 '

1/2/2

Tift if t

T

e df e X f

/2

' '

/2

exp 2T

t T

D f f i f f t

1/2 ' ' '

1/2TX f df D f f X f

Finite sequence means DFT is convolution of and D f X f

Fourier transform of a rectangular window is the Dirichlet kernel: The Fourier

transform of a rectangular window

Convolution in frequency = product in time

D f

exp 2t

D f ift h t

sin 1sinT

f TD f

f

/2

/2

exp 2 exp 2T

t T t

ift x t ift x t w t

1/2 ' ' '

1/2TX f df X f D f f

Bias Bias is the difference between the expected

value of an estimator and the true value.

The Dirichlet kernel is not a delta function, therefore the sample estimate is biased and doesn’t equal the true value.

ˆBIAS E X f X f

Normalized Dirichlet kernel

Narrowband bias: Local bias due to central lobe Broadband bias: Bias from distant frequencies due to sidelobes

2f T

20% height

Data tapers We can do better than multiplying the data by

a rectangular kernel. Choose a function that tapers the data to zero

towards the edge of the segment Many choices of data taper exist: Hanning

taper, Hamming taper, triangular taper and so on

Triangular taper

Fejer kernel, for triangular taper, compared with Dirichlet kernel, for rectangular taper.

12 1t

w tT

Reduces sidelobes

Broadens central lobe

Spectral concentration problem Tapering the data reduces sidelobes but broadens the

central lobes.

Are there “optimal” tapers?

Find strictly time-localized functions, ,

whose Fourier transforms are maximally localized on the frequency interval [-W,W]

w t1, ,t T

Optimal tapers The DFT, , of a finite series,

Find series that maximizes energy in a [-W,W] frequency band

w t U f

2

1

Tift

t

U f w t e

2

1/2 2

1/2

W

WU f

U f

Discrete Prolate Spheroidal Sequences Solved by Slepian, Landau and Pollack

Solutions are an orthogonal family of sequences which are solutions to the following eigenvalue functions

1

sin 2T

t

W t tw t w t

t t

Slepian functions Eigenvectors of eigenvalue equation Orthonormal on [-1/2,1/2] Orthogonal on [-W,W] K=2WT-1 eigenvalues are close to 1, the rest

are close to 0. Correspond to 2WT-1 functions within [-

W,W]

Power of the kth Slepian function within the bandwidth [-W,W]

Comparing Slepian functions

Systematic trade-off between narrowband and broadband bias

Advantages of Slepian tapers

Using multiple tapers recovers edge of time window

2k

k

w t 2

kk

U f

2WT=6

Multitaper spectral estimation Each data taper provides uncorrelated

estimate. Average over them to get spectral estimate.

Treat different trials as additional tapers and average over them as well

2

1

1 KMTX k

k

S f X fK

1

exp 2T

k kt

X f w t x t ift

Cross-spectrum and coherency Cross-spectrum

Coherency

*

1

1 KMTXY k k

k

S f X f Y fK

MTXYMT

XY MT MTX Y

S fC f

S f S f

Advantages of multiple tapers Increasing number of tapers reduces variance

of spectral estimators.

Explicitly control trade-off between narrowband bias, broadband bias and variance “Better microscope”

Local frequency basis for analyzing signals

21MT MTX XV S f E S f

K

Time-frequency resolution

Control resolution in the time-frequency plane using parameters of T and W in Slepians

Frequency

Time

T

2W

Example LFP spectrograms

Time (s)

Freq

uenc

y (H

z)

-0.5 0 0.5 1 1.50

50

100

5

10

15

20

25

Multitaper estimate- T = 0.5s, W = 10Hz

Time (s)

Freq

uenc

y (H

z)

-0.5 0 0.5 1 1.50

50

100

150

5

10

15

20

25

Multitaper estimate- T = 0.2s, W = 25Hz

Summary Time series present particular challenges for

statistical analysis

Spectral analysis is a valuable form of time series analysis