Lecture series: Data analysis

Lecture series: Data analysis

Lectures: Each Tuesday at 16:00

(First lecture: May 21, last lecture: June 25)

Thomas Kreuz, ISC, CNR [email protected]

http://www.fi.isc.cnr.it/users/thomas.kreuz/

mailto:[email protected]



• Lecture 1: Example (Epilepsy & spike train synchrony), Data acquisition, Dynamical systems

• Lecture 2: Linear measures, Introduction to non-linear dynamics

• Lecture 3: Non-linear measures

• Lecture 4: Measures of continuous synchronization

• Lecture 5: Measures of discrete synchronization(spike trains)

• Lecture 6: Measure comparison & Application to epileptic seizure prediction

Schedule

• Introduction to data / time series analysis• Univariate: Measures for individual time series - Linear time series analysis: Autocorrelation, Fourier spectrum - Non-linear time series analysis: Entropy, Dimension, Lyapunov exponent• Bivariate: Measures for two time series - Measures of synchronization for continuous data (e.g., EEG) cross correlation, coherence, mutual information, phase synchronization, non-linear interdependence - Measures of directionality: Granger causality, transfer entropy - Measures of synchronization for discrete data (e.g., spike trains): Victor-Purpura distance, van Rossum distance, event synchronization, ISI-distance, SPIKE-distance• Applications to electrophysiological signals (in particular single-unit data and EEG from epilepsy patients) Epilepsy – “window to the brain”

Overview of lecture series

• Example: Epileptic seizure prediction

• Data acquisition

• Introduction to dynamical systems

First lecture

Non-linear model systems

Linear measures

Introduction to non-linear dynamics

Non-linear measures

- Introduction to phase space reconstruction

- Lyapunov exponent

Second lecture

Non-linear measures

- Dimension

[ Excursion: Fractals ]

- Entropies

- Relationships among non-linear measures

Third lecture

Motivation

Measures of synchronization for continuous data • Linear measures: Cross correlation, coherence

• Mutual information

• Phase synchronization (Hilbert transform)

• Non-linear interdependences

Measure comparison on model systems

Measures of directionality• Granger causality

• Transfer entropy

Fourth lecture

Motivation and examples

Measures of synchronization for discrete data (here: spike trains, but in principle can be any other kind of discrete data)

• Victor-Purpura distance

• Van Rossum distance

• Schreiber correlation measure

• ISI-distance

• SPIKE-distance (& Applications)

Fifth lecture

Spikes / Spike trainsSpike: Action potential (event in which the membrane potential of a neuron rapidly rises and falls.)

Spike train: Temporal sequence of spikes.

Basic assumptions:

All-or-non law: “There is no such thing as half a spike.”Either full response or no response at all(depending on whether firing threshold is crossed or not)

Spikes are stereotypical. Shape does not carry information.

Background activity carries minimal information. Only spike times matter.

Motivation: Spike train (dis)similarity

Three different scenarios:

1. Simultaneous recording of population

Neuronal correlations, pathology (e.g. epilepsy)

2. Repeated presentation of just one stimulus

Reliability

3. Repeated presentation of different stimuli

Stimulus discrimination, neural coding

• Monkey retina (functioning in vitro for ~ 15h)

• Multi-Electrode Array (MEA) recordings (512 electrodes)

• Complete populations of retinal ganglion cells (~ 100 RGCs)

1. Simultaneous recording: Example

0 1 2

60

0

Time [s]

# Tr

ial

One neuron, 60 repetitions: High reliability

2. Repeated stimulus presentation: Example

3. Different stimuli: Neural codingNeural coding:Relationship between the stimulus and the individual or ensemble neuronal responses

Neural encoding: Map from stimulus to response Aim: Response prediction

Neural decoding: Map from response to stimulus Aim: Stimulus reconstruction

Encoding DecodingStimulus

Response

Neural coding schemes

Labelled line coding: Individual neurons code on their own.Identity of neuron that fires a spike matters.

Population coding: Joint activities of a number of neurons.Identity of the neuron is irrelevant. All that is important is that the spike is fired as part of the population response, not which neuron fired it.Advantages: Individual neurons are noisy, summed population is robust. Multi-coding possible. Faster.

See also: Sparseness vs. distributed representation in memory and recognition

Extreme sparseness: Grandmother cell Jennifer Aniston neuron (concept cell)

Jennifer Aniston neuron

[Quian Quiroga et al. Nature (2005)]

Sensory-motor system: Cortical homunculus

[Wilder Penfield: Epilepsy and the Functional Anatomy of the Human Brain. 1954]

Primary somatosensory cortex Primary motor cortex

Neural coding schemes

Rate coding: Most (if not all) information about the stimulus is contained in the firing rate of the neuron

Edgar Adrian 1929 (NP 1932): Firing rate of stretch receptor neurons in the muscles is related to the force applied to the muscle.

Temporal coding: Precise spike timing carries information

Many studies: Temporal resolution on millisecond time scaleNo absolute time reference in the nervous system Relative timing to stimulus onset / other spikes, but also with respect to ongoing brain oscillation

(special cases: Latency code, Pattern code, Coincidence code)

Measures of spike train (dis)similarity

- Victor-Purpura distance (Victor & Purpura, 1996)- van Rossum distance (van Rossum,

2001)- Event synchronization (Quian Quiroga et al.,

2002) - Schreiber correlation measure (Schreiber et al., 2003)- Hunter-Milton similarity (Hunter & Milton,

2003)- ISI-distance (ISI = Inter-spike interval) (Kreuz et al., 2007)- SPIKE-distance (Kreuz et al.,

2013)

Overview and comparison:Kreuz T, Haas J, Morelli A, Abarbanel HDI, Politi A: Measuring spike train synchrony. JNeurosci Methods 165, 151 (2007)Kreuz T, Chicharro D, Houghton C, Andrzejak RG, Mormann F:Monitoring spike train synchrony. JNeurophysiol 109, 1457 (2013)

Victor-Pupura: Sequence of elementary steps

0 1 2 3 4 5 6 7 8 90

1.49

0

2.08

Convolution

Diff²

Input

Output

Time [sec]

Van Rossum: DR(τR=0.1)=1.61

0 1 2 3 4 5 6 7 8 9-1

0

1

0

0.406

Output

Input

ISIs

Ratio

Time [s]

ISI-distance: DI=0.06

0 100 200 300 400 500 600 700 8000

10

1

Time [ms]

Spike

trains

Ia

Sa

Motivation: SPIKE-distance

ISI-Distance

SPIKE-Distance

0 1 2 3 4 5 6 7 8 9 10 11

1

2

Spike

trains

Time [arbitrary unit]

t

t(1)P (t) t(1)

F (t)

t(2)P (t) t(2)

F (t)

x(1)ISI (t)

x(2)ISI (t)

x(1)P (t) x(1)

F (t)

x(2)P (t) x(2)

F (t)

tP(1) (t)

tF(1) (t)

tP(2) (t) tF

(2) (t)

SPIKE-distance

Visualization: Dissimilarity profile

0 200 400 600 800 1000 12000

0.4

2

1Spike

trains

S

Time [ms]

0 500 1000 1500 2000 2500 3000 3500 4000

0

0.5

50

25

Spike

trains

SraSra

Time [arbitrary units]

Causal (real-time) SPIKE-distance

Instantaneous clustering

0 500 1000 1500 2000

40

20

Spike

trains

Time [ms]

Spike trains

S

Spi

ke tr

ains

10 20 30 40

10

20

30

40

Spike trains10 20 30 40



Spike trains

Sr

Spi

ke tr

ains

10 20 30 40

10

20

30

40



Spike trains

10 20 30 400

0.5

1

Selected averaging

0 500 1000 1500 2000 2500 3000 3500 4000

40

30

20

10

Time [ms]

Spike

trains

Spike trains

Spi

ke tr

ains

S

10 20 30 40

10

20

30

40




Spike trains

Spi

ke tr

ains

Sr

10 20 30 40

10

20

30

40



Spike trains

10 20 30 400

0.5

1

Population averages

0 500 1000 1500 2000 2500 3000 3500 4000

40

30

20

10

Time [ms]

G1

G2

G3

G4

10 20 30 40

30

20

10

Spike trains

Spi

ke tr

ains

S

Spike trains

10 20 30 40

30

20

10

Spike trains

Sr

Spike trains

G1 G2 G3 G4

G4

G3

G2

G1

Spike trains

< S >G

2 3 1 4Spike train groups

G1 G2 G3 G4

G4

G3

G2

G1

Spike trains

< Sr >G

0

0.2

0.4

0.6

0.8

1

2 3 1 4Spike train groups

Internally triggered averaging

0 10 20 30 40 50 60 70 80 90 100

20

10

Time [ms]

Spi

ke tr

ains

5 10 15 2020

15

10

5

Spike trains

Spi

ke tr

ains

S

0

0.2

0.4

0.6

0.8

1

1 16 11 19 8 4 12 9 20 6 5 18 14 2 10 15 17 3 13 7

Spike trains

Application to continuous data

0 20 40 60 800

0.4

0

0.2

10

8

6

4

2

Data

SaSa

SraSra

Time [s]

Representations

Dissimilarity matrix of size N^2 * #(t):

• Full representation (as seen in movie)

• Instantaneous dissimilarity (one frame of movie)

• Temporal averaging (selective, triggered)

• Spatial averaging - Synchronization among spike train groups (or full population Measure profile)

• Temporal and spatial averaging: Overall synchrony

Advantages• Perfect time resolution, no binning, no parameter

• Not invariant to shuffling of spikes among spike trains (in contrast to peri-stimulus time histogram, PSTH)

• Time-scale independence

• Computational efficiency

• Online monitoring (Real-time SPIKE-distance)Applications: - Epilepsy - Brain-machine interfacing

• Application to continuous data (e.g. EEG)

• Papers and Matlab source codes:

http://www.fi.isc.cnr.it/users/thomas.kreuz/sourcecode.html

Comparison of spike train distances

• Capability to reproduce known clustering

Comparison of continuous measure of synchronization

• Application to epileptic seizure prediction

• Predictive performance

Statistical validation

• Secondary time series analysis / Analysis of measure profiles

• The method of measure profile surrogates

Today’s lecture

Measurecomparison

- Associate neuronal network („Black box“)

- Time series from 29 neurons (each 32768 points) - Two synaptically coupled clusters of 13 neurons (1 and 2), remaining 3 neurons are coupled to all other (shared, S)

Validation: Hindemarsh-Rose simulations

0.5 1 1.5 2 2.5 3

x 104

-1

0

11c

1b0

1431

ISI

I(t)

Data points

HR spike trains from cluster 1: DI=0.019

0.5 1 1.5 2 2.5 3

x 104

-1

0

12a

1a0

1428

ISI

I(t)

Data points

HR spike trains from clusters 1 and 2: DI=0.032

• Minimum cost DV of transforming one train into the other

• Only three possible transformations: - Adding a spike (cost 1) - Deleting a spike (cost 1) - Shifting a spike (Parameter: Cost cV)

• Low cV: DV ~ Difference in spike count (rate code distance)

• High cV: DV ~ # non-aligned spikes (coincidence distance)

• ],0[ VD

Reminder: Victor & Purpura distance DV

[Victor & Purpura, J Neurophysiol 76, 1310 (1996)]

10-4 10-3 10-2 10-1 100100

101

102

103

cV

DV

1-12-2S-S1-S2-S1-2

0

0.1

0.2

0.3

0.4

0.5

DI

Time scale dependence: DV

Neuron

Neu

ron

1a1b 1c1d1e 1f 1g 1h 1i 1j 1k 1l 1mSaSbSc2a 2b 2c2d2e 2f 2g 2h 2i 2j 2k 2l 2m

1a1b1c1d1e1f

1g1h1i1j1k1l

1mSaSbSc2a2b2c2d2e2f

2g2h2i2j2k2l

2m 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Distance matrix (pairwise similarities): DI

Sa Sb Sc 2a 2e 2m 2b 2j 2i 2k 2d 2h 2l 2f 2g 2c 1a 1c 1d 1i 1j 1b 1h 1m 1f 1l 1g 1k 1e0

0.05

0.1

0.15

0.2

0.25

0.3

Neuron

Dis

tanc

eHierarchical cluster tree (dendrogram): DI

C2

C1

CS

Hierarchical cluster tree (dendrogram): DI

Single linkage algorithm

• First, the closest pair of spike trains is identified and thereby linked by a П-shaped line, where the height of the connection measures the mutual distance .

• These two spike trains are merged into a single element , and the next closest pair of elements is then identified and connected.• The procedure is repeated iteratively until a single cluster remains.• Distance between a pair of clusters:

ji SS ,

) ,( ji SSdC

CSCSSSdCCd jiji ,)}, ,(min{) ,(

• Confusion matrix : # spike trains from cluster classified as belonging to cluster

• Correct clustering: diagonal

• Quantification: Normalized confusion entropy

• For H=1: Cluster separation

Assessing cluster quality

]1,0[ H

]1,0[ 3

21

A

MS

LLLLLF

N

N

Si Sj Sk Neuron

Dist

ance

LS L2

L1

LM

LA

CS

C2 C1

Quantifying clustering performance

10-4 10-3 10-2 10-1 100100

101

102

103

cV

DV

1-12-2S-S1-S2-S1-2

0

0.1

0.2

0.3

0.4

0.5

DI

Time scale dependence: DV

10-5 10-4 10-3 10-2 10-1 100

0

0.2

0.4

0.6

0.8

1

cV

Clus

terin

g Pe

rfor

man

ce

HF

Clustering performance: Parameter dependence

Performance comparison: Clustering

DI

DI

DV

DV

DR

DR

DS

DS

DH

DH

DQ

DQ

Measure

Mea

sure

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Correlation among spike train distances

0.01

0.02

0.03

0.04

DV

DR

DQ

DS

DH

DI

Measure

Dis

tanc

eClustering of spike train distances

Epileptic seizureprediction

Lecture 6b

• Introduction to data / time series analysis• Univariate: Measures for individual time series - Linear time series analysis: Autocorrelation, Fourier spectrum - Non-linear time series analysis: Entropy, Dimension, Lyapunov exponent• Bivariate: Measures for two time series - Measures of synchronization for continuous data (e.g., EEG) cross correlation, coherence, mutual information, phase synchronization, non-linear interdependence - Measures of directionality: Granger causality, transfer entropy - Measures of synchronization for discrete data (e.g., spike trains): Victor-Purpura distance, van Rossum distance, event synchronization, ISI-distance, SPIKE-distance• Applications to electrophysiological signals (in particular single-unit data and EEG from epilepsy patients) Epilepsy – “window to the brain”

Overview of lecture series

Thanks a lot for yourpatience and attention!

Lecture series: Data analysis

Documents

Transcript of Lecture series: Data analysis