Lecture series: Data analysis

53
Lecture series: Data analysis Lectures: Each Tuesday at 16:00 (First lecture: May 21, last lecture: June 25) Thomas Kreuz, ISC, CNR [email protected] http://www.fi.isc.cnr.it/users/thomas.kreuz /

description

Lecture series: Data analysis. Thomas Kreuz , ISC, CNR [email protected] http://www.fi.isc.cnr.it/users/thomas.kreuz /. Lectures: Each Tuesday at 16:00 (First lecture: May 21, last lecture: June 25 ). Schedule. Lecture 1: Example (Epilepsy & spike train synchrony), - PowerPoint PPT Presentation

Transcript of Lecture series: Data analysis

Page 1: Lecture series: Data analysis

Lecture series: Data analysis

Lectures: Each Tuesday at 16:00

(First lecture: May 21, last lecture: June 25)

Thomas Kreuz, ISC, CNR [email protected]

http://www.fi.isc.cnr.it/users/thomas.kreuz/

Page 2: Lecture series: Data analysis

• Lecture 1: Example (Epilepsy & spike train synchrony), Data acquisition, Dynamical systems

• Lecture 2: Linear measures, Introduction to non-linear dynamics

• Lecture 3: Non-linear measures

• Lecture 4: Measures of continuous synchronization

• Lecture 5: Measures of discrete synchronization(spike trains)

• Lecture 6: Measure comparison & Application to epileptic seizure prediction

Schedule

Page 3: Lecture series: Data analysis

• Introduction to data / time series analysis• Univariate: Measures for individual time series - Linear time series analysis: Autocorrelation, Fourier spectrum - Non-linear time series analysis: Entropy, Dimension, Lyapunov exponent• Bivariate: Measures for two time series - Measures of synchronization for continuous data (e.g., EEG) cross correlation, coherence, mutual information, phase synchronization, non-linear interdependence - Measures of directionality: Granger causality, transfer entropy - Measures of synchronization for discrete data (e.g., spike trains): Victor-Purpura distance, van Rossum distance, event synchronization, ISI-distance, SPIKE-distance• Applications to electrophysiological signals (in particular single-unit data and EEG from epilepsy patients) Epilepsy – “window to the brain”

Overview of lecture series

Page 4: Lecture series: Data analysis

• Example: Epileptic seizure prediction

• Data acquisition

• Introduction to dynamical systems

First lecture

Page 5: Lecture series: Data analysis

Non-linear model systems

Linear measures

Introduction to non-linear dynamics

Non-linear measures

- Introduction to phase space reconstruction

- Lyapunov exponent

Second lecture

Page 6: Lecture series: Data analysis

Non-linear measures

- Dimension

[ Excursion: Fractals ]

- Entropies

- Relationships among non-linear measures

Third lecture

Page 7: Lecture series: Data analysis

Motivation

Measures of synchronization for continuous data • Linear measures: Cross correlation, coherence

• Mutual information

• Phase synchronization (Hilbert transform)

• Non-linear interdependences

Measure comparison on model systems

Measures of directionality• Granger causality

• Transfer entropy

Fourth lecture

Page 8: Lecture series: Data analysis

Motivation and examples

Measures of synchronization for discrete data (here: spike trains, but in principle can be any other kind of discrete data)

• Victor-Purpura distance

• Van Rossum distance

• Schreiber correlation measure

• ISI-distance

• SPIKE-distance (& Applications)

Fifth lecture

Page 9: Lecture series: Data analysis

Spikes / Spike trainsSpike: Action potential (event in which the membrane potential of a neuron rapidly rises and falls.)

Spike train: Temporal sequence of spikes.

Basic assumptions:

All-or-non law: “There is no such thing as half a spike.”Either full response or no response at all(depending on whether firing threshold is crossed or not)

Spikes are stereotypical. Shape does not carry information.

Background activity carries minimal information. Only spike times matter.

Page 10: Lecture series: Data analysis

Motivation: Spike train (dis)similarity

Three different scenarios:

1. Simultaneous recording of population

Neuronal correlations, pathology (e.g. epilepsy)

2. Repeated presentation of just one stimulus

Reliability

3. Repeated presentation of different stimuli

Stimulus discrimination, neural coding

Page 11: Lecture series: Data analysis

• Monkey retina (functioning in vitro for ~ 15h)

• Multi-Electrode Array (MEA) recordings (512 electrodes)

• Complete populations of retinal ganglion cells (~ 100 RGCs)

1. Simultaneous recording: Example

Page 12: Lecture series: Data analysis

0 1 2

60

0

Time [s]

# Tr

ial

One neuron, 60 repetitions: High reliability

2. Repeated stimulus presentation: Example

Page 13: Lecture series: Data analysis

3. Different stimuli: Neural codingNeural coding:Relationship between the stimulus and the individual or ensemble neuronal responses

Neural encoding: Map from stimulus to response Aim: Response prediction

Neural decoding: Map from response to stimulus Aim: Stimulus reconstruction

Encoding DecodingStimulus

Response

Page 14: Lecture series: Data analysis

Neural coding schemes

Labelled line coding: Individual neurons code on their own.Identity of neuron that fires a spike matters.

Population coding: Joint activities of a number of neurons.Identity of the neuron is irrelevant. All that is important is that the spike is fired as part of the population response, not which neuron fired it.Advantages: Individual neurons are noisy, summed population is robust. Multi-coding possible. Faster.

See also: Sparseness vs. distributed representation in memory and recognition

Extreme sparseness: Grandmother cell Jennifer Aniston neuron (concept cell)

Page 15: Lecture series: Data analysis

Jennifer Aniston neuron

[Quian Quiroga et al. Nature (2005)]

Page 16: Lecture series: Data analysis

Sensory-motor system: Cortical homunculus

[Wilder Penfield: Epilepsy and the Functional Anatomy of the Human Brain. 1954]

Primary somatosensory cortex Primary motor cortex

Page 17: Lecture series: Data analysis

Neural coding schemes

Rate coding: Most (if not all) information about the stimulus is contained in the firing rate of the neuron

Edgar Adrian 1929 (NP 1932): Firing rate of stretch receptor neurons in the muscles is related to the force applied to the muscle.

Temporal coding: Precise spike timing carries information

Many studies: Temporal resolution on millisecond time scaleNo absolute time reference in the nervous system Relative timing to stimulus onset / other spikes, but also with respect to ongoing brain oscillation

(special cases: Latency code, Pattern code, Coincidence code)

Page 18: Lecture series: Data analysis

Measures of spike train (dis)similarity

- Victor-Purpura distance (Victor & Purpura, 1996)- van Rossum distance (van Rossum,

2001)- Event synchronization (Quian Quiroga et al.,

2002) - Schreiber correlation measure (Schreiber et al., 2003)- Hunter-Milton similarity (Hunter & Milton,

2003)- ISI-distance (ISI = Inter-spike interval) (Kreuz et al., 2007)- SPIKE-distance (Kreuz et al.,

2013)

Overview and comparison:Kreuz T, Haas J, Morelli A, Abarbanel HDI, Politi A: Measuring spike train synchrony. JNeurosci Methods 165, 151 (2007)Kreuz T, Chicharro D, Houghton C, Andrzejak RG, Mormann F:Monitoring spike train synchrony. JNeurophysiol 109, 1457 (2013)

Page 19: Lecture series: Data analysis

Victor-Pupura: Sequence of elementary steps

Page 20: Lecture series: Data analysis

0 1 2 3 4 5 6 7 8 90

1.49

0

2.08

Convolution

Diff²

Input

Output

Time [sec]

Van Rossum: DR(τR=0.1)=1.61

Page 21: Lecture series: Data analysis

0 1 2 3 4 5 6 7 8 9-1

0

1

0

0.406

Output

Input

ISIs

Ratio

Time [s]

ISI-distance: DI=0.06

Page 22: Lecture series: Data analysis

0 100 200 300 400 500 600 700 8000

10

1

Time [ms]

Spike

trains

Ia

Sa

Motivation: SPIKE-distance

ISI-Distance

SPIKE-Distance

Page 23: Lecture series: Data analysis

0 1 2 3 4 5 6 7 8 9 10 11

1

2

Spike

trains

Time [arbitrary unit]

t

t(1)P (t) t(1)

F (t)

t(2)P (t) t(2)

F (t)

x(1)ISI (t)

x(2)ISI (t)

x(1)P (t) x(1)

F (t)

x(2)P (t) x(2)

F (t)

tP(1) (t)

tF(1) (t)

tP(2) (t) tF

(2) (t)

SPIKE-distance

Page 24: Lecture series: Data analysis

Visualization: Dissimilarity profile

0 200 400 600 800 1000 12000

0.4

2

1Spike

trains

S

Time [ms]

Page 25: Lecture series: Data analysis

0 500 1000 1500 2000 2500 3000 3500 4000

0

0.5

50

25

Spike

trains

SraSra

Time [arbitrary units]

Causal (real-time) SPIKE-distance

Page 26: Lecture series: Data analysis

Instantaneous clustering

0 500 1000 1500 2000

40

20

Spike

trains

Time [ms]

Spike trains

S

Spi

ke tr

ains

10 20 30 40

10

20

30

40

Spike trains10 20 30 40

Spike trains10 20 30 40

Spike trains10 20 30 40

Spike trains

Sr

Spi

ke tr

ains

10 20 30 40

10

20

30

40

Spike trains10 20 30 40

Spike trains10 20 30 40

Spike trains

10 20 30 400

0.5

1

Page 27: Lecture series: Data analysis

Selected averaging

0 500 1000 1500 2000 2500 3000 3500 4000

40

30

20

10

Time [ms]

Spike

trains

Spike trains

Spi

ke tr

ains

S

10 20 30 40

10

20

30

40

Spike trains10 20 30 40

Spike trains10 20 30 40

Spike trains10 20 30 40

Spike trains

Spi

ke tr

ains

Sr

10 20 30 40

10

20

30

40

Spike trains10 20 30 40

Spike trains10 20 30 40

Spike trains

10 20 30 400

0.5

1

Page 28: Lecture series: Data analysis

Population averages

0 500 1000 1500 2000 2500 3000 3500 4000

40

30

20

10

Time [ms]

G1

G2

G3

G4

10 20 30 40

30

20

10

Spike trains

Spi

ke tr

ains

S

Spike trains

10 20 30 40

30

20

10

Spike trains

Sr

Spike trains

G1 G2 G3 G4

G4

G3

G2

G1

Spike trains

< S >G

2 3 1 4Spike train groups

G1 G2 G3 G4

G4

G3

G2

G1

Spike trains

< Sr >G

0

0.2

0.4

0.6

0.8

1

2 3 1 4Spike train groups

Page 29: Lecture series: Data analysis

Internally triggered averaging

0 10 20 30 40 50 60 70 80 90 100

20

10

Time [ms]

Spi

ke tr

ains

5 10 15 2020

15

10

5

Spike trains

Spi

ke tr

ains

S

0

0.2

0.4

0.6

0.8

1

1 16 11 19 8 4 12 9 20 6 5 18 14 2 10 15 17 3 13 7

Spike trains

Page 30: Lecture series: Data analysis

Application to continuous data

0 20 40 60 800

0.4

0

0.2

10

8

6

4

2

Data

SaSa

SraSra

Time [s]

Page 31: Lecture series: Data analysis

Representations

Dissimilarity matrix of size N^2 * #(t):

• Full representation (as seen in movie)

• Instantaneous dissimilarity (one frame of movie)

• Temporal averaging (selective, triggered)

• Spatial averaging - Synchronization among spike train groups (or full population Measure profile)

• Temporal and spatial averaging: Overall synchrony

Page 32: Lecture series: Data analysis

Advantages• Perfect time resolution, no binning, no parameter

• Not invariant to shuffling of spikes among spike trains (in contrast to peri-stimulus time histogram, PSTH)

• Time-scale independence

• Computational efficiency

• Online monitoring (Real-time SPIKE-distance)Applications: - Epilepsy - Brain-machine interfacing

• Application to continuous data (e.g. EEG)

• Papers and Matlab source codes:

http://www.fi.isc.cnr.it/users/thomas.kreuz/sourcecode.html

Page 33: Lecture series: Data analysis

Comparison of spike train distances

• Capability to reproduce known clustering

Comparison of continuous measure of synchronization

• Application to epileptic seizure prediction

• Predictive performance

Statistical validation

• Secondary time series analysis / Analysis of measure profiles

• The method of measure profile surrogates

Today’s lecture

Page 34: Lecture series: Data analysis

Measurecomparison

Page 35: Lecture series: Data analysis

- Associate neuronal network („Black box“)

- Time series from 29 neurons (each 32768 points) - Two synaptically coupled clusters of 13 neurons (1 and 2), remaining 3 neurons are coupled to all other (shared, S)

Validation: Hindemarsh-Rose simulations

Page 36: Lecture series: Data analysis

0.5 1 1.5 2 2.5 3

x 104

-1

0

11c

1b0

1431

ISI

I(t)

Data points

HR spike trains from cluster 1: DI=0.019

Page 37: Lecture series: Data analysis

0.5 1 1.5 2 2.5 3

x 104

-1

0

12a

1a0

1428

ISI

I(t)

Data points

HR spike trains from clusters 1 and 2: DI=0.032

Page 38: Lecture series: Data analysis

• Minimum cost DV of transforming one train into the other

• Only three possible transformations: - Adding a spike (cost 1) - Deleting a spike (cost 1) - Shifting a spike (Parameter: Cost cV)

• Low cV: DV ~ Difference in spike count (rate code distance)

• High cV: DV ~ # non-aligned spikes (coincidence distance)

• ],0[ VD

Reminder: Victor & Purpura distance DV

[Victor & Purpura, J Neurophysiol 76, 1310 (1996)]

Page 39: Lecture series: Data analysis

10-4 10-3 10-2 10-1 100100

101

102

103

cV

DV

1-12-2S-S1-S2-S1-2

0

0.1

0.2

0.3

0.4

0.5

DI

Time scale dependence: DV

Page 40: Lecture series: Data analysis

Neuron

Neu

ron

1a1b 1c1d1e 1f 1g 1h 1i 1j 1k 1l 1mSaSbSc2a 2b 2c2d2e 2f 2g 2h 2i 2j 2k 2l 2m

1a1b1c1d1e1f

1g1h1i1j1k1l

1mSaSbSc2a2b2c2d2e2f

2g2h2i2j2k2l

2m 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Distance matrix (pairwise similarities): DI

Page 41: Lecture series: Data analysis

Sa Sb Sc 2a 2e 2m 2b 2j 2i 2k 2d 2h 2l 2f 2g 2c 1a 1c 1d 1i 1j 1b 1h 1m 1f 1l 1g 1k 1e0

0.05

0.1

0.15

0.2

0.25

0.3

Neuron

Dis

tanc

eHierarchical cluster tree (dendrogram): DI

C2

C1

CS

Page 42: Lecture series: Data analysis

Hierarchical cluster tree (dendrogram): DI

Single linkage algorithm

• First, the closest pair of spike trains is identified and thereby linked by a П-shaped line, where the height of the connection measures the mutual distance .

• These two spike trains are merged into a single element , and the next closest pair of elements is then identified and connected.• The procedure is repeated iteratively until a single cluster remains.• Distance between a pair of clusters:

ji SS ,

) ,( ji SSdC

CSCSSSdCCd jiji ,)}, ,(min{) ,(

Page 43: Lecture series: Data analysis

• Confusion matrix : # spike trains from cluster classified as belonging to cluster

• Correct clustering: diagonal

• Quantification: Normalized confusion entropy

• For H=1: Cluster separation

Assessing cluster quality

]1,0[ H

]1,0[ 3

21

A

MS

LLLLLF

N

N

Page 44: Lecture series: Data analysis

Si Sj Sk Neuron

Dist

ance

LS L2

L1

LM

LA

CS

C2 C1

Quantifying clustering performance

Page 45: Lecture series: Data analysis

10-4 10-3 10-2 10-1 100100

101

102

103

cV

DV

1-12-2S-S1-S2-S1-2

0

0.1

0.2

0.3

0.4

0.5

DI

Time scale dependence: DV

Page 46: Lecture series: Data analysis

10-5 10-4 10-3 10-2 10-1 100

0

0.2

0.4

0.6

0.8

1

cV

Clus

terin

g Pe

rfor

man

ce

HF

Clustering performance: Parameter dependence

Page 47: Lecture series: Data analysis

Performance comparison: Clustering

Page 48: Lecture series: Data analysis

DI

DI

DV

DV

DR

DR

DS

DS

DH

DH

DQ

DQ

Measure

Mea

sure

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Correlation among spike train distances

Page 49: Lecture series: Data analysis

0.01

0.02

0.03

0.04

DV

DR

DQ

DS

DH

DI

Measure

Dis

tanc

eClustering of spike train distances

Page 50: Lecture series: Data analysis

Epileptic seizureprediction

Page 51: Lecture series: Data analysis

Lecture 6b

Page 52: Lecture series: Data analysis

• Introduction to data / time series analysis• Univariate: Measures for individual time series - Linear time series analysis: Autocorrelation, Fourier spectrum - Non-linear time series analysis: Entropy, Dimension, Lyapunov exponent• Bivariate: Measures for two time series - Measures of synchronization for continuous data (e.g., EEG) cross correlation, coherence, mutual information, phase synchronization, non-linear interdependence - Measures of directionality: Granger causality, transfer entropy - Measures of synchronization for discrete data (e.g., spike trains): Victor-Purpura distance, van Rossum distance, event synchronization, ISI-distance, SPIKE-distance• Applications to electrophysiological signals (in particular single-unit data and EEG from epilepsy patients) Epilepsy – “window to the brain”

Overview of lecture series

Page 53: Lecture series: Data analysis

Thanks a lot for yourpatience and attention!