Lecture series: Data analysis
description
Transcript of Lecture series: Data analysis
Lecture series: Data analysis
Lectures: Each Tuesday at 16:00
(First lecture: May 21, last lecture: June 25)
Thomas Kreuz, ISC, CNR [email protected]
http://www.fi.isc.cnr.it/users/thomas.kreuz/
• Lecture 1: Example (Epilepsy & spike train synchrony), Data acquisition, Dynamical systems
• Lecture 2: Linear measures, Introduction to non-linear dynamics
• Lecture 3: Non-linear measures
• Lecture 4: Measures of continuous synchronization
• Lecture 5: Measures of discrete synchronization(spike trains)
• Lecture 6: Measure comparison & Application to epileptic seizure prediction
Schedule
• Introduction to data / time series analysis• Univariate: Measures for individual time series - Linear time series analysis: Autocorrelation, Fourier spectrum - Non-linear time series analysis: Entropy, Dimension, Lyapunov exponent• Bivariate: Measures for two time series - Measures of synchronization for continuous data (e.g., EEG) cross correlation, coherence, mutual information, phase synchronization, non-linear interdependence - Measures of directionality: Granger causality, transfer entropy - Measures of synchronization for discrete data (e.g., spike trains): Victor-Purpura distance, van Rossum distance, event synchronization, ISI-distance, SPIKE-distance• Applications to electrophysiological signals (in particular single-unit data and EEG from epilepsy patients) Epilepsy – “window to the brain”
Overview of lecture series
• Example: Epileptic seizure prediction
• Data acquisition
• Introduction to dynamical systems
First lecture
Non-linear model systems
Linear measures
Introduction to non-linear dynamics
Non-linear measures
- Introduction to phase space reconstruction
- Lyapunov exponent
Second lecture
Non-linear measures
- Dimension
[ Excursion: Fractals ]
- Entropies
- Relationships among non-linear measures
Third lecture
Motivation
Measures of synchronization for continuous data • Linear measures: Cross correlation, coherence
• Mutual information
• Phase synchronization (Hilbert transform)
• Non-linear interdependences
Measure comparison on model systems
Measures of directionality• Granger causality
• Transfer entropy
Fourth lecture
Motivation and examples
Measures of synchronization for discrete data (here: spike trains, but in principle can be any other kind of discrete data)
• Victor-Purpura distance
• Van Rossum distance
• Schreiber correlation measure
• ISI-distance
• SPIKE-distance (& Applications)
Fifth lecture
Spikes / Spike trainsSpike: Action potential (event in which the membrane potential of a neuron rapidly rises and falls.)
Spike train: Temporal sequence of spikes.
Basic assumptions:
All-or-non law: “There is no such thing as half a spike.”Either full response or no response at all(depending on whether firing threshold is crossed or not)
Spikes are stereotypical. Shape does not carry information.
Background activity carries minimal information. Only spike times matter.
Motivation: Spike train (dis)similarity
Three different scenarios:
1. Simultaneous recording of population
Neuronal correlations, pathology (e.g. epilepsy)
2. Repeated presentation of just one stimulus
Reliability
3. Repeated presentation of different stimuli
Stimulus discrimination, neural coding
• Monkey retina (functioning in vitro for ~ 15h)
• Multi-Electrode Array (MEA) recordings (512 electrodes)
• Complete populations of retinal ganglion cells (~ 100 RGCs)
1. Simultaneous recording: Example
0 1 2
60
0
Time [s]
# Tr
ial
One neuron, 60 repetitions: High reliability
2. Repeated stimulus presentation: Example
3. Different stimuli: Neural codingNeural coding:Relationship between the stimulus and the individual or ensemble neuronal responses
Neural encoding: Map from stimulus to response Aim: Response prediction
Neural decoding: Map from response to stimulus Aim: Stimulus reconstruction
Encoding DecodingStimulus
Response
Neural coding schemes
Labelled line coding: Individual neurons code on their own.Identity of neuron that fires a spike matters.
Population coding: Joint activities of a number of neurons.Identity of the neuron is irrelevant. All that is important is that the spike is fired as part of the population response, not which neuron fired it.Advantages: Individual neurons are noisy, summed population is robust. Multi-coding possible. Faster.
See also: Sparseness vs. distributed representation in memory and recognition
Extreme sparseness: Grandmother cell Jennifer Aniston neuron (concept cell)
Jennifer Aniston neuron
[Quian Quiroga et al. Nature (2005)]
Sensory-motor system: Cortical homunculus
[Wilder Penfield: Epilepsy and the Functional Anatomy of the Human Brain. 1954]
Primary somatosensory cortex Primary motor cortex
Neural coding schemes
Rate coding: Most (if not all) information about the stimulus is contained in the firing rate of the neuron
Edgar Adrian 1929 (NP 1932): Firing rate of stretch receptor neurons in the muscles is related to the force applied to the muscle.
Temporal coding: Precise spike timing carries information
Many studies: Temporal resolution on millisecond time scaleNo absolute time reference in the nervous system Relative timing to stimulus onset / other spikes, but also with respect to ongoing brain oscillation
(special cases: Latency code, Pattern code, Coincidence code)
Measures of spike train (dis)similarity
- Victor-Purpura distance (Victor & Purpura, 1996)- van Rossum distance (van Rossum,
2001)- Event synchronization (Quian Quiroga et al.,
2002) - Schreiber correlation measure (Schreiber et al., 2003)- Hunter-Milton similarity (Hunter & Milton,
2003)- ISI-distance (ISI = Inter-spike interval) (Kreuz et al., 2007)- SPIKE-distance (Kreuz et al.,
2013)
Overview and comparison:Kreuz T, Haas J, Morelli A, Abarbanel HDI, Politi A: Measuring spike train synchrony. JNeurosci Methods 165, 151 (2007)Kreuz T, Chicharro D, Houghton C, Andrzejak RG, Mormann F:Monitoring spike train synchrony. JNeurophysiol 109, 1457 (2013)
Victor-Pupura: Sequence of elementary steps
0 1 2 3 4 5 6 7 8 90
1.49
0
2.08
Convolution
Diff²
Input
Output
Time [sec]
Van Rossum: DR(τR=0.1)=1.61
0 1 2 3 4 5 6 7 8 9-1
0
1
0
0.406
Output
Input
ISIs
Ratio
Time [s]
ISI-distance: DI=0.06
0 100 200 300 400 500 600 700 8000
10
1
Time [ms]
Spike
trains
Ia
Sa
Motivation: SPIKE-distance
ISI-Distance
SPIKE-Distance
0 1 2 3 4 5 6 7 8 9 10 11
1
2
Spike
trains
Time [arbitrary unit]
t
t(1)P (t) t(1)
F (t)
t(2)P (t) t(2)
F (t)
x(1)ISI (t)
x(2)ISI (t)
x(1)P (t) x(1)
F (t)
x(2)P (t) x(2)
F (t)
tP(1) (t)
tF(1) (t)
tP(2) (t) tF
(2) (t)
SPIKE-distance
Visualization: Dissimilarity profile
0 200 400 600 800 1000 12000
0.4
2
1Spike
trains
S
Time [ms]
0 500 1000 1500 2000 2500 3000 3500 4000
0
0.5
50
25
Spike
trains
SraSra
Time [arbitrary units]
Causal (real-time) SPIKE-distance
Instantaneous clustering
0 500 1000 1500 2000
40
20
Spike
trains
Time [ms]
Spike trains
S
Spi
ke tr
ains
10 20 30 40
10
20
30
40
Spike trains10 20 30 40
Spike trains10 20 30 40
Spike trains10 20 30 40
Spike trains
Sr
Spi
ke tr
ains
10 20 30 40
10
20
30
40
Spike trains10 20 30 40
Spike trains10 20 30 40
Spike trains
10 20 30 400
0.5
1
Selected averaging
0 500 1000 1500 2000 2500 3000 3500 4000
40
30
20
10
Time [ms]
Spike
trains
Spike trains
Spi
ke tr
ains
S
10 20 30 40
10
20
30
40
Spike trains10 20 30 40
Spike trains10 20 30 40
Spike trains10 20 30 40
Spike trains
Spi
ke tr
ains
Sr
10 20 30 40
10
20
30
40
Spike trains10 20 30 40
Spike trains10 20 30 40
Spike trains
10 20 30 400
0.5
1
Population averages
0 500 1000 1500 2000 2500 3000 3500 4000
40
30
20
10
Time [ms]
G1
G2
G3
G4
10 20 30 40
30
20
10
Spike trains
Spi
ke tr
ains
S
Spike trains
10 20 30 40
30
20
10
Spike trains
Sr
Spike trains
G1 G2 G3 G4
G4
G3
G2
G1
Spike trains
< S >G
2 3 1 4Spike train groups
G1 G2 G3 G4
G4
G3
G2
G1
Spike trains
< Sr >G
0
0.2
0.4
0.6
0.8
1
2 3 1 4Spike train groups
Internally triggered averaging
0 10 20 30 40 50 60 70 80 90 100
20
10
Time [ms]
Spi
ke tr
ains
5 10 15 2020
15
10
5
Spike trains
Spi
ke tr
ains
S
0
0.2
0.4
0.6
0.8
1
1 16 11 19 8 4 12 9 20 6 5 18 14 2 10 15 17 3 13 7
Spike trains
Application to continuous data
0 20 40 60 800
0.4
0
0.2
10
8
6
4
2
Data
SaSa
SraSra
Time [s]
Representations
Dissimilarity matrix of size N^2 * #(t):
• Full representation (as seen in movie)
• Instantaneous dissimilarity (one frame of movie)
• Temporal averaging (selective, triggered)
• Spatial averaging - Synchronization among spike train groups (or full population Measure profile)
• Temporal and spatial averaging: Overall synchrony
Advantages• Perfect time resolution, no binning, no parameter
• Not invariant to shuffling of spikes among spike trains (in contrast to peri-stimulus time histogram, PSTH)
• Time-scale independence
• Computational efficiency
• Online monitoring (Real-time SPIKE-distance)Applications: - Epilepsy - Brain-machine interfacing
• Application to continuous data (e.g. EEG)
• Papers and Matlab source codes:
http://www.fi.isc.cnr.it/users/thomas.kreuz/sourcecode.html
Comparison of spike train distances
• Capability to reproduce known clustering
Comparison of continuous measure of synchronization
• Application to epileptic seizure prediction
• Predictive performance
Statistical validation
• Secondary time series analysis / Analysis of measure profiles
• The method of measure profile surrogates
Today’s lecture
Measurecomparison
- Associate neuronal network („Black box“)
- Time series from 29 neurons (each 32768 points) - Two synaptically coupled clusters of 13 neurons (1 and 2), remaining 3 neurons are coupled to all other (shared, S)
Validation: Hindemarsh-Rose simulations
0.5 1 1.5 2 2.5 3
x 104
-1
0
11c
1b0
1431
ISI
I(t)
Data points
HR spike trains from cluster 1: DI=0.019
0.5 1 1.5 2 2.5 3
x 104
-1
0
12a
1a0
1428
ISI
I(t)
Data points
HR spike trains from clusters 1 and 2: DI=0.032
• Minimum cost DV of transforming one train into the other
• Only three possible transformations: - Adding a spike (cost 1) - Deleting a spike (cost 1) - Shifting a spike (Parameter: Cost cV)
• Low cV: DV ~ Difference in spike count (rate code distance)
• High cV: DV ~ # non-aligned spikes (coincidence distance)
• ],0[ VD
Reminder: Victor & Purpura distance DV
[Victor & Purpura, J Neurophysiol 76, 1310 (1996)]
10-4 10-3 10-2 10-1 100100
101
102
103
cV
DV
1-12-2S-S1-S2-S1-2
0
0.1
0.2
0.3
0.4
0.5
DI
Time scale dependence: DV
Neuron
Neu
ron
1a1b 1c1d1e 1f 1g 1h 1i 1j 1k 1l 1mSaSbSc2a 2b 2c2d2e 2f 2g 2h 2i 2j 2k 2l 2m
1a1b1c1d1e1f
1g1h1i1j1k1l
1mSaSbSc2a2b2c2d2e2f
2g2h2i2j2k2l
2m 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Distance matrix (pairwise similarities): DI
Sa Sb Sc 2a 2e 2m 2b 2j 2i 2k 2d 2h 2l 2f 2g 2c 1a 1c 1d 1i 1j 1b 1h 1m 1f 1l 1g 1k 1e0
0.05
0.1
0.15
0.2
0.25
0.3
Neuron
Dis
tanc
eHierarchical cluster tree (dendrogram): DI
C2
C1
CS
Hierarchical cluster tree (dendrogram): DI
Single linkage algorithm
• First, the closest pair of spike trains is identified and thereby linked by a П-shaped line, where the height of the connection measures the mutual distance .
• These two spike trains are merged into a single element , and the next closest pair of elements is then identified and connected.• The procedure is repeated iteratively until a single cluster remains.• Distance between a pair of clusters:
ji SS ,
) ,( ji SSdC
CSCSSSdCCd jiji ,)}, ,(min{) ,(
• Confusion matrix : # spike trains from cluster classified as belonging to cluster
• Correct clustering: diagonal
• Quantification: Normalized confusion entropy
• For H=1: Cluster separation
Assessing cluster quality
]1,0[ H
]1,0[ 3
21
A
MS
LLLLLF
N
N
Si Sj Sk Neuron
Dist
ance
LS L2
L1
LM
LA
CS
C2 C1
Quantifying clustering performance
10-4 10-3 10-2 10-1 100100
101
102
103
cV
DV
1-12-2S-S1-S2-S1-2
0
0.1
0.2
0.3
0.4
0.5
DI
Time scale dependence: DV
10-5 10-4 10-3 10-2 10-1 100
0
0.2
0.4
0.6
0.8
1
cV
Clus
terin
g Pe
rfor
man
ce
HF
Clustering performance: Parameter dependence
Performance comparison: Clustering
DI
DI
DV
DV
DR
DR
DS
DS
DH
DH
DQ
DQ
Measure
Mea
sure
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1
Correlation among spike train distances
0.01
0.02
0.03
0.04
DV
DR
DQ
DS
DH
DI
Measure
Dis
tanc
eClustering of spike train distances
Epileptic seizureprediction
Lecture 6b
• Introduction to data / time series analysis• Univariate: Measures for individual time series - Linear time series analysis: Autocorrelation, Fourier spectrum - Non-linear time series analysis: Entropy, Dimension, Lyapunov exponent• Bivariate: Measures for two time series - Measures of synchronization for continuous data (e.g., EEG) cross correlation, coherence, mutual information, phase synchronization, non-linear interdependence - Measures of directionality: Granger causality, transfer entropy - Measures of synchronization for discrete data (e.g., spike trains): Victor-Purpura distance, van Rossum distance, event synchronization, ISI-distance, SPIKE-distance• Applications to electrophysiological signals (in particular single-unit data and EEG from epilepsy patients) Epilepsy – “window to the brain”
Overview of lecture series
Thanks a lot for yourpatience and attention!