STATISTICS FOR HIGH DIMENSIONAL BIOLOGICAL RECORDINGS Dr Cyril Pernet, Centre for Clinical Brain...

8
STATISTICS FOR HIGH DIMENSIONAL BIOLOGICAL RECORDINGS Dr Cyril Pernet, Centre for Clinical Brain Sciences Brain Research Imaging Centre [email protected] http://www.sbirc.ed.ac .uk/cyril/

Transcript of STATISTICS FOR HIGH DIMENSIONAL BIOLOGICAL RECORDINGS Dr Cyril Pernet, Centre for Clinical Brain...

STATISTICS FOR HIGH DIMENSIONAL BIOLOGICAL RECORDINGS

Dr Cyril Pernet,

Centre for Clinical Brain Sciences

Brain Research Imaging Centre

[email protected]://www.sbirc.ed.ac.uk/cyril/

Biological Recordings

• Behavioural / Electrophysiology / MRI images

• 1D: Single channel (time / freq)• 2D: Classification ‘images’ (can actually be spectrograms)• 3D: MRI (xyz) and MEEG (channels x time / freq / trials) • 4D: fMRI (time * xyz) and MEEG (channels x freq x time x

trials)

Biological Recordings

Often we want:

To ensure data are ok for analyses high dimensional outliers detection, weighting, etc.

To analyse each ‘cell’ in the data matrix = ‘massive univariate analyses’ multiple comparisons issue

To find features in the data to distinguish conditions / groups dimension reduction (ICA), classification (MVPA)

My toys

• General linear model (WLS, IRLS)

• Robust statistics (trimmed means, winsorized variance, skipped correlations, half space/mid-covariance determinant, MAD, S-outliers, etc)

• Bootstrap and permutations

• Cross-validation

Example 1: EEG outlier detection• Weighted least square of MEEG

–> weights based on time course similarity: 1. dimension reduction (PCA) 2. outlier detection (MAD) 3. weighting (WLS)

OLS – face 1 vs 2 seems a bit different WLS – face 1 vs 2 seems identicalBias is trial variability in face 2 leads to small diff. in OLS

Example 2: MCC• Threshold-Free Cluster Enhancement (widthe x heighth )• Smith and Nichols 2009 - Integrate the cluster mass at

multiple thresholds ; used for fMRI/TBSS

Example 2: MCC for N dimensions• Threshold-Free Cluster Enhancement:• Pernet et al 2014 validation for electrophysiology to

optimize parameter selection

Example 3: ICA – correction factors?

Decompose on spatial or temporalpatterns to independent sources:

Can we test all sources of interest simultaneously and still control the type I error rate?