Databasing, neuroimaging and geneticsnichols/OHBM2009/IntroImg...JB Poline 06/11/09 17 Some thoughts...

Post on 24-Jul-2020

0 views 0 download

Transcript of Databasing, neuroimaging and geneticsnichols/OHBM2009/IntroImg...JB Poline 06/11/09 17 Some thoughts...

06/11/09 1JB Poline

Databasing, neuroimaging and genetics

Jean-Baptiste Poline

Thanks: A. Barbot, B. Thyreau, Y. Schwartz , A. Moreno, B. Thirion, V. Frouin, E. Duchesnay, P Pinel and many

others

06/11/09 2JB Poline

Outline

• Motivation• Databasing and neuroimaging: a quick

review and taxinomy• Genetic databases: a very brief word• Neuroimaging and genetics: new needs• Two imaging genetics examples:

– Saguenay study– Imagen study

• Conclusion and perspective

06/11/09 3JB Poline

Motivation

• Imaging genetic studies can be divided into – Small groups selected for a specific polymorphism – Large studies involving hundreds / thousands of subjects for

sensitivity / exploratory approaches, (cf GWAS)

• Sharing: – several partners are involved, or the NIH requires it

• Data protection• Updating / data versioning

– Increasing the number of subjects, or information

• Queries made simpler / quicker / possible through the web or on disck

• Cost: Databasing reduces cost– Acquisition / maintenance

06/11/09 4JB Poline

You cannot handle large heterogeneous data without serious tools

•Exel files will kill you

06/11/09 5JB Poline

Databasing and neuroimaging: a quick review and taxinomy

See also…

• The bibliography database– Brainmap

• Networks / databases– BIRN, ADNI, Brainscape, fMRIDC…

• Knowledge based– Ontology projects / Xceed

• Processings…– Loni pipelines, NAMIC, Brainvisa, Neurogrid, Fistwidget,, …

• Each large project has its DB– Imagen, Saguenay, many others …

06/11/09 6JB Poline

BrainMap http://brainmap.org/

Three BrainMap applications :

1. Database searches and Talairach coordinate plotting(Sleuth)

2. Meta-analyses via the activation likelihood estimation (ALE) method; (GingerALE)

3. Entry of published functionalneuroimaging papers withcoordinate results (Scribe)

Not a resource for raw data, but may contain contrast ma ps

06/11/09 7JB Poline

The BIRN

06/11/09 8JB Poline

BIRN Data Repository

• Sharing data through the BDR to capture, curate, store, query, view, and download imaging and related data.

• Enable the sharing of existing, published data, • BDR as a mechanism to facilitate collaborations• Appropriate timeline for public release

– versioning

• A rich curatorial environment, built on the BIRN portal foundation, data submission process and subsequent sharing.

• XNAT remains a possibility

06/11/09 9JB Poline

NAMIC •creating a medical image computing platform •research on novel image analysis algorithms •deploying these capabilities

06/11/09 10JB Poline

06/11/09 11JB Poline

• ADNI methods available for non-ADNI studies. – imaging protocol, – image corrections, – ADNI phantom and analysis

software.

• http://www.adni-info.org• Access through request

06/11/09 12JB Poline

ADNI uses LONI Image Data Archive

06/11/09 13JB Poline

LONI

The LONI Image Data Archive: an environment for• safely archiving, • querying, • visualizing• sharing.

The archive facilitates• de-identification and pooling of

data from multiple institutions• protection from unauthorized

access• the ability to share data among

collaborative investigator

06/11/09 14JB Poline

Extensible Neuroimaging Archive Toolkit

06/11/09 15JB Poline

Neurogrid -http://www.neurogrid.ac.uk/

• A Grid-based network of neuroimaging centres and a neuroimaging tool-kit. Sharing data and expertise to facilitatethe archiving, curation, retrieval and analysis of imaging data

• Enable multiple sites large-scale clinical studies

• Practicalities: – Set up a secured account– Upload your brain image (T1, DTI)– Dowload results

06/11/09 16JB Poline

Outstanding questions

• Databases are still about large project, but local organisation is needed – How to reconcile the need for local need and real DB?

• Most of the tools from large projects require IT support (system manager + knowledge on neuroimaging) Often even if they pretend otherwise…

• Results are too rarely input in DB after analyses: ontology issues

• Large projects publications: are those the most efficient with respect to the current success criteria?– BIRN: about 80 publications in 5 years– ADNI: about 15? (pubmed)

06/11/09 17JB Poline

Some thoughts on neuroimaging and databasing

• Sharing data is not yet common but should be in the future– NIH trend, cost, specific population recrutment

• Remote computing is getting more common (cloud computing) but tools are still too difficult for average lab

• Reproducibility / provenance tracking of results may eventually impose databasingsolution

• Could be a cost effective solution…

06/11/09 18JB Poline

• Gene Database• A new database of genes and associated information is

available for searching in Entrez.• RefSeq

Reference sequences of chromosomes, genomic contigs, mRNAs, and proteins for human and major model organisms.

• OMIMA guide to human genes and inherited disorders maintained by Johns Hopkins University and collaborators.

• dbSNPA database of single nucleotide polymorphisms (SNPs) and other nucleotide variations.

NCBI (National Center for Biotechnology Information) Genome Resource guides http://www.ncbi.nlm.nih.gov/genome/guide/

06/11/09 19JB Poline

06/11/09 20JB Poline

See also the …. … resources

06/11/09 21JB Poline

db SNP:

• SNP rs2396753: Variations can be used for genemapping, definition of population structure, and performance of functional studies.– DBSNP

06/11/09 22JB Poline

Mapview

06/11/09 23JB Poline

Hapmap / Haploview

06/11/09 24JB Poline

Summarizing the needs

• Data protection / Backup / Archiving• Data (pseudo) anonymisation – deidentification

– The story of the pseudocode 2 and how it can be broken• Data entering and download

– User login/password based access– User specific view of the data

• Data versioning• Quality check – Data curation• Querrying the data (Gene/Img/Behav): Interface +

scripting– Different level: x,y,z ? Whole image / run?

• Sharing the results; (results re-entered)• Visualization

06/11/09 25JB Poline

Example 1: Saguenay Youth Study

A genetic study of long-term effects of prenatal exposure to maternal cigarette smoking:

On: * Brain Structure* Brain Function

* Cardiovascular Function* Body Fat/Metabolism

In: * Human Subjects (500 sibpairs)* Recombinant Inbred Strains of Rats

Funded by CIHR (PIs: T. Paus and Z. Pausova)

06/11/09 26JB Poline

Saguenay-Lac-Saint-Jean region

06/11/09 27JB Poline

Saguenay Youth Study

•Genome-wide scan with sib-pair linkage analysis•Fine mapping with family-based association analyses

•500 sib-pairs (+parental DNA)•Age: 12-18 years

•French-Canadian origin

250 non-exposed250 exposed

Matched by:• Maternal education

• School attended

Pausova et al. Human Brain Mapping 28:502-518, 2007

Saguenay Youth StudyData Collection

Telephone InterviewIII•Life habits of mother during pregnancy and now•Medical history of children, mother and father

30 min

Home VisitIV•School performance, activities at school, feelings at school, life at home (ECOBES, students)

•Your children and school, your education, your family life (ECOBES, parents)•Screen for psychiatric disorders (DISC Predictive Scale for adolescents)

•Puberty development; risky behaviors (cigarettes, drugs, alcohol); hyperactivity, conduct disorder, aggression, anxiety, and depression; delinquency (GRIP, adolescents)

•Cigarettes, drugs, and alcohol abuse; anxiety, depression, and anti-social behavior (GRIP, parents)•Drawing a blood sample (parents)

2 h

LaboratoryVNeuro-psychological Assessment

•IQ assessment (WISC-III)•Academic achievement (Woodcock-Johnson)

•Memory (Children’s Memory Scale)•Motor skills (pegboard, tapping, bi-manual coordination)

•Executive functions (interference, word fluency, working memory)•Emotion/Motivation (faces, voices, gambling, RFT)

•Language (FM threshold, phonological awareness, DAF, phonetic learning)

6 h

MRI scan•Brain

•Abdomen: fat and kidneysDiet and Physical Activity

•Twenty-four-hour food recall, •Food frequency questionnaire•Physical activity questionnaire

Hospital SessionVIBody composition•Anthropometry•Bioimpedance

•MRI (fat)Blood pressure, cardiovascular reactivity, and salivary cortisol

(Finometer: beat-to-beat, respiration)•Resting

•In response to postural change•In response to mental stress

4 h

School Session VIIFasting Blood Sample

Glucose and lipid metabolismLow-grade inflammation, endothelial and fibrinolytic dysfunctions, HPA activity

Sexual maturationSmoking habits

Nutrition

1 h

Genotyping: Candidate Genes and Total Genome ScanVIII

T1-weighted T2-weighted Proton Density MagnetizationTransfer Ratio

15 min 15 min 15 min

Structural Magnetic Resonance Imaging:

MR Pipeline: Quality Control

06/11/09 32JB Poline

One result

• White Mattervolume

Magnetization Transfer ratio

testosterone influenced WM volume to a greater extent i n males with the more “efficient”AR (short AR gene), compared with those with a less effi cient AR (long AR gene)

06/11/09 33JB Poline

Lessons from the Saguenay study

• Home made database (PHP, 1py)• Contains all variables (phone interview, etc) but not

the imaging data• No mecanism to share data• Home design for web pages for specific datasets

(~versioning)• Semi automatic analysis pipeline, results re-entered

in the DB

• The use of a specific population• Very large amount of behavioural or biological data• No tool easy for re-use

06/11/09 34JB Poline

Example 2: Imagen project and database: a brief review

• Genetically influenced individual differences in brain responsesto reward, punishment and emotional cues in adolescents mediate risk for mental disorders

• Neuroimaging : measurement of specific brain functionsimplicated in the etiology of mental disorders and link them to genetic and behavioural variations

• The goal of the present study is to identify the neurobiologicaland genetic basis of these traits and to assess their relevance for mental disorder. Means: a multicentre functional and structural genetic-neuroimaging study of a cohort of 2000+ 14 year old adolescents. Intermediate phenotypes of risk for adolescent mental illness will be explored.

06/11/09 35JB Poline

European partners

1. Berlin: A. Heinz2. Cambridge: replaced by Dresden, M. Smolka3. Dublin: H. Garavan4. Hamburg: C. Buechel5. London: G. Schumann, L. Reed, 6. Mannheim: H. Flor7. Nottingham: T. Paus8. Orsay: JL MartinotAlso: T. Robbins, Cam. P. Conrod, IOP, …

06/11/09 36JB Poline

WP 05: Neuroimaging standardisationYear 1 (1 year)

WP 01: Behavioural analysisof animal models; (3 years)Implementation Year 2-3

WP 03:Geneidenti-fication

Month 19-year 4

(2,5 years)

WP 06:Neuro-imaging

Year 2-4(3 years)

WP 02: Behavioural tasks in humans; (4 years)

Implementation Year 2-4

WP 04:Recruitment

andcharacterisation

Year 2-4 (3 years)

WP 08:DNA bank,

SNPdetection

andgenotyping

Year 2-5(4 years)

WP 07:Bioinfor-

matics and Biostatistics

Year 1-5(5 years)

WP 09: EthicsIMAGEN; Year 1-5 (5 years)

WP10: Training and dissemination; Year 1-5 (5 years)

WP 11: Project Management; Year 1-5 (5 years)

Preparation (Year 1) Preparation (Year 1)

Preparation (6 months)

WP 05: Neuroimaging standardisationYear 1 (1 year)

WP 01: Behavioural analysisof animal models; (3 years)Implementation Year 2-3

WP 03:Geneidenti-fication

Month 19-year 4

(2,5 years)

WP 06:Neuro-imaging

Year 2-4(3 years)

WP 02: Behavioural tasks in humans; (4 years)

Implementation Year 2-4

WP 04:Recruitment

andcharacterisation

Year 2-4 (3 years)

WP 08:DNA bank,

SNPdetection

andgenotyping

Year 2-5(4 years)

WP 07:Bioinfor-

matics and Biostatistics

Year 1-5(5 years)

WP 09: EthicsIMAGEN; Year 1-5 (5 years)

WP10: Training and dissemination; Year 1-5 (5 years)

WP 11: Project Management; Year 1-5 (5 years)

Preparation (Year 1) Preparation (Year 1)

Preparation (6 months)

06/11/09 37JB Poline

Step 1: One site collection and transfert (Scito, NNL)

06/11/09 38JB Poline

Step 2: data anonymisation and package handling

06/11/09 39JB Poline

Step 3: including data

40JB Poline

Work-Package 07 – Central Database

XNAT : a database tool[ Marcus & al 2007 ]

(also use in BIRN )

• XML schemas define database structure

( easy database modification )�

• Auto-generated tools :

• Web portal

• Command line

06/11/09 41JB Poline

Data included, use XML schema for DB Ontology

06/11/09 42JB Poline

06/11/09 43JB Poline

06/11/09 44JB Poline

Web

bas

edQ

ualit

ych

eck

06/11/09 45JB Poline

Web

bas

edQ

ualit

ych

eck

06/11/09 46JB Poline

(Pre-)Processings• T1

– SPM8 new segment– Brainvisa pipeline– Dartel / Free surfer have been tried out

• T2*– SPM8 preprocessing of all available EPI data – Strategy: mvt correction; reslicing, fMRI -> MPRAGE long,

MPRAGE long -> MNI template for each session– Homogenizing the log file to get fMRI protocols (dealing with

various number of runs, …)– Fitting the model intra subject (SPM) – Inter subject: in house (mixed effect + permutation)

• DTI– FSL – In house

06/11/09 47JB Poline

06/11/09 48JB Poline

Queries

• Give me T1 – normalized in MNI images for which subjects had score X above 5

• Give me behavioural scores of instrument X and Y for subjects with T2* image qualityabove Z

• Give me the genotypes of subject with bothbehavioural score X and DTI images of good quality

• Download results• API for scripts

06/11/09 49JB Poline

Automatic Quality check

06/11/09 50JB Poline

Neuroimaging scores for QC

fMRI Movement estimated

T1 mask and template overlap

Intra volume variance variation

06/11/09 51JB Poline

A few words on data analysis

06/11/09 52JB Poline

Neuroimaging and WGA

SNPs

G GT G

T TT G

G GaMRI dMRI fMRI

Clinical / behaviour

Find statistical linksOr

Predict

06/11/09 53JB Poline

Finding out the good analysisstrategies

SNP – 1M. +CNV

Transcriptom 50k

Images 200k-50k

Behaviour: <200

Data dimension reduction

Multiple comparisonpb

Inhomogeneousdata

Subjects

06/11/09 54JB Poline

Candidate SNPs vs. all image

f( )=voxel

G GC GC CC GG G

Stat. Map

Methods:- VBM, group fMRI, etc...

Complexity/multiple comparison issue:- ~106 tests or estimated parameters

For each voxel

06/11/09 55JB Poline

f( )=SNP

Method known as WGASMultiple comparison: ~106 tests

Plink?

For one voxel

One image region vs. all SNPs

06/11/09 56JB Poline

Feature selection approach

Selection

Selection

Gene-Imageon reduced

data

Or multivariate approaches

• Consider LD / spatial covariance / behaviourtests covariance

06/11/09 57JB Poline

Circuit Lecture de phrase Circuit Lecture de phrase –– damiersdamiersCorrelationCorrelation lateralisationlateralisation / / vistessevistesse de lecture pseudomotsde lecture pseudomots

Score=vitesse de lecture des pseudo-mots / étude cerveau entier p=0.01, 40 voxelsSans les outliers à deux écarts types de distance au moins de la moyenne

Circuit Ecoute de phraseCircuit Ecoute de phraseCorrelationCorrelation lateralisationlateralisation / / vistessevistesse de lecture pseudomotsde lecture pseudomots

Score=vitesse de lecture des pseudo-mots / étude cerveau entier p=0.01, 40 voxelsSans les outliers à deux écarts types de distance au moins de la moyenne

Circuit Lecture de phrase Circuit Lecture de phrase –– damiersdamiersDiffDiff . . LateralisationLateralisation –– genegene KIAA / SNP: rs155089 6>8KIAA / SNP: rs155089 6>8

Circuit Lecture de phrase Circuit Lecture de phrase –– damiersdamiersDiffDiff . . LateralisationLateralisation –– genegene KIAA / SNP: rs7761100 7>6KIAA / SNP: rs7761100 7>6

Type G/G G/T T/TType G/G G/T T/Tnb. sbj 32 28 7age 23 23 22men 39 40 50educ (y) 3.4 3.1 3.5Dysl.(%) 10 7 0Substr.(%) 70 79 81Pseudow(ms/w) 930 895 850

% dyslexic Pseudow. speed reading Substraction score

Type C/C C/T T/TType C/C C/T T/Tnb. sbj 2 21 34age - 25 24men - 44 36educ (y) - 3.1 3.4Dysl.(%) - 6 10Substr.(%) - 80 73Pseudow(ms/w) - 855 924

06/11/09 58JB Poline

peak

06/11/09 59JB Poline

Conclusion:

• A lot to be done: combining two complex and powerful data for – Better understanding of brain mecanisms– Better understanding of the impact of genetic

variations– Better risk factor prediction…

• Visualisation and interaction: see Abstract • Strategy for analysis is multiple to face huge

multiple comparison

06/11/09 60JB Poline

L Shen, S Kim, J D West, A J Saykin