Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

37
Integrated transcriptional profiling and linkage analysis for mapping disease genes and regulatory gene networks analysis Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine [email protected]

description

Integrated transcriptional profiling and linkage analysis for mapping disease genes and regulatory gene networks analysis. Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine [email protected]. Outline. Introduction: the biological framework - PowerPoint PPT Presentation

Transcript of Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Page 1: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Integrated transcriptional profiling and linkage analysis for mapping

disease genes and regulatory gene networks analysis

Enrico PetrettoResearch Fellow in Genomic Medicine

Imperial College Faculty of [email protected]

Page 2: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Outline• Introduction: the biological framework

– Expression QTL mapping using animal models

– eQTL analysis in multiple tissues

• Integrating genome-wide eQTL data to identify gene association networks

– Data mining of eQTLs

– Graphical Gaussian models (GGMs)

– Example of identification of disregulated pathway

– Master transcriptional regulator

Page 3: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

quantitative variation of mRNA levelsin a segregating population

Genetical Genomics

Genetic mapping

Expression QTLsgenetic determinants

of gene expression

model organisms

Page 4: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

The rat is among the leading model species for researchin physiology, pharmacology, toxicology

and for the study of genetically complex human diseases

Spontaneously Hypertensive Rat (SHR):A model of the metabolic syndrome

• Spontaneous hypertension• Decreased insulin action• Hyperinsulinaemia• Central obesity• Defective fatty acid metabolism• Hypertriglyceridaemia

Page 5: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Specialized tools for genetic mapping: Rat Recombinant Inbred (RI) strains

HXB1 HXB2 HXB3 HXB4 HXB5 HXB6 HXB7 …

F1

F2

Pravenec et al. J Hypertension, 1989

Mate two inbred strains

F1 offspring are identical

F2 offspring are different(due to recombination)

Brother sister mating over >20 generations to achieve

homozygosity at all genetic loci RI strains

Spontaneously

Hypertensive Rat

Normotensive

Rat (BN)

Page 6: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

SHR BN

F1

F2

H B BH B H HStrain Distribution Patternfor Gene X

Gene X

GenotypeB

Genotype H

Cumulative, renewable resource for phenotypes and genetic mapping

RI strains

Page 7: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Gene X

B H BB B H HSDP for Gene X

Mapping of QTLscompare strain distribution pattern of

markers and traits

RI strains

obesity

mR

NA

Linkage

Linkage

Page 8: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Gene expression analysis in the Rat

4 animals per strain (no pooling)

30 RI strains + 2 parental strains

Affymetrix RAE230A

Affymetrix RAE230_2

640 microarray data sets~ 16,000 probe sets per array (fat, kidney, adrenal) ~ 30,000 probe sets per array (heart, skeletal muscle)

Fat

Heart

Expression profiling

Skeletal muscle

Page 9: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

eQTL Linkage Analysis

Multiple testing issues

1,011 genetic markers 15,923 probe sets

Evaluate the linkage statistics for each genetic marker and use permutation testing to provide

genome-wide corrected P-values

* Storey 2000

Expected proportion of false positives among the probe sets called significant in the linkage

analysis (False Discovery Rate*)

For each probe set on the microarray, expression profiles were regressed against all 1,011 genetic markers

Page 10: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

cis-acting trans-actingeQTLgene

eQTLgene

cis- and trans-acting eQTLs

Candidate genes for physiological traits

Regulatory gene networks

Page 11: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Fat

Genomewide significance of the eQTL

eQTL datasets in the rat model system

In collaboration with Dr SA Cook (Molecular Cardiology, MRC Clinical Sciences Centre), Dr M Pravenec (Czech Academy of Sciences, Prague) and Prof N Hubner (MDC, Berlin)

Skeletal muscle

brain

Heart

Tissue

Rat genome

Cis-acting eQTL

Trans-acting eQTL

Page 12: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

+ cis-eQTL

+ trans-eQTL

Heart

Genetic architecture of genetic variation in gene expression

Petretto et al. 2006 PLoS Genet

trans-eQTLs: small genetic effect

cis-eQTLs: big genetic effect highly heritable

FatHeart

Page 13: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

FDR for cis- and trans-eQTLs

heart fat kidney adrenal

homogeneous tissues heterogeneous tissues

FDRFDR

Petretto et al. 2006 PLoS Genet

Page 14: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

trans-eQTLs hot-spotsRat chromosome 8

not tissue-specific cluster

heart

fat

kidney

adrenal

PGW<0.05tissue-specific clusters

Trans-eQTLs

Master transcriptional

regulator ?

Page 15: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Strategy to identify master transcriptional regulators

eQTLs

cis

Genetic markers

Data mining

Association networks

Model for master transcriptional regulator

genetic variant

cis-linked gene

Transcription Factor (TF) activity profile

Expression of trans-linked genes

Multi-tissues

TF binding data

Downstream functional validation in the lab

(Dr Cook / Prof Aitman)

Gene expression

trans

GGMs

FunctionalAnalysis

(GSEA, etc.)

Page 16: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

GGMs• Partial correlation matrix = (ij)

• Inverse of variance covariance matrix P = (ij) = P-1

• small n, large p• Regularized covariance matrix estimator by

shrinkage (Ledoit-Wolf approach)

• Guarantees positive definiteness

ij = - ij / (ii jj )½

Schafer and Strimmer 2004, Rainer and Strimmer 2007

Page 17: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Partial correlation graphs• Multiple testing on all partial correlations

– Fitting a mixture distribution to the observed partial correlations (p)

f (p) = 0 f0 (p;) + A fA (p)

uniform [-1, 1]

0 ,

Prob (non-zero edge|p) = 1 - 0 f0 (p;)

f (p)

0 +A =1, 0 >> A

Schafer and Strimmer 2004, Rainer and Strimmer 2007

Page 18: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

GGMs Infer partial ordering of the node

• Standardized partial variances (SPVi)

• Proportion of the variance that remains unexplained after regressing against all other variables

• Log-ratios of standardized partial variances B = (SPVi / SPVj)½

Log(B) |rest = 0 Log(B) |rest ≠ 0

undirected directed

ij ij

endogenous variable

smaller SPV

exogenous variable

bigger SPV

Inclusion of a directed edge into the network is conditional on a non-zero partial correlation coefficient

Schafer and Strimmer 2004, Rainer and Strimmer 2007

Page 19: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Hypothesis driven analysis

Graphical Gaussian models

• Detect conditionally dependent trans-eQTL genes

• Infer partial ordering of the nodes (directed edges)

1. Gene expression levels under genetic control (i.e., ‘structural’ genetic perturbation)

2. Co-expression of trans-eQTLs point to common regulation by a single gene

Page 20: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

0

20

40

60

80

100

120

140

160c1

7.6

c17

.38

c15

.10

8c1

5.1

1c1

6.0

c11

.31

c15

.75

c6.1

36

c4.9

3c1

5.7

8c1

.87

c10

.25

c11

.32

c4.1

48

c8.4

5c8

.87

c8.5

3c4

.91

c4.1

61

c10

.21

c4.1

51

c16

.46

c15

.80

c17

.40

c8.9

c16

.50

c3.4

1c2

0.4

4c3

.11

2c8

.49

c13

.9c1

7.8

7c3

.13

0c5

.15

1c7

.14

2c8

.32

c15

.58

c1.2

48

c8.3

8c1

.90

c12

.7c3

.12

9c6

.13

1

kidney

heart

fat

adrenal

Locus (chromosome.Mb)

Chromosome 15, 108 Mb, D15Rat29

trans-eQTLs hot spots

Page 21: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

posterior probability for non-zero edge 0.8

Heart tissue, trans-eQTLs hot-spot (chromosome 15)

Page 22: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

posterior probability for non-zero edge 0.8 posterior probability for directed edge 0.8

Heart tissue, trans-eQTLs hot-spot (chromosome 15)

Interferon Regulatory Factor 8

IFN-gamma-inducible

Implicated in immune and inflammatory responses

Overexpression of IRF8 greatly enhances IFN-gamma

Enrichment for NF-kappa-B transcription factor binding sites

Page 23: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

posterior probability for non-zero edge 0.70.7

posterior probability for directed edge 0.8

Relaxing the threshold…

Signal transducer / activator of transcriptionIFN gamma activated, drive expression of the target genes, inducing a cellular antiviral state

Involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum forassociation with MHC class I molecules

MHC class I antigen antigen processing and

presentation

degradation of cytoplasmic antigens for MHC class I antigen presentation pathways

Page 24: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Is this association graph tissue specific?

Page 25: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

C15.108

C15.108 C15.108C15.108

C15.108

C15.108

C15.108

C15.108

C15.108

kidney, all trans-eQTLs, posterior probability 0.95

Page 26: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Adrenal, all trans-eQTLs, posterior probability 0.95

C15.108

C15.108C15.108

C15.108

C15.108

C15.108

C15.108C15.108

C15.108

Page 27: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Trans-eQTL genes detected

in multiple tissues

IRF - transcription factorinflammatory response

adrenalheartkidney

interferon-stimulated transcription factor

FC

in R

I st

rain

sF

C in

the

par

enta

l str

ains

type I interferon (IFN)inducible gene

Microarray data:dysregulated genes

Page 28: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Trans cluster

Cis eQTLs

Transcripts representing Dock9 gene

Pearson Correlation 100,000 permutations Bonferroni corrected

cis-acting eQTLs within the cluster region

Model for master transcriptional regulator

genetic variant

cis-linked gene

Transcription Factor (TF) activity profile

Expression of trans-linked genes

Page 29: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Gene Set Enrichment Analysis

Correlation between Dock9 and all trans-eQTLs (heart)

LCP2IRF8TAP1PSMB9PSMB8PSMB10IRGMIFIT3STAT1USP18IFI35IRF7LGALS3BP

Transcript 1370905_atEnrichment Score -0.73Normalized Enrichment Score -0.93p-value 0.004FDR q-value 3%

Transcript 1385378_atEnrichment Score -0.69Normalized Enrichment Score -1.85Nominal p-value 0.015FDR q-value 7%

IRF8PSMB8PSMB10TAP1PSMB9IRGMSTAT1USP18IFIT3IFI35IRF7LGALS3BP

Functionalgene-sets correlated with Dock9

Genes whose expression is altered greater than twofold in mouse livers experiencing graft-versus-host disease (GVHD)

as a result of allogenic bone marrow transplantation…

Page 30: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Other examples

Page 31: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

ATP binding and ion transporter activityCalcium signaling pathway

Heart tissue, trans-eQTLs hot-spot (chromosome 15, 78Mb)

posterior probability for non-zero edge 0.8

posterior probability for directed edge 0.8

Page 32: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Fat tissue specific, trans-eQTLs hot-spot (chromosome 17)

posterior probability for non-zero edge 0.8

posterior probability for directed edge 0.8

Page 33: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Summary

• Genome-wide eQTL data provide new insights into gene regulatory networks

• GGMs applied to trans-eQTL hotspots identified dysregulated pathway related to inflammation

• Hypothesis-driven inference can be a powerful approach to dissect regulatory networks

Page 34: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Acknowledgments

Sylvia Richardson Stuart CookTim Aitman Jonathan Mangion

Rizwan Sarwar

collaborators:

Norbert Hubner (MDC, Berlin)

Michael Pravenec (Institute of Physiology, Prague)

Page 35: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Extra slides

Page 36: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Chr 15 qRT-PCR validation in RI strains

1

2

3

4

Rarresin1_pred Irf7_pred Stat1

Fo

ld c

han

ge

Array qRT-PCR

Gene Rarresin1_pred Irf7_pred Stat1

Array 2.28 3.06 1.63P 4.0E-05 8.6E-05 1.4E-04

qRT-PCR 1.36 1.91 1.90P 0.039 0.004 0.036

Page 37: Enrico Petretto Research Fellow in Genomic Medicine Imperial College Faculty of Medicine

Rpt4 mRNA

Control Alpha Beta Gamma

+256

+64

+16

+4

±1

Interferon

Fo

ld c

han

ge

Irf7 mRNA

Control Alpha Beta Gamma

+256

+64

+16

+4

±1

Interferon

Fo

ld c

han

ge

Rpt4 and Irf7 mRNA levels increase in response to interferon

• H9c2 cells (rat cardiac embryonic myoblast)• Stimulated with recombinant rat interferon for 3 hours• RNA extracted, assayed by qRT-PCR (SYBR Green I)• 3 independent expts, 3 biological replicates