SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig...

106
SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala

description

WHAT IS ”SYSTEMS BIOLOGY”? ”Systems biology is the coordinated study of biological systems by (1) investigating the components of cellular networks and their interactions, (2) applying exprerimental high- throughput and whole-genome techniques, and (3) integrating computational methods with experiemntal efforts.” – first sentence of the Preface, to Klipp E et al. ”Systems Biology in Practice”, WILEY-VCH, What do you think?

Transcript of SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig...

Page 1: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

SYSTEMS BIOLOGY

Lukasz Huminiecki, DPhil

Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala

Page 2: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Please, tell me who you are!

Computer scientist/mathematician

Computational biologist/bioinformatician

Raise your hand if you are:

Experimental biologist

Postgraduate

Undergraduate

Post-doc

Page 3: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

WHAT IS ”SYSTEMS BIOLOGY”?

”Systems biology is the coordinated study of biological systems by (1) investigating the components of cellular networks and their interactions, (2) applying exprerimental high-throughput and whole-genome techniques, and (3) integrating computational methods with experiemntal efforts.” – first sentence of the Preface, to Klipp E et al. ”Systems Biology in Practice”, WILEY-VCH, 2005.

What do you think?

Page 4: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Back to the Roots?

In fact, early criticics argued that molecular approaches are too reductionist, attempting to explain complex biological phenomena, through actions of few genes or proteins.

There is a cyclical element to all progress!

Before the era of the molecular revolution physiology-oriented biologists were much more used to looking at living things as systems.

Page 5: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Four areas of systems biology on which I will focus today

• Analysis of expression patterns

• Mathematical modeling

• Phylogenetics

• Web-resources and data integration

Page 6: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

PART 1EXPRESSION PATTERN

EVOLUTION

Page 7: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Classic view of evolution through gene duplication

• Susumu Ohno, 1970. Evolution by Gene Duplication. Springer, Berlin

• “Natural selection merely modified while redundancy created"

• The neo-functionalization model

Page 8: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Genome-scale tests (1)

Page 9: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Genome-scale tests (2)• Nembaware et al. 2002: Impact of the

presence of paralogs on sequence divergence in a set of mouse-human orthologs. Genome Research

Page 10: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Gene Expression Atlas• http://expression.gnf.org• 101 human (microchip U95A) and 89 mouse

(microchip U74A) Affymetrix experiments • Huminiecki L, Lloyd AT, Wolfe KH.

Congruence of tissue expression profiles from Gene Expression Atlas, SAGEmap and TissueInfo databases. BMC Genomics. 2003 Jul 29;4(1):31

• Mapping to Ensembl via LocusLink• TRIBE families and Ka/Ks calculations using

yn00 from PAML

Page 11: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Huminiecki et al. “Congruence of tissue expression profiles from GEA, SAGEmap and TissueInfo databases”. BMC Genomics

Page 12: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

R vs. Ks in paralogs

Page 13: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

One-to-one orthologs

Page 14: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Human or mouse duplication

Page 15: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Cumulative plots

Page 16: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Randomisation test 

  R > 0.6 R > 0.7 R > 0.8 R > 0.9

Human duplication 

91%p = 0.37

58% p = 0.0042

52% p = 0.0043

60% p = 0.038

Mouse duplication 

61% p = 0.0111

48% p = 0.0027

36% p = 0.0015

24% p = 0.0018 

The percentages indicate the ratios between the fractions of genes having a particular R-value in sets of orthologues with the human (163 sets) or mouse (139 sets) duplication versus the group of one-to-one orthologues (1,324 pairs).

Page 17: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Sub-functionalisation• Force et al. argue that neo-

functionalisation alone could not account for high accumulation of duplicated genes in eucaryotes

• Duplication-degeneration-complementation (DDC)

• Should lead to tissue-specific expression!

Page 18: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Tissue-specific genes evolve faster and are more likely to belong to large gene families

Page 19: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Gene expression patterns are, in evolutionary perspective,

surprisingly labile!

Page 20: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Literature• Khaitovich P, Weiss G, Lachmann M, Hellmann I, Enard W, Muetzel

B, Wirkner U, Ansorge W, Paabo S. A neutral model of transcriptome evolution. PLoS Biol. 2004 May;2(5):E132. Epub 2004 May 11.

• Huminiecki L, Wolfe KH. Divergence of spatial gene expression profiles following species-specific gene duplications in human and mouse. Genome Res. 2004 Oct;14(10A):1870-9.

• Jordan IK, Marino-Ramirez L, Koonin EV. Evolutionary significance of gene expression divergence. Gene. 2005 Jan 17;345(1):119-26. Epub 2004 Dec 29.

• Khaitovich P, Paabo S, Weiss G. Toward a neutral evolutionary model of gene expression. Genetics. 2005 Jun;170(2):929-39. Epub 2005 Apr 16.

• Liao BY, Zhang J. Evolutionary conservation of expression profiles between human and mouse orthologous genes. Mol Biol Evol. 2006 Mar;23(3):530-40. Epub 2005 Nov 9.

Page 21: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

The take home message• An entirely new paradigm is emerging

in evolutionary biology: expression patterns can change dramatically in the course of evolution.

• This impacts on our understanding of biodiversity, human origins, and drug discovery.

Page 22: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Broad goals of collaboration with Pfizer

We aim towards a set of heuristic rules to identify the most “druggable” GPCRs and the best model species in which to conduct preclinical tests. By “druggable” it is meant those which possess any single or combination of characteristics favourable to drug development, such as: (1) conserved sequence, (2) tissue-specificity, and (3) expression domain not overlapping with other members of the family.

Conserved sequence suggests that function is the same, and that drugs will have similar efficacy. A tissue-specific gene facilitates targeting into specific organs or tumour types, and is less likely to engage in multiple functions - both of these features are likely to result in advantageous toxicological profiles. Non-overlapping expression domain minimises the possibility of functional redundancy. Finally, the best animal model for preclinical trials is likely to be the species with the most “human” expression pattern of the target gene, especially in tissues directed for therapeutic intervention, as well as in toxicologically important organs, such as heart, lung, liver, kidney, and brain.

Page 23: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Specific goals of collaboration with Pfizer

• Generate high quality RNA preparations from 20 organs from duplicate male and female rat, guinea pig and dog samples, for comparison with commercial human RNA samples.

• Using qPCR techniques, determine the expression profile of at least 25 genes (with representatives from the histaminergic, serotinergic, and adrenergic GPCR families) in each of these tissues.

• Analyse data to consider congruence in expression profiles between species from an evolutionary bioinformatics perspective, in addition to gaining a deeper understanding into the degree of human-animal model translation and therefore into the suitability of animal species used for functional efficacy and toxicological studies at Pfizer.

Page 24: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Results: RNA isolation

a) b) c)

Polytron/RNAeasy with additional acid phenol step and DNAaway for difficult tissues

Page 25: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

• cumulative genes RT_samples ---------- ------ ----------

• run ----> id <---------- id assay | symbol prep tissue | species

• gene ---------- -------- tissue rt ------- | ct | |

• \/ \/ • | RT_summary tissue preps • | ---------- ------------ -------• | id date ----- tissue_index <---------- prep • | technician | tissue_name species • | kit | tissue • | samples <- donor • | description ratio • | dilution yield • | technician• | housekeep_actb • -------------> housekeep_hprt • -------------- • count • run • tissue • ct • dev

The Database

Page 26: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

The Ct value• Two-tube comparative method

with ”virtual” housekeeping gene • Amplification assumed to be exponential

with 100% efficiency, Cts scaled accordingly• a) histogram of Ct-values for over 6000 reactions; b) standard deviations

in triplicates; c) ACTB plotted against HPRT1. a) b) c)

Page 27: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

• Tissue RNAs from rat, guinea pig and dog were isolated. Human RNAs were purchased from Clontech.

• Human, rat and canine expression profiles of just under 40 genes have been examined thus far. Approximately 8 thousand assays have been performed.

• A number of striking differences in expression patterns have been revealed.

• Thus far, the most remarkable expression shifts have been observed in heart and aorta, among histamine, prostacyclin and adrenergic beta receptors. Numerous changes were also localised to the uterus.

• Apart from divergent expression patterns, mean expression levels also appeared rather different for many genes.

• Differences in expression between prostanoid receptors may have implications for the pharmacology of troublesome COX-2 inhibitors (such as Celebrex, Bextra, and Vioxx).

Results overview

Page 28: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

PART 2MODELING

Page 29: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Mathematische Modellierung von Stoffwechsel und Genexpression

Mathematical Modeling ofMetabolism and Gene Expression

• Dr. Edda Klipp• Kinetic Modeling Group

• Vorlesung in der Reihe• “Gene und Genome: die Zukunft der Biologie”

Page 30: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

What is a model?Yeast, mouse – as models for human

Verbal explanation

A sequence of letters ATTCGAGGTATA for DNA sequence

Wiring scheme

Mathematical description: Boolean NetworkDifferential EquationsStochastic Equations

- Abstraction-(Simplified) representation allowing for understanding

Edda Klipp, Kinetic Modeling Group

Page 31: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Why modeling?

Even the behavior of simple systems can usually not be predicted intuitively and from experience.

The behavior of complex dynamical processes can not predicted with sufficient precision just from experience.

For prediction and explanation of processes one needs a model.

Experimental observations: many simple and complex processes

isolated enzymatic reaction:

temporal prozesses in metabolic networkspattern of gene expression and regulation

Edda Klipp, Kinetic Modeling Group

Page 32: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Why modeling?

Advantages- Time scales may be streched or compressed.

- Solution algorithms / computer programmes can often be used indepentend of the actually modeled system.

- Costs of modeling are lower than for experiments.

- Representation of quantities that are experimentally hidden.

- No risk for real systems, no interactions investigation/system.

Edda Klipp, Kinetic Modeling Group

Page 33: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Why modeling?

Burning questions- How is cellular response to environmental changes and stress regulated?

- How should a cell be treated to yield a high output of a desired product (Biotechnology)

- Where should a drug operate to cure a disease (Health care)?

- Is our knowledge about a network/pathway complete?

Edda Klipp, Kinetic Modeling Group

Page 34: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Structure of the system

SextS1 S2 S3 S4 S5 Smito

S6

fast

slowslow

Variables, parameters, constantsState variables - set of variables describing the system completelyDimension of the systems = number of independent state variables

How many variables are used in my model? too few – System ist under-determinedtoo many – System ist over-determined and may be contratictery

Units of variables and parameters etc. fit together?

Boundary of the system

Edda Klipp, Kinetic Modeling Group

Page 35: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Biological processes arecomplex phenomena

Central dogma of molecular biology:

GenemRNA

ProteinesCellular processes

Edda Klipp, Kinetic Modeling Group

Page 36: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Direction of discovery

known to be predicted

Structure FunctionProtein interactions Biochemical actionMetabolic pathways Concentration changesEnzyme sets Influence of perturbations

Possible behavior, bifurcations : :

Function StructureTransmission of a signal Sequence of signaling compoundsTime course of concentrations Possible protein interactions : :

Edda Klipp, Kinetic Modeling Group

Page 37: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Concept of stateThe state of a system is a snapshot of the system at a given time that contains enough information to predict the behaviour of the system for all future times. The state of the system is described by the set of variables that must kept track of in a model.Different models of gene regulation have different representations of the state:Boolean model: a state is a list containing for each gene involved, of whether

it is expressed („1“) or not expressed („0“)Differential equation model: a list of concentrations of each chemical entityProbabilistic model: a current probability distribution and/or a list of actual numbers of molecules of a type

Each model defines what it means by the state of a system.Given the current state the model predicts what state/s can occur next.

Edda Klipp, Kinetic Modeling Group

Page 38: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Kinetics – change of stateA Bk

Deterministic, continuous time and state: e.g. ODE modelconcentration of A decreases and concentration of B increases. Concentration change in per time interval dt is given by

AkdtdB

Probabilistic, discrete time and state : transformation of a molecule of type A into a molecule of type Sorte B. The probability of this event in a time interval dt is given by

aktadttaP ,,1a – number of molecules of type A

Deterministic, discrete time and state : e.g. Boolean network modelPresence (or activity) of B at time t+1 depends on presence (or activity) of A at time t tAftB 1

Edda Klipp, Kinetic Modeling Group

Page 39: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Boolean Models

(George Boole, 1815-1864)Each gene can assume one of two states:

expressed („1“) or not expressed („0“)

Background: Not enough information for more detailed descriptionIncreasing complexity and computational effort for more specific models

(discrete, deterministic)

Replacement of continuousfunctions (e.g. Hill function)by step function

Edda Klipp, Kinetic Modeling Group

Page 40: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Boolean ModelsBoolean network is characterized by- the number of nodes („genes“): N- the number of inputs per node (regulatory interactions): k

The dynamics are described by rules:

„if input value/s at time t is/are...., then output value at t+1 is....“

Boolean network have always a finite number of possible states and,therefore, a finite number of state transitions.

B C

Linear chain

Ring

A B C D

A B

C D

A

B

A

Edda Klipp, Kinetic Modeling Group

Page 41: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Boolean ModelsTruth functions

in outputp p not p

0 0 0 1 11 0 1 0 1

rule 0 1 2 3

And Or Nor0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 10 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 11 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 11 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

rule 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

input outputp q

A B

B(t+1) = not (A(t))rule 2

Edda Klipp, Kinetic Modeling Group

Page 42: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Boolean Models

gene a gene b

gene c gene d

C

A

D

B

AB

+

+

repression

activation

transcription

translation

gene

protein

a b

c d

Boolean network

a(t+1) = a(t)

b(t+1) = (not c(t)) and d(t)

c(t+1) = a(t) and b(t)

d(t+1) = not c(t)

0000 00010001 01010010 00000011 00000100 00010101 01010110 00000111 0000

Steady state: 0101

1000 10011001 11011010 10001011 10001100 10111101 11111110 10101111 1010

Zyklus: 1000 1001 1101 1111 1010 1000

Edda Klipp, Kinetic Modeling Group

Page 43: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Boolean Models

- The number of states is finite, , as well as number of state changes.

- The system may reach steady states or cycles.

- Not every state can be reached from every other state.

-The successor state is unique, the predecessor state not.

Advantages: easy description with simple rules, no parameterscomputationally not demanding

Drawbacks: no intermediate values

N2

Edda Klipp, Kinetic Modeling Group

Page 44: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Description with Differential Equations

X + DNA X-DNAk1

X-DNA X + DNAk-1

Nucleic acids + DNA mRNA + DNAk1

mRNA Nucleic acidsk-1

Amino acids + mRNA Proteins + mRNAk2

Proteins Amino acidsk-2

DNAXkDNAXkdtDNAXd

11

SfS dtd

S – vector of concentrationsf – function(s), often non-linear

Edda Klipp, Kinetic Modeling Group

Page 45: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Basic Elements of Biochemical Networks

S1

S2

S4

S3

v1 v2

v3

v4

v5

dtdSdtdSSpdtdS

SpSSpdtdSSppdtdS

24

253

244132

1211

S1[0] = 0

S2[0] = 0S3[0] = 0S4[0] = 1

p1 = 1p2 = 1 p3 = 1 p4 = 0.5p5 = 0.5 0 1 2 3 4 5

0

0.5

1

S[t]

S1S2

S3 S4

Time

Systems equationsr – number of reactionsSi – metabolite concentrationsvj – reaction ratesnij – stoichiometric coefficients

Network properties Individual reaction properties

r

jjij

i vndt

dS

1

p,pSvv ijnN

Kinetics Dynamics admissible steady state fluxes conservation relations

Edda Klipp, Kinetic Modeling Group

Page 46: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

ODE - concept of steady state

0pS,vN 0dtdS or

•no change of concentrations•but (usually) non-vanishing fluxes or rates

Time

To restrict modeling to main aspects often the asymptotic behaviour of dynamic systems is analyzed (behavior after sufficient long time). It may be Va

riabl

e

- oscillatory- chaotic

- in many relevant situations the system will reach a steady state.

Edda Klipp, Kinetic Modeling Group

Page 47: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Data BasesGO (Gene Ontology) http://www.geneontology.org, functional description of gene products KEGG (Kyoto Enzyclopedia of Genes and Genomes) http://www.genome.ad.jp/kegg/, reference knowledge base offering information about genes and proteins, biochemical compounds and reactions, and pathways BRENDA (Comprehensive Enzyme Information System) http://www.brenda.uni-koeln.de, curated database containing functional data for individual enzymes NCBI (National Center for Biotechnology) http://www.ncbi.nlm.nih.gov/ ,provides several databases: - molecular databases, with information about nucleotide sequences, proteins, genes, molecular structures, and gene expression - taxonomy database: names and lineages of more than 130,000 organisms

SPAD (Signaling PAthway Database) http://www.grt.kyushu-u.ac.jp/spad/index.html, information about signaling pathways (schemes, links) JWS Online, Model database http://jjj.biochem.sun.ac.za/database/index.html , published models,implemented in Mathematica®

Models can be simulatedBiomodels, Model database http://www.biomodels.net/ , published models,implemented in SBML

Edda Klipp, Kinetic Modeling Group

Page 48: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Modeling Tools•BALSA•BASIS•BIOCHAM•BioCharon•biocyc2SBML•BioGrid•BioModels•BioNetGen•BioPathway Explorer•Bio Sketch Pad•BioSens•BioSPICE Dashboard•BioSpreadsheet•BioTapestry•BioUML•BSTLab•CADLIVE•CellDesigner•Cellerator•CellML2SBML•Cellware•CL-SBML•COPASI

•Cytoscape•DBsolve•Dizzy•E-CELL•ecellJ•ESS•FluxAnalyzer•Fluxor•Gepasi•INSILICO discovery•JACOBIAN•Jarnac•JDesigner•JigCell•JWS Online•Karyote*•KEGG2SBML•Kinsolver*•libSBML•MathSBML•MesoRD•MetaboLogica•MetaFluxNet

•MMT2•Modesto•Moleculizer•Monod•Narrator•NetBuilder•Oscill8•PANTHER Pathway•PathArt•PathScout•PathwayLab•Pathway Tools•PathwayBuilder•PaVESy•PNK•Reactome•ProcessDB•PROTON•pysbml•PySCeS•runSBML•SBML ODE Solver•SBMLeditor

•SBMLmerge•SBMLR•SBMLSim•SBMLToolbox•SBToolbox•SBW•SCIpath•Sigmoid*•SigPath•SigTran•SIMBA•SimBiology•Simpathica•SimWiz•SmartCell•SRS Pathway Editor•StochSim•STOCKS•TERANODE Suite•Trelis•Virtual Cell•WinSCAMP•XPPAUT

http://sbml.orgEdda Klipp, Kinetic Modeling Group

Page 49: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Conclusions•Mathematical models of cellular processes allow for a testable representation of experimental knowledge.

•Models clarify systemic and dynamic properties of the investigated object.

•Models allow simulating processes independent of the experiment.

•Modeling reveals regulatory properties of cellular networks Osmostress response:

–The role of channel Fps1 in osmoresponse–The ability to repeated stimulation and the contribution of phosphatases–Feedback loops / signal integration and separation

•Models can have predictive value–Mutant phenotypes–Effect of intervention–Integration of external signals to cell cycle progression–Critical cell size for G1/S transition

Edda Klipp, Kinetic Modeling Group

Page 50: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Process of model development- Analysis of the objects to be modeled

- Formulating of the scientific PROBLEMS

- Design of a simple model - as „cartoon“- in mathematical terms

- Solve the respective (mathematical) problemes- Comparison of results with real system (EXPERIMENT) - Difference- iterative enhancement of the models (structure, parameters, …)

Distribution of molecules on Both sides of a membrane

Ai Ao

dAi/dt = f(Ai, Ao, C, p)

If we would not make models, then we would not know, why they are wrong

Edda Klipp, Kinetic Modeling Group

Page 51: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Modeling

Mathematical Models for Cellular Processes

ODE-Systemsstructural

Knowledge +

experimental Data

System AnalysisSimulation,Parameteridentification

System Understanding + Prediction

Metabolic and Regulatory Networks

Edda Klipp, Kinetic Modeling Group

Page 52: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Basic Elements of Biochemical Networks

Glucose1-P Glucose6-P Fructose6-Pv1 v2

Phospho-glucomutase

Glucose-Phosphat-isomerase

Metabolite Metabolite Metabolite

Reaction Reaction

Design of structured metabolic models

1. Determination of system limitsG1P G6P F6Pv1 v2

Systemextern extern

Concentration change = Production – Degradation + Transport Transportvvv

dtPdG

2162. Balancing

PGKPGVv

M 11

1

maxRate as function of concentrations and parameters

3. Assignment of Kinetics

Transport

Edda Klipp, Kinetic Modeling Group

Page 53: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Hypothesis Generation

establish a mathematical model of the network

-define a performance function

-calculate parameters optimizing the performance function

-compare prediction with experimental data

Possible theoretical approaches:

Structure FunctionModelling of Systems Dynamics

Function StructureEvolutionary Optimization

HomeostasisAppropriate ResponseExperimental data

Network Control patternParameters

Edda Klipp, Kinetic Modeling Group

Page 54: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Model examples -MetabolismIn Vivo Analysis of Metabolic Dynamics in S. cerevisiae:M. Rizzi, M. Baltes, U. Theobald, M. ReussBiotechnol Bioeng.55: 592–608, 1997.

Representation of Metabolismin the KEGG data basewww.kegg/kegg2.jp

Edda Klipp, Kinetic Modeling Group

Page 55: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Model examples –Signaling pathways

GDPG GTPGG

GDPG

GTP GDP

Ra*

P

Signal

MAP KKKK

MAP KKK MAP KKK-P

MAP KK MAP KK-P

MAP K MAP K-P Signal

ATP ADP

MAP KK-PP

MAP KKK-PP

MAP K-PP

ATP ADP

ATP ADP ATP ADP

ATP ADP ATP ADP

MAP K cascade

A-P A

ADP ATP

B B-P

C-P C

P

k1

k2

k3

k4

PhosphoRelaysystem

Signal

G-Protein

Edda Klipp, Kinetic Modeling Group

Page 56: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Common properties

Cellular network has a high degree of connectivity.

The processes are reactions, molecular interactions.bindingintramolecular transformationsrelease

Differences in modeling of different partsare due to appropriate approximations.

Edda Klipp, Kinetic Modeling Group

Page 57: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Concentrations

Signalling Metabolism

Proteins low

~ 100-300 nmol/L(~ 103-104 molecules per cell)

(catalysts and substrates)

ATP ~ 2 mmol/L

Enzymes low

Metabolites higher

Edda Klipp, Kinetic Modeling Group

Page 58: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Network CharacteristicsSignaling

Reactions can be - catalysed by enzymes- autocatalytic.

The network is given by the existing proteinand their interactions.

Metabolism

All reactions are catalysed by enzymes.

The network is determined by the existing enzymes(which not necessarily interact).

Metabolites need not to be there initially.

Edda Klipp, Kinetic Modeling Group

Page 59: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Network CharacteristicsSignaling Metabolism

MAP K MAP K-P

ATP ADP

MAP K-PP

ATP ADP

P P

Glucose Gluc 6-P

ATP ADP

Fruc 1,6-PP

ATP ADP

P

Fruc 6-P

State changes: change in phosporylation statesCoding of information

But: Conservation(MAPK + MAPK-P + MAPK-PP)in the considered time window

Important feature:Flux through the pathway,(final) transformation of metabolites

Phosphorylation energy transfer

Edda Klipp, Kinetic Modeling Group

Page 60: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Rate equations…. Are a Choice of the Modeler

Signaling Metabolism

MAP K MAP K-P

ATP ADP

Glucose Gluc 6-PATP ADP

Catalyst and Substrate have aboutthe same concentration (ES)

Binding slow compared to intramolecularrearrangements.

First order kinetics

Typical choice:Michaelis-Menten-Kinetics

E+S ES E+P

Requirement: E << S

Hexokinase

Mg2+

MAP KK-PP

fast slowtot

MEkV

SKSVv

2max

max ,

ATPkkSEkv ,

Mass action kinetics

Edda Klipp, Kinetic Modeling Group

Page 61: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Spatial effectsSignaling Metabolism

„well stirred“

Molecules are considered to meet with probability according to their concentration (mass action).

Spatial effects usually neglected.

„well stirred“ ???

Low number of molecules,Highly organised complexes,Often membrane-bound.

Spatial effects should be considered.(problem with ODEs)At least as „compartmentalisation“

Edda Klipp, Kinetic Modeling Group

Page 62: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Temporal characterisationSignaling Metabolism

Time constants for reactions

kk1 A B

k+

k-

1

i i

jijj S

vn

nij - stoichiometric coefficients

0

0

dtdxdf

dtdxdft

Tc

Time constants for metabolites

Definition acc.to Llorens et al. 1998

Amplitude

Heinrich et al., 2002

0

0

dttX

dttXt

i

i

i

2

0

0

2

i

i

i

i

dttX

dttXt

i

i

i

dttX

S

20

Transition time

Duration

time

Edda Klipp, Kinetic Modeling Group

Page 63: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Conclusions

Models for Metabolism and Signaling can use theSame Design Principles.

Metabolism and Signaling may take place in different areas of the cellsdifferent regions of the concentration spacedifferent time scales

Signaling models have to account for the hierarchy in the system

Regulatory couplings (feedback) distribute control in both cases.

Edda Klipp, Kinetic Modeling Group

Page 64: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

EXAMPLE TGFbeta signal transduction:

the SMAD engine

Page 65: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Overview of the

pathway• Ligand dimer binds to

receptor heterotetramer (type I and II receptors, both ser/tre kinases)

• r-SMAD1/5/8 versus r-SMAD 2/3

• Phosphorylated r-SMAD binds SMAD4 and travels to the nucleus

• Ubiquitylation (SMURF1-dependent and independent)

Page 66: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

LETS LOOK UP THE TGFbeta PATHWAY!

www.reactome.org

Page 67: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Example: Vilar et al. 2006, PLoS Computational Biology

Signal Processing in the TGFbeta Superfamily Ligand/Receptor Network

Page 68: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

From Vilar et al. 2006, 2:1 0036-0045, PLoS Computational Biology

14 ligands, 5 type II and 7 type I receptors – this results in 50 different ligand/receptor complexes

Figure 2

Page 69: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Unusual features of the TGFbeta pathway

Simple core trasduction engine (two SMAD channels: 2/3 and 1/5/8) but very complex, diverse respones (42 ligands, 5 type II and 7 type I receptors, 300 target genes)

Receptors are constitutively internalised and recycled – only app. 10% present on the plasma membrane at any time

Comparatively late activation peak: app. 60 minutes (compare with EGFR of only 5 minutes)

Several negative feedback loops, including:- constitutive degradation- ligand-induced degradation (Smad7-Smurf2)

Page 70: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

From Vilar et al. 2006, 2:1 0036-0045, PLoS Computational Biology

Ki = 1/3 min

30 min

60 min

Klid = 1/4 min

Figure 3

Page 71: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Sources of experimental dataMitchell H, Choudhury A, Pagano RE, Leof EB. Ligand-dependent

and –independent transforming growth factor-beta receptor recycling regulated by clathrin-mediated endocytosis and Rab11. Mol Biol Cell, 2004, 15: 4166-4178:

• Recycling rate – Figure 3 (app. 30min)• Internalisation rate – Figure 4Di Guglielmo GM, Le Roy C, Goodfellow AF, Wrana JL. Distinct

endocytic pathways regulate TGF-beta receptor signalling and turnover. Nat Cell Biol, 2003, 5: 410-421:

• Internalisation rate – Table 1 - receptors are internalised through the clathrin pathway and lipid-caveolar compartments with similar rates

• Degradatation rate – Figure 3 – app. 400 min

Page 72: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Figure 3, Mitchell et al.Figure 3. TGF-beta receptors recycle at the same rate in the presence and absence of ligand. (A) Mb202 1-18 cells were processed for imaging and fluorescence quantitation as in Figure 2, B and C,, except 10 ng/ml GM-CSF was included in both incubations. Bar, 10 µm. (B) Cultures were labeled with 125I-Fab anti-GM-CSF receptor- for 2 h at 4°C in the presence ( ) or absence ( ) of 10 ng/ml GM-CSF. After washing and incubation at 37°C for 30 min (in the presence or absence of 10 ng/ml GM-CSF), labeled receptor antibody was removed by acid wash and the cultures returned to 37°C. (…) Results are expressed as percentage of the total cell-associated radioactive counts after the first acid strip and before further incubation at 37°C, and indicate the mean ± SD of two experiments done in duplicate.

Page 73: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Figure 4, Mitchell et al.

• Figure 4. TGF-beta receptors internalize at the same rate regardless of activation state. Mb202 1-18 cells were prebound with radiolabeled antibody in the presence ( ) or absence ( ) of 10 ng/ml GM-CSF as in Figure 3B and then incubated at 37°C for the indicated times. Surface antibody was removed by acid treatment at 4°C, after which cells were processed to determine internalized radioactivity (see Materials and Methods). Results are expressed as percentage of total cell-associated radioactive counts before incubation at 37°C and indicate the mean ± SD of two experiments done in duplicate.

Page 74: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Table 1, Di Guglielmo et al. Quantitation of TGF-beta receptor distribution by

immunoelectron microscopy

Page 75: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

From Vilar et al. 2006, 2:1 0036-0045, PLoS Computational Biology

Figure 4

Page 76: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Plasma membrane concentrations[lRiRii] - ligand/heterotetramer receptor complex[l] - ligand[Ri] - receptor type i[Rii] - receptor type ii

kα - ligand/receptor complex formation ratekcd - constitutive degradation rateklid - ligand induced degradation rateki - internalisation rate

Page 77: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Endosomal concentrations[lRiRii] - ligand/heterotetramer receptor complex[Ri] - receptor tpe i[Rii] - receptor type ii

ki - internalisation ratekr - recycling rate

Page 78: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

From Vilar et al. 2006, 2:1 0036-0045, PLoS Computational Biology

Late and long

Slower rates for internalisation and recycling: Ki = 1/10 min kr = 1/100 min

Figure 5

Page 79: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

From Vilar et al. 2006, 2:1 0036-0045, PLoS Computational Biology

CIR makes the difference

Figure 6

Page 80: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

PART 3PHYLOGENETICS

Page 81: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

WHAT I WILL TALK ABOUT• A BIT OF THEORY

• EXAMPLES (CRISPs AND SMADs)

• MEGA PACKAGE - HOWTO

• INTERPRETATIONS (what can a simple BLAST search, multiple sequence alignment, or a tree, tell me about BIOLOGY)

Page 82: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

First Things First(definitions)

• Phylogenetic analysis• Phylogenetic tree

– rooted– unrooted

• Homology– paralogy– orthology

• one-to-one• co-orthology

• Nucleotide substitutions– synonymous– non-synonymous

Page 83: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

A phylogenetic analysis of a family of related nucleic acid or protein sequences, is a determination of how the family members might have been derived during evolution

Phylogenetic tree – a graphical representation that depicts evolutionary relationships between a set of related sequences. Most-alike sequences are placed at the outer ends if two branches that are joined below into a lower common branch, representing their derivation from an ancetral sequence. An unrooted tree does not provide information on the common ancestor to the group.

What is phylogenetics?

Page 84: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

The simplest tree

Species A

Species B

Ancestral species

Evolutionary time

Gene A

Gene B

Ancestral genebranches

node

root

Page 85: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Homologs. Genes whose sequences are so similar that they almost certainly arose from a common ancestor gene

(1) Orthologs are genes in different species that arose from a single gene in the most recent

common ancestor of those species – that is, by a process of speciation

(2) Paralogs, on the other hand, are genes in the same species that arose from a single gene in an ancestral species by a process of

duplication

Who is Who of -ologs

Page 86: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Evolutionary time

Gene A1

Gene A2

Gene B1

Ancestral gene

Gene A2b

Gene B2

paralogs

co-orthologs1:1 orthologs

Page 87: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Non-synonymous substitution – a nucleotide substitution that results in an amino acid change (dn)

Synonymous substitution – a ”silent” nucleotide substitution, often in the third codon position, that does not result in an amino acid change (ds)

dn/ds – the simplest test for the rate of evolution (1 <, > 1, = 1)

Synonymous or non-synonymous?

Page 88: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

EXAMPLE

cysteine-rich secretory proteins (CRISPs)

Page 89: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

There are three CRISP genes in human, rat and mouse. However, their nomenclature is misleading

• None of the genes are simple one-to-one orthologs

• A single ancestral gene at the base of the vertebrate lineage was most likely subject to two rounds of gene duplication before the human/rodent split, but the picture is complicated by species-specific duplications and lineage-specific losses

• A surprisingly high number of changes in gene expression patterns have occurred during the evolution of the CRISP family. For detailed discussion, please see: (Huminiecki and Wolfe, Genome Research, 2004)

Page 90: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.
Page 91: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

EXAMPLE TGFbeta signal transduction:

the SMAD engine

Page 92: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Overview of the

pathway• Ligand dimer binds to

receptor heterotetramer (type I and II receptors, both ser/tre kinases)

• r-SMAD1/5/8 versus r-SMAD 2/3

• Phosphorylated r-SMAD binds SMAD4 and travels to the nucleus

• Ubiquitylation (SMURF1-dependent and independent)

Page 93: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Interesting phylogenetic phenomena

• DPP/BMP Type-1 receptor and an r-SMAD found in non-bilaterian cnidarian (Acropora millepora) – has the pathway evolved in a context other than dorsoventral patterning?

• Two SMAD4 in frogs: XSMAD4α and XSMAD4β. Also worms could have two co-SMADs (Sma-4 and Daf-3) but only one SMAD4 expected in mammals!

Page 94: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

What is the ancestral SMAD?• Hypothesis: an ancestral SMAD – CoRe-SMAD

– worked as a homodimer. The gene duplicated and gave rise to an r-SMAD and a co-SMAD

• But where did the i-SMADs come from? – i-SMADs evolve faster (evidence: average dn/ds,

length of protein branches, missing phosphorylation motif, and L3 sequence not conserved between DAD and i-SMAD6, 7);

– (((mad, dsmad2), medea),dad)– (((((SMAD1,SMAD5), SMAD9),SMAD2, SMAD3), SMAD4), SMAD6, SMAD7)

Page 95: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Amino-acid PAM matrix, neighbour joining tree

vertebrate SMAD1,5,9 D. melanogaster Mad

vertebrate receptor SMADs D. melanogaster dSMAD2

sma-2

daf-8

sma-3

daf-3

sma-4

vertebrate co-SMADs D. melanogaster Med Medea dSMAD4

daf-14

tag-68

D. melanogaster Dad

vertebrate SMAD7 vertebrate SMAD6

0.5

Fascinating C. elegans SMADs

Page 96: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Positive selection in sma/daf branches?

Sma genes control body size, while daf genes control dauer formation. Lengths of protein branches suggested that daf genes underwent a period of very fast protein evolution. Could it be positive selection in response to environmental change? dn/ds test positive!

Daf corresponding SMAD evidenceDaf-3 co-SMAD(?) nj_PAM, newfeld2_MH1_ml

i-SMAD newfeld2_p-loop_degenerateDaf-8 r-SMAD nj_PAM, newfeld2_p-loopDaf-14 co-SMAD(?) nj_PAM

co-SMAD newfeld2_p-loop(2S)Tag-68 i-SMAD nj_PAM

Page 97: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Interpretations of phylogenies

How all this could help in my project?

I will propose just a few ideas – please, join in, voice your suggestions, discuss your favourite gene family!!!

Page 98: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Application 1”Evolutionary Saga” or my gene family over the eons

Is the gene family present in bacteria, yeast, plants, non-bilaterial animals? To find out, just run a BLAST search against GenBank and read names of the species with hits. Can one infer from this how old the family is?

How many gene duplication events, and when did they occur? Have there been any deletions? Has the intron number changed, or there is no introns (suggestive of retroposition)

Can these events be correlated with the development of a new body plan, new organs, or novel physiology? Is this correlation supported by the sites of expression?

Page 99: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Porifera (sponges)

Cnidaria (jellyfish, coral)

Flatworms

Molluscs (gastropods)

Annelids (leeches)

Arthropods (insects)

Vertebrates (fish, birds, mammals)

Urochordates

Cephalochordates

HemichordatesEchinoderms (sea urchins, starfish)Nematodes (?)

Bilateri

a

Metazoanphylogeny

Wnt

TGFbet

a FGF2R?

Page 100: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Expansion of the signal transduction toolkit

Cnidaria C. elegans Drosophila Human and porifera

TGFbeta 1(?) 4 - 27

Wnt >1 5 7 18

FGF - 1 1 23

Increased anatomical complexity(diversification of body plans and body parts)

Page 101: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Application 2”My Gene and the Genome”, or how my favourite gene compares to other members of the gene family?

How many related genes, how similar, and in what physical location in the genome (most duplications are tandem, head-to-tail)

Evidence for functional redundancy? (important for knockouts)

Tissue-specific expression patterns, or do they overlap (expression.gnf.org)?

Genomic context (www.ensembl.org)

Page 102: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Application 3”Special Sites in my Gene”

Multiple sequence alignment:- regions of conservation- regions of change

Important for the design of my next deletion mutant, hybridization probe, or a set of primers

Visual inspection of the multiple sequence alignment will be sufficient in most cases (check out Pfam or ENSEMBL for precomputed alignments of your favourite family – www.ensembl.org)

Page 103: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Reference Bioinformatics: Sequence and Genome Analysis

David W. Mount

CSHL lab manual series

Great introductionto the field

Page 104: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Reference Molecular Evolutionand Phylogenetics

Masatoshi Nei, Sudhir Kumar

Nuts and bolts of tree drawing methods

Page 105: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

Reference From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design

S. Carroll, J. Grenier, S. Weatherbee

Interpretations

Page 106: SYSTEMS BIOLOGY Lukasz Huminiecki, DPhil Nobel medical institute, Karolinska, Stockholm & Ludwig Institute for Cancer Research, Uppsala.

THANK YOU!