Biological Networks and their Evolution -...
Transcript of Biological Networks and their Evolution -...
RNA dynamics and Biomolecular Systems Lab, CNRS−UMR168
Hervé ISAMBERT
Institut Curie, Paris, France
Biological Networks and their Evolution
2nd IMSc Complex Systems School, The Institute of Mathematical Sciences, Jan 14 & 15, 2010
I. Introduction: from Genomes to Networks
III. Properties of Biological Networks
II. Different Types of Biological Networks
IV.Network Evolution: from Genes to Organisms
V. Models of Biological Network Evolution
Biological Networks and their Evolution
Hervé ISAMBERT, Institut Curie, Paris
Biological Networks and their Evolution
Hervé ISAMBERT, Institut Curie, Paris
III. Properties of Biological Networks
II. Different Types of Biological Networks
IV.Network Evolution: from Genes to Organisms
V. Models of Biological Network Evolution
I. Introduction: from Genomes to Networks
Biological Networks and their Evolution
Hervé ISAMBERT, Institut Curie, Paris
I. Introduction: from Genomes to Networks
III. Properties of Biological Networks
IV.Network Evolution: from Genes to Organisms
V. Models of Biological Network Evolution
II. Different Types of Biological Networks
Biological Networks and their Evolution
Hervé ISAMBERT, Institut Curie, Paris
I. Introduction: from Genomes to Networks
II. Different Types of Biological Networks
IV.Network Evolution: from Genes to Organisms
V. Models of Biological Network Evolution
III. Properties of Biological Networks
... except in the light of evolution""Nothing in biology makes sense...
but is it for FUNCTIONAL reason ??
or is it because they CANNOT be randomby CONSTRUCTION ??
Then what sense does it make to compare them to randomized graphs ??
Very Preliminary Conclusion
Theodosius Dobzhansky (1973)
Biological Networks are NOT Random Graphs !
Biological Networks and their Evolution
Hervé ISAMBERT, Institut Curie, Paris
I. Introduction: from Genomes to Networks
III. Properties of Biological Networks
II. Different Types of Biological Networks
V. Models of Biological Network Evolution
IV.Network Evolution: from Genes to Organisms
Mechanisms of Genome Evolution
Long suspected... recently proved!
shuffling of protein domains
Implies important genetic modifications!!
Mechanisms of Genome Evolution
shuffling of protein domains
Kellis et al. 2004Whole Genome Duplication in Yeast Genome
Kellis et al. 2004Whole Genome Duplication in Yeast Genome
bird
s 1
0,00
0
rept
iles
8,0
00di
nosa
urs
dino
saur
s,
m
amm
als
5,5
00
−20−30
arch
eaba
cter
ia>
10,0
00,0
00 ?
?
x 2x 2
bony fish 24,000tetrapods 28,000
othe
rs (
prot
ists
) 3
0,00
0
plan
ts
330
,000
S. cerev.K. lactis
flowering275,000
eukaryotes
Paramecium
x 2
x 2
A. thalianaP. trichocarpa
x 2
x 2
Carp
X. laevis
x 2
vertebrates 52,000
amph
ibia
6,
000
B. napus
x 2
othe
r 5
5,00
0
x 2TetraodonX. tropicalis
x 2
x 2
−150
−250
−300
−350
viru
ses−1,000
MYA
−450
−500
LampreyLanceletBdelloid T. barreraeH. sapiensNadisBulinus M.incognita
∗< −3,500 ??
chordates
x 2
fung
i 1
00,0
00
sea
urch
in,
star
fish
6,0
00
spon
ges
7,0
00
jelly
fish,
ane
mon
es 1
0,00
0
x 2
jaw
less
fish
100
ceph
aloc
hord
ates
30
spid
ers,
cra
bs,
inse
cts,
800
,000
x 2ro
tifer
a >
2,20
0 nem
atod
a >
25,0
00
x 2x 2
mol
lusc
a 5
0,00
0
bilateria
animals 1,200,000
Whole Genome Duplications in Evolution
−Population bottleneck−Speciation eventsWhole Genome Duplications promote:
x 2x 2
1+1+1
bird
s 1
0,00
0
rept
iles
8,0
00di
nosa
urs
dino
saur
s,
m
amm
als
5,5
00
−20−30
arch
eaba
cter
ia>
10,0
00,0
00 ?
?
x 2x 2
bony fish 24,000tetrapods 28,000
othe
rs (
prot
ists
) 3
0,00
0
plan
ts
330
,000
S. cerev.K. lactis
flowering275,000
eukaryotes
Paramecium
x 2
x 2
A. thalianaP. trichocarpa
x 2
x 2
Carp
X. laevis
x 2
vertebrates 52,000
amph
ibia
6,
000
B. napus
x 2
othe
r 5
5,00
0
x 2TetraodonX. tropicalis
x 2
x 2
−150
−250
−300
−350
viru
ses−1,000
MYA
−450
−500
LampreyLanceletBdelloid T. barreraeH. sapiensNadisBulinus M.incognita
∗< −3,500 ??
chordates
x 2
fung
i 1
00,0
00
sea
urch
in,
star
fish
6,0
00
spon
ges
7,0
00
jelly
fish,
ane
mon
es 1
0,00
0
x 2
jaw
less
fish
100
ceph
aloc
hord
ates
30
spid
ers,
cra
bs,
inse
cts,
800
,000
x 2ro
tifer
a >
2,20
0 nem
atod
a >
25,0
00
x 2x 2
mol
lusc
a 5
0,00
0
bilateria
animals 1,200,000
Whole Genome Duplications in Evolution
−Population bottleneck−Speciation eventsWhole Genome Duplications promote:
x 2x 2
1+1+1
Evolution by Duplication Divergence
Evolution by Duplication Divergence
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istr
ibut
ion
Number of shared partners
Nod
e p
air
dis
trib
utio
n w
ith s
hare
d p
artn
ers
WGD dupli > 20 x random pairs
WGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI network
Yeast 6000 genes : 4000 in PPI network
PPI network
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istr
ibut
ion
Number of shared partners
Nod
e p
air
dis
trib
utio
n w
ith s
hare
d p
artn
ers
WGD dupli > 20 x random pairs
WGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI network
Yeast 6000 genes : 4000 in PPI network
PPI network
NetworkDuplicationunder WGD
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istr
ibut
ion
Number of shared partners
Nod
e p
air
dis
trib
utio
n w
ith s
hare
d p
artn
ers
WGD dupli > 20 x random pairs
WGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI network
PPI network
Yeast 6000 genes : 4000 in PPI network
Divergence
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istr
ibut
ion
Number of shared partners
Nod
e p
air
dis
trib
utio
n w
ith s
hare
d p
artn
ers
WGD dupli > 20 x random pairs
WGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI network
PPI network
Yeast 6000 genes : 4000 in PPI network
Divergence
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istr
ibut
ion
Number of shared partners
Nod
e p
air
dis
trib
utio
n w
ith s
hare
d p
artn
ers
WGD dupli > 20 x random pairs
WGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI network
PPI network
Yeast 6000 genes : 4000 in PPI network
"Rewiring" ??
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istr
ibut
ion
Number of shared partners
Nod
e p
air
dis
trib
utio
n w
ith s
hare
d p
artn
ers
WGD dupli > 20 x random pairs
WGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI network
PPI network
Yeast 6000 genes : 4000 in PPI network
Shared Partners
Biological Networks and their Evolution
Hervé ISAMBERT, Institut Curie, Paris
I. Introduction: from Genomes to Networks
III. Properties of Biological Networks
II. Different Types of Biological Networks
IV.Network Evolution: from Genes to Organisms
V. Models of Biological Network Evolution
Hervé Isambert, UMR168, Institut Curie, Section de Recherche, Paris
Transcription Networks:
PNAS, 104, 5516−5520 (2007)
Marco Cosentino Lagomarsino
Mol BioSyst, 5, 170−179 (2009)
BMC Syst Biol 1:49 (2007)
Kirill Evlampiev
PNAS, 105, 9863−9868 (2008)
Prot−Prot Interaction Networks:
Human Frontier, Ministere de la Recherche, ANR
Institut Curie, CNRS, Fondation PG de Gennes(from tolweb site)
Tree of Life
Richard Stein
Signaling Networks:
Evolution and Topology of Large Biological Networks
Biol Direct, 4, 28 (2009)
time−lineargrowth
Evolution of PPI Networks
growthexponentialEvlampiev, Isambert BMC Syst. Bio. (2007)
Evlampiev, Isambert PNAS USA (2008)
time−lineargrowth
Evolution of PPI Networks
growthexponentialEvlampiev, Isambert BMC Syst. Bio. (2007)
Evlampiev, Isambert PNAS USA (2008)
single
Symmetric divergence
new
dupli
Duplication
Partial Genome
interaction
proteins
n=1
General Duplication−Divergence Model of PPI Network Evolution
4’
41 3
2
2’1−q q
1 2 3 4
Genome
P−P Interaction Network
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev1
3
2’ 2
4’4
1
4
2
3
General Duplication−Divergence Model of PPI Network Evolution
γon
γsn
soγ
γss
γ
γoo
nn
n=1
Duplication
Partial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
1 2 3 4
K. Evlampiev
Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
1
4’4
3
2’ 21
4
2
3
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
IterationDuplication
Partial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
General Duplication−Divergence Model of PPI Network Evolution
1 2 3 4
Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
1
4’4
3
2’ 21
4
2
3
Overview of the Evolutionary Regimes
Overview of the Evolutionary Regimes
with respect to p(k)
Overview of the Evolutionary Regimes
Overview of the Evolutionary Regimes
Overview of the Evolutionary Regimes
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
IterationDuplication
Partial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
General Duplication−Divergence Model of PPI Network Evolution
1 2 3 4
Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
1
4’4
3
2’ 21
4
2
3
M’ >1 Scale−freeM’ <1 Exponential
3Network Topology Index iΓ
i=s,o,nM’= max( ) > M
Γik
iio
γin
γΓio
ns
kq
i
1−q
i=s,o,n
= (1−q) + q( + )isγNode Degree Growth Rate
Γ >1 Growing NetworkΓ <1 Vanishing NetworksΓ nΓoΓΓ = (1−q) + q + q Network Link Growth Rate
<1 Nonconserved NetworkM>1 Conserved NetworkM
sΓ
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
IterationDuplication
Partial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
1
2
= (1−q) + q M oΓ < ΓProtein Conservation Index
General Duplication−Divergence Model of PPI Network Evolution
1 2 3 4
Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
1
4’4
3
2’ 21
4
2
3
ΓM < 1 < Γ1 < M <
10
11
1
Expanding PPI Networks
Non−Conserved Networks Conserved Networks M
Γ
Γ
1
n oo
s
s
s
o
o
o
oo
s
o
s
o
o
s s
n
Network Conservation (M) and Expansion ( )Γ
2
0
1
3
4
5
6
7
8
9
duplication
Vanishing
M < < 1
ΓM < 1 < Γ1 < M <
10
11
1
Expanding PPI Networks
Non−Conserved Networks Conserved Networks M
Γ
Γ
1
n oo
s
s
s
o
o
o
oo
s
o
s
o
o
s s
n
Network Conservation (M) and Expansion ( )Γ
2
0
1
3
4
5
6
7
8
9
duplication
Vanishing
M < < 1
ΓM < 1 < Γ1 < M <
10
11
1
Expanding PPI Networks
Non−Conserved Networks Conserved Networks M
Γ
Γ
1
n o
s
s
s
o
o
o
oo
s
Network Conservation (M) and Expansion ( )Γ
2
0
1
3
4
5
6
7
8
9
duplication
Vanishing
M < < 1
ΓM < 1 < Γ1 < M <
10
11
1
Expanding PPI Networks
Non−Conserved Networks Conserved Networks M
Γ
Γ
1
Network Conservation (M) and Expansion ( )Γ
2
0
1
3
4
5
6
7
8
9
duplication
Vanishing
M < < 1
M’ >1 Scale−freeM’ <1 Exponential
3Network Topology Index iΓ
i=s,o,nM’= max( ) > M
Γik
iio
γin
γΓio
ns
kq
i
1−q
i=s,o,n
= (1−q) + q( + )isγNode Degree Growth Rate
Γ >1 Growing NetworkΓ <1 Vanishing NetworksΓ nΓoΓΓ = (1−q) + q + q Network Link Growth Rate
<1 Nonconserved NetworkM>1 Conserved NetworkM
sΓ
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
IterationDuplication
Partial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
1
2
= (1−q) + q M oΓ < ΓProtein Conservation Index
General Duplication−Divergence Model of PPI Network Evolution
1 2 3 4
Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
1
4’4
3
2’ 21
4
2
3
M’ >1 Scale−freeM’ <1 Exponential
3Network Topology Index iΓ
i=s,o,nM’= max( ) > M
Γik
iio
γin
γΓio
ns
kq
i
1−q
i=s,o,n
= (1−q) + q( + )isγNode Degree Growth Rate
Γ >1 Growing NetworkΓ <1 Vanishing NetworksΓ nΓoΓΓ = (1−q) + q + q Network Link Growth Rate
<1 Nonconserved NetworkM>1 Conserved NetworkM
sΓ
(M’ > M > 1)necessaryConserved Networks are also Scale−free
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
IterationDuplication
Partial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
1
2
= (1−q) + q M oΓ < ΓProtein Conservation Index
General Duplication−Divergence Model of PPI Network Evolution
1 2 3 4
Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
1
4’4
3
2’ 21
4
2
3
γss
soγ
γsn
γoo γ
on
γnn
of Linkswith proba
Deletion
1−γ1−µ
proteins
k k+1
Duplication−Divergence Models Including Self−links
interaction
µnµ
s
µon
µo
Iteration
Duplication
Partial Genome
Self−links leads to time−linear (not exponential) connectivity expansionk k Γ
Self−links do not affect evolutionary regimes butare essential contributors to certain network motifs
x (1+q)
1 2’ 21 2
3 3
4
4’
41
Conservation of Network Motifs
This is in broad agreement with empirical observations !
Eventually: structural orthology functional orthology
M1
γγ
γ
γγ
γ
γγγ 23
Motif conservation index
s/o
s/o
s/o
Network Motifs are NOT Conserved underWhile Proteins are typically Conserved (if M>1)
Long−term Duplication−Divergence Evolution
RalGDS RGL3
GAP
RalGPS1RalGPS2
GAP GAP
GAP GAP82%
60%
H. sapiens
RalBP1 Exocyst
x 2
RGL1
GAP GAP
RGL2RGL4
x 2
D. melanogaster
CG5522
?
28%
GAP
GAP
RalBP1 Exocyst
RGL
D. mel H. sap
x 2
D. rer
x 2x 2
Whole Genome Duplications
RalGDS
RGL1
RGL3
RASGRF1RASGRF2
KSR1
KSR2
ERAS
RIT2
RIT1
STK38
RasGRP3
RalB
RalA
RalGPS1
RGL4
RalGPS2
RalBP1Raf1
BRAF
ARAF
NRAS
HRAS
MRAS
RRAS
RGL2
KRASRRAS2
STK38L RasGRP1
RasGRP2
RasGRP4MAP4K1
MAP4K2
Ras−Ralpathways
RASRAS RAS
RalA RalB
RAS RAS
RalGEF
RAS
RAL
Ral Effectuor
RAS
Ral
Kirill Evlampiev
<N >/<N >=(n)(n+1) ∆(n)
h( )=α∆ = +q Γ(1−q) +q Γα α α
Γs o n
Asymptotic "Characteristic Equation"Network Node Growth Rate
<1{α}0<
αh( )
p ~ 1/k α+1k
Γ
Γ
i
i
h( )=α +q Γ(1−q) +q Γα α α
Γs o n
α∗ >1
<1{α}0<
Asymptotic Topology of PPI Networks
α10
h(1)
Dense
h(1)
Exponential Degree Distribution
h(0)
α∗ >1
Scale−free Degree Distribution
M’=max( ) > 1
kp ~ exp(− k)M’=max( ) < 1 µ
Convex Characteristic Function
<N >/<N >=(n)(n+1) ∆(n)
h( )=α∆ = +q Γ(1−q) +q Γα α α
Γs o n
Asymptotic "Characteristic Equation"Network Node Growth Rate
<1{α}0<
αh( )
p ~ 1/k α+1k
Γ
Γ
i
i
h( )=α +q Γ(1−q) +q Γα α α
Γs o n
α∗ >1
<1{α}0<
oΓiΓ (n)
Conserved Networks are also Scale−freenecessary (M’ > M > 1)
GDD Model + Fluctuations = max( )Πn=1
R R
sΓΠ (n)(n)
n=1
(n)(n)> [(1−q ) + q ] =M’ M
Scale−free
Exponential
[
Asymptotic Topology of PPI Networks
α10
h(1)
Dense
h(1)
Exponential Degree Distribution
h(0)
α∗ >1
Scale−free Degree Distribution
M’=max( ) > 1
kp ~ exp(− k)M’=max( ) < 1 µ
Convex Characteristic Function
= max( )iΓi=s,o,n
M’Asymptotic Network TopologyM’ >1M’ <1
3
]
0.01 0.1 10
0.4
0.8
1.2
0.01 0.1 10
0.4
0.8
1.2
0.01 0.1 10
0.4
0.8
1.2
L/<L> or N/<N> L/<L> or N/<N> L/<L> or N/<N>
20% Growth Rate 50% Growth Rate ~80% Growth Rate
Linear regime N ~ L NL regime N < L
Distribution of Network Sizes
Genome and PPI Network Evolutionunder Protein Domain Shuffling
Domain−Domain Interaction Network
Direct interaction bwn protein domains
Multidomain proteins (random shuffling)
domain1
domain2
Redefining PPI Networks as Domain Interaction Networks
XOR
XOR
protein1=domain1
Genome and PPI Network Evolutionunder Protein Domain Shuffling
Domain−Domain Interaction Network
Direct interaction bwn protein domains
Multidomain proteins (random shuffling)
protein2=
domain2A + domain2B
Redefining PPI Networks as Domain Interaction Networks
XOR
XOR
protein1=domain1
Genome and PPI Network Evolutionunder Protein Domain Shuffling
Domain−Domain Interaction Network
Indirect interaction within complexes
Direct interaction bwn protein domains
Multidomain proteins (random shuffling)
Adding some "combinatorial logic" through multidomain proteins
Redefining PPI Networks as Domain Interaction Networks
protein2=
AND
domain2A + domain2BXOR
XOR
1 10 1001e-6
1e-4
0.01
1
kp
kk 2 k
.g
kp
(+/− 2σ)WGD Model
(+/− 2σ)
Yeast direct int. data
Yeast direct int. dataWGD Model
γon
γoo=0.1 ( =1 γ
nn )=0γ
on
Node Degree
Protein direct connectivity
Average Connectivityof Neighbours k.g
kDistribution
λ =1.5 prot. bind. domains per protein
Comparison with Yeast Direct Interaction Data
Topology of direct interaction network
A two−parameter Whole Genome Duplication Model
with Random Protein Domain Shuffling λ
1 10 1001e-6
1e-4
0.01
1
kp
kk 2 k
.g
kp
(+/− 2σ)WGD Model
(+/− 2σ)
Yeast direct int. data
Yeast direct int. dataWGD Model
γon
γoo=0.1 ( =1 γ
nn )=0γ
on
Γ = Γ + Γ = 1 + 2 = 20%γono n
PPI Network growth rate
Seasquirt−Human (2WGD): 25%
Seasquirt−Fugu (3WGD): 11%
Node Degree
Protein direct connectivity
Average Connectivityof Neighbours k.g
kDistribution
λ =1.5 prot. bind. domains per protein
Comparison with Yeast Direct Interaction Data
Topology of direct interaction network
A two−parameter Whole Genome Duplication Model
with Random Protein Domain Shuffling λ
1 10 1001e-6
1e-4
0.01
1
1 10 1001e-6
1e-4
0.01
1
kp
kk 2 k
.g
kp
kk 2 k
.g
kp
(+/− 2σ)WGD Model
(+/− 2σ)
Yeast direct int. data
Yeast direct int. dataWGD Model
(+/− 2σ)WGD Model
γon
γoo=0.1 ( =1 γ
nn )=0γ
onwith Random Protein Domain Shuffling λA two−parameter Whole Genome Duplication Model
Topology of direct interaction network Topology of direct & indirect interaction network
Comparison with Yeast Direct + Indirect Interaction Data
Node Degree
Protein direct connectivity
Average Connectivityof Neighbours
Direct & indirect "matrix" connectivity
k.gk
Distribution
(+/− 2σ)WGD Model
Yeast dir. & indir. int. data
Yeast dir. & indir. int. data
λ =1.5 prot. bind. domains per protein
1 10 1001e-6
1e-4
0.01
1
1 10 1001e-3
0.01
0.1
1
1 10 1001e-3
0.01
0.1
1
kp
γnn
k.gk
Node Degree Distribution Average Connectivity of Neighbours
Fit of Yeast Data with a one−parameter WGD Model γ(onγ
oo=1 )=0=0.26
kp
Protein direct connectivity
kk 2 k
.g
Protein direct connectivity
(+/− 2σ)
phys. interaction dataYeast two−hybrid and
WGD Model
Comparison with Yeast Direct Interaction Data
Yeast two−hybrid and
WGD Model (+/− 2σ)
phys. interaction data
1
2
3
4
cspA
gyrA hns
caiF flhDC
fliAZY
nhaA osmC rcsAB stpA
flgAMN
flgBCDEFGHIJK flhBAE fliEfliLMNOPQR flgMN fliC fliDST motABcheAW tarTapcheRBYZ
Evolution and Topology of Transcription Networks
E.coli shallow Transcription Network
No large scale feedbacks
but 60% autoregulators (AR)
Transcription Factor Families
Babu et al., NAR 2006
Babu et al., NAR 2006
TF Family Expansion by Duplication
crp fnrC
G
AT
C
G
A
T
G
CTAG
C
AT
C
GATCGATC
AGT
C
T
AG
G
CAT
G
CATC
GATG
C
A
T
C
G
T
A
G
C
TAG
C
AT
G
ATC
G
T
CATGCAC
G
TA
C
G
A
T
GC
AT
-1 -0.5 0 0.5 1specificity
0
0.5
1
1.5
2
2.5
3
3.5
norm
aliz
ed h
its
autoregulatory sites
CRP
-1 -0.5 0 0.5 1specificity
0
1
2
3
4
norm
aliz
ed h
its
autoregulatory sites
FNR
CGTAC
GTA
CGAT
C
GAT
C
AGT
CAGT
AC
GT
C
TAGGCTAC
G
A
T
C
T
A
T
C
T
G
A
A
C
G
AT
AGCT
GTAC
GTCAG
TAC
GCTAG
CAT
GCAT
crp / fnr example
elimination of crosstalks
AR duplication
Evolutionary decoupling of duplicated ARs?
Regulatoryconflict !
Duplication / Divergence Asymmetry
Cosentino−Lagomarsino, Jona, Bassetti & Isambert,PNAS, 104, 5516 (2007) arxiv.org/abs/q.bio/0701035
Hierarchy and Feedback in the Evolutionof Bacterial Transcription Networks
Iteration
γon
γsn
soγ
γss
ioγ
inγΓi is
γ= (1−q) + q( + )q andΓik
i
Duplication
Partial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
γ
γoo
nn
Inhomogeneous Models of Duplication−Divergence Evolution
While the general "multicolor" inhomogeneous model cannot be solved exactly,
o
ns
kq
i
1−q
i=s,o,n
(n) (n) (n) (n) (n)
one can use a posteriori choices of evolutionary parameters’ values
Deletion
of Links
δ=1−γ
The approach becomes exact for two−color bipartite networks or oriented networks
This approach can model for instance the adaptative expansion of gene families
to mimick positive / negative selection of combinatorial features of networks (approximation)
1 2 3 4
Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
1
4
2
3
1
4’4
3
2’ 2
Bacteria versus Eukaryote Transcriptional Hierarchy
Cosentino−Lagomarsino et al. PNAS 2007 Sellerio et al., submitted 2008
S. cerevisiae has long transcriptional cascades
while bacteria have short ones
S. cerevisiaeEscherichia coli Bacillus subtilis
> 50% of TF ~ 10% of TF
Long
est S
hort
est P
aths
Bacteria versus Eukaryote Transcriptional Hierarchy
Cosentino−Lagomarsino et al. PNAS 2007 Sellerio et al., submitted 2008
S. cerevisiae has long transcriptional cascades
while bacteria have short ones
S. cerevisiaeEscherichia coli Bacillus subtilis
Long
est S
hort
est P
aths
Bacteria versus Eukaryote Transcriptional Hierarchy
Cosentino−Lagomarsino et al. PNAS 2007 Sellerio et al., submitted 2008
S. cerevisiae has long transcriptional cascades
while bacteria have short ones
S. cerevisiaeEscherichia coli Bacillus subtilis
Long
est S
hort
est P
aths
Bacteria versus Eukaryote Transcriptional Hierarchy
Cosentino−Lagomarsino et al. PNAS 2007 Sellerio et al., submitted 2008
S. cerevisiae has long transcriptional cascades
while bacteria have short ones
S. cerevisiaeEscherichia coli Bacillus subtilis
Adaptative Selection ? Random drift ?
Long
est S
hort
est P
aths
OR non−adaptative constraints on bacterial genomes ?? Isambert & Stein, Biol Direct (2009)
Bacteria versus Eukaryote Transcriptional Hierarchy
S. cerevisiae has long transcriptional cascades
while bacteria have short ones
S. cerevisiaeEscherichia coli Bacillus subtilis
Adaptative Selection ? Random drift ?
PopulationDynamics
N.s > 1 N.s << 1N population size
Long
est S
hort
est P
aths
Transcription / translation coupling versus mRNA export out of the nucleus
Apparent genome size constraint ?
Alternatively... Specific Evolutionary Constraints of Bacteria?
but gene conservation required "mutualized" gene pool + gene transfers
Encephalitozoon intestinalis
Selaginella moellendorffii
(M. hapla)
nematode H. sapiensC. elegans D. melanogaster
ANIMALS(Pratylenchus coffeae)
Trichoplax adhaerens
P. aethiopicus,lungfish
tick, argasid
carnivorous plant
G. margaretae
A. thaliana
PLANTS F. assyriaca,lily
rice, O. sativa T. aestivumwheat,
S. castanea(P. carinii)
A. gossypii S. cerevisae
FUNGI
P. ubique E. coli
FREE−LIVINGPROKARYOTES
10 10 10 101010 1010 10
(N. equitans) ARCHAEA
(G. theta)
T. thermophila
PROTISTS
G. polyedra
S. cellulosum(C. ruddii)
green algae
I. hospitalis
M. acetivorans
8 9 10 11 1210 2
(HIV)(HDV)
43 6 75
10 100 1,0001 ~10,000
10
approx. number of genes~20,000 ~30,000
Genome size (bp)
(mamavirus)
(phage T4)
Genome Size Restriction
of Free−living Prokaryotes
FREE−LIVING EUKARYOTESOBLIGATE PARASITES and SYMBIONTS
O. tauri P. tetraurelia A. dubia
EUBACTERIA
VIROIDS VIRUSES
Transcription / translation coupling versus mRNA export out of the nucleus
Bacterial genome structure : Apparent genome size constraint ?
Alternatively... Specific Evolutionary Constraints of Bacteria?
but gene conservation required "mutualized" gene pool + gene transfers
Encephalitozoon intestinalis
Selaginella moellendorffii
(M. hapla)
nematode H. sapiensC. elegans D. melanogaster
ANIMALS(Pratylenchus coffeae)
Trichoplax adhaerens
P. aethiopicus,lungfish
tick, argasid
carnivorous plant
G. margaretae
A. thaliana
PLANTS F. assyriaca,lily
rice, O. sativa T. aestivumwheat,
S. castanea(P. carinii)
A. gossypii S. cerevisae
FUNGI
P. ubique E. coli
FREE−LIVINGPROKARYOTES
10 10 10 101010 1010 10
(N. equitans) ARCHAEA
(G. theta)
T. thermophila
PROTISTS
G. polyedra
S. cellulosum(C. ruddii)
green algae
I. hospitalis
M. acetivorans
8 9 10 11 1210 2
(HIV)(HDV)
43 6 75
10 100 1,0001 ~10,000
10
approx. number of genes~20,000 ~30,000
Genome size (bp)
(mamavirus)
(phage T4)
Genome Size Restriction
of Free−living Prokaryotes
FREE−LIVING EUKARYOTESOBLIGATE PARASITES and SYMBIONTS
O. tauri P. tetraurelia A. dubia
EUBACTERIA
VIROIDS VIRUSES
but gene conservation required "mutualized" gene pool + gene transfers
Gene duplications + genome size constraint gene turnover
Transcription / translation coupling versus mRNA export out of the nucleus
Bacterial genome structure : Apparent genome size constraint ?
Alternatively... Specific Evolutionary Constraints of Bacteria?
RNA Dynamics and Biomolecular Systems Lab
5’ 3’
A
GCAGG
AGGACG
A
A
G
CAUCCUCCUACG
AAGGUAGGGGGAUGG
AAA
AC C
GUCC
CCCUGC
C
A
AG
C
AGGAGGA
CG
AAGCA
UCCUCCU
ACGA
AGGUAGGGGGAUGG
AAA
A
CCGUCCCCCUGCC
A
5’ 3’
cis / transcontrol
Duplicationof genes
Deletionof Links
Iteration
4
2 1
4’4
3
2’ 21
3
100 nm
II. Evolution of Biological NetworksModeling biological networks under duplication−divergence evolution
Self−assembly of ncRNARNA switches and networks
I. RNA Regulatory and Physical Networks
$$. Human Frontier, ANR, Ministère de la Recherche, Institut Curie, CNRS, Fondation PG de Gennes
Institut Curie, CNRS, Paris