Dealing with composition convergence to place plastids among … · 2013. 4. 16. ·...

138
Dealing with composition convergence to place plastids among Cyanobacteria Blaise Li Centro de Ciências do Mar, Universidade do Algarve, Portugal Institut für Populationsgenetik - 18/04/2013 Blaise Li Plastids, Cyanobacteria and composition biases

Transcript of Dealing with composition convergence to place plastids among … · 2013. 4. 16. ·...

Page 1: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Dealing with composition convergence toplace plastids among Cyanobacteria

Blaise Li

Centro de Ciências do Mar, Universidade do Algarve, Portugal

Institut für Populationsgenetik - 18/04/2013

Blaise Li Plastids, Cyanobacteria and composition biases

Page 2: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

The endosymbiotic origin of plastids

Cyanobacteria

Glaucophyta

green algae

red algae

land plants

chromalveolates. . .

euglenids

primary endosymbiosis

secondary endosymbiosis

secondary endosymbiosis

(after Keeling, 2010)Blaise Li Plastids, Cyanobacteria and composition biases

Page 3: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

The endosymbiotic origins of plastids

There is, however, a large number of endosymbioticrelationships seemingly based on photosynthesis thatare less well understood and vary across the entirespectrum of integration, from passing associations tolong term and seemingly well-developed partnerships(e.g. Rumpho et al. 2008). Indeed, the line between

what is an organelle and what is an endosymbiont isan arbitrary one. There are a few different, specific cri-teria that have been argued to distinguish the two, themost common being the genetic integration of the twopartners, and the establishment of a protein-targetingsystem. Most photosynthetic endosymbionts probably

primary endosymbiosis

primary endosymbiosis

secondary endosymbiosis

secondary endosymbiosis

secondary endosymbiosis

serial secondary endosymbiosis

(green alga)

tertiary endosymbiosis(diatom)

stramenopiles

ciliates

Dinophysis

Lepididinium

euglenids

chlorarachniophytes

Paulinella

dinoflagellatesApicomplexa

green algae

Durinskia

Karlodinium

red algae

glaucophytes

tertiary endosymbiosis(cryptomonad)

tertiary endosymbiosis(haptophyte)

haptophytes

cryptomonads

land plants

?

Figure 2. (Caption opposite.)

732 P. J. Keeling Review. The origin and fate of plastids

Phil. Trans. R. Soc. B (2010)

on May 13, 2011rstb.royalsocietypublishing.orgDownloaded from

Blaise Li Plastids, Cyanobacteria and composition biases

Page 4: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

The endosymbiotic origin of plastids

plastids

section I

section III section IV

Blaise Li Plastids, Cyanobacteria and composition biases

Page 5: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Old events are generally difficult to resolve:

I mutational saturationI changes in evolution modalitiesI enough time for divergences and convergences in these

modalities or in their consequences→ Simple evolutionary models may not be appropriate.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 6: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Old events are generally difficult to resolve:I mutational saturation

I changes in evolution modalitiesI enough time for divergences and convergences in these

modalities or in their consequences→ Simple evolutionary models may not be appropriate.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 7: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Old events are generally difficult to resolve:I mutational saturationI changes in evolution modalities

I enough time for divergences and convergences in thesemodalities or in their consequences

→ Simple evolutionary models may not be appropriate.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 8: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Old events are generally difficult to resolve:I mutational saturationI changes in evolution modalitiesI enough time for divergences and convergences in these

modalities or in their consequences

→ Simple evolutionary models may not be appropriate.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 9: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Old events are generally difficult to resolve:I mutational saturationI changes in evolution modalitiesI enough time for divergences and convergences in these

modalities or in their consequences→ Simple evolutionary models may not be appropriate.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 10: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Difficulty amplified because of endosymbiosis:

I simplificationI gene relocationI changes in biochemical context

→ sequences missing or with modified evolutionary trends(potentially misleading)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 11: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Difficulty amplified because of endosymbiosis:I simplification

I gene relocationI changes in biochemical context

→ sequences missing or with modified evolutionary trends(potentially misleading)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 12: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Difficulty amplified because of endosymbiosis:I simplificationI gene relocation

I changes in biochemical context→ sequences missing or with modified evolutionary trends(potentially misleading)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 13: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Difficulty amplified because of endosymbiosis:I simplificationI gene relocationI changes in biochemical context

→ sequences missing or with modified evolutionary trends(potentially misleading)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 14: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Difficulty amplified because of endosymbiosis:I simplificationI gene relocationI changes in biochemical context

→ sequences missing or with modified evolutionary trends(potentially misleading)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 15: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Too straightforward analyses give conflicting results.

I rDNA and amino-acid data: early divergence of plastidsI protein coding gene data: plastids close to pluricellular

CyanobacteriaWhat is the cause of this incongruence?

→ We studied the phenomenon on a dataset of protein codinggenes from plastids (or relocated in the plant host nucleus)and their (cyano)bacterial homologues.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 16: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Too straightforward analyses give conflicting results.I rDNA and amino-acid data: early divergence of plastids

I protein coding gene data: plastids close to pluricellularCyanobacteria

What is the cause of this incongruence?

→ We studied the phenomenon on a dataset of protein codinggenes from plastids (or relocated in the plant host nucleus)and their (cyano)bacterial homologues.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 17: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Too straightforward analyses give conflicting results.I rDNA and amino-acid data: early divergence of plastidsI protein coding gene data: plastids close to pluricellular

Cyanobacteria

What is the cause of this incongruence?

→ We studied the phenomenon on a dataset of protein codinggenes from plastids (or relocated in the plant host nucleus)and their (cyano)bacterial homologues.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 18: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Too straightforward analyses give conflicting results.I rDNA and amino-acid data: early divergence of plastidsI protein coding gene data: plastids close to pluricellular

CyanobacteriaWhat is the cause of this incongruence?

→ We studied the phenomenon on a dataset of protein codinggenes from plastids (or relocated in the plant host nucleus)and their (cyano)bacterial homologues.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 19: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Phylogenetic difficulties

Too straightforward analyses give conflicting results.I rDNA and amino-acid data: early divergence of plastidsI protein coding gene data: plastids close to pluricellular

CyanobacteriaWhat is the cause of this incongruence?

→ We studied the phenomenon on a dataset of protein codinggenes from plastids (or relocated in the plant host nucleus)and their (cyano)bacterial homologues.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 20: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Dataset

I 42 taxa, including 8 outgoup (non-cyano)bacteria,16 Cyanobacteria, and plastids from 1 Glaucophyta,4 Rhodophyta (red algae) and 13 Viridiplantae (greenplants)

I Cyanobacteria groups present:I NOST-1 (section IV)I OSC-2 (section III)I SPM-3, SO-6, GBACT, UNIT+ (section I)

I 75 protein-coding genes, but 452 missing sequences (i.e.14% overall, and up to 38 genes missing for one of theoutgroup taxa)

I Concatenated dataset (cg75) and its translation (cp75)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 21: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Dataset

I 42 taxa, including 8 outgoup (non-cyano)bacteria,16 Cyanobacteria, and plastids from 1 Glaucophyta,4 Rhodophyta (red algae) and 13 Viridiplantae (greenplants)

I Cyanobacteria groups present:

I NOST-1 (section IV)I OSC-2 (section III)I SPM-3, SO-6, GBACT, UNIT+ (section I)

I 75 protein-coding genes, but 452 missing sequences (i.e.14% overall, and up to 38 genes missing for one of theoutgroup taxa)

I Concatenated dataset (cg75) and its translation (cp75)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 22: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Dataset

I 42 taxa, including 8 outgoup (non-cyano)bacteria,16 Cyanobacteria, and plastids from 1 Glaucophyta,4 Rhodophyta (red algae) and 13 Viridiplantae (greenplants)

I Cyanobacteria groups present:I NOST-1 (section IV)

I OSC-2 (section III)I SPM-3, SO-6, GBACT, UNIT+ (section I)

I 75 protein-coding genes, but 452 missing sequences (i.e.14% overall, and up to 38 genes missing for one of theoutgroup taxa)

I Concatenated dataset (cg75) and its translation (cp75)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 23: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Dataset

I 42 taxa, including 8 outgoup (non-cyano)bacteria,16 Cyanobacteria, and plastids from 1 Glaucophyta,4 Rhodophyta (red algae) and 13 Viridiplantae (greenplants)

I Cyanobacteria groups present:I NOST-1 (section IV)I OSC-2 (section III)

I SPM-3, SO-6, GBACT, UNIT+ (section I)I 75 protein-coding genes, but 452 missing sequences (i.e.

14% overall, and up to 38 genes missing for one of theoutgroup taxa)

I Concatenated dataset (cg75) and its translation (cp75)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 24: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Dataset

I 42 taxa, including 8 outgoup (non-cyano)bacteria,16 Cyanobacteria, and plastids from 1 Glaucophyta,4 Rhodophyta (red algae) and 13 Viridiplantae (greenplants)

I Cyanobacteria groups present:I NOST-1 (section IV)I OSC-2 (section III)I SPM-3, SO-6, GBACT, UNIT+ (section I)

I 75 protein-coding genes, but 452 missing sequences (i.e.14% overall, and up to 38 genes missing for one of theoutgroup taxa)

I Concatenated dataset (cg75) and its translation (cp75)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 25: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Dataset

I 42 taxa, including 8 outgoup (non-cyano)bacteria,16 Cyanobacteria, and plastids from 1 Glaucophyta,4 Rhodophyta (red algae) and 13 Viridiplantae (greenplants)

I Cyanobacteria groups present:I NOST-1 (section IV)I OSC-2 (section III)I SPM-3, SO-6, GBACT, UNIT+ (section I)

I 75 protein-coding genes, but 452 missing sequences (i.e.14% overall, and up to 38 genes missing for one of theoutgroup taxa)

I Concatenated dataset (cg75) and its translation (cp75)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 26: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Dataset

I 42 taxa, including 8 outgoup (non-cyano)bacteria,16 Cyanobacteria, and plastids from 1 Glaucophyta,4 Rhodophyta (red algae) and 13 Viridiplantae (greenplants)

I Cyanobacteria groups present:I NOST-1 (section IV)I OSC-2 (section III)I SPM-3, SO-6, GBACT, UNIT+ (section I)

I 75 protein-coding genes, but 452 missing sequences (i.e.14% overall, and up to 38 genes missing for one of theoutgroup taxa)

I Concatenated dataset (cg75) and its translation (cp75)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 27: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

I Analyses using RAxML

I GTR+I+Γ for nucleotidesI CPREV+I+Γ for amino-acidsI 200 bootstrap pseudo-replicates

Blaise Li Plastids, Cyanobacteria and composition biases

Page 28: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

I Analyses using RAxMLI GTR+I+Γ for nucleotides

I CPREV+I+Γ for amino-acidsI 200 bootstrap pseudo-replicates

Blaise Li Plastids, Cyanobacteria and composition biases

Page 29: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

I Analyses using RAxMLI GTR+I+Γ for nucleotidesI CPREV+I+Γ for amino-acids

I 200 bootstrap pseudo-replicates

Blaise Li Plastids, Cyanobacteria and composition biases

Page 30: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

I Analyses using RAxMLI GTR+I+Γ for nucleotidesI CPREV+I+Γ for amino-acidsI 200 bootstrap pseudo-replicates

Blaise Li Plastids, Cyanobacteria and composition biases

Page 31: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

cg75

Firmicutes

Chloroflexi

Chlorobi

Proteobacteria

Gloeobacter_violaceus (GBACT)

Synechococcus_JA33Ab (GBACT)

Glaucophyta

Rhodophyta

Streptophyta

Chlorophyta

UNIT+

OSC-2

SO-6

NOST-1

SPM-3

1.001.00

1.00

1.00

1.00

1.000.70

1.00

0.990.81

0.88

0.70

cp75

translation

Blaise Li Plastids, Cyanobacteria and composition biases

Page 32: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

cg75

"basal" GBACT

Firmicutes

Chloroflexi

Chlorobi

Proteobacteria

Gloeobacter_violaceus (GBACT)

Synechococcus_JA33Ab (GBACT)

Glaucophyta

Rhodophyta

Streptophyta

Chlorophyta

UNIT+

OSC-2

SO-6

NOST-1

SPM-3

1.001.00

1.00

1.00

1.00

1.000.70

1.00

0.990.81

0.88

0.70

cp75

translation

"basal" GBACT

Blaise Li Plastids, Cyanobacteria and composition biases

Page 33: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

cg75

pluricellulars

grade of Cyanobacteria

Firmicutes

Chloroflexi

Chlorobi

Proteobacteria

Gloeobacter_violaceus (GBACT)

Synechococcus_JA33Ab (GBACT)

Glaucophyta

Rhodophyta

Streptophyta

Chlorophyta

UNIT+

OSC-2

SO-6

NOST-1

SPM-3

1.001.00

1.00

1.00

1.00

1.000.70

1.00

0.990.81

0.88

0.70

cp75

translation

"core"Cyanobacteria

Blaise Li Plastids, Cyanobacteria and composition biases

Page 34: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

cg75

pluricellulars

grade of Cyanobacteria

Firmicutes

Chloroflexi

Chlorobi

Proteobacteria

Gloeobacter_violaceus (GBACT)

Synechococcus_JA33Ab (GBACT)

Glaucophyta

Rhodophyta

Streptophyta

Chlorophyta

UNIT+

OSC-2

SO-6

NOST-1

SPM-3

1.001.00

1.00

1.00

1.00

1.000.70

1.00

0.990.81

0.88

0.70

cp75

translation

"core"Cyanobacteria

Blaise Li Plastids, Cyanobacteria and composition biases

Page 35: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

I cp75 is a direct translation of cg75

→ The trees should be the same.I But the analyses conflict in the identification of the

plastid sister-group.→ Something is not well modelled.

→ Can we have confidence in one of these trees?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 36: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

I cp75 is a direct translation of cg75→ The trees should be the same.

I But the analyses conflict in the identification of theplastid sister-group.→ Something is not well modelled.

→ Can we have confidence in one of these trees?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 37: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

I cp75 is a direct translation of cg75→ The trees should be the same.

I But the analyses conflict in the identification of theplastid sister-group.

→ Something is not well modelled.→ Can we have confidence in one of these trees?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 38: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

I cp75 is a direct translation of cg75→ The trees should be the same.

I But the analyses conflict in the identification of theplastid sister-group.→ Something is not well modelled.

→ Can we have confidence in one of these trees?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 39: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

First ML bootstrap analyses

I cp75 is a direct translation of cg75→ The trees should be the same.

I But the analyses conflict in the identification of theplastid sister-group.→ Something is not well modelled.

→ Can we have confidence in one of these trees?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 40: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotides or amino-acids?

I Here, low bootstrap suggests conflicting signals fornucleotides

I Nucleotide sequences are more likely to randomize withtime

I codon degeneracy → lowered selective pressureI only 4 states → convergence likely

I Selection on protein function stabilizes the amino-acidsequence

I But estimation of substitution matrix is easier fornucleotides (less states)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 41: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotides or amino-acids?

I Here, low bootstrap suggests conflicting signals fornucleotides

I Nucleotide sequences are more likely to randomize withtime

I codon degeneracy → lowered selective pressureI only 4 states → convergence likely

I Selection on protein function stabilizes the amino-acidsequence

I But estimation of substitution matrix is easier fornucleotides (less states)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 42: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotides or amino-acids?

I Here, low bootstrap suggests conflicting signals fornucleotides

I Nucleotide sequences are more likely to randomize withtime

I codon degeneracy → lowered selective pressureI only 4 states → convergence likely

I Selection on protein function stabilizes the amino-acidsequence

I But estimation of substitution matrix is easier fornucleotides (less states)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 43: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotides or amino-acids?

I Here, low bootstrap suggests conflicting signals fornucleotides

I Nucleotide sequences are more likely to randomize withtime

I codon degeneracy → lowered selective pressureI only 4 states → convergence likely

I Selection on protein function stabilizes the amino-acidsequence

I But estimation of substitution matrix is easier fornucleotides (less states)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 44: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotide composition attraction

We focus on a particular type of reconstruction artefact:nucleotide composition attraction.

I mutation can be variously biased across taxaI codon preference alsoI this influences the composition of the genomesI sites under lower selection constraint tend to conform to

that composition→ similar mutation biases and codon preferences may induceconvergence in the nucleotide sequence, especially at 3rdcodon position

Blaise Li Plastids, Cyanobacteria and composition biases

Page 45: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotide composition attraction

We focus on a particular type of reconstruction artefact:nucleotide composition attraction.

I mutation can be variously biased across taxa

I codon preference alsoI this influences the composition of the genomesI sites under lower selection constraint tend to conform to

that composition→ similar mutation biases and codon preferences may induceconvergence in the nucleotide sequence, especially at 3rdcodon position

Blaise Li Plastids, Cyanobacteria and composition biases

Page 46: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotide composition attraction

We focus on a particular type of reconstruction artefact:nucleotide composition attraction.

I mutation can be variously biased across taxaI codon preference also

I this influences the composition of the genomesI sites under lower selection constraint tend to conform to

that composition→ similar mutation biases and codon preferences may induceconvergence in the nucleotide sequence, especially at 3rdcodon position

Blaise Li Plastids, Cyanobacteria and composition biases

Page 47: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotide composition attraction

We focus on a particular type of reconstruction artefact:nucleotide composition attraction.

I mutation can be variously biased across taxaI codon preference alsoI this influences the composition of the genomes

I sites under lower selection constraint tend to conform tothat composition

→ similar mutation biases and codon preferences may induceconvergence in the nucleotide sequence, especially at 3rdcodon position

Blaise Li Plastids, Cyanobacteria and composition biases

Page 48: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotide composition attraction

We focus on a particular type of reconstruction artefact:nucleotide composition attraction.

I mutation can be variously biased across taxaI codon preference alsoI this influences the composition of the genomesI sites under lower selection constraint tend to conform to

that composition

→ similar mutation biases and codon preferences may induceconvergence in the nucleotide sequence, especially at 3rdcodon position

Blaise Li Plastids, Cyanobacteria and composition biases

Page 49: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotide composition attraction

We focus on a particular type of reconstruction artefact:nucleotide composition attraction.

I mutation can be variously biased across taxaI codon preference alsoI this influences the composition of the genomesI sites under lower selection constraint tend to conform to

that composition→ similar mutation biases and codon preferences may induceconvergence in the nucleotide sequence, especially at 3rdcodon position

Blaise Li Plastids, Cyanobacteria and composition biases

Page 50: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotide composition attractionT C A G

T

TTTPhe

TCT

Ser

TATTyr

TGTCys

TTC TCC TAC TGCTTA

LeuTCA TAA

TerTGA Ter

TTG TCG TAG TGG Trp

C

CTT

Leu

CCT

Pro

CATHis

CGT

ArgCTC CCC CAC CGCCTA CCA CAA

GlnCGA

CTG CCG CAG CGG

A

ATTIle

ACT

Thr

AATAsn

AGTSer

ATC ACC AAC AGCATA ACA AAA

LysAGA

ArgATG Met ACG AAG AGG

G

GTT

Val

GCT

Ala

GATAsp

GGT

GlyGTC GCC GAC GGCGTA GCA GAA

GluGGA

GTG GCG GAG GGG

Blaise Li Plastids, Cyanobacteria and composition biases

Page 51: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Nucleotide composition attractionT C A G

T

TTTPhe

TCT

Ser

TATTyr

TGTCys

TTC TCC TAC TGCTTA

LeuTCA TAA

TerTGA Ter

TTG TCG TAG TGG Trp

C

CTT

Leu

CCT

Pro

CATHis

CGT

ArgCTC CCC CAC CGCCTA CCA CAA

GlnCGA

CTG CCG CAG CGG

A

ATTIle

ACT

Thr

AATAsn

AGTSer

ATC ACC AAC AGCATA ACA AAA

LysAGA

ArgATG Met ACG AAG AGG

G

GTT

Val

GCT

Ala

GATAsp

GGT

GlyGTC GCC GAC GGCGTA GCA GAA

GluGGA

GTG GCG GAG GGG

Blaise Li Plastids, Cyanobacteria and composition biases

Page 52: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Composition and codon usage biases

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

10-1-2-3codon usage log10 ratios ( LeuTLeuC, SerAGSerTC

, ArgAArgC

):

10.50G+C proportion by codon position (1: −, 2: +, 3: ×):

−+ ×−+ ×

−+ ×−+ ×

−+×

−+ ×

−+×

−+ ×

−+×−+×

−+ ×

−+×

−+ ×

−+ ×

−+ ×

−+×

−+ ×

cg75Blaise Li Plastids, Cyanobacteria and composition biases

Page 53: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Composition and codon usage biases

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

10-1-2-3codon usage log10 ratios ( LeuTLeuC, SerAGSerTC

, ArgAArgC

):

10.50G+C proportion by codon position (1: −, 2: +, 3: ×):

−+ ×−+ ×

−+ ×−+ ×

−+×

−+ ×

−+×

−+ ×

−+×−+×

−+ ×

−+×

−+ ×

−+ ×

−+ ×

−+×

−+ ×

cg75

3rd pos. G+C

Blaise Li Plastids, Cyanobacteria and composition biases

Page 54: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Composition and codon usage biases

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

10-1-2-3codon usage log10 ratios ( LeuTLeuC, SerAGSerTC

, ArgAArgC

):

10.50G+C proportion by codon position (1: −, 2: +, 3: ×):

−+ ×−+ ×

−+ ×−+ ×

−+×

−+ ×

−+×

−+ ×

−+×−+×

−+ ×

−+×

−+ ×

−+ ×

−+ ×

−+×

−+ ×

cg75

1st pos. G+C

Blaise Li Plastids, Cyanobacteria and composition biases

Page 55: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Composition and codon usage biases

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

10-1-2-3codon usage log10 ratios ( LeuTLeuC, SerAGSerTC

, ArgAArgC

):

10.50G+C proportion by codon position (1: −, 2: +, 3: ×):

−+ ×−+ ×

−+ ×−+ ×

−+×

−+ ×

−+×

−+ ×

−+×−+×

−+ ×

−+×

−+ ×

−+ ×

−+ ×

−+×

−+ ×

cg75

1st pos. G+C

ArgA bias

LeuT bias

Blaise Li Plastids, Cyanobacteria and composition biases

Page 56: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position removal

I One frequent approach: removing 3rd codon positionswhen doing large-scale phylogeny

I A less frequent approach is to use a model thatacknowledges these composition bias differences

I will present a series of analyses starting from the applicationof the first approach to our nucleotide dataset.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 57: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position removal

I One frequent approach: removing 3rd codon positionswhen doing large-scale phylogeny

I A less frequent approach is to use a model thatacknowledges these composition bias differences

I will present a series of analyses starting from the applicationof the first approach to our nucleotide dataset.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 58: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position removal

I One frequent approach: removing 3rd codon positionswhen doing large-scale phylogeny

I A less frequent approach is to use a model thatacknowledges these composition bias differences

I will present a series of analyses starting from the applicationof the first approach to our nucleotide dataset.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 59: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position removalT C A G

T

TTTPhe

TCT

Ser

TATTyr

TGTCys

TTC TCC TAC TGCTTA

LeuTCA TAA

TerTGA Ter

TTG TCG TAG TGG Trp

C

CTT

Leu

CCT

Pro

CATHis

CGT

ArgCTC CCC CAC CGCCTA CCA CAA

GlnCGA

CTG CCG CAG CGG

A

ATTIle

ACT

Thr

AATAsn

AGTSer

ATC ACC AAC AGCATA ACA AAA

LysAGA

ArgATG Met ACG AAG AGG

G

GTT

Val

GCT

Ala

GATAsp

GGT

GlyGTC GCC GAC GGCGTA GCA GAA

GluGGA

GTG GCG GAG GGG

Blaise Li Plastids, Cyanobacteria and composition biases

Page 60: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position removalT C A G

T

TT-Phe

TC-

Ser

TA-Tyr

TG-Cys

TT- TC- TA- TG-TT-

LeuTC- TA-

TerTG- Ter

TT- TC- TA- TG- Trp

C

CT-

Leu

CC-

Pro

CA-His

CG-

ArgCT- CC- CA- CG-CT- CC- CA-

GlnCG-

CT- CC- CA- CG-

A

AT-Ile

AC-

Thr

AA-Asn

AG-Ser

AT- AC- AA- AG-AT- AC- AA-

LysAG-

ArgAT- Met AC- AA- AG-

G

GT-

Val

GC-

Ala

GA-Asp

GG-

GlyGT- GC- GA- GG-GT- GC- GA-

GluGG-

GT- GC- GA- GG-

Blaise Li Plastids, Cyanobacteria and composition biases

Page 61: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position removal

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

cg75

Firmicutes

Chloroflexi

Chlorobi

Proteobacteria

Gloeobacter_violaceus (GBACT)

Synechococcus_JA33Ab (GBACT)

SO-6

UNIT+

NOST-1

SPM-3

OSC-2

Glaucophyta

Rhodophyta

Streptophyta

Chlorophyta

1.000.99

1.001.00

1.000.99

1.00

0.54

0.881.00

0.99

1.00

cg75_no3Blaise Li Plastids, Cyanobacteria and composition biases

Page 62: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position removal

I UNIT+ monophyly restored

I But some signal not corresponding to synonymoussubstitutions was lost

I This signal can be saved by recoding instead of removing

Blaise Li Plastids, Cyanobacteria and composition biases

Page 63: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position removal

I UNIT+ monophyly restoredI But some signal not corresponding to synonymous

substitutions was lost

I This signal can be saved by recoding instead of removing

Blaise Li Plastids, Cyanobacteria and composition biases

Page 64: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position removal

I UNIT+ monophyly restoredI But some signal not corresponding to synonymous

substitutions was lostI This signal can be saved by recoding instead of removing

Blaise Li Plastids, Cyanobacteria and composition biases

Page 65: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position recodingT C A G

T

TTTPhe

TCT

Ser

TATTyr

TGTCys

TTC TCC TAC TGCTTA

LeuTCA TAA

TerTGA Ter

TTG TCG TAG TGG Trp

C

CTT

Leu

CCT

Pro

CATHis

CGT

ArgCTC CCC CAC CGCCTA CCA CAA

GlnCGA

CTG CCG CAG CGG

A

ATTIle

ACT

Thr

AATAsn

AGTSer

ATC ACC AAC AGCATA ACA AAA

LysAGA

ArgATG Met ACG AAG AGG

G

GTT

Val

GCT

Ala

GATAsp

GGT

GlyGTC GCC GAC GGCGTA GCA GAA

GluGGA

GTG GCG GAG GGG

Blaise Li Plastids, Cyanobacteria and composition biases

Page 66: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position recodingT C A G

T

TTYPhe

TCN

Ser

TAYTyr

TGYCys

TTY TCN TAY TGYTTN

LeuTCN TAR

TerTGR Ter

TTN TCN TAR TGG Trp

C

CTN

Leu

CCN

Pro

CAYHis

CGN

ArgCTN CCN CAY CGNCTN CCN CAR

GlnCGN

CTN CCN CAR CGN

A

ATHIle

ACN

Thr

AAYAsn

AGNSer

ATH ACN AAY AGNATH ACN AAR

LysAGN

ArgATG Met ACN AAR AGN

G

GTN

Val

GCN

Ala

GAYAsp

GGN

GlyGTN GCN GAY GGNGTN GCN GAR

GluGGN

GTN GCN GAR GGN

Blaise Li Plastids, Cyanobacteria and composition biases

Page 67: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position recoding

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

cg75

Firmicutes

Chloroflexi

Chlorobi

Proteobacteria

Gloeobacter_violaceus (GBACT)

Synechococcus_JA33Ab (GBACT)

SO-6

UNIT+

NOST-1

SPM-3

OSC-2

Glaucophyta

Rhodophyta

Streptophyta

Chlorophyta

1.000.99

1.001.00

1.000.99

1.00

0.60

0.891.00

0.98

1.00

cg75_degen3

degenerate at 3rd pos.

(27.35% recoded)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 68: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position recoding

I Similar effect as no3: UNIT+ monophyly restored

I But codon degeneracy exists at other positions, associatedwith switches between Leu, Arg and Ser families.→ We looked at the effect of this signal by selectivelyremoving it.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 69: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position recoding

I Similar effect as no3: UNIT+ monophyly restoredI But codon degeneracy exists at other positions, associated

with switches between Leu, Arg and Ser families.

→ We looked at the effect of this signal by selectivelyremoving it.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 70: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

3rd codon position recoding

I Similar effect as no3: UNIT+ monophyly restoredI But codon degeneracy exists at other positions, associated

with switches between Leu, Arg and Ser families.→ We looked at the effect of this signal by selectivelyremoving it.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 71: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating 1st and 2nd codon positionsT C A G

T

TTTPhe

TCT

Ser

TATTyr

TGTCys

TTC TCC TAC TGCTTA

LeuTCA TAA

TerTGA Ter

TTG TCG TAG TGG Trp

C

CTT

Leu

CCT

Pro

CATHis

CGT

ArgCTC CCC CAC CGCCTA CCA CAA

GlnCGA

CTG CCG CAG CGG

A

ATTIle

ACT

Thr

AATAsn

AGTSer

ATC ACC AAC AGCATA ACA AAA

LysAGA

ArgATG Met ACG AAG AGG

G

GTT

Val

GCT

Ala

GATAsp

GGT

GlyGTC GCC GAC GGCGTA GCA GAA

GluGGA

GTG GCG GAG GGG

Blaise Li Plastids, Cyanobacteria and composition biases

Page 72: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating 1st and 2nd codon positionsT C A G

T

TTTPhe

WST

Ser

TATTyr

TGTCys

TTC WSC TAC TGCYTA

LeuWSA TAA

TerTGA Ter

YTG WSG TAG TGG Trp

C

YTT

Leu

CCT

Pro

CATHis

MGT

ArgYTC CCC CAC MGCYTA CCA CAA

GlnMGA

YTG CCG CAG MGG

A

ATTIle

ACT

Thr

AATAsn

WSTSer

ATC ACC AAC WSCATA ACA AAA

LysMGA

ArgATG Met ACG AAG MGG

G

GTT

Val

GCT

Ala

GATAsp

GGT

GlyGTC GCC GAC GGCGTA GCA GAA

GluGGA

GTG GCG GAG GGG

Blaise Li Plastids, Cyanobacteria and composition biases

Page 73: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating 1st and 2nd codon positions

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

cg75

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

Synechococcus_elongatus (SO-6)Synechococcus_RCC307 (SO-6)

Acaryochloris_marina (UNIT+)Thermosynechococcus_elongatus (UNIT)Cyanothece_PCC7425 (UNIT+)SPM-3

NOST-1OSC-2

Prochlorococcus_marinus (SO-6)Rhodophyta

GlaucophytaStreptophytaChlorophyta

0.970.97

1.001.001.00

0.600.99

0.600.590.600.601.000.691.00

cg75_degenerate12

degenerate at 1st and 2nd pos.

(7.62% recoded)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 74: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating 1st and 2nd codon positions

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

cg75

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

Synechococcus_elongatus (SO-6)Synechococcus_RCC307 (SO-6)

Acaryochloris_marina (UNIT+)Thermosynechococcus_elongatus (UNIT)Cyanothece_PCC7425 (UNIT+)SPM-3

NOST-1OSC-2

Prochlorococcus_marinus (SO-6)Rhodophyta

GlaucophytaStreptophytaChlorophyta

0.970.97

1.001.001.00

0.600.99

0.600.590.600.601.000.691.00

cg75_degenerate12

degenerate at 1st and 2nd pos.

(7.62% recoded)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 75: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating 1st and 2nd codon positions

I UNIT+ monophyly not restored.

I SO-6 split.→ Is the removed signal actually useful? (more later)

I Lower supports suggest conflicting signals.(Prochlorococcus and Rhodophyta misplacement)

→ Let’s try to neutralize all synonymous substitutions. . .

Blaise Li Plastids, Cyanobacteria and composition biases

Page 76: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating 1st and 2nd codon positions

I UNIT+ monophyly not restored.I SO-6 split.

→ Is the removed signal actually useful? (more later)I Lower supports suggest conflicting signals.

(Prochlorococcus and Rhodophyta misplacement)→ Let’s try to neutralize all synonymous substitutions. . .

Blaise Li Plastids, Cyanobacteria and composition biases

Page 77: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating 1st and 2nd codon positions

I UNIT+ monophyly not restored.I SO-6 split.

→ Is the removed signal actually useful? (more later)

I Lower supports suggest conflicting signals.(Prochlorococcus and Rhodophyta misplacement)

→ Let’s try to neutralize all synonymous substitutions. . .

Blaise Li Plastids, Cyanobacteria and composition biases

Page 78: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating 1st and 2nd codon positions

I UNIT+ monophyly not restored.I SO-6 split.

→ Is the removed signal actually useful? (more later)I Lower supports suggest conflicting signals.

(Prochlorococcus and Rhodophyta misplacement)→ Let’s try to neutralize all synonymous substitutions. . .

Blaise Li Plastids, Cyanobacteria and composition biases

Page 79: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating 1st and 2nd codon positions

I UNIT+ monophyly not restored.I SO-6 split.

→ Is the removed signal actually useful? (more later)I Lower supports suggest conflicting signals.

(Prochlorococcus and Rhodophyta misplacement)

→ Let’s try to neutralize all synonymous substitutions. . .

Blaise Li Plastids, Cyanobacteria and composition biases

Page 80: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating 1st and 2nd codon positions

I UNIT+ monophyly not restored.I SO-6 split.

→ Is the removed signal actually useful? (more later)I Lower supports suggest conflicting signals.

(Prochlorococcus and Rhodophyta misplacement)→ Let’s try to neutralize all synonymous substitutions. . .

Blaise Li Plastids, Cyanobacteria and composition biases

Page 81: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all synonymous codon positionsT C A G

T

TTTPhe

TCT

Ser

TATTyr

TGTCys

TTC TCC TAC TGCTTA

LeuTCA TAA

TerTGA Ter

TTG TCG TAG TGG Trp

C

CTT

Leu

CCT

Pro

CATHis

CGT

ArgCTC CCC CAC CGCCTA CCA CAA

GlnCGA

CTG CCG CAG CGG

A

ATTIle

ACT

Thr

AATAsn

AGTSer

ATC ACC AAC AGCATA ACA AAA

LysAGA

ArgATG Met ACG AAG AGG

G

GTT

Val

GCT

Ala

GATAsp

GGT

GlyGTC GCC GAC GGCGTA GCA GAA

GluGGA

GTG GCG GAG GGG

Blaise Li Plastids, Cyanobacteria and composition biases

Page 82: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all synonymous codon positionsT C A G

T

TTYPhe

WSN

Ser

TAYTyr

TGYCys

TTY WSN TAY TGYYTN

LeuWSN TAR

TerTGR Ter

YTN WSN TAR TGG Trp

C

YTN

Leu

CCN

Pro

CAYHis

MGN

ArgYTN CCN CAY MGNYTN CCN CAR

GlnMGN

YTN CCN CAR MGN

A

ATHIle

ACN

Thr

AAYAsn

WSNSer

ATH ACN AAY WSNATH ACN AAR

LysMGN

ArgATG Met ACN AAR MGN

G

GTN

Val

GCN

Ala

GAYAsp

GGN

GlyGTN GCN GAY GGNGTN GCN GAR

GluGGN

GTN GCN GAR GGN

Blaise Li Plastids, Cyanobacteria and composition biases

Page 83: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all synonymous codon positions

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

cg75

Firmicutes

Chloroflexi

Chlorobi

Proteobacteria

Gloeobacter_violaceus (GBACT)

Synechococcus_JA33Ab (GBACT)

Glaucophyta

Rhodophyta

Streptophyta

Chlorophyta

UNIT+

OSC-2

SO-6

NOST-1

SPM-3

1.001.00

1.00

1.00

1.00

1.000.80

1.00

0.980.64

0.75

0.59

cg75_degenBlaise Li Plastids, Cyanobacteria and composition biases

Page 84: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all synonymous codon positions

I Core Cyanobacteria sister to plastids, like when usingamino-acids

I 1st and 2nd position signal actually contributes tocomposition attraction.(It’s neutralization helps when 3rd position synonymoussignal is also neutralized.)

I What happens? Is it "good" or "bad" signal?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 85: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all synonymous codon positions

I Core Cyanobacteria sister to plastids, like when usingamino-acids

I 1st and 2nd position signal actually contributes tocomposition attraction.

(It’s neutralization helps when 3rd position synonymoussignal is also neutralized.)

I What happens? Is it "good" or "bad" signal?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 86: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all synonymous codon positions

I Core Cyanobacteria sister to plastids, like when usingamino-acids

I 1st and 2nd position signal actually contributes tocomposition attraction.(It’s neutralization helps when 3rd position synonymoussignal is also neutralized.)

I What happens? Is it "good" or "bad" signal?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 87: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all synonymous codon positions

I Core Cyanobacteria sister to plastids, like when usingamino-acids

I 1st and 2nd position signal actually contributes tocomposition attraction.(It’s neutralization helps when 3rd position synonymoussignal is also neutralized.)

I What happens? Is it "good" or "bad" signal?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 88: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2

Serine synonymy is different from other synonymies

I AGY (Ser) ↔ ACY (Thr) ↔ TCY (Ser)I AGY (Ser) ↔ TGY (Cys) ↔ TCY (Ser)

Either a double simultaneous substitution, either a non-Serintermediate→ might lend itself less to composition convergence thanother synonymous substitutions and therefore may containuseful signal

Blaise Li Plastids, Cyanobacteria and composition biases

Page 89: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2

Serine synonymy is different from other synonymiesI AGY (Ser) ↔ ACY (Thr) ↔ TCY (Ser)I AGY (Ser) ↔ TGY (Cys) ↔ TCY (Ser)

Either a double simultaneous substitution, either a non-Serintermediate→ might lend itself less to composition convergence thanother synonymous substitutions and therefore may containuseful signal

Blaise Li Plastids, Cyanobacteria and composition biases

Page 90: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2

Serine synonymy is different from other synonymiesI AGY (Ser) ↔ ACY (Thr) ↔ TCY (Ser)I AGY (Ser) ↔ TGY (Cys) ↔ TCY (Ser)

Either a double simultaneous substitution, either a non-Serintermediate

→ might lend itself less to composition convergence thanother synonymous substitutions and therefore may containuseful signal

Blaise Li Plastids, Cyanobacteria and composition biases

Page 91: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2

Serine synonymy is different from other synonymiesI AGY (Ser) ↔ ACY (Thr) ↔ TCY (Ser)I AGY (Ser) ↔ TGY (Cys) ↔ TCY (Ser)

Either a double simultaneous substitution, either a non-Serintermediate→ might lend itself less to composition convergence thanother synonymous substitutions

and therefore may containuseful signal

Blaise Li Plastids, Cyanobacteria and composition biases

Page 92: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2

Serine synonymy is different from other synonymiesI AGY (Ser) ↔ ACY (Thr) ↔ TCY (Ser)I AGY (Ser) ↔ TGY (Cys) ↔ TCY (Ser)

Either a double simultaneous substitution, either a non-Serintermediate→ might lend itself less to composition convergence thanother synonymous substitutions and therefore may containuseful signal

Blaise Li Plastids, Cyanobacteria and composition biases

Page 93: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2T C A G

T

TTTPhe

TCT

Ser

TATTyr

TGTCys

TTC TCC TAC TGCTTA

LeuTCA TAA

TerTGA Ter

TTG TCG TAG TGG Trp

C

CTT

Leu

CCT

Pro

CATHis

CGT

ArgCTC CCC CAC CGCCTA CCA CAA

GlnCGA

CTG CCG CAG CGG

A

ATTIle

ACT

Thr

AATAsn

AGTSer

ATC ACC AAC AGCATA ACA AAA

LysAGA

ArgATG Met ACG AAG AGG

G

GTT

Val

GCT

Ala

GATAsp

GGT

GlyGTC GCC GAC GGCGTA GCA GAA

GluGGA

GTG GCG GAG GGG

Blaise Li Plastids, Cyanobacteria and composition biases

Page 94: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2T C A G

T

TTTPhe

WST

Ser

TATTyr

TGTCys

TTC WSC TAC TGCTTA

LeuWSA TAA

TerTGA Ter

TTG WSG TAG TGG Trp

C

CTT

Leu

CCT

Pro

CATHis

CGT

ArgCTC CCC CAC CGCCTA CCA CAA

GlnCGA

CTG CCG CAG CGG

A

ATTIle

ACT

Thr

AATAsn

WSTSer

ATC ACC AAC WSCATA ACA AAA

LysAGA

ArgATG Met ACG AAG AGG

G

GTT

Val

GCT

Ala

GATAsp

GGT

GlyGTC GCC GAC GGCGTA GCA GAA

GluGGA

GTG GCG GAG GGG

Blaise Li Plastids, Cyanobacteria and composition biases

Page 95: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

Synechococcus_elongatus (SO-6)Synechococcus_RCC307 (SO-6)

Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2Prochlorococcus_marinus (SO-6)

RhodophytaGlaucophyta

StreptophytaChlorophyta

0.990.99

1.001.001.00

0.90

0.92

0.97

0.910.920.910.920.920.780.821.00

10-1-2-3codon usage log10 ratios ( LeuTLeuC, SerAGSerTC

, ArgAArgC

):

10.50G+C proportion by codon position (1: −, 2: +, 3: ×):

−+ ×−+ ×

−+ ×−+ ×

−+×

−+ ×

−+×

−+ ×

−+×−+×

−+×

−+ ×

−+×

−+ ×

−+ ×

−+ ×−+ ×

−+×

−+ ×

cg75_degen12SBlaise Li Plastids, Cyanobacteria and composition biases

Page 96: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

Synechococcus_elongatus (SO-6)Synechococcus_RCC307 (SO-6)

Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2Prochlorococcus_marinus (SO-6)

RhodophytaGlaucophyta

StreptophytaChlorophyta

0.990.99

1.001.001.00

0.90

0.92

0.97

0.910.920.910.920.920.780.821.00

10-1-2-3codon usage log10 ratios ( LeuTLeuC, SerAGSerTC

, ArgAArgC

):

10.50G+C proportion by codon position (1: −, 2: +, 3: ×):

−+ ×−+ ×

−+ ×−+ ×

−+×

−+ ×

−+×

−+ ×

−+×−+×

−+×

−+ ×

−+×

−+ ×

−+ ×

−+ ×−+ ×

−+×

−+ ×

cg75_degen12S

3rd pos. G+C

Blaise Li Plastids, Cyanobacteria and composition biases

Page 97: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

Synechococcus_elongatus (SO-6)Synechococcus_RCC307 (SO-6)

Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2Prochlorococcus_marinus (SO-6)

RhodophytaGlaucophyta

StreptophytaChlorophyta

0.990.99

1.001.001.00

0.90

0.92

0.97

0.910.920.910.920.920.780.821.00

10-1-2-3codon usage log10 ratios ( LeuTLeuC, SerAGSerTC

, ArgAArgC

):

10.50G+C proportion by codon position (1: −, 2: +, 3: ×):

−+ ×−+ ×

−+ ×−+ ×

−+×

−+ ×

−+×

−+ ×

−+×−+×

−+×

−+ ×

−+×

−+ ×

−+ ×

−+ ×−+ ×

−+×

−+ ×

cg75_degen12S

ArgA bias

Blaise Li Plastids, Cyanobacteria and composition biases

Page 98: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating serine codon positions 1 and 2

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

Synechococcus_elongatus (SO-6)Synechococcus_RCC307 (SO-6)

Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2Prochlorococcus_marinus (SO-6)

RhodophytaGlaucophyta

StreptophytaChlorophyta

0.990.99

1.001.001.00

0.90

0.92

0.97

0.910.920.910.920.920.780.821.00

10-1-2-3codon usage log10 ratios ( LeuTLeuC, SerAGSerTC

, ArgAArgC

):

10.50G+C proportion by codon position (1: −, 2: +, 3: ×):

−+ ×−+ ×

−+ ×−+ ×

−+×

−+ ×

−+×

−+ ×

−+×−+×

−+×

−+ ×

−+×

−+ ×

−+ ×

−+ ×−+ ×

−+×

−+ ×

cg75_degen12S

1st pos. G+CLeuT bias

Blaise Li Plastids, Cyanobacteria and composition biases

Page 99: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

When SerAG ↔ SerTC is removed, composition biases at thirdand first codon positions seem to lead to more artefacts.

Two hypotheses:1. significant historical signal2. important but conflicting misleading signal

Blaise Li Plastids, Cyanobacteria and composition biases

Page 100: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

When SerAG ↔ SerTC is removed, composition biases at thirdand first codon positions seem to lead to more artefacts.Two hypotheses:

1. significant historical signal2. important but conflicting misleading signal

Blaise Li Plastids, Cyanobacteria and composition biases

Page 101: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

When SerAG ↔ SerTC is removed, composition biases at thirdand first codon positions seem to lead to more artefacts.Two hypotheses:1. significant historical signal

2. important but conflicting misleading signal

Blaise Li Plastids, Cyanobacteria and composition biases

Page 102: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

When SerAG ↔ SerTC is removed, composition biases at thirdand first codon positions seem to lead to more artefacts.Two hypotheses:1. significant historical signal2. important but conflicting misleading signal

Blaise Li Plastids, Cyanobacteria and composition biases

Page 103: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The literature says that SerAG ↔ SerTC occurs easily throughThr and Cys intermediates.

Is it the case here?I Recoding of the data in 23 aminoacids (distinct states for

the families of Leu, Arg and Ser)I MCMC under a GTR+I+Γ model, topology fixed to the

one obtained with the normal amino-acid dataI Inferred substitution matrix: highest rates are ArgA ↔

ArgC, LeuC ↔ LeuT, and SerAG ↔ SerTC.→ SerAG ↔ SerTC is more frequent than any non-synonymoussubstitution.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 104: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The literature says that SerAG ↔ SerTC occurs easily throughThr and Cys intermediates. Is it the case here?

I Recoding of the data in 23 aminoacids (distinct states forthe families of Leu, Arg and Ser)

I MCMC under a GTR+I+Γ model, topology fixed to theone obtained with the normal amino-acid data

I Inferred substitution matrix: highest rates are ArgA ↔ArgC, LeuC ↔ LeuT, and SerAG ↔ SerTC.

→ SerAG ↔ SerTC is more frequent than any non-synonymoussubstitution.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 105: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The literature says that SerAG ↔ SerTC occurs easily throughThr and Cys intermediates. Is it the case here?

I Recoding of the data in 23 aminoacids (distinct states forthe families of Leu, Arg and Ser)

I MCMC under a GTR+I+Γ model, topology fixed to theone obtained with the normal amino-acid data

I Inferred substitution matrix: highest rates are ArgA ↔ArgC, LeuC ↔ LeuT, and SerAG ↔ SerTC.

→ SerAG ↔ SerTC is more frequent than any non-synonymoussubstitution.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 106: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The literature says that SerAG ↔ SerTC occurs easily throughThr and Cys intermediates. Is it the case here?

I Recoding of the data in 23 aminoacids (distinct states forthe families of Leu, Arg and Ser)

I MCMC under a GTR+I+Γ model, topology fixed to theone obtained with the normal amino-acid data

I Inferred substitution matrix: highest rates are ArgA ↔ArgC, LeuC ↔ LeuT, and SerAG ↔ SerTC.

→ SerAG ↔ SerTC is more frequent than any non-synonymoussubstitution.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 107: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The literature says that SerAG ↔ SerTC occurs easily throughThr and Cys intermediates. Is it the case here?

I Recoding of the data in 23 aminoacids (distinct states forthe families of Leu, Arg and Ser)

I MCMC under a GTR+I+Γ model, topology fixed to theone obtained with the normal amino-acid data

I Inferred substitution matrix: highest rates are ArgA ↔ArgC, LeuC ↔ LeuT, and SerAG ↔ SerTC.

→ SerAG ↔ SerTC is more frequent than any non-synonymoussubstitution.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 108: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The literature says that SerAG ↔ SerTC occurs easily throughThr and Cys intermediates. Is it the case here?

I Recoding of the data in 23 aminoacids (distinct states forthe families of Leu, Arg and Ser)

I MCMC under a GTR+I+Γ model, topology fixed to theone obtained with the normal amino-acid data

I Inferred substitution matrix: highest rates are ArgA ↔ArgC, LeuC ↔ LeuT, and SerAG ↔ SerTC.

→ SerAG ↔ SerTC is more frequent than any non-synonymoussubstitution.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 109: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The relatively high frequency of SerAG ↔ SerTC may beassociated with composition biases.

I AG vs. TC at positions 1 and 2→ no correlation with global G+C % expected

I A vs. T at first positionI G vs. C at second position

Which groupings of taxa may be favoured by such biases?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 110: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The relatively high frequency of SerAG ↔ SerTC may beassociated with composition biases.

I AG vs. TC at positions 1 and 2→ no correlation with global G+C % expected

I A vs. T at first positionI G vs. C at second position

Which groupings of taxa may be favoured by such biases?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 111: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The relatively high frequency of SerAG ↔ SerTC may beassociated with composition biases.

I AG vs. TC at positions 1 and 2→ no correlation with global G+C % expected

I A vs. T at first position

I G vs. C at second positionWhich groupings of taxa may be favoured by such biases?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 112: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The relatively high frequency of SerAG ↔ SerTC may beassociated with composition biases.

I AG vs. TC at positions 1 and 2→ no correlation with global G+C % expected

I A vs. T at first positionI G vs. C at second position

Which groupings of taxa may be favoured by such biases?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 113: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

The relatively high frequency of SerAG ↔ SerTC may beassociated with composition biases.

I AG vs. TC at positions 1 and 2→ no correlation with global G+C % expected

I A vs. T at first positionI G vs. C at second position

Which groupings of taxa may be favoured by such biases?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 114: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Composition at position 1

Trichodesmium erythraeum (OSC-2)

Prochlorococcus marinus (SO-6)

AA+T

GC+G

0.600

0.625

0.650

0.675

0.525 0.550 0.575 0.600 0.625 0.650 0.675

Blaise Li Plastids, Cyanobacteria and composition biases

Page 115: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Composition at position 2

Trichodesmium erythraeum (OSC-2)Prochlorococcus marinus (SO-6)

AA+T

GC+G

0.425

0.435

0.445

0.455

0.465

0.475

0.435 0.445 0.455 0.465 0.475 0.485 0.495

Blaise Li Plastids, Cyanobacteria and composition biases

Page 116: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

I Prochlorococcus (SO-6) and Trichodesmium (OSC-2) arelikely attracted to plastids because of low G+C % at 1stand 3rd position.

I But they do not stand out when it comes to biasespossibly associated with Ser codon degeneracy at 1st and2nd position.

I So signal associated to these position will not reinforcetheir tendency to be artefactually placed, and may evencontribute to placing them at more correct positions.

→ What happens if we remove all synonymy-associated signalexcept at first and third positions of Ser codon?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 117: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

I Prochlorococcus (SO-6) and Trichodesmium (OSC-2) arelikely attracted to plastids because of low G+C % at 1stand 3rd position.

I But they do not stand out when it comes to biasespossibly associated with Ser codon degeneracy at 1st and2nd position.

I So signal associated to these position will not reinforcetheir tendency to be artefactually placed, and may evencontribute to placing them at more correct positions.

→ What happens if we remove all synonymy-associated signalexcept at first and third positions of Ser codon?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 118: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

I Prochlorococcus (SO-6) and Trichodesmium (OSC-2) arelikely attracted to plastids because of low G+C % at 1stand 3rd position.

I But they do not stand out when it comes to biasespossibly associated with Ser codon degeneracy at 1st and2nd position.

I So signal associated to these position will not reinforcetheir tendency to be artefactually placed, and may evencontribute to placing them at more correct positions.

→ What happens if we remove all synonymy-associated signalexcept at first and third positions of Ser codon?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 119: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Do SerAG ↔ SerTC bring accurate information?

I Prochlorococcus (SO-6) and Trichodesmium (OSC-2) arelikely attracted to plastids because of low G+C % at 1stand 3rd position.

I But they do not stand out when it comes to biasespossibly associated with Ser codon degeneracy at 1st and2nd position.

I So signal associated to these position will not reinforcetheir tendency to be artefactually placed, and may evencontribute to placing them at more correct positions.

→ What happens if we remove all synonymy-associated signalexcept at first and third positions of Ser codon?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 120: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all but Ser codon positions 1 and 2T C A G

T

TTTPhe

TCT

Ser

TATTyr

TGTCys

TTC TCC TAC TGCTTA

LeuTCA TAA

TerTGA Ter

TTG TCG TAG TGG Trp

C

CTT

Leu

CCT

Pro

CATHis

CGT

ArgCTC CCC CAC CGCCTA CCA CAA

GlnCGA

CTG CCG CAG CGG

A

ATTIle

ACT

Thr

AATAsn

AGTSer

ATC ACC AAC AGCATA ACA AAA

LysAGA

ArgATG Met ACG AAG AGG

G

GTT

Val

GCT

Ala

GATAsp

GGT

GlyGTC GCC GAC GGCGTA GCA GAA

GluGGA

GTG GCG GAG GGG

Blaise Li Plastids, Cyanobacteria and composition biases

Page 121: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all but Ser codon positions 1 and 2T C A G

T

TTYPhe

TCN

Ser

TAYTyr

TGYCys

TTY TCN TAY TGYYTN

LeuTCN TAR

TerTGR Ter

YTN TCN TAR TGG Trp

C

YTN

Leu

CCN

Pro

CAYHis

MGN

ArgYTN CCN CAY MGNYTN CCN CAR

GlnMGN

YTN CCN CAR MGN

A

ATHIle

ACN

Thr

AAYAsn

AGNSer

ATH ACN AAY AGNATH ACN AAR

LysMGN

ArgATG Met ACN AAR MGN

G

GTN

Val

GCN

Ala

GAYAsp

GGN

GlyGTN GCN GAY GGNGTN GCN GAR

GluGGN

GTN GCN GAR GGN

Blaise Li Plastids, Cyanobacteria and composition biases

Page 122: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all but Ser codon positions 1 and 2

Firmicutes

Chloroflexi

Chlorobi

Proteobacteria

Gloeobacter_violaceus (GBACT)

Synechococcus_JA33Ab (GBACT)

Glaucophyta

Rhodophyta

Streptophyta

Chlorophyta

UNIT+

OSC-2

SO-6

NOST-1

SPM-3

1.001.00

1.00

1.00

1.00

1.000.80

1.00

0.980.64

0.75

0.59

cg75_degen

Firmicutes

Chloroflexi

Chlorobi

Proteobacteria

Gloeobacter_violaceus (GBACT)

Synechococcus_JA33Ab (GBACT)

Glaucophyta

Rhodophyta

Streptophyta

Chlorophyta

UNIT+

NOST-1

SPM-3

OSC-2

SO-6

1.001.00

1.00

1.00

1.00

1.000.83

1.00

0.950.72

0.53

cg75_degenLR3Blaise Li Plastids, Cyanobacteria and composition biases

Page 123: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all but Ser codon positions 1 and 2

I Not much change

I Signal associated to switches between Ser families has amitigating effect on artefacts associated with G+Ccomposition biases.

I But no visible effect on the topology when combined withdata already purged from potentially misleading signal

I (In the present study...)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 124: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all but Ser codon positions 1 and 2

I Not much changeI Signal associated to switches between Ser families has a

mitigating effect on artefacts associated with G+Ccomposition biases.

I But no visible effect on the topology when combined withdata already purged from potentially misleading signal

I (In the present study...)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 125: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all but Ser codon positions 1 and 2

I Not much changeI Signal associated to switches between Ser families has a

mitigating effect on artefacts associated with G+Ccomposition biases.

I But no visible effect on the topology when combined withdata already purged from potentially misleading signal

I (In the present study...)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 126: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Degenerating all but Ser codon positions 1 and 2

I Not much changeI Signal associated to switches between Ser families has a

mitigating effect on artefacts associated with G+Ccomposition biases.

I But no visible effect on the topology when combined withdata already purged from potentially misleading signal

I (In the present study...)

Blaise Li Plastids, Cyanobacteria and composition biases

Page 127: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Conclusions

I Incongruence between nucleotide and amino-acid datamainly due to G+C convergence biases. It is likely thatplastids diverged early from the Cyanobacteria.

I rDNA have direct selective contraints ont their sequence,hence the results similar to amino-acid data.

I Codon-degeneracy recoding permits the removal ofmisleading signal while retaining a part of the signal notpresent at the amino-acid level. This could lead to moreaccurate results.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 128: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Conclusions

I Incongruence between nucleotide and amino-acid datamainly due to G+C convergence biases. It is likely thatplastids diverged early from the Cyanobacteria.

I rDNA have direct selective contraints ont their sequence,hence the results similar to amino-acid data.

I Codon-degeneracy recoding permits the removal ofmisleading signal while retaining a part of the signal notpresent at the amino-acid level. This could lead to moreaccurate results.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 129: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Conclusions

I Incongruence between nucleotide and amino-acid datamainly due to G+C convergence biases. It is likely thatplastids diverged early from the Cyanobacteria.

I rDNA have direct selective contraints ont their sequence,hence the results similar to amino-acid data.

I Codon-degeneracy recoding permits the removal ofmisleading signal while retaining a part of the signal notpresent at the amino-acid level. This could lead to moreaccurate results.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 130: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Modelling composition variations

I NDCH model: composition can be different at differentnodes of the tree.

I This mitigates the likelihood cost associated withgrouping taxa with diverging composition.

I And so less compostition convergence artefacts areexpected.

I Number of composition vectors increased until simulateddata has a composition heterogeneity compatible withthat of the real data.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 131: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Modelling composition variations

I NDCH model: composition can be different at differentnodes of the tree.

I This mitigates the likelihood cost associated withgrouping taxa with diverging composition.

I And so less compostition convergence artefacts areexpected.

I Number of composition vectors increased until simulateddata has a composition heterogeneity compatible withthat of the real data.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 132: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Modelling composition variations

I NDCH model: composition can be different at differentnodes of the tree.

I This mitigates the likelihood cost associated withgrouping taxa with diverging composition.

I And so less compostition convergence artefacts areexpected.

I Number of composition vectors increased until simulateddata has a composition heterogeneity compatible withthat of the real data.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 133: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Modelling composition variations

I NDCH model: composition can be different at differentnodes of the tree.

I This mitigates the likelihood cost associated withgrouping taxa with diverging composition.

I And so less compostition convergence artefacts areexpected.

I Number of composition vectors increased until simulateddata has a composition heterogeneity compatible withthat of the real data.

Blaise Li Plastids, Cyanobacteria and composition biases

Page 134: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Modelling composition variations

FirmicutesChloroflexi

ChlorobiProteobacteria

Gloeobacter_violaceus (GBACT)Synechococcus_JA33Ab (GBACT)

SO-6Thermosynechococcus_elongatus (UNIT+)Cyanothece_PCC7425 (UNIT+)Acaryochloris_marina (UNIT+)

SPM-3NOST-1

OSC-2GlaucophytaRhodophytaStreptophytaChlorophyta

0.980.97

1.001.001.000.72

1.00

0.700.720.720.720.94

0.59

1.00

cg75

Firmicutes

Proteobacteria

Chlorobi

Chloroflexi

Gloeobacter_violaceus (GBACT)

Synechococcus_JA33Ab (GBACT)

SO-6

UNIT+

NOST-1

SPM-3

OSC-2

Glaucophyta

Rhodophyta

Streptophyta

Chlorophyta

1.000.99

1.001.00

1.001.00

1.00

0.99

1.001.00

1.00

1.00

cg75_p4CV2Blaise Li Plastids, Cyanobacteria and composition biases

Page 135: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Modelling composition variations

I Similar effect as no3 and degen3: UNIT+ monophylyrestored

High support because Bayesian posterior probabilities, notbootstrap supports.

I 2 composition vectors corresponding to high and lowG+C %

I Why not as efficient as full degeneracy?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 136: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Modelling composition variations

I Similar effect as no3 and degen3: UNIT+ monophylyrestoredHigh support because Bayesian posterior probabilities, notbootstrap supports.

I 2 composition vectors corresponding to high and lowG+C %

I Why not as efficient as full degeneracy?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 137: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Modelling composition variations

I Similar effect as no3 and degen3: UNIT+ monophylyrestoredHigh support because Bayesian posterior probabilities, notbootstrap supports.

I 2 composition vectors corresponding to high and lowG+C %

I Why not as efficient as full degeneracy?

Blaise Li Plastids, Cyanobacteria and composition biases

Page 138: Dealing with composition convergence to place plastids among … · 2013. 4. 16. · FirstMLbootstrapanalyses Firmicutes Chloroflexi Chlorobi Proteobacteria Gloeobacter_violaceus(GBACT)

Modelling composition variations

I Similar effect as no3 and degen3: UNIT+ monophylyrestoredHigh support because Bayesian posterior probabilities, notbootstrap supports.

I 2 composition vectors corresponding to high and lowG+C %

I Why not as efficient as full degeneracy?

Blaise Li Plastids, Cyanobacteria and composition biases