Microbial Genomics for the Development of Biocatalysts for ......4624 ND ND Ccel_2467 Cphy_2056 ND...
Transcript of Microbial Genomics for the Development of Biocatalysts for ......4624 ND ND Ccel_2467 Cphy_2056 ND...
David B. Levin1, & Richard Sparling2 1Department of Biosystems Engineering &
2Department of Microbiology
University of Manitoba
Winnipeg, MB
Canada
Microbial Genomics for the Development of Biocatalysts
for Lignocellulosic Biorefining and Biofuels production
Microbial Genomics of Biocatalysts for Biorefining
Slide 2
Outline
Biofuels from Direct Cellulose Fermentation: C. thermocellum
Improvements through medium optimization
Microbial Genomics and Metabolism
Comparative genomics: central metabolism
Bioinformatics & Proteomics reveal unexpected Pathways in C. thermocellum
Direct conversion of raw substrates
Comparative genomics: Towards high performance lignocellulose fermentation
through consolidated bioprocessing
Cellulosic Biofuels:
Current vs Alternative Approach
Slide 3
Clostridium
thermocellum Clostridium termitidis
Biofuels from Direct Cellulose Fermentation
Clostridium thermocellum: thermophilic, cellulolytic, gram +ve, anaerobic Clostridium termitidis: mesophilic, cellulolytic, gram +ve, anaerobic Degrade cellulose and synthesize: Ethanol, H2 and CO2, VFAs - acetate (formate, lactate) C. thermocellum possesses a high rate of cellulose-degradation
C. termitidis cellulose hydrolysis comparable to C. cellulolyticum
Slide 4
Clostridium thermocellum
End-Product Formation on cellulose: Medium optimization
Slide 5
Starting with a baseline medium for C. thermocellum,
one can alter the medium to enhance:
Growth rate and end-product formation rate from
cellulose
Shift production towards the generation of a specific
end-product
Clostridium thermocellum
End-Product Formation on cellulose: Batch Cultures
0.01
0.10
1.00
10.00
100.00
0 10 20 30 40 50
Time (hours)E
nd
pro
du
ct
form
atio
n (
um
ole
)
Acetate
Formate
Lactate
Ethanol
10
100
1000
0 10 20 30 40 50
Time (hours)
To
tal
Pro
tein
(u
g)
1
10
100
Ga
s p
rod
uc
ed
(u
mo
le)
Protein
H2
CO2
A) Total protein, hydrogen and carbon dioxide; B) lactate, acetate, formate and ethanol produced by
C.thermocellum within 10mL baltch tubes of 1191 media grown on 4.5g/L a-cellulose incubated at 60C
A
To
tal P
rote
in (
g)
Ga
s p
rod
uce
d (
mo
le)
En
d p
rod
uct fo
rma
tio
n (
mo
le)
B
Time (hours) Time (hours)
Slide 6
Clostridium thermocellum
Medium optimization design
Slide 7
C. thermocellum
End-Product Formation: Medium optimization
Slide 8
Cellulose fermentation:
From test tube to genome to proteome
Slide 9
Medium optimization can only go so far without
-a deeper understanding of the genome,
-an understanding of the subset of genes actually used
under specific growth conditions (e.g. proteome)
Can lead to better medium design
Can lead to genetic engineering
Understanding the genome of cellulolytic fermentative
organisms: selection of organisms
Organism Optimum temp
End Products (mol/mol hexose equivalents) Growth condition Ref
(°C) H2 CO2 Acetate Ethanol Formate Lactate
Ca. saccharolyticus DSM 8903
70 3.5 2.5 3.6 4.0
NR 1.4 1.5 1.8
2.1 1.4 1.6 NR
NR ND ND ND
NR ND ND ND
NR 0.1 ND ND
Batch, 10 g l-1 sucrose Batch, 10 g l-1 glucose Continuous, 4.1 g l-1 glucose (D=0.1 h-1) Continuous, 1.1 g l-1 glucose (D=0.09 h-1)
19, 20 22 23 23
A. thermophilum 75 ✓ ✓ ✓ ✓ 21 28
C. cellulolyticum H10 37 1.6 1.8
1.0 1.1
0.8 0.8
0.3 0.4
ND ND
NR NR
Batch, 5 g l-1 cellulose Batch, 5 g l-1 cellobiose
24 24
C. phytofermentans ISDg 35-37 Major 1.0 1.6
Major 0.9 1.2
0.6 0.6 0.6
1.4 0.5 0.6
0.1 0.1 ND
0.3 NR NR
Batch, 34 g l-1 cellobiose Batch, 5 g l-1 cellulose Batch, 5 g l-1 cellobiose
18 24 24
C. thermocellum ATCC 27405
60 0.8 1.0
1.1 0.8
0.7 0.8
0.8 0.6
0.3 0.4
ND 0.4
Batch, 1.1 g l-1 cellobiose Batch, 4.5 g l-1 cellobiose
1 25
C. thermocellum JW20 60 1.8 0.6
1.7 1.8
0.9 0.3
0.8 1.4
ND ND
0.1 0.2
Batch, 2 g l-1 glucose Batch, 27 g l-1 cellobiose
26 26
T. pseudethanolicus 39E
NR 0.1
NR ✓ 2.0
0.3* 0.2 ✓ 0.1
1.3* 0.8 0.4 1.95 1.45 1.8
NR NR NR
>0.1* 1.1 ✓ 0.1
1 g l-1 xylose Batch, 20 g l-1 xylose Batch, 20 g l-1 glucose Batch, 8 g l-1 glucose
29 27 27 30 31 32
Hyd
rog
en
Eth
an
ol
ND- not detected
NR- not reported
* per xylose equivalent Slide 10
Comparative Bioinformatics analysis
Alcohol dehydrogenases G
ene
CO
G
ID
Ca
. sa
cch
aro
lyti
cus
DS
M 8
90
3
A. th
erm
op
hil
um
DS
M 6
72
5
C. ce
llu
loly
ticu
m
H1
0
C. p
hyt
ofe
rmen
tan
s
ISD
g
C. th
erm
oce
llu
m
AT
CC
27
40
5
C. th
erm
oce
llu
m
DS
M 4
15
0
T. p
seu
det
ha
no
licu
s
AT
CC
33
22
3
adhE 1454
1012
ND ND Ccel_3198 Cphy_3925 Cthe_0423 Cthe_C10_1096 Teth_390206
Fe-adh 1454 Csac_0407
Csac_0711
Csac_0622
Csac_1500
Athe_2244
Athe_0928
Ccel_1083
Ccel_0894
Ccel_3337
Cphy_2650
Cphy_1029
Cphy_1421
Cphy_2463
Cthe_0394
Cthe_2579
Cthe_0101
Cthe_C9_2833
Cthe_C3_0189
Cthe_C25_0616
Teth_391597
Teth_391979
Teth_390220
aldh 1012 ND ND ND Cphy_3041
Cphy_0958
Cphy_2418
Cphy_1178
Cphy_2642
Cphy_1428
Cphy_1416
Cthe_2238 Cthe_C58_1042 ND
adhE encodes domains necessary for complete reduction of acetyl-CoA to ethanol
Hydrogen Ethanol
Slide 11
Comparative Bioinformatics
Hydrogenases
Gen
e
CO
G
ID
Ca
. sa
cch
aro
lyti
cus
DS
M 8
90
3
A. th
erm
op
hil
um
DS
M 6
72
5
C. ce
llu
loly
ticu
m
H1
0
C. p
hyt
ofe
rmen
tan
s
ISD
g
C. th
erm
oce
llu
m
AT
CC
27
40
5
C. th
erm
oce
llu
m
DS
M 4
15
0
T. p
seu
det
ha
no
licu
s
AT
CC
33
22
3
ech NiFe H2ase
(Fd)
3260
3261
Csac_1534-1539 Athe_1082-1087 Ccel_3371-3366
Ccel_1691-1686
Cphy_1730-1735 Cthe_3024-3019 Cthe_C50_2173-2168 ND
Fe H2ase
(NADH)
4624
(1894)
Csac_1860-1864 Athe_1298-1299 Ccel_2232-2233
Ccel_2303-2304
Cphy_3804-3805
Cphy_0088-0087
Cthe_0341-0342
Cthe_0429-0430
Cthe_C10_1102-1103 Teth_391457-
391456
Fe H2ase
(NADPH)
4624
(0493)
ND ND ND ND Cthe_3003-3004 ND ND
Fe H2ase
(Fd)
4624 ND ND Ccel_2467 Cphy_2056 ND ND Teth_390221
rnf 4656-
4660,2
878
ND ND ND Cphy_0211-0216 Cthe_2430-2435 Cthe_C28_0369-0375 Teth_392124-
392119
Hydrogen Ethanol
Slide 12
Comparative Bioinformatics
End-product synthesis (yes/no inventory)
NADH NAD+
NADH NAD+
ADP ATP
RNF Ech
PFO
NADH NAD+ H+ H2
NADH H2ase FdRED FdOX
Fd H2ase
NAD+ NADH
NADPH H2ase
NADP+ NADPH
ATPase
H+IN
H+OUT H+
OUT
H+IN
K+IN
K+OUT
Pyruvate Acetyl-CoA
PFL
CO2
Formate
Lactate LDH
Acetyl-P
Acetaldehyde
2 NADH 2 NAD+
ADP ATP
ALDH ADH
AdhE
PTA ACK
Pi
Acetate
Ethanol
Alcohol
A B C D E F * H
NADH NAD+
ABCDEFGH ABCDEFGH
A x x x x x * x
A – Ca. saccharolyticus DSM 8903 B – A. thermophilum DSM 6725 C – C. cellulolyticum H10 D – C. phytofermentans ISDg E – C. thermocellum ATCC 27405 F – C. thermocellum DSM 4150 G – C. thermocellum JW20 H – T. pseudethanolicus 39E
A B C D E F * H
x x C D E F * x
x x C D E F * H
A B C D E F * H x x x D E F * x
A B C D E F * x
A B C D E F * H
x x C D x x * H
x x x D E F * H
Slide 13
What genomics tell us about C. thermocellum
Slide 14
Multiple genes have same
putative annotated
function!?
Slide 15
•Glycolytic pathway utilizes both PPi and multiple ATP-dependent PFKs
•Multiple methods of interconverting PEP and pyruvate exist, but no
genomic evidence of a pyruvate kinase AND no peptides corresponding to a
clostridial PK.
•MDH and ME may be used in transhydrogenation of NADH to NADPH,
could also be used for conversion of PEP to pyruvate
•Branched product pathway uses multiple PFOs and ADHs
•The absence of ALDH suggests ADH-E is needed for EtOH synthesis
•Fd-dependent Ech hydrogenase and NADH dependent hydrogenases
are present
•What will the proteome say?
Conclusions from central metabolism genomics
in C. thermocellum:
Slide 16
CAUTION: Proteomics is a tool, you need an experimental context!
0
2
4
6
8
10
12
14
16
0 2 4 6 8 10 12 14
End
-pro
du
cts
(mM
) Time (h)
H2
CO2
Acetate
Ethanol
Formate
End-Product Synthesis and Cellobiose Consumption During Growth
6.2
6.4
6.6
6.8
7.0
7.2
7.4
0
2
4
6
8
10
12
14
16
0 2 4 6 8 10 12 14 16 p
H
Bio
mas
s an
d s
ub
stra
te (
mM
)
Time (h)
Biomass
CB used
pH Exp
Stat
•C. thermocellum grown under carbon-limited conditions (2g/l cellobiose) in closed batch cultures with
no pH control. End-product profiles generally follow growth with a slight increase in ethanol:acetate
ratio consistent
Slide 17
Proteomic Analysis (Shotgun and 4-Plex 2D-HPLC-MS/MS)
-Relative protein expression
Based on spectral counts (SpC) in both shotgun and 4-plex 2D-HPLC-MS/MS
runs
Given as ‘relative abundance index’ (RAI) = peptide SpC / protein Mr
-Differential protein expression Sample labeling: iTRAQ (isobaric labelling)
Tag 114 & 115 (exponential phase, biological replicates), Tag 116 & 117
(stationary phase biological replicates)
Given as total iTRAQ reporter ratios per protein (stationary/exponential)
Significance of changes in expression based ‘vector difference’ (Vdiff )
Slide 18
Shopping list, good, but targeted analysis better!
Focus on core metabolism
LEGEND
Genomics tells us what organism can do,
not what it does do
Slide 19
Examples:
Fd (Ech) NiFe-Hydrogenase not expressed under tested growth condition
NADH-Fd bifurcating hydrogenase appears dominant
Pyruvate could be synthesized via pyruvate dikinase (PPi dependent) OR
malate shunt; both are highly expressed
PPi dependent phosphofructokinase is a major enzyme present for glycolysis
Possibility of major role for PPi in energy conservation in C. thermocellum
Need to check proteome of multiple organisms…
Bioinformatics & Proteomics Pathways of central
metabolism in C. termitidis
C. termitidis is a mesophilic, cellulolytic bacterium, isolated from the gut of the termite Nasutitermes lujae Can use cellulose, cellobiose, and other hexose sugars Major end-products: Ethanol, Acetate, H2, and CO2, but can synthesize Lactate
and Formate under certain growth conditions Reported to utilize xylose Genome sequence analysis and annotation revealed genes for pentose and glucoronate interconversion
Slide 20
Slide 21
Clostridium termitidis
Protocol for genomic and proteomic characterization of novel
organisms
Biofuels from Direct Cellulose Fermentation:
Choice of substrate
Slide 22
While there are source of “refined” cellulose wastes:
paper cups, paper plates, old news papers, pulpe waste
There are many sources of agricultural ligno-cellulosic wastes:
bagasse, wheat straw, flax shives, hemp hurds, sawmill waste
To what extent can raw substrates be fermented by consolidated
bioprocesses?
Substrates milled with 32-35 mesh to 0.5mm particle size a-cellulose
Biofuels from Direct Cellulose Fermentation:
Evaluation of Different Substrates
Slide 23
+ C. thermocellum
Biofuels from Direct Cellulose Fermentation:
Evaluation of Different Substrates
Available cellulose (mg): 19.4 11.5 23.0 46.0 15.7 15.8 18.3 15.4 17.3 7.7
Yields of fermentation end-products vary with % cellulose & substrate complexity
Normalized Yields of Ethanol and Hydrogen in C. thermocellum fermentation Reactions: 0.2% loading = 20 mg (2 g/L) each substrate in at 60 oC for 24 hrs
Slide 24
No pre-
treatment
Combination of organisms enhance breakdown of some
raw substrate
Slide 25
Magnified view of C. thermocellum
CBM
Structure of Cellulosome
Cellulosome Components
Anchoring protein Scaffoldin (cipA)
Cellulose Binding Motif (CBM) Cohesin domains
Enzymatic subunits
Dockerin domains
Dockerin
domains
Cohesion
domains
Presence or Absence of Dockerin Domains
in Conserved Glycoside Hydrolases
- 19 GHs conserved across
6 cellulolytic Clostridia:
Clocel – C. cellulovorans
Cther – C. thermocellum
Cter – C. termitidis
Cpap – C. papyrosolvens
Ccel – C. cellulolyticum
Cphy – C. phytofermentans
- C. stercorarium (Clst) genome
contains only 13 of the 19 GHs
found in other cellulolytic
clostridia
- 6 GHs not detected (ND) in Clst
genome
“+” and “++” indicate GHs with
dockerin domains
- Cphy and Clst GHs are NOT
cellulosome associated
Presence or Absence of Selected Glycoside
Hydrolases in Clst, Cter, and Cther
- GHs in Cther are mostly
cellulosomal:
- 12 GH cellulases; few
xylanses
- All GHs in Cphy and Clst are
acellulosomal;
- GHs in Cter and Cther are a
mixture of cellulosomal
and acellulosomal:
C – cellulosome-associated
A - acellulosomal
(no dockerin domains)
Remember: C. thermocellum encodes xlyanases, but does not grow on xylan hydrolysis products
CAZyome Analysis of sequenced Thermoanaerobacter spp.
A cautionary tale Table
Clade 3 Clade 2 Clade 1
Th
erm
oa
na
erob
act
er
sid
ero
phil
us
SR
4
Th
erm
oa
na
erob
act
er
ther
mo
hyd
rosu
lfuri
cus
WC
1
Th
erm
oa
na
erob
act
er
wie
gel
ii R
t8.B
1
Th
erm
oa
na
erob
act
er
ita
licu
s A
b9
Th
erm
oa
na
erob
act
er
ma
thra
nii
A3
Th
erm
oa
na
erob
act
er
bro
ckii
su
bsp
. fi
nn
ii A
ko
-1
Th
erm
oa
na
erob
act
er
pse
ud
eth
ano
licu
s 3
9E
Th
erm
oa
na
erob
act
er
sp. X
51
3
Th
erm
oa
na
erob
act
er
sp. X
51
4
Total Sequences in CAZyome 58 55 50 60 59 44 42 45 42
Glycoside Hydrolases Unique Gene Sequences 29 26 22 32 30 21 24 23 24
Unique Classes 17 15 15 19 19 15 16 16 16
Glycosyltransferases Unique Gene Sequences 15 11 15 15 13 11 11 11 12
Unique Classes 6 6 7 6 7 5 5 5 6
Carbohydrate Esterases Unique Gene Sequences 4 5 4 4 6 3 3 3 3
Unique Classes 2 2 2 3 5 2 2 2 2
Polysaccharide Lyases Unique Gene Sequences 0 0 0 1 0 0 0 0 0
Unique Classes 0 0 0 1 0 0 0 0 0
Carbohydrate Binding Modules Unique Gene Sequences 6 8 6 5 5 5 0 5 0
Unique Modules 4 4 4 4 4 3 0 3 0
Multi-Component Proteins Unique Gene Sequences 4 5 3 3 5 4 4 3 3
Unique Combinations 4 5 3 3 5 4 4 3 3
Extracellular Proteins* 5 7 3 5 6 2 3 2 2
*Excluding proteins annotated to be involved with cell wall hydrolysis
Extracellular glycoside hydrolases in the genus
Thermoanaerobacter
Slide 30
Microbial Genomics and Cellulosic Biofuels
Direct fermentation of cellulose by cellulolytic bacteria
Requires intimate knowledge of the metabolic pathways and their
regulation, in order to control the fermentation
Possible enhancements using molecular techniques
Ethanol is not the only or best possible biofuel… nor the only value
added product possible
Development of other products using molecular techniques
Combinations of organisms enhance total glycoside hydrolase
cocktail available for deconstruction, and enhances fermentation
Slide 31
Acknowledgements
Faculty
David Levin
Richard Sparling
Nazim Cicek
MSc Graduate Students
Chris Dartiaihl
Hon Wai Wan
PhD Students
Val Agbor
Carlo Carere
Jilagamazhi Fu
Eftekhar Hossain
Rumana Islam
Matt McCandless
Riffat Munir
Umesh Ramachandran
Tom Rydzak
Ryan Sestric
Marcel Taillefer
Tobin Verbeke
Scott Wuske
Biofuels Research Group
Research Associates
Parveen Sharma
Jaime Park
Serpil Ozmihci
PDFs
John Schellenberg
Sadhana Lal
Slide 32
Manitoba Proteomics Centre
Vic Spicer
Oleg Krokhin
John Wilkins
Bioinformatics
Justin Xiang
Brian Firstensky
page 11
Thank-you
Questions?