Expression Operating Unit -...
Transcript of Expression Operating Unit -...
Vivek K Mutalik
BioFAB Community meeting, July 19, 2010
biofab
Expression Operating UnitOrigins and Current Status
biofab Predictive Synthetic Biology
What can we build today ? Toggle switches, band pass filters, oscillators, complex circuits etc
Involves fine tuning of parts to operate….And how predictable that is ?
Prediction at each stage of composition...
C-dogma Challenge
biofab
Parts Devices Systems
Design Characterize Standardize Fabricate
Making biology easy to engineer
BioFAB: C-Dog Goals
• Performance can be predictable • Reliable functional composition
biofab C-Dog Goal and challenge
acgtcttaagacccactttcacatttaagttgtttttctaatccgcatatgatcaattcaaggccgaataagaaggctggctctgcaccttggtgatcaaataattcgatagcttgtcgtaataatggcggcatactatcagtagtaggtgtttccctttcttctttagcgacttgatgctcttgatcttccaatacgcaacctaaagtaaaatgccccacagcgctgagtgcatataatgcattctctagtgaaaaaccttgttggcataaaaaggctaattgattttcgagagtttcatactgtttttct
Promoter
T
GFP*
RBS
Input (Stimulus)
0.9
0.1
Resp
onse
Time
Resp
onse
Design, Assemble & characterize
PrePrediction & Rational design in place of tuning of parts performance
Current goal:Express any number of proteins to a desired mean and variance
Future goal:Set the leak, maximum, temporal response and dose response for the production of an arbitrary protein from an inducible promoter
biofab Part characterization
Transcription start site (TSS)
Relative Promoter strength
Transcription kinetics
Biochemical characterization
Terminator efficiency
Termination sites
Transcription pause
Kinetics
mRNA levels
mRNA Structure
mRNA stabilityTransfer functions
Reporter levels
Promoter- 1 GFP
RBS
T
Promoter Reporter Terminator
biofab
Salmonella
E. Coli TG1E. Coli K12
Host Context & Phylogenetic Distance
Small genetic variants, Deletion/overexpression librariesTagged Libraries
Genomic Context
Environmental Context• Defined Minimal vs. Rich• Different Carbon Sources• Growth Phase• Stressors (T, NaCL, pH)• 96 well plate, test tubes, flask
Genome
ORIInsertion point
BAC
ORI
PlasmidORIPlasmid
Plasmid
Context characterization
How does a part behave in different contexts ?
• Predicting a part’s performance is non-trivial task.
• Example: Promoter strength prediction from sequence
• The lessons learnt here are generally applicable to other promoters as well.
biofab Lessons
GoalsImprove promoter predictions across genomes
Design promoters with specific strengths for use in genetic circuits
OutcomeBy correlating promoter strength with sequence we have been able to model promoters for 2 alternative sigma factors σE andσ32
Improving Promoter Models
Virgil Rhodius
In-vitro studies and Homology modeling
Promoters of alternative sigmas are more highly conserved
Housekeeping σ Alternative σs
Housekeeping sigma Alternative sigmas
Regulate 1000s of promoters Regulate 10s – 100s of promoters
Most promoters regulated by transcription factors
Few promoters regulated by transcription factors
Promoters poorly conserved Promoters relatively well-conserved
As a test case, we have modeled promoters of the E. coli alternative sigma, Membrane stress sigma factor σE and
Heat shock sigma factor σ32
-10
αCTD
αNTDβ/β’
σ
UP -35
Discriminator
Variable SpacersAT rich
+1
Weakly conserved motifs separated by variable length spacer sequences – difficult to identify motifs
Why is predicting/modeling promoters difficult?
Promoters encode a multistep process
•Each kinetic step is encoded in the promoter sequence
•Position Weight Matrix Models are used for predicting transcription factor binding sites
•Based on binding energies•Promoters are more complex as encode multiple steps
•Transcription initiation requires:•DNA binding, DNA melting, Promoter Escape
+
RNA polymerase
promoter bound complex
KB
open complex
kf
abortive initiation
NTPs
+
promoter escape
mRNA
Approach – Refining Promoter Models
• Compare promoter strength with model promoter score
• Identify motifs that improve promoter strength
• Study outliers
Promoter prediction Model
Promoter strength
Promoter Score = PWM model score of homologous promoter sequences
Promoter Strength = rate of mRNA transcript production
Measuring strength of σE promotersLibrary of 60 natural σE promoters (E. coli and Salmonella)
-40-50-60 -30 -20 -10 +1
-10-35 LONG promoters
-65 +20
-10-35 SHORT promotersUP element
In vivo promoter strength measurementGFP reporter assay (σE overexpression)
In vitro promoter strength measurementTranscriptions from linear DNA templates
Promoters
Promoterstrength
Rhodius & Mutalik, 2010a. PNAS 107:2854-9
σE Natural promoters (60)
In vivo and In vitro strength of σE promoters– First time for any sigma regulon
Active promoters
Weak/inactive promoters
-10-35 SHORT promoters
Build PWM models based from active Short promoters
Test models by cross-validation
Test models ability to distinguish weak promoters
Modeling Strategy for σE promoters
Promoter score usingPosition Weight Matrix models (PWMs)
Align motifs Build frequency matrix
Build position weight matrix
Score motifs
Short Promoter score = PWM-35 + PWM-10 + PWM+1 + Spacer penaltiesFor each promoter:
G G A
G C A
G G T
G 3 2 0
A 0 0 2
T 0 0 1
C 0 1 0
G 1.0 0.7 -0.9
A -0.9 -0.9 2
T -0.9 -0.9 0.2
C -0.9 0.2 -0.9
weight = log (observed freq / expected freq)
GGT = 1 + 0.7 + 0.2= 1.9
Assumptions:•Represents the binding energies for DNA binding proteins•Each position is additive•Each position is independent
Modeling Strategy for σE promoters
Sequence logo of 40 in vivo active σE promoters
Model for σE short promoters
Promoter module scores with total scores and strengths
Promoter strength summary
•Promoter strength can be modeled based on strong promoters
•Assumptions of PWMs generally true for core promoter sequences
•Minimum module scores required to distinguish promoter function
•Provides a strategy for improving promoter prediction models
Rhodius & Mutalik, 2010a. PNAS 107:2854-9Rhodius & Mutalik, 2010b. Stay tuned!
Promoter strength summary
1. Can we build a reliable “strong, medium and weak” synthetic promoters ?
2. Lack of good quality data for building models /deriving general rules
3. We even do not have a simple data for different promoter-RBS-CDS combinations.
biofabBioFAB: Pilot project
Widely used Promoters and RBS
1. What happens ?
2. Are they independent ?
3. Composition rules ?
4. Predictive models ?
biofabPromoter-RBS combinations
10 Used RBS’s10 Famous Promoters
CON
TEXT
CON
TEXT
Promoter- 1 GFP T
-35 -10 +1
GFP = f(P, RBS)
biofabPromoter-RBS combinations
Assay Strain:BW25113Media:MOPS
Combinatorial libraryAssembled
Total 144
Plate reader
Flow cytometry
Data analysis and Modeling
Data
Lance Martin
biofabPromoter-RBS combinations: Results
RBS
Prom
oter
s
Activity
biofabStrategy for data analysis
Seq Activity=N N
P1 R1 A =N NN
Black box Seq-Activity models
PWM, PLSR Regression, HMM, NN, SVM models
P R
35 A =10 sp T rate
dG sp
Features Sequence-Activity models
---
Joao Guimaraes
=
Recoding
biofabResults
βp (P) * βR (R) = [Activity]
0 P1 P2 P3 ---
1 1 0 0 0
2 0 1 0 0
3 0 0 0 1
--- 1 0 0 0
R1 R2 R3 ---
1 0 0 0
0 1 0 0
0 1 0 0
0 0 0 1
A
0
0
0
1
100
10
50
1
R² = 0.8239
0123456789
10
0 2 4 6 8 10
Pred
icte
d A
ctiv
ity
Observed Activity
Q2 = 0.75
Multivariate data analysis (Partial least square regression)
BioFAB et al., Manuscript being preped
biofabResults and Summary
• Varied promoters & RBS used as test case for studying junctions• Promoter & RBS regions appear to be independent• Simple model explains >70% data variance
• We are testing the generality of the model for different reporters and building sophisticated models
Predicting Promoter-RBS combination outputs
P1 R1N
FP =NN
GFPRFP
mCherCat
LacZ
Copy
Envi
ronm
ent
RNA
N
biofab
Salmonella
E. Coli TG1E. Coli K12
Host Context & Phylogenetic Distance
Small genetic variants, Deletion/overexpression librariesTagged Libraries
Genomic ContextEnvironmental Context
• Defined Minimal vs. Rich• Different Carbon Sources• Growth Phase• Stressors (T, NaCL, pH)• 96 well plate, test tubes,
flask
Genome
ORIInsertion point
BAC
ORI
PlasmidORIPlasmid
Plasmid
Context characterization
How does a part behave in different contexts ?
biofab
Expression operating unit (EOU)
EOU
How to insulate functionality of parts from context change?How to improve predictability of performance ?
biofab EOU: higher resolution
Specific restriction sites ?Biobrick scars ?No restriction sites ?
Assembly methods& Context change
3’ UTR
biofab EOU # 1.0: Components
Promoter- 1 GFP T
ReporterdbITerminator
CON
TEXTpTac
InsulatorT
CON
TEXT
Pause
SspB region
T7A1D111
rpoCTerminator
BujardUTR
mRFP
mCherry
Gemini
LacZBreak EOUImprovise
Davis et al., Suer Lab
Mutalik et al.,Arkin lab
biofabPart Libraries
How do we design parts such that their performance is predictable?
Promoter- 1 GFP T
ReporterdbITerminator
CON
TEXTpTac
InsulatorT
CON
TEXT
Pause
SspB region
T7A1D111
rpoCTerminator
BujardUTR
biofab Junction architectures
Statistical Experimental Design- ExtensionPromoter library
Promoter Insulators RBS library
Terminatorlibrary
Reporterlibrary RBS library
Reporterlibrary
Operon designs and optimization
EXPRESSION OPERATING UNIT
J6 J7 J8J4 J5J2 J3J1
Up P 5’ Tran CDS 3’ T InsIns
J6 J7 J8J4 J5J2 J3J1
Up P 5’ Tran CDS 3’ T InsIns
biofab
C-Dog Project scope
Work in progress
“C. Dog” Project Leadership
J6 J7 J8J4 J5J2 J3J1
Up P 5’ Tran CDS 3’ T InsIns
Team GuillaumeTeam Vivek
Vector & Chromosome Data Analysis
biofab
biofab Open technology and contribution
• Defining our progress/success criteria
• We will work together with academic and industry communities to propose, adopt, implement and refine best practices in characterization, tools and methods
biofab Acknowledgements
UCB Chris Anderson John Dueber Julius LucksWeston Whitaker Stanley
JBEI Nathan Hillson Aindrila MMasood Hadi Hou Cheng ChuWill Holtz Jeff Dietrich Greg Bokinsky Adrienne Mckee
UCSF Virgil Rhodius Athanasios Typas Chris Voigt
BioFAB team: Joao and LanceAdam Arkin & Drew Endy
Stanford Christina Smolke
SynBERC Kevin Costa Leonard Katz
Puzzles• EOU parts• Insulators/Junctions• Integration sites
Opportunities• Assembly• HT Characterization
(Eg., Microfluidic)• Context studies• Biophysical models• Internal ref standards
biofab C-Dog-Puzzles & Opps