AI for Synthetic Biology · Mazzoldi, E. Groban Autodesk Research, USA 4:20 MDP-based Planning for...

33
[email protected] AI for Synthetic Biology @ IJCAI 2016 Aaron Adler Fusun Yaman 1 This document does not contain technology or Technical Data controlled under either the U.S. Interna9onal Traffic in Arms Regula9ons or the U.S. Export Administra9on Regula9ons.

Transcript of AI for Synthetic Biology · Mazzoldi, E. Groban Autodesk Research, USA 4:20 MDP-based Planning for...

[email protected]

AI for Synthetic Biology @ IJCAI 2016

Aaron Adler Fusun Yaman

1

ThisdocumentdoesnotcontaintechnologyorTechnicalDatacontrolledundereithertheU.S.Interna9onalTrafficinArmsRegula9onsortheU.S.ExportAdministra9onRegula9ons.

[email protected] Sponsors

2

[email protected] Workshop Goals

•  Expose AI researchers to a new domain •  Bring new tools and techniques to Synthetic

Biologists to help address hard problems •  Cross pollenate the AI and SynBio communities •  Develop collaborations between the

communities •  Discuss next steps

3

[email protected] Schedule

4

Time Title Authors Affilia0ons

1:30 Welcome A.Adler,F.Yaman

1:40 Introduc9ontoSynthe9cBiology A.Adler,F.Yaman BBN,USA

2:00 AIforSynthe9cBiology

F.Yaman,A.Adler

BBN,USA

2:25 Automatedreading,assemblyandexplana9ontoguidebiologicaldesign

N.Miskov-Zivanov

UniversityofPiTsburgh,USA

2:50 DebuggingGene9cProgramswithBayesianNetworks

G.Karlebach,L.Woodruff,C.VoigtandB.Gordon

MITBroadFoundry,USA

3:10 MolecularRobotsObeyingAsimov'sThreeLawsofRobo9cs

G.Kaminka,R.Spokoini-Stern,Y.Amir,N.AgmonandI.Bachelet

BarIlanUniversity&Augmanity,Israel

3:30 CoffeeBreak

4:00 ACombinatorialDesignWorkflowforSearchandPriori9za9oninLarge-ScaleSynthe9cBiologyConstructAssembly

J.Ng,A.Berliner,J.Lachoff,F.Mazzoldi,E.Groban

AutodeskResearch,USA

4:20 MDP-basedPlanningforDesignofGene-RepressionofCircuits

T.AmimeurandE.Klavins UniversityofWashington,USA

4:40 UsingMachineLearningtoInterpretUntargetedMetabolomicsintheContextofBiologicalSamples

A.Tong,N.Alden,V.Porokhin,N.Hassanpour,K.LeeandS.Hassoun

TucsUniversity,USA

5:00 Discussionandclosingremarks

5:30 Workshopends

[email protected]

Introduction to Synthetic Biology

Aaron Adler Fusun Yaman

5

ThisdocumentdoesnotcontaintechnologyorTechnicalDatacontrolledundereithertheU.S.Interna9onalTrafficinArmsRegula9onsortheU.S.ExportAdministra9onRegula9ons.

Workpar9allysponsoredbyDARPAundercontractHR0011-10-C-0168.TheviewsandconclusionscontainedinthisdocumentarethoseoftheauthorsandnotDARPAortheU.S.Government.

[email protected] What is Synthetic Biology?

•  “… a maturing scientific discipline that combines science and engineering in order to design and build novel biological functions and systems” [SynBERC]

•  Synthetic biologists are working on diverse applications: –  New medical diagnostics and therapies –  Extract harmful pollutants from the ground –  Chemical production or detection

•  Synthetic Biology is at a crossroads: AI can help!

Program CellsExecu9ngProgram

6

[email protected] Synthetic Biology vs. Genetic Engineering

•  Genetic Engineering is the ability to read, copy, and edit DNA so that controlled changes can be made to organisms

•  Engineering is the application of scientific, economic, social, and practical knowledge in order to invent, design, build, maintain, and improve structures, machines, devices, systems, materials, and processes. (Wikipedia)

•  Understand the design enough to make a prediction about how it will behave

7

[email protected]

Synthetic Biology as an Engineering Discipline

•  Goal: Design sophisticated biological systems in a reliable, efficient, and predictable manner

•  Useful engineering practices: –  Libraries of “parts”, Component testing, Standards &

interfaces, Decoupling, Modularity, Computer aided design •  Issues in engineering biological systems:

–  Device characterization, Impedance matching, Rules of composition, Noise, Cellular context, Environmental conditions, Rational design vs. directed evolution, Persistence, Mutations, Crosstalk, Cell death, Chemical diffusion, Motility, Incomplete models

•  A discipline that needs new engineering rule –  The rules don’t have to be identical to natural evolution

[Weiss] 8

[email protected] Why is this Important?

•  Breaking the complexity barrier:

•  Multiplication of research impact •  Reduction of barriers to entry

*Samplingofsystemsinpublica9onswithexperimentalcircuits

207

2,100 2,7007,500 14,600

32,000

583,0001,080,000

100

1,000

10,000

100,000

1,000,000

1975 1980 1985 1990 1995 2000 2005 2010

Lengthinbasepa

irs

Year

DNA synthesis Circuit size ?

9

[Purnick&Weiss,‘09]

[email protected] More Recent Advances

10

[Weiss]

[email protected] From Idea to Implementation

11

[email protected]

Systemintegra0on

Gene0cparts

Modules

Applica0ons

HierarchicalOrganiza9oninSynthe9cBiology

12

[email protected] DNA, RNA, Proteins •  DNA (Deoxyribonucleic acid)

is a double helix encoding genetic instructions –  Composed of nucleotides:

adenine (A), cytosine (C), guanine (G), or thymine (T)

•  RNA (Ribonucleic acid) is usually single stranded –  Composed of nucleotides:

adenine (A), cytosine (C), guanine (G), or uracil (U)

•  RNA can encode an amino acid sequence that can in turn produce a protein

13

[Wikimedia]

•  Expression dependent on cellular platform, e.g., animal, yeast, bacteria, and cellular context, e.g., heart vs. skin cell

[email protected]

Common Machinery: Transcriptional Logic

14

Transcrip)onisthecopyingofaregionofDNA,themoleculeinwhichgene9cinforma9onisencodedasasequenceofnucleo9des,intoastrandofRNA.

Transla)onisthedecodingoftheaminoacidsequenceofanRNAsequencetoproduceaprotein,therebyincreasingtheconcentra9onofthatprotein.Proteinsarethemain“machinery”ofacell.Amongotherthings,theyactassensors,asactuators,andasregulatorsofotherbiologicalprocesses.

Degrada)onandDilu)onaretheprocessesbywhichtheconcentra9onofproteinsandRNAtranscriptsdecrease.Degrada9onisthechemicalbreakdownofamoleculebycellularprocessesorbyitsowninstability.Dilu9onisthesideeffectofcellsgrowinganddividing:theeffec9veconcentra9onofanymoleculedropspropor9onaltotheamountthatthevolumeofthecellincreases.

Regula)onistheinterac9onofaproteinwiththepromoterregionofaDNAsequence,therebymodula9ngtherateatwhichtranscrip9onactsontheregionofDNAcontrolledbythepromoter.Theproteinmayrepressthepromoter,inhibi9ngtranscrip9on,orismayac)vatethepromoter,enhancingtranscrip9on.

Timeconstantsoftheseprocessesareocenquiteslow,ac9ngontheorderofminutes,hours,or(insomecases)days.

[email protected]

Structural

Chemical

Informa0onal

Proteins

1

2

3

RNADNA

promoter

Degrada0on&Dilu0on

4

RNApolymerase

ribosome

Focus on Information / Control

Most complex applications will require all three

Informa9onal:Representprocessesasdigitallogic,dataflowsChemical:Cellularreac9onsarefundamentallyprobabilis9candchemicalStructural:Reac9onsdependonthephysicalstructureofDNA/RNA/Proteins/Cells

15

[email protected]

Cellularcontext

System Design

16

SENSORS PROCESSING ACTUATION

Environment

Synthe0ccircuit

• Number?• Types?

• Sophis9ca9on?§ Timing§ States§ Lookuptables§ …

• Number?• Types?

Highlevelgoal:developanengineeringdisciplineforbiology

temperature,pH,light,chemicalsignals,mechanicalforce

fluorescence,movement,electricalac9vity,chemicalproducts

[email protected] Building Blocks

•  Features (Parts) are previously identified DNA sequences that perform a specific biological function –  promoter initiates transcription –  coding sequence for a protein –  terminator that halts transcription

•  Parts used as basis for engineering •  Fluorescent proteins can be observed and used

to help understand what is going on in a cell

17

Promoter CDS Terminator

+

[email protected] Genetic Regulatory Networks •  A collection of DNA regions and their regulatory

interactions is called a genetic regulatory network (GRN) –  Takes advantage of the modularity of the DNA molecule –  Design of a desired computation

•  The GRN may be designed as a single DNA sequence or as multiple separate sequences –  Can operate as an insertion into the organism’s existing DNA, as

a virus, or as independent free-floating DNA loops (known as plasmids)

18

Key

Promoter

Protein

Repress

Ac9vate

pTrepHef1a pTrertTA CFP LacI EYFP

Dox

pHef1a-LacO1Oid

IfthereisDoxThenglowCyanElseglowYellow

[email protected] Biological Circuits

•  Various forms of interaction can be used as computational building blocks for building more complex biological circuits –  Deliberately analogous to electronic circuits

•  Loops in the regulatory network can be used for feedback control or to create state memory

•  Allows an extremely wide variety of computational and control systems to be implemented as genetic regulatory networks

19

[email protected]

Input Concentration

Out

put C

once

ntra

tion

Digital Logic in an Analog World

•  Biological processes can support digital logic devices!

ideal (step fn)

reality (sigmoidal)

“0” “1”

“1”

“0”

20

[email protected]

Genetic Building Block – Digital Inverter

0 1

CDS P

Transcription / Translation

output protein input protein (repressor)

21

[email protected]

Gene9cBuildingBlock–DigitalInverter

1 0

CDS P

Transcription / Translation

output protein input protein (repressor)

22

[email protected] Interac9ngwithCells–IMPLIESGate

CDS P

Transcription / Translation

output protein input protein (repressor)

Repressor Inducer Output

Repressor Inducer Output0 0 10 1 11 0 01 1 1

23

[email protected] Interac9ngwithCells–IMPLIESGate

CDS P

Transcription / Translation

output protein inactive repressor

Repressor Inducer Output

Repressor Inducer Output0 0 10 1 11 0 01 1 1

24

[email protected] Abstract Genetic Regulatory Network (A-GRN)

•  Defines logical relationship between abstract parts •  The GRN above

–  Y induces and Z represses the transcription of X

•  The overall behavior depends on chemical properties –  degradation (γ ), dissociation (D), fold activation (K), basal

expression (α), cooperativity (H)

Y

Z

X

Kz Hz Dz αz γz

Kx Hx Dx αx γx

Ky Hy Dy αy γy

25

[email protected] Simulating System Behavior

•  Change in concentration of the chemicals approximated using differential equations

•  The input/output relationship between X&Y and X&Z

( Kz , Hz , Dz )

(αx , γx ) ( Ky , Hy , Dy )

X

Y Z 26

[email protected] DNA Assembly Techniques

•  BioBricks •  Magnetic Beads •  Gateway-Gibson •  Golden Gate •  Enzymes break DNA

apart allowing parts to join together

27

[igem.org]

[email protected] Getting the New DNA into Cells

•  How do you get new DNA into the cell? –  Transfection

•  Chemical and non-chemical methods –  Lipofection –  Virus delivery

•  DNA can be: –  chromosomally integrated OR –  transiently transfected OR –  on a separate plasmid

•  How do you measure the results? –  fluorescence –  mass spectrometry –  anti-body assays –  cells emit other chemicals –  RNA/DNA assays, …

28

[email protected]

GFP

“Fluoresce green when doxycycline is present”

Currently, even something this simple isn’t easy…

rtTA

Dox

Simple Circuit Example

29

[email protected] Sense/Actuate Example

NoArabinose HighDoseArabinose

Ara

AraC pBAD GFP TetR pTet RFPpBAD

30

[email protected] Synthetic Biology

[Medford]

[Weiss][Levskaya]

[Hasty]

31

[email protected]

OutlookforApplica9ons

Microbialbiochemical

synthesis• artemisinin• otherpharmaceu9cals

Environmentalapplica0ons• environmentalremedia9on• toxinsensing• explosivesensing

Bioenergyproduc0on• biodiesel• hydrogen• methane• …

Biomedicalapplica0ons• cancertherapeu9cagents• ar9ficial9ssuehomeostasis• programmed9ssueregenera9on• ar9ficialimmunesystem

32

[email protected] Example genetic circuit applications

Fermentation control CAR T-cell Therapy

33