Dynamic Programming

23
Dynamic Programming How to match up sequences and have the matches make sense and be quantitative

description

Dynamic Programming. How to match up sequences and have the matches make sense and be quantitative. Question is. How does a specific sequence compare to one other specific sequence? Is it similar? If so, at what level? Can’t compare every base to every other base--to complex. - PowerPoint PPT Presentation

Transcript of Dynamic Programming

Dynamic Programming

How to match up sequences and have the matches make sense

and be quantitative

Question is

• How does a specific sequence compare to one other specific sequence?– Is it similar?– If so, at what level?

• Can’t compare every base to every other base--to complex

You are in the driver’s seat

• What is the most important?– Exact nucleotide match?– One-for-one (no gaps)?– Length

Mathematical model

• Derive equation for each position, based on your value system

• Methodically go through each base for each sequence and calculate the value

• At the end, find the optimal path

Starting point: three possible scenarios for each position in

sequences X and Y• At a given position, the bases (Xm and Yn)

are identical in X and Y• At a given position, the base (Xm) in X is

aligned with a gap in Y (and Yn appeared earlier)

• At a given position, the base in Y is aligned with a gap in X (and Xm appeared earlier)

Assign a value to each situation

• Identical: +5

• Mismatch: -2

• Insertion or deletion: -6

(Could have others; could choose different values)

http://www.acm.org/crossroads/xrds13-1/dna.html

Alpha-glucosidase in plants:

Enzymes sharing WIDMNE signature sequence

alpha-glucosidase (all groups)alpha-xylosidase (plant, bacteria, archaea)Sucrase/Isomaltase (animal)

Related sequences with broad substrate specificity

0.1

Tp GAA

Bh BAB0442

Bt Aglu-III

Ss xylS

Lp XylQTm

AAD3539

Aa GlcA

Sc CAB8890

Ce AAA8317

Lv GAA

Hs S/I-C

Hs S/I-N

Cj GAAI

Cj GAAII

Hs GAA

Hv Aglu

At Aglu-1

Bv Aglu

So Aglu

Pp BAB3946

St MAL2

AtXYL1

TmXYL

MjAglu Pt Aglu

Sp Aglu

Anig aglA

An AgdA

Ca GAM1

Soc GAM1

An agdB

Fungi

Protista

Bacteria

Plantae

Animalia

Archaea

Plant -amylases are located in different cellular compartments

Plastids (chloroplasts, amyloplasts)CytosolApoplast (cell wall space)

What is the function of the non-plastid forms?

dodder

adzuki bean

morning glory

rice 2Abarley A

barley B

rice 3B

maize

rice 3E

rice XP_472377

cassava

apple 9

apple 8

potato plantain

rice NP_916641

kiwifruit

apple 10

Clade ISecreted

421-445 aa

Clade IICytosolic

407-414 aa

Clade IIIPlastidic

877-906 aa

Arabidopsis AMY1

Arabidopsis AMY2

Arabidopsis AMY3

Homologous sequences (homologues)Share a common ancestor

ParalogsHomologues derived by gene duplicationFunctions may varyLook for differences

OrthologsHomologues derived by speciationCommon functionLook for similarities

Use alignments to look for:

• Structures important for common functions (orthologs)

• Structures important for unique functions (paralogs)

• Unusual structures

N C

AtAMY1

AMY1 has a three amino acid deletion

3

Red: NHDTGST Blue: VAEIW

Barley -amylase

Active site residues

Variation in the active site loop among plant and bacterial -amylases

AtAMY1