Post on 19-Mar-2016
description
Dynamic Programming
How to match up sequences and have the matches make sense
and be quantitative
Question is
• How does a specific sequence compare to one other specific sequence?– Is it similar?– If so, at what level?
• Can’t compare every base to every other base--to complex
You are in the driver’s seat
• What is the most important?– Exact nucleotide match?– One-for-one (no gaps)?– Length
Mathematical model
• Derive equation for each position, based on your value system
• Methodically go through each base for each sequence and calculate the value
• At the end, find the optimal path
Starting point: three possible scenarios for each position in
sequences X and Y• At a given position, the bases (Xm and Yn)
are identical in X and Y• At a given position, the base (Xm) in X is
aligned with a gap in Y (and Yn appeared earlier)
• At a given position, the base in Y is aligned with a gap in X (and Xm appeared earlier)
Assign a value to each situation
• Identical: +5• Mismatch: -2• Insertion or deletion: -6
(Could have others; could choose different values)
http://www.acm.org/crossroads/xrds13-1/dna.html
Alpha-glucosidase in plants:
Enzymes sharing WIDMNE signature sequence
alpha-glucosidase (all groups)alpha-xylosidase (plant, bacteria, archaea)Sucrase/Isomaltase (animal)
Related sequences with broad substrate specificity
0.1
Tp GAA
Bh BAB0442
Bt Aglu-III
Ss xylS
Lp XylQTm
AAD3539
Aa GlcA
Sc CAB8890
Ce AAA8317
Lv GAA
Hs S/I-C
Hs S/I-N
Cj GAAI
Cj GAAII
Hs GAA
Hv Aglu
At Aglu-1
Bv AgluSo Aglu
Pp BAB3946
St MAL2
AtXYL1
TmXYL
MjAglu Pt Aglu
Sp Aglu
Anig aglA
An AgdA
Ca GAM1
Soc GAM1
An agdB
Fungi
Protista
Bacteria
Plantae
Animalia
Archaea
Plant -amylases are located in different cellular compartments
Plastids (chloroplasts, amyloplasts)CytosolApoplast (cell wall space)
What is the function of the non-plastid forms?
dodder
adzuki bean
morning glory
rice 2Abarley A
barley B
rice 3B
maize
rice 3E
rice XP_472377
cassava
apple 9
apple 8
potato plantain
rice NP_916641
kiwifruit
apple 10
Clade ISecreted
421-445 aa
Clade IICytosolic
407-414 aaClade IIIPlastidic
877-906 aa
Arabidopsis AMY1
Arabidopsis AMY2
Arabidopsis AMY3
Homologous sequences (homologues)Share a common ancestor
ParalogsHomologues derived by gene duplicationFunctions may varyLook for differences
OrthologsHomologues derived by speciationCommon functionLook for similarities
Use alignments to look for:
• Structures important for common functions (orthologs)
• Structures important for unique functions (paralogs)
• Unusual structures
N C
AtAMY1
AMY1 has a three amino acid deletion
3
Red: NHDTGST Blue: VAEIW
Barley -amylase
Active site residues
Variation in the active site loop among plant and bacterial -amylases
AtAMY1