© Wiley Publishing. 2007. All Rights Reserved. Building Multiple- Sequence Alignments.
Multiple Sequence Alignments
description
Transcript of Multiple Sequence Alignments
Aidan Budd, EMBL Heidelberg
Multiple Sequence Alignments
Aidan Budd, EMBL Heidelberg
Build a Sequence Alignment
Sequences are usually aligned automaticallyMUSCLE, PRANK, CLUSTAL etc.
Also possible 'manually' using tools such as JalView
Hopefully, these demonstrations will highlight that
Alignment is "trivial" (at one level, at least)only involves putting gap characters in the right places
Aidan Budd, EMBL Heidelberg
Build an Automatic MSA
http://www.ebi.ac.uk/Tools/msa/muscle/
Search Internet for "EBI Muscle"
Aidan Budd, EMBL Heidelberg
Build an Automatic MSA
Copy and paste sequences in FASTA format
Click "Submit"
http://www.embl.de/~seqanal/courses/commonCourseContent/sequences/verySimilarHemoglobins_unaligned.fasta
Aidan Budd, EMBL Heidelberg
Build an Automatic MSA
Wait for result to be returned
Click "Download Alignment File" to reach plain-text version of alignment
Aidan Budd, EMBL Heidelberg
Build an Automatic MSA
Download file or copy-paste text into text editor to store alignment on local computer
View alignment in MSA viewer (e.g. JalView) etc.
Aidan Budd, EMBL Heidelberg
Choosing an MSA tool
•CLUSTALX, MUSCLE, PROBCONSdivergent protein sequences
•NASTmultiple alignment of 16S rRNA genes
•PRANKmultiple alignment of relatively similar DNA sequences in an evolutionary context
•EXPRESSO(3DCoffee)multiple alignment of protein sequences, some of which have 3D structural information
•MAUVE, Enredomultiple alignment of genomes
• and many others...
Different tools designed for different tasks
Aidan Budd, EMBL Heidelberg
Examining MSAs:Recognising Patterns
Aidan Budd, EMBL Heidelberg
•Only one of many possible colouring schemes
•Good at highlighting variation in conservation between
•Designed for red/green colour-blindness
CLUSTALX Colouring Scheme
extract from an alignment of p53 proteins
Aidan Budd, EMBL Heidelberg
Amino acids with similar properties drawn with the same colour
extract from an alignment of p53 proteins
e.g. basic residues arginine (R) and lysine (K)
CLUSTALX Colouring Scheme
Aidan Budd, EMBL Heidelbergextract from an alignment of p53 proteins
Residues only coloured...
e.g. lysine (K) in columns with:only "a few" other basic residues (uncoloured)
"many" other basic residues (coloured)
... if some proportion of residues in the column have the same property
CLUSTALX Colouring Scheme
Aidan Budd, EMBL Heidelberg
Hydrophobic: L V I M F W A CPolar: N T S Q
Acidic: D E
Basic: K R Secondary-structure breaking: G P
Large Aromatic Polar: H Y
(CLUSTALX help file fully describes the default colouring rules)
CLUSTALX Colouring Scheme
Aidan Budd, EMBL Heidelberg
http://tardis.nibio.go.jp/cgi-bin/homstrad/showpage.cgi?family=response_reg&disp=str
Response regulator receiver domain
Common Patterns - Buried Beta-Strand
Aidan Budd, EMBL Heidelberg
http://tardis.nibio.go.jp/cgi-bin/homstrad/showpage.cgi?family=response_reg&disp=str
Response regulator receiver domain
Common Patterns - Amphipathic Partially-Buried Alpha-Helices
Aidan Budd, EMBL Heidelberg
ubiquitin conjugating enzyme
Common Patterns - Amphipathic Beta Strands
Aidan Budd, EMBL Heidelberg
Different, more strongly biased (from equal representation of each of the 20 amino acids), sequence composition
Sometimes more variable sequence•more substitutions•more gapsthan globular/structured regions)
Common Patterns - Non-Globular Sequence
Aidan Budd, EMBL Heidelberg
KK
Identifying Mis-Aligned Regions
Identify a region of a sequence that you think is misaligned
Decide how you would "fix" this misalignment
Look at patterns of conservation, and sequences which
Aidan Budd,EMBL Heidelberg
Unusual Sequences: Examples
With CLUSTALX “”Quality”->”Show Low-Scorring Segments” switched on
Short/fragmented sequences
Unusual pattern of "conservation"
Aidan Budd, EMBL Heidelberg
Using MSAs to Improve Prediction of Linear Motifs
Demonstration and Exercise