Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple...

35
Motif discovery Tutorial 5
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    237
  • download

    5

Transcript of Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple...

Page 1: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

Motif discovery

Tutorial 5

Page 2: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

• Motif discovery– MEME– MAST– TOMTOM– GOMO– PROSITE

Multiple sequence alignments and motif discovery

Page 3: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

Can we find motifs using multiple sequence alignment?

1 2 3 4 5 6 7 8 9 10

A 0 0 0 0 0 3/6 1/6 2/6 0 0

D 0 3/6 2/6 0 0 1/6 5/6 1/6 0 1/6

E 0 0 4/6 1 0 0 0 0 1 5/6

G 0 1/6 0 0 1 1/3 0 0 0 0

H 0 1/6 0 0 0 0 0 0 0 0

N 0 1/6 0 0 0 0 0 0 0 0

Y 1 0 0 0 0 0 3/6 3/6 0 0

..YDEEGGDAEE....YDEEGGDAEE....YGEEGADYED....YDEEGADYEE....YNDEGDDYEE....YHDEGAADEE..

MotifA widespread pattern with a biological significance

Page 4: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

Can we find motifs using multiple sequence alignment (MSA)?

YES! NO

Page 5: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

Using MSA for motif discoveryCan only work if things align nicely alone

For most motifs this is not the case!

Page 6: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

ClustalW - Inputhttp://www.ebi.ac.uk/Tools/clustalw2/index.html

Input sequences

Gap scoring

Scoring matrix

Email address

Output format

Page 7: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

http://www.ebi.ac.uk/Tools/muscle/index.html

Muscle

Input sequences

Email address

Output format

Page 8: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

Motif search: from de-novo motifs to motif annotation

gapped motifs

Large DNA data

http://meme.sdsc.edu/

Page 9: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

MEME – Multiple EM* for Motif finding

http://meme.sdsc.edu/• Motif discovery from unaligned sequences

Genomic or protein sequences• Flexible model of motif presence (Motif can be absent in

some sequences or appear several times in one sequence)

*Expectation-maximization

Page 10: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

MEME - InputEmail addres

s

Input file (fasta file)

How many times in each

sequence?

How many motifs?

How many

sites?

Range of motif

lengths

Page 11: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

MEME - Output

Motif score

Page 12: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

MEME - Output

Motif length

Number of times

Motif score

Page 13: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

MEME - Output

Low uncertainty

=

High information content

Page 14: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

MEME - Output

Multilevel Consensus

Page 15: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

Patterns can be presented as regular expressions

[AG]-x-V-x(2)-{YW}

[] - Either residuex - Any residuex(2) - Any residue in the next 2 positions{} - Any residue except these

Examples: AYVACM, GGVGAA

Page 16: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

Sequence names

Position in sequence

Strength of match

Motif within sequence

MEME - Output

Page 17: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

Overall strength of motif matches

Motif location in the input sequence

MEME - OutputSequence names

Page 18: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

What can we do with motifs?

• MAST - Search for them in non annotated sequence databases (protein and DNA)

• TOMTOM - Find the protein who binds the DNA motifs.

• GOMO - Find putative target genes (DNA) of motifs and analyze their associated annotation terms.

• PROSITE - Search for them in annotated protein sequence databases.

Page 19: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

MAST

• Searches for motifs (one or more) in sequence databases:– Like BLAST but motifs for input– Similar to iterations of PSI-BLAST

• Profile defines strength of match– Multiple motif matches per sequence– Combined E value for all motifs

• MEME uses MAST to summarize results: – Each MEME result is accompanied by the MAST result for

searching the discovered motifs on the given sequences.

http://meme.sdsc.edu/meme4_4_0/cgi-bin/mast.cgi

Page 20: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

MAST - InputEmail

address

Input file (motifs)

Database

Page 21: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

MAST - OutputInput motifs

Presence of the motifs in a given database

Page 22: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

TOMTOM

• Searches one or more query DNA motifs against one or more databases of target motifs, and reports for each query a list of target motifs, ranked by p-value.

• The output contains results for each query, in the order that the queries appear in the input file.

http://meme.sdsc.edu/meme/doc/tomtom.html

Page 23: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

TOMTOM - Input

Input motif

Background frequencies

Database

Page 24: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

DNA IUPAC* codeA --> adenosine M --> A C (amino) C --> cytidine S --> G C (strong) G --> guanine W --> A T (weak) T --> thymidine

B --> G T C D --> G A T R --> G A (purine) H --> A C T Y --> T C (pyrimidine) V --> G C A K --> G T (keto) N --> A G C T (any)

Example: YCAY = [TC]CA[TC]

*IUPAC = International Union of Pure and Applied Chemistry

Page 25: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

TOMTOM - OutputInput motif

Matching motifs

Page 26: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

TOMTOM – OutputWrong input, ok results

Page 27: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

JASPAR

• Profiles – Transcription factor binding sites– Multicellular eukaryotes– Derived from published collections of experiments

• Open data accesss

Page 28: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

scoreorganism logoName of gene/protein

Page 29: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

GOMO

• GOMO takes DNA binding motifs to find putative target genes and analyze their associated GO terms. A list of significant GO terms that can be linked to the given motifs will be produced.

• GOMO returns a list of GO-terms that are significantly associated with target genes of the motif.

• Gene Ontology provides a controlled vocabulary to describe gene and gene product attributes in any organism.

Page 30: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

GOMO - Input

Email addres

s

Input file (motifs)

Database

Page 31: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

GOMO - OutputInput motifs

GO annotation

MF - Molecular functionBP - Biological process CC - Cellular compartment

Page 32: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

ProSite is a database of protein domains and motifs that can be searched by either regular expression patterns or sequence profiles.

Prositehttp://www.expasy.org/tools/scanprosite

Page 33: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.
Page 34: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

Prosite - inputInput motif

a regular expression

Database

Filters

Page 35: Motif discovery Tutorial 5. Motif discovery –MEME –MAST –TOMTOM –GOMO –PROSITE Multiple sequence alignments and motif discovery.

Prosite - OutputInput motif

Location in the protein sequence

protein