BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search...

22
BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are similarity searches good for? One sequence by itself is not informative; it must be analyzed by comparative methods against existing sequence databases to develop hypothesis concerning relatives and function BLAST program Database Query
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    243
  • download

    0

Transcript of BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search...

Page 1: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

BLAST

Tutorial 3

What is BLAST?• Basic Local Alignment Search Tool• Is a set of similarity search programs designed to explore sequence databases. 

What are similarity searches good for?• One sequence by itself is not informative; it must be analyzed by comparative methods against existing sequence databases to develop hypothesis concerning relatives and function

BLAST program DatabaseQuery

Page 2: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

NameQuery typeDatabase

blastnGenomicGenomic

blastpProteinProtein

blastxTranslated genomic

Protein

tblastnProteinTranslated genomic

tblastxTranslated genomic

Translated genomic

BLAST Databases

Page 3: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

http://www.ncbi.nlm.nih.gov/BLAST/

Page 4: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Place Query

Choose Database

?

Page 5: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

BLASTN Databases

Gene collection

GenBank, EMBL, DDBJ, PDB and NCBI reference sequences (RefSeq)

Genomic + Transcript

Complete human and mouse genome + transcriptome

ESTExpressed sequence tags

mitoMitochondrial sequences

vectorVector subset of GenBank

monthGenBank, EMBL, DDBJ, PDB from 30 days

EnviEnvironmental samples

http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#nucleotide_databases

Page 6: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Place Query

Choose Database

Optimize similarity level of the search

Threshold for results significance

Limit output size

Primary word match (16-64 nt)

Reward and penalty for matching and mismatching bases

Cost to create and extend a gap

Remove low information content

Limit search to specific organism

?

Page 7: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Search for homologous to chick “olfactory receptor 6” gene

Page 8: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Query sequence Matched Areas of database sequences

Global Alignments

Local Alignments

Page 9: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Sequence Identifier

Sequence description

Score(bits)

CoverageIdentity

E value

Page 10: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Score andE value

Identities and gaps

Strand

Page 11: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Multiple hits on a same subject

Page 12: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Design of the BLAST survey

Consider your research question:

•Are you looking for an particular gene in a particular species?: BLAST against the genome of that species.

•Are you looking for additional members of a gene family across all species? : BLAST against the gene collection database.

•Are you looking for exact motif matches? : increase gap penalty or use megablast.

Page 13: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Score and E-value

Score (S): (identities + mismatches)-gaps

Depends on search space

Query length(bp)

Database length(bp)

Depends on scoring system

Score

Bit Score (S’):

Page 14: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Score and E-value

•The score is a measure of the similarity of the query to the sequence shown.

•The E-value is a measure of the reliability of the score.

•The definition of the E-value is: The probability due to chance, that there is another alignment with a similarity greater than the given S score.

Page 15: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Score and E-value

The Size of the E-value

•The typical threshold for a good E-value from a BLAST search is E=10-6≈e-6 or lower.

•The reason for such low values is that an E=0.001 in a million entry database would still leave 1000 entries due to chance. An E=e-6 would only leave one entry due to chance.

Page 16: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.
Page 17: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Given the following parameters:Query length: 150=1.37 K=0.711Average Sequence length in database: 270Number of sequences in database: 4,554,026

Exercise

Calculate the S, S’ and E for the following BLAST hit:

ACGTCGATCGAGCT||||| ||||||||AGGTCGTC-GAGGT

S = 13-1 = 12S’= (1.37*12 – ln(0.711))/ln(2)S’= 16.44 + 0.341 /0.693S’= 24.2

S: (Id+MM)-GP

Page 18: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Exercise

Calculate the S, S’ and E for the following BLAST hit:

ACGTCGATCGAGCT||||| ||||||||AGGTCGTC-GAGGT

E= 0.711x150x270x4,554,026xe-1.37*12

E= 131135455683x7.24e-8E= 9504.27

Given the following parameters:Query length: 150=1.37 K=0.711Average Sequence length in database: 270Number of sequences in database: 4,554,026

Page 19: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

Exercise

What will be the minimal score in order to achieve a significant E value (e-6~10-6)?

131135455683e-1.37S=10-6

ln (131135455683e-1.37S)=ln(10-6)

ln (131135455683)+ln(e-1.37S)=-13.81

25.6-1.37S=-13.81

S= =-13.81-25.6/-1.37

S≈ 28.76

Page 20: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

באדםCFTR. חיפוש רצפים הומולוגיים לגן 1

Page 21: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

הנמצאים ביצורים אחריםCFTR. חברי משפחה נוספים לגן 2

Page 22: BLAST Tutorial 3 What is BLAST? Basic Local Alignment Search Tool Is a set of similarity search programs designed to explore sequence databases. What are.

ABC transporters. חיפוש של גנים נוספים חברי משפחת 3