BLAST (Basic local alignment search Tool)

31
Welcome to the presentation On BLAST

Transcript of BLAST (Basic local alignment search Tool)

Page 1: BLAST (Basic local alignment search Tool)

Welcome to the presentationOn

BLAST

Page 2: BLAST (Basic local alignment search Tool)

IntroductionGroup Members: Shahida khatun MD. Firoz Ahmed MD. Shariful Islam Chandrima Das Shantonu Kumar Roy Merina Junaki Ikhtina Afroz Shanjida Afrin MST. Shahinur Akter MD. Ariful Islam Sagar

Page 3: BLAST (Basic local alignment search Tool)

BLAST(Basic local alignment search Tool)

Page 4: BLAST (Basic local alignment search Tool)

ContentsDefinitionBackgroundTypes of BLAST ProgramAlgorithmBLAST Input-OutputBLAST searchBLAST FunctionObjectives of BLAST

Page 5: BLAST (Basic local alignment search Tool)

DefinitionThe Basic Local Alignment Search Tool (BLAST) for comparing gene and protein sequences against others in public databases.

BLAST is a set of sequence comparison algorithms used to search databases for optimal local alignments to a query.

Page 6: BLAST (Basic local alignment search Tool)

DefinitionIt breaks the query and databases sequences into fragments and seeks matches between them.

Nucleic acid/Protein Alignments were time consuming. Alignments were done by full alignments by using dynamic programming. BLAST is 50 times faster then dynamic programming.

Page 7: BLAST (Basic local alignment search Tool)

BackgroundBeginning in the 1970s, scientists began to accumulate DNA and protein sequence data at an exponential rate; in fact, researchers currently have approximately 97 billion bases sequenced and over 93 million records.

Amazingly, this sequence data doubles every 18 months!

Page 8: BLAST (Basic local alignment search Tool)

BackgroundToday, one of the most commonly used tools to examine DNA and protein sequences is the Basic Local Alignment Search Tool, also known as BLAST.

BLAST is a computer algorithm that is available for use online at the National Center for Biotechnology Information (NCBI) website and many other sites.

Page 9: BLAST (Basic local alignment search Tool)

Types of BLASTNucleotide-nucleotide BLAST (blastn) - This program, given a DNA query, returns the most similar DNA sequences from the DNA database that the user specifies.Protein-protein BLAST (blastp) - This program, given a protein query, returns the most similar protein sequences from the protein database that the user specifies.Position-Specific Iterative BLAST (PSI-

BLAST) (blastpgp) - This program is used to find distant relatives of a protein.

Page 10: BLAST (Basic local alignment search Tool)

Types of BLAST Nucleotide 6-frame translation-protein

(blastx) -This program compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database. Nucleotide 6-frame translation-

nucleotide 6-frame translation (tblastx)

-The purpose of tblastx is to find very distant relationships between nucleotide sequences.

Page 11: BLAST (Basic local alignment search Tool)

Types of BLAST Protein-nucleotide 6-frame translation

(tblastn) -This program compares a protein query against the all six reading frames of a nucleotide sequence database. Large numbers of query sequences

(megablast) -When comparing large numbers of input sequences via the command-line BLAST, "megablast" is much faster than running BLAST multiple times.

Page 12: BLAST (Basic local alignment search Tool)

Types of BLASTOf these programs, BLASTn and BLASTp are the most commonly used because they use direct comparisons, and do not require translations.

However, since protein sequences are better conserved evolutionarily than nucleotide sequences, tBLASTn, tBLASTx, and BLASTx, produce more reliable and accurate results when dealing with coding DNA. 

Page 13: BLAST (Basic local alignment search Tool)

BLAST AlgorithmThe blast algorithm is fast, accurate and web-accessible.

It is relatively faster than other sequence similarity search tools.

Complex BLAST algorithm requires multiple steps and many parameters.

Page 14: BLAST (Basic local alignment search Tool)

BLAST AlgorithmAn overview of the BLAST algorithm (a protein to protein search) is as follows: Remove low-

complexity region or sequence repeats in the query sequence.

Make a k-letter word list of the query sequence - Take k=3 for example, we list the words of length 3 in the query protein sequence (k is usually 11 for a DNA sequence) "sequentially", until the last letter of the query sequence is included.

Page 15: BLAST (Basic local alignment search Tool)

BLAST Algorithm List the possible matching words. Organize the remaining high-scoring words

into an efficient search tree. Repeat step 3 to 4 for each k-letter word in

the query sequence. Scan the database sequences for exact

matches with the remaining high-scoring words.

Extend the exact matches to high-scoring segment pair (HSP).

Page 16: BLAST (Basic local alignment search Tool)

BLAST Algorithm List all of the HSPs in the database whose

score is high enough to be considered. Evaluate the significance of the HSP score. Make two or more HSP regions into a longer

alignment. Show the gapped Smith-Waterman local

alignments of the query and each of the matched database sequences.

Report every match whose expect score is lower than a threshold parameter E.

Page 17: BLAST (Basic local alignment search Tool)

BLAST Input-OutputInputInput sequences in FASTA or Genbank format.OutputBLAST output can be delivered in a variety of formats. These formats include HTML, plain text, and XML formatting. For NCBI's web-page, the default format for output is HTML.

An introduction that tells where the search occurred and what database and query were compared

Page 18: BLAST (Basic local alignment search Tool)

BLAST Output A list of the sequences

in the database containing segment pairs whose scores were least likely to occur by chance

Alignments of the high-scoring segment pairs showing identical and similar residues

A complete list of the parameter settings used for the search.

Page 19: BLAST (Basic local alignment search Tool)

BLAST OutputE-value (expectation value) The Expect value (E) is a parameter that

describes the number of hits one can "expect" to see by chance when searching a database of a particular size.

It decreases exponentially as the Score (S) of the match increases.

Essentially, the E value describes the random background noise.

In general terms the smaller E is the more likely the match is significant.

Page 20: BLAST (Basic local alignment search Tool)

BLAST Output Default E value for blastn, blastp,

blastx and tblastn is 10 At this setting, 10 hits with scores

equal to or better than the defined alignment score, S, are expected to occur by chance. The E-value can be increased or decreased to alter the stringency of the search.

Increase the E value when searching with a short query, since it is likely to be found many times by chance in a given database.

Page 21: BLAST (Basic local alignment search Tool)

BLAST OutputBit Score A bit score is another prominent

statistical indicator used in addition to the E value in a BLAST output.

The bit score measures sequence similarity independent of query sequence length and database size and is normalized based on the raw pairwise alignment score.

Page 22: BLAST (Basic local alignment search Tool)

BLAST Search

• Go to http://www.ncbi.nlm.nih.gov/• Select BLAST program

Page 23: BLAST (Basic local alignment search Tool)

BLAST Search

Selecting the BLAST Database

Page 24: BLAST (Basic local alignment search Tool)

BLAST Search

Entering sequence Submitting search

Page 25: BLAST (Basic local alignment search Tool)
Page 26: BLAST (Basic local alignment search Tool)

BLAST FunctionBLAST can be used for several purposes. These include identifying species, locating domains, establishing phylogeny, DNA mapping, and comparison.Identifying species -With the use of BLAST, we can possibly correctly identify a species or find homologous species. This can be useful, for example, when we are working with a DNA sequence from an unknown species.

Page 27: BLAST (Basic local alignment search Tool)

BLAST FunctionLocating domains - When working with a protein sequence you can input it into BLAST, to locate known domains within the sequence of interest.Establishing phylogeny -Using the results received through BLAST we can create a phylogenetic tree using the BLAST web-page.

Page 28: BLAST (Basic local alignment search Tool)

BLAST FunctionDNA mapping -When working with a known species, and looking to sequence a gene at an unknown location, BLAST can compare the chromosomal position of the sequence of interest, to relevant sequences in the databaseComparison -When working with genes, BLAST can locate common genes in two related species, and can be used to map annotations from one organism to another.

Page 29: BLAST (Basic local alignment search Tool)

Objectives of BLAST It is one of the most popular programs for

sequence analysis. Enables a researcher to compare a query

sequence with a library or database of sequence.

Identify library sequences that resemble the query sequence above a certain threshold.

The objective is to find high scoring ungapped segments among related sequences.

Page 30: BLAST (Basic local alignment search Tool)

Objectives of BLAST Alignments of the high-scoring

segment pairs showing identical and similar residues.

A complete list of the parameter settings used for the search.

That’s all from our presentation

Page 31: BLAST (Basic local alignment search Tool)

THANK YOU