Sequence Analysis. DNA and Protein sequences are biological information that are well suited for...

16
Sequence Analysis

Transcript of Sequence Analysis. DNA and Protein sequences are biological information that are well suited for...

Page 1: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Sequence Analysis

Page 2: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Sequence Analysis

• DNA and Protein sequences are biological information that are well suited for computer analysis

• Fundamental Axiom: homologous sequences share an evolutionary ancestor and are almost surely performing the same or a similar function

Page 3: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Sequence Analysis topics for today

• Restriction enzyme sites for diagnostics and cloning

• Open reading frame analysis

• Conceptual translation

• Oligo primer design

• Sequence alignments

Page 4: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Sequence Analysis

• Alignments document homologous relationships

• DNA sequence alignments - best for showing identity

• Protein sequence alignments best for showing similarity

Page 5: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Types of Alignments

Page 6: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

In Class Tutorial

• Introduction to File Formats– Examples of file formats– Utilities to change formats

• Restriction Analysis– Web tools for restriction analysis– Local programs

Page 7: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

In Class Tutorial

• Open reading frame analysis• Reverse complement• Capturing output to an MS Word doc• Oligo Primer Design for PCR and

sequencing• Alignments

– global and local

Page 8: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Sequence File Formats

• FASTA – Simplest format– Easy to create by hand on a word processor

Page 9: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

FASTA• First line must start with > followed by seq name• Second line to end = sequence• No numbers or spaces• Seq can be UPPER or lower case

Page 10: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

File Formats

• Some sequence analysis program take input sequences in FASTA format ONLY

• ReadSeq is a web based utility that converts many file formats to FASTA

• More and more programs will accept multiple file formats as input

Page 11: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Mono-Space Fonts

• Every character uses the same space = mono space

• ATG and C use the same space on a line

• W and . use the same space on a line

• Critical for sequence alignments to stay aligned

Page 12: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Mono-Space Fonts

NOT a Monospace font

Page 13: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Primer Design

• Primers are chemically synthesized oligonucleotides

• Used for sequencing and PCR

• Bad primer design can result in reaction failures

Page 14: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Primer Design Matters

Page 15: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Primer Design• TM 55-60°: PCR primer pairs need to have similar TM’s • GC content 40-60% (Biased to 5’ end) • Length = 17-25nt• Low self complementarity (Palindromes)• < 3/5 3’ bases G/C (no GC clamp at 3’ end)• Low complementarity between primers (avoid primer dimer)• Blast search primers – avoid repetitive DNA• Small amplicon size increases PCR efficiency• Avoid runs of one base

Page 16: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences.

Primer Design: GC Clamps cause false priming