Lecture #4 : Comparing genes

52
Lecture #4 : Comparing Lecture #4 : Comparing genes genes 9/14/09

description

Lecture #4 : Comparing genes. 9/14/09. This week. Homework #2 due on Wed Email with questions Email me answers or hand in in class Wed - I will be at Dept of Biology retreat Lecture will be given by Kelly O’Quin - expert in phylogenetics - PowerPoint PPT Presentation

Transcript of Lecture #4 : Comparing genes

Page 1: Lecture #4 : Comparing genes

Lecture #4 : Comparing Lecture #4 : Comparing genesgenes

9/14/09

Page 2: Lecture #4 : Comparing genes

This weekThis week Homework #2 due on Wed

Email with questionsEmail me answers or hand in in class

Wed - I will be at Dept of Biology retreatLecture will be given by Kelly O’Quin -

expert in phylogeneticsHe will go over homework so it must

be done before class

Page 3: Lecture #4 : Comparing genes

Questions for todayQuestions for today

0. More BLAST1. Where do we get high quality

gene sequences?2. How do genes evolve?3. How do we compare genes?

Page 4: Lecture #4 : Comparing genes

How to find genesHow to find genes

Start with genes which are known from model organisms

Use these to pull out genes from genomes

Compare genes to learn about sensory evolution

Page 5: Lecture #4 : Comparing genes

Blast - GenbankBlast - Genbank

What database do you want to search?

What do you want to compare?

What program do you want to do the searching?

Page 6: Lecture #4 : Comparing genes

Query Database Type

Nucleotide Nucleotide Blastn, Megablast, Discont megablast

Protein Protein Blastp, Psi-blast, Phi-blast

Translated nucleotide

Protein Blastx

Protein Translated nucleotide

Tblastn

Translated nucleotide

Translated nucleotide

Tblastx

Types of blast queriesTypes of blast queries

Page 7: Lecture #4 : Comparing genes
Page 8: Lecture #4 : Comparing genes

Defaults

Database

Program

Confirm

Page 9: Lecture #4 : Comparing genes

Nucleotide BLAST = DNA nucleotide query vs nucleotide database

Page 10: Lecture #4 : Comparing genes

Choices for programsChoices for programs

Megablast Highly similar sequences >95%

Word length 28 Discontiguous megablast

Pretty similar seqs Word length 11

Blastn Dissimilar seqsWord length 11

Page 11: Lecture #4 : Comparing genes

Translated blast = protein query vs translated database

Page 12: Lecture #4 : Comparing genes

BLAST a genomeBLAST a genome

Request IDAWJ4D4B7012

Page 13: Lecture #4 : Comparing genes
Page 14: Lecture #4 : Comparing genes
Page 15: Lecture #4 : Comparing genes
Page 16: Lecture #4 : Comparing genes
Page 17: Lecture #4 : Comparing genes

BLASTing is funBLASTing is fun

This is meant to be enjoyable Be a genome explorer

Find out what kind of data is out thereFind out what kind of data isn’t there

QUESTIONS?????

Page 18: Lecture #4 : Comparing genes

Q1.Q1.

There is so much data in Genbank. How do you find GOOD data?

ExampleBovine rhodopsin - 1st G protein

coupled receptor to be sequencedSearch Genbank with text

49 entries

Page 19: Lecture #4 : Comparing genes

Bovine opsinBovine opsin

Page 20: Lecture #4 : Comparing genes

Bovine rhodopsinBovine rhodopsin

Page 21: Lecture #4 : Comparing genes

Searching for genesSearching for genes

Searching by text is fraught with perilGenbank has too many linksPull up many things that are not what

you want BLAST is better approach NCBI has also made records which

combine all similar sequences into one

Page 22: Lecture #4 : Comparing genes
Page 23: Lecture #4 : Comparing genes
Page 24: Lecture #4 : Comparing genes

NCBI has done some of NCBI has done some of the workthe work

They have hand-curated data for some species to make a set of reference sequencesNucleotide sequences - NMxxxxxxxProtein sequences - NPxxxxxx

For human rhodopsinNM000539NP000530

These are the gold standard for sequences

Page 25: Lecture #4 : Comparing genes

HomologeneHomologene

Page 26: Lecture #4 : Comparing genes

HomologsHomologs

Two genes which arise in the common ancestor of two organisms and are passed down

Implies genes perform same function in two organisms

Therefore they can be compared to learn about evolution

Page 27: Lecture #4 : Comparing genes

Human

Chimp

Macaque

Bushbaby

These 4 primates have many genes which are homologsand have been passed down from primate ancestor

Page 28: Lecture #4 : Comparing genes

Homologene search for Homologene search for rhodopsinrhodopsin

Page 29: Lecture #4 : Comparing genes

HomologeneHomologene

Page 30: Lecture #4 : Comparing genes

Three primary sequence Three primary sequence portals: 1. NCBIportals: 1. NCBI

Page 31: Lecture #4 : Comparing genes

3. DNA database of Japan3. DNA database of Japan

Page 32: Lecture #4 : Comparing genes

2. Ensembl - European 2. Ensembl - European Bioinformatics Institute Bioinformatics Institute

(EBI)(EBI)

Page 33: Lecture #4 : Comparing genes

Select just genesSelect just genes

Page 34: Lecture #4 : Comparing genes

Scroll down to find the Scroll down to find the gene you wantgene you want

Page 35: Lecture #4 : Comparing genes

Location Orthologues are predicted and linkedLinks to transcript and protein

Page 36: Lecture #4 : Comparing genes
Page 37: Lecture #4 : Comparing genes

OMIM - Online mendelian OMIM - Online mendelian inheritance in maninheritance in man

Page 38: Lecture #4 : Comparing genes

Good places to find genesGood places to find genes

Model organisms: NCBI homologene Genes from models and other organisms:

Sanger Ensembl gene familiesNOTE: These are often predicted from genome

sequencesIf there is a sequence in NCBI homologene, it

may be different (and more accurate) than Sanger predictions

OMIM is a good reference

Page 39: Lecture #4 : Comparing genes

Q2. How do genes change Q2. How do genes change through time?through time?

Change in actual sequenceMutationRecombination

Change in frequency of a sequenceSelection - “survive” betterDrift - get passed on by chanceMigration - move between populations

Page 40: Lecture #4 : Comparing genes

Mutation vs selectionMutation vs selection Mutation = sequence changeATGCCGTGACGT ATGCCTTGACGT

Selection/drift/migration = sequence frequency changes across a number of individuals

ATGTG ATGTG ATGTG ATGTG ATGTG ATGTGATGTG ATGTG ATGTG ATGTG ATGTG ATGTT

ATGTG ATGTG ATGTG ATGTT ATGTT ATGTT ATGTG ATGTG ATGTG ATGTT ATGTT ATGTTATGTT ATGTG ATGTG ATGTT ATGTT ATGTT

Page 41: Lecture #4 : Comparing genes

Evolution as tinkererEvolution as tinkerer

Changes are typically small Mutation is source of new

sequenceNot all mutations are created equalSome occur more often than others

Other forces shift frequency of particular sequence

Page 42: Lecture #4 : Comparing genes

Triplet amino acid codeTriplet amino acid codeF, phe TTT S, ser TCT Y, tyr TAT C, cys TGTF, phe TTC S, ser TCC Y, tyr TAC C, cys TGCL, leu TTA S, ser TCA O, stopTAA J, stop TGAL, leu TTG S, ser TCG B, stopTAG W, trp TGG

L, leu CTT P, pro CCT H, his CAT R, arg CGTL, leu CTC P, pro CCC H, his CAC R, arg CGCL, leu CTA P, pro CCA Q, gln CAA R, arg CGAL, leu CTG P, pro CCG Q, gln CAG R, arg CGG

I, ile ATT T, thr ACT N, asn AAT S, ser AGTI, ile ATC T, thr ACC N, asn AAC S, ser AGCI, ile ATA T, thr ACA K, lys AAA R, arg AGAM, metATG T, thr ACG K, lys AAG R, arg AGG

V, val GTT A, ala GCT D, asp GAT G, gly GGTV, val GTC A, ala GCC D, asp GAC G, gly GGCV, val GTA A, ala GCA E, glu GAA G, gly GGAV, val GTG A, ala GCG E, glu GAG G, gly GGG

Page 43: Lecture #4 : Comparing genes

Mutation causes Mutation causes nucleotide changenucleotide change

What about AA sequence? Synonymous change

Syn = sameAA stays same

Nonsynonymous changeNot sameAA changes

Page 44: Lecture #4 : Comparing genes

Amino acid codeAmino acid codeF, phe TTT S, ser TCT Y, tyr TAT C, cys TGTF, phe TTC S, ser TCC Y, tyr TAC C, cys TGCL, leu TTA S, ser TCA O, stopTAA J, stop TGAL, leu TTG S, ser TCG B, stopTAG W, trp TGG

L, leu CTT P, pro CCT H, his CAT R, arg CGTL, leu CTC P, pro CCC H, his CAC R, arg CGCL, leu CTA P, pro CCA Q, gln CAA R, arg CGAL, leu CTG P, pro CCG Q, gln CAG R, arg CGG

I, ile ATT T, thr ACT N, asn AAT S, ser AGTI, ile ATC T, thr ACC N, asn AAC S, ser AGCI, ile ATA T, thr ACA K, lys AAA R, arg AGAM, metATG T, thr ACG K, lys AAG R, arg AGG

V, val GTT A, ala GCT D, asp GAT G, gly GGTV, val GTC A, ala GCC D, asp GAC G, gly GGCV, val GTA A, ala GCA E, glu GAA G, gly GGAV, val GTG A, ala GCG E, glu GAG G, gly GGG

Page 45: Lecture #4 : Comparing genes

Amino acid (AA) typesAmino acid (AA) types

Non-polar A, F, G, I, L, M, P, V, W

Polar N, Q, S, T, Y Charged, + H, K, R Charged, - D, E Other C

Often changing AA within a group does not affect protein function

Page 46: Lecture #4 : Comparing genes

SelectionSelection Stabilizing selection - Acts to

keep protein function the sameSynonymous change more frequent

than nonsynonymous Amino acid changes occur within

group much more common than betweenNon polar nonpolarPolar polar

Page 47: Lecture #4 : Comparing genes

Similarity matrixSimilarity matrix

A = alanineC = cysteineD = aspartic acidE = glutamic acidF = phenylalanineG = glycineH = histidine

Page 48: Lecture #4 : Comparing genes

Comparing sequencesComparing sequences

Can do at either nucleotide or AA level

Gather sequences from a bunch of different organisms

Need to align them so that sites which perform the same function can be compared

Page 49: Lecture #4 : Comparing genes

Aligning sequencesAligning sequences

Sequences may differ in lengthOften have differences at amino- or

carboxy- terminus of the proteinNeed a way to align parts of protein

that are performing the same function

Page 50: Lecture #4 : Comparing genes

Example - RH2 opsin in Example - RH2 opsin in fishesfishes

Goldfish MNGTEGNNFYVPLSNRMedaka MENGTEGKNFYIPMNNRZebrafish MNGTEGSNFYIPMSNRKillifish MGYGPNGTEGNNFYIPMSNKTrout MQNGTEGSNFYIPMSNRHalibut MVWDGGIEPNGTEGKNFYIPMSNRCod MRMEANGTEGKNFYIPMSNRTetraodon MVWDGGIEPNGTEGKNFYIPMSNR

Page 51: Lecture #4 : Comparing genes

Align sequencesAlign sequences

Zebrafish M--------NGTEGSNFYIPMSNR Trout M------Q-NGTEGSNFYIPMSNR Medaka M------E-NGTEGKNFYIPMNNR Cod M----RMEANGTEGKNFYIPMSNR Halibut MVWDGGIEPNGTEGKNFYIPMSNR Tetraodon MVWDGGIEPNGTEGKNFYIPMSNR Goldfish M--------NGTEGNNFYVPLSNR Killifish M---GYG-PNGTEGNNFYIPMSNK * *****.***:*:.*:

* identical: conserved. semi-conserved

Page 52: Lecture #4 : Comparing genes

Amino acid (AA) typesAmino acid (AA) types

Non-polar A, F, G, I, L, M, P, V, W

Polar N, Q, S, T, Y Charged, + H, K, R Charged, - D, E Other C

Often changing AA within a group does not affect protein function