Welcome to Introduction to Bioinformatics Wednesday, 25 January Introduction to Molecular Biology

Post on 10-Jan-2016

20 views 1 download

description

Welcome to Introduction to Bioinformatics Wednesday, 25 January Introduction to Molecular Biology Part 2: DNA to protein. Coming attractions! Significance Palindromes (SQ4) Why introns? (SQ8) Types of mutation (SQ12). - PowerPoint PPT Presentation

Transcript of Welcome to Introduction to Bioinformatics Wednesday, 25 January Introduction to Molecular Biology

Welcome toIntroduction to Bioinformatics

Wednesday, 25 JanuaryIntroduction to Molecular Biology

Part 2: DNA to protein

• Coming attractions!• Significance• Palindromes (SQ4)• Why introns? (SQ8)• Types of mutation (SQ12)

These discussion groups may be useful if they are utilized by others in the class.

I think the Blackboard discussions can be really helpful with using peers

as resources

i am pretty certain it will if people

actively read and use it.

little understanding of what the big picture is.

when explaining the concepts please

also mention the practical

applications

200820121993

E. coli: What makes it kill?

Escherichia coli . . .

. . . very small lab rats

Courtesy of Kent State University Microbiology

E. coli: What makes it kill?

Escherichia coli . . .

haemorrhagic colitis

E. coli: What makes it kill?

E. coli K12 E. coli O157:H7

Gene finder Gene finder

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

E. coli: What makes it kill?

E. coli K12 E. coli O157:H7

Gene finder Gene finder

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

E. coli: What makes it kill?

Similarity finder

Killer proteinMembrane protein, sodium transporter

Iron responsive transcriptional regulator

Calcium dependent protein kinase

Unknown protein

Unknown protein

Unknown protein

. . .

Killer functions

ideas for new antibiotics

How to find set of O157:H7-specific protein?

How to find set of O157:H7-specific protein?

E. coli K12 E. coli O157:H7

Gene finder Gene finder

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

How to sift through thousands of proteins?

How to find set of O157:H7-specific protein?

E. coli K12 E. coli O157:H7

Gene finder Gene finder

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

How to sift through thousands of proteins?

How to find set of O157:H7-specific protein?

E. coli K12 E. coli O157:H7

Gene finder Gene finder

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

DEFINE K12-set AS Genes-of K12-DNA

DEFINE O157-set AS Genes-of O157-DNA

FOR EACH protein IN O157-set

WHEN Constituent-of (K12-set, protein) = FALSE

COLLECT protein

"Constituent-of"? "Same"?

How to find basis of drought resistance?

Wild-type crops Drought-resistant

Gene finder Gene finder

DEFINE K12-set AS Genes-of K12-DNA

DEFINE O157-set AS Genes-of O157-DNA

FOR EACH protein IN O157-set

WHEN Constituent-of (K12-set, protein) = FALSE

COLLECT protein

"Constituent-of"? "Same"?

Ray Wu, Cornell Univ.

How to find basis of human traits?

Great apes Humans

Gene finder Gene finder

DEFINE K12-set AS Genes-of K12-DNA

DEFINE O157-set AS Genes-of O157-DNA

FOR EACH protein IN O157-set

WHEN Constituent-of (K12-set, protein) = FALSE

COLLECT protein

"Constituent-of"? "Same"?

How to find set of O157:H7-specific protein?

E. coli K12 E. coli O157:H7

Gene finder Gene finder

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

DEFINE K12-set AS Genes-of K12-DNA

DEFINE O157-set AS Genes-of O157-DNA

FOR EACH protein IN O157-set

WHEN Constituent-of (K12-set, protein) = FALSE

COLLECT protein

"Constituent-of"? "Same"?

>NC_000913 (size:4639221) AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAA AAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAAT TAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATA GCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCAT TAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGG GCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCG ATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCT GCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCAT TAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTG CCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCT GCATGGCATTAGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGC TGATTTGCCGTGGCGAGAAAATGTCGATCGCCATTATGGCCGGCGTATTA GAAGCGCGCGGTCACAACGTTACTGTTATCGATCCGGTCGAAAAACTGCT GGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCTGAGTCCACCC GCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAA CGGTTCCGACTACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATT GTTGCGAGATTTGGACGGACGTTGACGGGGTCTATACCTGCGACCCGCGT CAGGTGCCCGATGCGAGGTTGTTGAAGTCGATGTCCTACCAGGAAGCGAT GGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGCACCATTACCC CCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT CAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACC GGTCAAGGGCATTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCTG GTCCGGGGATGAAAGGGATGGTCGGCATGGCGGCGCGCGTCTTTGCAGCG ATGTCACGCGCCCGTATTTCCGTGGTGCTGATTACGCAATCATCTTCCGA ATACAGCATCAGTTTCTGCGTTCCACAAAGCGACTGTGTGCGAGCTGAAC GGGCAATGCAGGAAGAGTTCTACCTGGAACTGAAAGAAGGCTTACTGGAG CCGCTGGCAGTGACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGG TATGCGCACCTTGCGTGGGATCTCGGCGAAATTCTTTGCCGCACTGGCCC GCGCCAATATCAACATTGTCGCCATTGCTCAGGGATCTTCTGAACGCTCA ATCTCTGTCGTGGTAAATAACGATGATGCGACCACTGGCGTGCGCGTTAC TCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTG GCGTCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAA AGCTGGCTGAAGAATAAACATATCGACTTACGTGTCTGCGGTGTTGCCAA CTCGAAGGCTCTGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGC AGGAAGAACTGGCGCAAGCCAAAGAGCCGTTTAATCTCGGGCGCTTAATT

>NC_000913 (size:4639221) AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAA AAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAAT TAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATA GCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCAT TAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGG GCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCG ATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCT GCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCAT TAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTG CCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCT GCATGGCATTAGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGC TGATTTGCCGTGGCGAGAAAATGTCGATCGCCATTATGGCCGGCGTATTA GAAGCGCGCGGTCACAACGTTACTGTTATCGATCCGGTCGAAAAACTGCT GGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCTGAGTCCACCC GCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAA CGGTTCCGACTACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATT GTTGCGAGATTTGGACGGACGTTGACGGGGTCTATACCTGCGACCCGCGT CAGGTGCCCGATGCGAGGTTGTTGAAGTCGATGTCCTACCAGGAAGCGAT GGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGCACCATTACCC CCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT

>NC_000913 (size:4639221) AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAA AAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAAT TAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATA GCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCAT TAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGG GCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCG ATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCT GCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCAT TAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTG CCGAACTTTTGACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCG CAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCT GCATGGCATTAGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGC TGATTTGCCGTGGCGAGAAAATGTCGATCGCCATTATGGCCGGCGTATTA GAAGCGCGCGGTCACAACGTTACTGTTATCGATCCGGTCGAAAAACTGCT GGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCTGAGTCCACCC GCCGTATTGCGGCAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCA GGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAA CGGTTCCGACTACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATT GTTGCGAGATTTGGACGGACGTTGACGGGGTCTATACCTGCGACCCGCGT CAGGTGCCCGATGCGAGGTTGTTGAAGTCGATGTCCTACCAGGAAGCGAT GGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGCACCATTACCC CCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT

Information+

Self-assembly

Structure+

Function

ProteinDNA

CGACCATCGCCTTAGTAC

Study Question 11

How do genes exert control over cellular processes?

Structure

TranscriptomicsSystems

GenomicsTCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

Metabolomics

Proteomics

What does this have to do withbioinformatics?

Michael MontagueJ Craig Venter Institute

Wednesday, January 26, 12:30(12:15 food)

Engineering West Building, Room 401

Science (17 Dec 2010) 330:1605

IV.G. How a greater ability to predict protein function from its primary structure would change the world we live in

Are there any programs that predict the

conformation/chemical properties of proteins?

Study Question 12

Information+

Self-assembly

Structure+

Function

ProteinDNA

CGACCATCGCCTTAGTAC

Study Question 11

How do genes exert control over cellular processes?

when explaining the concepts please

also mention the practical

applications

Important applications?

What have we gotten from…

Michael Faraday

Unity of electricity and magnetism

Inst Chem, Hebrew Univ Jerusalem

What have we gotten from…

William Roentgen

Behavior of electricity in a vacuum

http://www.wired.com/science/discoveries/news/2007/11/dayintech_1108

Welcome toIntroduction to Bioinformatics

Wednesday, 25 JanuaryIntroduction to Molecular Biology

Part 2: DNA to protein

• Coming attractions!• Significance• Palindromes (SQ4)• Why introns? (SQ8)• Types of mutation (SQ12)

I am hearing the word palindromic DNA for the

first time, I was wondering if and example could be given of a palindromic

DNA sequence.

Palindromic Sequences

What is it?

What about with DNA? GCTATCG

Backwards = forwards ROTATOR

TTAATGTGAGTTAGCTCACTCATTAATTACACTCAATCGAGTGAGTAA

• DNA is double stranded

What is it?

What about with DNA? GCTATCG

Backwards = forwards ROTATOR

• DNA is redundant

TTAATGTGAGTTAGCTCACTCATTAATTACACTCAATCGAGTGAGTAA

• DNA is double stranded

Palindromic Sequences

What is it?

What about with DNA? GCTATCG

Backwards = forwards ROTATOR

• DNA is redundant

TTAATGTGAGTTAGCTCACTCATTAATTACACTCAATCGAGTGAGTAA

• DNA is double stranded

TTAATGTGAGTTAGCTCACTCATTAATGAGTGAGCTAACTCACATTAA

• DNA has direction (read 5’->3’)

5’- -3’3’- -5’

Palindromic Sequences

TTAATGTGAGTTAGCTCACTCATTAATTACACTCAATCGAGTGAGTAA

5’- -3’3’- -5’

TAT GGCATGCTAGC

TTAAT TCATTAATTA AGTAA

CGTACGATCGG TAT

DNA: cruciform

RNA: stem/loop

Palindromic SequencesPalindromic sequences as structural RNA

TTAATGTGAGTTAGCTCACTCATTAATTACACTCAATCGAGTGAGTAA

5’- -3’3’- -5’

tRNA

UAU GGCAUGCUAGC

UUAAU UCAUU

DNA: cruciform

RNA: stem/loop

Palindromic SequencesPalindromic sequences as structural RNA

Palindromic SequencesPalindromic sequences as protein binding sites

why palindromes are targeted by DNA-binding proteins

why [are] palindromes… targeted by DNA-binding

proteins

recognizes GTGAGTT

NNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

NNNNNNNNNNNNNNNNNNNNNNNNNN

TTAATGTGAGTTAGCTCACTCATT AATGAGTGAGCTAACTCACATTAA

Palindromic SequencesPalindromic sequences as protein binding sites

NNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

NNNNNNNNNNNNNNNNNNNNNNNNNN

TTAATGTGAGTTAGCTCACTCATT AATGAGTGAGCTAACTCACATTAA

Palindromic SequencesPalindromic sequences as protein binding sites

Palindromic Sequences

Palindromic sequences as protein binding sites

NNNNNNNNNNNNNNN

NNNNNNNNNNNNNNN

NNNNNNNNNNNNNNNNNNNNNNNNNN

TTAATGTGAGTTAGCTCACTCATT AATGAGTG

AGCTAACT

CACATTAA

Palindromic Sequences

Palindromic sequences as protein binding sites

NNNNNNNNNNNNNNN

NNNNNNNNNNNNNNN

NNNNNNNNNNNNN

NNNNNNNNNNNNN

TTAATGTGAGTTAGCTCACTCATT

AATGAGTGAGCTAACTCACATTAA

Palindrom

ic Sequences

Palindromic sequences as protein binding sites

NNNNNNNNNNNNNNN

NNNNNNNNNNNNNNN

NNNNNNNNNNNNN

NNNNNNNNNNNNN

TTAATGTGAGTTAGCTCACTCATT

AATGAGTGAGCTAACTCACATTAA

recognizes GTGAGTT

Palindromic Sequences

Palindromic sequences as protein binding sites

NNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

NNNNNNNNNNNNNNNNNNNNNNNNNN

TTAATGTGAGTTAGCTCACTCATTAATGAGTGAGCTAACTCACATTAA

Palindromic Sequences

Palindromic sequences as protein binding sites

NNNNNNNN

NNNNNNN

NNNNNNNN

NNNNNNN

NNNNNNNN

NNNNN

NNNNNNNN

NNNNN

TTAATGTG

AGTTAGCT

CACTCATT

AATGAGTGAGCTAACTCACATTAA

Palin

drom

ic Se

quen

ces

Palin

drom

ic se

quen

ces a

s pro

tein

bind

ing

sites

NNNNNNNNNNNNNNN

NNNNNNNNNNNNNNN

NNNNNNNNNNNNN

NNNNNNNNNNNNN

TTAATGTGAGTTAGCTCACTCATT

AATGAGTGAGCTAACTCACATTAA

Palindromes: Serve as binding sites for dimeric protein

Palindromic SequencesPalindromic sequences as protein binding sites

NNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

NNNNNNNNNNNNNNNNNNNNNNNNNN

TTAATGTGAGTTAGCTCACTCATT AATGAGTGAGCTAACTCACATTAA

GTA ..(8).. TAC

5’-GTA ..(8).. TACNNNNNNNNNNTANNNTNNNNNNNNNNNNNNNNNNNNNNNNNNNNATGNNNNNNNNNNNNNNNN3’-CAT ..(8).. ATGNNNNNNNNNNATNNNANNNNNNNNNNNNNNNNNNNNNNNNNNNNTACNNNNNNNNNNNNNNNN

gene

Palindromic SequencesPalindromic sequences as protein binding sites

GTA ..(8).. TAC

5’-GTA ..(8).. TACNNNNNNNNNNTANNNTNNNNNNNNNNNNNNNNNNNNNNNNNNNNATGNNNNNNNNNNNNNNNN3’-CAT ..(8).. ATGNNNNNNNNNNATNNNANNNNNNNNNNNNNNNNNNNNNNNNNNNNTACNNNNNNNNNNNNNNNN

gene

Transcription factor

RNA Polymerase

Palindromic SequencesPalindromic sequences as protein binding sites

RNA

Is the promoter a beginning string of nucleotides for RNA,

GTA ..(8).. TAC

5’-GTA ..(8).. TACNNNNNNNNNNTANNNTNNNNNNNNNNNNNNNNNNNNNNNNNNNNATGNNNNNNNNNNNNNNNN3’-CAT ..(8).. ATGNNNNNNNNNNATNNNANNNNNNNNNNNNNNNNNNNNNNNNNNNNTACNNNNNNNNNNNNNNNN

gene

Transcription factor

RNA Polymerase

Palindromic SequencesPalindromic sequences as protein binding sites

RNA

Is the promoter a beginning string of nucleotides for RNA, and if so, can such a beginning

string have uses beyond just defining the beginning of the

promoter sequence?

Structure of DNA5' 3'

Structure of DNA

ATG GAG CCT …

Which strand determines the

codons of the gene?

5’

3’

3’

5’

Structure of DNA

Structure of DNA

Structure of DNA

Structure of DNA

ATG GAG CCT …

Which strand determines the

codons of the gene?

5’

3’

3’

5’

Structure of DNAStructure of the Genome

GGCAGAATGTGAAGCTAGGCATTGTACCTAGCCGTCTTACACTTCGATCCGTAACATGGATC

Where might be a start codon? (presume ATG)

10 20 30

Structure of DNAStructure of the Genome

GGCAGAATGTGAAGCTAGGCATTGTACCTAGCCGTCTTACACTTCGATCCGTAACATGGATC

Where might be a start codon? (presume ATG)

10 20 30

Structure of DNAStructure of the Genome

GGCAGAATGTGAAGCTAGGCATTGTACCTAGCCGTCTTACACTTCGATCCGTAACATGGATC

Where might be a start codon? (presume ATG)

10 20 30

Structure of DNAStructure of the Genome

GGCAGAATGTGAAGCTAGGCATTGTACCTAGCCGTCTTACACTTCGATCCGTAACATGGATC

Where might be a start codon? (presume ATG)

10 20 30

Structure of DNAStructure of the Genome

GGCAGAATGTGAAGCTAGGCATTGTACCTAGCCGTCTTACACTTCGATCCGTAACATGGATC

Where might be a start codon? (presume ATG)

10 20 30

Structure of DNAStructure of the Genome

GGCAGAATGTGAAGCTAGGCATTGTACCTAGCCGTCTTACACTTCGATCCGTAACATGGATC

1 2

43

Where might be a start codon? (presume ATG)

5’5’3’

3’

I think it could be easier if we spoke with the students near us first (ex.in little groups) for a minute or two,

then ask for comments from the whole class.

Structure of DNAStructure of the Genome

5’5’3’

3’