Yigal Bejerano and Randeep S. Bhatia Bell Laboratories, Lucent Technologies IEEE INFOCOM, 2004
Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim...
-
Upload
tyrone-foster -
Category
Documents
-
view
222 -
download
1
Transcript of Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim...
![Page 1: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/1.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 1
MW 11:00-12:15 in Beckman B302
Profs: Serafim Batzoglou, Gill Bejerano
TAs: Aaron Wenger & Jim Notwell
![Page 2: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/2.jpg)
Goals
http://cs273a.stanford.edu [Bejerano Fall11/12] 2
![Page 3: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/3.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 3
Goals
• Meet your genome (learn to surf, learn the surf)• Understand genomic tools (theory, applications)• DIY (pose questions, use tools, write code, get answers)
![Page 4: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/4.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 4
Materials
How is the class split between CS and BIO?We’ll have three Friday sessions starting this Friday 10am, in B-302:Bio Primer; Text Processing Primer; UCSC Genome Browser Primer.Strong programming background recommended (CS107 level)
Homework (schedule on website):Two individual homework assignments (theory + practice),
plus a group project.Instead of an exam we’ll have a milestone and a final
poster session.
Attendance is mandatory (for grade). You may skip 2 lectures without affecting your grade.
Get on the mailing list!
Reading Material: mostly journal papers
![Page 5: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/5.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 5
Topics
Topics will include:
(1) genome sequencing: technologies, assembly, personalized sequencing
(2) functional landscape: genes, regulatory modules, repeats, RNA genes, epigenetics
(3) genomic contribution to human disease and disease susceptibility
(4) genome evolution: evolutionary processes, comparative genomics, ultraconservation, exaptation
As time permits, we may cover population genetics and personalized genomics, ancient DNA, metagenomics, or other current topics.
Discuss difference from CS262 (Winter), CS374(Fall), Gene203(Fall)
![Page 6: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/6.jpg)
Asides
• Biology is very complex.• Beautiful truths can be gleaned with little knowledge.• We’ll start with a bird’s eye view and gradually dive in.• But we still won’t exhaust the depth of any single topic.
• Feedback always welcome.
http://cs273a.stanford.edu [Bejerano Fall11/12] 6
![Page 7: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/7.jpg)
Organism – Cell - Genome
http://cs273a.stanford.edu [Bejerano Fall11/12] 7
1013 different cells in an adult human.
The cell is the basic unit of life.
DNA = linear molecule inside the cell that carries instructions needed throughout the cell’s life ~ long string(s) over a small alphabet
Alphabet of four (nucleotides/bases) {A,C,G,T} Strings of length 104-1011
...ACGTACGACTGACTAGCATCGACTACGACTAGCAC...
“instruction”Genome:
![Page 8: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/8.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 8
One Cell, One Genome, One Replication
•Every cell holds a copy of all its DNA = its genome.•The human body is made of ~1013 cells.•All originate from a single cell through repeated cell divisions.
cell
genome =
all DNA
chicken ≈ 1013 copies(DNA) of egg (DNA)
chicken
egg egg
egg
cell
division
DNA strings =
Chromosomes
![Page 9: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/9.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 9
Lights, Action, Rolling
2001
HGC Celera
Getting the “blueprint of life”
![Page 10: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/10.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 10
TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATACATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTCAGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTCCGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACTAGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATGATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAAAAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAATTGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAATTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGGATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGATTTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAATCTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATGAACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATCATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAAAAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCAGCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAACTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAG...TTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTAAGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGAGTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACAGCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAACCAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAACACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTGGTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTCTCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAATGCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAAT
![Page 11: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/11.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 11
DNA sequencing
How we obtain the sequence of nucleotides of a species
…ACGTGACTGAGGACCGTGCGACTGAGACTGACTGGGTCTAGCTAGACTACGTTTTATATATATATACGTCGTCGTACTGATGACTAGATTACAGACTGATTTAGATACCTGACTGATTTTAAAAAAATATT…
![Page 12: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/12.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 12
DNA Sequencing – OverviewGel electrophoresis
Predominant, old technology by F. Sanger
Whole genome strategiesPhysical mappingWalkingShotgun sequencing
Computational fragment assembly
The future—new sequencing technologiesPyrosequencing, single molecule methods, …“Next” Generation sequencing, Third Gen,Novel assembly techniques
Future variants of sequencingResequencing of humansMicrobial and environmental sequencingCancer genome sequencing
1975
2015
![Page 13: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/13.jpg)
http://cs273a.stanford.edu [Bejerano Fall09/10] 13
Steps to Assemble a Genome
1. Find overlapping reads
4. Derive consensus sequence ..ACGATTACAATAGGTT..
2. Merge some “good” pairs of reads into longer contigs
3. Link contigs to form supercontigs
Some Terminology
read a 500-900 long word that comes out of sequencer
mate pair a pair of reads from two endsof the same insert fragment
contig a contiguous sequence formed by several overlapping readswith no gaps
supercontig an ordered and oriented set(scaffold) of contigs, usually by mate
pairs
consensus sequence derived from thesequene multiple alignment of reads
in a contig
![Page 14: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/14.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 14
NGS: Next Generation (re)Sequencing
Output = massive amounts of short, lower quality reads.
New Technologies + New Algorithms = New Opportunities
![Page 15: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/15.jpg)
Third Generation Sequencing
http://www.mcb.harvard.edu/branton/index.htm
Just one example:
Output: very long reads of 10,000-100,000 basepairs each.
We’ll be able to sequence “anything” we like. In a lab.
15http://cs273a.stanford.edu [Bejerano Fall11/12]
![Page 16: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/16.jpg)
100 million species
7 billion individuals
1013 cells in a human
Genomes, sequences everywhere
or sequence just an active portion
16http://cs273a.stanford.edu [Bejerano Fall11/12]
![Page 17: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/17.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 17
“Unfinished Business in a Finished Genome”
341 remaining gaps:
33 Heterochromatic,
35 Euchromatic Boundaries,
273 Euchromatic Interior regions.
Centromeric, Telomeric gaps
Arcocentric, rDNA clusters:chr. 13,14,15,21,22
![Page 18: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/18.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 18
Copy Number Variation (CNVs)
so... how representative is the reference genome?
[Redon et al, 2006]
![Page 19: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/19.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 19
TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATACATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTCAGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTCCGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACTAGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATGATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAAAAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAATTGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAATTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGGATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGATTTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAATCTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATGAACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATCATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAAAAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCAGCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAACTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAG...TTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTAAGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGAGTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACAGCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAACCAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAACACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTGGTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTCTCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAATGCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAAT
![Page 20: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/20.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 20
TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATACATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTCAGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTCCGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACTAGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATGATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAAAAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAATTGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAATTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGGATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGATTTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAATCTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATGAACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATCATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAAAAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCAGCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAACTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAG...TTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTAAGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGAGTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACAGCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAACCAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAACACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTGGTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTCTCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAATGCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATTT
Promoter motifs
3’ UTR motifs
Exons
Introns
![Page 21: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/21.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 21
Genomes, Genes & Proteins
The most visible instructions in our genome are Genes.
Genes explain exactly HOW to synthesize any protein.
Proteins are the work horses of every living cell.
...ACGTACGACTGACTAGCATCGACTACGACTAGCAC...
geneGenome:
cellprotein
![Page 22: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/22.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 22
Portals to the Human Genome
GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAG
Human Genome = three billion (3*109) basepairs:
![Page 23: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/23.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 23
Genome Browser Database
Primary table: positions, names, etc.
UnderlyingDatabase(MySQL)
Auxiliary table: related data
visualize search & download
![Page 24: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/24.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 24
The Human Gene Set
[HGC, 2001]
![Page 25: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/25.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 25
Gene Finding – The PracticeChallenge:
“The genes, the whole genes, and nothing but the genes”
![Page 26: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/26.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 26
Meet Your Genome
[Human Molecular Genetics, 3rd Edition]
![Page 27: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/27.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 27
Repeats / obile Elements ("selfish DNA")
HumanGenome:
3*109 letters1.5%
knownfunction >50%
junk
![Page 28: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/28.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 28
![Page 30: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/30.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 30
Transcripts, transcripts everywhere
Human Genome
Transcribed (Tx)
Tx from both strandsLeaky tx?
Functional?
![Page 31: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/31.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 31
Human Gene Regulation
All these cells have the same Genome.
Gene
Gene
Gene
Gene
20,000 Genes encode how to make proteins.
1,000,000 Genomic “switches” determinewhich and how much proteins to make.
1013 different cells in an adult human.
Hundreds of different cell types.
![Page 32: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/32.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 32
Combinatorial Regulatory Code
Gene
2,000 different proteins can bind specific DNA sequences.
A regulatory region encodes 3-10 such protein binding sites.
When all are bound by proteins the regulatory region turns “on”,and the nearby gene is activated to produce protein.
Proteins
DNA
DNA
Protein binding site
![Page 33: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/33.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 33
Cis-reg & Ultra elements from obile Elements
[Yass is a small town in New South Wales, Australia.]
Co-option event, probably due to favorable genomic context
All other copies are destined to decay over time at a neutral rate
[Bejerano et al., Nature 2006]
![Page 34: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/34.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 34
unicellular
multicellular
Unicellular vs. Multicellular
![Page 35: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/35.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 35
Signal Transduction
![Page 36: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/36.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 36
Histone Code
![Page 37: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/37.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 37
Every Genome is Different
DNA Replication is imperfect – between individuals of the same species, even between the cells of an individual.
...ACGTACGACTGACTAGCATCGACTACGA...
chicken
egg ...ACGTACGACTGACTAGCATCGACTACGA...
functionaljunk
TT CAT
“anything
goes”
many changes
are not toleratedchicken
This has bad implications – disease, and good implications – evolution.
![Page 38: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/38.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 38
TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATACATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTCAGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTCCGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACTAGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATGATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAAAAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAATTGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAATTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGGATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGATTTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAATCTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATGAACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATCATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAAAAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCAGCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAACTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAG...TTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTAAGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGAGTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACAGCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAACCAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAACACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTGGTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTCTCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAATGCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAAT
![Page 39: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/39.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 39
Single Base Changes
![Page 40: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/40.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 40
Larger Size Mutations
[de Kok et al, 1996]
![Page 41: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/41.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 41
Genome Wide Association Studies
![Page 42: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/42.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 42
TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATACATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTCAGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTCCGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACTAGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATGATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAAAAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAATTGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAATTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGGATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGATTTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAATCTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATGAACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATCATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAAAAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCAGCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAACTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAG...TTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTAAGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGAGTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACAGCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAACCAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAACACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTGGTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTCTCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAATGCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAAT
![Page 43: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/43.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 43
human
mouse
rat
chimp
chicken
fugu
zfish
dog
tetra
Intelligent Designer
human
mouse
rat
chimp
chicken
fugu
zfish
dog
tetra
opossum
cow
macaque
platypus
opossum
cow
macaque
platypus
Comparative Genomics
“Nothing in Biology Makes Sense Except in the Light of Evolution” Theodosius Dobzhansky
t[Adam Siepel, Cornell]
![Page 44: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/44.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 44
Fixation, Positive & Negative Selection
Neutral DriftNeutral Drift Positive SelectionPositive SelectionNegative SelectionNegative Selection
How can we detect negative
selection?
How can we detect negative
selection?
How can we detect positive
selection?
How can we detect positive
selection?
![Page 45: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/45.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 4545
PhenotypeGenotype
![Page 46: Http://cs273a.stanford.edu [Bejerano Fall11/12] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Aaron Wenger & Jim Notwell.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f155503460f94c2a0ed/html5/thumbnails/46.jpg)
http://cs273a.stanford.edu [Bejerano Fall11/12] 46
To Be Continued…To Be Continued…