Rec DNA II.1 The Human Genome Project Rec DNA II.2 2003 Completion of the Human Genome Programe...
-
Upload
juliette-larke -
Category
Documents
-
view
215 -
download
3
Transcript of Rec DNA II.1 The Human Genome Project Rec DNA II.2 2003 Completion of the Human Genome Programe...
Rec DNA II. 1
The Human Genome ProjectThe Human Genome Project
Rec DNA II. 2
2003 Completion of the Human Genome Programe
Start of the „post-genomic era”
2001 First draft of the Human Genome
HGP (Human Genome Project)
NCBI:NationalCenter of BiotechnologyInformation
Celera Genomics(privát szektor)Craig Venter
Francis CollinsHGP (Human Genome Project)
NCBI:NationalCenter of BiotechnologyInformation
Celera Genomics(private sector)Craig Venter
Francis Collins
Rec DNA II. 3
GTCCGGTCCC GGGACCCCCT GCCCAGGGTC AGAGGGGCGC CTACCTAGCT CACGGTCTTG
GGCCGGAGGG AATGGAGGAG GGAGCGGGGT CGACCGCTCA GCTGTCCGCC CAGTTTCGGA
GGCGGCCACG CGAGGATCAA CTGTGCAACG GGTGGGGCCG CGGCTGACCG TGGTGGTCGC
GGGGGCTGAG GGCCAGAGGC TGCGGGGGGG GGGCGGCGGG ATGAGCTAGG CGTCGGCGGT
TGAGTCGGGC GCGGAGTCGG GGGCAGGGGG AGCGGGCGTG GAGGGCGCGC ACGAGGTCGA
GGCGAGTCCG CGGGGGAGGC GGGCAGAGCC TGAGCTCAGG TCTTTCTGCG TCTGGCGGAA
CGGGCCTGGG AGGGAGGTTT TGCCAGATAC CAGGTGGACT AGGGTGAGCG CCCGAGGGCC
GGGACGCACG CACGGGCCGG GTAGGATGGC GCTGGCGTCG ATGCCCGCGC GCTTCAGGGC
CTGGTCTGGC CGCCCCTCCA TCCTTGTCGG TTTCTCGGGT CGCGGACCCC GCGCGGCGCC
GGGCGATGCT GGCCTGCCCG TGGCCACCAC CTCGCTTCAT TCCCGTCTCT TTGGGCCGCC
GCATTCGTCC ACGTGCCCGT CTCTCCCTGC GCAAAATTCC AAGATGAGCA AATACTGGGC
TCACGGTGGA GCGCCGCGGG GGCCCCCCTG AGCCGGGGCG GGTCGGGGGC GGGACCAGGG
TCCGGCCGGG GCGTGCCCGA GGGGAGGGAC TCCCCGGCTT GCGACCCGGC GTTGTCCGCG
J.Watson, 1st director of HGP
The Human Genome Project
‘clone by clone’ technique:- Parallel construction of genetic and physical maps- Representation of the genome in ordered libraries
The Human Genome Project
‘clone by clone’ technique:- Parallel construction of genetic and physical maps- Representation of the genome in ordered libraries
Rec DNA II. 4
Mapping strategies: Physical MapsMapping strategies: Physical Maps
Cytogenetic (chromosomal) maps - binding pattern
Cosmid contig maps ordered clones of overlapping libraries
Restriction mapssites of known restriction enzymes
DNA sequencesHigh
Low
resolution
Rec DNA II. 5
1st aim:
Find 30,000 markers(in average distance of 150,000 bp)
Marker: a unique sequence
The ‘clone by clone’ technique
The ‘clone by clone’ technique
2nd aim:
- Isolate chromosomes - Cleave them with endonuclease(150,000 bp fragments) -Clone them (Bacterial Arteficial Chromosome, BAC clones)
Rec DNA II. 6
3rd aim:Map the BAC clones with restriction endonucleasePut them in order!
Ordered BAC libraries
The ‘clone by clone’ technique
The ‘clone by clone’ technique
Rec DNA II. 7
Sequence the ends:GCCGAATCCAATTAGAAAAT
TAGAAAATCACATTTACCAGTCTGA
CCAGTCTGACCCCGCAAACGGGTTT
150 000 bp (BAC)
1500 bp fragments(overlapping)
Align the sequences:
GCCGAATCCAATTAGAAAAT
TAGAAAATCACATTTACCAGTCTGA
CCAGTCTGACCCCGCAAACGGGTTT
Sequencing the BAC clonesSequencing the BAC clones
Rec DNA II. 8
Craig Venter
Celera: The „shotgun” methodsCelera: The „shotgun” methods
2000 bpand 10000 bp fragments
AAGGACTTATG____________________GGACACAGGTTATGG
GACTTA_____CGTTGGAGAGAGGACACA________________CGTTATATTG
Sequencing of the ends and aligning by computer:
Only physical maps
Rec DNA II. 9
Representation of the human genomeRepresentation of the human genome
1. Databases (‘in silico’)HGP: http://www.ncbi.nlm.nih.gov/Celera: http://www.celera.com/
2. A series of bacterial colonies (BAC libraries)
Rec DNA II. 10
The ENTREZ databaseThe ENTREZ database
http://www.ncbi.nlm.nih.gov/Entrez/
National Center of Biotechnology Institute, USASurfing on the Net
Rec DNA II. 11http://www.ncbi.nlm.nih.gov/mapview/
Search for Homo Sapiens, DRD4 (dopamine D4 receptor gene)Surfing on the Net
Rec DNA II. 12
http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606&query=DRD4
Internet sétaChromosomal localization of the DRD4 gene
Rec DNA II. 13
http://www.ncbi.nlm.nih.gov/mapview/maps.cgi?taxid=9606&chr=11&MAPS=genec,ugHs,genes-r&cmd=focus&fill=40&query=uid(1641)&QSTR=DRD4
Internet séta
nagyítás
Search gene
sequence
Rec DNA II. 14
NCBI Entrez Gene
Rec DNA II. 15
http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=60521
Online Mendelian Inheritance in Man (OMIM)
OMIM: Database of mutations, diseases Known function of genes
OMIM: Database of mutations, diseases Known function of genes Internet séta
Review of the literature, references
Rec DNA II. 16http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=retrieve&dopt=default&list_uids=1815
Internet séta
Exon – intron structure
Exon (red box) – intron (red line) structure of a gene
Direction of transcription
Rec DNA II. 17
About 20,000 genes
The “useful information” of the genome The “useful information” of the genome
Less than 5% of the genome ???
45% of the human genomes are “jumping genes” (transposones)•LINEs
(long interspread elements): 6 kb, 8500 copies, 25% of our genomereplicates with reverse transcriptionmany truncated forms (inactive)
•SINEs (short interspred elements): 100-300 bp, 1,5 million copies 13% of our genome, replicates by using the SHINE machinery
Others• Duplicated human genes (pseudogenes)• Simple repeats (e.g.. AAAAAAAAAAAAAA….)
The ‘extra’ (‘junk’) DNA - Repeat sequencesThe ‘extra’ (‘junk’) DNA - Repeat sequences
Rec DNA II. 18
Universal Protein Resource (Swiss-Prot, TrEMBL, és PIR egyesítése)
http://www.expasy.uniprot.org/
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein&itool=toolbar
NCBI Entrez Protein database
Internet sétaProtein databasesProtein databases
Rec DNA II. 19
http://www.gene-regulation.com/pub/databases.html
http://www.gene-regulation.com/pub/databases.html#transcompel
http://www.cbil.upenn.edu/tess/
Databases of transcription factorsDatabases of transcription factors
2 transzkripciós faktor együtt
Internet séta
Rec DNA II. 20
The polymorphic nature of the human genome
The polymorphic nature of the human genome
Approx. 0.5% variations(15 million base pairs)
Rec DNA II. 21
Unrelated humans:
share - 99.9%(the difference is about
3 x 106 bp)
Mutations &Polimorphisms
GAGGGAGCGC
GAGGGAGCGCGAGGGAGCGC
GAGGGTGCGC
GAGGGTGCGC
GAGGGTGCGCHuman & apes:
share ~ 95%
“Similarity” in terms of gene sequence“Similarity” in terms of gene sequence
Rec DNA II. 22
GTCCGGTCCC GGGACCCCCT GCCCAGGGTC AGAGGGGCGC CTACCTAGCT CACGGTCTTG
GGCCGGAGGG AATGGAGGAG GGAGCGGGGT CGACCGCTCA GCTGTCCGCC CAGTTTCGGA
GGCGGCCACG CGAGGATCAA CTGTGCAACG GGTGGGGCCG CGGCTGACCG TGGTGGTCGC
GGGGGCTGAG GGCCAGAGGC TGCGGGGGGG GGGCGGCGGG ATGAGCTAGG CGTCGGCGGT
TGAGTCGGGC GCGGAGTCGG GGGCAGGGGG AGCGGGCGTG GAGGGCGCGC ACGAGGTCGA
GGCGAGTCCG CGGGGGAGGC GGGCAGAGCC TGAGCTCAGG TCTTTCTGCG TCTGGCGGAA
CGGGCCTGGG AGGGAGGTTT TGCCAGATAC CAGGTGGACT AGGGTGAGCG CCCGAGGGCC
GGGACGCACG CACGGGCCGG GTAGGATGGC GCTGGCGTCG ATGCCCGCGC GCTTCAGGGC
CTGGTCTGGC CGCCCCTCCA TCCTTGTCGG TTTCTCGGGT CGCGGACCCC GCGCGGCGCC
GGGCGATGCT GGCCTGCCCG TGGCCACCAC CTCGCTTCAT TCCCGTCTCT TTGGGCCGCC
GCATTCGTCC ACGTGCCCGT CTCTCCCTGC GCAAAATTCC AAGATGAGCA AATACTGGGC
TCACGGTGGA GCGCCGCGGG GGCCCCCCTG AGCCGGGGCG GGTCGGGGGC GGGACCAGGG
TCCGGCCGGG GCGTGCCCGA GGGGAGGGAC TCCCCGGCTT GCGACCCGGC GTTGTCCGCG
Mutations: rare allele variations - usually monogenic disorders(in less than 1% of the human population)
when the “misprint” is fatal
GAGGGCGCGC ACGAGGTCGA
TCTTTCTGCG TCTGGCGGAA
AGGGTGAGCG CCCGAGGGCC
ATGCCCGCGC GCTTCAGGGC
CGCGGACCCC GCGCGGCGCC
TCCCGTCTCT TTGGGCCGCC
AAGATGAGCA AATACTGGGC
GGTCGGGGGC GGGACCAGGG
CGACCCGGC GTTGTCCGCG
Azonosított monogénes öröklődésű
betegségekSickle cell anemia
Rec DNA II. 23
2 ismétlődés
3 ismétlődés
4 ismétlődés
5 ismétlődés
VNTR
G C A C T A C CC G T G A T G G
G C A T T A C CC G T A A T G G
SNP
… harmless misprints”
Genetic polimorphisms: variations over 1% frequency in humans
Single Nucleotide Polymorphism Variable Number of Tandem Repeats
Rec DNA II. 24
Single Nucleotide Polymorphisms/ SNPs (pronounced “snips”)
• 90% of the known variations• most SNPs have only two alleles
Polymorphism - MutationPolymorphism - Mutation
Polymorphism
Neutral ???Risk factors
more than 1%Frequency less than 1%
Effect disease
Mutation
Length Polymorphism: repeat sequences
Rec DNA II. 25
What is next?
Rec DNA II. 26
What is next?
Rec DNA II. 27
“Human - ape genome: 95% similarityWhat is the difference?”
“Human - ape genome: 95% similarityWhat is the difference?”
Rec DNA II. 28
High throughput methodsin genome analyzes:Automated DNA sequencing
High throughput methodsin genome analyzes:Automated DNA sequencing
‘Color sequencing’
Based on dideoxy-chain termination (see also: Lehninger)
...3’ C A A G T C A C C T T G
C A A G
A ddA
Terminating positions
Sequencing reaction mixture: All the four dNTP All the four ddNTP with different fluorescent dyeDNA polymerase, primer
Rec DNA II. 29
+
index
Sequencing results:
Rec DNA II. 30
DNA chip (oligonucleotide array) 1. Mutation analysis
DNA chip (oligonucleotide array) 1. Mutation analysis
50 µm
1.2 cm~ 60 000 position
One position:1 000 000molecules
Rec DNA II. 31
The oligonucleotide arrayThe oligonucleotide arrayExample: mutation analysis of a 4 000 bp gene (e.g. CFTR)
4000 bp length – 4000 oligo
4 variations in the middle base:12 000 oligo
1–202–213–22
...
Arrays of a 20 bp oligo
Rec DNA II. 32
sampleControl (no mutation)
Comparison with computer
The resultThe result
Rec DNA II. 33
DNA-chip 2: Expression Analysis by Micro-arraysDNA-chip 2: Expression Analysis by Micro-arrays