Human Genome Sequence and Variability
description
Transcript of Human Genome Sequence and Variability
![Page 1: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/1.jpg)
Human Genome Sequence
and Variability
Gabor T. Marth, D.Sc.
Department of Biology, Boston [email protected]
Medical Genomics Course – Debrecen, Hungary, May 2006
![Page 2: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/2.jpg)
Lecture overview
1. Genome sequencing strategies, sequencing informatics
2. Genome annotation, functional and structural features in the human genome
3. Genome variability, DNA nucleotide, structural, and epigenetic variations
![Page 3: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/3.jpg)
1. The Human genome sequence
![Page 4: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/4.jpg)
The nuclear genome (chromosomes)
![Page 5: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/5.jpg)
The genome sequence
• the primary template on which to outline functional features of our genetic code (genes, regulatory elements, secondary structure, tertiary structure, etc.)
![Page 6: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/6.jpg)
Completed genomes
~1 Mb~100 Mb
>100 Mb
~3,000 Mb
![Page 7: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/7.jpg)
Main genome sequencing strategies
Clone-based shotgun sequencing
Whole-genome shotgun sequencing
Human Genome Project Celera Genomics, Inc.
![Page 8: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/8.jpg)
Hierarchical genome sequencing
BAC library construction
clone mapping
shotgun subclone library construction
sequencing
sequence reconstruction (sequence assembly)Lander et al. Nature 2001
![Page 9: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/9.jpg)
Clone mapping – “sequence ready” map
![Page 10: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/10.jpg)
Hierarchical genome sequencing
BAC library construction
clone mapping
shotgun subclone library construction
sequencing/read processing
sequence reconstruction (sequence assembly)Lander et al. Nature 2001
![Page 11: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/11.jpg)
Shotgun subclone library construction
BAC primary clone cloning vector
sequencing vector
subclone insert
![Page 12: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/12.jpg)
Hierarchical genome sequencing
BAC library construction
clone mapping
shotgun subclone library construction
sequencing/read processing
sequence reconstruction (sequence assembly)Lander et al. Nature 2001
![Page 13: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/13.jpg)
Sequencing
![Page 14: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/14.jpg)
Robotic automation
Lander et al. Nature 2001
![Page 15: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/15.jpg)
Base calling
PHREDbase = AQ = 40
![Page 16: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/16.jpg)
Vector clipping
![Page 17: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/17.jpg)
Hierarchical genome sequencing
BAC library construction
clone mapping
shotgun subclone library construction
sequencing/read processing
sequence reconstruction (sequence assembly)Lander et al. Nature 2001
![Page 18: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/18.jpg)
Sequence assembly
PHRAP
![Page 19: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/19.jpg)
Repetitive DNA may confuse assembly
![Page 20: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/20.jpg)
Sequence completion (finishing)
CONSED, AUTOFINIS
H
gapregion of low sequence coverage and/or quality
![Page 21: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/21.jpg)
2. Human genome annotation
![Page 22: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/22.jpg)
Genome annotation – Goals
protein coding genes RNA genesrepetitive elements
GC content
![Page 23: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/23.jpg)
The starting material
AGCGTGGTAGCGCGAGTTTGCGAGCTAGCTAGGCTCCGGATGCGACCAGCTTTGATAGATGAATATAGTGTGCGCGACTAGCTGTGTGTTGAATATATAGTGTGTCTCTCGATATGTAGTCTGGATCTAGTGTTGGTGTAGATGGAGATCGCGTAGCGTGGTAGCGCGAGTTTGCGAGCTAGCTAGGCTCCGGATGCGACCAGCTTTGATAGATGAATATAGTGTGCGCGACTAGCTGTGTGTTGAATATATAGTGTGTCTCTCGATATGTAGTCTGGATCTAGTGTTGGTGTAGATGGAGATCGCGTGCTTGAGTCGTTCGTTTTTTTATGCTGATGATATAAATATATAGTGTTGGTGGGGGGTACTCTACTCTCTCTAGAGAGAGCCTCTCAAAAAAAAAGCTCGGGGATCGGGTTCGAAGAAGTGAGATGTACGCGCTAGXTAGTATATCTCTTTCTCTGTCGTGCTGCTTGAGATCGTTCGTTTTTTTATGCTGATGATATAAATATATAGTGTTGGTGGGGGGTACTCTACTCTCTCTAGAGAGAGCCTCTCAAAAAAAAAGCTCGGGGATCGGGTTCGAAGAAGTGAGATGTACGCGCTAGXTAGTATATCTCTTTCTCTGTCGTGCT
![Page 24: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/24.jpg)
Coding genes – ab initio predictions
ATGGCACCACCGATGTCTACGTGGTAGGGGACTATAAAAAAAAAAA
Open Reading Frame = ORF
Stop codonStart codon
PolyA signal
![Page 25: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/25.jpg)
Ab initio predictions
Gene structure
![Page 26: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/26.jpg)
Ab initio predictions
…AGAATAGGGCGCGTACCTTCCAACGAAGACTGGG…
splice donor site splice acceptor site
![Page 27: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/27.jpg)
Ab initio predictions
GenscanGrailGenieGeneFinderGlimmeretc…
EST_genomeSim4SpideyEXALIN
![Page 28: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/28.jpg)
Homology based predictions
ATGGCACCACCGATGTCTACGTGGTAGGGGACTATAAAAAAAAAAA
ACGGAAGTCT
known coding sequence from another organism
GGACTATAAA
expressed sequence
genes predicted by homology
GenomescanTwinscanetc…
![Page 29: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/29.jpg)
Consolidation – gene prediction systems
Otto
Ensembl
FgenesH
Genscan
Grail
Genewise
Sim4 dbEst
![Page 30: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/30.jpg)
ncRNA genes
prediction based on structure (e.g. tRNAs)
for other novel ncRNAs, only homology-based predictions have been successful
![Page 31: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/31.jpg)
Repeat annotations
Repeat annotation are based on sequence similarity to known repetitive elements in a repeat sequence library
![Page 32: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/32.jpg)
The landscape of the human genome
![Page 33: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/33.jpg)
Gene annotations – # of coding genes
Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001
![Page 34: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/34.jpg)
Gene annotations – gene length
Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001
![Page 35: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/35.jpg)
Gene annotations – gene function
Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001
![Page 36: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/36.jpg)
GC content and coding potential
Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001
![Page 37: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/37.jpg)
ncRNAs
Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001
![Page 38: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/38.jpg)
Segmental duplications
Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001
![Page 39: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/39.jpg)
Repeat elements
Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001
![Page 40: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/40.jpg)
Genes and repeats
![Page 41: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/41.jpg)
Physical vs. genetic map (Mb/cM)
0.4 cM 1.3 cM 0.7 cM
0.4 Mb 0.7 Mb 0.3 Mb
![Page 42: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/42.jpg)
3. Human genome variability
![Page 43: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/43.jpg)
DNA sequence variations
• the reference Human genome sequence is 99.9% common to each human being
• sequence variations make our genetic makeup unique
SNP
• the most abundant human variations are single-nucleotide polymorphisms (SNPs) – 10 million SNPs are currently known
![Page 44: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/44.jpg)
DNA sequence variations
insertion-deletion (INDEL) polymorphisms
![Page 45: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/45.jpg)
Structural variations
Speicher & Carter, NRG 2005
![Page 46: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/46.jpg)
Structural variations
Feuk et al. Nature Reviews Genetics 7, 85–97 (February 2006) | doi:10.1038/nrg1767
![Page 47: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/47.jpg)
Detection of structural variants
Feuk et al. Nature Reviews Genetics 7, 85–97 (February 2006) | doi:10.1038/nrg1767
![Page 48: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/48.jpg)
Epigenetic changes: chromatin structure
Sproul, NRG 2005
![Page 49: Human Genome Sequence and Variability](https://reader035.fdocuments.us/reader035/viewer/2022062517/56813b10550346895da3ba16/html5/thumbnails/49.jpg)
Epigenetic changes: DNA methylation
Laird, NRC 2003