Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.
-
Upload
bertram-weaver -
Category
Documents
-
view
217 -
download
0
Transcript of Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.
![Page 1: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/1.jpg)
Human Genome Structure and Organization
Bert Gold, Ph.D., F.A.C.M.G.
![Page 2: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/2.jpg)
Genetic Variation
PhenotypeExpression of the genotype (modified by the environment).
The structural or functional nature of an individual. Includes:
appearance, physical features, organ structure
biochemical, physiologic nature
GenotypeGenetic status, the alleles an individual carries.
![Page 3: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/3.jpg)
Learning Objectives
Recap and Update Public and Private Human Genome Project Status
Provide Reminders of Necessary Background for Genetic Disease Association and Linkage Studies
![Page 4: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/4.jpg)
Definitions• Penetrance - The probability that an individual who is ‘at-
risk’ for the disorder (ie- carries the gene) develops (expresses) the condition. May be age dependent.
• Expression - The characteristics of a trait or disease that are outwardly expressed. Eg-myotonic dystrophy: myotonia, cataracts, narcolepsy, frontal balding, infertility.
• Ascertainment – The method used in gathering genetic data. Study conclusions differ depending on how affected individuals entered the study.
• Phenocopy – Individuals whose phenotype, under the influence of non-genetic agents, has become like the one normally caused by a specific genotype in the absence of non-genetic agents.
• Pleiotropy - The quality of an allele to produce more than one effect; ie- to manifest its expression in the structure and/or function of more than one organ system or tissue
• Recurrence Risk – Likelihood that a relative of a proband for a rare disease will have the same disease.
![Page 5: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/5.jpg)
Penetrance and Expressivity
• Penetrance: Proportion that expresses a trait– Complete: P=1.0 or 100%– Incomplete (“reduced”): P<1.0 or < 100%
• Expressivity: Severity of the phenotype– Expressivity may vary
• Between families (interfamilial) or• Within families (intrafamilial)
• TRY NOT TO CONFUSE “VARIABLE EXPRESSIVITY” WITH “INCOMPLETE PENETRANCE”
![Page 6: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/6.jpg)
Chromosomes, Genes and Proteins
Genes are on Chromosomes
Genes may encode proteins or RNA
![Page 7: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/7.jpg)
Non-coding RNA ‘genes’ • tRNAs (497 were counted, 821 when count
genes and pseudogenes)– tRNAs found are consistent with Wobble– Codon bias only roughly correlated with tRNA
distribution
• rRNAs• small nucleolar RNAs (snoRNAs)• snRNAs (spliceosome constituents)• 7SL RNA• telomerase RNA• Xist transcript• Vault RNA
![Page 8: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/8.jpg)
tRNAs
![Page 9: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/9.jpg)
Some chromosomes are richer in genes than others
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
1 3 5 7 9 11 13 15 17 19 21 X
Chromosomes
Number ofNucleotide
sin
Exons
![Page 10: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/10.jpg)
HOXA, HOXB, HOXC and HOXD are in regions with a particularly low density of repeats: This is believed to result
from the presence of Cis-acting elements in this vicinity.
![Page 11: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/11.jpg)
Proteins demonstrate patterns and similarity of function
![Page 12: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/12.jpg)
Functionally and Structurally similar proteins are organized into families
e.g.- E.C., SWISS-PROT, TrEMBL,
![Page 13: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/13.jpg)
In silico approaches to characterize genes include:
• PFAM, searchable via HMMER• Other in silico collections include:
– PRINTS– PROSITE– SMART– BLOCKS
• Creation of an Integrated Protein Index (IPI)
![Page 14: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/14.jpg)
How many genes are there?
Estimates from the Public Program– RefSeq– Exons– Introns– Average Sizes– Coding Sequences (CDS)– Alternative splice products (about 3%)– Creation of an Integrated Gene Index (IGI)– Genscan to Ensembl to Pfam via GeneWise (31,778)– Could be as low as 24,500 using overprediction
corrections.
![Page 15: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/15.jpg)
Estimates from Celera25,086 in Assembly 3
• 25,086 in Assembly 3
![Page 16: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/16.jpg)
Pre-existing estimates
• W. Gilbert’s back of the envelope calculation
• Reassociation Kinetics
• Estimates from Double Twist using Promoter Inspector plus
• Unpublished estimates from Human Genome Sciences
![Page 17: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/17.jpg)
Size of Genes:
• Largest: Dystrophin 2.7 Mb
• Titin
• 80,780 bp coding
• 178 exons
• largest single exon 17,106
![Page 18: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/18.jpg)
GENE HOMOLOGS, ORTHOLOGS, PARALOGS
• Vaculolar sorting machinery in yeast• ABC gene superfamily• Ig gene superfamily• FGF superfamily• Intermediate filament superfamily• PROTEIN FAMILY EXPANSION
APPEARS TO BE A PRIMARY EVOUTIONARY MECHANISM
![Page 19: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/19.jpg)
The proteome
• Functional categories
• PRINTS
• Prosite
• Pfam
• Interpro (http://www.ebi.ac.uk/interpro/)
![Page 20: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/20.jpg)
GENE ONTOLOGY
• Standard Vocabulary
• Hierarchy of terms (Directed ACYCLIC Graph)
• Ashburner Nature Genetics 25:25-29 (2000)
• ‘Bushy’ model
![Page 21: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/21.jpg)
Horizontal Transfer controversy • One of the major conclusions of the Public Genome effort,
published in Feb. 15, 2001 Nature was: “Hundreds of human genes appear likely to have resulted
from horizontal transfer from bacteria at some point in the vertebrate lineage. Dozens of genes appear to have been derived from transposable elements”
• This has now been widely disputed and is believed to result from:– Microbial contaminants in the sequence.– Bacterial gene integration into pre-vertebrates– And
• “The more probable explanation for the existence of genes shared by humans and prokaryotes, but missing in nonvertebrates, is a combination of evolutionary rate
variation, the small sample of nonvertebrate genomes, and gene loss in the nonvertebrate lineages. “
-Salzberg et. al., Science
![Page 22: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/22.jpg)
Splice Pattern, 98% GT-AG
![Page 23: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/23.jpg)
Chromatin Structure
• Euchromatin
• Heterochromatin
• Nucleosomes
![Page 24: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/24.jpg)
Chromosome Facts
• Chromosomes replicate during S phase
• Chromosomes recombine during Pachytene
• Recombination is an obligate activity
• Sex chromosomes recombine with each other
![Page 25: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/25.jpg)
Cytogenetics is done by Karyotyping
• Chromosomes are chemically frozen in metaphase
• Must be carried out on dividing cells• Microfilament inhibitors• Microtubule inhibitors• Membrane lysis• Pronase, trypsin digest• Giemsa stain• G-bands correspond to regions of relatively low
GC contenthttp://genome.ucsc.edu/goldenPath/mapPlots/http://genome.ucsc.edu/goldenPath/hgTracks.html
![Page 26: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/26.jpg)
Cell Division: Meiosis
– Segregation• Defined: Alleles are paired; gametes
receive one of each.• Exceptions: trisomy and uniparental disomy
– Independent Assortment• Gene Pairs segregate independently• Exception: linkage
![Page 27: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/27.jpg)
Meiosis Creates Gametes
And provides a basis for genetic recombination!
![Page 28: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/28.jpg)
Genetic Recombination
• Crossing Over• Resolution• Recombinant Chromosomes
– OBLIGATE ACTIVITY– FEMALE RECOMB. RATES HIGHER THAN
MALE– INCREASED RATES AT TELOMERES– PARADOX: SHORT ARMS SHOW MORE THAN
LONG ARMS– 1cM is 1 Mb on long arms, but short arms are 2 cM
per Mb and the Yp-Xp pseudoautosomal region is 20 cM per Mb.
![Page 29: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/29.jpg)
INCREASED RATES AT TELOMERES
![Page 30: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/30.jpg)
PARADOX: SHORT ARMS SHOW MORE THAN LONG ARMS
![Page 31: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/31.jpg)
Genes
• Units of heredity• Encode proteins (and some RNAs)• Human genetics is the study of gene variation in
humans• ‘Gene’ as a term is used ambiguously to refer
both to the ‘locus’ and the ‘allele’ ie- There is only one locus but two alleles in a given individual.
• Sequencing in both genome projects took place upon multiple alleles; this has led to some assembly confusions.
• Ultimately want a haploid genome map.
![Page 32: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/32.jpg)
The Human Genome Project • International public effort commencing in 1990 to
sequence the entire human genome by 2005.• STS approach chosen in 1991• Private effort launched in 1996 by Celera using
‘Shotgun’ cloning
![Page 33: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/33.jpg)
BAC clones, sequenced into BAC end reads, and assembled into ‘contigs’
![Page 34: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/34.jpg)
Markerless ‘contigs’ in the Celera
assembly are called ‘Scaffolds’
![Page 35: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/35.jpg)
Markers are BAC ends in the ‘shotgun’
![Page 36: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/36.jpg)
Mate pair reads provided the core of Celera sequence
![Page 37: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/37.jpg)
Draft human genome sequences complete by
February 2001.• Published simultaneously in Feb. 2001
– Public Sequence in NATURE (409: 745-964)– Celera Sequence in SCIENCE (291: 1145-
1434)
![Page 38: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/38.jpg)
Greater than 50% of sequence is repetitive
![Page 39: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/39.jpg)
45% of the human genome is derived from transposable elements
• Long Interspersed Elements: LINEs (21% of genome)– LINE1 – Some Still Active, Autonomous, consist of two ORFs
(one is a pol).
– LINE2
– LINE3
• Short Interspersed Elements: SINEs (13% of genome)– ALU – Some still active, use L1 enzymes to replicate
– MIR
– Ther2/MIR3
• LTR Retroposons– Consist of gag and pol
– Protease, rt, RNAseH, integrase all encoded
– Reverse transcription occurs cytoplasmically, using a tRNA to prime replication
• DNA Transposons
![Page 40: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/40.jpg)
98.5 % of sequence is non-coding.
Approximately 1/3 of the human genome is transcribed (public guess).
![Page 41: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/41.jpg)
Allelism
• Alternate forms of a gene
• e.g.- Sickle Cell, CFTR
• Recessive disease
• e.g. Achondroplasia, Tuberous Sclerosis
• Dominant Disease
![Page 42: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/42.jpg)
Heterozygote or Homozygote
• 1,2 or 1,1
• homogeneity of alleles at a locus
![Page 43: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/43.jpg)
Genetic Markers
• RFLPs• VNTRs (STRs)• Microsatellites• STSs• SNPs• “Tools” used to find disease genes• “Flags” with locations throughout the
genome
![Page 44: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/44.jpg)
Polymorphism Information Content versus Heterozygosity (PIC vs. het)
• Determining heterozygosity from SNP rare allele frequency
• Information Content in SNPs versus STRs
![Page 45: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/45.jpg)
Typology of SNPs• Type I- Coding, non-synonymous, non-conservative• Type II- Coding, non-synonymous, conservative• Type III- Coding, synonymous• Type IV- Non-coding, 5’-UTR• Type V- Non-coding, 3’UTR• Type VI- Other non-coding• Type I and Type II SNPs have lower heterozygosity
than other SNPs, presumably as a result of selective pressure.– About 25% of type I and type II SNPs have minor allele
frequencies > 15%– About 60% have minor allele frequencies < 5%
![Page 46: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/46.jpg)
Mutation
• Occurs more often during male meiosis
• Occurs more often in ‘long genes’
• More easily detected in Dominant Diseases– Achondroplasia– Duchenne Muscular Dystrophy
• May often involve CpG mutating to TpG
![Page 47: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/47.jpg)
Autosomal Recessive Inheritance
• Two copies of a gene required to be affected• Carriers have one copy of the mutation and are
unaffected• 25% of offspring of two carriers will be
affected• Males and females affected in equal number• Eg. Sickle Cell, beta-thal., CF
![Page 48: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/48.jpg)
X Linked Recessive (Sex Linked)
• Females rarely affected
• No male to male transmission
• Affected males transmit gene to all daughters
• Eg- Duchenne Muscular Dystrophy, Hemophilia A
![Page 49: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/49.jpg)
Autosomal Dominant Inheritance
• Each child at 50% risk
• Does not skip generations
• Often, lethal in double dose
• Large genetic load
![Page 50: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/50.jpg)
X-linked Dominant Pedigree
• Example is Hypophosphatemic, Vitamin D Resistant Rickets
• Distinguished from Autosomal Dominant by:– No male-to-male transmission– All daughters of affected fathers are affected
![Page 51: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/51.jpg)
IMPORTANT NOTE:
Dominant and Recessive refer to the phenotypic expression of alleles, NOT to intrinsic characteristics of gene loci.
![Page 52: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/52.jpg)
Inheritance Pattern Complexities • Pseudodominant Transmission of a Recessive• Pseudorecessive Transmission of a Dominant
– Misassigned paternity, causal heterogeneity, incomplete penetrance, germline mosaicisim
• Mosaicism• Mitochondrial Inheritance• Penetrance and Expressivity
– Semi-dominant, gender- influenced, age-related, transmission-related, imprinting
• Uniparental Disomy (UPD)• Environmental effects, phenocopies
![Page 53: Human Genome Structure and Organization Bert Gold, Ph.D., F.A.C.M.G.](https://reader034.fdocuments.us/reader034/viewer/2022051417/5697c01e1a28abf838cd12cd/html5/thumbnails/53.jpg)
Preview of linkage analysis
• Characterizing Human Genetics:– Long generation time– Inability to control matings– Inability to control study population– Inability to control exposures to environmental
conditions– It is possible to define phenotypes well!– Can study genetic structures through family history– Link phenotypes and genetic structures through
statistical methods