BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and...

17
BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size i.e. total number of DNA bp Varies widely - WHY? C- paradox i.e., what is the source of the differences? Do the number of genes required vary so much? (how many “phyla” are represented at the right?) Phylum Chordata Phylum Arthropoda

Transcript of BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and...

Page 1: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 1 ©copyright Bruce Blumberg 2004. All rights reserved

Organization and Structure of Genomes (contd)

• Genome size– i.e. total number of DNA bp – Varies widely - WHY?

• C- paradox– i.e., what is the source of the

differences?• Do the number of genes required

vary so much?– (how many “phyla” are represented at

the right?)

Phylum Chordata

Phylum Arthropoda

Page 2: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 2 ©copyright Bruce Blumberg 2004. All rights reserved

Organization and Structure of Genomes (contd)

• How to measure genome complexity?– Hybridization kinetics– Shear and melt DNA– Allow to hybridize and measure ds

vs ss by spectrophotometry

• Cot½ - measures genome size and complexity– Larger value – longer to hybridize

• Smaller k– Longer to hybridize – more unique

sequences, larger genome– Much of what we knew about

genome size and complexity comes from these studies

Page 3: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 3 ©copyright Bruce Blumberg 2004. All rights reserved

Organization and Structure of Genomes (contd)

• Assumptions

– Cot½ measures rate of association of sequences

– Simple curves at right suggest simple composition

• No repetitive sequences• (graphed inverse to book)

• What would a more complex genome look like?– Would it be just shifted

further to the right?

– Or ?

Page 4: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 4 ©copyright Bruce Blumberg 2004. All rights reserved

Organization and Structure of Genomes (contd)

• Measure eukaryotic DNA– Multiple components– Can calculate more than

1 Cot½ value

– Either means starting material is not pure (i.e., multiple types of DNA)

– Or means different frequency classes of DNA

• Highly repetitive• Moderately repetitive• Unique

– Very big surprise

Page 5: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 5 ©copyright Bruce Blumberg 2004. All rights reserved

Organization and Structure of Genomes (contd)

• What does it mean?

Genetic complexity is not directly proportional to genome size!

• Increase in C is not alwaysaccompanied by proportionalincrease in number of genes– Note incorrect numbers

in chart• Drosophila

– ~14,000• human

– Controversial– ~30-50K

Page 6: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 6 ©copyright Bruce Blumberg 2004. All rights reserved

Organization and Structure of Genomes (contd)

• What can we learn by hybridizing RNA back to the genomic DNA?– Label RNA and hybridize with

excess DNA – measure formationof hybrids over time

– Rot½ analysis shows that RNA doesnot hybridize with highly repetitive DNA

– What does this mean?• Most of mRNA is transcribed

from non-repetitive DNA• Moderately repetitive DNA is

transcribed• Highly repetitive DNA is

probably not transcribed into mRNA– Key argument why

genome sequencers do not bother with “difficult” regions of repetitive DNA

Page 7: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 7 ©copyright Bruce Blumberg 2004. All rights reserved

Organization and Structure of Genomes (contd)

• Gene content is proportional to single copy DNA– Amount of non-repetitive DNA has a

maximum,total genome size does not– What is all the extra DNA, i.e., what is it

good for?

– Where did all this junk come from and why is it still around?

• Repetitive DNA• Telomeres• Centromeres• Transposons• Junk of all sorts

• DNA replication is very accurate• Selective advantage?

• OR

Page 8: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 8 ©copyright Bruce Blumberg 2004. All rights reserved

Organization and Structure of Genomes (contd)

• What is this highly repetitive DNA?

• Selfish DNA?– Parasitic sequences that exist solely

to replicate themselves?

• Or evolutionary relics?– Produced by recombination,

duplication, unequal crossing over

• Probably both– Transposons exemplify “selfish DNA”

• Akin to viruses?

– Crossing over and other forms of recombination lead to large scale duplications

Page 9: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 9 ©copyright Bruce Blumberg 2004. All rights reserved

Transcription of Prokaryotic vs Eukaryotic genomes

• Prokaryotic genes are expressedin linear order on chromosome– mRNA corresponds directly

to gDNA

• Most eukaryotic genes are interrupted by non-coding sequences– Introns (Gilbert 1978)– These are spliced out after

transcription and priorto transport out of nucleus

– Posttranscriptional processingin an important feature ofeukaryotic gene regulation

• Why do eukaryotes have introns?– What are they good for?

Page 10: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 10 ©copyright Bruce Blumberg 2004. All rights reserved

Introns and splicing

• Alternative splicing can generate protein diversity– Many forms of alternative splicing seen– Some genes have numerous alternatively spliced forms

• Dozens are not uncommon, e.g., cytochrome P450s

Page 11: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 11 ©copyright Bruce Blumberg 2004. All rights reserved

Introns and splicing

• Alternative splicing can generate protein diversity (contd)– Others show sexual dimorphisms

• Sex-determining genes• Classic chicken/egg paradox

– how do you determine sex if sex determines which splicing occurs and spliced form determines sex?

Page 12: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 12 ©copyright Bruce Blumberg 2004. All rights reserved

Origins of intron/exon organization

• Introns and exons tend to be short but can vary considerably– “Higher” organisms tend to have longer lengths in both– First introns tend to be much larger

than others – WHY?

• Often contain regulatory elements– Enhancers– Alternative promoters– etc

Page 13: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 13 ©copyright Bruce Blumberg 2004. All rights reserved

Origins of intron/exon organization

• Exon number tends to increase with increasing organismal complexity– Possible reasons?

• Longer time to accumulate introns?• Genomes are more recombinogenic due to repeated sequences?• Selection for increased protein complexity

– Gene number does not correlate with complexity– Ergo, it must come from somewhere

Page 14: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 14 ©copyright Bruce Blumberg 2004. All rights reserved

Origins of intron/exon organization

• When did introns arise– Introns early – Walter Gilbert

• There from the beginning, lost in bacteria and many simpler organisms

– Introns late – Cavalier-Smith, Ford Doolittle, Russell Doolittle• Introns acquired over time as a result of transposable

elements, aberrant splicing, etc• If introns benefit protein evolution – why would they be lost?

– Which is it?

• Introns late(at the moment)

Page 15: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 15 ©copyright Bruce Blumberg 2004. All rights reserved

Evolution of gene clusters

• Many genes occur as multigene families (e.g., actin, tubulin, globins, Hox)– Inference is that they evolved from a common ancestor– Families can be

• clustered - nearby on chromosomes (α-globins, HoxA)• Dispersed – on various chromosomes (actin, tubulin)• Both – related clusters on different chromosomes (α,β-globins,

HoxA,B,C,D)– Members of clusters may show stage or

tissue-specific expression • Implies means for coregulation as well

as individual regulation

Page 16: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 16 ©copyright Bruce Blumberg 2004. All rights reserved

Evolution of gene clusters (contd)

• multigene families (contd)– Gene number tends to increase with

evolutionary complexity• Globin genes increase in number

from primitive fish to humans

– Clusters evolve by duplication and divergence

Page 17: BioSci D145 lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved Organization and Structure of Genomes (contd) Genome size –i.e. total.

BioSci D145 lecture 1 page 17 ©copyright Bruce Blumberg 2004. All rights reserved

Evolution of gene clusters (contd)

• History of gene families can be traced by comparing sequences– Molecular clock model holds that rate of change within a group is

relatively constant• Not totally accurate – check rat genome sequence paper

– Distance between related sequences combined with clock leads to inference about when duplication took place