Comparative Genomics: Analysis of the Mouse...

15
1 Comparative Genomics: Analysis of the Mouse Genome Initial sequencing and comparative analysis of the mouse genome. Mouse Genome Sequencing Consortium 2002, Nature 420:520-562. Mouse/human genome comparison Conservation of synteny: number of chromosome rearrangements • Repeats Evolution of orthologues. Ratio Ka/Ks Evolution of gene families • Selection

Transcript of Comparative Genomics: Analysis of the Mouse...

Page 1: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

1

Comparative Genomics:Analysis of the Mouse Genome

Initial sequencing and comparativeanalysis of the mouse genome.

Mouse Genome Sequencing Consortium2002, Nature 420:520-562.

Mouse/human genome comparison

• Conservation of synteny: number ofchromosome rearrangements

• Repeats

• Evolution of orthologues. Ratio Ka/Ks

• Evolution of gene families

• Selection

Page 2: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

2

Purpose

Highlights

• Genome 14% smaller than human (2.5Gb vs 2.9 Gb).

• 90% corresponds to regions of conserved synteny.

• 40% can be aligned at the nucleotide level.

• ~0.5 nucleotide substitutions per site since the divergence of the two species.

• 5% under purifying selection.

Page 3: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

3

More highlights

• Various measures of divergence show substantial variation across the genome.

• 30,000 protein-coding genes.• Dozens of local gene expansions.• Estimation of the rate of protein evolution

in mammals. Certain classes of secreted proteins under positive selection.

• Marked differences in activity but similar types of repeat sequences.

• 80,000 SNP identified.

Divergence time

p.521

Page 4: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

4

Sequencing strategy

p.522

Page 5: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

5

88 mapped ultracontigs with N50 length = 50.6 Mb

Page 6: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

6

Syntenic segments and syntenic blocks

Page 7: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

7

Page 8: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

8

Size distribution of segments and blockswith conserved synteny

betwwen mouse and human

24.046.433.538.6Total

1.03.00.40.9DNA

4.18.68.79.9LTR

10.713.67.68.2SINEs

7.921.016.519.2LINEs

Lineage specific

HumanLineage specific

MouseTEs

Composition of repeats in the mouse and human genome Fraction of lineage-specific repeats

Ancestral repeats 5% 22%

Page 9: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

9

Twofold higher of nucleotide substitution rate in the mouse lineage

(estimated from comparison of ancestral repeats)

Human Mouse

0.17 substitutions per site

0.34 substitutions per site

Age distribution of interspersed repeats in the mouse genome

Page 10: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

10

Pseudogenes in mouse genome: ~14.000. More than half processedpseudogenes.

Gapdh: 1 single functional gene and ~400 pseudogenes distriburedacross 19 of the mouse chromosomes.

Page 11: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

11

Comparison of 12.845 1:1 orthologues

Page 12: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

12

Evolution of Cytochrome P450 gene familiesin mouse

Changes in genome size

Human 2.9 Gb Mouse 2.5 Gb

Ancestor 2.9 Gb

Lineage-specific repeats + 900 Mb

Deletion -1.300 Mb

-----------------------------------------------

Net change - 400 Mb

Lineage-specific repeats + 700 Mb

Deletion - 700 Mb

-----------------------------------------------

Net change - 0 Mb

Expected proportion of the ancestral genome retained in both species

76% x 55% = 42%

Page 13: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

13

Neutral substitution rate

• Ancestral repeat sequence.

66.7% nucletide identity 0.46-0.47 substitutions per site

• Fourfold degenerate sites in codons ofgenes

67% nucletide identity 0.46-0.47 substitutions per site

Example: n = 100; = 0.667 (genome-wide average); p = 0.8; S = 2.8

Page 14: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

14

Proportion of mammalian genome underevolutionary selection for biological function

Sneutral

Sgenome

Sselected

20.8% of the windows are under selection

25.2% of human genomecontained in windows

20.8 x 25.2 = 5.25% ofgenome under selection

Page 15: Comparative Genomics: Analysis of the Mouse Genomebioinformatica.uab.cat/base/documents/masterGP/Genoma del rató… · Expected proportion of the ancestral genome retained in both

15

Proportion of genome under selection

p. 552

• 1.5% protein-coding regions of genes• 1% UTR of protein-coding genes• Regulatory regions that control gene-

expression• Non-protein coding RNAs (ncRNAs)• Chromosomal structural elements• Recent pseudogenes• Other??????

Proportion of genome under selection