Modeling and Displaying* Synteny

37
Modeling and Displaying* Synteny A cautionary tale * in gbrowse

Transcript of Modeling and Displaying* Synteny

Modeling and Displaying* Synteny

A cautionary tale

* in gbrowse

Overview

• Getting familiar w/ SynView• Modeling the data• Serving the data to gbrowse/SynView

• The perils of artifacts

Synteny of 5 Plasmodium species against P. falciparum chr 4 (reference)

-- chromosome view --

Synteny of 5 Plasmodium species against P. falciparum chr 4 (reference)

-- chromosome view --P.f. (human parasite) is finished

8x

5x

Primate

parasites

Rodent

parasites

Synteny of 5 Plasmodium species against P. falciparum chr 4 (reference)

-- chromosome view --

Coloring of rodent contigs indicates mapping to “chromosomes” in a composite genome

created by assembling all three species together.

Coloring of rodent contigs indicates mapping to “chromosomes” in a composite genome

created by assembling all three species together.8x

5x

Primate

parasites

Rodent

parasites

P.f. (human parasite) is finished

Synteny of 5 Plasmodium species against P. falciparum chr 4 (reference)

-- chromosome view --

Syntenic map computed by the Mercator programDesigned to do many-way multiple genome alignments.

C. Dewey (2007) Aligning multiple whole genomes with Mercator and MAVID.In N. Bergman, editor, Comparative Genomics, volume 395 of Methods in Molecular Biology. Humana Press.

Syntenic map computed by the Mercator programDesigned to do many-way multiple genome alignments.

C. Dewey (2007) Aligning multiple whole genomes with Mercator and MAVID.In N. Bergman, editor, Comparative Genomics, volume 395 of Methods in Molecular Biology. Humana Press.

8x

5x

Primate

parasites

Rodent

parasites

P.f. (human parasite) is finished

Synteny of 5 Plasmodium species against P. falciparum chr 4 (reference)

-- gene view --

Synteny of 5 Plasmodium species against P. falciparum chr 4 (reference)

-- gene view --

11 tr

acks

Synteny of 5 Plasmodium species against P. falciparum chr 4 (reference)

-- gene view --

Gene orthologs computed by OrthoMCL

Chen, F., Mackey, A.J., Stoeckert, C.J. Jr., Roos, D.S. (2006) OrthoMCL-DB: Querying A Comprehensive Collection of Ortholog Groups. Nucleic Acids Research Vol. 34: D363-8, Jan 2006.

Gene orthologs computed by OrthoMCL

Chen, F., Mackey, A.J., Stoeckert, C.J. Jr., Roos, D.S. (2006) OrthoMCL-DB: Querying A Comprehensive Collection of Ortholog Groups. Nucleic Acids Research Vol. 34: D363-8, Jan 2006.

11 tr

acks

Synteny of 5 Plasmodium species against P. falciparum chr 4 (reference)

-- gene view --

Synteny view in GBrowse by SynView

Wang H et al. (2006) SynView: a GBrowse-compatible approach to visualizing comparative genome data. Bioinformatics 22: 2308-9

Synteny view in GBrowse by SynView

Wang H et al. (2006) SynView: a GBrowse-compatible approach to visualizing comparative genome data. Bioinformatics 22: 2308-9

11 tr

acks

Computing the display

Computing the display

Four queries1. Syntenic span2. Syntenic genes3. And their exons4. Gene orthology

Four queries1. Syntenic span2. Syntenic genes3. And their exons4. Gene orthology

P. falciparum Chr 4 (reference)P. falciparum Chr 4 (reference)

P. vivax contig (syntenic)P. vivax contig (syntenic)

P. falciparum Chr 4 (reference)P. falciparum Chr 4 (reference)

P. vivax contig (syntenic)P. vivax contig (syntenic)

P. falciparum Chr 4 (reference)P. falciparum Chr 4 (reference)

P. vivax contig (syntenic)P. vivax contig (syntenic)

600k

700k

1100k

1250k

P. falciparum Chr 4 (reference)P. falciparum Chr 4 (reference)

P. vivax contig (syntenic)P. vivax contig (syntenic)

600k

700k

1100k

1250k

Naïve scaling:syntenic/reference

Naïve scaling:syntenic/reference

600k

700k

1100k

1250k

80k insertion

600k

700k

1100k

1250k

80k insertion

Artifact! It appears to the user that the Pf genes have no orthologs! … they do, but they are off-screen!

Artifact! It appears to the user that the Pf genes have no orthologs! … they do, but they are off-screen!

40k40k

40k40k

Solution: scale locallySolution: scale locally

Define anchors along reference:• start of a syntenic segment• start of genes• end of genes• end of a syntenic segment

Use left- and right-most visible anchors as scalingbounds

Define anchors along reference:• start of a syntenic segment• start of genes• end of genes• end of a syntenic segment

Use left- and right-most visible anchors as scalingbounds

40k40k

160k160k

Another artifact causes this bug Another artifact causes this bug

Contig ctg_6899 shown in duplicate

The source of the problem is a gene modeldiscrepancy, leading to one Pf gene with two Pv orthologs.

This makes ambiguous anchors.

The source of the problem is a gene modeldiscrepancy, leading to one Pf gene with two Pv orthologs.

This makes ambiguous anchors.

The source of the problem is a gene modeldiscrepancy, leading to one Pf gene with two Pv orthologs.

This makes ambiguous anchors.

The source of the problem is a gene modeldiscrepancy, leading to one Pf gene with two Pv orthologs.

This makes ambiguous anchors.

The source of the problem is a gene modeldiscrepancy, leading to one Pf gene with two Pv orthologs.

This makes ambiguous anchors.

The source of the problem is a gene modeldiscrepancy, leading to one Pf gene with two Pv orthologs.

This makes ambiguous anchors.

Solution: lose internal anchorsSolution: lose internal anchors

Another problem arises when there is only one anchor on screen

Another problem arises when there is only one anchor on screen

Synteny between a chromsome and contig

With a deletion in chromosome

500k 520k

240k200k

500k 520k

240k200k

515k 516k

515k 516k

Scale the syntenic gene 1:1 so that it shows up…

But what about the syntenic span?

??

500k 520k

240k200k

515k 516k

515k 516k

Only two choices: Each with artifacts:

1. 1:1 - might not be in display

2. Use the full map - doesn’t agree w/ genes

??

Another artifact. This one may be unavoidable.

Synteny schema

SyntenyAnchor

Synteny

Acknowledgements• Synteny viewer

– Haiming Wang (UGA)– Aaron Mackey (GSK)– Brian Brunk (UPenn)– Many others on the ApiDB team

• OrthoMCL– Feng Chen (UPenn)

• Mercator– Colin Dewey (Wisc)

• Vision– David Roos (UPenn)– Chris Stoeckert (Upenn)– Jessie Kissinger (UGA)