R ESEARCH at G ENOME B IOINFORMATICS L AB
description
Transcript of R ESEARCH at G ENOME B IOINFORMATICS L AB
![Page 1: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/1.jpg)
![Page 2: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/2.jpg)
RRESEARCHESEARCHat
GGENOMEENOME BBIOINFORMATICSIOINFORMATICS LLABAB
Josep F. Abril Ferrandoand
Genís Parra Farré
Genome BioInformatics Research Lab
RGBI @ ( IMIM – UPF – CRG )
![Page 3: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/3.jpg)
Introduction
Visualization of Genomic
Annotations
Comparative Genomics
Human and Mouse Genomes
Exon Structural SelectionBIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-
CRG)
SUMMARYSUMMARY
![Page 4: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/4.jpg)
Computational Analysis of Genomic Computational Analysis of Genomic SequencesSequences
DNA SEQUENCE
Sequencing
ASSEMBLED SEQUENCE
Assembling
ANNOTATED SEQUENCE
Analyzing
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 5: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/5.jpg)
From Genes to Genomes: Single GenesFrom Genes to Genomes: Single Genes
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 6: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/6.jpg)
From Genes to Genomes: ChromosomesFrom Genes to Genomes: Chromosomes
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 7: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/7.jpg)
From Genes to Genomes: Whole GenomesFrom Genes to Genomes: Whole Genomes
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 8: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/8.jpg)
Comparative Genomics: Single GenesComparative Genomics: Single Genes
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 9: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/9.jpg)
Comparative Genomics: Syntenic RegionsComparative Genomics: Syntenic Regions
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 10: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/10.jpg)
Programming in PProgramming in POSTOSTSSCRIPT (I)CRIPT (I)
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
%!PS
%
%% Variable Definition: $counter = 0
/counter 0 def
%
%% Function Definition: sub box(x,y) {...}
/box { %%% y x box
gsave %
20 mul % y X
0 % y X 0
moveto % y
20 mul % Y
dup % Y Y
10 0 % Y Y 10 0
rlineto % Y Y
0 % Y Y 0
exch % Y 0 Y
rlineto % Y
-10 0 % Y -10 0
rlineto % Y
neg % -Y
0 % -Y 0
exch % 0 -Y
rlineto %
closepath %
0 1 0 % 0 1 0
setrgbcolor % "green-color"
fill %
grestore %
} def %
Vector Graphics
Language
Prefix Notation
Stacks:
exec, paths, dicts, ...
Dictionaries:
Identifier Object
![Page 11: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/11.jpg)
%
%% Initialization
100 100 translate % New Coords Origin
2 5 scale % Re-scaling x-axes*2
% % y-axes*5
%
%% BaseLine
gsave %
0 0 moveto %
90 0 lineto %
0 setgray %
1 setlinewidth %
stroke %
grestore %
Programming in PProgramming in POSTOSTSSCRIPT (II)CRIPT (II)
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
%
%% Main Loop
mark % mark
0.25 0.35 0.15 % mark 0.25 0.35 0.15
counttomark % mark 0.25 0.35 0.15 3
{ %%%%%%%%%%%%%% begin loop (x3)
/counter %%
counter %%
1 add %%
def %% $counter = $counter + 1
counter %
% 1st loop: mark 0.25 0.35 0.15 counter==1
% 2nd loop: mark 0.25 0.35 counter==2
% 2nd loop: mark 0.25 counter==3
box % mark ...
} repeat %%%%%%%%%%%%%% finish loop (x3)
pop % clean up stack (removes "mark")
%
showpage
%%EOF%%
![Page 12: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/12.jpg)
GFF2PS and GFF2APLOTGFF2PS and GFF2APLOT
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 13: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/13.jpg)
Visualizing Genomic AnnotationsVisualizing Genomic Annotations
J.F. Abril and R. Guigó.
" gff2ps: visualizing genomic annotations "
Bioinformatics 16(8):743-744 (2000).
M.G. Reese, G. Hartzell, N.L. Harris, U. Ohler, J.F. Abril and S.E. Lewis.
" Genome Annotation Assessment in Drosophila melanogaster "
Genome Research 10(4):483-501 (2000).M.D. Adams et al (including J.F. Abril).
" The Genome Sequence of Drosophila melanogaster "
Science 287(5461):2185-2195 (2000).
J.C. Venter et al (including J.F. Abril and R. Guigó).
" The Sequence of the Human Genome "
Science 291(5507):1304-1351 (2001).
R.A. Holt et al (including J.F. Abril and R. Guigó).
" The Genome Sequence of the Malaria Mosquito Anopheles gambiae "
Science 298(5591):129-149 (2002).
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
http://genome.imim.es/software/gfftools/GFF2PS.html
![Page 14: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/14.jpg)
Whole Genome Gene-FindingWhole Genome Gene-Finding
Homosapiens
GENES
abinitio
DATABASE
homology
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 15: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/15.jpg)
Whole Genome Gene-Finding: Comparative Whole Genome Gene-Finding: Comparative ApproachApproach
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 16: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/16.jpg)
Whole Genome Gene-Finding: Comparative Whole Genome Gene-Finding: Comparative ApproachApproach
GENES
Homosapiens
Musmusculus
GENES
homology
geneprediction
geneprediction
homology
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 17: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/17.jpg)
Whole Genome Gene-Finding Results Whole Genome Gene-Finding Results AnalysisAnalysis
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 18: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/18.jpg)
Human and Mouse Comparative GenomicsHuman and Mouse Comparative Genomics
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
Mouse Genome Sequencing Consortium (including J.F. Abril, G. Parra and R. Guigó).
" Initial sequencing and comparative analysis of the mouse genome "
Nature 420(6915):520-562 (2002).
G. Parra, P. Agarwal, J.F. Abril, T. Wiehe, J.W. Fickett and R. Guigó.
" Comparative gene prediction in human and mouse "
Genome Research 13(1):108-117 (2003).
R. Guigó, E.T. Dermitzakis, P. Agarwal, C.P. Ponting, G. Parra, A. Reymond, J.F. Abril, E. Keibler, R. Lyle, C. Ucla, S.E. Antonarakis and M.R. Brent.
" Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes "
PNAS 100(3):1140-1145 (2003).
![Page 19: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/19.jpg)
Predicting “Novel” Genes in the Mouse Predicting “Novel” Genes in the Mouse Genome (I)Genome (I)
golden path annotations
golden path annotations
additional blastn matches to ENSEMBL + REFSEQ
additional blastn matches to ENSEMBL + REFSEQ
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
![Page 20: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/20.jpg)
Predicting “Novel” Genes in the Mouse Predicting “Novel” Genes in the Mouse Genome (II)Genome (II)
tblastx
geneidexons
tblastx
sgpgenes
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
additional blastn matches to ENSEMBL + REFSEQ
![Page 21: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/21.jpg)
Homosapiens
Predictions
Musmusculus
Predictions
GENESEnriched Pool
StructuralAlignment Exstral
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
HomologyBlastp
Homology and Gene Structure FilteringHomology and Gene Structure Filtering
![Page 22: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/22.jpg)
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
Exon Structure over an AlignmentExon Structure over an Alignment
![Page 23: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/23.jpg)
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
RT-PCR ValidationRT-PCR Validation
![Page 24: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/24.jpg)
Number of predictions
Tested Success Rate
Enriched 1428 214 62.15%
Similar 2125 38 10.53%
Other 3659 63 3.17%
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
Results of the Experimental ValidationResults of the Experimental Validation
![Page 25: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/25.jpg)
BIOINFORMÀTICA UPF T23 – 2003/03/06 – J.F. Abril and G. Parra @ Genome BioInformatics Lab – RGBI (IMIM-UPF-CRG)
Example of a Bash ScriptExample of a Bash Script
![Page 26: R ESEARCH at G ENOME B IOINFORMATICS L AB](https://reader036.fdocuments.us/reader036/viewer/2022062408/56813ff3550346895dab0cb2/html5/thumbnails/26.jpg)
http://genome.imim.es/