Raw nuclear DNA Sequence

7
<<comment>> Raw nuclear DNA Sequence SNAP EST2Genom e Chromosome I 690 curated gene models GeneWise RepeatMaske r EST2Genom e BLAT Ab initio gene detection <<comment>> Chromosome I 690 curated gene models <<comment>> 426,440 EST of apicomplexa EST Alignments Protein alignments TRF RepeatScout Repeat masking <<comment>> 1.2% of repeat sequences (75 kbp) <<comment> > Uniprot GAZE <<comment>> 5,419 gene models <<comment>> 44,545 EST have been aligned <<comment>> 1,426 gene models Figure S1

description

Raw nuclear DNA Sequence. RepeatMasker. 426,440 EST of apicomplexa. TRF. Repeat masking. Chromosome I 690 curated gene models. RepeatScout. 1.2% of repeat sequences (75 kbp). Uniprot. BLAT. Chromosome I 690 curated - PowerPoint PPT Presentation

Transcript of Raw nuclear DNA Sequence

Page 1: Raw nuclear DNA Sequence

<<comment>>Raw nuclear DNA Sequence

SNAP

EST2GenomeChromosome I 690 curated gene models

GeneWise

RepeatMasker

EST2Genome

BLAT

Ab in

itio

gene

de

tecti

on

<<comment>>Chromosome I690 curated gene models

<<comment>>426,440 EST of apicomplexa

EST

Alig

nmen

ts

Prot

ein

alig

nmen

ts

TRF

RepeatScout

Repe

at m

aski

ng

<<comment>>1.2% of repeat sequences (75 kbp)

<<comment>>Uniprot

GAZE

<<comment>>5,419 gene models

<<comment>>44,545 EST have been

aligned

<<comment>>1,426 gene models

Figure S1

Page 2: Raw nuclear DNA Sequence

<<comment>>Raw nuclear DNA Sequence

Snap detection EST/Prot mapping

GlimmerHMMGAZE

BlastX homology saturation

Delete CDS without homology overlapping CDS with homology

Create CDS presenting homology and delete overlapping CDS without homologies

Calculate start codon position by comparison with three best homologues

Delete in frame introns (mostly from GAZE)

Create intron to cover all ORFs in frames presenting BLASTX alignments from homologues

Split CDS with BLASTX alignment corresponding to different homologues

<<comment>>Similar to MegaBlast

<<comment>>Gene fusion event is

evaluated

Gene

stru

ctur

eGe

ne se

lecti

on

<<comment>>Homologues should have conserved ATG

position

Translate CDS in amino acids sequence

Best bidirectionnal hit and KEGG ID identification

<<comment>>Annotated DNA Sequence

3,513 gene models

Figu

re S

2Au

tom

atic

dete

ction

<<comment>>3,923 gene models

<<comment>>1,426 gene models

Page 3: Raw nuclear DNA Sequence

A.

B.

C.

D.

E.

Figure S3

Page 4: Raw nuclear DNA Sequence

A.

B.

Figure S4

0 500 1000 1300 Kbp

Chromosome I

Chromosome III

Chromosome II

Chromosome I

0 1000 2000 3000 Kbp

GAPD

H

CRM

P Tandem repeats Gap

Ia Ib

IIa IIb

IIIa IIIbIIIc

Ia Ib

FBA

CEN

CEN

REP

REP

N1-15

CEN

CEN

rDNA

N1-15

FBA

Page 5: Raw nuclear DNA Sequence

cDNA gDNA

RT- RT+

cDNA gDNA

RT- RT+

Bmldh Bmtpk

1000 bp -

Figure S5

Page 6: Raw nuclear DNA Sequence

Figure S6A

MDH-likeLDH of

apicomplexa

LDH-likeMDH

LDH

0.2

10095

100

100

100

100100

100

95

98

82

84

9210096

84100

100

A

88

Page 7: Raw nuclear DNA Sequence

Figure S6B

Eukaryota

Bacteria

Bartonella

100

90

88

97

10090

100

91

B

89