Chani & Malki present:

23
Chani & Malki present : Project adviser: Dr. Ron Wides The OdzFinder

description

Chani & Malki present:. The OdzFinder. Project adviser: Dr. Ron Wides. WANTED. Name : Odz a.k.a : Ten-m Family : pair-rule gene Length: 10,000 bp. Getting to Know Odz …. Discovered in D. Melanogaster in 1994. Belongs to pair rule gene family. - PowerPoint PPT Presentation

Transcript of Chani & Malki present:

Page 1: Chani & Malki  present:

Chani & Malki present:

Project adviser: Dr. Ron Wides

The OdzFinder

Page 2: Chani & Malki  present:

WANTED

Name: Odz

a.k.a: Ten-m

Family: pair-rule gene

Length: 10,000 bp

Page 3: Chani & Malki  present:

Getting to Know Odz… Discovered in D. Melanogaster in 1994

Odz protein is expressed in neurons, developing brain and hindgut

Odz protein is expressed in segmentation.

Od Od z

Belongs to pair rule gene family

Plays a crucial role in the CNS during fetal development

Page 4: Chani & Malki  present:

The Odz Family

Ten-m1Ten-m2Ten-m3Ten-m4

Ten-a

Ten-m

Ten-m

Vertebrates

Arthropods

Odz gene orthologs have been found in 3 phylums:

Nematodes

Page 5: Chani & Malki  present:

The Odz Protein

2731 Amino Acids

III. hydrophobic sequences, probably transmembrane sequence

EGF-like domain Intracellular kinase substrate domain ODZ

The only pair rule gene that encodes a protein!

Contains 3 domains:

I. extracellular EGF-like repeats

II. tyrosine kinase phosphorylation sites

Page 6: Chani & Malki  present:

EGF-like Repeats

x(4)-C-x(0,48)-C-x(3,12)-C-x(1,70)-C-x(1,6)-C-x(2)-G-a-x(0,21)-G-x(2)-C-x

EGF-like domain: 30 - 40 amino acid residues Significant homology to epidermal growth factor

(EGF) Has been found in single or multiple copies in a

number of other proteins Generally found in the extracellular domain of

membrane proteins or secreted proteins Involved in receptor-ligand interactions Includes 6 conserved cysteine residues involved in

disulfide bonds

Page 7: Chani & Malki  present:

The lab’s goals:

Genomics:

To find a broad family of Odz gene

Phylogenetic trees to discover segmentation mechanism

Massive alignment to find conserved regions

Biological in-vivo experiments to change regions

Proteomics:

The protein’s role

How the protein functions

The protein’s interactions with other proteins ( i.e : notch)

Page 8: Chani & Malki  present:

Finding Odz Genes

BLASTing new EST libraries

DataBases

Se/uences discovered

in the lab

EST Libraries

Odz DataBase

Extracting DNA from various innocent creatures

BLASTing existing databases

Page 9: Chani & Malki  present:

Odz Database

The collected data was organized by Michal

Markovitz in a relational database.

The database consists of 10 different tables.

For example:

Page 10: Chani & Malki  present:

2 problems remained:

1. Blast results include many non Odz hits:

• prokaryotic hits• non-metazoan hits• EGF region hits• Low similarity

We need a program to automatically extract Odz hits from NCBI Blast results!!!

0

10

20

30

40

50

60

70

80

low scoreprokaryoticnon-metazoanOdzEgf-like

2. Every day…• New sequences are added to the existing databases• New EST libraries are released

Page 11: Chani & Malki  present:

A perl program that will automatically extract Odz hits from NCBI Blast results.

The OdzFinder

Page 12: Chani & Malki  present:

Blast Report Tax Report

UpdateDatabase

Combination

Look up table

Evalue>y?

Score>x? Score>x?

Evalue>y?

Odz

EGF?

Metazoan?

Prokaryote?

All EGFNo EGF

Mixed EGF

no

yes

yes

yes

yes

yes

input

S.O.F.T - screen Odz Flow Template

Page 13: Chani & Malki  present:

>gi|163076235|gb|AC765764.7 Apis mellifera BAC clone RP11-18D7 , complete sequence

Length = 184032  

Score = 153 bits (328), Expect = 3e-36 Identities = 59/59 (100%), Positives = 59/59 (100%)

Frame = +3 / +3  

Query: 3 IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH 179

IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH

Subjct: 3 IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH 179 

The program extracts relevant information from each hit:

inputBlast Report

 

BLASTS are performed on the Odz orthologs

The results are sent to the OdzFinder program to be filtered.

Page 14: Chani & Malki  present:

>gi|163076235|gb|AC765764.7 Apis mellifera BAC clone RP11-18D7 , complete sequence

Length = 184032  

Score = 153 bits (328), Expect = 3e-36 Identities = 59/59 (100%), Positives = 59/59 (100%)

Frame = +3 / +3  

Query: 3 IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH 179

IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH

Subjct: 3 IQHKTFKFHGNYIKQRFHPRIYK*RYKYQRFHPRIYK*NLNLYRVCCSHIILECLQTAH 179 

Taxonomy Report Eukaryota .................................. 2502 hits 41 orgs [root; cellular organisms] . Bilateria ................................ 2421 hits 33 orgs [Fungi/Metazoa group; Metazoa; Eumetazoa] . . Coelomata .............................. 2396 hits 31 orgs . . . Deuterostomia ........................ 2322 hits 23 orgs . . . . Chordata ........................... 2296 hits 22 orgs . . . . . Euteleostomi ..................... 2236 hits 21 orgs [Craniata; Vertebrata; Gnathostomata; Teleostomi] . . . . . . Tetrapoda ...................... 2022 hits 14 orgs [Sarcopterygii] . . . . . . . Amniota ...................... 1908 hits 12 orgs . . . . . . . . Eutheria ................... 1634 hits 10 orgs [Mammalia; Theria]

Search for eukaryotic and metazoan results.

Build prokaryotic database for possible future use.

Evolutional distance becomes relevant when dealing with EGF-like repeats.

The program will receive the BLAST hit’s Taxonomy Report and manipulate it into a manageable hash table.

A default Taxonomy Report will be available when BLASTing against ESTs.

inputBlast Report Tax Report

;

root ;cellular organisms ;Eukaryota ;Fungi/Metazoa group ;Metazoa ;Eumetazoa ;Bilateria ;Coelomata ;Protostomia ;Panarthropoda ;Arthropoda ;Mandibulata ;Pancrustacea ;Hexapoda ;Insecta ;Dicondylia ;Pterygota ;Neoptera ;Endopterygota ;

Hymenoptera ;Apocrita ;Aculeata ;Apoidea; Apidae; Apinae; Apini; Apis

Page 15: Chani & Malki  present:

Tenascin-m (odz) includes 8 EGF-like repeats

The conserved EGF region gave problematic results.

Many hits appear only due to their similarity to the EGF region.

Query :

Subject :

EGF?

High score!!!

Page 16: Chani & Malki  present:

There are three possible positions regarding the hit’s relation to the query’s EGF-like region-

I. The hit is completely inside the query’s EGF-region

525 2750804Query

Hit

II. The hit is completely outside the query’s EGF-region

525 804Query

Hit

III. The hit is partially in the query’s EGF-region

804525Query

Hit

Page 17: Chani & Malki  present:

Get a better picture..

Page 18: Chani & Malki  present:

score & e-value are examined

Set low threshholds to ensure that very small hits are not missed - some times

they are translocations

Position I:

The hit is completely outside the query’s EGF-like region

Evalue<y?

Score>x?

Odz

yes

yes

No EGF

Page 19: Chani & Malki  present:

Position II:

The hit is completely inside the query’s EGF-like region

Look up table example:

In order to prevent acceptance of non-odz hits with high scores due to their egf-region , a look up table was established

evolutionally close query & subject high id % demanded

evolutionally distant query & subject low id % demanded

Query HitOdz OrthologOdz Paralog

Mus MusculusHomo Sapiens95%70%

Mus MusculusDrosophila Melanogaster

75%55%

Look up table

Score>x?

Evalue>y?

Odz

yes

yes

?

All EGF

Page 20: Chani & Malki  present:

Position III :

The hit is partially inside the query’s EGF-like region

2 Possibilities:

A. False call ! An EGF hit with insignificant similarity outside of EGF-domains.

B. The Real Thing ! EGF with adjacent regions of significant similarity.

A B

Treat like II

Is it more like A or like B?

Treat like I

Mixed EGF

Page 21: Chani & Malki  present:

DBIUpdate Database

:Data flow through DBI

A database interface module for Perl

Enables Perl applications to access multiple database types

Provides a consistent database interface independent of the actual database being used

DBD::MSQLMySQLRDBMSDBIPerl Script

Page 22: Chani & Malki  present:

giscorespecies

49256537140Xenopus

48096180637Apis mellifera

45382362619Gallus gallus

42658224125Homo sapiens

34932761384Rattus norvegicus

38087011463Mus musculus

45446084419Drosophila melanogaster

325657151604Caenorhabditis elegans

41469033760Gasterosteus aculeatus

Results!

EGF

Odz

not Metazoa

ProkaryoticEGF

Odz

not Metazoa

Prokaryotic

Page 23: Chani & Malki  present:

Special thanks to our project adviser

Dr. Ron Wides

For his guidance, patience & Krispy Kreme donuts