DRASTIC Database Resource for Analysis of Signal Transduction in Cells Gary Lyon Interrogating the...

Post on 17-Dec-2015

214 views 0 download

Tags:

Transcript of DRASTIC Database Resource for Analysis of Signal Transduction in Cells Gary Lyon Interrogating the...

DRASTIC Database Resource for Analysis of Signal Transduction in Cells

www.drastic.org.uk

Gary Lyon

Interrogating the DRASTIC Gene Expression Database

30 April 2004

Aim of DRASTICAim of DRASTIC

To understand signal transduction in response to plant pathogens and other environmental stresses.

To assist with putting into context the results of our own gene discovery work within the PPI Programme

and

Publicity !

Why do we need ‘DRASTIC’?Why do we need ‘DRASTIC’?

• Published gene expression data is not searchable.

• Too much data to remember e.g. microarray data.

• Cannot match ‘unknown’ genes with prior expression data (14.2% of entries in the database are ‘unknown’).

• Gene names associated with certain accession numbers change with time.

• Cell biology is complex. [Simple answers to complex problems are always wrong]

For exampleFor example

• One gene can have a variety of names : HBZip homeobox domain HD-zip homeobox protein homeobox domain zipper protein transcription factor, homeobox protein

• Names can be wrong: ‘HB AtHB-14 like’ should be ‘AtHB-9’ ‘Htf9C’ should be ‘RNA methyltransferase-related’ ‘endo 1,4-beta-mannosidase like’ should be ‘protein kinase family’

• Names can be confusing: ‘HSR201 like’ ‘RSH2 :Rel-SpoT homology’

www.drastic.org.uk

Access database

• Incorporates published data from microarrays and Northerns of ESTs regulated by various treatments

(i) Environmental stress e.g. drought, NaCl, high and low temperatures

(ii) Pathogens and elicitors (salicylic acid, ethylene, jasmonates)

• 424 references

• 266 treatments

• 67 plant species

• 10,193 gene accessions

Selection by Gene nameSelection by Gene name

treatment 1 treatment 2 treatment 3

1

2

3

4

5

6

7

Potential signalling networks

Funded by a 1 year PGRA grant from Carnegie Trust awarded to:University of Abertay

– Dr Les Ball, Dr Louis Natanson (Computing)– Prof Kevan Gartland, Dr Jill Gartland (Biotech.)– Davina Button (RA)

University of Edinburgh– Prof Peter Ghazal (GTI; Scottish Centre for Genomic Technology and

Informatics)

University of St Andrews– Dr Ishbel Duncan (Computer Science)

Aim:

–To build an intelligent and generic system for new hypothesis formulation from complex biochemical pathway databases.

Davina Button

‘Road Map’

Options with the new database

Genes induced by BTH

pathogen induced – incompatible (Arabidopsis)

Pathways e.g glycolysis enzymes

Conversion of glucose to pyruvate

• Wrong pathway

• Insufficient data

• Some errors (different time points? low homology!)

• Evidence of another pathway

Possible interpretations:-

1. Les Ball (Abertay),

2. Prof Bonnie Webber (School of Informatics, Edinburgh University),

3. CABI.

• Data input and

• Data analysis

Could be used to provide a putative relationship between genes/proteins based on existing knowledge in the literature. This model could be combined with information in the gene expression database to provide a draft version of a regulatory gene network.

Text mining

Web stats - Location of users

Impact factors ?!

DRASTIC

Database Resource for Analysis of Signal Transduction in Cells

SCRI

Gary Lyon

Adrian Newton

Bruce Marshall

University of Abertay

Les Ball

Louis Natason

Alasdair Houston

www.drastic.org.uk

Can we group treatments?

Genes up-regulated by Sulphur depletionGenes up-regulated by Sulphur depletion

Another exampleAnother example

The same gene can have different accession numbers – a big problem with genes of unknown function.

However, by converting accession numbers into AGI numbers we have shown that for the following ESTs

down-regulated by :-chitin (viz H37231, R90140, T41806), drought (viz AV823744), ethylene (viz R90140), low oxygen (At2g10940) or sodium chloride (AV823744),

or up-regulated by salicylic acid (R90140, H37231)

are all the same gene viz At2g10940

up-regulated down-regulatedArabidopsis 5052 1246potato 168 8tomato 393 213Nicotiana tabacum 258 87pepper 113 0rice 234 43 ethylene 105 20salicylic acid 330 146jasmonates (methyl) 344 135jasmonic acid 78 2 Ecc 35 0Eca 3 0P. infestans (incompatible) 15 1P. infestans (compatible) 51 3 cold 436 187drought 690 263sodium chloride 546 248wounding 510 63 Abscisic acid 359 46 Total in database 7127 1828

Plants

Treatments

Pathogens

Environmental stresses

Number of entries in the Gene expression Database - examples

What else could we do with the data?What else could we do with the data?

• Identify potato and barley orthologs of stress induced genes

• Map the position of the stress inducible genes

• Statistical analysis of signal transduction genes

• What are the differences between different plant tissues e.g. roots v. leaves.

Information from Maleck et al., Nature Genetics (Dec 2000) 26, 403-410

Out of 50 accession numbers checked (March 2004):-

• 26 (52%) were correctly identified

• 3 (6%) were wrongly identified (though 2 of these could be classed as ‘additional information being made available’ with only 1 really wrong.

• 13 (26%) are newly identified with a gene name (these were originally described (‘no homology’)

• 8 (16%) remain unknown but have an AGI number (these were originally described as ‘no homology’)