The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li,...

11
The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004 www.mygrid.org.uk

description

Third-party tools Utopia Haystack (IBM) LSID Launchpad (IBM) my Grid information model The my Grid Information Model my Grid in Context External Services Applications Web portals Taverna e-Science workbench Legacy apps Web Services OGSA-DAI databases Websites Core Services Service & workflow discovery Feta semantic discovery View federated UDDI+ Workflow enactment Freefluo workflow engine Metadata Management RDF-based Metadata store Provenance capture tool my Grid ontology Notification service LSID support Data Management my Grid information repository AMBIT text extraction service Soaplab Gowlab OGSA-DAI DQP service Web Service (Grid Service) communication fabric

Transcript of The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li,...

Page 1: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

The myGrid Information Model

Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe

AHM2004, 1 September 2004www.mygrid.org.uk

Page 2: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

The myGrid Information ModelOutline• myGrid in context• Science & e-Science• The information model• Next steps• Conclusions

Page 3: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

Third

-par

ty

tool

s

Utopia Haystack (IBM) LSID

Launchpad (IBM)

myGrid information

model

The myGrid Information ModelmyGrid in Context

External Services

Applications Web portals

Tavernae-Science workbench

Legacy apps

Web Services

OGSA-DAI databasesWebsites

Core Services

Ser

vice

& w

orkf

low

di

scov

ery

Fetasemantic discovery

Viewfederated

UDDI+

Wor

kflo

w

enac

tmen

t

Freefluoworkflow engine

Met

adat

a M

anag

emen

t

RDF-based Metadata

store

Provenance capture tool

myGrid ontology

Notification service

LSID support

Dat

a M

anag

emen

t

myGrid information repository

AMBITtext extraction

service

SoaplabGowlab OGSA-DAI

DQP service

Web Service (Grid Service) communication fabric

Page 4: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

The myGrid Information ModelScience and e-Science• The scientific process:

1. Observe and describe phenomena and study existing knowledge

2. Formulate a hypothesis to explain the phenomena

3. From the hypothesis, predict other phenomena

4. Develop and perform repeatable experiments that test the predictions.

• E-science parallels:1. Search online repositories,

with workflows and queries

2. Create domain ontologies to express hypotheses and …

3. … predictions

4. Workflows and queries can be preserved, shared, re-enacted

Page 5: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

The myGrid Information ModelAspects of the model• Based on the CCLRC Scientific Metadata

Model (Matthews & Sufi)• Programmes, studies & experiments• People & organizations• Data types• Provenance metadata• Annotation & argumentation

Page 6: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

The myGrid Information Model Programmes, studies & experiments

has participants1 0..*

uses10..* contains1

0..*

method0..*1

episodes1

0..*

lab books1

0..*participates in

10..*

acts in0..*

1

selected studies

0..*

instances1 0..*

initiates1

0..*

LabBookView

StudyRole

StudyParticipationEpisode

PersonStudyParticipation

ExperimentDesign

InvestigationProgramme

Study

ProgrammeResource

Operation

Workflow

WebServiceOperation

ExperimentInstance

example operation types

Page 7: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

The myGrid Information Model Provenance metadata

created via1 1

created by0..*

1

outputs1

0..*

inputs1

0..*

includes0..1

1value

1

value

1

has provenance

trace0..1

1

initiates1 0..*StudyParticipation

LifeScienceDocument

ActualInputParameter

ActualOutputParameter

WorkflowTrace

WebServiceTrace

OperationTrace

DirectCreation

CreationTypeDataProvenance

InvestigationExperimentInstance

example trace types

Page 8: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

19747251 AC005089.3831Homo sapiens BAC

clone CTA-315H11 from 7, complete sequence15145617 AC073846.6

815Homo sapiens BAC

clone RP11-622P13 from 7, complete sequence15384807 AL365366.20

46.1Human DNA sequence

from clone RP11-553N16 on chromosome 1, complete sequence7717376 AL163282.2

44.1Homo sapiens

chromosome 21 segment HS21C08216304790 AL133523.5

44.1Human chromosome 14

DNA sequence BAC R-775G15 of library RPCI-11 from chromosome 14 of Homo sapiens (Human), complete sequence34367431 BX648272.1

44.1Homo sapiens mRNA;

cDNA DKFZp686G08119 (from clone DKFZp686G08119)5629923 AC007298.17

44.1Homo sapiens 12q22

BAC RPCI11-256L6 (Roswell Park Cancer Institute Human BAC Library) complete sequence34533695 AK126986.1

44.1Homo sapiens cDNA

FLJ45040 fis, clone BRAWH302048620377057 AC069363.10

44.1Homo sapiens

chromosome 17, clone RP11-104J23, complete sequence4191263 AL031674.1

44.1Human DNA sequence

from clone RP4-715N11 on chromosome 20q13.1-13.2 Contains two putative novel genes, ESTs, STSs and GSSs, complete sequence17977487 AC093690.5

44.1Homo sapiens BAC

clone RP11-731I19 from 2, complete sequence17048246 AC012568.7

44.1Homo sapiens

chromosome 15, clone RP11-342M21, complete sequence14485328 AL355339.7

44.1Human DNA sequence

from clone RP11-461K13 on chromosome 10, complete sequence5757554 AC007074.2

44.1Homo sapiens PAC

clone RP3-368G6 from X, complete sequence4176355 AC005509.1

44.1Homo sapiens

chromosome 4 clone B200N5 map 4q25, complete sequence2829108 AF042090.1

44.1Homo sapiens

chromosome 21q22.3 PAC 171F15, complete sequence

>gi|19747251|gb|AC005089.3| Homo sapiens BAC clone CTA-315H11 from 7, complete sequenceAAGCTTTTCTGGCACTGTTTCCTTCTTCCTGATAACCAGAGAAGGAAAAGATCTCCATTTTACAGATGAGGAAACAGGCTCAGAGAGGTCAAGGCTCTGGCTCAAGGTCACACAGCCTGGGAACGGCAAAGCTGATATTCAAACCCAAGCATCTTGGCTCCAAAGCCCTGGTTTCTGTTCCCACTACTGTCAGTGACCTTGGCAAGCCCTGTCCTCCTCCGGGCTTCACTCTGCACACCTGTAACCTGGGGTTAAATGGGCTCACCTGGACTGTTGAGCG

urn:lsid:taverna:datathing:15

..BLAST_Report

rdf:type

urn:lsid:taverna:datathing:13

..similar_sequences_to

.. nucleotide_sequence

rdf:type

service invocation

..created_by

workflow invocation

workflow definition

experiment definition

project

person

group

service description

organisation

..described_by

..run_during

..invocation_of

..part_of

..works_for

..part_of

..part_of

..author

..author

..run_for

..masked_sequence_of

..filtered_version_of

The myGrid Information Model Annotation & argumentation

Page 9: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

The myGrid Information ModelNext Steps: modelling e-science events

Third

-par

ty

tool

s

Utopia Haystack (IBM) LSID

Launchpad (IBM)

myGrid information

model

Applications

Core Services

External Services

Ser

vice

& w

orkf

low

di

scov

ery

Fetasemantic discovery

Viewfederated

UDDI+

Web portals

Tavernae-Science workbench

Wor

kflo

w

enac

tmen

t

Freefluoworkflow engine

Met

adat

a M

anag

emen

t

RDF-based Metadata

store

Provenance capture tool

myGrid ontology

SoaplabGowlab

AMBITtext extraction

service

Legacy apps

Web Services

OGSA-DAI databasesWebsites

OGSA-DAI DQP service

e-Science coordinatione-Science Mediator

e-Science process patterns

e-Science events

LSID support

Dat

a M

anag

emen

t

myGrid information repository

Web Service (Grid Service) communication fabric

Notification service

Page 10: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

The myGrid Information ModelConclusions• Builds on existing work

– CCLRC Scientific Metadata Model– LSID: gives access via third party tools– Semantic web

• Persistent types almost implemented• Transient types in progress• Needs validating

– Early versions already doing useful work

Page 11: The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September 2004  .

The myGrid Information ModelThe myGrid teamMatthew Addis, Nedim Alpdemir, Pinar Alper, Rich Cawley, Neil Davis, Vijay Dialani, Stefan Egglestone, Alvaro Fernandes, Justin Ferris, Rob Gaizauskas, Kevin Glover, Carole Goble (director), Chris Greenhalgh, Mark Greenwood, Yikun Guo, Ananth Krishna, Peter Li, Xiaojian Liu, Phil Lord, Darren Marvin, Karon Mee, Simon Miles, Luc Moreau, Arijit Mukherjee, Tom Oinn, Juri Papay, Norman Paton, Terry Payne, Steve Pettifer, Milena Radenkovic, Peter Rice, Angus Roberts, Alan Robinson, Martin Senger, Nick Sharman, Robert Stevens, Victor Tan, Paul Watson, Anil Wipat, Chris Wroe & Jun Zhao.