Protocols in Workflow. Membership First session –Chair: John Brooke –Note Takers: Raj Bose,...

Post on 28-Mar-2015

213 views 0 download

Transcript of Protocols in Workflow. Membership First session –Chair: John Brooke –Note Takers: Raj Bose,...

Protocols in Workflow

Membership

• First session– Chair: John Brooke– Note Takers: Raj Bose, Mario Antonioletti– Geoff Lusted– Michael Burns

• Second Session (above plus)– Peter Furniss– Jon Blower– Denise Ecklund– Martin Craig

Scope

• What do we mean by a protocol?– SOAP is not sufficient for using large data

sets within Web/Grid services context– Concentrate on scientific based workflow– Protocols used to communicate data between

activities, • e.g. GridFTP, http, ftp, sockets …

– Context• The context determines some of the detail

Data sources/sinks

• Databases– accessing– querying– transferring

• data rates• state (of transmission)

• Streaming– connection (possibly using sockets)– data rates

• Files & Formats– transferring– state

Control and data protocols

• We distinguished between protocols to do with workflow and control and protocols to do with data transfer.

• However the more we looked at it the more we realised that these are inherently linked in scientific workflows.

• We concentrated for a long time on data transfer between components of the workflow.

Security Issues

• Trust delegation– Globus model with proxy certificates– Unicore where all activities are pre-signed

• Static model– Need to specify the level of security of a workflow

• sub-workflow?• whole of the workflow?

– Trust chain in the workflow and encapsulation• Trust between members in the chain

• Sometimes it's not the data that needs to be moved but the computation– Algorithms may be commercially sensitive

More Issues

• Level of encapsulation of the workflow– Level of workflow granularity required of the

enactment, e.g. the security/state/type of data transfer between services

• Push vs Pull data transfer models– Implies an ordering in the process flow

• Does WS* have the capabilities of expressing everything that people want to do over the Grid– State issues, e.g. status of file transfer tasks.

Even More Issues

• Provenance

• Metadata

• Failure recovery/notification

The issues here are to do with how far into the workflow protocols should these considerations extend. They are often built on lower level protocols which are built on lower level etc.. As in Simon’s fleas.

Next steps??

• Need to examine more workflows, looked at OGSA-DAI, Unicore, GADS data serving web service but there are any more.

• We suggest that the issues of transfer of very large amounts of data in different ways is a differentiating area for scientific workflows.

• Security is also viewed differently so should look at this.

• Considerable overlap with concerns of other breakout groups.