Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa...

19
Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd Cracow Grid Workshop

description

Overview GridMiner –Service-oriented grid-aware data mining system –cope with very large data sets high dimensional data sets geographically distributed data sets different types of data sets –implemented on top of Globus Toolkit 3.0

Transcript of Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa...

Page 1: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Workflow Management in GridMiner

Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa

Institute for Software ScienceUniversity of Vienna

The 3rd Cracow Grid Workshop

Page 2: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Outline

• Overview• The Knowledge Discovery Process• GridMiner Architecture• Collaboration of Services• Workflows• Dynamic Service Composition

Page 3: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Overview

• GridMiner– Service-oriented grid-aware data mining system– cope with

• very large data sets• high dimensional data sets• geographically distributed data sets• different types of data sets

– implemented on top of Globus Toolkit 3.0

Page 4: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

DWH

Knowledge

Cleaning andIntegration

Selection andTransformation

Data Mining

Evaluation andPresentation

The Knowledge Discovery Process

Page 5: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

GridMiner Architecture

GMMSMediation

GMPPSPre Processing

GMDMSData Mining

GMPRSPresentation

GM DSCEDynamic Service Control

GMDISIntegration

GMOMSOLAM

GMISInformation

GMRBResource Broker

GridMiner Core

GMCMSOLAP / Cubes

GridMiner Base

GridMiner Workflow

Grid CoreServices Security File and Database

Access ServiceReplica

Management

Grid Core

Grid Resources Data Source

Fabric

Page 6: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Collaboration of GM-Services

GMPPSPre Processing

GMDMSData Mining

GMDISIntegration

GMPRSPresentation

Data SourcesIntermediateResult 1

IntermediateResult 2(e.g. “flat table”)

IntermediateResult 3(e.g. PMML)

FinalResult

Simple Scenario:

Page 7: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Collaboration (2)

GMDISGMPPS

GMPPSGMPPS GMDMS GMPRS

GMPPS GMPPS

GMDMS

GMDMS

GMPRS

GMPRS

Complex Scenarios:

GMDMS GMPRS

GMDISGMPPS

GMPPSGMCMS GMOMS GMPRSGMPPS

Page 8: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Workflow Management

• Motivation– high complex and dynamic process

• order of service execution• selection of services• sequential and parallel execution

– long running process• termination of client would terminate the workflow

=> Additional workflow layer needed !

Page 9: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Workflow ModelsStatic workflows Dynamic workflows

Page 10: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Dynamic Workflows

DSCE

Service A Service B

Service C

Service D

DSCL • Dynamic Service Control Language (DSCL)– based on XML– easy to use

• Dynamic Service Control Engine (DSCE)– processes workflow

according to DSCL

Page 11: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Dynamic Service Control Language

• Features– Control flow

• parallel execution of activities• sequential execution of activities

– Activities• creation of new Grid Service Instances• invoking operations on Grid Service Instances• Querying SDEs of Grid Service Instances• assigning and copying variables

Page 12: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

dsclvariables

variable *value ?

compositionactivity *

Page 13: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

DSCL - Example

variables

composition

dscl

qreateService invoke query

SDEqreateService invoke query

SDE

qreateService invoke

Page 14: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Dynamic Service Control Engine

• Features– processing of a DSCL document– parallelism– hiding complexity– delivery of intermediate results– status of executed service– Caching mechanism included

Page 15: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Dynamic Service Control Engine

• Implementation– transient stateful OGSA Grid Service– Operations

• updateDSCL()• start()• stop()• resume()

– SDE• activities

– results, failures, states for each activity

Page 16: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

DSCE - Architecture

Service Interface Factory Interface

DSC Engine

DGS Invocation

Dynamic Invoker

Axis 1.1

Globus 3.0

Page 17: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Current and Future Work

• This is work in progress• Additional Features

– Notification Model– Exception Handling

Page 18: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Related Work

• BPEL4WS: Business Process Execution Language (BEA, IBM, Microsoft, SAP, Siebel)

• GSFL: Grid Services Flow Language (Krishnan, Wagstrom, Laszewski)

• Data mining. Concepts and Techniques (Han)• Anatomy of the Grid (Foster, Kesselman, Tuecke)• Physiology of the Grid (Foster, Kesselman, Nick, Tuecke)• Open grid service infrastructure (Tuecke, Czaijkowski,

Foster)

Page 19: Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.

Conclusions

• Dynamic Service Control is an approach allowing the service consumer specify a workflow

• General approach – not only restricted to GridMiner