GRID superscalar: a programming paradigm for GRID applications

65
SC 2004, Pittsburgh, Nov. 6-12 GRID superscalar: a programming paradigm for GRID applications CEPBA-IBM Research Institute Rosa M. Badia, Jesús Labarta, Josep M. Pérez, Raül Sirvent

description

GRID superscalar: a programming paradigm for GRID applications. CEPBA-IBM Research Institute Rosa M. Badia, Jesús Labarta, Josep M. Pérez, Raül Sirvent. Outline. Objective The essence User’s interface Automatic code generation Run-time features Programming experiences Ongoing work - PowerPoint PPT Presentation

Transcript of GRID superscalar: a programming paradigm for GRID applications

Page 1: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

GRID superscalar: a programming paradigm for GRID applications

CEPBA-IBM Research Institute

Rosa M. Badia, Jesús Labarta, Josep M. Pérez, Raül Sirvent

Page 2: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Outline

• Objective• The essence• User’s interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions

Page 3: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Objective

• Ease the programming of GRID applications

• Basic idea:

L3

Dir

ec

tory

/Co

ntr

ol

L2 L2 L2

LSU LSUIFUBXU

IDU IDU

IFUBXU

FPU FPU

FX

U

FX

UISU ISU

Grid

ns seconds/minutes/hours

Page 4: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Outline

• Objective• The essence• User’s interface• Automatic code generation• Current run-time features• Programming experiences• Future work• Conclusions

Page 5: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

The essence

• Assembly language for the GRID– Simple sequential programming, well defined operations and

operands

– C/C++, Perl, …

• Automatic run time “parallelization”– Use architectural concepts from microprocessor design

• Instruction window (DAG), Dependence analysis, scheduling, locality, renaming, forwarding, prediction, speculation,…

Page 6: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Input/output files

The essence

for (int i = 0; i < MAXITER; i++) {

newBWd = GenerateRandom();

subst (referenceCFG, newBWd, newCFG);

dimemas (newCFG, traceFile, DimemasOUT);

post (newBWd, DimemasOUT, FinalOUT);

if(i % 3 == 0) Display(FinalOUT);

}

fd = GS_Open(FinalOUT, R);

printf("Results file:\n"); present (fd);

GS_Close(fd);

Page 7: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

The essenceSubst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT…

GS_open

Subst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT

Display

Display

CIRI Grid

Page 8: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

The essenceSubst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT…

GS_open

Subst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT

Display

Display

CIRI Grid

Page 9: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Outline

• Objective• The essence• User’s interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions

Page 10: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

• Three components:

– Main program

– Subroutines/functions

– Interface Definition Language (IDL) file

• Programming languages: C/C++, Perl

User’s interface

Page 11: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

• A Typical sequential program

– Main program:

for (int i = 0; i < MAXITER; i++) {

newBWd = GenerateRandom();

subst (referenceCFG, newBWd, newCFG);

dimemas (newCFG, traceFile, DimemasOUT);

post (newBWd, DimemasOUT, FinalOUT);

if(i % 3 == 0) Display(FinalOUT);

}

fd = GS_Open(FinalOUT, R);

printf("Results file:\n"); present (fd);

GS_Close(fd);

User’s interface

Page 12: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

User’s interface

void dimemas(in File newCFG, in File traceFile, out File DimemasOUT){ char command[500]; putenv("DIMEMAS_HOME=/usr/local/cepba-tools"); sprintf(command, "/usr/local/cepba-tools/bin/Dimemas -o %s %s", DimemasOUT, newCFG ); GS_System(command);}

• A Typical sequential program– Subroutines/functions

void display(in File toplot){ char command[500];

sprintf(command, "./display.sh %s", toplot); GS_System(command);}

Page 13: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

User’s interface

• GRID superscalar programming requirements

– Main program: open/close files with• GS_FOpen, GS_Open, GS_FClose, GS_Close

– Currently required. Next versions will implement a version of C library functions with GRID superscalar semantic

– Subroutines/functions• Temporal files on local directory or ensure uniqueness of name per

subroutine invocation• GS_System instead of system• All input/output files required must be passed as arguments

Page 14: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

interface MC {void subst(in File referenceCFG, in double newBW, out File newCFG);void dimemas(in File newCFG, in File traceFile, out File DimemasOUT);void post(in File newCFG, in File DimemasOUT, inout File FinalOUT);void display(in File toplot)

};

• Gridifying the sequential program

– CORBA-IDL Like Interface: • In/Out/InOut files• Scalar values (in or out)

– The subroutines/functions listed in this file will be executed in a remote server in the Grid.

User’s interface

Page 15: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Outline

• Objective• The essence• User’s interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions

Page 16: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Automatic code generation

app.idl

app-worker.capp.c app-functions.c

server

gsstubgen

app.h

client

app-stubs.c

app_constraints.cc app_constraints_wrapper.cc

app_constraints.h

app.xml

Page 17: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Sample stubs file

#include <stdio.h> …int gs_result;

void Subst(file referenceCFG, double seed, file newCFG)

{

/* Marshalling/Demarshalling buffers */

char *buff_seed;

/* Allocate buffers */

buff_seed = (char *)malloc(atoi(getenv("GS_GENLENGTH"))+1);

/* Parameter marshalling */

sprintf(buff_seed, "%.20g", seed);

Execute(SubstOp, 1, 1, 1, 0, referenceCFG, buff_seed, newCFG);

/* Deallocate buffers */

free(buff_seed);

}…

Page 18: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Sample worker main file

#include <stdio.h> …int main(int argc, char **argv) { enum operationCode opCod = (enum

operationCode)atoi(argv[2]);

IniWorker(argc, argv);

switch(opCod) { case SubstOp: { double seed;

seed = strtod(argv[4], NULL); Subst(argv[3], seed, argv[5]); } break;

…}

EndWorker(gs_result, argc, argv); return 0; }

Page 19: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Sample constraints skeleton file

#include "mcarlo_constraints.h"#include "user_provided_functions.h"

string Subst_constraints(file referenceCFG, double seed, file newCFG) {

string constraints = "";

return constraints;}

double Subst_cost(file referenceCFG, double seed, file newCFG) {

return 1.0;}…

Page 20: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Sample constraints wrapper file (1)

#include <stdio.h>…

typedef ClassAd (*constraints_wrapper) (char **_parameters);typedef double (*cost_wrapper) (char **_parameters);

// PrototypesClassAd Subst_constraints_wrapper(char **_parameters);double Subst_cost_wrapper(char **_parameters);…

// Function tablesconstraints_wrapper constraints_functions[4] = { Subst_constraints_wrapper, …};

cost_wrapper cost_functions[4] = { Subst_cost_wrapper, …};

Page 21: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Sample constraints wrapper file (2)

ClassAd Subst_constraints_wrapper(char **_parameters) { char **_argp; // Generic buffers char *buff_referenceCFG; char *buff_seed;

// Real parameters char *referenceCFG; double seed; // Read parameters _argp = _parameters; buff_referenceCFG = *(_argp++); buff_seed = *(_argp++);

//Datatype conversion referenceCFG = buff_referenceCFG; seed = strtod(buff_seed, NULL); string _constraints = Subst_constraints(referenceCFG, seed); ClassAd _ad; ClassAdParser _parser; _ad.Insert("Requirements", _parser.ParseExpression(_constraints)); // Free buffers return _ad;}

Page 22: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Sample constraints wrapper file (3)

double Subst_cost_wrapper(char **_parameters) { char **_argp;

// Generic buffers char *buff_referenceCFG;

char *buff_referenceCFG; char *buff_seed;

// Real parameters char *referenceCFG; double seed; // Allocate buffers // Read parameters _argp = _parameters; buff_referenceCFG = *(_argp++); buff_seed = *(_argp++); //Datatype conversion referenceCFG = buff_referenceCFG; seed = strtod(buff_seed, NULL); double _cost = Subst_cost(referenceCFG, seed); // Free buffers return _cost;}

Page 23: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Binary building

client

GRID superscalarruntime

serveri

app-functions.c

app-worker.c

app-stubs.c

app.c

GT2

.

.

.

serveri

app-functions.c

app-worker.c

GT2 services: gsiftp, gram

app_constraints.cc

app_constraints_wrapper.cc

Page 24: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Calls sequence without GRID superscalar

app.c

LocalHost

app-functions.c

Page 25: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Calls sequence with GRID superscalar

app.c

app-stubs.c

GRID superscalarruntime

app_constraints_wrapper.cc

app_constraints.cc

GT2

LocalHostRemoteHost

app-functions.c

app-worker.c

Page 26: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Outline

• Objective• The essence• User interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions

Page 27: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Run-time features

• Previous prototype over Condor and MW• Current prototype over Globus 2.x, using the API• File transfer, security, … provided by Globus• Run-time implemented primitives

– GS_on, GS_off

– Execute

– GS_Open, GS_Close, GS_FClose, GS_FOpen

– GS_Barrier

– Worker side: GS_System

Page 28: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Run-time features

• Data dependence analysis

• Renaming

• File forwarding

• Shared disks management and file transfer policy

• Resource brokering

• Task scheduling

• Task submission

• End of task notification

• Results collection

• Explicit task synchronization

• File management primitives

• Checkpointing at task level

• Deployer

• Exception handling

• Current prototype over Globus 2.x, using the API• File transfer, security, … provided by Globus

Page 29: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Data-dependence analysis

• Data dependence analysis– Detects RaW, WaR, WaW dependencies based on file parameters

• Oriented to simulations, FET solvers, bioinformatic applications– Main parameters are data files

• Tasks’ Directed Acyclic Graph is built based on these dependencies

Subst

DIMEMAS

EXTRACT

Subst

DIMEMAS

EXTRACT

SubstSubst

DIMEMAS

EXTRACT

Display

Page 30: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

“f1_2”“f1_1”

File-renaming

• WaW and WaR dependencies are avoidable with renaming

T1_1

T2_1

T3_1

T1_2

T2_2

T3_2

T1_N

T1_N

T1_N

…“f1” “f1” “f1”

While (!end_condition()){ T1 (…,…, “f1”); T2 (“f1”, …, …); T3 (…,…,…);} WaR

WaW

Page 31: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

File forwarding

T1

T2

f1

T1

T2

f1 (by socket)

• File forwarding reduces the impact of RaW data dependencies

Page 32: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

File transfer policy

client

server1

server2

T1

f1 f4

T6

f4 f7

f1f7

Working directories

Page 33: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Shared working directories

client

server1

server2

f1f4

f7f1 f7

T1

T6

Working directories

Page 34: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Shared input disks

client

server1

server2

Input directories

Page 35: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Disks configuration file

khafre.cepba.upc.es SharedDisk0 /app/DB/input_data

kandake0.cepba.upc.es SharedDisk0 /usr/DB/inputs

kandake1.cepba.upc.es SharedDisk0 /usr/DB/inputs

kandake0.cepba.upc.es DiskLocal0 /home/ac/rsirvent/matmul-perl/worker_perl

kandake1.cepba.upc.es DiskLocal0 /home/ac/rsirvent/matmul-perl/worker_perl

khafre.cepba.upc.es DiskLocal1 /home/ac/rsirvent/matmul_worker/worker

working directories

shared directories

Page 36: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Resource Broker

• Resource brokering– Currently not a main project goal– Interface between run-time and broker– A Condor resource ClassAdd is built for each resource

Broker configuration file:Machine LimitOfJobs Queue WorkingDirectory Arch OpSys GFlops Mem NCPUs SoftNameList

khafre.cepba.upc.es 3 none /home/ac/rsirvent/DEMOS/mcarlo i386 Linux 1.475 2587 4 Perl560 Dimemas23kadesh.cepba.upc.es 0 short /user1/uni/upc/ac/rsirvent/DEMOS/mcarlo

powerpc AIX 1.5 8000 16 Perl560 Dimemas23kandake.cepba.upc.es /home/ac/rsirvent/McarloClAds

workers

localhost

Page 37: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Resource selection (1)

• Cost and constraints specified by user and per IDL task:• Cost (time) of each task instance is estimated

double Dimem_cost(file cfgFile, file traceFile){ double time;

time = (GS_Filesize(traceFile)/1000000) * f(GS_GFlops()); return(time);}

• A task ClassAdd is built on runtime for each task instance

string Dimem_constraints(file cfgFile, file traceFile){ return "(member(\"Dimemas\", other.SoftNameList))";}

Page 38: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Resource selection (2)

• Broker receives requests from the run-time– ClassAdd library used to match resource ClassAdds with

task ClassAdds– If more than one matching, selects the resource which

minimizes:

– FT = File transfer time to resource r– ET = Execution time of task t on resource r (using user

provided cost function)

f ( t ,r ) FT( r ) ET( t ,r )

Page 39: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Task scheduling

• Distributed between the Execute call, the callback function and the GS_Barrier call

• Possibilities – The task can be submitted immediately after being created

– Task waiting for resource

– Task waiting for data dependency

• GS_Barrier primitive before ending the program that waits for all tasks

Page 40: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Task submission

• Task submitted for execution as soon as the data dependencies are solved if resources are available

• Composed of – File transfer– Task submission

• All specified in RSL • Temporal directory created in the server working directory for

each task• Calls to globus:

– globus_gram_client_job_request – globus_gram_client_callback_allow– globus_poll_blocking

Page 41: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

End of task notification

• Asynchronous state-change callbacks monitoring system – globus_gram_client_callback_allow()

– callback_func function

• Data structures update in Execute function, GRID superscalar primitives and GS_Barrier

Page 42: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Results collection

• Collection of output parameters which are not files– Partial barrier synchronization (task generation from main code

cannot continue till we have this scalar result value)

• Socket and file mechanisms provided

Page 43: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

GS_Barrier

• Implicit task synchronization – GS_Barrier– Inserted in the user main program when required

– Main program execution is blocked

– globus_poll_blocking() called

– Once all tasks are finished the program may resume

Page 44: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

File management primitives

• GRID superscalar file management API primitives:– GS_FOpen – GS_FClose– GS_Open– GS_Close

• Mandatory for file management operations in main program• Opening a file with write option

– Data dependence analysis – Renaming is applied

• Opening a file with read option– Partial barrier until the task that is generating that file as output file finishes

• Internally file management functions are handled as local tasks– Task node inserted – Data-dependence analysis – Function locally executed

• Future work: offer a C library with GS semantic (source code with typicals calls could be used)

Page 45: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

3

Task level checkpointing

• Inter-task checkpointing• Recovers sequential consistency in the out-of-order execution

of tasks

0 1 2 3 4 5 6

Completed

Running

Committed

Successful execution

Page 46: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

3

Task level checkpointing

• Inter-task checkpointing• Recovers sequential consistency in the out-of-order execution

of tasks

0 1 2 3 4 5 6

Completed

Running

Committed

Failing execution

Failing

Cancel

Finished correctly

Page 47: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

3

Task level checkpointing

• Inter-task checkpointing• Recovers sequential consistency in the out-of-order execution

of tasks

0 1 2 3 4 5 6

Completed

Running

Committed

Restart execution

Failing

Finished correctly

Execution continues normally!

Page 48: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Checkpointing

• On fail: from N versions of a file to one version (last committed version)

• Transparent to application developer

Page 49: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Deployer

• Java based GUI• Allows workers specification: host details, libraries location…• Selection of Grid configuration • Grid configuration checking process:

– Aliveness of host (ping)

– Globus service is checked by submitting a simple test

– Sends a remote job that copies the code needed in the worker, and compiles it

• Automatic deployment– sends and compiles code in the remote workers and the master

• Configuration files generation

Page 50: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Deployer (2)

• Automatic deployment

Page 51: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Exception handling

• GS_Speculative_End(func) / GS_Throw

while (j<MAX_ITERS){ getRanges(Lini, BWini, &Lmin, &Lmax, &BWmin, &BWmax); for (i=0; i<ITERS; i++){ L[i] = gen_rand(Lmin, Lmax); BW[i] = gen_rand(BWmin, BWmax); Filter("nsend.cfg", L[i], BW[i], "tmp.cfg"); Dimemas("tmp.cfg", "nsend_rec_nosm.trf", Elapsed_goal, "dim_ou.txt"); Extract("tmp.cfg", "dim_out.txt", "final_result.txt"); } getNewIniRange("final_result.txt",&Lini, &BWini); j++;}GS_Speculative_End(my_func);

void Dimemas(char * cfgFile, char * traceFile, double goal, char * DimemasOUT){

… putenv("DIMEMAS_HOME=/aplic/DIMEMAS"); sprintf(aux, "/aplic/DIMEMAS/bin/Dimemas -o %s %s", DimemasOUT, cfgFile); gs_result = GS_System(aux); distance_to_goal = distance(get_time(DimemasOUT), goal);

if (distance_to_goal < goal*0.1) { printf("Goal Reached!!! Throwing exception.\n"); GS_Throw; }}

Function executed when a exception is thrown

Page 52: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Exception handling (2)

• Any worker can call to GS_Throw at any moment• Task that rises the GS_Throw is the last valid task (all

sequential tasks after that must be undone) • The speculative part is considered from the task that throws the

exception till the GS_Speculative_End (no need of a Begin clause)

• Possibly of calling a local function when the exception is detected.

Page 53: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Putting all together: involved files

User provided files

Files generated from IDL

Files generated by deployer

app.c

app-stubs.c

app_constraints_wrapper.ccapp_constraints.cc

app-functions.c

app-worker.capp.h

app_constraints.h

broker.cfg diskmaps.cfg

app.idl

Page 54: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Outline

• Objective• The essence• User’s interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions

Page 55: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Programming experiences

• Performance modelling (Dimemas, Paramedir)– Algorithm flexibility

• NAS Grid Benchmarks– Improved component programs flexibility

– Reduced Grid level source code lines

• Bioinformatics application (production)– Improved portability (Globus vs just LoadLeveler)

– Reduced Grid level source code lines

• Pblade solution for bioinformatics

Page 56: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Programming experiences

• fastDNAml– Computes the likelihood of various phylogenetic trees, starting with

aligned DNA sequences from a number of species (Indiana University code)

– Sequential and MPI (grid-enabled) versions available

– Ported to GRID superscalar • Lower pressure on communications than MPI• Simpler code than MPI

Tree evaluation

Barrier

Page 57: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

NAS Grid Benchmarks

LaunchLaunch

ReportReport

SP

SP

SP

SPSP

SPSP

SPSP

SP

SP

SP

SPSP

SPSP

SPSP

SP

SP

SP

SPSP

SPSP

SPSP

Launch

Report

BT MG FT

BT MG FT

BT MG FT

MF

MF

MF

MFMF

MF

LaunchLaunch

ReportReport

BTBT MGMG FTFT

BTBT MGMG FTFT

BTBT MGMG FTFT

MF

MF

MF

MFMF

MF

Launch

Report

LU LU LU

MG MG MG

FT FT FT

MFMFMF

MFMFMF

LaunchLaunch

ReportReport

LULU LULU LULU

MGMG MGMG MGMG

FTFT FTFT FTFT

MFMFMF

MFMFMF

Launch

Report

BT SP LU

BT SP LU

BT SP LU

MF

MF MF

MF

MF MF

MF

MF

LaunchLaunch

ReportReport

BTBT SPSP LULU

BTBT SPSP LULU

BTBT SPSP LULU

MF

MF MF

MF

MF MF

MF

MF

Page 58: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

NAS Grid Benchmarks

• All of them implemented with GRID superscalar• Run with classes S, W, A• Results scale as expected• When several servers are used, ASCII mode required

MB.S

050

100150200250300350

0 2 4 6

#tasks

Ell

ap

sed

tim

e (

s)

Khafre

Kadesh8

Page 59: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Programming experiences

• Performance analysis– GRID superscalar run-time instrumented

– Paraver tracefiles from the client side

– Measures of task execution time in the servers

Page 60: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Programming experiences

• Overhead of GRAM Job Manager polling interval

Globus overhead (VP.W)

05

1015202530354045

1 3 5 7 9 11 13 15

Task N

tim

e (

s) Task duration

Active to Done

Request to Active

Page 61: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Programming experiences

• VP.S task assignment

BT

BT

BT

MF

MF

MF

MG

MG

MG

MF

MF

MF

FT

FT

FT

Kadesh

Khafre Remote file transfers

Page 62: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Outline

• Objective• The essence• User’s interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions

Page 63: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Ongoing work

• OGSA oriented resource broker, based on Globus Toolkit 3.x.• Bindings to Ninf-G2• Binding to ssh/rsh/scp• New language bindings (shell script)• And more future work:

– Bindings to other basic middlewares• GAT, …

– Enhancements in the run-time performance guided by the performance analysis

Page 64: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

Conclusions

• Presentation of the ideas of GRID superscalar

• Exists a viable way to ease the programming of Grid applications

• GRID superscalar run-time enables– Use of the resources in the Grid

– Exploiting the existent parallelism

Page 65: GRID superscalar: a programming paradigm  for GRID applications

SC 2004, Pittsburgh, Nov. 6-12

More information

• GRID superscalar home page:

http://people.ac.upc.es/rosab/index_gs.htm

• Rosa M. Badia, Jesús Labarta, Raül Sirvent, Josep M. Pérez, José M. Cela, Rogeli Grima, “Programming Grid Applications with GRID Superscalar”, Journal of Grid Computing, Volume 1 (Number 2): 151-170 (2003).