SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to...

26
SubmitR

Transcript of SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to...

Page 1: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

SubmitR

Page 2: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• http://diagrid.org

• Gateway to computing resources

• Based on HUBzero platform

• Hub = Community:

– Wish list

– Forum

– Issue tracker

– Tools

Page 3: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

Hub Tools

• GUI programs

• Solve workflow problems

• Can be made easily available to non-Purdue users

• You can create tools

• Easily developed with Rappture

Page 4: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• Long runner – Submit and forget

• Parallel execution– Requires parallel library

• Parameter sweeps– Including Monte Carlo simulations

Use SubmitR for...

Page 5: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• Single: one process

• Parallel: multiple processes communicating with each other

• Sweep: many isolated processes – Different parameter values on the command

line, within a data file, or both

Job Types

Page 6: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

SubmitR

R Script & Files

R Script & Files(Individually, .tar.gz, or .zip)

Results, logs, and uploaded files (within single .zip file)

Results / Output

Page 7: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts
Page 8: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts
Page 9: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts
Page 10: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts
Page 11: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts
Page 12: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

ElectroGraph GWASExactHW KernSmooth MASS Matrix PBSmapping base boot class cluster codetools compiler cubature datasets deldir foreign grDevices graphics

grid igraph lattice maptools methods mgcv mvtnorm ncf nlme nnet np parallel plotrix plyr qtl raster rgdal rgeos

rpart snow snowfall sp spatial spatstat splancs splines stats stats4 stpp stringr survival tcltk tools utils

AvailableR

Libraries

• Process for requesting installation of new libraries

Page 14: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

1

SubmitR

• Introduction• Brief overview of “SubmitR”• Disclaimers:

– New to the R language ● Don't be surprised if confused by a question.

– SubmitR at version 1.0● Open to feedback.

• SubmitR designed to ease task of running R scripts on a remote HPC system.

• Accessed by way of DiaGrid.• Before we get into SubmitR, let's take a very quick

look at DiaGrid.

Page 15: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

2

• http://diagrid.org

• Gateway to computing resources

• Based on HUBzero platform

• Hub = Community:

– Wish list

– Forum

– Issue tracker

– Tools

• This site acts as a gateway to computing resources.

• Uses the HUBzero platform.• Enables a community by providing things like

forums and ticketing systems. • Also provides for tools.

Page 16: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

3

Hub Tools

• GUI programs

• Solve workflow problems

• Can be made easily available to non-Purdue users

• You can create tools

• Easily developed with Rappture

• Tools are applications that run embedded within the hub website itself.

• They can be published to a wide audience of users.

• SubmitR is a hub tool.

Page 17: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• Long runner – Submit and forget

• Parallel execution– Requires parallel library

• Parameter sweeps– Including Monte Carlo simulations

Use SubmitR for...

• Here's some of the use cases we see for SubmitR.

• Scripts with extreme runtimes.• Scripts that need access to clusters for

parallel execution.• Or, if you want to invoke a large number of

runs that sweep over a range of values.• One thing we've been looking at is Monte

Carlo simulations.

Page 18: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• Single: one process

• Parallel: multiple processes communicating with each other

• Sweep: many isolated processes – Different parameter values on the command

line, within a data file, or both

Job Types

• And based on that, we split out some different job types:

– A single job for long runners and simple tests.

– A parallel run for process that need to talk to each other...

● Of course you need to use a supporting parallel library.

– And a parameter sweep job...

● With sweep values that change either on the command line, or within a data file, or both.

Page 19: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

SubmitR

R Script & Files

R Script & Files(Individually, .tar.gz, or .zip)

Results, logs, and uploaded files (within single .zip file)

Results / Output

• Here's the basic flow of SubmitR• Staring in the lower left...• Use a browser to upload your R script and

any input files.• SubmitR then uses the hub's “submit”

service to execute the script on available DiaGrid resources.

• When its completed, the output is returned to SubmitR and then you can download it.

Page 20: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• Uploading files:• Bring in your R script or scripts and any

input files you might need.• Upload files individually or in an archive

file (zipped/tared/gzipped). They are automatically extracted.

• Of course there's a practical limit to how much data you can upload and store.

Page 21: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• Setting up the job:• Select the R script to run and specify any

extra R command line options.• This includes options for R itself and

command line arguments that the script might use.

Page 22: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• Submit the job for execution.• This sends the job out to queue up for a

run.• The log area indicates what we've done so

far.• The status line is updated periodically-

every minute or so.• It is ok to log off of diagrid.org and check

back later. Just make sure press “Keep for later” (instead of “Terminate)!

Page 23: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• Review output of completed job• Standard output is collected and displayed

in the output tab.• If the job had spawned multiple runs, the

output of each one would also appear here.

Page 24: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• Download everything in one zip file:• This includes the redirected standard

output as well as any output files and the uploded R scripts and input files.

• Important! After the files are collected and downloaded, they're all deleted from the server.

• This clears the app so it can be set up for another run.

Page 25: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

ElectroGraph GWASExactHW KernSmooth MASS Matrix PBSmapping base boot class cluster codetools compiler cubature datasets deldir foreign grDevices graphics

grid igraph lattice maptools methods mgcv mvtnorm ncf nlme nnet np parallel plotrix plyr qtl raster rgdal rgeos

rpart snow snowfall sp spatial spatstat splancs splines stats stats4 stpp stringr survival tcltk tools utils

AvailableR

Libraries

• Process for requesting installation of new libraries

• (List as of Nov. 2012. More to be added soon.)

Page 26: SubmitR - · PDF fileDiaGrid resources. • When its completed, the output is returned to SubmitR and then you can download it. • Uploading files: • Bring in your R script or scripts

• http://diagrid.org

[email protected]

• Questions?

Thank you!