Cluster Computing at IQSS
Alex Storer, Research Technology Consultant
What is the RCE? Research Computing Environment For research in the social sciences Get an account! [email protected]
How do I access it?
How do I access it?
How do I access it?
How do I access it?
What are your needs?
Gigantic Process Many Processes
Gigantic Process
RCE Powered
Gigantic Process
RCE Powered
Applications
Request Up to 256gb
RAM
Run a job for up to 5 days*
Graphical/Windowed
experience of Stata,
Matlab, etc.
RCE Powered
Many Processes
Input Output
Many Processes
Input 1 Output 1Input 2 Output 2Input 3 Output 3Input 4 Output 4Input 5 Output 5
Many Processes
Condor• Schedules which jobs go to which available machines
• Called from the command line
• Reads in 'submit' files
Example: Simulating You have a model that takes 30 minutes to run
and computes a result You want to establish confidence intervals for this
number by running it many times
Example .submit file
Example .submit file
What command do I run?
Example .submit file
What arguments do I give to the command?
Example .submit file
What input do I give to the command?
Example .submit file
Where do I save the outputs?
Example .submit file
How many times do I run this?
Example .submit file
/usr/bin/R --no-save --vanilla < simulate.R > out.1/usr/bin/R --no-save --vanilla < simulate.R > out.2.../usr/bin/R --no-save --vanilla < simulate.R > out.10
Unix Standard Input/Output
How to submit a file
condor_submit simulate.submit
Your out.0 fileIt's just everything that R writes to the screen from the script!
Example: Simulating Instead of using the out.$(Process) structure, you
can save the data in your script You cannot expect the Processes to complete in
order! You shouldn't write to the same file until all
processes are complete Instead of calling a script, use a function
Example: Simulating (with a function)
• procid is an input
• We tell the function what to save and where to save it
Example: submitting a function in batch
Execute this command in R. Specifically, run the simfunction.R file which we defined on the previous slide.
Example: submitting a function in batch
Execute this command in R. Specifically, call the function sim.function with the input as $(Process).sim.function(0)sim.function(1)…
Example: submitting a function in batch
We are no longer using the standard input and standard output, so we can leave these blank.
MATLAB Example An example where we need to do the same thing
to a number of data files and write out the results Call a function which knows how to map the
process ID to the data to load
MATLAB function
MATLAB function
We will pass $(process) as the function input
MATLAB function
Try to load:
data_0.mat
data_1.matetc.
MATLAB function
Compute the relevant result.
MATLAB function
Save the results as:
result_0.mat
result_1.matetc.
.submit File Example
The Arguments section is the most important, let's look at each piece individually
.submit File Example
Arguments = "-nodisplay –singleCompThread –r ''"
Start the arguments with double quotes (")
.submit File Example
Arguments = "-nodisplay –singleCompThread –r ''"
-nodisplay tells Matlab to not pop up the GUI
.submit File Example
Arguments = "-nodisplay –singleCompThread –r ''"
-singleCompThread tells Matlab to use only one core (this is what condor expects)
.submit File Example
Arguments = "-nodisplay –singleCompThread –r ''"
-r tells Matlab to execute whatever commands come next.
.submit File Example
Arguments = "-nodisplay –singleCompThread –r ''"
Put the commands to run in single quotes.
.submit File Example
Do NOT try to write your entire Matlab script in the submit file!
Some arguments must be executed before calling your script, however…
.submit File Example
Arguments = "-nodisplay –singleCompThread –r ''"
The commands to Matlab will go on inside the single quotesThey must be on a single line!
.submit File Example
Arguments = "-nodisplay –singleCompThread –r ''"
setenv(''HOME'',''nfs/home/A/astorer''); cd(''/nfs/home/A/astorer/Work/outreach/matlab''); mytest($(PROCESS))
.submit File Example
Arguments = "-nodisplay –singleCompThread –r ''"
setenv(''HOME'',''nfs/home/A/astorer''); cd(''/nfs/home/A/astorer/Work/outreach/matlab''); mytest($(PROCESS))
setenv is required for Matlab to load your local preferences.You must use two single quotes instead of one single quote.Remember to set your own home directory, e.g. nfs/home/J/jdoe
.submit File Example
Arguments = "-nodisplay –singleCompThread –r ''"
setenv(''HOME'',''nfs/home/A/astorer''); cd(''/nfs/home/A/astorer/Work/outreach/matlab''); mytest($(PROCESS))
Change to the directory that contains the script you want to run.
.submit File Example
Arguments = "-nodisplay –singleCompThread –r ''"
setenv(''HOME'',''nfs/home/A/astorer''); cd(''/nfs/home/A/astorer/Work/outreach/matlab''); mytest($(PROCESS))
Finally, run the function on the $(PROCESS) variable.
Example submission Because our function reads in data, we generate
the data ahead of time This is what is in our directory before submitting
(ls lists directory contents)
Notice that we count starting from 0!
Example submission Use condor_submit to submit the file Depending on the job, this may take some time to
complete!
Example submission Use condor_q <username> to check the status of
your jobs Use condor_rm <username> to clear your jobs.
Example submission Use condor_q <username> to check the status of your
jobs When this returns with no result, your jobs are complete.
Example submission Results!
Stata Example
Universe = vanillaExecutable = /usr/local/bin/stata-mpArguments = donotification = Completenotify_user = [email protected]
input = Test.dooutput = Test.outerror = Test.errLog = Test.logQueue 1
This is like running stata-mp do Test.do
Notification!
Universe = vanillaExecutable = /usr/local/bin/stata-mpArguments = donotification = Completenotify_user = [email protected]
input = Test.dooutput = Test.outerror = Test.errLog = Test.logQueue 1
You can get e-mails when your job is done!
Debugging If your results aren't as expected, first check the
error files
My jobs never finish?! Sometimes, jobs aren't well formed and condor
won't know what to do Condor will hold these jobs Your submit file is probably wrong somehow – try
looking at the log file as well as the submit file
H stands for "Held"
Top Related