Tutorial CST simulations on HTCondor

19
Tutorial CST simulations on HTCondor Sebastien Joly 06/04/2020

Transcript of Tutorial CST simulations on HTCondor

Page 1: Tutorial CST simulations on HTCondor

Tutorial CST simulations on HTCondor

Sebastien Joly

06/04/2020

Page 2: Tutorial CST simulations on HTCondor

You need to work in AFS !

On Linux, using the command ssh –Y [email protected]

On Windows with PuTTY, hostname : [email protected], Port : 22

Create directory (command : mkdir HTCondor) in your AFS home folder named "HTCondor" for later use

Example paths : /afs/cern.ch/user/s/sjoly/HTCondor /!\ Max 10Go quota in …/user/…

/afs/cern.ch/work/s/sjoly/HTCondor /!\ Max 100Go quota in …/user/…

Then create 4 folders IN this directory named output, error, log, results :

mkdir output error log results

Launch HTCondor jobs from this directory

First connect to lxplus

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 2

Page 3: Tutorial CST simulations on HTCondor

To run a job on HTCondor you need :

• CST file (no parameters sweep)

• Executable file (.sh file)

• Submit file (.sub file)

• (This tutorial)

What you need for a HTCondor job

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 3

Page 4: Tutorial CST simulations on HTCondor

How to export CST results as text files

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 4

Open the Post-

processing top tab, then

the Result Templates

tab under Tools.

Page 5: Tutorial CST simulations on HTCondor

How to export CST results as text files

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 5

The template based post-

processing tab has the drop

down option General 1D.

Then the second drop down

list has ASCII Export.

Ticking Export All 1D

Results and adding a

postfix of .txt is how this is

done. This makes CST write

the results into txt files

under a

./simulation_name/Export

folder, that is easier to

access for transferring.

/!\ DO THIS STEP IF YOU WANT RESULTS TEXT FILES /!\

Page 6: Tutorial CST simulations on HTCondor

This file will list all the commands HTCondor will have to do, it's a shell script.

You can run it locally on a terminal but it'll take a very long time obviously.

To understand it let's look at an example.

Executable (.sh) file

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 6

Page 7: Tutorial CST simulations on HTCondor

#!/bin/bash

$ export EOS_MGM_URL=root://eosuser.cern.ch

$ export HOME='/afs/cern.ch/user/s/sjoly/HTCondor’

$ cernbox_folder='/eos/user/s/sjoly/HTCondor’

$ file_name=“example"

Example .sh file (Part 1/3)

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 7

Don’t forget these lines to

make the script work !

Change both HOME and

cernbox_folder by your AFS

and EOS paths.

file_name should be both the name

of the CST file and of its folder.

Page 8: Tutorial CST simulations on HTCondor

$ eos cp "$cernbox_folder "/"$file_name".cst "$HOME"/"$file_name".cst

$ chmod +x "$file_name".cst

$ /afs/cern.ch/project/parc/cst2018/cst_design_environment -t -tw "$HOME"/"$file_name".cst --numthreads 24

Example .sh file (Part 2/3)

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 8

Copy CST file from your EOS folder to your AFS folder.

Change permission to run CST file.

Open file with CST PARTICLE STUDIO wakefield solver (-t –tw argument), more

arguments on a later slide

/!\ -- numthreads MUST match number of requested CPUs, here 24.

Page 9: Tutorial CST simulations on HTCondor

#Optional: copy data files in text format in a separate directory for easy access

$ cd "$HOME"/results/

$ mkdir "$file_name"

$ cp "$HOME"/"$file_name"/Export/Particle* "$HOME"/results/"$file_name“

# Compress and copy output back to EOS

$ cd "$HOME"

$ tar -cvf "$file_name"_HTCondor.tar ./"$file_name" ./"$file_name".cst

$ eos cp ./"$file_name"_HTCondor.tar "$cernbox_folder "/"$file_name"_HTCondor.tar

exit 0

Example .sh file (Part 3/3)

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 9

Lines to copy all impedance

and wakes text files to a results

folder.

Lines to

compress and

move back CST

file and folder to

EOS folder.

Page 10: Tutorial CST simulations on HTCondor

-m Starts CST MICROWAVE STUDIO

-t Starts CST PARTICLE STUDIO

-e Starts the eigenmode solver; only valid for -m

-tw Starts the wakefield solver; only valid for -t

Example to launch CST PARTICLE STUDIO wakefield solver :

/afs/cern.ch/project/parc/cst2018/cst_design_environment -t -tw file_name.cst

More arguments by typing this line on lxplus console :

/afs/cern.ch/project/parc/cst2018/cst_design_environment --help

CST arguments

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 10

Page 11: Tutorial CST simulations on HTCondor

This file allows you to run the commands in your .sh file on HTCondor clusters.

To submit a job you need to use the command :

condor_submit example.sub

To understand it let's look at an example.

Submit (.sub) file

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 11

Page 12: Tutorial CST simulations on HTCondor

executable = example.sh

RequestCpus = 24

+WCKey = CST

+JobFlavour = "workday"

environment = CST_INSTALLPATH="/afs/cern.ch/project/parc/cst2018"; CST_WAIT_FOR_LICENSE=on; CST_LICENSE_SERVER="1705@lxlicen01","1705@lxlicen02","1705@lxlicen03"; HOME="/afs/cern.ch/user/s/sjoly/HTCondor"

transfer_output_files = ""

output = output/output.$(ProcId).out

error = error/error.$(ProcId).err

log = log/log.$(ProcId).log

queue

Example .sub file (Complete)

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 12

Number of requested CPUs, should match -- numthreads in .sh file.

Job flavour corresponds to maximum run time of

the job, more info on a later slide.

Don’t forget to change AFS path here.

Page 13: Tutorial CST simulations on HTCondor

A CPU core corresponds to 2gb of memory and 20gb of disk space.

By default, a job will get one slot of a CPU core.

To ask for more CPUs you can do the following in the submit file:

RequestCpus = 4

This will result in a slot of 4 CPUs, 8gb of memory and 20gb of disk.

It's possible to ask for more memory but I haven't tried it.

RequestMemory = 14000

CPUs and memory

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 13

Page 14: Tutorial CST simulations on HTCondor

espresso = 20 minutes

microcentury = 1 hour

longlunch = 2 hours

workday = 8 hours

tomorrow = 1 day

testmatch = 3 days

nextweek = 1 week

Setting manually the maximum runtime can be achieved by placing the following in your submit file:

+MaxRuntime = Number of seconds

(Allow for runtime > 1 week)

Job flavours

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 14

Page 15: Tutorial CST simulations on HTCondor

• The "output" folder contains all the messages that would have otherwise been printed on CST in the "Progress" box. Very useful to see if a problem was caused by CST.

• The "error" folder contains any errors that would normally have been printed to the terminal if you were running the script/program yourself from the command line. Good place to look to see if the .sh file triggered an error.

• The "log" folder contains information that HTCondor tracks for each job, including when it was submitted, started, and stopped. It also describes resource use, and where the job ran. Can be interesting but less clear to read.

Output folders

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 15

Page 16: Tutorial CST simulations on HTCondor

• Change permissions on both your executable and submit file by doing :

chmod +x example.sh

chmod +x example.sub

• Try to run your .sh file on lxplus before submitting it to see if it works :

./example.sh

You can stop it once it’s starting the CST step

• You can follow CST progress by doing :

nano ./file_name/Result/progress.log

Good practices

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 16

Page 17: Tutorial CST simulations on HTCondor

Submit job : condor_submit example.sub

Check the status of the job execution: condor_q

Cancel a job: condor_rm JOBID

Access the job's local files while it is running: condor_ssh_to_job <jobid>

Check all the details about the job: condor_q -better {job-id}

Check how the job was executed once it finishes: condor_q JOBID -l

Check how long the job was running: condor_history JOBID -limit 1 -af:hRemoteWallClockTime

Useful HTCondor commands

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 17

Page 18: Tutorial CST simulations on HTCondor

EOS tutorial

How to run CST simulations in the Linux Cluster

HTCondor troubleshooting

References

06/04/2020 Sebastien Joly | Tutorial CST simulations on HTCondor 18

Page 19: Tutorial CST simulations on HTCondor

home.cern