Putting your users in a Box Greg Thain Center for High Throughput Computing.
High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.
-
Upload
christine-cameron -
Category
Documents
-
view
212 -
download
0
Transcript of High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.
![Page 1: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/1.jpg)
High Throughput Parallel Computing (HTPC)
Dan Fraser, UChicago
Greg Thain, Uwisc
![Page 2: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/2.jpg)
The two familiar HPC Models
High Throughput Computing Run ensembles of single core jobs
Capability Computing A few jobs parallelized over the whole system Use whatever parallel s/w is on the system
![Page 3: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/3.jpg)
HTPC – an emerging model
Ensembles of small- way parallel jobs
(10’s – 1000’s)
Use whatever parallel s/w you want (It ships with the job)
![Page 4: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/4.jpg)
Current HTPC Mechanism
Submit jobs that consume an entire processor Typically 8 cores (today) Package jobs with a parallel library (schedd)
HTPC jobs as portable as any other job
MPI, OpenMP, your own scripts, …
Parallel libraries can be optimized for on-board memory access
All memory is available for efficient utilizationSome jobs just need more memory
Submit jobs via WMS glidein on the OSG A separate list of resources for HTPC (by VO)
https://twiki.grid.iu.edu/bin/view/Documentation/HighThroughputParallelComputing
![Page 5: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/5.jpg)
Where we are heading
Condor 7.6.x now supports: Node draining Partial node usage (e.g. 4 out of 8 cores) Fixes a job priority issue with HTPC jobs
Now being tested at Wisconsin (CHTC) Plan is to create a “best practice” template to
help with Condor configuration Will add to RPM CE doc
Currently in Pacman OSG CE doc
![Page 6: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/6.jpg)
Configuring Condor for HTPC
Two strategies: Suspend/drain jobs to open HTPC slots Hold empty cores until HTPC slot is open http://condor-wiki.cs.wisc.edu
See the HTPC twiki for the straightforward PBS setup
https://twiki.grid.iu.edu/bin/view/Documentation/HighThroughputParallelComputing
![Page 7: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/7.jpg)
How to submit
universe = vanilla
requirements = (CAN_RUN_WHOLE_MACHINE =?= TRUE)
+RequiresWholeMachine=trueexecutable = some job
arguments = arguments
should_transfer_files = yes
when_to_transfer_output = on_exit
transfer_input_files = inputs
queue
![Page 8: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/8.jpg)
MPI on Whole machine jobs
universe = vanilla
requirements = (CAN_RUN_WHOLE_MACHINE =?= TRUE)
+RequiresWholeMachine=true
executable = mpiexec
arguments = -np 8 real_exeshould_transfer_files = yes
when_to_transfer_output = on_exit
transfer_input_files = real_exequeue
Whole machine mpi submit file
![Page 9: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/9.jpg)
How to submit to OSG
universe = grid
GridResource = some_grid_host
GlobusRSL = MagicRSL
executable = wrapper.sh
arguments = arguments
should_transfer_files = yes
when_to_transfer_output = on_exit
transfer_input_files = inputs
transfer_output_files = output
queue
![Page 10: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/10.jpg)
Chemistry Usage Example of HTPC
![Page 11: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/11.jpg)
What’s the magic RSL?
Site Specific
Documentation is in the OSG CE (
PBS
(host_xcount=1)(xcount=8)(queue=?)
LSF
(queue=?)(exclusive=1)
Condor
(condorsubmit=(‘+WholeMachine’ true))
![Page 12: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/12.jpg)
What’s with the wrapper?
Chmod executable
Create output files
#!/bin/shchmod 0755 real.extouch output./mpiexec –np 8 real.ex
![Page 13: High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.](https://reader036.fdocuments.us/reader036/viewer/2022072011/56649e0c5503460f94af53f0/html5/thumbnails/13.jpg)
Next steps …?