Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k...
-
Upload
dorcas-mccormick -
Category
Documents
-
view
215 -
download
3
Transcript of Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k...
![Page 1: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/1.jpg)
Frank MuellerNorth Carolina State University
![Page 2: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/2.jpg)
2
PIs & Funding
NSF funding level: $550k
NCSU: $60k (ETF) + $20+k (CSC)
NVIDIA: donations ~$30k
PIs/co-PIs:
— Frank Mueller
— Vincent Freeh
— Helen Gu
— Xuxian Jiang
— Xiaosong Ma
Contributors:
— Nagiza Samatova
— George Rouskas
![Page 3: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/3.jpg)
3
ARC Cluster: In the News
“NC State is Home to the Most Powerful Academic HPC in North Carolina” (CSC News, Feb 2011)
“Crash-Test Dummy For High-Performance Computing” (NCSU, The Abstract, Apr 2011)
“Supercomputer Stunt Double” (insideHPC, Apr 2011)
![Page 4: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/4.jpg)
4
Purpose
Create a mid-size computational infrastructure to support research in areas such as:
![Page 5: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/5.jpg)
5
Researchers Already Active
In the first week of public access:
From groups from within NCSU:— CSC, ECE, Chem/Bio Engineering, Materials,
Operations Research ORNL Tsinghua University, Beijing, China
![Page 6: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/6.jpg)
6
System Overview
Mid TierFront Tier Back Tier
Head/Login Nodes
IB Switch Stack
GEther Switch Stack I/O Nodes Storage ArrayCompute/Spare Nodes
PFS Switch Stack
Interconnect
SSD+SATA
![Page 7: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/7.jpg)
7
Hardware
108 Compute Nodes— 2-way SMPs with AMD Opteron
6128 processors with 8 cores per socket
— 16 cores per node!— 32 GB DRAM per node
1728 compute cores available
![Page 8: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/8.jpg)
8
Interconnects
Gigabit Ethernet
— interactive jobs, ssh, service
— Home directories
40Gbit/s Infiniband (OFEDstack)
— MPI Communication
— Open MPI, MVAPICH
— IP over IB
![Page 9: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/9.jpg)
9
GPUs
NVIDIA Tesla C2050 (1 login + 36 nodes)— 448 Compute cores per GPU
— Peak GigaFLOPS 515 SP/1030 DP
— Memory Amount 3GB
— Memory Interface 384-bit
— Memory Bandwidth (GB/sec) 144
NVIDIA GTX480 (10 nodes)— 480 Compute cores per GPU
— Peak GigaFLOPS 1344.96 SP/ 168DP
— Memory Amount 3GB
— Memory Interface 384-bit
— Memory Bandwidth (GB/sec) 177.4
NVIDIA Tesla C2070 (2 nodes)— 448 Compute cores per GPU
— Peak GigaFLOPS 515 SP/1030 DP
— Memory Amount 6GB
— Memory Interface 384-bit
— Memory Bandwidth (GB/sec) 144
NVIDIA 1060 GTX (1 node)
NVIDIA 8800 GTX (1 node)
![Page 10: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/10.jpg)
10
Solid State Drives
All 108 compute nodes equipped with
OCZ RevoDrive 120GB SSD
— Read: Up to 540 MB/s— Write: Up to 480 MB/s— Sustained Write: Up to 400 MB/s— Random Write 4KB (Aligned): 75,000 IOPS
![Page 11: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/11.jpg)
11
File Systems
Available Today:— NFS home directories over Gigabit Ethernet— Local per-node scratch on spinning disks (ext3)— Local per-node 120GB SSD (ext2)
In the future:— Parallel File Systems
–Lustre–Separate dedicated nodes are available for parallel filesystems–1 MDS + 4 clients
Are you interested in helping us set this up for your research projects??
![Page 12: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/12.jpg)
12
Power Monitoring
Watts Up Pro— Serial and USB available.
Connected in groups of:— Mostly 4 nodes (sometimes just 3)— 2x 1 node
– 1 w/ GPU– 1 w/o GPU
![Page 13: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/13.jpg)
13
Software Stack
Additional packages and libraries— upon request but…— Not free? you need to pay— License required? you need to sign it— Installation required? you need to
–Test it–Provide install script
check ARC website constantly changing
![Page 14: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/14.jpg)
14
Base System
64bit Rocks 5.3 (based off of CentOS)
Batch system:— Torque/Maui (PBS)
All compilers and tools are available on the login nodes.
— Gcc, gfortran, …— Compute nodes share the same base OS and
libraries as the login nodes.
![Page 15: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/15.jpg)
15
MPI
Open MPI— Operates over Infiniband— Integrated with BLCR— Already in your default PATH
–mpicc
MVAPICH— Infiniband support— Requires changes to your path. See ARC site.
![Page 16: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/16.jpg)
16
OpenMP
The "#pragma omp" directive in C programs works.
gcc -fopenmp -o fn fn.c
![Page 17: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/17.jpg)
17
CUDA SDK
Ensure you are using a node with a GPU— Several types available to fine tune for your applications
needs:–Well-performing single or double precision devices.
Requires environment changes:export PATH=".:~/bin:/usr/local/bin:/usr/bin:$PATH“
export PATH="/usr/local/cuda/bin:$PATH“
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:/usr/local/cuda/lib:$LD_LIBRARY_PATH“
export MANPATH="/usr/share/man:$MANPATH“
Or see site to make sure you have the latest paths…
![Page 18: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/18.jpg)
18
PGI Compiler (Experimental)
Awaiting site license update.
export PATH=".:~/bin:/usr/local/bin:/usr/bin:$PATH“
export PATH="/usr/local/cuda/bin:$PATH“
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:/usr/local/cuda/lib“
export MANPATH="/usr/share/man“
![Page 19: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/19.jpg)
19
Virtualization
Goal: To allow a user to request VMs from the batch system just like they would any other resource
— User gets full root access to each VM requested with complete control over that VM.
— VMs will share the same network or may be grouped together into private networks across single or multiple nodes.
Elegant VM creation scripts in place allow entire machine creation in a single line.
![Page 20: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/20.jpg)
20
Job Submission
cannot SSH to a compute node
must use PBS to submit jobs— Either as batch — or interactively
Presently there are “hard” limits for job times and sizes. In general, please be considerate of other users and do not abuse the system.
There are special queues for nodes with a GPU— As we add additional specialized resources even more
queues will become available.
![Page 21: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/21.jpg)
21
PBS Basics
On the login node:— to submit a job: qsub …— to list your jobs: qstat— to list everyone’s jobs: qstat –a— to delete/cancel/stop your job: qdel …— to check node status: pbsnodes
![Page 22: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/22.jpg)
22
qsub Basics
qsub -q cuda ... # job submitted to GPU/CUDA queue
qsub -l ncpus=4 ... # ask for four tasks (processors) -- packed as up to 16 tasks per node
qsub -l nodes=4:ppn=16 ... # job for four nodes with 16 processors on each node (64 tasks)
qsub -l nodes=2:ppn=1 -q cuda ... # job for two tasks on two nodes with GPU/CUDA support
qsub -l nodes=2,cput=00:5:00 ... # job for two tasks + 5 minutes CPU time
to submit interactive: qsub -I # one node, shell will open up
to submit interactive: qsub -I -nodes=20 #two nodes w/ 20 tasks
to submit interactive: qsub -I -l host=compute-0-54.local #specifically on node 54
to submit interactive: qsub -I -l host=compute-0-54.local+compute-0-55.local #on 54+55
to submit interactive with X11: qsub -I -X ...
![Page 23: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/23.jpg)
23
Listing your nodes
Once your job begins, $PBS_NODEFILE points to a file that contains a list of your requested nodes.
Open MPI is already integrated with PBS. Simply using mpirun … will automatically use all requested processes directly from PBS.
For example, a CUDA programmer that wants to use 4 GPU nodes:
[dfiala@login-0-0 ~]$ qsub -I -lnodes=4:ppn=1 -qcuda
qsub: waiting for job 1774.arcs.csc.ncsu.edu to start
qsub: job 1774.arcs.csc.ncsu.edu ready
[dfiala@compute-0-2 ~]$ cat $PBS_NODEFILE
compute-0-2.local
compute-0-32.local
compute-0-35.local
compute-0-38.local
---SSHing between these nodes FROM the PBS session is allowed---
![Page 24: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/24.jpg)
24
Handling problems
If you find a node that is giving you trouble please report it to the mailing list.
As a workaround, you can keep that node busy by queuing an empty job:
echo sleep 600 | qsub -l host=compute-0-100,walltime=1000
![Page 25: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/25.jpg)
25
Hardware in Action
4 racks in server room
![Page 26: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/26.jpg)
26
Running Large Jobs (and keeping cool)
While our new cluster is surely state of the art…
The cooling system isn’t.
Our “dual action” cooling solutionfor the state of the art cluster
State of the art cluster
![Page 27: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/27.jpg)
27
Temperature Monitoring
It is the user’s responsibility to maintain room temperatures below 80 degrees while utilizing the cluster.
— ARC website has links to online browser-based temperature monitors.
— And the building staff have pagers that will alarm 24/7 when temperatures exceed the limit.
![Page 28: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/28.jpg)
28
Connecting to ARC
ARC access is restricted to on-campus IPs only.— If you ever are unable to log in (connection gets dropped
immediately before authentication) then this is likely the cause.
Non-NCSU users may request remote access by providing a remote machine that their connections must originate from.
![Page 29: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/29.jpg)
29
Summary
Your ARC Cluster@Home: What can I do with it? Primary purpose: Advance Computer Science Research (HPC and
beyond)— Want to run a job over the entire machine?— Want to replace parts of the software stack?
Secondary purpose: Service to sciences, engineering & beyond— Vision: Have domain scientists work w/ Computer Scientists on code
http://moss.csc.ncsu.edu/~mueller/cluster/arc/
Equipment donations welcome Ideas how to improve ARC? let us know
— Qs? send to mailing list (once you have an account)— request an account: email dfiala<at>ncsu.edu
–Research topic, abstract, and compute requirements/time– Must include your unity ID– NCSU Students: Advisor sends email as means of their approval– Non-NCSU: same + preferred username + hostname(your remote login location.
![Page 30: Frank Mueller North Carolina State University. 2 PIs & Funding NSF funding level: $550k NCSU: $60k (ETF) + $20+k (CSC) NVIDIA: donations ~$30k PIs/co-PIs:](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d0a5503460f949dc10d/html5/thumbnails/30.jpg)
30
Slides provided by David Fiala
Edited by Frank Mueller
Current as of May 11, 2011.