Introduction to Scientific Computing Shubin Liu, Ph.D. Renaissance Computing Institute (RENCI)...
-
Upload
nicholas-tucker -
Category
Documents
-
view
224 -
download
7
Transcript of Introduction to Scientific Computing Shubin Liu, Ph.D. Renaissance Computing Institute (RENCI)...
Introduction to Scientific Introduction to Scientific ComputingComputing
Shubin Liu, Ph.D.Renaissance Computing Institute (RENCI)University of North Carolina at Chapel Hill
9/17/2007 Scientific Computing @ UNC 2
Agenda• Introduction to High-Performance Computing• Hardware Available
– Servers, storage, file systems, etc.
• How to Access• Programming Tools Available
– Compilers & Debugger tools– Utility Libraries– Parallel Computing
• Scientific Packages Available• Job Management• Hands-on Exercises
9/17/2007 Scientific Computing @ UNC 3
Course Goals
• An introduction to high-performance computing and UNC Research Computing
• Available Research Computing hardware facilities • Available software packages and serial/parallel
programming tools and utilities/libraries• How to efficiently make use of Research Computing
facilities on campus
9/17/2007 Scientific Computing @ UNC 4
Pre-requisites• An account on Emerald cluster• UNIX Basics
Getting started: http://help.unc.edu/?id=5288
Intermediate: http://help.unc.edu/?id=5333
vi Editor: http://help.unc.edu/?id=152
Customizing: http://help.unc.edu/?id=208
Shells: http://help.unc.edu/?id=5290
ne Editor: http://help.unc.edu/?id=187
Security: http://help.unc.edu/?id=217
Data Management: http://help.unc.edu/?id=189
Scripting: http://help.unc.edu/?id=213
HPC Application: http://help.unc.edu/?id=4176
9/17/2007 Scientific Computing @ UNC 5
About Us• ITS
– http://its.unc.edu– Physical locations: 401 West Franklin Street; 211 Manning Drive– 12 Divisions
IT Infrastructure and Operations Research Computing Teaching and Learning Technology Planning and Special Projects Telecommunications User Support and Engagement Office of the Vice Chancellor Communications Enterprise Applications Enterprise Data Management Financial Planning and Human Resources Information Security
• RENCI– http://www.renci.org/– Anchor Site: 100 Europa Drive, suite 540, Chapel Hill – A number of virtual sites on the campuses of Duke, NCSU and UNC-Chapel Hill,
and regional facilities across the state – Mission: to foster multidisciplinary collaborations; to enable advancements in
science, industry, education, the humanities and the arts; to provide the technical leadership and expertise; to work hand-in-hand with businesses and communities to utilize advanced technologies
9/17/2007 Scientific Computing @ UNC 6
About Us
• Where/Who are we and do we do?– ITS Manning: 211 Manning Drive– Website
http://www.renci.org/unc/computing/
– Groups• Infrastructure • Engagement • User Support
9/17/2007 Scientific Computing @ UNC 7
About Myself• Ph.D. from Chemistry, UNC-CH
• Currently Senior Computational Scientist @ RENCI Engagement Center at UNC Chapel Hill
• Responsibilities:
– Support Comp Chem/Phys/Material Science software
– Support Programming (FORTRAN/C/C++) tools, code porting, parallel computing, etc.
– Conduct research and engagement projects on Computational Chemistry
• DFT theory and concept
• Systems in biological and material science
9/17/2007 Scientific Computing @ UNC 8
About You
• Name, department, interest?• Any experience before with high performance
computing?• What do you expect to use the Research
Computing facilities for?• What do you expect from this training course?
9/17/2007 Scientific Computing @ UNC 9
What is Scientific Computing?• Simply put
– To use high-performance computing (HPC) facilities to solve real scientific problems.
• From Wikipedia.com– Scientific computing (or computational science) is the
field of study concerned with constructing mathematical models and numerical solution techniques and using computers to analyze and solve scientific and engineering problems. In practical use, it is typically the application of computer simulation and other forms of computation to problems in various scientific disciplines.
9/17/2007 Scientific Computing @ UNC 10
What is Scientific Computing?Engineering
Sciences
NaturalSciences
ComputerScience
AppliedMathematics
Scientific Computing
Theory/Model Layer
Algorithm Layer
Hardware/Software
Application Layer From scientific discipline viewpoint From operational viewpoint
Parallel
Computing
High- Performance
ComputingScientificComputing
From Computing Perspective
9/17/2007 Scientific Computing @ UNC 11
What is HPC?• Computing resources which provide more than an
order of magnitude more computing power than current top-end workstations or desktops – generic, widely accepted.
• HPC ingredients:– large capability computers (fast CPUs)– massive memory – enormous (fast & large) data storage – highest capacity communication networks (Myrinet, 10
GigE, InfiniBand, etc.)– specifically parallelized codes (MPI, OpenMP)– visualization
9/17/2007 Scientific Computing @ UNC 12
Why HPC?
• What are the three-dimensional structures of all of the proteins encoded by an organism's genome and how does structure influence function, both spatially and temporally?
• What patterns of emergent behavior occur in models of very large societies? • How do massive stars explode and produce the heaviest elements in the periodic
table? • What sort of abrupt transitions can occur in Earth’s climate and ecosystem structure?• How do these occur and under what circumstances? If we could design catalysts
atom-by-atom, could we transform industrial synthesis? • What strategies might be developed to optimize management of complex
infrastructure systems? • What kind of language processing can occur in large assemblages of neurons? • Can we enable integrated planning and response to natural and man-made disasters
that prevent or minimize the loss of life and property?
– From NSF Program Solicitation Page on HPC System Acquisition: Towards a Petascale Computing Environment for Science and Engineering, 2006
http://www.nsf.gov/pubs/2005/nsf05625/nsf05625.htm
9/17/2007 Scientific Computing @ UNC 13
Machine LINPACK Performance
Peak Performance
Intel Pentium 4 (2.53 GHz) 2355 5060
NEC SX-6/1 (1proc. 2.0 ns) 7575 8000
HP rx5670 Itanium2 (1GHz) 3528 4000
IBM eServer pSeries 690 (1300 MHz)
2894 5200
Cray SV1ex-1-32(500MHz) 1554 2000
Compaq ES45 (1000 MHz) 1542 2000
AMD Athlon MP1800+(1530MHz) 1705 3060
Intel Pentium III (933 MHz) 507 933
SGI Origin 2000 (300 MHz) 533 600
Intel Pentium II Xeon (450 MHz) 295 450
Sun UltraSPARC (167MHz) 237 333
1 CPU, Units in MFLOPS
Reference: http://performance.netlib.org/performance/html/linpack.data.col0.html
MFLOPS : Measure of Performance
9/17/2007 Scientific Computing @ UNC 14
TOP500
• A list of the 500 most powerful computer systems over the world
• Established in June 1993 • Compiled twice a year (June & November)• Using LINPACK Benchmark code (solving linear algebra
equation aX=b )• Organized by world-wide HPC experts, computational
scientists, manufacturers, and the Internet community • Homepage: http://www.top500.org
9/17/2007 Scientific Computing @ UNC 15
TOP500:June 2007
Rank Installatio Site
/Year
ManufacturerComputer/Procs
Rmax
Rpeak
1 DOE/NNSA/LLNLUnited States/2005
BlueGene/LeServer Blue Gene Solution / 131,072, IBM
280,600367,000
2 Oak Ridge National LaboratoryUnited States/2006
Jaguar - Cray XT4/XT3 /23016
Cray Inc.
101,700
119,350
3 NNSA/Sandia National LaboratoriesUnited States/2006
Red Storm - Sandia/ Cray Red Storm, Opteron 2.4 GHz dual core/26544Cray Inc.
101,400
127,411
4 IBM Thomas J. Watson Research Center
United States/2005
BGWeServer Blue Gene Solution / 40,960, IBM
91,290114,688
5 Stony Brook/BNL, New York Center for Computional SciencesUnited States/2007
New York Blue - eServer Blue Gene Solution/36864IBM
82,161
103,219
25 University of North CarolinaUnited States/2007
Topsail - PowerEdge 1955, 2.33 GHz, Cisco/Topspin Infiniband/4160, Dell
28,770
38821.1
TOP 5, Units in GFLOPS (=1024 MGLOPS)
9/17/2007 Scientific Computing @ UNC 16
Shared memory - single address space. All processors have access to a pool of shared memory. (examples: Chastity/zephyr, happy/yatta, cedar/cypress, sunny) Methods of memory access : Bus and Crossbar
Distributed memory - each processorhas it’s own local memory. Must do message passing to exchange data between processors. (examples: Baobab, the new Dell Cluster)
MEMORY
BUS
CPU CPU CPU CPUCPU CPU CPU CPU
MMMM
NETWORK
Shared/Distributed-Memory Architecture
9/17/2007 Scientific Computing @ UNC 17
What is a Beowulf Cluster?
• A Beowulf system is a collection of personal computers constructed from commodity-off-the-shelf hardware components interconnected with a system-area-network and configured to operate as a single unit, parallel computing platform (e.g., MPI), using an open-source network operating system such as LINUX.
• Main components:– PCs running LINUX OS
– Inter-node connection with Ethernet,
Gigabit, Myrinet, InfiniBand, etc.
– MPI (message passing interface)
9/17/2007 Scientific Computing @ UNC 18
LINUX Beowulf Clusters
9/17/2007 Scientific Computing @ UNC 19
What is Parallel Computing ?
• Concurrent use of multiple processors to process data– Running the same program on many processors. – Running many programs on each processor.
9/17/2007 Scientific Computing @ UNC 20
Advantages of Parallelization
• Cheaper, in terms of Price/Performance Ratio• Faster than equivalently expensive
uniprocessor machines • Handle bigger problems• More scalable: the performance of a particular
program may be improved by execution on a large machine
• More reliable: In theory if processors fail we can simply use others
9/17/2007 Scientific Computing @ UNC 21
Catch: Amdahl's Law Speedup = 1/(s+p/n)
9/17/2007 Scientific Computing @ UNC 22
Parallel Programming Tools
• Share-memory architecture– OpenMP
• Distributed-memory architecture– MPI, PVM, etc.
9/17/2007 Scientific Computing @ UNC 23
OpenMP• An Application Program Interface (API) that may be used to explicitly direct
multi-threaded, shared memory parallelism • What does OpenMP stand for?
– Open specifications for Multi Processing via collaborative work between interested parties from the hardware and software industry, government and academia.
• Comprised of three primary API components: – Compiler Directives – Runtime Library Routines – Environment Variables
• Portable: – The API is specified for C/C++ and Fortran – Multiple platforms have been implemented including most Unix platforms and
Windows NT • Standardized:
– Jointly defined and endorsed by a group of major computer hardware and software vendors
– Expected to become an ANSI standard later???
9/17/2007 Scientific Computing @ UNC 24
OpenMP Example (FORTRAN)
PROGRAM HELLO INTEGER NTHREADS, TID, OMP_GET_NUM_THREADS,
+ OMP_GET_THREAD_NUM
C Fork a team of threads giving them their own copies of variables !$OMP PARALLEL PRIVATE(TID)
C Obtain and print thread id TID = OMP_GET_THREAD_NUM() PRINT *, 'Hello World from thread = ', TID
C Only master thread does this IF (TID .EQ. 0) THEN NTHREADS = OMP_GET_NUM_THREADS() PRINT *, 'Number of threads = ', NTHREADS END IF
C All threads join master thread and disband !$OMP END PARALLEL
END
9/17/2007 Scientific Computing @ UNC 25
The Message Passing Model
• Parallelization scheme for distributed memory.• Parallel programs consist of cooperating processes,
each with its own memory.• Processes send data to one another as messages• Message can be passed around among compute
processes• Messages may have tags that may be used to sort
messages.• Messages may be received in any order.
9/17/2007 Scientific Computing @ UNC 26
MPI: Message Passing Interface• Message-passing model• Standard (specification)
– Many implementations (almost each vendor has one)– MPICH and LAM/MPI from public domain most widely used– GLOBUS MPI for grid computing
• Two phases:– MPI 1: Traditional message-passing– MPI 2: Remote memory, parallel I/O, and dynamic processes
• Online resources– http://www-unix.mcs.anl.gov/mpi/index.htm– http://www-unix.mcs.anl.gov/mpi/mpich/– http://www.lam-mpi.org/– http://www.mpi-forum.org– http://www-unix.mcs.anl.gov/mpi/tutorial/learning.html
9/17/2007 Scientific Computing @ UNC 27
A Simple MPI Code
#include "mpi.h" #include <stdio.h>
int main( argc, argv ) int argc; char **argv;
{ MPI_Init( &argc, &argv ); printf( "Hello world\n" ); MPI_Finalize(); return 0; }
include ‘mpif.h’integer myid, ierr, numprocs
call MPI_INIT( ierr)call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)call MPI_COMM_SIZE (MPI_COMM_WORLD, numprocs,ierr)
write(*,*) ‘Hello from ‘, myidwrite(*,*) ‘Numprocs is’, numprocscall MPI_FINALIZE(ierr)
end
C Version FORTRAN Version
9/17/2007 Scientific Computing @ UNC 28
Other Parallelization Models
• VIA: Virtual Interface Architecture -- Standards-based Cluster Communications
• PVM: a portable message-passing programming system, designed to link separate host machines to form a ``virtual machine'' which is a single, manageable computing resource. It’s largely an academic effort and there has been no much development since 1990s.
• BSP: Bulk Synchronous Parallel Model, a generalization of the widely researched PRAM (Parallel Random Access Machine) model
• Linda:a concurrent programming model from Yale, with the primary concept of ``tuple-space''
• HPF: PGI’s first standard parallel programming language for shared and distributed-memory systems.
9/17/2007 Scientific Computing @ UNC 29
Research Computing Servers @ UNC-CH
• IBM P690 – SMP, 32 CPUs, happy/yatta• SGI Altix 3700 – SMP, 128 CPUs,
cedar/cypress• AMD & Xeon LINUX Cluster – Distributed
memory, 352 CPUs, Emerald (old Baobab). • Dell LINUX cluster – Distributed memory 4160
CPUs, topsail
9/17/2007 Scientific Computing @ UNC 30
IBM P690 (Regatta)-IBM pSeries 690 Model 6C4, Power4+ Turbo, 32 1.7 GHz processors
- 32 CPUs share-memory
- 128 GB Memory
- 217 GLOPS
- 8 146.8 GB local Disk Drives
- access to 4TB of NetApp NAS RAID array used for scratch space, mounted as /nas and /netscr
-OS: IBM AIX 5.3 Maintenance Level 04
- login node: happy.isis.unc.edu
- compute node: yatta.isis.unc.edu
-Will be gone any time
9/17/2007 Scientific Computing @ UNC 31
SGI Altix 3700
• Servers for Scientific Applications such as Gaussian, Amber, and custom code
• Login node: cedar.isis.unc.edu • Compute node:
cypress.isis.unc.edu• Cypress: SGI Altix 3700bx2 - 128
Intel Itanium2 Processors (1600MHz), each with 16k L1 cache for data, 16k L1 cache for instructions, 256k L2 cache, 6MB L3 cache, 4GB of Shared Memory (512GB total memory)
• Two 70 GB SCSI System Disks as /scr
9/17/2007 Scientific Computing @ UNC 32
SGI Altix 3700• Cedar: SGI Altix 350 - 8 Intel
Itanium2 Processors (1500MHz), each with 16k L1 cache for data, 16k L1 cache for instructions, 256k L2 cache, 4MB L3 cache, 1GB of Shared Memory (8GB total memory), two 70 GB SATA System Disks.
• RHEL 3 with Propack 3, Service Pack 3
• No AFS (HOME & pkg space) access
• Scratch Disk: /netscr, /nas, /scr
9/17/2007 Scientific Computing @ UNC 33
Emerald Cluster
• Dual AMD Athlon 48-CPU, 1.4 GHz, 1GB/CPU memory, Myrinet connection
• Dual AMD Athlon 20-CPU, 1.6 GHz, 1GB/CPU memory, Myrinet connection
• IBM BladeCenter, 50-CPU, 2.4 GHz, 1.25GB/CPU memory, Gigabit ethernet connection
• IBM BladeCenter, 156-CPU, 2.8 GHz, 1GB/CPU memory, Gigabit ethernet connection
• 2 Login Nodes: IBM BladeCenter, one Xeon 2.4GHz, 2.5GB RAM and one Xeon 2.8GHz, 2.5GB RAM
• Login: emerald.isis.unc.edu• Access to 7TB of NetApp NAS RAID array used for scratch space,
mounted as /nas and /scr • OS: RedHat Enterprise Linux 3.0 • TOP500: 395th place in the June 2003 release.
9/17/2007 Scientific Computing @ UNC 34
9/17/2007 Scientific Computing @ UNC 35
behuang
wendellx
thclarke
huxz
jinm
kunwang
leaverfa
aok
guntas
skolenik
paulg
emlange
rjha
walljam
aewebb
gsmurphy
Dec. 2006 Baobab/Emerald Usage
bat_96h
bat_24h
patrons
bat_14d
par_48h_8c
par_96h_4c
bat_30d
bat_1h
par_12h_32c
par_24h_8c
par_24h_4c
9/17/2007 Scientific Computing @ UNC 36
New Dell LINUX Cluster, Topsail• 520 dual nodes (4160 CPUs) Xeon (EM64T) • 3.6GHz, 2MB L1 cache 2GB memory per CPU• InfiniBand inter-node connection• Not AFS mounted, not open to general public• Access based on peer-reviewed proposal• HPL: 6.252 Teraflops, 74th in 2006 JuneTOP500 list and 104th in the November 2006 list and 25th in the
June 2007 list (after upgrade)
9/17/2007 Scientific Computing @ UNC 37
Original Topsail
• Compute nodes: – 1024 CPUs – 3.6 GHz Intel EM64T – 2M L2 cache – 4 GB memory– 90 nm technology– 800 MHz FSB
9/17/2007 Scientific Computing @ UNC 38
Intel 5300 Processor Series65 nm technology
NEW Topsail
28.77 teraflops after upgrade
9/17/2007 Scientific Computing @ UNC 39
New TopsailChip/Block Diagram
9/17/2007 Scientific Computing @ UNC 40
Benchmark: Latency
9/17/2007 Scientific Computing @ UNC 41
Benchmark: Bandwidth
9/17/2007 Scientific Computing @ UNC 42
Benchmarks: GAMESS
9/17/2007 Scientific Computing @ UNC 43
File Systems• AFS (Andrew File System): AFS is a distributed network file
system that enables files from any AFS machine across the campus to be accessed as easily as files stored locally. – As ISIS HOME for all users with an ONYEN– Limited quote: 250 MB for most users [type “fs lq” to view]– Current production version openafs-1.3.8.6– Files backed up daily [ ~/OldFiles ]– Directory/File tree: /afs/isis/home/o/n/onyen
• For example: /afs/isis/home/m/a/mason, where “mason” is the ONYEN of the user
– Accessible from chastity/zephyr, happy/yatta, baobab– But not from cedar/cypress, new dell cluster– Recommended to compile, run I/O intensive jobs on /scr or /netscr– More info: http://help.unc.edu/?id=215#d0e24
9/17/2007 Scientific Computing @ UNC 44
Basic AFS Commands
• To add or remove packages– ipm add pkg_name, ipm remove pkg_name
• To find out space quota/usage– fs lq
• To see and review AFS tokens (read/write-able)– tokens, klog
• Over 200 packages installed in AFS pkg space– /afs/isis/pkg/
• More info available at– http://its.unc.edu/dci/dci_components/afs/
9/17/2007 Scientific Computing @ UNC 45
Data Storage• Local Scratch: /scr
– Cedar/cypress: 4x70 GB SCSI System Disks – Chastity/zephyr: 550 GB Fibre-Channel RAID Array – Happy/yatta: 8 36.4 GB Disk Drives – For running jobs, temporary data storage, not backed up
• Network Attached Storage (NAS): /nas/uncch, /netscr– 7TB of NetApp NAS RAID array used for scratch space, mounted as /nas and
/scr – For running jobs, temporary data storage, not backed up– Shared by all login and compute nodes (cedar/cypress, chastity/
zephyr, happy/yatta, baobab)• Mass Storage (MS)
– Never run jobs using files in ~/ms (compute nodes do not have ~/ms access)– Mounted for long term data storage on all scientific computing
servers’ login nodes as ~/ms ($HOME/ms)
9/17/2007 Scientific Computing @ UNC 46
Subscription of Services• Have an ONYEN ID
– The Only Name You’ll Ever Need
• Eligibility: Faculty, staff, postdoc, and graduate students
• Go to http://onyen.unc.edu
9/17/2007 Scientific Computing @ UNC 47
Subscription of Services
9/17/2007 Scientific Computing @ UNC 48
Access to Servers
• To cedar– ssh cedar.isis.unc.edu
• To Emerald– ssh baobab.isis.unc.edu
• To Topsail– ssh topsail.unc.edu
9/17/2007 Scientific Computing @ UNC 49
Programming Tools• Compilers
– FORTRAN 77/90/95– C/C++
• Utility Libraries– BLAS, LAPACK, FFTW, SCALAPACK– IMSL, NAG, – NetCDF, GSL, PETSc
• Parallel Computing – OpenMP– PVM– MPI (MPICH, LAM/MPI, MPICH-GM)
9/17/2007 Scientific Computing @ UNC 50
Compilers: SMP machines• Cedar/Cypress – SGI Altix 3700, 128 CPUs
– 64-bit Intel Compiler versions 8.0, 8.1 and 9.0, /opt/intel
• FORTRAN 77/90/95: ifort/ifc/efc• C/C++: icc/ecc
– 64-bit GNU compilers• FORTRAN 77 f77/g77• C and C++ gcc/cc and g++/c++
• Happy/Yatta – IBM P690, 32CPUs– XL FORTRAN 77/90 8.1.0.3 xlf, xlf90– C and C++ AIX 6.0.0.4 xlc, xlC
9/17/2007 Scientific Computing @ UNC 51
Compilers: LINUX Cluster• Absoft ProFortran Compilers
– Package Name: profortran – Current Version: 7.0 – FORTRAN 77 (f77): Absoft FORTRAN 77 compiler version 5.0 – FORTRAN 90/95 (f90/f95): Absoft FORTRAN 90/95 compiler version 3.0
• GNU Compilers– Package Name: gcc – Current Version: 3.4.3 – FORTRAN 77 (g77/f77): 3.4.3 – C (gcc): 3.4.3 – C++ (g++/c++): 3.4.3
• Intel Compilers – Package Name: intel_fortran intel_CC – Current Version: 8.1 – FORTRAN 77/90 (ifc): Intel LINUX compiler version 8.0, 8.1, 9.0 – CC/C++ (icc): Intel LINUX compiler version 8.0, 8.1, 9.0
• Portland Group Compilers– Package Name: pgi – Current Version: 5.2 – FORTRAN 77 (pgf77): The Portland Group, Inc. pgf77 5.2-4 – FORTRAN 90 (pgf90): The Portland Group, Inc. pgf90 5.2-4 – High Performance FORTRAN (pghpf): The Portland Group, Inc. pghpf 5.2-4 – C (pgcc): The Portland Group, Inc. pgcc 5.2-4 – C++ (pgCC): The Portland Group, Inc. pgCC 5.2-4
9/17/2007 Scientific Computing @ UNC 52
LINUX Compiler BenchmarkAbsoft ProFortran 90
Intel FORTRAN 90 Portland Group FORTRAN 90
GNU FORTRAN 77
Molecular Dynamics (CPU time)
4.19 (4) 2.83 (2) 2.80 (1) 2.89 (3)
Kepler (CPU Time) 0.49 (1) 0.93 (2) 1.10 (3) 1.24 (4)
Linpack (CPU Time) 98.6 (4) 95.6 (1) 96.7 (2) 97.6 (3)
Linpack (MFLOPS) 182.6 (4) 183.8 (1) 183.2 (3) 183.3 (2)
LFK (CPU Time) 89.5 (4) 70.0 (3) 68.7 (2) 68.0 (1)
LFK (MFLOPS) 309.7 (3) 403.0 (2) 468.9 (1) 250.9 (4)
Total Rank 20 11 12 17
•For reference only. Notice that performance is code and compilation flag dependent. For each benchmark, three identical runs were performed and the best CPU timing was chosen among the three and then listed in the Table. Optimization flags: for Absoft -O, Portland Group -O4 -fast, Intel -O3, GNU -O
9/17/2007 Scientific Computing @ UNC 53
Profilers & Debuggers
• SMP machines– Happy: dbx, prof, gprof– Cedar: gprof
• LINUX Cluster– PGI: pgdebug, pgprof, gprof– Absoft: fx, xfx, gprof– Intel: idb, gprof– GNU: gdb, gprof
9/17/2007 Scientific Computing @ UNC 54
Utility Libraries• Mathematic Libraries
– IMSL, NAG, etc. • Scientific Computing
– Linear Algebra• BLAS, ATLAS• EISPACK• LAPACK• SCALAPACK
– Fast Fourier Transform, FFTW– The GNU Scientific Library, GSL– Utility Libraries, netCDF, PETSc, etc.
9/17/2007 Scientific Computing @ UNC 55
Utility Libraries• SMP Machines
– Happy/Yatta: ESSL (Engineering and Scientific Subroutine Library), -lessl
• BLAS• LAPACK• EISPACK• Fourier Transforms, Convolutions and Correlations, and
Related Computations• Sorting and Searching• Interpolation• Numerical Quadrature• Random Number Generation• Utilities
9/17/2007 Scientific Computing @ UNC 56
Utility Libraries• SMP Machines
• Cedar/Cypress: MKL (Intel Math Kernel Library) 8.0,
-L/opt/intel/mkl721/lib/64 -lmkl -lmkl_lapack -lsolver -lvml -lguide
» BLAS
» LAPACK
» Sparse Solvers
» FFT
» VML (Vector Math Library)
» Random-Number Generators
9/17/2007 Scientific Computing @ UNC 57
Utility Libraries for Emerald Cluster
• Mathematic Libraries– IMSL
• The IMSL Libraries are a comprehensive set of mathematical and statistical functions
• From Visual Numerics, http://www.vni.com• Functions include
- Optimization - FFT’s- Interpolation - Differential equations - Correlation - Regression - Time series analysis - and many more
• Available in FORTRAN and C• Package name: imsl• Required compiler: Portland Group compiler, pgi• Installed on AFS ISIS package space, /afs/isis/pkg/imsl• Current default version 4.0, latest version 5.0• To subscribe IMSL, type “ipm add pgi imsl”• To compiler a C code, code.c, using IMSL:
pgcc -O $CFLAGS code.c -o code.x $LINK_CNL_STATIC
9/17/2007 Scientific Computing @ UNC 58
• Mathematic Libraries– NAG
• NAG produces and distributes numerical, symbolic, statistical, visualisation and simulation software for the solution of problems in a wide range of applications in such areas as science, engineering, financial analysis and research.
• From Numerical Algorithms Group, http://www.nag.co.uk• Functions include
- Optimization - FFT’s- Interpolation - Differential equations - Correlation - Regression - Time series analysis - Multivariate factor analysis - Linear algebra - Random number generator
• Available in FORTRAN and C• Package name: nag• Available platform: SGI IRIX, SUN Solaris, IBM AIX, LINUX• Installed on AFS ISIS package space, /afs/isis/pkg/nag• Current default version 6.0• To subscribe IMSL, type “ipm add nag”
Utility Libraries for Emerald Cluster
9/17/2007 Scientific Computing @ UNC 59
Utility Libraries for Emerald Cluster
• Scientific Libraries– Linear Algebra
• BLAS, LAPACK, LAPACK90, LAPACK++, ATALS, SPARSE-BLAS, SCALAPACK, EISPACK, FFTPACK, LANCZOS, HOMPACK, etc.
• Source code downloadable from the website: http://www.netlib.org/liblist.html
• Compiler dependent• BLAS and LAPACK available for all 4 compiler at AFS
ISIS package space, gcc, profortran, intel and pgi
• SCALAPACK available for pgi and intel compilers• Assistance available if other versions are needed
9/17/2007 Scientific Computing @ UNC 60
• Scientific Libraries– Other Libraries: not fully implemented yet and thus please be
cautious and patient when using them• FFTW http://www.fftw.org/• GSL http://www.gnu.org/software/gsl/• NetCDF http://www.unidata.ucar.edu/software/netcdf/• NCO http://nco.sourceforge.net/• HDF http://hdf.ncsa.uiuc.edu/hdf4.html • OCTAVE http://www.octave.org/ • PETSc http://www-unix.mcs.anl.gov/petsc/petsc-as/• ……
– If you think more libraries are of broad interest, please recommend to us
Utility Libraries for Emerald Cluster
9/17/2007 Scientific Computing @ UNC 61
Parallel Computing• SMP Machines:
– OpenMP• Compilation:
– Use “-qsmp=omp” flag on happy– Use “-openmp” flag on cedar
• Environmental Variable Setup– setenv OMP_NUM_THREADS n
– MPI• Compilation:
– Use “-lmpi” flag on cedar– Use MPI capable compilers, e.g., mpxlf, mpxlf90, mpcc, mpCC
– Hybrid (OpenMP and MPI): Do both!
9/17/2007 Scientific Computing @ UNC 62
Parallel Computing With Emerald Cluster
• Setup MPI Implementation MPICH MPI-LAM
MPI Package to be “ipm add”-ed mpich mpi-lam
Vendor\Programming Language F77 F90 C C++ F77 F90 C C++
GNU Compilers √ √ √ √ √ √
Absoft ProfFortran Compilers √ √ √ √ √ √ √ √
Portland Group Compilers √ √ √ √ √ √ √ √
Intel Compilers √ √ √ √ √ √ √ √
9/17/2007 Scientific Computing @ UNC 63
• Setup
Vendor \ Language
Package Name FORTRAN 77 FORTRAN 90 C C++
GNU gcc g77 gcc g++
Absoft ProfFortran
profortran f77 f95
Portland Group pgi pgf77 pgf90 pgcc pgCC
Intelintel_fortran
intel_CCifc ifc icc icc
Commands for Parallel MPI Compilation
mpich or mpi-lam
mpif77 mpif90 mpicc mpiCC
Parallel Computing With Emerald Cluster
9/17/2007 Scientific Computing @ UNC 64
• Setup – AFS Packages to be “ipm add”-ed– Notice the order: Compiler is always added first– Add ONLY ONE compiler into your environmentCOMPILER MPICH MPI-LAM
GNU ipm add gcc mpich ipm add gcc mpi-lam
Absoft ProFortran ipm add profortran mpich ipm add profortran mpi-lam
Portland Group ipm add pgi mpich ipm add pgi mpi-lam
Intelipm add intel_fortran
intel_CC mpichipm add intel_fortran
intel_CC mpi-lam
Parallel Computing With Emerald Cluster
9/17/2007 Scientific Computing @ UNC 65
• Compilation – To compile an MPI Fortran 77 code, code.f, and to form an
executable, exec%mpif77 -O -o exec code.f
– For a Fortran 90/95 code, code.f90, and to form an executable, exec%mpif90 -O -o exec code.f90
– For a C code, code.c, and to form an executable, exec%mpicc -O -o exec code.c
– For a C++ code, code.cc, and to form an executable, exec%mpiCC -O -o exec code.cc
Parallel Computing With Emerald Cluster
9/17/2007 Scientific Computing @ UNC 66
Scientific Packages• Available in AFS package space• To subscribe a package, type “ipm add pkg_name” where
“pkg_name is the name of the package. For example, “ipm add gaussian”
• To remove it, type “ipm remove pkg_name”• All packages are installed at the /afs/isis/pkg/ directory.
For example, /afs/isis/pkg/gaussian. • Categories of scientific packages include:
– Quantum Chemistry– Molecular Dynamics– Material Science– Visualization– NMR Spectroscopy– X-Ray Crystallography– Bioinformatics– Others
9/17/2007 Scientific Computing @ UNC 67
Scientific Package: Quantum ChemistrySoftware Package Name Platforms Current Version Parallel
ABINIT abinit IRIX/LINUX 4.3.3 YES (MPI)
ADF adf LINUX 2002.02 Yes (PVM)
Cerius2 cerius2 IRIX/LINUX 4.10 Yes (MPI)
GAMESS gamess IRIX/LINUX 2001.9.6 Yes (MPI)
Gaussian gaussian IRIX/LINUX 03C02 Yes (OpenMP)
MacroModel macromodel IRIX 7.1 No
MOLFDIR molfdir IRIX 2001 NO
Molpro molpro IRIX/LINUX 2002.6 Yes (MPI)
NWChem nwchem IRIX/LINUX 4.7 Yes (MPI)
MaterialStudio materisalstudio LINUX 3.2 Yes (MPI)
CPMD cpmd IRIX/LINUX 3.0 YES (MPI)
ACES2 aces2 IRIX 4.1.2 No
9/17/2007 Scientific Computing @ UNC 68
Scientific Package: Molecular Dynamics
Software Package Name Platforms Current Version Parallel
Amber amber IRIX/LINUX 8.0 MPI
NAMD/VMD namd,vmd IRIX/LINUX 2.5 MPI
Gromacs gromcs IRIX/LINUX 3.2.1 MPI
InsightII insightII IRIX 2000.3 --
MacroModel macromodel IRIX 7.1 --
PMEMD pmemd IRIX/LINUX 3.0.0 MPI
Quanta quanta IRIX 2005 MPI
Sybyl sybyl IRIX/LINUX 7.1 --
CHARMM charmm IRIX 3.0B1 MPI
TINKER tinker LINUX 4.2 --
O o IRIX 9.0.7 --
9/17/2007 Scientific Computing @ UNC 69
Molecular & Scientific VisualizationSoftware Package Name Platforms Current Version
AVS avs IRIX 5.6
AVS Express Avs-express IRIX 6.2
Cerius2 cerius2 IRIX/LINUX 4.9
DINO dino IRIX 0.8.4
ECCE ecce IRIX 2.1
GaussView gaussian IRIX/LINUX/AIX 3.0.9
GRASP grasp IRIX 1.3.6
InsightII insightII IRIX/LINUX 2000.3
MOIL-VIEW Moil-view IRIX 9.1
MOLDEN molden IRIX/LINUX 4.0
MOLKEL molkel IRIX 4.3
MOLMOL molmol IRIX 2K.1
MOLSCRIPT molscript IRIX 2.1.2
MOLSTAR molstar IRIX/LINUX 1.0
9/17/2007 Scientific Computing @ UNC 70
Molecular & Scientific Visualization
Software Package Name Platforms Current Version
MOVIEMOL moviemol IRIX 1.3.1
NBOView nbo IRIX/LINUX 5.0
QUANTA quanta IRIX/LINUX 2005
RASMOL rasmol IRIX/LINUX/AIX 2.7.3
RASTER3D raster3d IRIX/LINUX 2.7c
SPARTAN spartan IRIX 5.1.3
SPOCK spock IRIX 1.7.0p1
SYBYL sybyl IRIX/LINUX 7.1
VMD vmd IRIX/LINUX 1.8.2
XtalView xtalview IRIX 4.0
XMGR xmgr IRIX 4.1.2
GRACE grace IRIX/LINUX 5.1.2
IMAGEMAGICK Imagemagick IRIX/LINUX/AIX 6.2.1.3
GIMP gimp IRIX/LINUX/AIX 1.0.2
XV xv IRIX/LINUX/AIX 3.1.0a
9/17/2007 Scientific Computing @ UNC 71
NMR & X-Ray Crystallography Software Package Name Platforms Current Version
CNSsolve cnssolve IRIX/LINUX 1.1
AQUA aqua IRIX/LINUX 3.2
BLENDER blender IRIX 2.28a
BNP bnp IRIX/LINUX 0.99
CAMBRIDGE cambridge IRIX 5.26
CCP4 ccp4 IRIX/LINUX 4.2.2
CNX cns IRIX/LINUX 2002
FELIX felix IRIX/LINUX 2004
GAMMA gamma IRIX 4.1.0
MOGUL mogul IRIX/LINUX 1.0
Phoelix phoelix IRIX 1.2
TURBO turbo IRIX 5.5
XPLOR-NIH Xplor_nih IRIX/LINUX 2.11.2
XtalView xtalview IRIX 4.0
9/17/2007 Scientific Computing @ UNC 72
Scientific Package: Bioinformatics
Software Package Name Platforms Current Version
BIOPERL bioperl IRIX 1.4.0
BLAST blast IRIX/LINUX 2.2.6
CLUSTALX clustalx IRIX 8.1
EMBOSS emboss IRIX 2.8.0
GCG gcg LINUX 11.0
Insightful Miner iminer IRIX 3.0
Modeller modeller IRIX/LINUX 7.0
PISE pise LINUX 5.0a
SEAVIEW seaview IRIX/LINUX 1.0
AUTODOCK autodock IRIX 3.05
DOCK dock IRIX/LINUX 5.1.1
FTDOCK ftdock IRIX 1.0
HEX hex IRIX 2.4
9/17/2007 Scientific Computing @ UNC 73
Packages on Cedar/Cypress• No access to AFS packages
• A separate pool of packages are installed at /opt• Available packages include:
– Amber– CPMD– Gaussian– GROMACS– MOLPRO– NAMD– NWChem– PMEMD
9/17/2007 Scientific Computing @ UNC 74
Why do We Need Job Management Systems?
• “Whose job you run in addition to when and where it is run, may be as important as how many jobs you run!”
• Effectively optimizes the utilization of resources
• Effectively optimizes the sharing of resources• Often referred to as Resource Management
Software, Queuing Systems, or Job Management System, etc.
9/17/2007 Scientific Computing @ UNC 75
Job Management Tools
• PBS - Portable Batch System– Open Source Product Developed at NASA Ames Research Center
• DQS - Distributed Queuing System– Open Source Product Developed by SCRI at Florida State University
• LSF - Load Sharing Facility– Commercial Product from Platform Computing, Already Deployed at UNC-CH ITS
Computing Servers
• Codine/Sun Grid Engine– Commercial Version of DQS from Gridware, Inc. Now owned by SUN.
• Condor– A Restricted Source ‘Cycle Stealing’ Product From The University of Wisconsin
• Others Too Numerous To Mention
9/17/2007 Scientific Computing @ UNC 76
Submission host
LIM
Batch API
Master host
MLIM
MBD
Execution host
SBD
Child SBD
LIM
RES
User jobLIM – Load Information ManagerMLIM – Master LIMMBD – Master Batch DaemonSBD – Slave Batch DaemonRES – Remote Execution Server
queue1
2
3
45
6 7
89
10
11
12
13
Loadinformation
otherhosts
otherhosts
bsub app
Operation of LSF
9/17/2007 Scientific Computing @ UNC 77
Cool Things about LSF• It provides user access to dynamic load-balancing, load-sharing, and job
queuing. • It includes LSF JobScheduler, LSF Make, LSF Global Intelligence (which
allows for complex analysis of system activity data), LSF MultiCluster, and Platform HPC.
• Platform HPC includes the Parallel Application Manager (PAM) and integration with a large set of numerical and scientific computing applications.
• Users of the cluster can define what resources they need for a given problem (number of CPUs, special software licenses, CPU time, disk space, memory, etc.) and let resource management facilities of LSF determine where the job should run and when.
• If the cluster becomes overloaded, Platform LSF acts as a “traffic controller”, ensuring that the work continues to flow without a system crashing.
• For programs written with MPI calls, Platform HPC provides parallel application management (PAM) and scripts that integrate with the MPI libraries and binaries.
• These features manage the execution of the code and provide housekeeping services, such as assigning CPUs to the program when it starts and graceful termination in the event of an error.
9/17/2007 Scientific Computing @ UNC 78
Common LSF Commands• lsid
– A good choice of LSF command to start with is the lsid command
• lshosts/bhosts– shows all of the nodes that the LSF
system is aware of • bsub
– submits a job interactively or in batch using LSF batch scheduling and queue layer of the LSF suite
• bjobs– isplays information about a recently
run job. You can use the –l option to view a more detailed accounting
• bqueues– displays information about the
batch queues. Again, the –l option will display a more thorough description
• bkill <job ID# >– kill the job with job ID number of #
• bhist -l <job ID# >– displays historical information
about jobs. A “-a” flag can displays information about both finished and unfinished jobs
• bpeek -f <job ID#>– displays the stdout and stderr
output of an unfinished job with a job ID of #.
• bhpart– displays information about host
partitions• bstop
– Suspend a unfinished jobs• bswitch
– switches unfinished jobs from one queue to another
9/17/2007 Scientific Computing @ UNC 79
More about LSF
• Type “jle” -- checks job efficiency• Type “bqueues” for all queues on one cluster/machine
(-m); Type “bqueues -l queue_name” for more info about the queue named “queue_name”
• Type “busers” for user job slot limits • Specific for Baobab:
– cpufree -- to check how many free/idle CPUs avaialble on Baobab
– pending -- to check how many jobs are still pending
9/17/2007 Scientific Computing @ UNC 80
LSF Clusters• At ITS UNC-CH, we have three clusters
– coral• Emerald/Baobab LINUX Cluster
– fleet• Happy/yatta, sunny, etc.
– conifers• Cedar/cypress
• One cannot submit jobs across LSF clusters! Each cluster is self-contained.
9/17/2007 Scientific Computing @ UNC 81
LSF Queues• The LSF queues implement our job scheduling
and control policies. Their names reflect the characteristics of the jobs each queue accepts. These include the job type, the run time, and the number of CPUs requested.
• There are three job types:– batch (serial)– interactive– parallel
9/17/2007 Scientific Computing @ UNC 82
LSF Queues for the fleet/conifers Clusters
Queues Description
int Interactive jobs
now Preemptive debugging queue, 10 min wall-clock limit, 2 CPUs
week Default queue, one week wall-clock limit, up to 32 CPUs/user
monthLong-running serial-job queue, one month wall-clock limit, up to 4 jobs per
user
staff ITS Research Computing staff queue
manager For use by LSF administrators
9/17/2007 Scientific Computing @ UNC 83
Run Jobs on LSF fleet/conifers Clusters
• Jobs to Interactive Queue
bsub -q int -m cedar -Ip my_interactive_job • Serial Jobs
bsub -q week -m cypress my_batch_job
• Parallel OpenMP Jobssetenv OMP_NUM_THREADS 4
bsub -q week -n 4 -m cypress my_parallel_job
• Parallel MPI Jobsbsub -q week -n 4 -m cypress mpirun -np 4 my_parallel_job
9/17/2007 Scientific Computing @ UNC 84
LSF Queues on Emerald
QUEUE_NAME PRIO STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SUSP
patrons 90 Open:Active - - - - 0 0 0 0
int 80 Open:Active 16 2 - - 1 0 1 0
now 70 Open:Active - 2 - - 0 0 0 0
week 50 Open:Active - 32 - - 13 0 13 0
month 40 Open:Active 32 4 - - 4 0 4 0
staff 30 Open:Active - - - - 0 0 0 0
manager 20 Open:Active - - - - 0 0 0 0
idle 10 Open:Active - - - - 9 0 9 0
9/17/2007 Scientific Computing @ UNC 85
LSF Queues on Emerald$ bqueues -l idle
QUEUE: idle -- jobs that may be preempted by jobs from the patrons queue
PARAMETERS/STATISTICSPRIO NICE STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SSUSP USUSP RSV 10 0 Open:Active - - - - 8 0 8 0 0
0
SCHEDULING PARAMETERS r15s r1m r15m ut pg io ls it tmp swp mem loadSched - - - - - - - - - - - loadStop - - - - - - - - - - -
gm_ports loadSched - loadStop -
SCHEDULING POLICIES: NO_INTERACTIVE
USERS: all HOSTS: donors/ POST_EXEC: /opt/lsf/etc/post_execRES_REQ: select[type==any] same[model]JOB_STARTER: /opt/lsf/common/etc/job_starter.pl
9/17/2007 Scientific Computing @ UNC 86
Peculiars of Baobab Cluster(LSF sciclus Cluster)
CPU TypeResources
-R
Parallel Job Submission
esub-a
Wrapper
AMD Athlon 1.4 GHz athlon14
lammpi
mpich
lammpirun_wrapper
mpichp4_wrapper
AMD Athlon 1.6 GHz athlon16
Xeon 2.4 GHz xeon24,blade,lammpi
Xeon 2.8 GHz xeon28,blade,lammpi
Notice that -R and -a flags are mutually exclusive in one command line.
9/17/2007 Scientific Computing @ UNC 87
Run Jobs on Emerald LINUX Cluster
• Interactive Jobsbsub -q int_8h -R athlon14 -Ip my_interactive_job
• Syntax for submitting a serial job is:bsub -q queuename -R resources executable
– For examplebsub -q bat_96h -R blade my_executable
• To run a MPICH parallel job on AMD Athlon machines with, say, 4 CPUs, bsub -q par_24h_4c -n 4 -a mpich mpirun.lsf my_par_job
• To run LAM/MPI parallel jobs on IBM BladeCenter machines with, say, 4 CPUs:
bsub -q par_24h_4c -n 4 -a lammpi mpirun.lsf my_par_job
9/17/2007 Scientific Computing @ UNC 88
Final Friendly Reminders
• Never run jobs on login nodes– For file management, coding, compilation, etc., purposes only
• Never run jobs outside LSF– Fair sharing
• Never run jobs on your AFS ISIS home or ~/ms. Instead, on /scr, /netscr, or /nas– Slow I/O response, limited disk space
• Move your data to mass storage after jobs are finished and remove all temporary files on scratch disks– Scratch disk not backed up, efficient use of limited resources– Old files will automatically be deleted without notification
9/17/2007 Scientific Computing @ UNC 89
Online Resources
• Get started with Research Computing:http://www.unc.edu/atn/hpc/getting_started/index.shtml?id=4196
• Programming Toolshttp://www.unc.edu/atn/hpc/programming_tools/index.shtml
• Scientific Packageshttp://www.unc.edu/atn/hpc/applications/index.shtml?id=4237
• Job Managementhttp://www.unc.edu/atn/hpc/job_management/index.shtml?id=4484
• Benchmarkshttp://www.unc.edu/atn/hpc/performance/index.shtml?id=4228
• High Performance Computinghttp://www.beowulf.orghttp://www.top500.orghttp://www.linuxhpc.orghttp://www.supercluster.org/
9/17/2007 Scientific Computing @ UNC 90
Online Training Resources• Ohio Supercomputer Center (OSC)
http://www.osc.edu/hpc/training/• Texas Advanced Computing Center (TACC)
http://www.tacc.utexas.edu/services/training/• Maui High Performance Computing Center (MHPCC)
http://www.mhpcc.edu/training/tutorials/• National Center for Supercomputing Applications (NCSA)
http://www.ncsa.uiuc.edu/Divisions/eot/Training/• Lawrence Livermore National Laboratory (LLNL)
http://www.llnl.gov/computing/hpc/training/• National Energy Research Scientific Computing Center (NERSC)
http://www.nersc.gov/nusers/services/training/classes/• University of Minnesota Supercomputing Institute (UMSI)
http://www.msi.umn.edu/tutorial/• San Diego Supercomputer Center (SDSC)
http://www.sdsc.edu/user_services/training/
9/17/2007 Scientific Computing @ UNC 91
QUESTIONS & COMMENTS?
Please direct comments/questions about research computing to
E-mail: [email protected]
Please direct comments/questions pertaining to this presentation to
E-Mail: [email protected]
Please direct comments/questions about research computing to
E-mail: [email protected]
Please direct comments/questions pertaining to this presentation to
E-Mail: [email protected]
9/17/2007 Scientific Computing @ UNC 92
Hands-on Exercises
• If you haven’t done so yet– Subscribe the Research Computing services– Access via SecureCRT or X-Win32 to chastity, happy,
baobab, etc.– Create a working directory for yourself on /netscr or /scr– Get to know basic AFS, UNIX commands– Get to know the Baobab Beowulf cluster
• Compile serial and one parallel (MPI) codes on Baobab• Get familiar with basic LSF commands • Get to know available packages available in AFS package
space• Submit jobs via LSF using serial or parallel queue