Welcome New Users! Getting Started with HPC · Welcome New Users! Getting Started with HPC Erin...
Transcript of Welcome New Users! Getting Started with HPC · Welcome New Users! Getting Started with HPC Erin...
Welcome New Users!
Getting Started with HPC
ErinShaw&CesarSulAdvancedCyberinfrastructure
Research&Educa9onFacilitators(ACI-REF)USCCenterforHigh-PerformanceCompu9ng(HPC)
Spring2017
1. What is HPC?
§ HPCisUSC’sCenterforHigh-PerformanceCompu8ng.• HPCadvancesUSC’smissionbyprovidingtheinfrastructureand
supportnecessaryforresearchcompu9ng.• Itexiststohelpadvancescien8ficdiscoveryatUSC.
§ HPCisaworld-classsuper-compu8ngcenter!• Aspartof“standingup”anupgradedsystem,HPCrunsandpublishes
standardperformancebenchmarks.• Itiscurrentlyrankedthe12thfastestacademicsupercomputerinU.S.
byTOP500.org,theinterna8onalsupercomputerrankingsite.
ITS Information Technology Services
DouglasShookUSCCIO
MaureenDoughertyUSCHPCDirector
RandolphHallUSCVPofResearch&FacultyExecu=veDirector,HPC*
*TheHPCFacultyAdvisorycommiTeeadvisestheCIOaboutthefaculty’sresearchneedsrelatedtotheuniversity’sHPCresources.
ITSDataCenter
HPC
HPC User Base
§ HPCisaUSC-wideresource.• HPCresourcesareavailable
atnochargetoUSCfaculty,researchersandstudents.
• MostusersarefromDornsife,ViterbiandKeck.
• Othersarefrombusiness,psychology,cinema,pharmacyandelsewhere.
• Thereare11classaccounts.§ HPCishousedwithintheITSdatacenterandis
monitoredaround-the-clockbyITSstaff.
HPC Facilitation
§ Requestassistance• [email protected](emailagain!)• Drop-intoOfficeHours
- EveryTuesday@2:30pm(UPCLVL3M)- AOerworkshops,whenscheduled, Wednesdays@4:00pm(HSCNML203)
• Requestalab/individualconsulta8on§ Learnmore!
• VisithTps://hpcc.usc.edu• ATendaWorkshop(whenscheduled)
- UPCFridays,2:30-4:30pm,VPD106- HPCWednesdays1:30-3:30pm,MCA249
2. HPC Accounts
§ HPCaccountsareproject-based;faculty,staff,researchersandgraduatestudentscanapplyforuptotwoHPCprojectaccounts• HPCreferstotheapplicantasthePIoftheproject• ThePIcanaddmemberstotheirprojectgroup(orbethesolemember)
- e.g.,aprofessoraddsstudentstoaclassproject- e.g.,aninves=gatoraddsgraduatestudentstoaresearchproject
• Memberscanbelongtomul8pleprojects,includingtheirown
§ Projectsareallocatedacorehoursanddiskspacequota• PIcanaskHPCtoincreasehoursandspacethroughthewebsite• Use$mybalance/$myquotatomonitorcomputehours/diskspace
2. HPC Special Accounts
§ Classaccounts• Instructorscancreateclassaccountsfortheirstudents,forteaching
andclassassignments.- Wehadelevencoursesacrosstheacademicyear
§ Securedataaccounts• AsofJanuary,2017,HPCcannowbeusedwithsensi8veand
restricted-accessdata.• Previously,youcouldnotstoreorprocessdataordocumentsonHPC
iftheybelongedtoacategoryoflegallyprotectedorhigh-riskinforma8on.
NEW!
3. HPC Computing Cluster
§ Acomputercluster…consistsofconnectedcomputers(nodes)thatworktogether…nodesareconnectedtoeachotherthroughfastlocalareanetworks…witheachnoderunningitsowninstanceofanopera8ngsystem...usuallyincludessohwareforhigh-performancedistributedcompu8ng
3. HPC Computing ClusterAsimple,home-builtcluster OnerackinHPCcluster
Networkcables Mul8pleracksinHPCcluster
RowsofracksinHPCcluster!
3. HPC Computing ClusterNetworkFirewall
ComputeNodes(HPCCluster)>2,700nodes,>32Kcores(CentOS)
Myrinet(10Gbit/sec)Infiniband(56Gbit/sec)
FastNetworks
2.4PB(total)
328TB(/staging)
DataStorage
Head*Nodes
hpc-login2
hpc-login3
hpc-transferDataTransferNode*
*OnlyheadnodescanaccesstheInternet.*HeadnodesandDTNaresharedbyallusers
4. Working on HPC from a Laptop or Desktop
§ Asecurenetworkisrequired• UseUSCSecureWirelessorUSCEthernettoconnectfromUSC• UseaVirtualPrivateNetwork(VPN)clienttoconnectfromoutsideUSC
§ Asecureshell(ssh)isrequired(ashellisLinux’scommandlineinterface)• OnMacs,useTerminal,ana8veapplica8on
- Addi=onally,installXQuartz(www.xquartz.com)forGUIviewing
• OnWindows,installX-Win32fromsohware.usc.edu- Orinstallanotherpersonalfavorite,e.g.PuTTY,SecureShell,etc.
§ Toconnect• OnaMacTerminal,type“ssh–X<YourUSCNetId>@hpc-login2.usc.edu”• OnWindows,configurethesshconnec8onforhpc-login2.usc.edu
4. Working on HPC from a Laptop or Desktop
§ Asecurefiletransferprotocol(shp)isrequiredfortransferringdatafiles• UsetheLinuxcommandsscpandrsyncforcommandlinetransfers• Useoneofthemanyshpclientapplica8onsavailable,e.g.,
- FilezillaisavailableforbothMacandWindows- Chooseyourfavorite(e.g.,SecureShellsupportsbothloginandfiletransfer)- Seeheps://itservices.usc.edu/sOp/forop=ons
§ Toconnect• Configuretheshpconnec8onforhpc-transfer.usc.edu
- hpc-transferisadedicatedDTN(datatransfernode)
5. HPC File System
§ Filesystemscontrolhowdataisstoredandusedonadisk§ DataispresentedtotheHPCclusterfrompath“/home”
“root”,denotedbyaforwardslash(“/”),isthetopleveloftheLinuxOpera8ngSystem’sfilesystem’sdirectoryhierarchy
- Typecommands”cd /” then“ls” toviewdirectoriesunderroot
├auto├bin├homeçHPCuserhomeandprojectdirectories├lib├lib64├mnt├sbin├stagingçHPCdatastagingdirectories├tmp├usrçHPCmaintainedsohwareisin/usr/usc:
/(root)
5. HPC File System
§ Theloca8onsofyourproject,applica8ons,codes,libraries,data,etc.areallspecifiedbyuniquepaths
Tofindyourcurrentpath,typepwd(printyourworkingdirectory):$pwd/home/rcf-proj/T/trojanrcf-projindicatesthatthisisaprojectdirectory,where“e”istheprojectand“trojan”istheuser
$pwd/home/rcf-40/trojanrcf-xxindicatesthatthisisauser’shomedirectorywhere“trojanistheuser
5. HPC File System
§ Pathsareeitherabsoluteorrela8veAbsolutepathscontainroot,orasymbolthatexpandstoafullpathExamples:
/home/rcf-proj/hpc/hpcuser‘/’ startsattop(root)level./mycatphotos/cat.jpg‘.’ resolvestothepathofyourcurrentdirectory~/.bashrc‘~’ resolvestothepathofyourhomedirectory
Allotherpathsareinterpretedrela8vetoyourcurrentdirectoryExamples:
$cdmycatphotoschangedirectoryto‘mycatphotos’(incurrentdir)$catmycatphotos/cat.datdisplaycontentsoffilecat.datin‘mycatphotos’
/home/rcf-xx/
csul/ shaw/ chris/
5. File System: Home Directory
§ Userlogintotheirhomedirectory/home/rcf-40/<user_name>- valueofenvironmentvariable$HOME
• Privatedirectory,onlyusercanmodifyfiles• Backedupdaily
§ Userquotas• 1GBofdiskquotaand100,000filesoffilequota
- applica=onsmayinstallhiddenfileshere
§ Usedfor• Loggingin,setngupenvironmentandstoringpersonalfiles
- notforinstalla=on,computa=onorlargestorage
/home/rcf-proj/
proj1/
jimi/
proj2/
csul/ shaw/ chris/
5. Project Directory
§ Everyprojecthasitsowndirectory/home/rcf-proj/<project_name>• PIisowner,everymemberhasasubdirectory• Projectquota(max2TB)issharedamongallmembers• Backedupdaily
§ Usedfor• Installingsohware,runningjobsandstoringdata
- onlyPIcancreatesharedsubdirectoriesandinstallsoOwareattoplevel
§ Permissions• Bydefaultmembersubdirectorieshavegroupreadaccess
- memberscanmakeprivate-neversetpermissionsootherscanwrite
5. Staging Directory
§ Everyprojecthasastagingdirectory/staging/<project_name>
- samestructureasprojectdirectory- forstagingdataforjobs(copydata/resultsto/from)
• Lotsofspace(~328TB)- noquotas!- /stagingisclearedduringsemi-annualdown=mes
• Dataisnotbackedup- storecopyofdatasomewhereelse
§ Stagingisaparallelfilesystem• Ithasfasterr/waccessratesthantheprojectfilesystem
/staging/
proj1/
jimi/
proj2/
csul/ shaw/ chris/
5. Temporary storage on compute node
§ Everyjobhasaccesstolocalstorage(~60GB–1.8TB)$TMPDIR- equalto/tmp/{your_job_id}
/scratch- combines$TMPDIRfromfirst20nodes- /scratchisavailabletoallnodesofjob
• Fastestr/wrates• Onlyaccessiblewhenoncomputenode
§ Dataisnotbackedup!• Computenodedirectoriesarecleanedattheendofeveryjob
- copydatabacktoyourprojectorstagingdirectories
/(root)
$TMPDIR/
{your_data}
/(root)
scratch/
{your_data}
OneormorenodesOnenodeonly
6. HPC Computing Resources
§ HPChastwocompu8ngclusters~1700nodesonoriginalMyrinet(10Gbps)interconnectcluster~1300nodesonnewerInfiniband(5.6.6Gbps)interconnectedcluster
264Hewlee-PackardSL250,dualXeon8-core2.6GHz,dualNVIDIAK20GPUscontaining2,496cores,eachwith64GBmemory448Hewlee-PackardSL230,dualXeon8-core2.6GHzCPUs,with64GBmemory288Lenovonx360m5dualXeon8-core2.6GHzCPUswith64GBmemory19Lenovonx360m52.6GHzdualNVIDIAK40GPUscontaining2,880cores,eachwith64GBmemory5Lenovonx360m52.6GHzdualNVIDIAK80GPUscontaining2x2,496cores,eachwith64GBmemory(condo’dbyresearchgroup,notpublic)
§ Runjobsoncomputenodes!
6. HPC Computing Cluster (June 2016)
Index Vendor Model CPU Number Type Core Speed Memory GPU
Number & Type
1 Dell R910 Quad Intel Xeon Decacore 2.0GHz 1TB
2 HP SL160 Dual Intel Xeon Hexcore 3.0GHz 24GB
3 HP DL165 Dual Intel Xeon, Dodecacore 2.3GHz 48GB
4 Oracle X2200 Dual AMD Opteron Dualcore 2.3GHz 16GB
5 Dell PE1950 Dual Intel Xeon Quadcore 2.5GHz 12GB
6 Oracle X2200 Dual AMD Opteron Quadcore 2.3GHz 16GB
7 IBM DX360 Dual Intel Xeon Hexcore 2.6GHz 24GB
8 HP SL250S Dual Intel SB Xeon Octocore 2.6GHz 64GB Dual NVIDIA K20
9 HP SL230S Dual Intel SB Xeon Octocore 2.6GHz 64GB
10 Lenovo NX360 M5 Dual Intel Xeon Octocore GHz 64GB
11 Lenovo NX360 M5 Dual Intel Xeon Octocore 2.6 GHz 64GB Dual NVIDIA K40
queue nodes ppn gpus avx /tmp core cpu model net-work node names node
type
large- mem 4 40 - - 1.8T decacore xeon r910 myri
hpc-1t-1 hpc-1t-2 hpc-1t-3 hpc-1t-4
1
large main quick
8 12 - - 140 GB hexcore xeon sl160 myri hpc0965-0972 2
large main quick
67 24 - - 895 GB
dodeca-core xeon dl165 myri
hpc0981-1021 hpc1044-1050 hpc1123-1128 hpc1196-1200 hpc1223-1230
3
large main quick
26 8 - - 60 GB dualcore opteron x2200 myri
hpc1723-1728 hpc1734-1739 hpc1741-1742 hpc1744-1754
hpc1756
4
large main quick
54 8 - - 60 GB quadcore xeon pe1950 myri hpc2283-2318
hpc2320-2337 5
large main quick
138 8 - - 60 GB quadcore opteron x2200 myri
hpc2349-2370 hpc2470
hpc2472-2481 hpc2483
hpc2486-2505 hpc2510-2544 hpc2546-2559 hpc2561-2580 hpc2582-2597
hpc2600
6
large main quick
4 12 - - 200 GB hexcore xeon dx360 myri hpc2758-2761 7
large main quick
237 16 2 avx 850 GB octocore xeon sl250s IB hpc3025-3027
hpc3031-3264 8
large main quick
45 16 - avx 5500 GB octocore xeon sl230s IB
hpc3648-3688 hpc3695
hpc3766-3768 9
large main quick
217 16 - avx avx2
5500 GB octocore xeon nx360m5 IB
hpc3769-3792 hpc3888-4056 hpc4081-4104
10
large main quick
19 16 2 avx avx2
5500 GB octocore xeon nx360m5 IB hpc3817-3834
hpc3852 11
queue nodes ppn gpus avx /tmp core cpu model net-work node names node
type
large- mem 4 40 - - 1.8T decacore xeon r910 myri
hpc-1t-1 hpc-1t-2 hpc-1t-3 hpc-1t-4
1
large main quick
8 12 - - 140 GB hexcore xeon sl160 myri hpc0965-0972 2
large main quick
67 24 - - 895 GB
dodeca-core xeon dl165 myri
hpc0981-1021 hpc1044-1050 hpc1123-1128 hpc1196-1200 hpc1223-1230
3
large main quick
26 8 - - 60 GB dualcore opteron x2200 myri
hpc1723-1728 hpc1734-1739 hpc1741-1742 hpc1744-1754
hpc1756
4
large main quick
54 8 - - 60 GB quadcore xeon pe1950 myri hpc2283-2318
hpc2320-2337 5
large main quick
138 8 - - 60 GB quadcore opteron x2200 myri
hpc2349-2370 hpc2470
hpc2472-2481 hpc2483
hpc2486-2505 hpc2510-2544 hpc2546-2559 hpc2561-2580 hpc2582-2597
hpc2600
6
large main quick
4 12 - - 200 GB hexcore xeon dx360 myri hpc2758-2761 7
large main quick
237 16 2 avx 850 GB octocore xeon sl250s IB hpc3025-3027
hpc3031-3264 8
large main quick
45 16 - avx 5500 GB octocore xeon sl230s IB
hpc3648-3688 hpc3695
hpc3766-3768 9
large main quick
217 16 - avx avx2
5500 GB octocore xeon nx360m5 IB
hpc3769-3792 hpc3888-4056 hpc4081-4104
10
large main quick
19 16 2 avx avx2
5500 GB octocore xeon nx360m5 IB hpc3817-3834
hpc3852 11
Let’sdoourCGAhomework…
ComputeNodes
Let’stestthisonthecluster…
$qsub–I-lnodes=2:ppn=8
HeadNodes
hpc-login2
hpc-login3
(interac9vejob)$myprogram
waitinqu
eue
*HPCusestheTORQUE/PBSresourcemanagerandtheMoabclusterscheduler.JobsarescheduledbasedonordersubmiTed,number&typesofnodesrequestedand8merequired.
7. Running jobs on the ClusterLet’ssubmitthisto
thecluster…
(non-interac9vebatchjob,resultswillbeprintedtofile)
JobScheduler*
$qsubmyjob.pbs
(batchjob)$myprogram
7. Submitting a Job – Batch Mode
§ UseaPBSscripttosubmitajobtotheHPCcluster
1. AddPBScompu8ngresourcerequests
2. Addshellcommands3. Submityourjobtothe
queue:$qsubmyjob.pbs
§ Example:myjob.pbs
#!/bin/bash#PBS-lnodes=2:ppn=16#PBS-lwall8me=02:00:00
#changedirectorycd/path/to/myproject
#setpath/environmentvariablessource/usr/usc/sas/default/setup.sh
#runprogramsasmy.sas
7. Submitting a Job – Interactive mode
§ PBShasaspecialjobsubmissionmodethatallowsyoutoaccessallocatedcompu8ngresourcesinterac8vely,fortes8ngonlyExample:Request1nodewith8processorspernodeforonehour
$ qsub -I -l nodes=1:ppn=8 -l walltime=01:00:00
§ Whenaninterac8vejobisaccepted,anewloginshellwillstartonthefirstcomputenode• Youcanrunprogramsasmany8mesasyouwantun8ltherequested8me
expires(usuallyuptotwohours)
• Extremelyusefulforcompiling/debugging/tes8ngyourcodeandpreparingyourPBSscripts
7. Submitting a Job - PBS commands
§ Jobcontrolcommandsqsub submitajobqdel deleteajob
§ Jobmonitoringcommandsqstat–u<user_id>showmyqueuestatusshowstart<job_id> showqueuingschedulecheckjob<job_id> showjobsta8s8cs
§ SeeAdap8veCompu8ng’sTorque(PBS)documenta8on
7. Submitting a Job – Queues
§ Therearefourqueuesavailabletothepublic
§ Eachqueuehasdifferentconstraints
• Numberqueuedjobs,nodes,“wall8me”,simultaneousjobs• Thelargememqueueisonlyforhighlyparalleljobs
§ Bydefault,aqueuewillbeselectedforyoubasedonwall8me• Someresearchlabshavetheirownnodesandqueues(-q)
QueueNameMaximum
JobsQueuedMaximumNodeCount
MaximumWallTime
MaximumJobsperUser
main 1000 99 24hours 10quick 100 4 1hour 10large 100 256 24hours 1
largemem 100 1 336hours 1
8. Installed Software
§ HPCmaintainssohwarein/usr/usc/• Includescompilers,sta8s8cal,mathema8cal,andsimula8onprograms;
numericallibraries,licensedapplica8onsandmore…
§ Youcanalsoinstallsohwareinyourprojectdirectory
• HPCcanhelpwiththis
$ls/usr/uscacml/ }w/ imp/ mpich2/qespresso/amber/ gaussian/intel/ mpich-mx/qiime/aspera/gflags/ iperf/ mvapich2/R/bbcp/ git/ java@ NAMD/root/bin/globus/ jdk/ ncview/sas/(manymore)
8. Installed Software
§ Tousesohwarein/usr/usc• Selectaversion
- use“default”forthemostrecentversion
• Ineachversiondirectoryaretwosetupscripts:- setup.sh(forusewiththebashshell,whichisthedefault)- setup.csh(forusewiththe“t”or“c”shells)
• Typethefollowingtosetuptheenvironmentforthesohware- $sourcesetup.sh
$ls/usr/usc/python2.6.5/ 2.7.6/ 2.7.8/ 3.3.3/ 3.4.3/ 3.4.5/ 3.5.1/ 3.5.2/ default@
$ls/usr/usc/python/default/bin/include/ lib/man/setup.cshsetup.shshare/
9. HPC Policies
§ Requiredreading• hTps://hpcc.usc.edu/support/accounts/hpcc-policies/
§ Resourcelimits• Onheadnodes,jobs(24hours),alloca8ons(corehours,diskspace),
§ Scheduleddown8mes• Twiceayear,forupgradesandmaintenance
§ ProtecteddataallowedwithinHPCSecureDataAccounts(only)• HPCisnowHIPAA-compliant
§ Playwellwithothers,prac8cesafecompu8ng• i.e.,sharepublicnodes,notprivatedata!
9. USC Policies
§ Recommendedreading• Itisyourresponsibilitytoabidebythese
§ Informa8onSecurity • hTp://policy.usc.edu/info-security/
§ NetworkInfrastructureUse • hTp://policy.usc.edu/network-infrastructure/
§ PrivacyofPersonalInforma8on• hTp://policy.usc.edu/info-privacy/
§ DigitalMillenniumCopyrightActCompliance• hTp://cio.usc.edu/copyright/policy/
10. Linux Commands & ConceptsEnvironmentbashshell,.bashrcenvironmentvariables($PATH)
Keyboardnaviga9on<up>/<down>:showprev/nextcmd<ctl-a>/<ctl-e>:mvtobeg/endofline<alt-f>/<alt-b>:mvforwd/backaword<tab>,<tab-tab>:autocomplete
Specialcharacters“*”:wildcard“/”,“~” :rootdir,homedir“.”,“..”:currentdir,parentdir“>”,“<“:redirectoutput/input“|”:pipeoutputtoinputofnextcmd
Naviga9on$pwd$cd$ls(-alh)$cp/mv$touch/rm(-i)$mkdir/rmdir$history
Reading/Edi9ngfiles$cat,$less$nano($vi,$emacs)
Permissions$chmod$chgrp
Informa9on$mybalance-h$myquota$top$du–h$man$echo[$PATH]$wc$sort
Tools$alias$wget$for(loop)
10. Linux Commands & Concepts (applied)
$ mybalance-h$ myquota$ top$ pwd$ ls,ls-l,ls-F--color$ manls$ echohello$ é é é ê ê ê $ echo$PATH$ cd..,pwd,ls
$ cd/home/rcf-proj/<myproj>/<mydir>$ mkdirtest,ls,rmdirtest,ls$ mkdirworkshop,ls$ cdworkshop,pwd$ toucha.a,ls$ cpa.ab.b,ls$ mvb.bc.c,ls$ rmc.c,ls$ aliasrm=‘rm-i’$ rma.a,ls
10. Linux Commands & Concepts (applied)
$ aliascdp=‘cd/path/to/proj’$ cd~,pwd$ cdp,pwd$ cd~,ls–alh.bash*$ cat.bashrc$ cp.bashrc.bashrc_ori$ nano~/.bashrc
aliasrm=‘rm-i’aliascdp=‘cd/path/to/proj’aliasll='ls-hlt'
$ cdp$ llworkshop$ chmodg+wworkshop,ls-l$ chmodg-rwworkshop,ls–l$ cdworkshop$ wgethTp://hpcc.usc.edu,ls$ wc-lindex.html$ catindex.html|grep@$ catindex.html|grepjpg
$ ls/usr/usc/mat<tab>toautocomplete(fails)<tab><tab>toshowcandidates
$ ls/usr/usc/matl<tab>toautocomplete(succeeds)<tab><tab>toshowcandidates
$ ls/usr/usc/matlab/2:
$ ls*$ foriin*;doecho$i;done
10. Linux Commands & Concepts (applied)
$ history$ history>>history.out$ lesshistory.out$ lesshistory.out|grepwget$ ls-t/usr/usc/>ls.out$ lessls.out$ lessls.out|sort$ lessls.out|sort-f|head–n5$ du–h*|sort–n$ du–h--summarize
10. Linux References
§ HPCworkshop• Introduc8ontoLinux,PBS&theHPCCluster
§ Lyndavideo(accessviaUSC) • hTps://www.lynda.com/Linux-tutorials/Learn-Linux-Command-Line-
Basics/435539-2.html
§ SohwareCarpentrytutorial• hTp://swcarpentry.github.io/shell-novice/
§ O’ReillyBooksdirectory• hTp://www.linuxdevcenter.com/cmd/
§ Manymanywebsites…usesearch
Appendix I – DDDT!*
§ Don’tshareyourpassword• Goeswithoutsaying!
§ Don’tsetpermissionssootherscanwritetoyourdirectory• Makesiteasyforotherstooverwriteanddeleteyourfiles• Createagroup-shareddirectoryforyourgroup,instead
§ Don’truncomputeintensivejobsonheadnodes• Useacomputenode.Everyoneiswatching($top).
§ Don’tread/write/copyzillionsof8nyfiles• Useadatabase(lmdb,mysql)oralargefiletocombinedata.
*Don’tDoDumbThings!