Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts...
Transcript of DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts...
![Page 1: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/1.jpg)
DeepSense Computing Platform
![Page 2: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/2.jpg)
Agenda
• System Overview
• File Systems
• Data Transfer
• IBM Spectrum LSF
• Conductor with Spark
• Technical Support
![Page 3: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/3.jpg)
System Overview
• Compute Nodes• 20 Large memory nodes
-20 core, 512GB memory
• 4 Huge memory nodes
-20 core, 1TB memory
• 10 GPU nodes
-2XP100, 20 core, 512 GB Memory
• Operating System • Redhat Enterprise 7.5
![Page 4: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/4.jpg)
Heterogeneous Computing with GPU
![Page 5: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/5.jpg)
NVIDIA P100 GPU
Cores 3584
Memory 16 GB HBM2
Memory Bandwidth 720 GB/s
FLOPS (sp) 9.3 TFLOPS
FLOPS (dp) 4.7 TFLOPS
FLOPS (hp) 18.7 TFLOPS
Power consumption 250W
CUDA compute ability 6.0
![Page 6: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/6.jpg)
IBM Power 8 with NVIDIA P100
CPU-GPU Systems Connected via PCI-e
NVLink Enables Fast Unified Memory Access between CPU and GPU memory
![Page 7: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/7.jpg)
Applications
• Domain
- Ocean data products, Ship Building, Fisheries and
Aquaculture, Seaport and Logistics, Security and
Defense, Marine Risk…
• Data Source
- Sensor logs, text, image, video, web traffic
geospatial, AIS, …
• Analytics
- Image processing, Time-Series, Predictive
Analytics, Machine Learning, Deep Learning,
Distributed Computing, ..
![Page 8: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/8.jpg)
File System
File System
Directory Purpose Quota Backed up?
Purged?
Home /dshome/subdir/username development 1Tb, 500k files per user yes no
Data /data/projectname development 2Tb, 500k files per project
yes no
Scratch /scratch/username computation 2TB, 1M files per user no yes
![Page 9: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/9.jpg)
Data Transfer
• Two protocol nodesoprotocol1.deepsense.ca
oprotocol2.deepsense.ca
• Connect using SAMBA:
- smb://protocol1.deepsense.ca
- use your DeepSense account
![Page 10: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/10.jpg)
Data Backups
• DeepSense is platform for data analytics. It is not meant for long term storage.• Users should ensure their original data is backed up at their own site.
• We do have daily backups• /dshome, /data, /software are backed up every evening
• The backup keeps 7 versions of files
• Once a file is deleted, it is kept backed up for 30 days. After which, it is no longer accessible
• If you need to restore a file, please let us know
![Page 11: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/11.jpg)
IBM Spectrum LSF
• Workload management platform
• Maximize utilization for distributed
High Performance computing
• GPU Support
• Execute batch/interactive jobs
• Containerized workloads
![Page 12: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/12.jpg)
LSF Access and Login
• User account => Deepsense account
• Login nodes:• login1.deepsense.ca
• login2.deepsense.ca
• Example connection:• ssh <username>@login1.deepsense.ca
• for Mac or Linux client use terminal
• for windows client use PuTTY, MobaXterm
• If you are off campus, need a Dalhousie VPN connection
![Page 13: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/13.jpg)
Submitting Job to LSF
• Development/test jobs• For testing/dev use the login nodes
• Shared with all users
• Batch jobs• Command: bsub
• With ‘bsub’ options specify:
input/output files, GPU option, CPU/Memory Limit, etc..
• Interactive jobs• Command: bsub -I
![Page 14: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/14.jpg)
LSF Monitoring/Cancelling jobs
• Check Running jobso bjobs -l
o bjobs -l <jobid> // for job details
• Control job executiono Job suspend: bstop <jobid>
o Job resume: bresume <jobid>
o Job kill: bkill <jobid>
• Check available hostso bhosts
![Page 15: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/15.jpg)
IBM Spectrum Conductor with Spark (CWS)
• Spark integration and lifecycle management platform
• Support for multiple Spark versions
• Integrated application platform
• Notebooks, Deep Learning packages
• Simplified administration
![Page 16: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/16.jpg)
Accessing CWS
• Management ConsoleoGo to url:https://ds-mgm-02.deepsense.cs.dal.ca:8443
o Login using DS account
• Command Line Optiono from login node ssh to:
ssh ds-cmhm-02.deepsense.cs.dal.ca
o source the environment
o login to cws using DS account
![Page 17: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/17.jpg)
CWS - Spark Instance Group
• From dashboard go to:oWorkload -> Spark -> Spark
Instance Group
o Specify name, directory and user
• Choose Spark Versiono Spark 2.3.1, Spark 2.2.0,
Spark 2.1.1, Spark 1.6.1
• Optional: choose Notebook
![Page 18: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/18.jpg)
Technical support
• DocumentationoDeepsense computing platform wiki page
https://docs.deepsense.ca
o IBM Knowledge Center
https://www.ibm.com/support/knowledgecenter/
• Troubleshooting/technical questionso Send email to [email protected]
![Page 19: DeepSense Computing Platform · o Job resume: bresume o Job kill: bkill •Check available hosts o bhosts. IBM Spectrum Conductor with Spark (CWS) •Spark](https://reader033.fdocuments.us/reader033/viewer/2022042301/5eccb0eba0af283cb576e7d7/html5/thumbnails/19.jpg)
Questions ?