Providers with Red Hat OpenShift Enabling...
Transcript of Providers with Red Hat OpenShift Enabling...
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING1
Enabling GPU-as-a-Service Providers with Red Hat OpenShift
@jeremyederSenior Principal Software Engineer, Red HatMarch, 2018
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Agenda
● OpenShift Cluster Overview● Infrastructure Abstraction● High Performance Features● GPU Overview
2
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Community Powered Innovation
3
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
What does an OpenShift Cluster look like?
SERVICE LAYER
ROUTING LAYER
PERSISTENTSTORAGE
REGISTRY
RHEL
NODE
C
C
RHEL
NODE
C C
RHEL
NODE
c
C
C
RHEL
NODE
C C
RHEL
NODE
C
RHEL
NODE
CRED HATENTERPRISE LINUX
MASTER
API/AUTHENTICATION
DATA STORE
SCHEDULER
HEALTH/SCALING
PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID
4
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Abstract away any infrastructure
SERVICE LAYER
ROUTING LAYER
PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID
● Bare Metal● RHV● OpenStack● VMware● GCE● Azure● AWS● BYO nodes...
5
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 6
One Platform to...
OpenShift is the single platformto run any application: ● Old or new● Monolithic/Microservice
Big Data
NFV
FSI
Animation
ISVsHPC
Machine Learning
6
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 7
High Performance RFEs by VerticalFeature FSI NFV ISV BD/ML ANIM HPC
NUMA (cpuset.cpus and cpuset.mems) Yes Yes Yes Maybe Maybe Yes
Device Passthrough (NIC/Disk/GPU etc...) Yes Yes Yes Maybe Maybe Yes
sysctl Support (non-namespaced too) Yes Yes Yes Yes Yes Yes
Separation of control- and data-plane Yes Yes Yes Yes Yes Yes
Node “fitness” (extended health info) Yes Yes Maybe Maybe Maybe Yes
Multi-homed pods Yes Yes Maybe Yes Yes Yes
Kernel Modules (DKMS-ish) Yes Yes Maybe Maybe Yes Maybe
Hugepages Yes Yes Yes Yes Maybe Maybe
7
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Enable containerization of Infrastructure Software● Software-defined Storage and Networking● Packet switching and routing tiers● Multi-workloads (very different) within a single cluster
○ Layered schedulers (HPC/grid)● Many more...
Why do this?
8
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
● Gluster/Container Native Storage● Ceph● OpenStack● rad analytics● KubeVirt
Enable containerization of Red Hat’s products
9
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
● Resource Management Working Group○ Features Delivered
■ Device Plugins (GPU/Bypass/FPGA)■ CPU Manager (exclusive cores)■ Huge Pages Support
○ Extensive Roadmap● Intel, IBM, Google, NVIDIA, Red Hat, many more...
Upstream First: Kubernetes Working Groups
10
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
● Network Plumbing Working Group○ Formalized Dec 2017
● Goal is to implement an out of tree, pseudo-standard collection of
CRDs for multiple networks, owned by sig-network, *out of tree*
● Separate control- and data-plane, Overlapping IPs, Fast Data-plane● IBM, Intel, Red Hat, Huawei, Cisco, Tigera...at least.
Upstream First: Kubernetes Working Groups
11
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
GPU CLUSTER TOPOLOGY
12
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Control Plane
Compute Nodes and Storage Tier
Infrastructure
master and etcd
master and etcd
master and etcd
registry and
router
registry and
router
LB
registry and
router
OpenShift Cluster Topology
13
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Compute Nodes...
● How to enable software to take advantage of “special” hardware
● Create Node Pools○ Mark them as “special”○ Taints/Tolerations○ ExtendedResourceTole
ration
OpenShift Cluster Topology
14
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Compute Nodes...
● How to enable software to take advantage of “special” hardware
● Tune/Configure the OS○ Tuned Profiles○ CPU Isolation○ sysctls
OpenShift Cluster Topology
15
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Unsafe● Experimental Kubelet Flag● kernel.sem*● kernel.shm*● kernel.msg*● fs.mqueue.*● net.*
In OpenShift, there are three “types” of sysctls
Safe● Enabled by default● kernel.shm_rmid_forced● net.ipv4.ip_local_port_range● net.ipv4.tcp_syncookies
Node-level● Can’t set from a pod● Potentially affects other
pods● Many interesting sysctls● Use TuneD
16
OpenShift Cluster Topology
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Compute Nodes...
● How to enable software to take advantage of “special” hardware
● Optimize your workload○ Dedicate CPU cores○ Consume hugepages
OpenShift Cluster Topology
17
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Compute Nodes...
● How to enable software to take advantage of “special” hardware
● Enable the Hardware○ Install drivers○ Deploy Device Plugin
OpenShift Cluster Topology
18
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Compute Nodes...
● How to enable software to take advantage of “special” hardware
● Consume the Device○ KubeFlow Template
deployment
OpenShift Cluster Topology
19
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Kubernetes Deployment for STAC-A2
● All-in-One Kubernetes Installation● (hack/local-up-cluster.sh)● Node labeled ● Containers:
○ RHEL7+CUDA9○ RHEL7+CUDA9+DEVICE-PLUGIN○ RHEL7+CUDA9+STAC-A2
● CUDA 9● 8 x NVIDIA Tesla V100 (Volta) GPUs● HPE Apollo 6500 w/XL270d Gen9 ● Red Hat Enterprise Linux 7.4● Kubernetes 1.8 (setup info)● nvidia-smi
--applications-clocks=877,1380
● https://rhelblog.redhat.com/2017/11/21/red-hat-and-partners-deliver-new-performance-records-on-prominent-risk-analytics-benchmark/
● https://news.developer.nvidia.com/a-new-stac-a2-record/ 20
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 21
Kubernetes Deployment for STAC-A2
Volta GPU Kubelet
Device Plugin(daemonset)
Kube Scheduler
Volta GPUVolta GPU
Volta GPUVolta GPU
Volta GPUVolta GPU
Volta GPU
Benchmark (pod)
resources: limits: nvidia.com/gpu: 8
kubectl create
21
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Benchmark (pod)
resources: limits: nvidia.com/gpu: 8
22
Kubernetes Deployment for STAC-A2
Volta GPU Kubelet
Device Plugin(daemonset)
Kube Scheduler
Volta GPUVolta GPU
Volta GPUVolta GPU
Volta GPUVolta GPU
Volta GPU
kubectl create
22
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
● Early KubeFlow involvement
● radanalytics templates for ML-workflow on OpenShift
● Machine-Learning OpenShift Commons
● Demo Repositories
○ https://github.com/zvonkok/nvidia-k8s
○ https://github.com/redhat-performance/openshift-psap
Recent GPU-related work on OpenShift
23
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
THANK YOUplus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews
24
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Commoditizing GPU-as-a-Service Providers with Red Hat OpenShiftTuesday, Mar 27, 1:00 PM - 1:25 PM, Room 210E
Red Hat OpenShift Container Platform, with Kubernetes at it's core, can play an important role in building flexible hybrid cloud infrastructure. By abstracting infrastructure away from developers, workloads become portable across any cloud. With NVIDIA Volta GPUs now available in every public cloud [1], as well as from every computer maker, an abstraction library like OpenShift becomes even more valuable. Through demonstrations, this session will introduce you to declarative models for consuming GPUs via OpenShift, as well as the two-level scheduling decisions that provide fast placement and stability.
25