Providers with Red Hat OpenShift Enabling...

25
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 1 Enabling GPU-as-a-Service Providers with Red Hat OpenShift @jeremyeder Senior Principal Software Engineer, Red Hat March, 2018

Transcript of Providers with Red Hat OpenShift Enabling...

Page 1: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING1

Enabling GPU-as-a-Service Providers with Red Hat OpenShift

@jeremyederSenior Principal Software Engineer, Red HatMarch, 2018

Page 2: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Agenda

● OpenShift Cluster Overview● Infrastructure Abstraction● High Performance Features● GPU Overview

2

Page 3: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Community Powered Innovation

3

Page 4: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

What does an OpenShift Cluster look like?

SERVICE LAYER

ROUTING LAYER

PERSISTENTSTORAGE

REGISTRY

RHEL

NODE

C

C

RHEL

NODE

C C

RHEL

NODE

c

C

C

RHEL

NODE

C C

RHEL

NODE

C

RHEL

NODE

CRED HATENTERPRISE LINUX

MASTER

API/AUTHENTICATION

DATA STORE

SCHEDULER

HEALTH/SCALING

PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID

4

Page 5: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Abstract away any infrastructure

SERVICE LAYER

ROUTING LAYER

PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID

● Bare Metal● RHV● OpenStack● VMware● GCE● Azure● AWS● BYO nodes...

5

Page 6: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 6

One Platform to...

OpenShift is the single platformto run any application: ● Old or new● Monolithic/Microservice

Big Data

NFV

FSI

Animation

ISVsHPC

Machine Learning

6

Page 7: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 7

High Performance RFEs by VerticalFeature FSI NFV ISV BD/ML ANIM HPC

NUMA (cpuset.cpus and cpuset.mems) Yes Yes Yes Maybe Maybe Yes

Device Passthrough (NIC/Disk/GPU etc...) Yes Yes Yes Maybe Maybe Yes

sysctl Support (non-namespaced too) Yes Yes Yes Yes Yes Yes

Separation of control- and data-plane Yes Yes Yes Yes Yes Yes

Node “fitness” (extended health info) Yes Yes Maybe Maybe Maybe Yes

Multi-homed pods Yes Yes Maybe Yes Yes Yes

Kernel Modules (DKMS-ish) Yes Yes Maybe Maybe Yes Maybe

Hugepages Yes Yes Yes Yes Maybe Maybe

7

Page 8: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Enable containerization of Infrastructure Software● Software-defined Storage and Networking● Packet switching and routing tiers● Multi-workloads (very different) within a single cluster

○ Layered schedulers (HPC/grid)● Many more...

Why do this?

8

Page 9: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

● Gluster/Container Native Storage● Ceph● OpenStack● rad analytics● KubeVirt

Enable containerization of Red Hat’s products

9

Page 10: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

● Resource Management Working Group○ Features Delivered

■ Device Plugins (GPU/Bypass/FPGA)■ CPU Manager (exclusive cores)■ Huge Pages Support

○ Extensive Roadmap● Intel, IBM, Google, NVIDIA, Red Hat, many more...

Upstream First: Kubernetes Working Groups

10

Page 11: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

● Network Plumbing Working Group○ Formalized Dec 2017

● Goal is to implement an out of tree, pseudo-standard collection of

CRDs for multiple networks, owned by sig-network, *out of tree*

● Separate control- and data-plane, Overlapping IPs, Fast Data-plane● IBM, Intel, Red Hat, Huawei, Cisco, Tigera...at least.

Upstream First: Kubernetes Working Groups

11

Page 12: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

GPU CLUSTER TOPOLOGY

12

Page 13: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Control Plane

Compute Nodes and Storage Tier

Infrastructure

master and etcd

master and etcd

master and etcd

registry and

router

registry and

router

LB

registry and

router

OpenShift Cluster Topology

13

Page 14: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Compute Nodes...

● How to enable software to take advantage of “special” hardware

● Create Node Pools○ Mark them as “special”○ Taints/Tolerations○ ExtendedResourceTole

ration

OpenShift Cluster Topology

14

Page 15: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Compute Nodes...

● How to enable software to take advantage of “special” hardware

● Tune/Configure the OS○ Tuned Profiles○ CPU Isolation○ sysctls

OpenShift Cluster Topology

15

Page 16: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Unsafe● Experimental Kubelet Flag● kernel.sem*● kernel.shm*● kernel.msg*● fs.mqueue.*● net.*

In OpenShift, there are three “types” of sysctls

Safe● Enabled by default● kernel.shm_rmid_forced● net.ipv4.ip_local_port_range● net.ipv4.tcp_syncookies

Node-level● Can’t set from a pod● Potentially affects other

pods● Many interesting sysctls● Use TuneD

16

OpenShift Cluster Topology

Page 17: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Compute Nodes...

● How to enable software to take advantage of “special” hardware

● Optimize your workload○ Dedicate CPU cores○ Consume hugepages

OpenShift Cluster Topology

17

Page 18: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Compute Nodes...

● How to enable software to take advantage of “special” hardware

● Enable the Hardware○ Install drivers○ Deploy Device Plugin

OpenShift Cluster Topology

18

Page 19: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Compute Nodes...

● How to enable software to take advantage of “special” hardware

● Consume the Device○ KubeFlow Template

deployment

OpenShift Cluster Topology

19

Page 20: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Kubernetes Deployment for STAC-A2

● All-in-One Kubernetes Installation● (hack/local-up-cluster.sh)● Node labeled ● Containers:

○ RHEL7+CUDA9○ RHEL7+CUDA9+DEVICE-PLUGIN○ RHEL7+CUDA9+STAC-A2

● CUDA 9● 8 x NVIDIA Tesla V100 (Volta) GPUs● HPE Apollo 6500 w/XL270d Gen9 ● Red Hat Enterprise Linux 7.4● Kubernetes 1.8 (setup info)● nvidia-smi

--applications-clocks=877,1380

● https://rhelblog.redhat.com/2017/11/21/red-hat-and-partners-deliver-new-performance-records-on-prominent-risk-analytics-benchmark/

● https://news.developer.nvidia.com/a-new-stac-a2-record/ 20

Page 21: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 21

Kubernetes Deployment for STAC-A2

Volta GPU Kubelet

Device Plugin(daemonset)

Kube Scheduler

Volta GPUVolta GPU

Volta GPUVolta GPU

Volta GPUVolta GPU

Volta GPU

Benchmark (pod)

resources: limits: nvidia.com/gpu: 8

kubectl create

21

Page 22: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Benchmark (pod)

resources: limits: nvidia.com/gpu: 8

22

Kubernetes Deployment for STAC-A2

Volta GPU Kubelet

Device Plugin(daemonset)

Kube Scheduler

Volta GPUVolta GPU

Volta GPUVolta GPU

Volta GPUVolta GPU

Volta GPU

kubectl create

22

Page 23: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

● Early KubeFlow involvement

● radanalytics templates for ML-workflow on OpenShift

● Machine-Learning OpenShift Commons

● Demo Repositories

○ https://github.com/zvonkok/nvidia-k8s

○ https://github.com/redhat-performance/openshift-psap

Recent GPU-related work on OpenShift

23

Page 24: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

THANK YOUplus.google.com/+RedHat

linkedin.com/company/red-hat

youtube.com/user/RedHatVideos

facebook.com/redhatinc

twitter.com/RedHatNews

24

Page 25: Providers with Red Hat OpenShift Enabling GPU-as-a-Serviceon-demand.gputechconf.com/gtc/2018/presentation/s8769... · Ceph OpenStack rad analytics KubeVirt ... Recent GPU-related

JEREMY EDER - RED HAT PERFORMANCE ENGINEERING

Commoditizing GPU-as-a-Service Providers with Red Hat OpenShiftTuesday, Mar 27, 1:00 PM - 1:25 PM, Room 210E

Red Hat OpenShift Container Platform, with Kubernetes at it's core, can play an important role in building flexible hybrid cloud infrastructure. By abstracting infrastructure away from developers, workloads become portable across any cloud. With NVIDIA Volta GPUs now available in every public cloud [1], as well as from every computer maker, an abstraction library like OpenShift becomes even more valuable. Through demonstrations, this session will introduce you to declarative models for consuming GPUs via OpenShift, as well as the two-level scheduling decisions that provide fast placement and stability.

25