Making workload nomadic when accelerated

Making Workload Nomadic When AcceleratedNomad Project Introduction

Zhipeng Huang - HuaweiMichele Paolino - Virtual Open SystemsHui Deng - China Mobile

Outline

❏ Motivation❏ Interesting Use Cases❏ OPNFV Requirements❏ Nomad Introduction❏ Future Plan

Motivation

Motivation - Why ?OpenStack currently lacks of a set of features in respect to portable hardware acceleration*:

● Accelerators life management● Accelerators resource discovery● Reconfigurable FPGA, GPUs and other accelerators migration support, ease

of use, etc.

* these features have been highlighted in the OPNFV OpenStack GAP Analysis document.

The main purpose of Nomad is to implement all these features, providing (all) the OpenStack users with a portable, powerful, easy way to use hardware acceleration

Motivation - What ?3 models if related to a more traditional CPU as part of the workload

Very Close - in the cpu chipset or on the board (i.e., see Intel’s skylake staging)

● Suitable for offload model, and inline if associated interface is also in place● With optimal sharing of resources can provide excellent processing gains● Limited by horizontal scale, but can be leveraged as a unit of management like the associated CPU

Nearby - attached via a bus or similar (i.e., PCIe or within a chassis assembly)

● Suitable for offload and inline models● Susceptible to negative impact if interface across the bus is chatty● Larger scale possible, particularly in chassis configurations

Far - reachable by TCP/IP or other communication protocol

● Suitable for offload and inline if latency is not a concern● Largest horizontal scale flexibility● Much more suited to a standalone function model

Motivation - How ?Nova Extension Solely Nomad (Dedicated Management Function)

Target Best performance, no portability. Best performance/portability trade off.

Accelerator access

Direct management. Management through a portability layer.

Pros - Direct interaction between the compute node and the accelerators could provide slightly better performance

- Resource discovery, scheduling, setup etc.- Support for automatic integrated acceleration management for accelerated VM's Migration- Hardware portability and independence.

Cons - Portability/migration hard to support- Code complexity: specific code needed for each accelerator type, with impact to the project performance, security, maintainability, etc.

- The accelerator allocation phase might take time, as an handshake procedure has to be put in place- Scalability issues

Outline

✓ Motivation❏ Interesting Use Cases❏ OPNFV Requirements❏ Nomad Introduction❏ Future Plan

Interesting Use Cases - NFVIaaS● For people that are familiar with ETSI NFV standard,

NFVIaaS was among the NFV use cases that published in Phase 1 documents

● However few of us grasp what this use case actually meant for business, until now.

● Now many operators begin to build their own Public Cloud. If an operator has a multiple site supported Public Cloud, it then would be able to offer NFVIaaS to NFV app companies that has its own content, or VNF, or MANO, but without NFVI. These NFV apps then could be deployed on the Public Cloud.

● There is still a problem for the operator who owns the Public Cloud - so further service classification on NFVIaaS without acceleration

Acceleration makes NFVIaaS a possibility

Interesting Use Cases - accelerated vSwitchThe next generation of virtual switch technologies will combine data plane and low level SW acceleration with network engines, PCI accelerators, etc:

● Dynamically allocating (reconfigurable) accelerators and supporting VM migration

● Dispatching packet processing workloads on the available resources

*http://www.virtualopensystems.com/en/products/vosyswitch-nfv-virtual-switch/

VOSYSwitch* is a user space virtual switch for NFV (and not only), based on the framework Snabb (snabb.io), and thus independent from Intel/DPDK:

● It inherits from Snabb the LuaJIT trace compiler, which provides performance optimization at runtime based on the current traffic

● Supports x86/ARM architectures and hardware accelerators through ODP

● Its roadmap includes FPGA/GPU/OpenCL support, and integration with OpenStack Nomad

Outline

✓ Motivation✓ Interesting Use Cases❏ OPNFV Requirements❏ Nomad Introduction❏ Future Plan

OPNFV RequirementsDPACC general requirments : https://docs.google.com/document/d/1YexfnLRZ9gZnj-5PNOnrJ5CVMhbzZ7mIJogW6nyIGq0/edit

DPACC OpenStack requirements : https://docs.google.com/document/d/1_fOinIQNcPwNODZPzGK5vRMPJQLwL7iLds4NFnjXSms/edit

https://docs.google.com/document/d/1YexfnLRZ9gZnj-5PNOnrJ5CVMhbzZ7mIJogW6nyIGq0/edit







https://docs.google.com/document/d/1_fOinIQNcPwNODZPzGK5vRMPJQLwL7iLds4NFnjXSms/edit







Outline

✓ Motivation✓ Interesting Use Cases✓ OPNFV Requirements❏ Nomad Introduction❏ Future Plan

Nomad Introduction - some really high level stuffDesign Goal:

● Loosely coupled (more than Neutron)

● DB oriented management● Network: DAG based● Storage: affinity driven● Compute : capability based

scheduling● Big Data: FPGA/GPU programming● Fine grained fault monitoring and

management (bloom filter?)

Nomad introduction - first stab at nomad-compute

● BP Driven, everything starts from scratch

● Target code available in August

● Try to make OPNFV Colorado release if possible

Outline

✓ Motivation✓ Interesting Use Cases✓ OPNFV Requirements✓ Nomad Introduction❏ Future Plan

Nomad future plans● Develop storage and network related features if there are volunteers

that are interested.

● Try to make Nomad less VM centric○ consider FPGA that is sitting out on PCI as an independent

resource for anything that may be deployable on it. Like VMs on CPUs

○ Good for portability and scalability

● Work with existing teams (e.g. EPA)

● Fault management

Outline

✓ Motivation✓ Interesting Use Cases✓ OPNFV Requirements✓ Nomad Introduction✓ Future Plan

BoF Discussion

https://etherpad.openstack.org/p/AUS-BoF-NOMAD



The missing piece and Ironic

When it comes to NFV and the use of VNFs that have been created and are managed in this way, they are not visible to Neutron and not captured in Nova as consumable functions.

Ironic has the role of discovering and initializing “bare metal” devices and exposing them to the rest of the OpenStack system. However there is no requirement that all resources used by Nova for example need be bare metal based.

We need a public api that allows the dynamic registration of resources that happen to be hosted on acceleration hardware.

Backup Slides

What is Application Acceleration (Harm)Typically when you run something on alternate hardware as a subcomponent of the application, we call it an accelerated application, and the specific function is often call an accelerated function. This is commonly called the “offload” model. Sometimes referred to as co-processing, and we see analytics as an example of this type of workload.

The “inline” model frequently puts a specialized accelerator between an application and an interface to other systems. We all know about Graphics “acceleration” as an example, or wetware interface pre-processors used in genomics. An inline function may be standalone and not have any external processing dependencies

A few platforms have emerged to support this, namely GPU and FPGA, and along with even more specific hardware are commonly connected to a more traditional CPU via PCI or similar technology.

As these patterns and specific workloads have become highly popular we have seen general CPU vendors add acceleration platforms to the chipset. Graphics, communications, encryption are all examples.

So how do we deploy and manage acceleration hardware? and

How do we manage them separately from the service that exploites them?

OpenStack by principal - Nova (Harm)Nova by definition manages the allocation of compute resources.

● Through meta data it is dynamically aware of a compute node and its characteristics. This may include some close or nearby resources.

● Nova can be taught with alternate meta data about what look like standalone compute nodes that are in fact acceleration devices.

● By providing additional filters and automation scripts Nova can manage a standalone acceleration device just like a general CPU device

Glance is used to manage the life cycle of artifacts used for provisioning

● Glance understands the artifacts through meta data associated with a resource● Just like VMs need images, accelerated devices need to be loaded with

bitstreams● By providing additional artifact types and meta data Glance can be utilized to

manage the artifacts needed for acceleration device life cycle management

OpenStack by principal - Glance (Harm)

OpenStack by principal - Neutron/Cinder (Harm)Neutron and Cinder manage traditional data center devices and do not understand how they are implemented. The device is just something to be configured, and separately monitored and managed.

● The devices are consumed much like a PaaS or SaaS level service. An appliance model. If the appliance is implemented with CPU, GPU, FPGA, discrete hardware or aliens it does not matter.

● The manager holds operational state and configuration data about the devices, just like Nova understands number of vCPU and how much has been allocated.

Conclusions (certainly from Harm ;-))

● This completes the puzzle and separates out the concerns for creating and consuming VNFs to create NFV and NFVi.

● This approach provides a more generalized way to manage acceleration hardware while still separating life cycle from specific function.

● By supporting all 3 types of configuration as well as consideration of inline and offload models, any specific performance optimization for each can still be applied, but without affecting the alternates.

While current steps in Nova to support acceleration, and in Nomad to support the domain specific notions of NFV…

This approach completes the picture

OpenStack to the rescue (Harm)

By leveraging Glance and Nova to manage the provisioning of acceleration hardware in all models, the accelerated application/function can be adopted directly into any automation a customer needs.

Functions provided by these accelerated systems can be combined and consumed like any PaaS or SaaS service

Not sure how to mix this in the flow (Harm)

The following set of slides provide a bit of background and then the reasoning to lead to the following proposed approaches. Since Nomad already has a path forward I leave it to the Nomad veterans to determine if this is of interest and how to weave the ideas.

Making workload nomadic when accelerated

Technology

Transcript of Making workload nomadic when accelerated