Horizontal Decoupling of Cloud Orchestration for Stabilizing Cloud Operation and Maintenance

3
Horizontal Decoupling of Cloud Orchestration for Stabilizing Cloud Operation and Maintenance Wenbo Mao DaoliCloud Company Beijing and Shanghai, China www.daolicloud.com February 8, 2015 In a plain and understandable desire for achieving economies of scale, a cloud orchestration software system should be capable of managing a huge farm of hardware servers. However, even using most advanced software configuration and management tools, the field has trial-and-error reached a common sense that the distribution scale of a cloud orchestrator mustn’t be too large. E.g., VMware, probably among the most experienced players in the trade, stipulates a rule-of-thumb upper bound for its orchestrator vRealize: No more than 1,000 servers per vRealize even if the software is installed on top quality hardware devices. Scaling-up beyond that level, cloud operation

Transcript of Horizontal Decoupling of Cloud Orchestration for Stabilizing Cloud Operation and Maintenance

Page 1: Horizontal Decoupling of Cloud Orchestration for Stabilizing Cloud Operation and Maintenance

Horizontal Decoupling of Cloud Orchestration for Stabilizing Cloud

Operation and Maintenance

Wenbo Mao

DaoliCloud Company

Beijing and Shanghai, China

www.daolicloud.com

February 8, 2015

In a plain and understandable desire for achieving economies of scale, a cloud

orchestration software system should be capable of managing a huge farm of hardware

servers. However, even using most advanced software configuration and management

tools, the field has trial-and-error reached a common sense that the distribution scale of

a cloud orchestrator mustn’t be too large. E.g., VMware, probably among the most

experienced players in the trade, stipulates a rule-of-thumb upper bound for its

orchestrator vRealize: No more than 1,000 servers per vRealize even if the software is

installed on top quality hardware devices. Scaling-up beyond that level, cloud operation

Page 2: Horizontal Decoupling of Cloud Orchestration for Stabilizing Cloud Operation and Maintenance

and maintenance would become unstable plus incur sharp cost increasements in

operation and maintenance. Recent achievements in hyper efficient CPU virtualization

by Docker have seminally ignited additional orders-of-magnitude explosions in the

number of micro-servicing CPUs, certainly to add further troubles to worsening

scalability in cloud orchestration. Current poor scalability status quo in cloud

orchestration means that today’s clouds are in small isolated scatters and patches, and

therefore cannot efficiently tap cloud potentials from economies of scale.

The essential problem behind poor scalability in cloud orchestration is that all cloud

orchestrators, from commercial offerings or from open source projects, unanimously

and conventionally evolve from a horizontally tight coupled architecture. A horizontally

tight coupled orchestrator is a bunch of software components which are host knowledge

interwoven. By speaking of "host knowledge interwoven", we mean that the

software components in a cloud orchestrator know the existence, roles and duties of

one another right at their birthday of being installed on a farm of server hosts, and

throughout their remainder entire lifecycles afterwards. When a farm gets large, some

queues of events and messages will inevitably become long; writelock mechanisms

for consistency protection and CoW DB accesses will also aggregate momentum to

slow down responsiveness; and occasional unfortunate popup of failures, even merely

in a benign timeout sense, occurring at one point in the farm would highly likely pull

down other knowledge interwoven parts. As a matter of fact, all cloud servicing or

hosting providers, as long as having a size, all have to rely on human based

operation/maintenance teams 7x24 on-guard the farm, playing similar roles of

firefighters!

We present Network Virtualization Infrastructure (NVI) technology to horizontally

decouple cloud orchestration. The NVI technology minimizes the size of a cloud

orchestration region down to over one single hardware server, e.g., in the formulation of

OpenStack all-in-one installation. An orchestrator managing only one server host of

course has absolutely no knowledge whatsoever about any other orchestrator

managing another server. Thus, any server host in an NVI farm has no software

knowledge about any other server host in the farm. While having obviously maximized

stability for cloud operation and maintenance, the overlay cloud resources which are

pooled by NVI remain to have unbound scalability. This is because NVI can trans-

orchestrator connect overlay nodes in user mode only upon one node initiating

communication to another (think of http connection!). NVI can connect various virtual

CPUs over independent and heterogeneous cloud orchestrators, e.g., connect

lightweight micro-servicing Docker containers and heavy-duty hypervisor VMs, which

are independently orchestrated by, e.g., Kubernetes and OpenStack. Moreover,

NVI can transparently link different cloud service providers, also in user mode.

The key enabler for any two not-knowing-one-another orchestrators to serve user-mode

connection for their respectively orchestrated overlay nodes is a novel OpenFlow

formulation for forwarding trans-orchestrator underlay packets. This new SDN

Page 3: Horizontal Decoupling of Cloud Orchestration for Stabilizing Cloud Operation and Maintenance

formulation succeeds constructing any OSI layer, any form of overlay network without

any need of packet encapsulation, i.e., without using any of the trans-host-network

protocols such as VLAN, VXLAN, VPN, MPLS, GRE, NVGRE, LISP, STT, Geneve, or

any such we may have missed from the enumeration! Having avoided trans-host packet

encapsulation, there is of course no need for the involving orchestrators to know one

another in host mode, neither in the system installation time nor in their remainder entire

lifecycles afterwards. It is in such a simple principle that the SDN innovation of NVI

achieves complete horizontal decoupling of cloud orchestration. With connection taking

place only in user mode, cloud deployment, operation, maintenance, and system

upgrading, etc., can become 100% automated. It is now also plainly manifested that the

NVI technology supports inter-cloud patching, also in user mode.

With the problem-solving architecture of NVI for truly scalable cloud

orchestration, DaoliCloud attempts to contribute to the cloud industry a new production

line: “Build, ship and low-cost operate any cloud, any scale”, as a new frontier to work

with, extending from the great inspiration of “Build, ship and run any app, anywhere”

from Docker. Having fixed a single small size for orchestrator installation and

configuration, build, ship, operate and maintain any cloud, private or public, will be low

cost, and fast because of one size and automation.

URL http://www.daolicloud.com exposits, in "for dummies" simplicity, a near-product-

quality prototype of our new cloud orchestration technology which horizontally

decouples globally located orchestrators. These globally distributed, host-mode not-

knowing-each-other, and user-mode well organized orchestrators are independent all-

in-one OpenStack hosts in Beijing, Shanghai and N. Virginia. We cordially invite much

respected reviewers of this abstract, and hopefully many subsequent interested trial

users in the audience of the forthcoming OpenStack Summit, to signup on the above

URL for a trial use. We humbly wish that some trial users might come to an appreciation

that the new architecture work for scalable cloud orchestration can indeed enable a

number of never-known-before useful cloud properties which are probably only made

possible from the new architectural innovation in cloud orchestration as a

specific application, and in network virtualization for a more

general pursuit in knowledge and technology advances.