MODELING OF DECENTRALIZED AND RECONFIGURABLE … Talks/ericsson-decentr-reconfig-Nov2016.pdf–...
Transcript of MODELING OF DECENTRALIZED AND RECONFIGURABLE … Talks/ericsson-decentr-reconfig-Nov2016.pdf–...
MODELING OFDECENTRALIZED ANDRECONFIGURABLE CLOUDS
Martin Korling, Chris Hogue, [email protected]@ericsson.com
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 2
› Intro› Background› De-centralization› Re-configurability (disaggregation)› Dynamic model (work-in-progress)› Cost-benefit (work-in-progress)› Conclusion
Outline
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 3
› Infrastructure for society critical applications, different than today’s
› Use cases:…
› New flexibilities: de-centralization, re-configurability
› Double relevance of Big Data -> Big Control/Decision: – Big control is a workload, – Big control is used as control mechanism within infrastructure
Intro
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 4
› how to use decentralization, cost-benefit› how to use hardware re-configurability, cost-benefit› interface between application and infrastructure, service contract parameters› how to design control and management planes› hyperconverged and hyperscale architectures, e.g. data locality› isolation strategies, trade-off with resource fragmentation› policy architecture, space of constraints, beyond labels› serverless/event-based in distributed scenarios.
Intro, problem statement
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 5
› Simulations– ROSS, Time-warp
› Object-oriented with message passing– Cloudsim
› Object-oriented
› Infrastructure modelling– Google B4– Google Job packing– Google Borg
› Utilization is not useful– Scaling– Inflating– Shrinking
Background, prior work
Abhishek Verma, Madhukar Korupolu, John Wilkes, Evaluating job packing in warehouse-scale computing
Abhishek Verma, Madhukar Korupolu, John Wilkes, et. al, Large-scale cluster management at Google with Borg
Adrian Cockcroft, “Utilization is Virtually Useless as a Metric!”
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 6
› Heterogeneity important› Class/Priority used extensively› Constraints important
› Network traffic information missing!
Background, prior work
J. Wilkes and C. Reiss. Details of the ClusterData-2011-1 trace, 2011. https://code.google.com/p/ googleclusterdata/wiki/ClusterData2011_1
Charles Reiss et.al, Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis”
Sheng Di, Derrick Kondo, Walfredo Cirne. Characterization and Comparison of Google Cloud Load versus Grids. 2012
The Google cluster trace
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 7
De-centralization
global/publiclevel
countrylevel
metro levelon-premleveldevice
level
There are important:
• system domain borders
• business domain borders
• legal domain border
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 8
De-centralizationDEVICE ON-PREM LOCAL/
METROCOUNTRY GLOBAL/
CENTRALmachine, car,train, …
factory, office,power-station, …
central office,base station,…
datacenter, … datacenter, …
resilience,bandwidth
scale
regulatorycompliance,multicloud
control, security, low latency
control, extreme low latency ”micro
datacenters”
it’s not only about latency
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 9
ApplicationsDEVICE ON-PREM LOCAL/
METROCOUNTRY GLOBAL/
CENTRALmachine, car,train, …
factory, office,power-station, …
central office,base station,…
datacenter, … datacenter, …
Web,“Cloud”,video upload
“CLIENT-SERVER”
CDN Video streaming
HYBRIDCLOUD
Resourceoffload
COMPLIANTDATA
Personal datastorage
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 10
ApplicationsDEVICE ON-PREM LOCAL/
METROCOUNTRY GLOBAL/
CENTRALmachine, car,train, …
factory, office,power-station, …
central office,base station,…
datacenter, … datacenter, …
UPSTREAMVIDEO
Upstream video processing
CONTROLSYSTEMS
Extremelow latency
Video stream Metadata, events
Control loop
Intermediatecontrol Metadata, events
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 11
› Chassis28094 variants2822 variants
› Rack(28094 + 2822) 20
= 7.658x10275
› Disaggregated RackScaleComponents
– Compute Sleds (2809 variations)
– Storage Sleds (282 variations)
Re-configurability with HDS 8000(Rackscale design (RSD))
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 12
› Physical Sleds to Logical Servers
› 4 “bootable” servers shown here– 4 CSU– 6 SSU in SAS Daisy Chain– What is the Storage in each Server?
– Depends on › Discs inside each SSU› Zone Partitioning
From Disaggregated HardwareTo “Composite” Compute Nodes
SSD
HDD
HDD
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 13
NOW: FAST, DETAILED HARDWARE MODEL Generation
Pre-computedConfigurations
RequirementsConstraints:BudgetStorage CapacityvCPUs…
FAST ITERATIONS
100’s of Racks in Complex ConfigurationsDAS, SDS, BOM, Price,
Illustrations, CablingInstallation Details
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 14
› Go generatedJSON & SQL forms
› Star-Schema– Logical (graph)
› Pods, NodesDevices, Connects
– Physical (hardware)› Racks, Chassis› Switches, Management› CSUs, SSUs› DAS, SDS,› Cables, Discs
Detailed Date Model & ConfigGenerator
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 15
GOAL: Simulation including Workloads
RequirementsConstraints:BudgetStorage CapacityvCPUsFloor SpacePowerMass
SimulationWorkloadsUsersTrafficBurstsSLAGrowth
Datacenter ModelsOptimized Configuration
REFINEMENT
MODEL ENTERPRISEWORKLOAD
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 16
RUNTIME Data modelAsset
Site Transport
Pools
Workload
Dataset
ComponentSystems
Machine Connection FlowJobCPU Mem
Disk Link
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 17
Dynamic modeling
Still rough• machine borders oversimplified• aggregation level, just an exampleObservations• workload components co-varying• workloads competing• stranded resources• datasets impose constraints
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 18
The problem
resources
quality
resources
quality
resources
quality
resources
quality
resources
quality
resources
quality
Quality
Capacity
It’s not bin-packing
$
micro behaviors macro behavior
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 19
The problemSure, it’s a
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 20
Simplified model
Data sources
Access links
Metro/MicroDatacenter
Aggregations/Metro links
CentralDatacenter
~1 customer per linkl_a
~10k customers per linkl_m
~1M customers
~100 metro areas
~10 metro sites in each
~10 central sites
CENTRALIZED DE-CENTRALIZED
computer vision
anomalydetection
computer vision
anomalydetection
Workload descriptionw [1/w] = bitrate/CPU/time
Cost factorsp(l) price of link resource, l_m
p(c) price of compute resource, c_m
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 21
Simplified MODELWorkload description
w [w] = instructions/bits
[1/w] = bitrate/MIPS
Cost factors
p(l) production of link resource, l_mp(c) production of compute resource, c_m
relativecost
relativequality
Quality (SLA)
throughput sustained processing bitrate
p(c)/p(l)
1/w > p(c)/p(l)
de-centralizedmore costly
de-centralizedless costly
de-centralizedworse quality
de-centralizedbetter quality
w
p(l)
p(c)
10000
Example numbers
1000
100
(USD/100Mbps/T)
(USD/CPU/T)10 100 1000
p(c)/p(l)0.1M 1M
10M
100M1/w = 5M
1/w = 1Mbps/0.2CPU = 5M
BACK-OF-ENVELOPE, PRELIMINARY, SUBJECT TO SIMPLIFYING ASSUMPTIONS
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 22
› de-centralized: improve isolation, regulatory compliance› de-centralized: improve quality / cost (trade-off link and compute bottlenecks) › de-centralized: improve response time / latency› de-centralized: risk of resource fragmentation
› re-configurability: isolation› re-configurability: mitigate resource fragmentation
Tentative cost-benefit
© Ericsson | De-centralized Re-configurable Cloud | 2016-11-01 | Page 23
› Intro› Background› De-centralization› Re-configurability (disaggregation)› Dynamic model (work-in-progress)› Cost-benefit (work-in-progress)› Conclusion
Summary