Deterministic capacity planning for OpenStack as elastic cloud infrastructure
-
Upload
sean-cohen -
Category
Technology
-
view
349 -
download
0
description
Transcript of Deterministic capacity planning for OpenStack as elastic cloud infrastructure
Deterministic capacity planningfor OpenStack
Keith BasilPrincipal Product Manager, Red Hat
Sean CohenPrincipal Product Manager, Red Hat
Tushar KatarkiPrincipal Product Manager, Red Hat
http://sharpwriter.deviantart.com/art/Welcome-to-the-Internet-Please-Follow-me-322248378http://creativecommons.org/licenses/by-nc-nd/3.0/
devOps headband, BOFH Slayer gun handle and OpenStack unicorn branding added for effect. Not for redistribution.
AGENDA
✦ OpenStack as an Elastic Cloud✦ Determinism in Infrastructure✦ Compute for Elastic Clouds✦ Storage for Elastic Clouds ✦ Networking for Elastic Clouds✦ Putting It All Together
Keith Basil
personalVirginia hare scrambler, plays chess..
professionalRed HatCloudscaling, Time Warner Cable,FederalCloud.com, Cisco and a couple of startups
blendedskype/twitter/github/irc, life: noslzzp
Sean Cohen
personal Jazzman, oil painting & tennis...
professionalRed HatDot Hill Systems, CloverleafCommunications, VerticalNet
blendedskype: sean.redhat, irc: scohen
Tuskar Katarki
personalTwo kids and the wife, squash, hike/bike
professionalRed Hat15 years in IT infrastructure developmentSun Microsystems, Oracle
Hello.. I’m Your Elastic Cloud.
H E L L Omy name is
OpenStack
OpenStack ...
✦Is open source software and vibrant community
✦Provides a framework for an elastic cloud
✦Benefits from deterministic deployment approaches
Elastic Cloud != Enterprise Virtualization
Elastic Cloud Workloads
✦Applications expect failure
✦Smaller stateless VMs
✦Applications scale out horizontally with VMs of predetermined capacity
✦Lifecycle measured in hours to minutes
Enterprise Virt Workloads
✦Workloads NOT designed to tolerate failure
✦Larger stateful VMs
✦Workloads scale up within custom VMs(more vCPU, vRAM)
✦Lifecycle measured in years
Scale Up- Servers are like pets.
Scale Out- Servers are like cattle.
Difference in the resource requests?
I want 6 vCPUs, 4 GB and 120Gb disk please.
One is user determined. One is provider determined.8)
I want an m1.small
please8)
I would like an m1.medium VM please!
Umm, Do I know you? I need to see some papers!!
Keystone
Ok, we need to find a place to build this
VM.Nova
Tag - you’re it!
instance
capacity capacity
capacity Papers are good. Time to get to
work!Nova
NodeNeutron, I need a network
with all the trimmings!Neutron
Here’s your IP, default route and FW settings.
Cinder, have that volume ready for
me?
Node
Indeed I do. Don’t forget to mount it!
SwiftGlance
Hey Glance, can I get the RHEL 6.4 image?
Node
8)
OpenStack in 2 Minutes!
Thank you OpenStack!!
8)
It’s rendering time!
Your Mission, Should You Chose to Accept It..
“If you’re going to do operations reliably, you need to make it reproducible and programmatic.”
“Applications are what matter. Anything that gets apps deployed faster and helps companies manage the proliferation of apps is good. Hence, DevOps.”
- Mark Imbriaco VP of Ops, Digital Ocean
- Mike LoukidesWhat is DevOps?
http://sharpwriter.deviantart.com/art/Welcome-to-the-Internet-Please-Follow-me-322248378http://creativecommons.org/licenses/by-nc-nd/3.0/
devOps headband, BOFH Slayer gun handle and OpenStack unicorn branding added for effect. Not for redistribution.
The goal is to keep your devOps heroes in play!
Determinism in Infrastructure
Let's Break The Myth...
There is no such thing as
“infinite scale” in cloud computing
All computing requests, even for virtualized resources, ultimately map to
physical device —> finite resources
✦ Every provider has limits, even if they’re massive.
✦ Adding the word Cloud simply squeezes the limit balloon
✦ It doesn’t eliminate the issue, even with “elasticity.”
✦ The service provider is responsible for risk mitigation of the capacity it rents.
Capacity Planning in a the Cloud
Infrastructure as “building” code
Why History matters..
✦Capacity planning and performance monitoring in the context of Public providers:✦Can be done only by understand the history of a specific cloud provider. ✦Requires both cloud performance application to understand✦Current state of the provider✦Performance history over a given period of time.
Cloud tenants have a service level expectation
Cloud Operators have business constraints
Implicit contract
8^)
Operators
RULE!
8^)
Unicorns
RULE!
8^)
8^)
devOps
FTW!
8^)
BOFH
Slayer!
8^)
# root
8^)
8^)
Unicorns
RULE!
8^)
Unicorns
RULE!
Implicit Contract
8^)
uid=0
Operator Tenants
Capacity Planning in the Cloud•Cloud users buy services based on capacity, protected by SLA•Cloud provider need deterministic capacityplanning to support the elastic growth
8^)
Operators
RULE!
8^)
Unicorns
RULE!
8^)
8^)
devOps
FTW!
8^)
BOFH
Slayer!
8^)
# root
8^)
8^)
Unicorns
RULE!
8^)
Unicorns
RULE!
Implicit Contract
8^)
uid=0
Operator Tenants
Deterministic Capacity Planning
✦Determinism is the best measure we have for predicting the effort and expense of making a process consistently performant✦When your service becomes a critical part of a customer’s infrastructure, their fate becomes wedded to the SLA’s you deliver. ✦ In Cloud Computing, the service’s performance will not be measured by its average speed but by the consistency of its speed
Modeling Performances
✦Using this information, we’re able to more accurately determine the capacity of a Public provider✦ Monitoring performance spikes and valleys over time. ✦This means we can more accurately model for performance, and thus capacity.
Benchmarks can provide useful insight for performance analysis and capacity planning
http://cloudharmony.com/benchmarks
Deterministic Concepts & Goals
AWS and GCE as models
You want 2048, not Tetris®
✦ Scheduling made easy
✦ Scaling made easy
✦ Optimal hardware use (no holes or hot spots)
✦ Performance consistency
How do we achieve determinism for these core OpenStack services?
Compute for Elastic Clouds
ComputeInstance Family
Solving resource contention in Compute
CPU
DiskMemory
1/1
1/2
1/4
1/8
n1-standard-8
n1-standard-4
n1-standard-2
n1-standard-1
m1.xlarge
m1.large
m1.medium
m1.small
m1.classn1-standard.class
xlarge
large
medium
small
Public Cloud VM Instances Exposed!
We can take this approach with OpenStack
xlargelarge medium
small
Solve for the biggest VM in the class
We can easily derive the entire instance family because smaller instances are fractional proportions of the largest.
This facilitates efficient hardware use and scheduling.
1/1 1/2 1/4 1/8
xlarge
Efficient Bin-Packing with Fractional Proportions
xlarge
Compute Hardware Node (general compute instance family)
128GB memory, (16) 1TB disks, (2) E5-2670 CPU
xlarge
small
small
small
small
small
small
small
small
medium medium
medium medium
xlarge xlarge
small
small
small
small
small
small
small
smallGiven the machine config below, it would support:
(4) n1-standard-8-d(8) n1-standard-4-d(16) n1-standard-2-d(32) n1-standard-1-d
(8) m1.xlarge(16) m1.large(32) m1.medium(64) m1.small
large
large
large
Efficient Scheduling with Fractional Proportions
MEMORY OPTIMIZED NODE
small
small
small
small
medium
medium medium
xlarge
medium medium
small
small
large
large
GENERAL COMPUTE NODE
xlarge
small
small
small
small
medium medium
medium medium
xlarge
large
General Purpose Instance Families✦ n1-standard✦ m1✦ A1 - A4
CPU OPTIMIZED NODE
small
small
small
small
small
small
small
small
medium
xlarge
medium medium
small
small
large
large
Memory Optimized Instance Families✦ n1-highmem✦ m2,cr1✦ A5 - A7
CPU Optimized Instance Families✦ n1-highcpu✦ c1,cc2,c3
sche
dulin
g
sche
dulin
g
sche
dulin
g
Compute Calculator Intro
Designed to help determine optimal compute hardware configurations
✦Visually shows resource constraints
✦Allows custom instance families
✦Walk through
Storage for Elastic Clouds
Block StorageVolume Types
Solving resource contention in Block Storage
Throughput
General StoragePerformance(IOPS/latency)
What Are the Public Clouds Doing with Storage?
Performance Optimized – ✦ guaranteed IOPS (SSDs)✦ IOPS per GB with low latency✦ for I/O intensive workloads✦ Billed by size and IO usage
Capacity Optimized (standard) – ✦no IOPS guarantees✦workloads with moderate IO✦Billed by size and IO usage
Blended Approach (Performance Scaled with Capacity) –
✦ Ephemeral disks deprecated!✦ IOPS scale with volume size✦ Attached volume limits✦ Billed by size only
Block Storage Classes in OpenStack
THROUGHPUT OPTIMIZED STORAGE NODEPERFORMANCE OPTIMIZED STORAGE NODE
Performance Optimized Storage✦ all SSDs
GENERAL STORAGE NODE
Throughput Optimized Storage✦ fast SAS drives with RAID 5/6✦ throughput tuned network✦ high bandwidth Internal bus
Capacity (General) Optimized Storage✦ larger SATA HDDs
Cin
der s
ched
ulin
g
Cin
der s
ched
ulin
g
Cin
der s
ched
ulin
g
SSD SSD SSD SSD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDDHDD HDDHDD
HDD HDD
HDD HDD
HDD HDD
HDD HDD
SSD SSD SSD SSD
SSD SSD SSD SSD
SSD SSD SSD SSD
Storage Tiers with OpenStack Cinder
8^)
Operators
RULE!
8^)
1. Define storage back ends
2. Create Volumes Types✦ General✦ Performance✦ Throughput
3. Create Volumes
# cinder create \ --volume_type IOPS_OPTIMIZED_TYPE \ --display_name volume-1 50
TENANT
OPERATOR
✦ Raw capacity of the storage
✦ Replication
✦ RAID type
Capacity (General) Optimized Storage
RAID TYPE2-Way
Replication
3-Way
Replication
RAID5 2.2 3.3
RAID6 2.4 3.6
RAID10 4 n/a
Example:
Twelve (12), 1TB disks, configured for RAID6 and 2-way replication would yield 5.0TB of usable capacity.
12TB / 2.4 = 5.0TB net usable capacity.
✦ IOPS scale linearly with VM count
✦ Limits should be seen as triggers for storage scale out
Performance Optimized Storage
Write Latency
READ Latency
Throughput Optimized Storage
✦ Throughput response matters
✦ The Read/Write mix matters
✦ Influenced by RAID type
41
Storage Planning ● Step 0: What is my Cloud Storage offering?
● Capacity Based
● Performance (IOPS) Based
● Throughput (Bandwidth) Based
● Step 1: What Storage Tiers do I need?
● Capacity Optimized, Performance Optimized, Throughput Optimized
● Step 2: Storage Capacity Planning
● Workload projections
● Performance Observations, Metrics to be optimized, and Calculators
● Step 3: Procure and Deploy
● Step 4: Manage and Steer
● Schedulers
Networking for Elastic Clouds
Core Network
Solving resource contention for the Network
Throughput
ResiliencyLatency
Enterprise vs Cloud Fabric
Traditional Enterprise Topology Modern Cloud Friendly Topology
Network diagrams referenced from http://cto.vmware.com/is-your-cloud-ready-for-big-data/
Network Elasticity is Required..
NODE NODE NODE NODE NODE NODE NODE NODE
NODE NODE
NODE NODE
NODE NODE NODE NODE NODE NODE NODE NODE
NODE NODE
NODE NODE
NODE NODE NODE NODE NODE NODE NODE NODE
NODE NODE
NODE NODE
NODE NODE NODE NODE NODE NODE NODE NODE NODE
BLOCKSTORE
BLOCKSTORE
NODE
NODE NODE NODE NODE NODE NODE NODE
BLOCKSTORE
BLOCKSTORE
NODE
NODE NODE NODE NODE NODE NODE NODE
NODENODE
NODE
BLOCKSTORE
BLOCKSTORE
BLOCKSTORE
BLOCKSTORE
Elastic Cloud Resource Map
NODE
NODE
Because your cloud will grow..
Each unit here could be a server, or a rack of servers.
Core Fabric Requirements
OpenStack friendly networking features:
✦Availability and Resiliency (multi-path, per-flow routing)
✦Resource Node (compute/storage) Data Throughput
✦Network Latency
✦Congestion Management
Spine and Leaf Topology
Ask your friendly network vendor for guidance
Cisco, ARISTA, Brocade, Juniper, Force10, etc.
http://bradhedlund.com/2012/01/25/construct-a-leaf-spine-design-with-40g-or-10g-an-observation-in-scaling-the-fabric/
Putting it All Together
Remember our Hero!
Plan for the Resource Service Level
Compute/StorageNetwork Fabric
Cloud Controller
ResourceService
Level
High level architectureCore
servi
ces
Genera
l Purp
ose
Compu
te
Perform
ance
Storag
e
Genera
l (Cap
acity
)
Storag
e
DeterministicNetwork{
OpenStackCore Services{
DeterministicResources}
Scale Out (as needed)
Questions?
Resources
✦ https://github.com/noslzzp/cloud-resource-calculator
✦ What is DevOps?http://oreil.ly/1jBcsAu - free!
Open source tools includes:✦Graphite✦Ganglia
Public Clouds Benchmarks✦Cloudharmony.com✦Cloudsleuth.com(Global Provider View)
Thank You!
Red Hat Enterprise Linux OpenStack PlatformHigh AvailabilityArthur Berezin — Technical Product Manager, Red HatWednesday, April 162:30 pm - 3:30 pm
Deploying Red Hat Enterprise Linux OpenStack Platform in the enterprise with FlexPodArthur Enright — Field Product Manager, Red HatNetApp and CiscoWednesday, April 163:40 pm - 4:40 pm
Deep dive: OpenStack ComputeSteve Gordon — Technical Product Manager, Red HatThursday, April 179:45 am - 10:45 am
Check out these sessions!