VMworld 2013: Software-defined Storage - The Next Phase in the Evolution of Enterprise Storage
VMworld 2013: Building the Management Stack for Your Software Defined Data Center
-
Upload
vmworld -
Category
Technology
-
view
137 -
download
0
description
Transcript of VMworld 2013: Building the Management Stack for Your Software Defined Data Center
Building the Management Stack for Your
Software Defined Data Center
Bernd Harzog, The Virtualization Practice
Mark Leake, VMware
VCM4869
#VCM4869
Bernd Harzog — Virtualization Performance and
Capacity Management Analyst
• Analyst and Consultant Focused upon: — Infrastructure Performance and Capacity Management of Virtualized Systems — Application Performance Management — Transaction Performance Management — End User Experience Management
• Clients include: — Enterprises seeking
virtualization performance management solutions
— Vendors offering solutions
• Key Findings — Virtualization introduces
sharing and dynamic behavior — Agile Development produces
rapidly changing applications — Both combine to require a new
tools, organizations and management processes
Key Trends
• The demand for business functionality implemented in software is infinite (therefore so is the backlog)
• More and more software – from different sources, from different tools, in different languages, on different run times
• Scaled out commodity deployment platforms
• Distribution of applications across data centers and private/public clouds
• Virtualization of business critical and performance critical applications
• More than one hypervisor in the enterprise
• Rapidly changing applications running on dynamic platforms
• The Software Defined Data Center will deliver dramatic benefits and create significant management challenges
Virtualization is Progressing – Business Critical
Apps are now VMware’s Focus
Your New World
• Agile Development creates rapidly changing applications
• Built in diverse languages and running on diverse language runtimes
• Running on next generation deployment platforms
• Deployed on multiple virtualization platforms
• Running on scaled out commodity hardware
• Located in multiple clouds with multiple owners
Your Cloud Public Cloud Hybrid Cloud
The Principles of the SDDC
Network Virtualization
What’s Different about a Software Defined Data
Center?
• Configuration, management, and some of the functional execution of CPU, memory, networking and storage is done in the SDDC software — Example – configuration of a virtual tunnel between two VM’s across clusters or
even virtual data centers
• Some of the services currently performed by dedicated hardware appliances will be performed by software plug-ins to the SDDC — Example – load balancing and security
• Almost all of the configuration for a VM or for an N-tier application system can be done in one place, and will follow that workload around
• Since all of this configuration will be done in the SDDC software layer, and since it will all be exposed via API’s, configuration changes will occur much more frequently and easily
• Private clouds will be able to address a broad range of business critical applications since the required resources will be able to be automatically marshaled by the Cloud Management platform from the SDDC
• An SDDC supporting a private cloud will be a highly dynamic computing platform using a high degree of automation to continuously execute a variety of actions on a highly automated basis
Management Principles for the Software Defined
Data Center
• Start Over – Start with a new Reference Architecture - do not assume that any tool you have purchased automatically makes the cut
• Insist upon easy to try, easy to buy, easy to manage, and results in production before purchase
• Organize for the successful virtualization of business critical applications
• Define Performance as Latency and Response time, not Resource Utilization
• Manage every application for performance, not just the 5% most painful and important ones
• Get Real Time, Deterministic and Comprehensive about Data Collection
• Design your management architecture for the distributed cloud case even if you are not there yet
If you Don’t Believe Me!
Statistics Collection & Telemetry
Another area of focus for an open networking ecosystem should be defining a framework for common storage and query of real time and historical performance data and statistics gathered from all devices and functional blocks participating in the network. This is an area that doesn’t exist today. Similar to Quantum, the framework should provide for vendor specific extensions and plug-ins. For example, a fabric vendor might be able to provide telemetry for fabric link utilization, failure events and the hosts affected, and supply a plug-in for a Tool vendor to query that data and subscribe to network events.
http://cto.vmware.com/open-source-open-interfaces-and-open-networking/
The Worse than Useless Test
Apply this test to every single management product in your company
1. Does it operate on a real-time, continuous, and deterministic basis? 2. Does it support workloads distributed across data centers (yours and
ones you rent (cloud))? 3. Does it work across your virtualization and cloud vendor
environments? 4. Can it re-configure itself every time you change something in the
environment or in the applications? 5. Can you support it and use it without the continuous presence of on
premise consultants from the vendor of the tool? 6. If it is a monitoring tool, does it focus upon response time and
latency? 7. Can you try it, for free, in production, before you buy it or more of it?
If the answer is not “Yes” to all seven junk the tool and start over
Starting Over – Rethink ITIL and the CMDB
• ITIL is designed to get you to document and slow down the rate of change
• Don’t tell the Change Control Committee about vMotion!
• Your CMDB will never be able to keep up with rate of change in a Software Defined Data Center
• Every configuration change needs to be tracked in real time, and cross-correlated with performance degradations and resource contention
Starting Over – Rethink ITIL Business Service
Management
There will be no time to “Design Services”. They will need to be discovered automatically as they are put into production
Starting Over – Legacy Management Solutions
will Never by able to Cope with the SDDC
= • A Software Defined Data Center changes too frequently for
legacy management solutions to be able to keep up.
• Legacy solutions cannot be incrementally modified to be able to cope with the SDDC
• Gluing a new product from an acquired startup to the side of a legacy management solution cannot fix a fundamentally broken approach.
• Put the dino in a cage and do not let him out – build a new management stack for your SDDC – isolate the dino to your legacy physical environment
Blind Dinosaur
Gartner is Not Going to Be Much Help Either
• Gartner used to cover “Operations Management” tools in its “IT Event Correlation and Analysis” Magic Quadrant
• That MQ was last published in December 2012, and was retired in 2013
• Gartner has not yet come up with a
replace MQ that includes legacy vendors like IBM, BMC, HP and CA, as well as newcomers like VMware vCenter Operations, Dell vFoglight, Microsoft SCVVM, VMTurbo, etc.
Insist upon the New Way of Trying,
Implementing, and Buying Management Software
The Old Way The New Way
• Rep takes the CIO to play golf
• Enterprise software deal gets signed
• Some products work, others don’t
• People go around the ELA to get the tools they need
• You get to download and use the software in production first
• You prove to yourself that it really does work and add value in your environment
• Then (and only then) do you buy it
Organize for Virtualization of Critical Applications,
Agility, and Success
Virtualization is Just One Team
Data Center Operations
LAN Team
Windows Server Team
Linux Server Team
WAN Team
Database Team
Java Server Team
Web Server Team
SAN Team
Storage Team
Programmer/Analyst Team
Virtual Operations
Tier 3 Support
Tier 2 Support
Tier 1 Help Desk
Application Operations Support
Systems Engineering
Virtualization and Application Operations are THE Teams
• The existing IT Operations Organization will not be able to cope with the SDDC or the clouds that run on it
• Virtualization pervades IT Operations, and becomes Virtual Operations
• Application Operations is responsible for the performance of every application in production (purchased and custom developed)
Performance ≠ Resource Utilization
Performance = Response Time & Latency
The Root of All Evil
• CPU and Memory are horrible indicators of performance
Latency is the appropriate measure of infrastructure performance
Response Time is the appropriate measure of application performance
Pick The Right Vendors
A Reference Architecture for your SDDC
Management Stack
App Performance Mgmt
Au
tom
atio
n &
Orc
hes
tratio
n
Infrastructure Perf. Mgmt
The SDDC Management Stack
Cloud Management
Security*
Operations Mgmt
Big
Data
Rep
osit
ory
Self
-Learn
ing
An
aly
tic
s
Data Protection*
* Not Covered in this Presentation
Surgeon Generals Warning
Trying to use software products that do not exist (or do not work yet) is bad for your health
Big Data Repository
Potential Vendors
• VMware (LogInsights)
• Splunk
• Cloudera (Hadoop)
• 10gen (MongoDB)
• NuOdb
• Pivotal (HVE)
We Need a Multi-Vendor Management Data Store!
Ap
p P
erf
orm
an
ce
Mg
mt
Au
tom
ati
on
& O
rch
es
trati
on
Infr
astr
uctu
re P
erf
. M
gm
t
Clo
ud
Man
ag
em
en
t
Secu
rity
Op
era
tio
ns M
gm
t
Big Data Repository
Self-Learning Analytics
Data
Pro
tec
tio
n
Key Functionality
• All management products should feed one data store
• One version of the truth as to the state of the SDDC
• Since the SDDC is one “Domain”
• The only feasible way to do “entire-Domain” root cause and reporting
• The only feasible way to do “entire domain” analytics
Operations Management
Key Features
1.Host and guest resource utilization monitoring
2.Capacity Mgmt & Planning
3.Used by IT Operations
Example Vendors
• Cirba
• CloudPhysics
• HP (VPV)
• ManageEngine
• Quest (vOperations)
• Reflex Systems
• Solarwinds
• Splunk
• Veeam
• VMTurbo
• VMware vC OPS
• Zenoss
Key Criteria for Resource Based Performance and
Capacity Monitoring
• Out of the box value – if it is not providing value in 10 minutes junk it and find something else (auto-discovery is key)
• Collect data from vCenter AND the other virtualization platforms that you support or plan to support
• Look for the integration of performance management, capacity management, and configuration management
• Collecting, dashboarding, alerting, and reporting on vCenter data is commodity functionality – look for value in analytics and automation
Infrastructure Performance (Latency)
Management
• Servers
• Storage
• SAN Fabric
Key Features
1. Understanding of end-to-end infrastructure performance
2. Capacity management and planning
3. Infrastructure response time is the key metric
4. Used by the team supporting the virtual infrastructure
Example Vendors
• AppNeta
• ExtraHop Networks
• Riverbed
• Sevone
• Virtual Instruments
• Xangati
• Network Fabric
Key Criteria for Infrastructure Response Time
Solutions
• Measure IRT – Monitor how long it takes the infrastructure to respond to requests for work, not how much resource it takes
• Deterministic – Get the real data, not a synthetic transaction, or an average
• Real Time – Get the data when it happens, not seconds or minutes later
• Comprehensive – Get all of the data, not a periodic sample of the data
• Zero-Configuration (Discovery) – Discover the environment and its topology, and keep this up to date in real time
• Application (or VM) Aware – Understand where the load is coming from and where it is going
• Application Agnostic – Work for every workload or VM type in the environment irrespective of how the application is built or deployed
Example - Infrastructure Performance
Management & Real Time Metrics
• Knowing whether performance is good or not all of the time, requires measuring performance in a comprehensive, deterministic, and real time manner
• Averaging good transactions with bad transactions obscures the true nature and impact of the bad transactions
VMware vCenter
5 Minute Average Data
Virtual Instruments VirtualWisdom
Real Time Data
Application Performance Management
Key Features
1. Understanding of app response time across the application system
2. Used by Operations and Application Support
Example Vendors
• AppEnsure
• AppDynamics
• AppFirst
• AppNeta
• BlueStripe
• Boundary
• Confio Software
• Correlsense
• Compuware (dynaTrace)
• ExtraHop Networks
• HP (Performance Anywhere)
• New Relic
• Quest (Foglight)
• Riverbed
Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent
APM is not just for Custom Applications
Apps Ops = Every Application!
• CA/Wily
• HP Diagnostics
• IBM ITCAM
• Precise
• AppDynamics
• AppNeta (TraceView)
• Compuware
• HP (Perf. Anywhere)
• New Relic
• Quest (Foglight)
• BMC Patrol
• NetIQ
• HP BAC
• CA Unicenter/Spectrum
• AppEnsure
• AppFirst
• BlueStripe
• Boundary
• Confio Software
• Correlsense
• ExtraHop
• Riverbed
Legacy Modern
Custom Developed Apps (DevOps)
Every App (AppOps))
Key Criteria for Application Response Time
Solutions
• Measure Actual Application Response Time – How long did it take, not how much resource it used
• Breadth of Application Support – Ideally support every application running in the environment automatically (conflicts with depth)
• Depth of Root Cause Diagnostics – Provide deep analysis into the application stack for root cause (conflicts with breadth)
• Deterministic – Get the real data, not a synthetic transaction, or an average
• Real Time – Get the data when it happens, not seconds or minutes later
• Comprehensive – Get all of the data, not a periodic sample of the data
• Application Discovery and Topology Mapping – Automatically discover new applications and their topology and keep this update to date automatically and continuously
• Analytics and Baselining – Avoid manual thresholds, learn normal behavior and alarm based upon deviations from normal
• Public Cloud Ready – Allow applications to be distributed across organizational boundaries, and have monitoring work with no firewall work
Examples – Dynamic, Continuous, Real-Time
Application Response Time
AppEnsure
AppDynamics
BlueStripe
dynaTrace
Cloud Management
Key Features
1. Automated Provisioning of Services
2. Presentation of Services in a Service Catalog
Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent
Example Vendors
• BMC CLM
• Cisco (Cloupia)
• Citrix (Cloud.com)
• CloudBolt Software
• Embotics
• Eucalyptus
• FluidOps
• Piston Cloud (OpenStack)
• ServiceMesh
• VirtuStream
• VMware vCAC
The Three Phases of Cloud Management
1) AWS Clone Phase (Self-Service from IT) – Let IT offer what AWS offers – Probably not as easy – Probably not as flexible – Probably not as cheap – Why the first generation of Cloud Management failed
2) Tactical IT Agility Phase (Automated Provisioning) – Automates provisioning of tactical and simple production applications – Does not address anything that really matters to the business – Where we are now
3) Enterprise Application Phase (Lifecycle Management) – Automate the management of the applications that matter (DevOps,
SAP) – Address the core of what IT does day in and day out – The strategy for the enterprise capable Cloud Management vendors
IT Automation in Your SDDC
Puppet Chef
vFabric AppDirector
Legacy Automation Process
Populate the Image
Assemble the Application
Self-Learning Analytics – The Only Way to Keep
up with your SDDC
Self-Learning
Analytics
• The right organization, the right tools, and the right data
• Combined with the right self-learning Analytics
• Leads to an automated “entire stack” Root Cause Analysis Process
App Performance Mgmt
Automation & Orchestration
Infrastructure Perf. Mgmt
Cloud Management
Security
Operations Mgmt
Big
Data
Rep
osit
ory
Data Protection
Real Time, Deterministic and Comprehensive Data
Prelert
Netuitive
VMW vC Ops
Before You Try to be Predictive….
• Instrument your infrastructure for end-to-end latency (Infrastructure Performance Management)
• Implement a real-time operational data store that can keep up with the rate of change in your virtual environment
• Implement a modern Developer focused APM solution for your critical custom developed applications
• Implement an Operations focused APM solution to measure response time for every application
• Get as real time, deterministic, and comprehensive as possible with all of your response time and latency metrics
• Reorganize and implement an Application Operations function staffed with application domain experts
• Operationalize finding and fixing problems in real time
• Then and only then – try to get truly predictive
Evaluation Criteria for Performance Analytics
• How automated is the learning (really)
• Diversity of accepted data (time series, events)
• Frequency and quantity of data inputs
• Breadth of plug-ins to the monitoring products you own, or are going to own
• Process for learning (handling) “normal” events
• Tradeoffs between false positives (false alarms) and false negatives (you missed something)
• Ease of implementation (time and cost)
• Quality of the Analysis (can you trust it?)
The Reference Architecture with VMware
Management Solutions
Partner Solutions
vC
Orc
hestra
tor a
nd
Pu
pp
et
Future Networking Instrumentation
vCloud Automation Center
vShield
vCenter Operations Manager
Lo
g In
sig
ht
vC
Op
s &
Lo
g In
sig
ht
An
aly
tic
s
VDP & SRM
App Performance Mgmt
Au
tom
atio
n &
Orc
hes
tratio
n
Infrastructure Perf. Mgmt
Cloud Management
Security
Operations Mgmt
Big
Data
Rep
osit
ory
Self
-Learn
ing
An
aly
tic
s
Data Protection
The SDDC Management Stack The VMware Implementation
The first vendor of an SDDC (VMware) will be the first vendor of an SDDC Management Stack (VMware)
A Reference Architecture for your SDDC
Management Stack
App Performance Mgmt
Infrastructure Perf. Mgmt
The SDDC Management Stack
Cloud Management
Security*
Operations Mgmt
Big
Data
Rep
osit
ory
Self
-Learn
ing
An
aly
tic
s
Data Protection*
* Not Covered in this Presentation
Netuitive,
Prelert
Splunk
CloudBolt, Embotics, FluidOps
ServiceMesh, VirtuStream
AppDynamics, AppEnsure, AppFirst,
AppNeta, BlueStripe, Boundary,
Compuware, Correlsense, ExtraHop,
INETCO, New Relic, Riverbed
Confio, ExtraHop, GigaMon,
Virtual Instruments, Xangati
Cirba, CloudPhysics, Dell,
HP, Hotlink,
VMTurbo, Zenoss
Au
tom
atio
n &
Orc
hes
tratio
n
Puppet, Chef, Cloud Sidekick
Intigua
Candidate Vendors to Manage Your SDDC
Virtualization Platform (vSphere, vCloud, Hyper-V, KVM, XenServer)
Cloud Management Perf. & Cap Mgmt
Veeam
Xangati
AppDynamics
New Relic
BlueStripe
vCAC
Embotics
Riverbed
Zenoss
Piston Cloud
Infr. Perf. Mgmt App Perf. Mgmt
vC Operations
Virtual Instruments
Sevone
AppFirst
SolarWinds
Cisco/Cloupia
Eucalyptus
Reflex Systems
Quest (Foglight)
BMC CLM
Citrix (Cloud.com)
ExtraHop
VMTurbo
Confio Software
Compuware
Automation
App Director
Puppet
ScaleXtreme
Opscode (Chef)
Correlsense
Splunk
Self-Learning Analytics
vCenter Operations
Prelert
Netuitive
Cirba
VirtuStream
AppEnsure
AppNeta
Riverbed
CloudBolt
CloudPhysics
Intigua
MangeEngine
Quest
ExtraHop Networks
AppNeta
HP (Perf. Anywhere)
Boundary
ServiceMesh
FluidOps
HP (VPV)
Hotlink
One Final Point (Wrap Up)
• In this industry we are great at inventing things to solve problems that we did not know that we had
• The PC, the LAN, Client/Server, the Internet, Java, Server Virtualization, VDI, Clouds and Smartphones are all innovations that targeted previously unknown problems
• We are very good at propagating these innovations throughout enterprise organizations worldwide
• Every time we do this we forget about managing the innovation before we deploy it
• If you buy the right management products at the right time you can avoid repeating this mistake with your SDDC
Thank You
Building a New Management Stack for your
Software Defined Data Center (and your Cloud)
Bernd Harzog
Analyst, Virtualization Performance and Capacity Management
44
Other VMware Activities Related to This Session
HOL:
HOL-SDC-1301
Applied Cloud Operations HOL-SDC-1313 vCloud Suite Use Cases - Infrastructure Provisioning (IaaS) HOL-SDC-1314 vCloud Suite Use Cases Application Provisioning (PaaS) HOL-SDC-1307 Enable Hybrid Cloud Automation & Governance with vCAC
Group Discussions:
VCM1002-GD, VCM1004-GD
Cloud Operations with Hicham Mourad or Sam McBride VCM1003-GD Cloud Automation with Naomi Sullivan
VCM4869
Building the Management Stack for Your
Software Defined Data Center
Bernd Harzog, The Virtualization Practice
Mark Leake, VMware
VCM4869
#VCM4869