Big Data for Better Datacenters

33
Big Data for Better Datacenters Balaji Parimi @vimAPIGuru Krishna Raj Raja @esxtopGuru Strata 2014, Santa Clara

description

Despite all the recent advancements in the operations management field, data center management today still largely remains as a black art. Administrators have limited visibility into their data center operations today and yet they have to make important operations management decisions every day. A typical data center generates about a Billion data points every day. A lot of insight could be gathered from this data but due to the large volume and scale, on-premise software solutions only collect limited subset of this data. This limits them to a very narrow view of the data center. We at CloudPhysics have taken a different approach to this problem. We created an analytics platform in the cloud, that provides the ability to query, slice and dice and mashup the data with multiple data-sources. This approach not only yields incredible insights but also solves many of the teething operational management issues that have not been solved before. In this talk we give an overview of the data center metadata and provide details on how CloudPhysics handles this data at scale using its platform.

Transcript of Big Data for Better Datacenters

Page 1: Big Data for Better Datacenters

Big  Data  for  Better  Datacenters

!Balaji  Parimi    

@vimAPIGuru  

Krishna  Raj  Raja    @esxtopGuru  

!Strata  2014,  Santa  Clara

Page 2: Big Data for Better Datacenters

World  Today

Page 3: Big Data for Better Datacenters

The  Irony

Page 4: Big Data for Better Datacenters

Analytics

Page 5: Big Data for Better Datacenters

Networking  becomes  mainstream  !No  effective  means  to  pool  storage  and  compute  capacity  !100s  of  silos  to  manage

1990s

Application

Network

Application

OS OS

Compute Compute

Storage Storage

Page 6: Big Data for Better Datacenters

Virtualization  becomes  mainstream  

!Improved  resource  pooling  !Increased  utilization  

2000s

Storage

Network

VM

Hypervisor

Compute

VM

VM VM

VM

Hypervisor

Compute

VM

VM VM

Page 7: Big Data for Better Datacenters

2010s

Cloud  has  become  mainstream  !Centralized  management

VM

Storage

Network

VM

Software  Defined  Datacenter

Compute

VM

VM VM

VM

Compute

VM VM

70%  of  all  the  x86  OS  instances  run    in  virtual  machines,  80%  by  2016    

!-­‐  Gartner  (2012)

Page 8: Big Data for Better Datacenters

Data  Centers  Have  Evolved  But  IT  Operations  Has  Not

Page 9: Big Data for Better Datacenters

IT  Operations  Today

Retrofitted

Page 10: Big Data for Better Datacenters

IT  Operations  Today

Tedious

Page 11: Big Data for Better Datacenters

IT  Operations  Today

Paralyzing

Page 12: Big Data for Better Datacenters

IT  Operations  Today

Lot  of  Guesswork!

Page 13: Big Data for Better Datacenters

IT  Operations  Should  Be  Data-­‐Driven,    “Intuition  and  Guesswork”  Should  be  Replaced  By

“Analytics  and  Predictive  Intelligence”

Page 14: Big Data for Better Datacenters

Datacenter  Efficiency

Statistically Significant Machine Metadata Available for Data

Scientists

Page 15: Big Data for Better Datacenters

Half  Million  Private  Clouds

Can  Private  Clouds  Match  The  Efficiency?

Data  Scientists Machine    Metadata+

Page 16: Big Data for Better Datacenters

Operational  Metadata  Per  Day

500  Virtual  Machines

                   50  Servers

Private  Cloud

1  Billion  Data  Points

25GB

Page 17: Big Data for Better Datacenters

60 million tweets650  Million  Users

2.5 billion pieces of content1  Billion  Users

Quadrillion datapoints (1015)

500,000 Datacenters

50  Million  VMs

Massive  Scale

Page 18: Big Data for Better Datacenters

Challenges

VM

                   Server                    Server

                   Server                    Server

VM VM VM

                                             Data  Center

VM VM

VM VM VM VM VM VM

                           Management  Server                                                                                Control  Plane  

Management    Driven  by  Collective  Intelligence

(Collect)

(Predict/Alert)

Page 19: Big Data for Better Datacenters

ORG

MGMT SERVER CLUSTER

SERVER

DISK STORE

STORAGE!CLUSTER

VDISK

VM

VNIC

VSWITCH

VPORTGROUP

One to Many

Data  Model

Page 20: Big Data for Better Datacenters

CONFIGURATION

PERFORMANCE

TASK & EVENTS

EXTERNAL DATA SOURCES

Types  of  Data  Streams

Page 21: Big Data for Better Datacenters

On-­‐Demand    Analytics  Engine

Prediction  and  Simulation  Engine

Pre-­‐Alerts  and  Notification

Distributed  Datastore            Complex  Stream  Processing

             Collector Data  Analyst  Administrator Partner

Data  Stream

Page 22: Big Data for Better Datacenters

Extract  /  Transform

Persist

Model

Stream  Mashup

Prediction

Alerting

Job  Scheduling

Resource  Management

Enabling  Private  Clouds

Collective  Intelligence

Page 23: Big Data for Better Datacenters

Use  Cases

Page 24: Big Data for Better Datacenters

Predicting  Out  of  Disk  Space  EventDa

tastore  Capacity

0%

25%

50%

75%

100%

April May June July August

Datastore  1 Datastore  2 Datastore  3

Disk  Space  Usage  is  Hard  to  Predict

Page 25: Big Data for Better Datacenters

Predicting  Out  of  Disk  SpaceProb

ability

0%

25%

50%

75%

100%

<  1  Month 6  Months 1  Year

100%  Full 90%  Full 80%  Full

Probability  Distribution  Using  Monte  Carlo  Simulation

Page 26: Big Data for Better Datacenters

Discovering  Causality

1. Virtual  Machine  18  2. Virtual  Machine  23  3. Virtual  Machine  108

Page 27: Big Data for Better Datacenters

Predicting  Potential  Outage

Page 28: Big Data for Better Datacenters

Predicting  SSD  Performance  ImpactLatency  Re

ducjon

0%

25%

50%

75%

100%

Solid  State  Drive  Cache  Per  VM

16  GB 32  GB 64  GB 128  GB

VM  1

 VM  2

 VM  3

Target  L

atency

62  GB

24  GB

Page 29: Big Data for Better Datacenters

Predicting  Configuration  Outliers

Page 30: Big Data for Better Datacenters

Predicting  Performance  OutliersCP

U  Co-­‐effi

cient

0

22.5

45

67.5

90

Virtual  Machine  #

0 3 6 9 12

Organizaion  A Organizajon  B Organizajon  C

Page 31: Big Data for Better Datacenters

Some  Closing  Thoughts

Page 32: Big Data for Better Datacenters

Towards  a  New  World

Page 33: Big Data for Better Datacenters

Thank  You

!Balaji  Parimi    

@vimAPIGuru  

Krishna  Raj  Raja    @esxtopGuru