Scheduling and Guided Search for Cloud Applications

Scheduling Bio-Medical Applications on Clusters

Scheduling and Guided Search for Cloud ApplicationsCristiana AmzaS1Big Data is HereData growth (by 2015) = 100x in ten years [IDC 2012]Population growth = 10% in ten yearsMonetizing data for commerce, health, science, services, .

[source: Economist] courtesy ofBabak Falsafi2Data Growing Faster than TechnologyWinterCorp Survey, www.wintercorp.comGrowingtechnologygap courtesy ofBabak FalsafiChallenge 1: Costs of a datacenter4Estimated costs of datacenter:46,000 servers$3,500,000 per month to runData courtesy of James Hamilton [SIGMOD11 Keynote]3yr server and 10yr infrastructure amortizationServer & Power are88% of total costJames Hamilton from Amazon provided this cost estimation as part of his SIGMOD11 keynote.He describes a typical datacenter consisting of 46 thousand servers and it costs $3.5 million per month to runIt breaks down the costs into many dimensions: Server, Power, Power distribution, networkingThe server and power costs are 88 percent of the total operating costsSo, improving the resource utilization of servers is crucial for improving the cost-efficiency and lowering costs of the datacenter4Datacenter Energy Not SustainableModern datacenters 20 MW!In modern world, 6% of all electricity, growing at >20%!

Billion Kilowatt hour/year2001 2005 2009 2013 2017

A Modern Datacenter

17x football stadium, $3 billion 50 million homes courtesy ofBabak FalsafiHow many homes?5Amazon Cant Recover All Its Cloud Data From OutageMax Eddy, 27 April 2011,www.geekosystem.com

When the Cloud Fails: T-Mobile, Microsoft Lose Sidekick Customer DataOm Malik, 10 October 2009, gigaom.comWhoops Facebook loses 1 billion photosChris Keall, 10 March 2009, The National Business ReviewCloud Storage Often Results in Data LossChad Brooks, 10 October 2011, www.businessnewsdaily.com6Cloudy

with a chance offailure courtesy ofHaryadi S. GunawiChallenge 2: Data Management (Anomalies)Problems are entrenchedI have been working in this area since 2001Problems have only grown more complex/intractableSame old Distributed Systems problemsNew: Levels of indirection (remote processing, deep software stacks, VMs, etc) Eg: Cloud monitoring and logging data (terrabytes per day)But, no notable success stories with analyzing such data

78ReduceReduceReduceMapMapMap

MapReduce parallelism: Embarrassing/simplisticWorks for aggregate opSimple schedulingChallenge 3: Paradigm Limitations9Hadoop/Enterprise: Separate Storage SilosHardware $$$Periodic data ingestCross-silo data management $$$

Hadoop

What can we do ?Find Meaningful AppsWe can produce/find tons of dataNeed to analyze something of vital importance to justify draining vital resources Otherwise the simplest solution is to stop creating the problem(s) 10What can we do ?Consolidate Research AgendasFind overarching, mission critical paradigmsState of the art: MapReduce too simplistic Develop standards, common tools and benchmarksIntegrate solutions, think holisticallyEnforce accountability for data center/Cloud provider11Opportunity 1: The Brain ChallengeStarted to explore Neuroscience workloads in 2010A Brain Summit/Workshop held at IBM TJ WatsonStarted a collaboration with Stephen Strother at Baycrest a year laterAn application that is both data and compute intensiveBoils down to an optimization problem in a highly parametrized search space

12Opportunity 2: Guided Modeling Performance modeling, energy modeling, anomaly modeling, biophysical modelingAll tend to be interpolations/searches/optimizations in highly parametrized spaces Key idea: Develop a common framework that works for allExtend the way MapReduce standardized aggregation opsGuidance: Operator Reduction, Linear Interpolation, etc13Building models takes time

High LatencyLow LatencyDB MemoryStorage MemoryAvg. Latency1432 data points32x32=1024 sampling pointsActuate a live system and take experimental samples.Sample in 512MB chunks; 15 minutes for each point16GBExhaustive sampling takes 11 days!Goal: Reduce Time by Model Reuse

High LatencyLow LatencyDB ResourcesStorage ResourcesAvg. LatencyDynamic Resource Allocation [FAST09]Capacity Planning [SIGMOD13]What-if Queries [SIGMETRICS10]Anomaly Detection[SIGMETRICS10]Towards A Virtual Brain Model[HPCS14]

15Provideresource-to-performance mappingMoreLessSo What are performance models?15

Use less resources: Customer wants 1000 TPS. What is the most efficient (e.g., CPU/Memory) to deliver it?Share resources: Can I place customer As DB along side customer Bs DB? Will their service-levels be met?Service ProviderManagement Interactions16Use the right amount of resources: What will be the performance (e.g., query latency) if I use 8GB of RAM instead of 16GB?Solve performance problems: Im only getting 500 TPS. Whats wrong? Is the cloud to blame?

Customer DBANeed to build performance models to understandLibraries/archive of models17Black-box Models

Minimal assumptionsNeeds lots of samplesCould over-fitAnalytical Models

No samples requiredDifficult to deriveFragile to maintain

Gray-box Models

Few samples neededCan be adaptedStill need to deriveData drivenKnowledge drivenUse anEnsembleof modelsPerformance models come in many types, some are data driven and others are more knowledge driven 17Model Ensemble approach181. Guidance as trends and patternsyxyxyxUse data2. Automatically tune the modelsTest & Rank3. Rank the models use a blendyxyxyxyxRepeat (if needed)12

How to specify guidance19SelfTalk Language to describe relationshipsProvide a catalog of common functions

Details in SIGMETRICS10 paperyxSpecifies model inputs and parametersCurve-fitting and validation algorithmsHINT myHintRELATION LINEAR(x,y)METRIC (x,y) { x.name=MySQL.CPU y.name=MySQL.QPS} CONTEXT (a) { a.name=MySQL.BufPoolAlloc a.value >= 512MB}Refine models using data20Use hints to link relations to metrics

yxHints that CPU linearly correlated to QPSLearns parameters using data (or requests more data)But working-set should be in RAMRank models and blend

High LatencyDB MemoryStorage MemoryAvg. Latency2116GB1. Divide search space into regionsyyy3. Associate best-model to region2. n-fold cross-validation to rankPrototype22

SelfTalk, Catalog of models, data MySQL & Storage Server

How should I partition resources between two applications A and B?22Runtime Engine23Model Matching

Model Repositorymodel new workloadreuse similar data/models selective, iterative andensemble learning process

Expand SamplesModel Validation and Refinementrefineif necessaryWe already introduce the main functionalities of SelfTalk. Now In this slide, we go through the modeling process of runtime engine.First, when we start to model a new workload, model matching module will retrieve similar data or models from Chorus repository. To refine the model, runtime needs to expand sampling set through actuating the live system, and then runtime rank models for each region.If the model is ready, it will be put into the Chorus repository for use. If the accuracy requirement is not met, then the steps will be repeated. This progress is a selective, iterative and ensemble learning process. We will explain it in the following slides.

This is the Chorus iterative model building process. As we know, Chorus is an iterative modeling framework, so sysadmin can pose new model templates and new inquires during this iterative process.

Chorus is an interactive modeling process. Sysadmin can post new model templates and new inquires into this iterative model building process.Chorus: an interactive way for sysadmin to intact with the modeling process.Ex 1: Predicting buffer pool latencies24

Analytical model replaced with data-driven modelsEx 2: Model Transformation

CacheL1L2L3Core i7256 kB1024 kB8192 kBGuidance: Step function (3D to 2D reduction)

L1L2L3Xeon256 kB2048 kB20480 kBCore i7256 kB1024 kB8192 kBRead throughput drops at L1 cache size for both CPUs

L2 does not have any significant effect on neither of CPUs

Read throughput again drops at L3 cache size for both CPUs. L3 of Xeon is larger than core i7, hence we see that Xeon can keep the read performance for the 16m data. 26With minimum new samples

CacheL1L2L3Xeon256 kB2048 kB20480 kBEx 3: Modeling and Job Scheduling for BrainData centers usually have heterogeneous structure variety of multicores, GPUs, etc.Different stages of application have different resource demands (CPU versus data intensive)Job scheduling to available resources becomes non-trivialGuided modeling helps28Functional MRI

Goal Studying brain functionality ProcedureAsking patients (subjects) to do a task and capturing brain slices measuring blood oxygen level.Correlating images to identify brain activity29Functional MRIOverall PipelineSubject Selection & Experimental DesignData AcquisitionData PreprocessingAnalysis ModelResults30

Functional MRI3131NPAIRS As Our Application NPAIRS Goal: Processing images to find images correlationsFeature Extraction: A common technique in image processing applications (e.g. Face Recognition)Using Principal Component Analysis to extract Eigen VectorsFinding a set of Eigen Vectors which is a good representative for the whole set of subjects Machine Learning Methods, Heuristic Search, etc.32NPAIRSSPLIT-HALF 2

FULL DATA

Scans Design Data

Scans Statistical Parametric MapSJ2

SPMSJ1

Design DataSPLIT-HALF 1

REPRODUCIBILITY ESTIMATE (r)vSJ2vSJ1Split J33Output of NPAIRS34

NPAIRS Flowchart35

NPAIRS Profiling Results36

GPU Execution Profile37

NPAIRS Execution on Different Nodes38

Job Modeling: Exhaustive Sampling39

Sample Set:1 to 99

Fitness Score R2: 0.995

Total Run Time: 64933 [1] 72.95003[1] 0.9950043[1] 6493339Uniform Sampling

40Sample Set:2 12 22 32 42 52 62 72 82 92

Using 5-fold cross validation

Fitness Score R2: 0.990

Total Run Time: 6368 [1] 0.9903911[1] 636840Guidance: Step function + Fast Sampling41

Sample Set:2 4 8 12 16 20 24 32 48 96Fitness Score R2: 0.993

Total Run Time: 5313 16.6% Time Saving![1] 106.1214[1] 0.9927327[1] 531341Heterogeneous CPU Only (3Fat + 5Light Nodes)42

3 Fat, 3 Light, 3 GPU nodes43

Resource Utilization44

Overall Execution Time Comparison45

ConclusionsBig Data processing is driving a quantum leap in ITHampered by slow progress in data center managementWe propose to investigate guided modelingPromising preliminary results with Neuroscience workloads7x speedup of NPAIRS on small CPU+GPU cluster

46Backup Slides47Modeling ProcedureGet sample set and split it using 5-fold cross validationFit the model using 4 folds sample dataTest the model using 1 fold sample dataTry 5 type of splits, and sum the model error.If the error is less than the threshold, stop. We found the model.

48NotesTotal run time is the summary of total sampling time. Modeling time is negligible.Use the exhaustive data set as the true value, and the fitted model to predict the value. Then compute the coefficient of determination R2.49

Scheduling and Guided Search for Cloud Applications

Documents

Transcript of Scheduling and Guided Search for Cloud Applications