Wes Lloyd, Shrideep Pallickara , Olaf David, James Lyon, Mazdak Arabi , Ken Rojas

44
Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Wes Lloyd, Shrideep Pallickara, Olaf David, James Lyon, Mazdak Arabi, Ken Rojas November 6, 2012 Colorado State University, Fort Collins, Colorado USA UCC 2012: 5 th IEEE/ACM International Conference on Utility and Cloud Computing

description

Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds. Wes Lloyd, Shrideep Pallickara , Olaf David, James Lyon, Mazdak Arabi , Ken Rojas November 6 , 2012 Colorado State University, Fort Collins, Colorado USA - PowerPoint PPT Presentation

Transcript of Wes Lloyd, Shrideep Pallickara , Olaf David, James Lyon, Mazdak Arabi , Ken Rojas

Cloud Services Innovation Platform (CSIP)

Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsWes Lloyd, Shrideep Pallickara, Olaf David, James Lyon, Mazdak Arabi, Ken Rojas

November 6, 2012

Colorado State University, Fort Collins, Colorado USAUCC 2012: 5th IEEE/ACM International Conference on Utility and Cloud Computing

1OutlineBackgroundResearch ProblemResearch QuestionsExperimental SetupExperimental ResultsConclusions2Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds2Background3Traditional Application Deployment4Object StoreSingle ServerNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsObject StoreGeospatial DBrDBMSNOSQL DBFile ServerServicesDistributedCacheLogging ServerApache TomcatIaaS cloudApplication Deployment5Application Component Deployment 6App ServerComponentDeploymentApplicationComponentsApplication StackVirtual Machine (VM) Images

PERFORMANCErDBMS r/oFile ServerLog ServerLoad BalancerImage 2rDBMS write. . .Image 1App ServerFile ServerLog ServerrDBMS writeImage nrDBMS r/oLoad BalancerDist. cacheNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Cloudsn=# components; k=# components per set

Permutations

Combinations

But neither describes partitions of a set!Application Deployments

7

Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds

Bells Number 8ModelComponentDeploymentn = #componentsApplication StackVM deployments# of ConfigurationsDatabaseFile ServerLog Server. . .k= #configsconfig 1MDFLconfig 2MFLconfig nMLFD1 VM : 1..n componentsnk4155526203787784,140921,147n. . .DNumber of ways a set of n elements can be partitioned into non-empty subsets

Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsResearch Problem9Problem StatementHow should application components be deployed to ?

Provide high throughput (requests/sec) With low resource costs (# of VMs) To guide VM image compositionAvoid resource contention from interfering components

10Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds11VM

VM

VM

Physical Machine (PM) Resources

VMVMVM

VM

VM

VM

PERFORMANCEResourceContentionResourceSurplusNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsResource Utilization Statisticsc12CPU- CPU time- CPU time in user mode- CPU time in kernel mode- CPU idle time- # of context switches- CPU time waiting for I/O- CPU time serving soft interrupts- Load average (# proc / 60 secs)

Disk- Disk sector reads- Disk sector reads completed- Merged adjacent disk reads- Time spent reading from disk- Disk sector writes- Disk sector writes completed- Merged adjacent disk writes- Time spent writing to diskNetwork- Network bytes sent- Network bytes receivedPMVMVMPMVMVMVMNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds

Can Resource Utilization Statistics

13

Model Application Performance? Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsResearch Questions14Research QuestionsWhich resource utilization statistics are the best predictors?

How should resource utilization data be treated for use in models?

Which modeling techniques are best for predicting application performance and ranking performance of service compositions?

15RQ1)

RQ2)

RQ3)Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsExperimental Setup16

RUSLE2 ModelRevised Universal Soil Loss EquationCombines empirical and process-based sciencePrediction of rill and interrill soil erosion resulting from rainfall and runoffUSDA-NRCS agency standard modelUsed by 3,000+ field officesHelps inventory erosion ratesSediment delivery estimationConservation planning tool17Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsRUSLE2 Web ServiceMulti-tier client/server applicationRESTful, JAX-RS/Java using JSON objectsSurrogate for common architectures

18OMS3RUSLE2POSTGRESQLPOSTGIS1.7+ million shapes57k XML files, 305MbNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds

Eucalyptus 2.0 Private Cloud(9) Sun X6270 blade serversDual Intel Xeon 4-core 2.8 GHz CPUs24 GB ram, 146 GB 15k rpm HDDsCentOS 5.6 x86_64 (host OS)Ubuntu 9.10 x86_64 (guest OS)Eucalytpus 2.0Amazon EC2 API support8 Nodes (NC), 1 Cloud Controller (CLC, CC, SC)Managed mode networking with private VLANsXEN hypervisor v 3.4.3, paravirtualization19Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsRUSLE2 ComponentsVirtual MachineDescriptionMModelApache Tomcat 6.0.20, Wine 1.0.1, RUSLE2 Model, Object Modeling System (OMS 3.0)DDatabasePostgresql-8.4, and PostGIS 1.4.0-2. soil data: 1.7 million shapes, 167 million pointsmanagement data: 98 shapes, 489k pointsclimate data: 31k shapes, 3 million points4.6 GB for the state of TNFFile Servernginx http server 0.7.62 57,185 XML files consisting of 305MB.LLoggerCodebeamer 5.5 running 32-bit ApacheTomcat 6.0Custom REST/JSON logging service as wrapper. 20Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds20SC2M DF LSC4M D

FLSC7LMD

FSC3M D

F LSC5MDF LSC6MD FLSC8MD

F LSC9MD LFSC10M FD LSC11M FDLSC12M LD FSC13M LDFSC14M DLFSC15M LFDSC1M DF L21(15) Tested Component Deployments

Each VM deployed to separate physical machineAll components installed on composite imageScript enabled/disabled components to achieve configsNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsRUSLE2 Application Variants22D-bound Model 21% Database 77% File I/O .75% Overhead1% Logging .1%

M-bound Model 73% Database 1% File I/O 18% Overhead 8% Logging 1%

D-bound:join w/ a nested queryM-bound:standard modelNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds22 23

SC15SC14SC13SC12SC11SC10SC9SC8SC7SC6SC5SC4SC3SC2SC1 CPU time disk sector reads disk sector writes net bytes rcvd net bytes sentResource Utilization Variance for Component Deployments

Boxes represent absolute deviation from meanMagnitude of variance for deploymentsTested Resource Utilization Variablesc24Network- Network bytes sent (nbr)- Network bytes received (nbs)CPU- CPU time- CPU time in user mode (cpu usr)- CPU time in kernel mode (cpu krn)- CPU idle time (cpu_idle)- # of context switches (contextsw)- CPU time waiting for I/O (cpu_io_wait)- CPU time serving soft interrupts (cpu_sint_time)- (loadavg) (# proc / 60 secs)

Disk- Disk sector reads (dsr)- Disk sector reads completed (dsreads)- Merged adjacent disk reads (drm)- Time spent reading from disk (readtime)- Disk sector writes (dsw)- Disk sector writes completed (dswrites)- Merged adjacent disk writes (dwm)- Time spent writing to disk (writetime)Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds25100 random runsJSON object20x Ensembles100 random runs100 random runs100 random runs100 random runs100 random runs100 random runs100 random runs100 random runsSC5MDF LSC8MD

F LSC11M FDLSC14M DLFSC1M DF L(15) RUSLE2deployments

Resource UtilizationDatascript captureExperimental Data Collection

1st run training dataset2nd run test datasetNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsExperimental Results26

RQ1 Which are the best predictors? VM VariablesCPUDisk I/ONetwork I/O27Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsRQ1 Which are the best predictors? PM Variables28

CPUNetwork I/ONov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsRQ2 How should VM resource utilization data be used by performance models?Combination: RUdata=RUM+RUD+RUF+RUL

Used Individually: RUdata={RUM; RUD; RUF; RUL;}

29Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds

RQ2 How should VM resource utilization data be used by performance models?30D-bound separateD-bound combinedM-bound separateM-bound combinedTreating VM data separately for D-bound was better !RUM or RUMDFLfor M-bound was better !Note the larger RMSEfor D-bound RUMDFL!Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsRQ3 Which modeling techniques were best?Multiple Linear Regression (MLR)Stepwise Multiple Linear Regression (MLR-step)Multivariate Adaptive Regression Splines (MARS)Artificial Neural Network (ANNs)31Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds

RQ3 Which modeling techniques were best?32MultipleLinearRegressionStepwiseMLRMultivariateAdaptiveRegresionSplinesArtificalNeuralNetworkRUMDFL data used tocompare models.

Had high RMSEtest error for D-Bound (32% avg)Model performance did not vary much

Best vs. Worst

D-BoundM-Bound .11% RMSEtrain.08% .89% RMSEtest.08% .40 rank err.66

Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsConclusions33ConclusionsCPU statistics were the best predictors

The best treatment of resource utilization statistics was model specific. - (RUMDFL) best for M-Bound RUSLE2 (more I/O) - Individual VM stats (e.g. RUM) best for D-Bound RUSLE2 (more CPU)

ANN and MARS provided lower RMSerror. All models adequately predicted performance and ranks

34RQ1)

RQ2)

RQ3)Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsQuestions35

Extra Slides36Gaps in Related WorkExisting approaches do not considerVM image compositionComplementary component placementsInterference among componentsMinimization of resources (# VMs)Load balancing of physical resourcesPerformance models ignoreDisk I/O Network I/OVM and component locationApproaches & Gaps 37

Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsApplicationServersLoad BalancerLoad BalancerService RequestsnoSQL data storesrDBMSdistributed cacheInfrastructure ManagementProblems & Challenges 38Scale Services

Tune Application Parameters

Tune Virtualization ParametersNov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsProvisioning VariationProblems & Challenges 39VMPhysical HostPhysical HostPhysical HostPhysical HostPhysical HostPhysical HostVMVMVMAmbiguousMappingVMVMVMVMVMVMVMVMVMVMVMVMVMVMRequest(s) to launch VMsVMs ReservePM Memory BlocksVMs Share PMCPU / Disk / Network

PERFORMANCENov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsApplication Profiling VariablesPredictive Power40

Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsApplication Deployment ChallengesVM image compositionService isolation vs. scalabilityResource contention among componentsProvisioning variation Across physical hardware

41

Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsResource Utilization StatisticsVMs Reserve PM memoryShare CPU, disk, and network I/O resourcesVM application performance Reflects quality of load balancing of shared resourcesResource contention performance degradationResource surplus good performance, higher costs

42

Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsResource Utilization VariablesStatisticDescription P/VCPU timeCPU time in msP/Vcpu usrCPU time in user mode in msP/Vcpu krnCPU time in kernel mode in msP/Vcpu_idleCPU idle time in msP/VcontextswNumber of context switchesP/Vcpu_io_waitCPU time waiting for I/O to completeP/Vcpu_sint_timeCPU time servicing soft interruptsVdsrDisk sector reads (1 sector = 512 bytes)VdsreadsNumber of completed disk readsVdrmNumber of adjacent disk reads mergedVreadtimeTime in ms spent reading from diskVdswDisk sector writes (1 sector = 512 bytes)VdswritesNumber of completed disk writesVdwmNumber of adjacent disk writes mergedVwritetimeTime in ms spent writing to diskP/VnbrNetwork bytes sentP/VnbsNetwork bytes receivedP/VloadavgAvg # of running processes in last 60 sec43Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service CloudsExperimental DataScript captured resource utilization statsVirtual machinesPhysical MachinesTraining data: first complete run20 different ensembles of 100 model runs 15 component configurations30,000 model runs Test data: second complete run30,000 model runs44Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds