Post on 08-May-2015
description
© 2014 VMware Inc. All rights reserved.
Hybrid Cloud for Increased Scientific Agility
Josh Simons
Office of the CTO
Matt Herreras
Systems Engineering
2
Our Goal
Increase agility and decrease time to solution for scientists and engineers using
virtualization and cloud technologies
3
Problem Statements
• Fundamental tension between end-users and administrators– End-user need for compute environments customized and optimized for
their specific requirements
– Administrator need to deploy a secure, cost-effective, standard compute infrastructure
• Which leads to– Disconnected islands of compute with security risk and cost
inefficiencies
• Current HPC job scheduling approaches are not agile– Job queuing delays can increase time to solution
– May overcommit physical resources on cluster nodes
• Which leads to– Reduced cluster throughput and inherent inefficiencies
4
How Cloud and Virtualization Can Help
• Use virtualized HPC cloud infrastructure to simultaneously support the needs of both scientist / engineers and administrators
• Leverage inherent virtual platform capabilities to increase cluster throughput and increase efficiency
• Enable rapid end-user self-provisioning of new compute resources
• Provide policy-based enforcement of fair-share resource usage in multi-tenant environments
• Automate secure, compliant policy-based workload separation
• Add fault recovery and avoidance capabilities to protect applications from hardware failure, and to increase cluster throughput
Standard OS Standard OS Standard OS
OS A OS B
End-User Customization
App A App B
hardware hardware
virtualization layer virtualization layer virtualization layer
hardware
Support groups with disparate software requirements
Including root access
Workload Separation
App A
OS A
App B
OS B
virtualization layer
hardware
virtualization layer
hardware
virtualization layer
hardware
Secure multi-tenancy
Fault isolation
…and sometimes performance
Virtual Machine Live Migration
OS A OS B OS A OS B
OS A
Use Resources More Efficiently
App A App B
virtualization layer virtualization layer virtualization layerhardware hardware hardware
App D App E
App CAvoid killing or pausing jobs
Increase overall throughput
OS OS OS
app
OS
app
OS
Workload Agility
hardware
operating system
app
virtualization layer
hardware
virtualization layer
hardware
app app
app
OS
app
OS
VMware vCAC API
Users IT
Research Group 1 Research Group m
Public Clouds
ProgrammaticControl andIntegrations
User Portals Security
VMwarevShield
Research Cluster 1 Research Cluster n
VMware vCloud Automation Center
VMwarevCenter Server
VMware vSphere VMware vSphere VMware vSphere
Blueprints
VMwarevCenter Server
VMwarevCenter Server
Secure Private and Public Cloud for HPC
11
Summary
• New approach to delivering HPC resources using cloud and virtualization technologies that can uniquely address the conflicting needs of end-users and administrators
• Move away from traditional static job placement to a more flexible, dynamic environment to raise throughput and increase scientific agility
• VMware continues to drive virtualization platform improvements to address an expanding range of HPC workloads
• Just a taste offered here. See papers and blog listed in Resources for detailed information on workload performance, platform tuning, etc.
Resources
• CTO HPC blog:– http://cto.vmware.com/tag/hpc
• ACM Operating Systems Review paper:– Virtualizing High Performance Computing
• HPCvirt 2011 workshop papers:– Performance Evaluation of HPC Benchmarks on VMware's ESX Server
• http://labs.vmware.com/publications/performance-evaluation-of-hpc-benchmarks-on-vmwares-esxi-server
– Virtualizing Performance Counters• http://labs.vmware.com/publications/virtualizing-performance-counters
• Latency whitepaper:– Best Practices for Performance Tuning of Latency-Sensitive Workloads in
vSphere VMs• http://www.vmware.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads.pdf
Resources
• Big Data / Hadoop technical whitepaper– Virtualized Hadoop Performance with VMware vSphere 5.1
• http://www.vmware.com/resources/techresources/10360
• InfiniBand performance– RDMA Performance in Virtual Machines with QDR InfiniBand on vSphere 5
• http://labs.vmware.com/publications/ib-researchnote-apr2012
• Paravirtualized RMDA– Toward a Paravirtual RDMA Device for VMware ESXi Guests
• http://labs.vmware.com/publications/vrdma-vmtj-winter2012
Thank YouMatt Herrerasmherreras@vmware.com
Josh Simonssimons@vmware.com