CS-580K/452 Introduction to Cloud Computing
Transcript of CS-580K/452 Introduction to Cloud Computing
2. Server Virtualization
CS-552/452 Introduction to Cloud Computing
Recap: Cloud Computing
• By NIST (The National Institute of Standards and Technology)
• Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
• 5 features: shared resource pooling, on-demand self-service, broad network access, rapid elasticity, and measured service.
Shared Resource Pooling
▪Ultimately, data center resources can be logically placed into three categories:▪Compute: a collection of all
CPU (GPU or more) capabilities
▪Networks: for data transmission
▪Storage: for storing data
3
Why shared resource pooling?
4
▪Multitenancy▪Cloud computing adopts a multi-
tenant architecture▪Allow multiple cloud users to share
computing resources (hence cost efficient)
▪But cloud needs to make sure that each tenant’s cloud resources are ▪ isolated and invisible to others
▪How?
Why share a single server?
5
▪ Single core becomes more powerful
▪ More and more cores per CPU
▪ Overall CPU% is low (in data centers) – 15% ~ 18%
▪ How to efficiently use a single server? ▪ via virtualization techniques:
processes, virtual machines, containers, etc.
Virtualization (1): Processes
Operating systems allow multiple applications to run simultaneously –via the abstraction of processes
While running, a process is associated with physical state (e.g., CPU, memory and devices) maintained by the OS
Process access physical resources indirectly via a well-defined interface –system calls
6
OS
APP1 APP2 APP3 APP4CPU Scheduler System Call
Processes
Pros and Cons
▪ Pros: ▪Multi-tasking: multiple applications (from different
users) can run “simultaneously”. ▪Maximize resource utilization, e.g., multiple
processes on a multi-core system
▪ Cons:▪1. (unpredictable) Contention
Pros and Cons▪ Cons:
▪ Contention
▪ Given fixed/limited resources, contention is inevitable
▪ But we desire more “predictive” behaviors
▪ E.g., a proportional fair share CPU scheduler ensures a process can always get a fixed portion of a physical CPU (e.g., 50%)
8
Example: Memory Sharing
9
int main(){
int i;char *p[2048];
for(i=0;i<2048;i++){
p[i]=malloc(1024000); // allocate 1 MBmemset(p[i],'\0',1024000); // set that area to 0s
}
while(1){
if (i>=2048)i = 0;
memset(p[i],'\0',1024000);i++;
}}
If you run multiple (e.g., 10) such programs on a physical server with 4 GB memory, what would happen?
Memory Thrashing
10
http://www.firmcodes.com/memory-thrashing-in-operating-system/
▪ Memory thrashing is a problem that arises when the memory being accessed is much more than the physical memory.
▪ Unpredictive contention▪ Any solution?
Resource Scheduling
Example: Unfairness of I/O Resource Sharing
Storage Devices0.E+00
1.E+04
2.E+04
3.E+04
4.E+04
Async IO # = 1 Async IO # = 8 Async IO # = 32
IOP
S
4 KB Synchronous Sequential Read (IO # = 1)
4 KB Asynchronous Random Read
P1
P2
Varying IO
Concurrency
Sync I/Os Async I/Os
CFQ –Default I/O Scheduler in Linux
11
Example: Unfairness of I/O Resource Sharing
Storage Devices
Resource Scheduling
0.E+00
1.E+04
2.E+04
3.E+04
4.E+04
Async IO # = 1 Async IO # = 8 Async IO # = 32
IOP
S
4 KB Synchronous Sequential Read (IO # = 1)
4 KB Asynchronous Random Read
P1
P2
Sync I/Os Async I/Os Varying IO
Concurrency CFQ –Default I/O Scheduler in Linux
12
Example: Unfairness of I/O Resource Sharing
Storage Devices
Resource Scheduling
0.E+00
1.E+04
2.E+04
3.E+04
4.E+04
Async IO # = 1 Async IO # = 8 Async IO # = 32
IOP
S
4 KB Synchronous Sequential Read (IO # = 1)
4 KB Asynchronous Random Read
P1
P2
Varying IO
Concurrency
Sync I/Os Async I/Os
CFQ –Default I/O Scheduler in Linux
13A breach of performance isolation!
Pros and Cons
▪Cons:▪1. Contention
▪ Given fixed/limited resources, contention is inevitable▪ But we desire more predictive behaviors▪ Some “restricted”, fine-grained resource control is
necessary to ensure predictive sharing
14
Pros and Cons▪ Cons:
▪ 1. Contention
▪ 2. Dependency/Compatibility
▪ If your code does not need other libs(self-contained), probably fine
▪ If your code depends on libs and services (OS distributions provide), very likely to have dependency issues
▪ The application’s version and the OS version should (strictly) match to each other 15
APP1
APP2
Library 1
Require v. 2.0
Require v. 1.5
v 1.5
Solution?
▪ We want to solve the following two main issues
▪ 1. Contention
▪ Strong regulation, such as limiting (via metering)
▪ More advanced resource management: limit, share, and reservation
▪ For multiple resources, such as CPU, memory, block I/O, and network.
▪ 2. Dependency/compatibility
▪ Provide processes with their own view of the system (limit what they can see, and what you can use)
▪ With their own running environments (with their library versions)
16
Virtualization (2): Containerization
17
OS
APP1 APP2 APP3 APP4
Control Group Namespace
lib1 lib1 lib2 lib3 lib4
Container1 Container2 Container3 Container4
▪ Containerization encapsulates an application in a container instance with its own execution environment (e.g., with its own binaries and dependencies)
▪ Resource utilization of a container will be strictly limited.
▪ Processes of the same container share the same unique view, namely namespaces.▪ Multiple namespaces (process id,
network, file systems, IPC, etc.)
Containerization
• Containerization is increasingly popular because containers are:
• Flexible: Even the most complex applications can be containerized
• Lightweight: Containers leverage and share the host kernel
• Interchangeable: You can deploy updates and upgrades on-the-fly
• Portable: You can build locally, deploy to the cloud, and run anywhere
• Scalable: You can increase and automatically distribute container replicas
• Stackable: You can stack services vertically and on-the-fly
18
A Typical Use Case
• Container could increase portability of existing apps.
• Every business has a portfolio of older apps already running in their environment that is either serving customers or automating business processes.
• Once containerized, these apps can be augmented with additional services or transitioned into a microservices architecture.
19
Docker Container
▪ Docker is one container solution.
▪ gvisor (google): https://github.com/google/gvisor
▪ kata container: https://katacontainers.io/supporters/
▪ Written in Go programming language.
▪ Docker container is built upon Namespaces and Cgroups (kernel features).
20
https://www.docker.com/
Demo Time (create a docker container)
• Get a Docker container image:• docker image pull image_name
(e.g., ubuntu)
• Run a container from an image• Docker container runs a ubuntu
“command”• e.g., docker run ubuntu
echo “Hello World!”• Docker container runs a ubuntu
“shell”• e.g., docker run -it ubuntu
/bin/bash• Docker container runs a
webserver• E.g., docker run --name some-
nginx-2 -d -p 8080:80 nginx21
Pros and Cons
▪ Pros:
▪Predictive contention via resource limits▪ E.g., enabled by cgroups
▪ Isolated execution environments▪ E.g., via various unique namespaces
▪ Package an application and its dependent execution environment as a container image that can be launched anywhere
▪Lightweight▪ Can achieve close-to-native performance
▪ Can you verify it?
22
Pros and Cons▪ Cons:
▪ Share the same host kernel
▪ Can't boot a different OS▪ No personal kernel modules or drivers
▪ Lack kernel-level development support
▪ Weak isolation▪ Rely on system call interface, which is huge
(300+ system calls in Linux)
▪ If one container comprises the kernel, it can access all containers running on the same machine
▪ Not acceptable in public clouds
23
Container A Container B
Host kernel
Wide system call interface
Virtualization (3): Hardware Virtualization
▪ Hardware virtualization uses software to emulate the existence of hardware in the form of virtual hardware ▪ E.g., vCPU, vMem, virtual network card, virtual hard disks, etc.
▪ It then creates a virtual computer system, namely virtual machine (VM)▪ Consists of a set of (virtual) hardware, upon which a full-fledged kernel runs
24
OS1
APP1 APP2 APP3 APP4
OS2 OS3 OS4
Operating System/Hypervisors
VM1 VM2 VM3 VM4
Virtualization (3): Virtual Machines
• Virtual Machine
▪ A fully protected and isolated copy of the underlying physical machine’s hardware (i.e., emulated by software)
• Virtual Machine Monitor (Hypervisor)
▪ A thin layer of software that's between the hardware and the operating system, virtualizing and managing all hardware resources for VMs.
25
Demo for VMs
26
Old idea from the 1960s
• IBM VM/370 – A VMM for IBM mainframe▪ Multiple OS environments on expensive hardware
▪ Desirable when few machine around
• Popular research idea in 1960s and 1970s▪ Entire conferences on virtual machine monitors
▪ Hardware/VMM/OS designed together
Interest died out in the 1980s and 1990s▪ Hardware got much cheaper
▪ Operating systems got more powerful (e.g. multi-user)
27
A Return to Virtual Machines
▪ Commercial virtual machines for x86 architecture▪ VMware (1999-)
▪ Connectix VirtualPC (now Microsoft)
▪ KVM (Linux)
▪ Research virtual machines for x86 architecture▪ Xen (SOSP ’03)
▪ plex86
28
Pros and Cons of Hardware Virtualization
▪ Pros: ▪ Strong isolation (than containers)
▪ More levels of protection – VM’s kernel and host kernel
▪ Narrow interface between VMs and hypervisor –10+ hypercalls
▪ With full fledged operating systems
▪ Cons:▪ Performance overhead
▪ Due to multiple software layers
▪ Large memory footprint
29
OS1
APP1
Operating System/Hypervisors
VM1
Narrow hypercall interface
OS1
APP1
VM2
Comparisons: VMs vs. Containers
▪ Isolation
▪ Overhead
30
Takeaways:
•How to Share a Physical Computer?
31
OS
APP1 APP2 APP3 APP4
Scheduler OS
APP1 APP2 APP3 APP4
SchedulerResource
LimitsNamespace
lib1 lib1lib2
lib3 lib4
Container1 Container2 Container3 Container4
OS1
APP1 APP2 APP3 APP4
OS2 OS3 OS4
Hypervisor
ProcessesContainers (OS-level
virtualization)Virtual Machines (Machine-
level virtualization)
What fuels Cloud Computing?Lots of large-scale data centers
https://www.youtube.com/watch?v=5d9s_Dxs9ck