Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science...

40
Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung, Taiwan, ROC Cloud Operating System

Transcript of Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science...

Page 1: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Cloud Operating System

Unit 12Cloud System Management

I M. C. Chiang

Department of Computer Science and Engineering National Sun Yat-sen University

Kaohsiung, Taiwan, ROC

Cloud Operating System

Page 2: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Outline Out of the Machine

IDC Management

Based on the Machine Service Availability Virtual Machine Management

The Management Tool: libvirt Snapshot and Checkpoint Live Migration Virtual Machine Security Rootkit

Summary

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-2

Page 3: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Out of the Machine

Cloud is not only the Cloud on the network. Plenty of elements support the Cloud.

Server Power Supplies Air Conditioner Staff Members etc.

Virtual is built on reality.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-3

Page 4: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

IDC Management Comfortable environment for machines.

Temperature, humidity level.

Prevents from natural disasters. Flood, earthquake.

Prevents from power failure. UPS.

Prevents from microwaves. Well-planned escape. Guarded entrance. Limited use of data storage media.

Circumstance in movie “Transformers”.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-4

Page 5: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Based on the Machine

Eventually, what customers concern is the services provided.

Here are some important issues. Customers think what Cloud should be. Maintainers think what help Cloud to be.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-5

Page 6: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Service Availability (1)

Very important for all services Amazon EC2 guarantees at least 99.95% availability

in agreement (about 262.8 minutes down time at most in a year)

Google App Engine guarantees same service level agreement

Both provide refund if the requirements are not met

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-6

Page 7: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Service Availability (2)

Possibly methods of increasing availability Providing virtual machine instance snapshots

Can backup VM’s state

Providing virtual machine live migration Can move the virtual machine to another physical machine

on the fly

Redundant storage data In different physical storages and different place.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-7

Page 8: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Virtual Machine Management

Most IaaS solutions aren’t bound with hypervisors Can use different hypervisors in clouds Manage instances will be an issue A cloud is composited with many hosts, increasing

even more complexity

A common layer for managing all hypervisors with one controlling point Reduce the complexity greatly

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-8

Page 9: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

libvirt - Introductions (1)

Initial release: Dec 19, 2005 Most recent stable release: Feb 13, 2012 An open source API, daemon and

management tools are included Aiming for “being a building block for higher

level management tools”

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-9

Page 10: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

libvirt - Introductions (2)

Supported by Red Hat Writing in C

Binding with C#, Python, Perl, OCaml, Ruby, Java, PHP

Support hypervisor: KVM, Xen, VMWare, MS Hyper-V, etc.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-10

Page 11: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

libvirt - Introductions (3)

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-11

Page 12: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

libvirt - Features (1)

VM Management Including provision, create, modify, monitor, control,

migrate, and stop instances

Instance resources management Network interfaces and firewall setup Storage management

Overall instances’ states monitoring Local physical host resource consumption

monitoring.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-12

Page 13: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

libvirt - Features (2)

Remote management Using TLS encryption and x509 certificates Authenticating with Kerberos and SASL Provides secure remote control

Portable client API for multiple OSs Including Linux, Solaris, and Windows

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-13

Page 14: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

libvirt – Operation Modes

libvirt has two operation modes. Local

use libvirt API directly

Remote executes extra libvirtd allows user to access hypervisors on remote machine

through authenticated connections

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-14

Page 15: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

libvirt – Tools Based on It

virsh An interactive CLI including in libvirt

Virtual Machine Manager An GUI developed by Red Hat

oVirt Web application for virtual machines management.

Developed by Red Hat as well

And more than 20 projects base or use libvirt

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-15

Page 16: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

libvirt – Supported by Xen (1)

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-16

Page 17: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

libvirt – Supported by Xen (2) Programs using libvirt execute in Dom0. libvirt can be initialized in two ways, each has their

methods to connect to the Xen infrastructure. With root access, use virConnectOpen().

Connect to the Xen Daemon through an HTTP RPC layer. A read/write ocnnection to the XenStore. Use Xen Hypervisor calls.

Without root access, use virConnectOpenReadOnly(). Fork a libvirt_proxy program (running as root) to provide

read_only access to the API. Be useful for reporting and monitoring.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-17

Page 18: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Snapshot and Checkpoint (1)

Not only the disk image File-based representation of the state, data and

hardware configuration of whole VM Can “freeze” the virtual machine in some

particular states, then resume the execution Useful for system forensics, or restore the

whole system back after failed upgrade/patch

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-18

Page 19: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Snapshot and Checkpoint (2)

Difference between “snapshot” and “checkpoint” Different definitions in different hypervisors

Xen Only “checkpoint”

Microsoft Hyper-V “snapshot” for long-term backup “checkpoint” for short term recovery

VMWare Only “snapshot”

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-19

Page 20: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Snapshot - Creation

With the CLI command, making snapshots can be scheduled and executed automatically Different command for different hypervisor of course

Xen xl save [OPTIONS]

VMWare Workstation vmrun snapshot [OPTIONS]

With help of libvirt: All can be done with “virsh snapshot-create [OPTIONS]”

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-20

Page 21: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Live Migration (1)

Snapshot can make backups for disaster recovery

If host needs maintenance, we have to move virtual machine from host to host on the fly for minimizing downtime

Live migration can be seamless from end-users Two ways of migration

Pre-copy memory Post-copy memory

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-21

Page 22: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Live Migration (2)

Pre-copy memory migration implementation Warm-up

Copy the current memory pages to destination If pages change, re-copy them until the rate is less than

given rate

Stop-and-copy Stop the source VM and copy the remaining dirty pages to

target VM. Downtime happens here. Could be milliseconds to

seconds, depends on memory size.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-22

Page 23: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Live Migration (3)

Post-copy memory migration implementation Suspending the source VM first, then copy the

minimal execution state of the VM to the destination Including CPU, registers, and non-pageable memory

After copying the state the VM at destination start running

What about the memory? Each time the page that haven’t transferred it generates

page-faults. The page-faults will be handled by hypervisor, and copy

from the source through network.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-23

Page 24: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Live Migration (4)

Pre-Copy Need warm-up stage for copying most pagetable Longer downtime depends on the VM’s workload

From 60ms to 210ms*

Post-Copy Even less downtime than pre-copy Performance impact after migration

Demand-paging mechanism reduce the performance impact

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-24

Page 25: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

VM Security

Virtual machine monitor security is recently the most important issue for Cloud Computing. All virtual machines controlled by VMM. VMM is the bridge between virtual machines and the

hardware. Hard disk Memory CPU, etc.

Theoretically, a virtual machine is a completely isolated guest operating system installation.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-25

Page 26: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

VM Security – Virtual Machine Escape What is virtual machine escape?

The process of breaking out of a virtual machine and interacting with the host OS.

The first discovery of virtual machine escape. 2008, within VMWare By Core Security Technologies CVE-2008-0923

Allows guest OS users to read and write arbitrary files on the host OS.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-26

Page 27: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

VM Security – VMWare (1)

The number of security vulnerabilities in record 154 due to 2012/04/02

The oldest record CVE-1999-0733

Miss Buffer overflow in VMWare 1.0.1 for Linux.

Method Uses a long HOME environmental variable.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-27

Page 28: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

VM Security – VMWare (2)

The newest record CVE-2012-1515

Miss VMWare ESX/ESXi 3.5, 4.0 and 4.1 do not implement port-

based I/O operations properly. Effect

Allows guest OS users to gain guest OS privileges. Method

Overwrites memory locations in a read-only memory block associated with the Virtual DOS Machine.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-28

Page 29: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

VM Security – Xen (1)

The number of security vulnerabilities in record 9 due to 2012/04/02

The newest record CVE-2009-3525

Miss tools/libxc/xc_dom_bzimageloader.c in Xen 3.2, 3.3, 4.0 and 4.1

Effect Allows local users to cause a DoS

Method Unspecified vectors related to “Lack of error checking in the

decompression loop”

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-29

Page 30: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

VM Security – Xen (2)

The oldest record CVE-2008-4405

Miss xend in Xen 3.0.3 does not properly

limit the contents of the /local/domain/xenstore directory tree

restrict a guest VM’s write access within the directory tree Effect

Allows guest OS users to cause a DoS

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-30

Page 31: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

VM Security – Hyper-V The number of security vulnerabilities in record

3 due to 2012/04/02 All allow users to cause a DoS

CVE-2010-0026 Host OS hang Via a crafted application that executes a malformed series of machine

instructions.

CVE-2010-3960 host OS hang By sending a crafted encapsulated packet over the VMBus.

CVE-2011-1872 host OS infinite loop Via malformed machine instructions in a VMBus packet.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-31

Page 32: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

VM Security - OpenStack

The number of security vulnerabilities in record 2 due to 2012/04/02

CVE-2011-4596 When enabling EC2 API and the S3/RegisterImage

image-registration method, allow remote authenticated users to overwrite arbitrary files.

CVE-2012-0030 When using OpenStack API, allow remote

authenticated uses to bypass access restrictions for tenants of other users.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-32

Page 33: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

VM Security – About Vulnerability Top 3 of vulnerability types

Execute Code Denial of Service SQL Injection

Information resource http://www.cvedetails.com

supplies the records above

http://www.cve.mitre.org http://nvd.nist.gov

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-33

Page 34: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

VM Management – Rootkit

What is a rootkit? A tool for getting root or cleaning the invade history. A kind of malicious software. In order to hide the existence of certain processes. It is nice, before.

Sony BMG copy protection rootkit scandal. Trojan can be seen as a rootkit.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-34

Page 35: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Rootkit - Examples

We have already known that a rootkit is a software which intends to get the control of the computer

Here are two VMBRs (Virtual-Machine Based Rootkit). SubVirt Blue Pill

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-35

Page 36: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Rootkit– SubVirt (1)

Proposed by team of Microsoft Research and University of Michigan on 2006.

The procedure of infection. We assume that SubVirt has the administrator

authority. After rebooting, SubVirt should be executed first. SubVirt starts VMM and runs the original operating

system as a virtual machine on VMM. SubVirt can collects the wanted information.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-36

Page 37: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Rootkit– SubVirt (2)

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-37

Page 38: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Rootkit– Blue Pill (1)

Designed by Joanna Rutkowska. First demonstrated at the Black Hat Briefings

on August 3, 2006. Originally it required AMD-V support, but was

ported to Intel VT-x as well. It will start a thin hypervisor and virtualize the

rest of the machine under it. The machine doesn’t need to be restarted.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-38

Page 39: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Rootkit– Blue Pill (2)

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-39

Page 40: Cloud Operating System Unit 12 Cloud System Management I M. C. Chiang Department of Computer Science and Engineering National Sun Yat-sen University Kaohsiung,

Summary

Cloud is not only what user see. Snapshot and checkpoint can help to retain the

service availability. There are two ways for live migration:

Pre-copy memory Post-copy memory

Services based on virtual machines, and virtual machines managed by hypervisors, so the security of hypervisors is important.

04/19/23 Cloud Operating System - Unit 12: Cloud Management U12-40