Global Overlay Network : PlanetLab

64
Global Overlay Network : PlanetLab Claudio E.Righetti 6 October, 2005 (some slides taken from Larry Peterson)

description

Global Overlay Network : PlanetLab. Claudio E.Righetti 6 October, 2005 (some slides taken from Larry Peterson). - PowerPoint PPT Presentation

Transcript of Global Overlay Network : PlanetLab

Page 1: Global Overlay Network : PlanetLab

Global Overlay Network : PlanetLab

Claudio E.Righetti

6 October, 2005

(some slides taken from Larry Peterson)

Page 2: Global Overlay Network : PlanetLab

• “PlanetLab: An Overlay Testbed for Broad-Coverage Services “ Bavier, Bowman, Chun, Culler, Peterson, Roscoe, Wawrzoniak . ACM SIGCOMM Computer Communications Review . Volume 33 Number 3 : July 2003

• “ Overcoming the Internet Impasse through Virtualization “ Anderson , Peterson , Shenker , Turner . IEEE Computer. April 2005

• “Towards a Comprehensive PlanetLab Architecture”, Larry Peterson, Andy Bavier, Marc Fiuczynski, Steve Muir, and Timothy Roscoe, June 2005

http://www.planet-lab.org/PDN/PDN-05-030

Page 3: Global Overlay Network : PlanetLab

Overview

1. What is PlanetLab?

2. Architecture1. Local: Nodes

2. Global: Network

3. Details1. Virtual Machines

2. Maintenance

Page 4: Global Overlay Network : PlanetLab

What Is PlanetLab?

• Geographically distributed overlay network

• Testbed for broad-coverage network services

Page 5: Global Overlay Network : PlanetLab

PlanetLab Goal

“…to support seamless migration of an application from an early prototype,

through multiple design iterations,

to a popular service that continues to evolve.”

Page 6: Global Overlay Network : PlanetLab

Priorities

• Diversity of Network– Geographic– Links

• Edge-sites, co-location and routing centers, homes (DSL, cable-modem)

• Flexibility– Allow experimenters maximal control over PlanetLab nodes– Securely and fairly

Page 7: Global Overlay Network : PlanetLab

Architecture Overview

• Slice : horizontal cut of global PlanetLab resources• Service : set of distributed and cooperating programs

delivering some higher-level functionality• Each service runs in a slice of PlanetLab

Page 8: Global Overlay Network : PlanetLab

Services Run in Slices

PlanetLab Nodes

Page 9: Global Overlay Network : PlanetLab

Services Run in Slices

PlanetLab Nodes

Virtual Machines

Service / Slice A

Page 10: Global Overlay Network : PlanetLab

Services Run in Slices

PlanetLab Nodes

Virtual Machines

Service / Slice A

Service / Slice B

Page 11: Global Overlay Network : PlanetLab

Services Run in Slices

PlanetLab Nodes

Virtual Machines

Service / Slice A

Service / Slice B

Service / Slice C

Page 12: Global Overlay Network : PlanetLab

“… to view slice as a network of

Virtual Machines, with a set of local resources bound to each

VM .”

Page 13: Global Overlay Network : PlanetLab

Virtual Machine Monitor ( VMM)

• Multiple VMs run on each PlanetLab node• VMM arbitrates the nodes’s resources among them

Page 14: Global Overlay Network : PlanetLab

PlanetLab Architecture

• Node-level– Several virtual machines on each node, each running a

different service• Resources distributed fairly• Services are isolated from each other

• Network-level– Node managers, agents, brokers, and service managers

provide interface and maintain PlanetLab

Page 15: Global Overlay Network : PlanetLab

Per-Node View

Virtual Machine Monitor (VMM)

NodeMgr

LocalAdmin

VM1 VM2 VMn…

Page 16: Global Overlay Network : PlanetLab

Node Architecture Goals

• Provide a virtual machine for each service running on a node

• Isolate virtual machines• Allow maximal control over virtual machines• Fair allocation of resources

– Network, CPU, memory, disk

Page 17: Global Overlay Network : PlanetLab

PlanetLab’s design philosophy

• Application Programming Interface used by tipical services

• Protection Interface implemented by the VMM

PlanetLab node virtualization mechanisms are characterized by the these two interfaces are drawn

Page 18: Global Overlay Network : PlanetLab

One Extreme: Software Runtimes (e.g., Java Virtual Machine, MS CLR)

• Very High level API• Depend on OS to provide protection and resource

allocation• Not flexible

Page 19: Global Overlay Network : PlanetLab

Other Extreme: Complete Virtual Machine (e.g., VMware)

• Very Low level API (hardware)– Maximum flexibility

• Excellent protection• High CPU/Memory overhead

– Cannot share common resources among virtual machines• OS, common filesystem

• High-end commercial server , 10s VM

Page 20: Global Overlay Network : PlanetLab

Mainstream Operating System

• API and protection at same level (system calls)• Simple implementation (e.g., Slice = process group)• Efficient use of resources (shared memory, common

OS)• Bad protection and isolation• Maximum Control and Security?

Page 21: Global Overlay Network : PlanetLab

PlanetLab Virtualization: VServers

• Kernel patch to mainstream OS (Linux)• Gives appearance of separate kernel for each virtual

machine– Root privileges restricted to activities that do not affect other

vservers

• Some modification: resource control (e.g., File handles, port numbers) and protection facilities added

Page 22: Global Overlay Network : PlanetLab

PlanetLab Network Architecture

• Node manger (one per node)– Create slices for service managers

• When service managers provide valid tickets

– Allocate resources for vservers

• Resource Monitor (one per node)– Track node’s available resources– Tell agents about available resources

Page 23: Global Overlay Network : PlanetLab

PlanetLab Network Architecture

• Agent (centralized)– Track nodes’ free resources– Advertise resources to resource brokers– Issue tickets to resource brokers

• Tickets may be redeemed with node managers to obtain the resource

Page 24: Global Overlay Network : PlanetLab

PlanetLab Network Architecture

• Resource Broker (per service)– Obtain tickets from agents on behalf of service managers

• Service Manager (per service)– Obtain tickets from broker– Redeem tickets with node managers to acquire resources– If resources can be acquired, start service

Page 25: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

Broker

Page 26: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

BrokerResource Monitor

Page 27: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

BrokerResource Monitor

Page 28: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

BrokerResource Monitor

ticket

Page 29: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

Broker

ticket

Resource Monitor

Resource Monitor

Page 30: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

Broker

ticket

Resource Monitor

Resource Monitor

ticket

ticket

Page 31: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

Broker

ticket

ticket

ticket

Page 32: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

Broker

ticket

ticket

ticket

Page 33: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

Broker

ticket

ticket

ticket

Page 34: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

Broker

ticket

ticket

ticket

Page 35: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

Broker

ticket

ticket

ticketNode Manager

Node Manager

Page 36: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

Broker

ticket

Page 37: Global Overlay Network : PlanetLab

Obtaining a Slice

Agent

Service Manager

Broker

ticket

Page 38: Global Overlay Network : PlanetLab

PlanetLab Today

www.planet-lab.org

Page 39: Global Overlay Network : PlanetLab

PlanetLab Today

• Global distributed systems infrastructure– platform for long-running services– testbed for network experiments

• 583 nodes around the world– 30 countries– 250+ institutions (universities, research labs, gov’t)

• Standard PC servers– 150–200 users per server– 30–40 active per hour, 5–10 at any given time– memory, CPU both heavily over-utilised

Page 40: Global Overlay Network : PlanetLab

Node Software

• Linux Fedora Core 2– kernel being upgraded to FC4– always up-to-date with security-related patches

• VServer patches provide security– each user gets own VM (‘slice’)– limited root capabilities

• CKRM/VServer patches provide resource mgmt– proportional share CPU scheduling– hierarchical token bucket controls network Tx bandwidth– physical memory limits– disk quotas

Page 41: Global Overlay Network : PlanetLab

Issues

• Multiple VM Types– Linux vservers, Xen domains

• Federation– EU, Japan, China

• Resource Allocation– Policy, markets

• Infrastructure Services– Delegation

Need to define the PlanetLab Architecture

Page 42: Global Overlay Network : PlanetLab

Key Architectural Ideas

• Distributed virtualization– slice = set of virtual machines

• Unbundled management– infrastructure services run in their own slice

• Chain of responsibility– account for behavior of third-party software– manage trust relationships

Page 43: Global Overlay Network : PlanetLab

N x N

Trust Relationships

PrincetonBerkeleyWashingtonMITBrownCMUNYUETHHarvardHP LabsIntelNEC LabsPurdueUCSDSICSCambridgeCornell…

princeton_codeennyu_dcornell_beehiveatt_mcashcmu_esmharvard_icehplabs_donutlabidsl_pseprirb_phiparis6_landmarksmit_dhtmcgill_cardhuji_enderarizona_storkucb_bambooucsd_shareumd_scriptroute…

TrustedIntermediary

(PLC)

Page 44: Global Overlay Network : PlanetLab

Principals

• Node Owners– host one or more nodes (retain ultimate control)– selects an MA and approves of one or more SAs

• Service Providers (Developers)– implements and deploys network services– responsible for the service’s behavior

• Management Authority (MA)– installs an maintains software on nodes– creates VMs and monitors their behavior

• Slice Authority (SA)– registers service providers– creates slices and binds them to responsible provider

Page 45: Global Overlay Network : PlanetLab

Trust Relationships(1) Owner trusts MA to map network

activity to responsible sliceMA

Owner Provider

SA

(2) Owner trusts SA to map slice to responsible providers

1

2

5

6

(3) Provider trusts SA to create VMs on its behalf

3

(4) Provider trusts MA to provide working VMs & not falsely accuse it

4

(5) SA trusts provider to deploy responsible services

(6) MA trusts owner to keep nodes physically secure

Page 46: Global Overlay Network : PlanetLab

Architectural Elements

MA

NM +VMM

nodedatabase

NodeOwner

OwnerVM

SCS

SAslicedatabase

VM ServiceProvider

Page 47: Global Overlay Network : PlanetLab

Narrow Waist

• Name space for slices< slice_authority, slice_name >

• Node Manager Interfacerspec = < vm_type = linux_vserver,

cpu_share = 32,

mem_limit - 128MB,

disk_quota = 5GB,

base_rate = 1Kbps,

burst_rate = 100Mbps,

sustained_rate = 1.5Mbps >

Page 48: Global Overlay Network : PlanetLab

Node Boot/Install ProcessNode PLC Boot Server

1. Boots from BootCD (Linux loaded)

2. Hardware initialized

3. Read network config . from floppy

7. Node key read into memory from floppy

4. Contact PLC (MA)

6. Execute boot mgr

Boot Manager

8. Invoke Boot API

10. State = “install”, run installer

11. Update node state via Boot API

13. Chain-boot node (no restart)

14. Node booted

9. Verify node key, send current node state

12. Verify node key, change state to “boot”

5. Send boot manager

Page 49: Global Overlay Network : PlanetLab

PlanetFlow

• Logs every outbound IP flow on every node– accesses ulogd via Proper– retrieves packet headers, timestamps, context ids (batched)– used to audit traffic

• Aggregated and archived at PLC

Page 50: Global Overlay Network : PlanetLab

Chain of ResponsibilityJoin Request PI submits Consortium paperwork and requests to join

PI Activated PLC verifies PI, activates account, enables site (logged)

User Activated Users create accounts with keys, PI activates accounts (logged)

Nodes Added to Slices

Users add nodes to their slice (logged)

Slice Traffic Logged

Experiments run on nodes and generate traffic (logged by Netflow)

Traffic Logs Centrally Stored

PLC periodically pulls traffic logs from nodes

Slice Created PI creates slice and assigns users to it (logged)

Network Activity Slice Responsible Users & PI

Page 51: Global Overlay Network : PlanetLab

Slice Creation

PLC(SA)

VMM

NM VM

PI SliceCreate( ) SliceUsersAdd( )

User SliceNodesAdd( ) SliceAttributeSet( ) SliceInstantiate( )

SliceGetAll( )

slices.xml VM VM…

.

.

.

.

.

.

Page 52: Global Overlay Network : PlanetLab

Slice Creation

PLC(SA)

VMM

NM VM

PI SliceCreate( ) SliceUsersAdd( )

User SliceAttributeSet( ) SliceGetTicket( )

VM VM…

.

.

.

.

.

.

(distribute ticket to slice creation service)

SliverCreate(ticket)

Page 53: Global Overlay Network : PlanetLab

Brokerage Service

PLC(SA)

VMM

NM VM

PI SliceCreate( ) SliceUsersAdd( )

Broker SliceAttributeSet( ) SliceGetTicket( )

VM VM…

.

.

.

.

.

.

(distribute ticket to brokerage service)

rcap = PoolCreate(ticket)

Page 54: Global Overlay Network : PlanetLab

Brokerage Service (cont)

PLC(SA)

VMM

NM VM VM VM…

.

.

.

.

.

.

(broker contacts relevant nodes)

PoolSplit(rcap, slice, rspec)

VM

User BuyResources( ) Broker

Page 55: Global Overlay Network : PlanetLab

VIRTUAL MACHINES

Page 56: Global Overlay Network : PlanetLab

PlanetLab Virtual Machines: VServers

• Extend the idea of chroot(2)– New vserver created by system call– Descendent processes inherit vserver– Unique filesystem, SYSV IPC, UID/GID space– Limited root privilege

• Can’t control host node

– Irreversible

Page 57: Global Overlay Network : PlanetLab

Scalability

• Reduce disk footprint using copy-on-write– Immutable flag provides file-level CoW– Vservers share 508MB basic filesystem

• Each additional vserver takes 29MB

• Increase limits on kernel resources (e.g., file descriptors)– Is the kernel designed to handle this? (inefficient data

structures?)

Page 58: Global Overlay Network : PlanetLab

Protected Raw Sockets

• Services may need low-level network access– Cannot allow them access to other services’ packets

• Provide “protected” raw sockets– TCP/UDP bound to local port– Incoming packets delivered only to service with corresponding port

registered– Outgoing packets scanned to prevent spoofing

• ICMP also supported– 16-bit identifier placed in ICMP header

Page 59: Global Overlay Network : PlanetLab

Resource Limits

• Node-wide cap on outgoing network bandwidth– Protect the world from PlanetLab services

• Isolation between vservers: two approaches– Fairness: each of N vservers gets 1/N of the resources during

contention– Guarantees: each slice reserves certain amount of resources (e.g.,

1Mbps bandwidth, 10Mcps CPU)• Left-over resources distributed fairly

Page 60: Global Overlay Network : PlanetLab

Linux and CPU Resource Management

• The scheduler in Linux provides fairness by process, not by vserver– Vserver with many processes hogs CPU

• No current way for scheduler to provide guaranteed slices of CPU time

Page 61: Global Overlay Network : PlanetLab

MANAGEMENT SERVICES

Page 62: Global Overlay Network : PlanetLab

PlanetLab Network Management

1. PlanetLab Nodes boot a small Linux OS from CD, run on RAM disk

2. Contacts a bootserver3. Bootserver sends a (signed) startup script

• Boot normally or• Write new filesystem or• Start sshd for remote PlanetLab Admin login

• Nodes can be remotely power-cycled

Page 63: Global Overlay Network : PlanetLab

Dynamic Slice Creation

1. Node Manager verifies tickets from service manager

2. Creates a new vserver

3. Creates an account on the node and on the vserver

Page 64: Global Overlay Network : PlanetLab

User Logs in to PlanetLab Node

• /bin/vsh immediately:1. Switches to the account’s associated vserver

2. Chroot()s to the associated root directory

3. Relinquishes true root privileges

4. Switch UID/GID to account on vserver

– Transition to vserver is transparent: it appears the user just logged into the PlanetLab node directly