PlanetLab: Catalyzing Network Innovation October 2, 2007 Larry Peterson Princeton University Timothy...
-
Upload
jonas-thornton -
Category
Documents
-
view
213 -
download
0
Transcript of PlanetLab: Catalyzing Network Innovation October 2, 2007 Larry Peterson Princeton University Timothy...
PlanetLab: Catalyzing Network Innovation
October 2, 2007Larry PetersonPrinceton University
Timothy RoscoeIntel Research at Berkeley
Challenges• Security
– known vulnerabilities lurking in the Internet DDoS, worms, malware
– addressing security comes at a significant cost federal government spent $5.4B in 2004 estimated $50-100B spent worldwide on security in 2004
• Reliability– e-Commerce increasingly depends on fragile Internet
much less reliable than the phone network (three vs five 9’s) risks in using the Internet for mission-critical operations barrier to ubiquitous VoIP
– an issue of ease-of-use for everyday users
Challenges (cont)• Scale & Diversity
– the whole world is becoming networked sensors, consumer electronic devices, embedded processors
– assumptions about edge devices (hosts) no longer hold connectivity, power, capacity, mobility,…
• Performance– scientists have significant bandwidth requirements
each e-science community covets its own wavelength(s)
– purpose-built solutions are not cost-effective being on the “commodity path” makes an effort sustainable
Two Paths• Incremental
– apply point-solutions to the current architecture
• Clean-Slate– replace the Internet with a new network architecture
• We can’t be sure the first path will fail, but…– point-solutions result in increased complexity
making the network harder to manage making the network more vulnerable to attacks making the network more hostile to new applications
– architectural limits may lead to a dead-end
Architectural Limits• Minimize trust assumptions
– the Internet originally viewed network traffic as fundamentally cooperative, but should view it as adversarial
• Enable competition– the Internet was originally developed independent of any
commercial considerations, but today the network architecture must take competition and economic incentives into account
• Allow for edge diversity– the Internet originally assumed host computers were connected to
the edges of the network, but host-centric assumptions are not appropriate in a world with an increasing number of sensors and mobile devices
Limits (cont)• Design for network transparency
– the Internet originally did not expose information about its internal configuration, but there is value to both users and network administrators in making the network more transparent
• Enable new network services– the Internet originally provided only a best-effort packet delivery
service, but there is value in making processing capability and storage capacity available in the middle of the network
• Integrate with optical transport– the Internet originally drew a sharp line between the network and the
underlying transport facility, but allowing bandwidth aggregation and traffic engineering to be first-class abstractions has the potential to improve efficiency and performance
Barriers to Second Path• Internet has become ossified
– no competitive advantage to architectural change– no obvious deployment path
• Inadequate validation of potential solutions– simulation models too simplistic– little or no real-world experimental evaluation
• Testbed dilemma– production testbeds: real users but incremental change– research testbeds: radical change but no real users
Recommendation
It is time for the research community, federal government, and commercial sector to jointly
pursue the second path. This involves experimentally validating new network
architecture(s), and doing so in a sustainable way that fosters wide-spread deployment.
Approaches• Revisiting definition & placement of function
– naming, addressing, and location– routing, forwarding, and addressing– management, control, and data planes– end hosts, routers, and operators
• Designing with new constraints in mind– selfish and adversarial participants– mobile hosts and disconnected operation– large number of small, low-power devices– ease of network management
Deployment Story• Old model
– global up-take of new technology– does not work due to ossification
• New model– incremental deployment via user opt-in– lowering the barrier-to-entry makes deployment plausible
• Process by which we define the new architecture– purists: settle on a single common architecture
virtualization is a means– pluralists: multiplicity of continually evolving elements
virtualization is an ends
• What architecture do we deploy?– research happens…
Validation Gap
Analysis Simulation / Emulation Experiment At ScaleWith Real Users
Deployment
(models) (code)
(results)
(measurements)
PlanetLab
What is PlanetLab?• An open, shared testbed for
– Developing– Deploying– Accessing
- planetary-scale services.What would you do if you had Akamai’s
infrastructure?
PlanetLab
Motivation• New class of applications emerging that spread over
sizable fraction of the web• Architectural components starting to emerge• The next Internet will be created as an overlay on the
current one• It will be defined by services, not transport• There is NO vehicle to try out the next n great ideas
in this area
PlanetLab
Guidelines (1)
• Thousand viewpoints on “the cloud” is what matters– not the thousand servers– not the routers, per se– not the pipes
PlanetLab
Guidelines (2)
• and you must have the vantage points of the crossroads– co-location centers, peering points, etc.
PlanetLab
Guidelines (3)
• Each service needs an overlay covering many points– logically isolated
• Many concurrent services and applications– must be able to slice nodes => VM per service– service has a slice across large subset
• Must be able to run each service / app over long period to build meaningful workload– traffic capture/generator must be part of facility
• Consensus on “a node” more important than “which node”
PlanetLab
Guidelines (4)
• Test-lab as a whole must be up a lot– global remote administration and management– redundancy within
• Each service will require own management capability• Testlab nodes cannot “bring down” their site
– not on forwarding path
• Relationship to firewalls and proxies is key
PlanetLab
Guidelines (5)• Storage has to be a part of it
– edge nodes have significant capacity
• Needs a basic well-managed capability
PlanetLab
Initial core team:Intel Research:
David CullerTimothy RoscoeBrent ChunMic Bowman
Princeton:Larry PetersonMike Wawrzoniak
University of Washington:Tom AndersonSteven Gribble
PlanetLab
• 1000+ machines spanning 500 sites and 40 countries
• Supports distributed virtualization each of 600+ network services running in their own slice
Requirements
1) It must provide a global platform that supports both short-term experiments and long-running services.
– services must be isolated from each other– multiple services must run concurrently– must support real client workloads
Requirements
2) It must be available now, even though no one knows for sure what “it” is.
– deploy what we have today, and evolve over time– make the system as familiar as possible (e.g., Linux)– accommodate third-party management services
Requirements
3) We must convince sites to host nodes running code written by unknown researchers from other organizations.
– protect the Internet from PlanetLab traffic– must get the trust relationships right
Requirements
4) Sustaining growth depends on support for site autonomy and decentralized control.
– sites have final say over the nodes they host– must minimize (eliminate) centralized control
Requirements5) It must scale to support many users with minimal
resources available.– expect under-provisioned state to be the norm– shortage of logical resources too (e.g., IP addresses)
Design Challenges• Minimize centralized control without violating
trust assumptions.
• Balance the need for isolation with the reality of scarce resources.
• Maintain a stable and usable system while continuously evolving it.
Key Architectural Ideas• Distributed virtualization
– slice = set of virtual machines
• Unbundled management– infrastructure services run in their own slice
• Chain of responsibility– account for behavior of third-party software– manage trust relationships
PlanetLab
Implementation Research Issues• Sliceability: distributed virtualization• Isolation and resource control• Security and integrity: exposed machines• Management of a very large, widely dispersed
system• Instrumentation and measurement• Building blocks and primitives
29
Slice-ability• Each service runs in a slice of PlanetLab
– distributed set of resources (network of virtual machines)– allows services to run continuously
• VM monitor on each node enforces slices– limits fraction of node resources consumed– limits portion of name spaces consumed
• Issue: global resource discovery– how do applications specify their requirements?– how do we map these requirements onto a set of nodes?
Slices
Slices
Slices
User Opt-in
Server
http://coblitz.org/www.princeton.edu/podcast.mp4
Client
Per-Node View
Virtual Machine Monitor (VMM)
NodeMgr
LocalAdmin
VM1 VM2 VMn…
Global View
…
…
…
PLC
Exploit Layer 2 Circuits
Deployed in NLR & Internet2 (aka VINI)
Circuits (cont)
Supports arbitrary virtual topologies
Circuits (cont)
Exposes (can inject) network failures
Circuits (cont)
BGP
BGP
BGP
BGP
Participate in Internet routing
40
Distributed Control of Resources• At least two interested parties
– service producers (researchers) decide how their services are deployed over available nodes
– service consumers (users) decide what services run on their nodes
• At least two contributing factors– fair slice allocation policy
both local and global components (see above)
– knowledge about node state freshest at the node itself
41
Unbundled Management• Partition management into orthogonal services
– resource discovery– monitoring node health– topology management– manage user accounts and credentials– software distribution
• Issues– management services run in their own slice– allow competing alternatives– engineer for innovation (define minimal interfaces)
42
Application-Centric Interfaces• Inherent problems
– stable platform versus research into platforms– writing applications for temporary testbeds– integrating testbeds with desktop machines
• Approach– adopt popular API (Linux) and evolve implementation– eventually separate isolation and application interfaces– provide generic “shim” library for desktops
43
Virtual Machines• Security
– prevent unauthorized access to state• Familiar API
– forcing users to accept a new API is death• Isolation
– contain resource consumption• Performance
– don’t want to be apologetic
Virtualization
Virtual Machine Monitor (VMM)
NodeMgr
OwnerVM
VM1 VM2 VMn…
Linux kernel (Fedora Core)+ Vservers (namespace isolation)+ Schedulers (performance isolation)+ VNET (network virtualization)
Auditing serviceMonitoring servicesBrokerage servicesProvisioning services
Resource Allocation• Decouple slice creation and resource allocation
– given a “fair share” (1/Nth) by default when created– acquire/release additional resources over time
including resource guarantees
• Protect against thrashing and over-use– link bandwidth
upper bound on sustained rate (protect campus bandwidth)– memory
kill largest user of physical memory when swap at 85%
PlanetLab
Confluence of Technologies• Cluster-based management• Overlay and P2P networks• Virtual machines and sandboxing• Service composition frameworks• Internet measurement• Packet processors• Colo services• Web services The time is now.
Usage Stats• Users: 2500+ • Slices: 600+• Long-running services: ~20
– content distribution, scalable large file transfer, – multicast, pub-sub, routing overlays, anycast,…
• Bytes-per-day: 4 TB– 1Gbps peak rates not uncommon
• Unique IP-addrs-per-day: 1M
Validation Gap
Analysis Simulation / Emulation Experiment At ScaleWith Real Users
Deployment
(models) (code)
(results)
(measurements)
Deployment GapM
atu
rity
Time
Analysis (MatLab)
Controlled Experiment (EmuLab)
Deployment Study (PlanetLab)
Pilot Demonstration (PL Gold)
Commercial Adoption
Idea
s
Implementation Reality
User & Network Reality
Economic Reality
PlanetLab
Emerging applications• Content distribution• Peer-to-Peer networks• Global storage• Mobility services• Etc. etc.
Vibrant research community embarking on new direction and none can try out their ideas.
Trust RelationshipsPrincetonBerkeleyWashingtonMITBrownCMUNYUETHHarvardHP LabsIntelNEC LabsPurdueUCSDSICSCambridgeCornell…
princeton_codeennyu_dcornell_beehiveatt_mcashcmu_esmharvard_icehplabs_donutlabidsl_pseprirb_phiparis6_landmarksmit_dhtmcgill_cardhuji_enderarizona_storkucb_bambooucsd_shareumd_scriptroute…
N x NTrusted
Intermediary(PLC)
Principals• Node Owners
– host one or more nodes (retain ultimate control)– selects an MA and approves of one or more SAs
• Service Providers (Developers)– implements and deploys network services– responsible for the service’s behavior
• Management Authority (MA)– installs an maintains software on nodes– creates VMs and monitors their behavior
• Slice Authority (SA)– registers service providers– creates slices and binds them to responsible provider
Trust Relationships(1) Owner trusts MA to map network
activity to responsible sliceMA
Owner Provider
SA
(2) Owner trusts SA to map slice to responsible providers
1
2
5
6
(3) Provider trusts SA to create VMs on its behalf
3
(4) Provider trusts MA to provide working VMs & not falsely accuse it
4
(5) SA trusts provider to deploy responsible services
(6) MA trusts owner to keep nodes physically secure
Architectural ElementsMA
NM +VMM
nodedatabase
NodeOwner
OwnerVM
SCS
SAslice
database
VM ServiceProvider
Slice Creation
PLC(SA)
VMM
NM VM
PI SliceCreate( ) SliceUsersAdd( )
User/Agent GetTicket( )
VM …
.
.
.
.
.
.
(redeem ticket with plc.scs)
CreateVM(slice)
plc.scs
Brokerage Service
PLC(SA)
VMM
NM VM VM VM…
.
.
.
.
.
.
(broker contacts relevant nodes)
Bind(slice, pool)
VM
User BuyResources( )
Broker
PlanetLab: Two Perspectives• Useful research platform• Prototype of a new network architecture
What are people doing in/on/with/around PlanetLab?
1. Network measurement2. Application-level multicast3. Distributed Hash Tables4. Storage5. Resource Allocation6. Distributed Query Processing7. Content Distribution Networks8. Management and Monitoring9. Overlay Networks10. Virtualisation and Isolation11. Router Design12. Testbed Federation13. …
Lessons Learned• Trust relationships
– owners, operators, developers• Virtualization
– scalability is critical– control plane and node OS are orthogonal– least privilege in support of management functionality
• Decentralized control– owner autonomy– delegation
• Resource allocation– decouple slice creation and resource allocation– best effort + overload protection
• Evolve based on experience– support users quickly
Conclusions• Innovation can come from anywhere
• Much of the Internet’s success can be traced to its support for innovation “at the edges”
• There is currently a high barrier-to-entry for innovating “throughout the net”
• One answer is a network substrate that supports “on demand, customizable networks”– enables research– supports continual innovation and evolution
PlanetLab Software Overview
Mark [email protected]
Node Software• Boot
– Boot CD– Boot Manager
• Virtualization– Linux kernel– VServer– VNET
• Node Management– Node Manager– NodeUpdate– PlanetLabConf
• Slice Management– Slice Creation Service– Proper
• Monitoring– PlanetFlow– pl_mom
PLC Software• Database server
– pl_db• PLCAPI server
– plc_api• Web server
– Website PHP– Scripts
• Boot server– PlanetLabConf scripts
• PlanetFlow archive• Mail, Support (RT), DNS, Monitor, Build, CVS, QA
Boot Manager• Boot Manager
– bootmanager/source/ Main BootManager class, authentication, utility functions,
configuration, etc.– bootmanager/source/steps/
Individual “steps” of the install/boot process– bootmanager/support-files/
Bootstrap tarball generation Legacy support for old Boot CDs
Virtualization• Linux kernel
– Fedora Core 8 kernel VServer patch
• VServer– util-vserver/
Userspace VServer management utilities and libraries• VNET
– Linux kernel module– Intercepts bind(), other socket calls– Intercepts and marks all IP packets– Implements TUN/TAP, proxy socket extensions
Node Management• Node Manager (pl_nm)
– sidewinder/ Thin XML-RPC shim around VServer (or other VMM) syscalls, and other
knobs– util-python/
Miscellaneous Python utility functions– util-vserver/python/
Python bindings for VServer syscalls• Node Update
– NodeUpdate/ Wrapper around yum for keeping node RPMs up-to-date
• PlanetLabConf– PlanetLabConf/
Pull-based configuration file distribution service Most files dynamically generated on a per-node or per-node group basis
Slice Management• Slice Creation Service (pl_conf)
– sidewinder/ Runs in a slice Periodically downloads slices.xml from boot server Local XML-RPC API for delegated slice creation, query
• Proper– proper/
Simple local interface for executing privileged operations Bind mount(), privileged port bind(), root read()
Administration and Monitoring• PlanetFlow (pl_netflow)
– netflow/ MySQL schema and initialization/maintenance scripts
– netflow/html/ PHP frontend
– netflow/pfgrep/ Console frontend
– ulogd/ Packet header collection, aggregation, and insertion
• PlanetLab Monitor (pl_mom)– pl_mom/swapmon.py
Swap space monitor and slice reaper– pl_mom/bwmon.py
Average daily bandwidth monitor
Database and API• Database
– pl_db/ PostgreSQL schema generated from XML
• PLCAPI– plc_api/specification/
XML specification of API functions
– plc_api/PLC/ mod_python implementation
Web Server• PHP, Static, Generated
– plc_www/includes/new_plc_api.php Auto-generated PHP binding to PLCAPI
– plc_www/db/ Secure portion of website
– plc_www/generated/ Generated include files
– plc/scripts/ Miscellaneous scripts
Boot Server• Secure Software Distribution
– Authenticated, encrypted with SSL– /var/www/html/boot/
Default location for Boot Manager
– /var/www/html/install-rpms/ Default /etc/yum.conf location for RPM updates
– /var/www/html/PlanetLabConf/ Server-side component Mostly PHP