Networking 15-719 Advanced Cloud Computinggarth/15719/lectures/15719-S17-Networking.pdf · Advanced...
Transcript of Networking 15-719 Advanced Cloud Computinggarth/15719/lectures/15719-S17-Networking.pdf · Advanced...
Networking 15-719 Advanced Cloud Computing Garth Gibson Greg Ganger Majd Sakr
Apr 5, 2017 15719 Adv. Cloud Computing 1
Advanced Cloud Computing Networking Readings
• Ref 1: “The cost of a cloud: research problems in data center networks.” A. Greenberg, J. R. Hamilton, D. A. Maltz, P. Patel. In ACM SIGCOMM Computer Comm. Review, Vol. 39, No. 1, Jan. 2009. Section 3. http://research.microsoft.com/en-us/um/people/dmaltz/papers/DC-Costs-CCR-editorial.pdf
• Ref 2: “PortLand: A Scalable, Fault-Tolerant Layer 2 Data Center Network Fabric.” R. N. Mysore, A. Pamporis, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, A. Vahdat. In SIGCOMM 2009. Sections 1-2. http://cseweb.ucsd.edu/~vahdat/papers/portland-sigcomm09.pdf
• Ref 3: “Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network.” Arjun Singh, J. Ong, A. Agarwal, G, Anderson, A. Armistead, R. Bannon, S. Boving, G. Desai, B. Feldman, P. Germano, A. Kanagala, J. Provost, J. Simmons, E. Tanda, J. Wanderer, U. Holzle, S. Stuart, A. Vahdat. In SIGCOMM 2015.
• Ref4: “B4: Experience with a Globally-Deployed Software Defined WAN.” Sushant Jain, Alok Kumar, Subhasree Mandal, Joon Ong, Leon Poutievski, Arjun Singh, Subbaiah Venkata, Jim Wanderer, Junlan Zhou, Min Zhu, Jonathan Zolla, Urs Hölzle, Stephen Stuart and Amin Vahdat. In SIGCOMM 2013. Sections 1-4. http://dl.acm.org/citation.cfm?id=2486019
• Ref 5: “Onix: A Distributed Control Platform for Large-scale Production Networks.” Teemu Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu, R. Ramanathan, Y. Iwata, H. Inoue, T. Hama, S. Shenker. In OSDI 2010. http://static.usenix.org/events/osdi10/tech/full_papers/Koponen.pdf
• Ref 6: “Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication.” Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat, David G. Andersen, Gregory R. Ganger, Garth A. Gibson, Brian Mueller. In SIGCOMM 2009. http://dl.acm.org/citation.cfm?id=1592604
Apr 5, 2017 15719 Adv. Cloud Computing 2
Outline
• A Quick Review of Computer Networking o Huge credit to 15-441/641 for lecture materials (Steenkiste, Zhang,…)
o Protocol stack: Layer 2 (Link, LAN), Layer 3 (Network, IP)
o Link: LANs, Bridging LANs, Proxy LANs, Virtual LANs
o Network: IP, Forwarding, Tunneling, VPNs
o Virtual Circuits, MPLS, Traffic Shaping/Engineering
• Data Center Networking Issues o Scaling out past the limits of VLANs
Apr 5, 2017 15719 Adv. Cloud Computing 3
Apr 5, 2017 15719 Adv. Cloud Computing 4
Apr 5, 2017 15719 Adv. Cloud Computing 5
Apr 5, 2017 15719 Adv. Cloud Computing 6
Apr 5, 2017 15719 Adv. Cloud Computing 7
Apr 5, 2017 15719 Adv. Cloud Computing 8
Apr 5, 2017 15719 Adv. Cloud Computing 9
Apr 5, 2017 15719 Adv. Cloud Computing 10
11
IP to MAC Address Translation
• How does one find the Ethernet address of a specified IP host? • ARP (Address Resolution Protocol)
o Broadcast search for IP address • E.g., “who-has 128.2.184.45 tell 128.2.206.138” sent to
Ethernet broadcast (all FF address) o Destination responds (only to requester using unicast) with appropriate
48-bit Ethernet address • E.g, “reply 128.2.184.45 is-at 0:d0:bc:f2:18:58” sent to
0:c0:4f:d:ed:c6
Apr 5, 2017 15719 Adv. Cloud Computing
12
Caching ARP Entries
• Efficiency Concern o Would be very inefficient to use ARP request/reply every time need to send
IP message to machine
• Each Host Maintains Cache of ARP Entries o Add entry to cache whenever get ARP response
o Set timeout of ~20 minutes
Apr 5, 2017 15719 Adv. Cloud Computing
VLANs – if there are too many hosts in LAN
VLANs logically segment switched LANs based on organization or
function, independent of their physical location in the network Devices on a VLAN share their own (private) LAN
Form their own IP subnet
Offers many benefits: Performance: limits broadcast messages to the VLAN – improves scalability
Security: isolates VLAN – done by routers with smarter filtering capabilities
Management: manage network topology without changing physical topology
Apr 5, 2017 15719 Adv. Cloud Computing 15
VLAN Example
16 Apr 5, 2017 15719 Adv. Cloud Computing
VLAN Logical Topology
17 Apr 5, 2017 15719 Adv. Cloud Computing
VLAN Types
VLANs are implemented by switches (replace a daisy-chained
cable with a wire per machine and a “switch” as the shared “wire”)
VLAN memberships can be controlled by a switch in different ways,
based on: n Port: incoming ports are tagged with VLAN ID
n MAC address: switch has (MAC, VLAN ID) table
n Protocol: switch as (protocol, VLAN ID) table
The frame headers are encapsulated or modified to insert a VLAN
ID
Is inserted by first switch before forwarding packet
Removed by last switch before forwarding to the destination device
Apr 5, 2017 15719 Adv. Cloud Computing 18
Apr 5, 2017 15719 Adv. Cloud Computing 19
Apr 5, 2017 15719 Adv. Cloud Computing 20
Apr 5, 2017 15719 Adv. Cloud Computing 21
Apr 5, 2017 15719 Adv. Cloud Computing 22
Apr 5, 2017 15719 Adv. Cloud Computing 23
IP Forwarding – Hop-by-Hop Control
47.1
47.2 47.3
IP 47.1.1.1
Dest Out47.1 147.2 247.3 3
1
2 3
Dest Out47.1 147.2 247.3 3
1
2
1
2
3
IP 47.1.1.1
IP 47.1.1.1 IP 47.1.1.1
Dest Out47.1 147.2 247.3 3
Apr 5, 2017 15719 Adv. Cloud Computing 24
Tunneling – Forcing particular routes • Force a packet to go thru a specific point in
network. – Path taken is different from regular routing
• Achieved by adding an extra IP header to packet with a new destination address. – Similar to putting a letter in another envelope – Preferable to using IP source routing option
• Used increasingly to deal with special routing requirements or new features. – Mobile IP,.. – Multicast, IPv6, research, ..
Data IP1 IP2
IP2
IP1
25 Apr 5, 2017 15719 Adv. Cloud Computing
Extending Private Network
• Supporting Road Warriors (and hybrid remote clouds) o Employee working remotely with assigned IP address 198.3.3.3 o Wants to appear to rest of corporation as if working internally at address 10.6.6.6
• Gives access to internal services (e.g., ability to send mail)
• Virtual Private Network (VPN) o Overlays private network on top of regular Internet
26
Internet
Corporation X
C
C
C
S
C: ClientS: Server
198.3.3.3 10.6.6.6
10.X.X.X
CNAT
Apr 5, 2017 15719 Adv. Cloud Computing
27
Supporting VPN by Tunneling
• Concept o Appears as if two hosts connected directly
• Usage in VPN o Create tunnel between client & firewall
o Remote client appears to have direct connection to internal network
CF
R R198.3.3.3
243.4.4.4
10.5.5.5 10.6.6.6 F: FirewallR: RouterC: Client
Apr 5, 2017 15719 Adv. Cloud Computing
28
Supporting VPN by Tunneling
• Client creates packet for internal node 10.1.1.1 • Entering Tunnel
o Add extra IP header directed to firewall (243.4.4.4) o Original header becomes part of payload o Possible to encrypt it
• Exiting Tunnel o Firewall receives packet o Strips off header o Sends through internal network to destination
CF
R R198.3.3.3
243.4.4.4
10.5.5.5 10.6.6.6
Payload
source: 198.3.3.3 dest: 243.4.4.4
dest: 10.1.1.1 source: 10.6.6.6
Apr 5, 2017 15719 Adv. Cloud Computing
Circuit Versus Packet Switching
• Fast switches can be built relatively inexpensively
• Inefficient for bursty data • Predictable performance
(e.g. hard QoS) • Requires circuit
establishment before communication
• Switch design is more complex and expensive
• Allows statistical multiplexing
• Difficult to provide QoS guarantees
• Data can be sent without signaling delay and overhead
Circuit Switching Packet Switching
Can we get the benefits of both?Apr 5, 2017 15719 Adv. Cloud Computing 29
Virtual Circuits Versus Packet Switching
• Virtual circuit switching: o Uses short connection identifiers to forward packets o Switches know about the connections so they can more easily
implement features such as quality of service o Virtual circuits form basis for traffic engineering: VC identifies long-
lived stream of data that can be scheduled
• Packet switching: o Use full destination addresses for forwarding packets o Can send data right away: no need to establish a connection first o Switches are stateless: easier to recover from failures o Adding QoS is hard o Traffic engineering is hard: too many packets!
Apr 5, 2017 15719 Adv. Cloud Computing 30
31
1
Virtual Circuit Forwarding
• Address used for look up is a virtual circuit identifier (VC id)
• Forwarding table entries are filled in during signaling
• VC id is often shorter than destination address
VC1 3
Switch
VC2 3
VC3 4
VC4 ?
VC5 ?
Address Next Hop
A C
B D
E
34
F
2
VC1
VC3VC2
Apr 5, 2017
Virtual Circuits In Practice
• ATM: Teleco approach o Kitchen sink. Based on voice, support file transfer, video, etc., etc. o Intended as IP replacement. That didn’t happen. :) o Today: rarely used.
• MPLS: The “IP Heads” answer to ATM o Stole good ideas from ATM o Integrates well with IP o Today: Used inside some networks to provide VPN support, traffic
engineering, simplify core.
• Other networks just run IP.
Apr 5, 2017 15719 Adv. Cloud Computing 32
MPLS
• Multi-Protocol Label Switching
• Bringing virtual circuit concept into IP
• Driven by multiple forces o QoS
o Traffic engineering
o High performance forwarding
o VPN
Layer 2 header
Layer 3 (IP) header
Layer 2 header
Layer 3 (IP) header
MPLS label
Apr 5, 2017 15719 Adv. Cloud Computing 33
IP Forwarding – Hop-by-Hop Control
47.1
47.2 47.3
IP 47.1.1.1
Dest Out47.1 147.2 247.3 3
1
2 3
Dest Out47.1 147.2 247.3 3
1
2
1
2
3
IP 47.1.1.1
IP 47.1.1.1 IP 47.1.1.1
Dest Out47.1 147.2 247.3 3
Apr 5, 2017 15719 Adv. Cloud Computing 34
Label Switched Path (LSP)
Intf In
Label In
Dest Intf Out
3 40 47.1 1
Intf In
Label In
Dest Intf Out
Label Out
3 50 47.1 1 40
47.1
47.2 47.3
1 2
3 1
2 1
2 3
3
Intf In
Dest Intf Out
Label Out
3 47.1 1 50
IP 47.1.1.1
IP 47.1.1.1
Apr 5, 2017 15719 Adv. Cloud Computing 35
Intf In
Label In
Dest Intf Out
3 40 47.1 1
Intf In
Label In
Dest Intf Out
Label Out
3 50 47.1 1 40
47.1
47.2 47.3
1
2
3 1
2
1
2 3
3
Intf In
Dest Intf Out
Label Out
3 47.1.1 2 33 3 47.1 1 50
IP 47.1.1.1
IP 47.1.1.1
Explicitly Routed LSP – Shape Traffic as Needed
Apr 5, 2017 15719 Adv. Cloud Computing 36
Apr 5, 2017 15719 Adv. Cloud Computing 37
Can use fixed routing to increase average rack to rack BW Can use shaped traffic to maximize BW for dynamic traffic pattern
ToR
EoR
Core
Cloud/Data Center Networking
• Large data centers have 10-100s of 1000s of machines o X 10s of VMs each o 1000s of network boxes
• Heavy use of Layer 2 o Cheaper, faster than IP
• Dependence on Layer 3 o For things not in same
layer 2 domain
Apr 5, 2017 15719 Adv. Cloud Computing 38
(PortLand) Data Center Network Requirements
• R1: VMs migrate w/o changing IP or breaking TCP connections o Limited to layer 2 broadcast domains, VLANs would have to span all
• R2: Switches should not need configuration with each new Vcloud o Changing VLAN definitions is reconfiguration
• R3: All network paths should be available to all connected nodes o Need significant multi-pathing & traffic shaping, usually slow to adapt
• R4: Forwarding loops (on failure recovery) should not happen o Current layer 2 & 3 protocols have this problem
• R5: Failure recovery should be fast, no impact on open sessions o Current protocols recover slowly compared to data center speeds
Apr 5, 2017 15719 Adv. Cloud Computing 39
Data Center Routing
• Layer 3 equipment is more expensive than layer 2 equipment
• IP addresses are assigned to hosts according to physically direct
switch connections
• Routing protocols slowly gather and share connectivity o Connectivity failures led to transient loops (IP limits no. routing hops)
o Management/configuration changes (possible routes) is error prone • Eg. DHCP service tables must agree with routing paths • Eg. Security policies in firewalls must agree with routing paths
• Migrating a VM changes its IP address, breaks TCP connections o Can use overlay networks, but they are slow and therefore expensive
Apr 5, 2017 15719 Adv. Cloud Computing 40
Data Center VLANs
• LANs and VLANs must support broadcast o Reduces configuration because forwarding tables self discovered o Can move a machine/VM among common (V)LAN switches with same IP o Big (V)LANs means big tables and lots of broadcast traffic
• Millions of VMs in same VLAN is rarely supported (usually 100s – 1000s)
• Layer 2 solves forwarding loops with spanning tree o One of many paths is selected for all traffic A -> B o Bad for bandwidth, underutilizes paths and links o VLANs that span networks using proxies force too much traffic high in tree
• Switches must be configured with per VLAN bandwidth o Making changes complex, often manual
Apr 5, 2017 15719 Adv. Cloud Computing 41
Apr 5, 2017 15719 Adv. Cloud Computing 42
Apr 5, 2017 15719 Adv. Cloud Computing 43
SDN in Google’s WAN
• Google is either special case or leading indicator (not clear which) o Google scale is always probably the most extreme,
o Google likes to totally control applications, systems & infrastructure,
o Google is extremely cost sensitive
• SDN lets Google program the WAN using cheaper hardware o Want to use 100% of WAN bandwidth
o Total control and not that many WAN nodes (datacenters)
o Like to invent, rapidly experiment, understand complete implementation
• Dislike expensive, complex, proprietary network devices
Apr 5, 2017 15719 Adv. Cloud Computing 44
B4 Design
• Control of routing done by OpenFlow controller (OFC), executed by an agent (OFA) o Paxos SMR protects decisions
• Traffic engineering gathers traffic measurements & application priorities & optimizes paths o Multi-pathing implemented with
IP-in-IP tunneling o Optimizer roughly done
by linear programming
Apr 5, 2017 15719 Adv. Cloud Computing 45
Next up
• Geo-Replication
Apr 5, 2017 15719 Adv. Cloud Computing 46