Networking and Communications David W. Hankins 10/28/2008.

Networking and Communications

David W. Hankins10/28/2008

About your Speaker

Left college early to work for ISP's.

Yeah, don't do that.

Now I'm continuing my education.

Operated IP networks from small dial access to large backbones.

Wrote software for Skycache/Cidera.

Now a Software Engineer at Internet Systems Consortium, Inc. (ISC)

Working on the ISC DHCP software project.

Mostly a maintainer, but also wrote DHCPv6 (DHCP for IPv6) software.

Author, RFC 5071 (don't bother reading it).

Off-road and video game enthusiast.

About ISC Internet Systems Consortium, Inc.

Headquartered in Redwood City, CA 501(c)(3) Nonprofit Corporation http://www.isc.org/

Mission: To develop and maintain production quality Open Source

software, such as BIND and DHCP Enhance the stability of the global DNS through reliable F-root

nameserver operations and ongoing operation of a DNS crisis coordination center, ISC's OARC for DNS

Further protocol development efforts, particularly in the areas of DNS evolution and facilitating the transition to IPv6.

Overview

The OSI model is actually dead, but you still need to know it.

We'll talk about the historic to current progression of Ethernet.

'Network' means 'Internet' these days, so I will focus there.

The OSI Model

OSI defined 7 “Layers” in OSI standard networking, so that different technologies could be used at each layer, and the lower and higher layers needed no knowledge.

This is kind of like “Modular Programming” for networks.

The 7 OSI Layers (plus Evi Nemeth's extensions): 1: Physical – ex: “Twisted Pair”

2: Link – ex: “Ethernet”

3: Network – ex: “Ipv4”

4: Transport – ex: “TCP”

5: Session

6: Presentation

7: Application

8: Financial

9: Political

OSI is Dead

Long live OSI!

There are still some common uses of these terms.

In Internet Protocols, the phrase was coined:

“IP over Everything. Everything over IP.”

The idea in the Internet's childhood was to provide a simple framing format that could be used, and communicate with hosts everywhere, regardless of the lower layer link protocols (nor how proprietary they were). IP over everything.

It was recognized that once you did this, the great bastions of proprietary networks (the telco companies) would soon find themselves subverted; IP could bridge them all.

Ex: Voice over UDP over IP over PPP over SSH over TCP over IP over DNS over UDP over IP over Ethernet transmitted on twisted pair. Which one is layer 3?

Ethernet Framing

Ethernet Network Interface Cards (NICs) were assigned (theoretically) unique addresses by their manufacturers.

The first 24 bits are Organizationally Unique Identifiers, so for one manufacturer that field is fixed, and they assign the remaining bits.

That didn't work out so well.

The NIC would filter out any packet it received that was not directed to its own address, so the OS would not have to discard it. Exception: broadcast address.

The 64-octet minimum length was to enforce Ethernet timing parameters; at 10Mbps, 512 bits is 51.2us, so 10km.

Ethernet: Thicknet

Ethernet L1's follow a kind of naming convention. In the term “XXXBase-PHY”, X is the speed of the link (in megabits per second), and PHY is a conventional suffix to describe the link media. For example, both 10Base-T and 100Base-T exist.

When I started working at ISP's, something called “Thicknet” (“10Base-5”) was just getting phased out in favor of “Thinnet” (10Base-2).

Thicknet used a coaxial cable, a center conductor wire made of copper sheathed in an insulator (wax), with a braided second conductor completely surrounding it, so that the shield and the center pin shared the same axis. A plastic sheath enclosed the whole thing.

The center wire carried signal, the sheath made a Farraday Cage.

At points along the plastic sheath, lines marked the points in the wire where Ethernet signals' standing waves would stand. Hosts could be attached to these points, with two points sticking through the sheath and insulator like vampires.

Ethernet: Thinnet

Note that although the -5 suffix in Thicknet was used to describe its maximum distance between cable endpoints (500 meters), and the -2 suffix used in Thinnet similarly marked the intent for it to survive 200 meters, 10Base-2 was later clarified to survive no more than 185 meters.

Thinnet used a coaxial cable (RG-58, smaller than Thicknet's RG-8), but rather than vampiric connections, the coax cables used BNC connectors on the end, and hosts in the network were connected on “T” connections.

Each end of the coax would be terminated with a 50 Ohm resistor. This reduced the amplitude of reflection signals reaching the cables' ends.

This was certainly better than Thicknet, but it still had problems.

Ethernet: Do the Twist!

The 10Base-T standard described the use of Category 5 cable. Category 5 cable is a performance standard, generally met by using twisted pairs of 24 AWG unshielded cables.

To meet the 16Mhz standard, you might use more twists, or different gauge wire.

Similarly to 10Base-5 and 10Base-2 where the center conductor and sheath served as the transmit/receive and Ground, twisted pair cable usually carries 4 pairs of conductors, where each pair is twisted around themselves.

The twist in the pair averages out noise imparted on the line from other signal sources; receivers look at the voltage difference between the pair, not at the absolute voltage of either. By twisting the pair, the signal and ground are affected near identically, so the difference is unaffected by noise.

Ethernet: RJ45

The new 10Base-T cables used RJ45 connectors on the end, plastic connectors with a retaining lever and 8 conductors (for the 4 pair).

W-O, O, W-G, B, W-B, G, W-B, B

With 4 pair to choose from, Ethernet was now free to use two whole pair for transmit and receive.

The orange and green pair.

Ethernet Hubs would present female RJ45's, and provided a bus between all the transmit and receive pins. This meant that although you had separate physical channels for transmit and receive, you still had to handle collisions, where two nodes transmit at the same time.

This meant realistically, 10Base could not reach 10Mbps.

Ethernet: Duplexing

Throughout the 10Base-T and well into the 100Base-TX eras, networks slowly migrated away from Ethernet 'Hubs' towards Ethernet 'switches' (also referred to as 'bridges').

A switch's primary difference is that it has its own Ethernet receive and transmit chips, which receive packets from one ingress interface, examine the Ethernet header destination address, and select only one egress port to transmit the Ethernet packet on (presuming it is not already busy).

Switches slowly build a “Forwarding Database” by observing packets that it receives, and recording the source address and the port it received the packet on.

This enabled Ethernet hosts to transmit in “Full Duplex”, at full line speed, receiving and transmitting in parallel. No collisions.

A day in the Life of your Laptop

On your OS's desktop, there is some widget that lets you pick Wireless-Ethernet networks, distinguished by ESSID. This is just Ethernet over 802.11(mumble), some form of spread-spectrum microwave, with some quirks.

One quirk is that your NIC will associate with an access point, which in essence establishes a connection for your NIC with the AP's Ethernet broadcast domain. From here down, everything looks like Ethernet packets.

These Ethernet packets will probably be carrying a number of things, but we're only interested in IP (Internet Protocol) and ARP (Address Resolution Protocol) packets.

ARP provides a way for hosts to find the Ethernet MAC address for a given IP address, if it is on the same broadcast domain.

Getting Configured

You can't talk to other folks on the Internet unless you have an IP address (so they can send replies).

So the first thing your laptop will do once it is associated is to start its Dynamic Host Configuration Protocol (DHCP) client. The DHCP client uses a finite state machine to retain a consistent and correct configuration given changes in administrative policy (IP renumbering, changes in service addresses).

The client's initial DHCP packets are transmitted to the broadcast address, any servers will reply to the client's unicast MAC address (with some complicated exceptions), offering it an address configuration as well as service locators.

The client selects a configuration and requests it.

IP Addresses

IP version 4 addresses are 32-bit fields, usually represented by octets: 10.1.2.3

An IP “subnet” is a specific region of the 32-bit space. 10.1.2.0/24, for example, is all those addresses where the first 24 bits are “10.1.2”, and the remaining 8 bits are of any value.

The IP packet header has lots of fields; let's just say it at least has a source and destination IP address.

To forward a packet from your laptop to another on the same network, your laptop notices “10.1.2.4” is inside “10.1.2.0/24”, and uses ARP to unicast directly.

When you want to talk to someone on another network, it gets complicated.

IP Routing Basics

One of the things you got from DHCP was a “routers” option. This lists a number (usually one) of IP addresses inside your subnet which you should direct packets to get to the rest of the world. These are generally referred to as “default routers.”

Any route is a pair of values: a prefix, and a destination to forward that prefix to.

The default route is simply the 0.0.0.0/0 prefix directed to the listed default routers' IP addresses. Your laptop also carries a 10.1.2.0/24 route in its table, pointing to your NIC.

To route a packet outward, you start with the most specific route(s) in your table, and work down to the least specific. The route matches if the destination IP address is within the subnet. If no route matches, you emit an ICMP error.

DNS

IP addresses are hard to read and type, so we use the Domain Name System to map names to resources.

Now that your laptop is on the network, you start up your web browser, and try for http://www.isc.org/.

“www.isc.org” is not an IP address, it is a domain name. To find www.isc.org's IP address, your laptop performs a recursive DNS query against a nameserver (whose IP address was provided by DHCP). The recursive nameserver is an “assist service” extended by your network administrator.

It recursively follows DNS delegations from the root nameservers (the silent dot after org), the GTLD nameservers (org), and finally their delegation to ISC's own nameservers (isc.org), which reply with the “A record” of www.isc.org. All these nameservers are referred to as 'Authoritative' nameservers.

http://www.isc.org/

http://www.isc.org/

http://www.isc.org/

UDP and TCP

DHCP and DNS are protocols that run over UDP (over IP over...), although DNS can also be carried by TCP.

They are really just UDP payload data, the UDP port essentially directs the packet to a buffer to be sent to an application (or discard).

HTTP, to reach http://www.isc.org/, wants to open a TCP connection to the HTTP port 80 on www.isc.org. Now that your laptop has this IP address in hand, it can do that.

TCP and UDP differ mainly in that if a UDP packet is lost, no one cares (the application must retransmit). TCP's whole purpose, however, is to make sure a stream of data written in on one end reliably reaches the other end, as fast as reasonably possible. It tracks RTT's, and schedules retransmissions. It negotiates and then uses a 'window' of data to completely use the network between the nodes. The application is unaware of all this.

http://www.isc.org/

http://www.isc.org/

IP Routing Again

You understand Routing Basics, but how does your default router know how to direct packets destined to ISC?

It has its own routing table. It might be simple, like yours, using another default route. At some point, it will reach a router that is running “default-less”, that has loaded the full table of routes for the whole Internet.

Networks advertise routes for their own address space to their customers, peers and transit providers using the Border Gateway Protocol. These peers then (selectively) extend the announcement to their own customers, peers and transit providers. This goes around the world.

A BGP route is again just a prefix, with a destination address, and an AS-list to perform loop detection. When an IP packet matches a BGP destination route, the destination address is picked up and re-searched on the routing table.

The IGP

BGP was first laid down to carry those border routes; the routes external to the network, and to advertise least-specifics for the local network's address space.

Recursive lookups on its table must eventually be found in the internal network, as a directly connected route on the current router.

So internal or directly attached networks are commonly advertised to all the routers in the network using ISIS or OSPF – IGP Routing protocols. These protocols have different design and limitations – they converge faster and support load balancing, but do not scale to large numbers of routes.

Many networks have grown so large, that they offload portions of the IGP into BGP itself; this practice is called “iBGP”, although the protocol is identical.

Route Caching

When a route is finally found, it is worthwhile not to repeat all that recursive effort. So, the router will insert an entry in a cache.

There are many caching approaches, the most common is Flow Caching, where the IP packet's source and destination IP address and TCP/UDP ports are combined together to form a unique key in the cache (usually a hash table).

This has a tendency to provide very stable RTT times, an advantage over other load-balance related caching techniques (like round robin).

In earlier router architectures, the cache lived on the line card holding the interfaces. Today, modern routers prepopulate the cache from routing information.

Internet Growth

This graph only measures the number of PTR records registered in the DNS. The actual number of IP addresses in use is probably much higher!

There are only 2^32 addresses, but don't worry: IPv6! Someday...

Keeping up with Growth

The main currency of Internet connectivity is bandwidth. Ports for users is just capex. Bandwidth charges are forever.

One trouble is that network interfaces classically delivered by Telco companies come with fixed monthly costs associated with their maximum line rate capacity.

T1: 1.5Mbps, T3: 45Mbps, OC-3: 150Mbps...

This creates a “stepladder effect” in ISP operations. Your userbase grows, your T1 is getting full. You have to double your expenses on a second T1 to support a fraction of a percent more customers.

The ISP profit game was about matching your growth and expense curves optimally, timing your buildouts in advance of your growth.

All your Profits are Belong...

Maybe you've spotted the flaw in this little plan. No matter how much an ISP grows, it just gives all its profits to the Telco! Customers pay just enough for more bandwidth.

This continued for a long time, until recent years (~2000) with the advent of fibre based long-haul and metropolitan networks.

This allowed ISP's to get their own “dark” fibre (or a lambda on a shared fibre using WDM). No one can tell you how fast your networking goes over light. They just carry your light around.

So the Telco's just bought the ISP's instead.

Still, today we bill more in terms of 95%ile of bandwidth used, and not so much in terms of connect rate.

Some notes on Telco Lines

Your home phone is still (unfortunately) analog. The Telco does a Digital-To-Analog conversion, pushing pulses towards your home carrying the audio signal (driving your speaker). When you speak, an Analog-To-Digital conversion places your mic signal into a digital stream that is call-routed outwards.

This digital stream is 56Kbps of audio, carried in a 64Kbps timeslice (8 bits per second are used for control).

These timeslices are MUXed together to form telecommunications backbone lines. A T1 is 24 of these “DS0”'s. A DS-3 is 30 T1's MUXed together. Call routing protocols connect and assign timeslices throughout the network.

ISPs first started by renting fixed allocations in these networks (a whole or fractional T1 or T3 at a time).

PSTN Implementation

The stream of digital audio bits that comprise the timeslices inside each DS0, carried in whatever level of hierarchical switching, is transmitted between nodes as block waves, in its digital form using Alternate Mark Inversion (AMI).

Positive or negative voltage is a 1. Neutral is a zero. The transmitter produces an equal number of positive and negative voltage waves.

This makes for little if any “DC Bias” in the DS lines.

One end transmits, using its own internal clock. The far end synchs up to the signal as it is received, and transmits in the other direction using this signal to provide clocking.

In order to maintain clock synchronization, for DS1 a minimum of 1 bit per every eight must be set high. In the attempt to send 8 binary zeroes, the transmitting node would indicate a bipolar violation. Later we used B8ZS and B3ZS.

ISDN – The Upgrade that Wasn't

The Integrated Switched Digital Network could essentially be understood as a “Version 2” of the PSTN.

Basic Rate Interfaces were two pair instead of one and delivered two 64Kbps digital channels (in AMI form just as before) and one 8Kbps D-channel.

PRI were carried by DS1, and had 23 64Kbps channels and one 64Kbps D-channel.

You could make digital voice or data “calls” over any of these channels (even opening multiple channels to one destination).

Because it was digital, and also provided a means for unadulterated data transmission, Telco companies saw it as a “value added service.”

So it cost extra to get one. Digital voice calls were free just like on a normal phone, but data connections were costly.

Data over Voice

So, what you do is you modify your modem's DSP firmware.

When you get a voice call, check to see if it passes HDLC framing.

If it doesn't pass, start modem negotiations. If it does pass, push the raw digital bits through to your processor for digital PPP negotiation.

The telco in Washington actually tried to get a law passed to make this illegal.

Just how Fast

Well, OC-768 is running these days at about 40Gbps.

At IETF 71, Philadelphia, Comcast delivered transit for the event via a single 100 Gigabit Ethernet line (the prototype).

Because it was the prototype, the IETF network didn't actually have equipment to receive it. So the interface towards IETF was actually 10 individual 10Gig-E's.

The Ethernet and SONET specifications often do what they can to just surpass each other. The simple reason is that SONET facilitates ATM (Telco profit) whereas Ethernet facilitates IP (ISP profit).

The last DWDM system I saw put 64 lambdas on a single fibre.

So let's say, 6.4 Tbps is approachable.

References

I admit that I lifted the Ethernet framing picture from Wikipedia. It appears to be under an appropriate license.

http://en.wikipedia.org/wiki/Ethernet_II_framing The Internet Growth image is from ISC's 2008 domain

survey, available on our website (but please do ask before including it in publications, we'll probably say yes, but we like to hear from you anyway).

http://www.isc.org/ops/ds/ The rest is all from memory. I make no claims any of it is

accurate.

http://www.isc.org/ops/ds/

Networking and Communications David W. Hankins 10/28/2008.

Documents

Transcript of Networking and Communications David W. Hankins 10/28/2008.