Washington WASHINGTON UNIVERSITY IN ST LOUIS [email protected] GigE for the MSR Fred Kuhns...

25
Washington WASHINGTON UNIVERSITY IN ST LOUIS [email protected] GigE for the MSR Fred Kuhns [email protected]

Transcript of Washington WASHINGTON UNIVERSITY IN ST LOUIS [email protected] GigE for the MSR Fred Kuhns...

Page 1: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

[email protected]

GigE for the MSR

Fred Kuhns

[email protected]

Page 2: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

2WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

Ethernet Forwarding Scenario 1

EthernetSwitch

Host

IP: 192.163.204.2MAC: 08:00:20:7C:E3:25

Host

IP: 192.163.204.3MAC: 08:00:20:7C:F2:45

RouterPort 0:IP: 192.163.204.4MAC: 00:01:03:7C:23:03Port 1:IP: 192.163.150.1MAC: 00:01:03:7C:56:34

EthernetSwitch

Port 1:IP: 192.163.204.2MAC: 00:00:5E:04:00:01

MSR P1

HostIP: 192.163.150.2MAC: 00:40:33:A3:4C:04

P0

P1

Host

IP: 192.163.150.3MAC: 08:00:20:54:6C:4A

P3

Use the Address Resolution Protocol to Map 192.168.204.2

to 08:00:20:7C:E3:25. Encapsulation datagram in Ethernet frame and send.

Destination Addr:192.168.204.2

IP hdr

data

Packet arrives with destination host on local

network. Output port must map destination IP address

to MAC address.

Page 3: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

3WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

Ethernet Forwarding Scenario 2

EthernetSwitch

Host

IP: 192.163.204.2MAC: 08:00:20:7C:E3:25

Host

IP: 192.163.204.3MAC: 08:00:20:7C:F2:45

RouterPort 0:IP: 192.163.204.4MAC: 00:01:03:7C:23:03Port 1:IP: 192.163.150.1MAC: 00:01:03:7C:56:34

EthernetSwitch

Port 1:IP: 192.163.204.2MAC: 00:00:5E:04:00:01

MSR P1

HostIP: 192.163.150.2MAC: 00:40:33:A3:4C:04

P0

P1

Host

IP: 192.163.150.3MAC: 08:00:20:54:6C:4A

P3 Forwards to final destination host

Next hop router IP address must be used in the ARP

request: Map 192.168.204.4 to 00:01:03:7C:23:03.

Encapsulate datagram in Ethernet frame and send.

Destination Addr:192.168.150.2

IP hdr

data

Packet arrives with destination host NOT on locally attached network. Output port must send to

the next hop router.

Page 4: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

4WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

Ethernet Frame Format

Transport Header

Fragment offset

VersionH-length TOS Total length

Identification Flags

TTL Protocol IP Header checksum

IP Source Address

IP Destination Address

Destination Address cont.

Destination (6 B)

Source Address cont.

Source Address - (6 B)

Ether Type (2 B)

IPH

eade

rE

ther

net

Hea

der

IPD

atag

ram

Page 5: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

5WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

IP Encapsulation in Ethernet Frames

FCS (4)Data (46-1500)type0800

src address (6)dst address (6)

len(2)

src address (6)dst address (6) FCS (4)Data (38 - 1492)

DSAPAA

SSAPAA

ctl03

Org Code00

type0800

802.2 LLC 802.2 SNAP

802.2 LLC/SNAP

• Ethernet frame size: 64 - 1518 Bytes• if type 1500, then IEEE frame, otherwise Ethernet V2.Ethernet Encapsulation, RFC 894

IEEE 803.2/802.2 encapsulation, RFC 1042

0 len 1500

Pad(0-46)

Pad(0-46)

Page 6: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

6WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

ARP FrameDestination Address (6B)

Source Address (6B)

Ether Type (2B)

Hardware Address Space (2B)

Protocol Address Space (2B)

Byte length of Hardware address = 6 (1B)

Byte length of Protocol address = 4 (1B)

Hardware Address of Sender (6 B)

Protocol Address of Sender (4 B)

Hardware Address of Destination (6 B)

Protocol Address of Destination (4 B)

Operation Code 1/2(2B)

Page 7: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

7WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

ARP Message Formats

ARP Request

type0806

src address<eth-A>

dst address ff:ff:ff:ff:ff:ff

FCSxx

has0001

pas0800

hl6

pl4

op01

sha<eth-A>

spa<ip-A>

tha<??>

tpa<ip-B>

type806

src address<eth-B>

dst address <eth-A>

FCSxx

has1

pas800

hl6

pl4

op02

sha<eth-B>

spa<ip-B>

tha<eth-A>

tpa<ip-A>

ARP Reply

Host B Eth<eth-B>

Reply (02)

Request (01)

Host A Eth<eth-B>

Host A IP<ip-A>

Host B IP<ip-A>

Ethernet Header (14 B)

pad

pad

ARP Message (28 Bytes for Request or Reply)

Ethernet Data - Pad with zeros to 46 BytesFCS(4B)

Ethernet Frame with ARP Request/Reply - 64 Bytes

18 Byte Pad

Page 8: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

8WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

IP over ATM (rfc 791 and 2684)IP

Head

er

AA

L5 T

railer

IP

Data

gra

m

Fragment offset

VersionH-length TOS Total length

Identification flags

TTL protocol Header checksum

Source Address

Destination Address

Options ??

IP data (transport header and transport data)

AAL5 padding (0 - 40 bytes)

CPCS-UU (0) CPCS-UU (0) Length (IP packet + LLC/SNAP)

CRC

Page 9: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

9WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

IP Header Fields (rfc 791)• Version - support IPv4 (4)• Header Length - Length in 32 bit words

(>= 5)• TOS -• Total Length - Length of datagram in

octets• Id - Assists in reassembling fragments• Flags - • Fragment Offset - Where fragment

belongs, offset is in octets

0 DF

MF

TOS Precedense Field:111 - Network Control110 - Internetwork Control101 - Critic/ECP100 - Flash Override011 - Flash010 - Immediate001 - Priority000 - RoutineRemaining TOS Fields:D - 1 = Low delayT - 1 = High ThroughputR - 1 = High Reliability

0Prec. D T R 0

DF - 1 = Don’t Fragment, MF - 1 = More Fragments

Page 10: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

10WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

IP Header Fields

•TTL - router must decrement, if 0 then discard packet

•Protocol - UDP/TCP/ICMP/RSVP to name a few

•Header Checksum - 16 bit one’s complement of the one’s complement sum of all 16 bit words in header

•Source Address - Sending hosts IP address•Destination Address - Destination hosts IP address

Page 11: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

11WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

SPC

shimupdate

shimdemux

Packet Routing Within MSR

WUGS

...out port + IntBase

(64 ... 127)

InVC

...

Ingress Egress

ATM uses VCsas link layer

address.

Ethernet: Base VC used fordirectly attached hosts,

subports are for hext hop routers

From previous hop

router or endstation

add

shim

rem sh

imFIPLshimproc.

FPX FPX

SPC

shimdemux

shimupdate

OutVC

Outbound VC = SPI + ExtBase0 <= SPI<= 15

currently support at most 4

Lin

k In

terfaceL

ink

In

terf

ace

IP processing for FPX 1. Broadcast and Multicast

destination address2. IP options3. ICMP messages4. Packet not recognized

Inbound VC = SPI + ExtBase0 <= SPI <= 15

Currently support at most 4 Inbound VCs: One for Ethernet or

Four for ATM

Current VCI Support1) 64 Ports (PN)2) 16 sub-ports (SP)

FIPL

IPproc

plugins

FIPL

IPproc

plugins

in port + IntBase(64 ... 127)

Page 12: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

12WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

GigE Link Interface

ARP Table(M Entries)

MACIP

IP1 MAC1

IPM MACM

......

Pkt VC = 50

Endsystem, broadcastor multicast address

if VC != 50,Lookup VC in

VIN tablereturns IP used for ARP lookup(support N = 4)

Send to pkt->dstif bcast or mcast

map to eaddrelse

resolve w/ARP

IP Header

data

AAL5 trailer

IP Header

data

Ethernet

Add Ethernet header using the derived destination address and out source address. Protocol is IP.

Software createsVIN table at boot time by writing to

interface.

Fro

mF

PX

/SP

C

To

Nex

t H

op o

r E

nd

stat

ion

No ARP entry aging!

To a next hop routerNH #1 = Base + 1 = 51

NH #2 = Base + 2 = 52

NH #3 = Base + 3 = 53

VIN Table - 4 entries

50 MyIP0 0

53 MyIP2 NhIP2

MyIPVC NhIP

52 MyIP1 NhIP1

51 MyIP0 NhIP0

Map multicast or broadcast toethernet address

If ARP table lookup fails, send ARP request to broadcast address, drop packet. No retries are made.

Page 13: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

13WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

Ethernet Assigned Numbers• RFC1700 obsoleted by online database at IANA:

– http://www.iana.org/assignments/ethernet-numbers

• Ethernet Address - 6 octets:– 3 high-order octets = Organizationally Unique

Identifier (OUI)– 3 low-order octets = the interface number

• Multicast bit = lsb of the MSB (xxxx xxx1)– first byte odd => multicast or broadcast– first byte even => unicast address– multicast address = ((OUI | 0x0100) << 24) & Group_ID

• Ethernet Broadcast: FF:FF:FF:FF:FF:FF

Page 14: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

14WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

IP and Ethernet Multicast• IANA has allocated address block with OUI = 00:00:5E

– Used for unicast addresses for ”IETF standard track protocols “

– Half of Multicast addresses reserved for IP, remaining for “special use”. Leaves 23 bits for multicast addresses:

• 01:00:5E:00:00:00 to 01:00:5E:7F:FF:FF

– Could use this block for our interface, see ethernet numbers

• IP Multicast– Class D address, 0xE0000000 + 28 Bit Group ID– 224.0.0.0 to 239.255.255.255 (0xE0000000 - 0xEFFFFFFF)

• IP to Ethernet Mapping– RFC1112 - Host Extensions for IP Multicasting – Non-unique mapping: 28 bit IP group to 23 bit Ethernet group

• 32 IP multicast groups per mapped ethernet multicast address.

Page 15: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

15WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

Multicast: IP to Ethernet Mappings• Network Byte Ordering, Internet Standard Bit order:

(Big-Endian)

0000 0001 0000 0000 0101 1110 0xxx xxxx xxxx xxxx xxxx xxxx47240

Multicast Bit Internet BitMSB LSB

lsbmsb 1110 xxxx xxxx xxxx xxxx xxxx xxxx xxxx

Class D (Multicast)

Not Used in IP to Ethernet Mapping

Block of Ethernet Multicast Address

0 8

LSB

23 bits

Page 16: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

16WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

IP Broadcast• No Direct Impact on GigE Interface

• IP Broadcast : default, we will not forward directed broadcasts.– limited versus:

• {-1, -1}. Must not be forwarded, Destination address only

– Directed broadcast: • {Network-Number, -1}, destination address only.

– Subnet Directed Broadcast: • {Network-Number, Subnet-Number, -1}

– Directed Broadcast to all subnets:• {Network-Number, -1, -1}

Page 17: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

17WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

Unicast - If we use the IANA Block

0000 0000 0000 0000 0101 1110 0000 0100 xxxx xxxx xxxx xxxx47230

Multicast Bit set to 0

MSB LSB

IANA Block of Ethernet Addresses16 bits

ARL Interface Number

Page 18: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

18WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

GigE Link InterfaceARP Table (M Entries)

MACIP

IP1 MAC1

IPM MACM

......

Base VCto FPX/SPC

IP Header

data

Ethernet

Fro

m N

ext

Hop

or

En

dst

atio

n

To

FP

X/S

PC

receive ethernet frame: ethif (eth->type == ARP)

if (eth->arp->has != Ethernet/0001) Drop Frameif (eth->arp->pas != IP/0800) Drop Frameupdate {eth->arp->spa, eth->arp->sha} in ARP tableif (eth->arp->tpa NOT in {MyIP0, MyIP1, MyIP2})

Drop Frame // target IP not oursif (eth->arp->op == Request/01) {

swap source and target ARP infoset operation to Replyset ether header src and dst addresssend reply

} // Already handled eth->arp->op == Reply/02// when updated cache above

else if (eth->type == IPv4)remove ethernet header, padding and CRCadd AAL5 trailer and required paddingbreak into cells and send on default Base VC

else Error, drop packet

*Unicast MAC address filtering

IP Header

data

AAL5 trailer

Page 19: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

19WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

Notes• Packet Received on ATM interface:

– If received on Base_VC (i.e. 50) then • map IP destination (ip->dst_addr) to ethernet representation.

• Unicast uses ARP table, multicast and broadcast use appropriate mapping.

– Otherwise, • lookup VC in VIN table: Table entry index = RX_VC - Base_VC.

• ARP the resulting Next Hop IP address.

– This permits a simple mechanism for “tunneling” traffic to a gateway. This allows us to support directed broadcast and provides a convenient mechanism for testing.

• Packet received on Ethernet interface: – if IPv4 then send all (unicast, multicast and broadcast) to input

port processor on the Base_VC (i.e. 50)

Page 20: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

20WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

ARP Cache• IP Address = Network_Prefix.Host or simply Net.Host

– Assume a prefix length of at least 24 bits, leaves 8 bits for the host

– An interface can have at most 3 unique IP addresses

• Interface may communicate with at most 256 hosts per network

• Implement ARP cache as a table with 768 entries (3 * 256)

• See next slideVIN Table

PrefixMask

Local IPAddress

Next HopIP Address

Mask0 MyIP0 NH0

Mask1 MyIP1 NH1

Mask2 MyIP2 NH2

EntryNumber

0

1

2

EthernetIP

IP0,0

......

IP0,255 Ether0,255

Ether0,0

IP1,0

......

IP1,255 Ether1,255

Ether1,0

IP2,0

......

IP2,255 Ether2,255

Ether2,0

ARP Table

Net 0

Net 1

Net 2

Net 0 = Mask0 & MyIP0

Net 1 = Mask1 & MyIP1

Net 2 = Mask2 & MyIP2

Page 21: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

21WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

‘get next packet’:// received frame from ATM interfaceif (RX_VC == Base_VC)

ipdst = ip->dst_addr;elseipdst = VIN_Table[RX_VC- Base_VC].NextHop

// ipdst == IP Address of host we must send packet to// determine networkfor (i = 0; i < 3; i++) {

if ((ipdst & Maski) == (MyIPi & Maski)) {index = (i << 8) | (ip->dst_addr & ~Maski)break; }

if i == 3 ; drop packet, goto get next packet// i corresponds to the Network Number (0 - 2)if (ArpTable[index].EtherAddress != 00:00:00:00:00:00) {

construct ethernet frame send packet goto ‘get next packet’

} else {send ARP Request for ipdstdrop packet, goto ‘get next packet’}

Implementing the ARP TableVIN Table

EthernetIP

IP0,0

......

IP0,255 Ether0,255

Ether0,0

IP1,0

......

IP1,255 Ether1,255

Ether1,0

IP2,0

......

IP2,255 Ether2,255

Ether2,0

ARP Table

index

PrefixMask

Local IPAddress

Next HopIP Address

Mask0 MyIP0 NH0

Mask1 MyIP1 NH1

Mask2 MyIP2 NH2

EntryNumber

0

1

2

don’t need to store IP address

Page 22: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

22WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

Notes and Issues• GigE Control Interface for Software configuration.

1. Reset interface to defaults

2. Clear ARP cache

3. Read ARP table

4. Read VIN table

5. Read ethernet address

6. set VIN table entries and other registers• Set BASE VC (currently 50)• Set Entries in the VIN table• Add static ARP entries??

Page 23: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

23WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

Notes and Issues• Comprehensive testing scenarios need defining

• verify multicast and broadcast

• VC to control line card

Page 24: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

24WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

References• RFC 1122 - Requirements for Internet Hosts

– Must send and receive using RFC-894 - compliant– Should receive RFC-1042 mixed with RFC-894 - we do not– May send using RFC-1042 - we do not– Must use ARP– Must flush out-of-date ARP cache entries - not compliant– Must prevent ARP floods - we only try once– Should have configurable ARP cache timeout - no– Should save at least one (latest) unresolved (by ARP) packet - no– Must report broadcasts to IP layer - compliant– IP layer Must pass TOS to link layer - via the header– Must Not report no ARP entry as “destination unreachable” -

compliant

Page 25: Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu GigE for the MSR Fred Kuhns fredk@arl.wustl.edu.

25WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 1/9/01

References• RFC-826 : Address Resolution Protocol

– Maps <protocol, address> to 48 bit Ethernet address– our processing differs in minor ways

• RFC 1700 : Assigned Numbers– Ethertype values defined by RFC 1700– IP to ethernet multicast address mapping defined

• RFC-1812 : Requirements for IPv4 Routers– Must not believe ARP reply if contains multicast or broadcast

address - not compliant– Must be compliant with RFC 1122 - Partial

• Support Ethernet V2 only– RFC 894: IP encapsulation in Ethernet V2 - Supported– RFC 1042: IP encapsulation in 802.3 frames - Not Supported