1 Chapter 7 Switching, Packets, Frames, Parity, Checksums, and CRCs.
Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert...
-
date post
21-Dec-2015 -
Category
Documents
-
view
217 -
download
0
Transcript of Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert...
Checksum ‘offloading’
A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums
Network efficiency
• Last time (in our ‘nictcp.c’ demo) we saw the amount of work a CPU would need to do when setting up an ethernet packet for transmission with TCP/IP protocol format
• In a busy network this amount of packet- computation becomes a ‘bottleneck’ that degrades overall system performance
• But a lot of that work can be ‘offloaded’!
The ‘loops’ are costly
• To prepare for a packet-transmission, the device-driver has to execute a few dozen assignment-statements, to set up fields in the packet’s ‘headers’ and in the Transmit Descriptor that will be used by the NIC
• Most of these assignments involve simple memory-to-memory copying of parameters
• But the ‘checksum’ fields require ‘loops’
Can’t ‘unroll’ checksum-loops
• One programming technique for speeding up loop-execution is known as ‘unrolling’, to avoid the ‘test-and-branch’ inefficiency:
• But it requires knowing in advance what number of loop-iterations will be needed
int sum = 0;
sum += wp[0];sum += wp[1];sum += wp[2];…sum += wp[99];
The ‘offload’ solution
• Modern network controllers can be built to perform TCP/IP checksum calculations on packet-data as it is being fetched from ram
• This relieves a CPU from having to do the most intense portion of packet preparation
• But ‘checksum offloading’ is an optional capability that has to be ‘enabled’ – and ‘programmed’ for a specific packet-layout
‘Context’ descriptors
• Intel’s Pro1000 network controllers employ special ‘Context’ Transmit-Descriptors for enabling and configuring the ‘checksum-offloading’ capability
• Two kinds of Context Descriptor are used:– An ‘Offload’ Context Descriptor (Type 0)– A ‘Data’ Context Descriptor (Type 1)
Context descriptor (type 0)
IPCSS
PAYLENDTYP
=0MSS
IPCSE IPCSOTUCSSTUCSE TUCSO
TUCMDSTAHDRLEN RSV
63 48 47 40 39 32 31 16 15 8 7 0
Legend: IPCSS (IP CheckSum Start) TUCSS (TCP/UDP CheckSum Start) IPCSO (IP CheckSum Offset) TUCSO (TCP/UDP CheckSum Offset) IPCSE (IP CheckSum Ending) TUCSE (TCP/UDP CheckSum Ending)
PAYLEN (Payload Length) DTYP (Descriptor Type) TUCMD (TCP/UCP Command) STA (TCP/UDP Status) HDRLEN (Header Length) MSS (Maximum Segment Size)
DEXT=1 (Extended Descriptor)
The TUCMD byte
IDE SNAPDEXT(=1)
reserved(=0) RS TSE IP TCP
7 6 5 4 3 2 1 0
Legend: IDE (Interrupt Delay Enable) SNAP (Sub-Network Access Protocol) DEXT (Descriptor Extension) RS (Report Status) TSE (TCP-Segmentation Enable) IP (Internet Protocol) TCP (Transport Control Protocol)
always valid valid only when TSE=1
Context descriptor (type 1)
ADDRESS
DTALENDTYP
=1VLAN DCMDSTAPOPTS RSV
63 48 47 40 39 32 31 16 15 8 7 0
Legend: DTALEN (Data Length) DTYP (Descriptor Type) DCMD (Descriptor Command) STA (Status) RSV (Reserved) POPTS (Packet Options) VLAN (VLAN tag)
DEXT=1 (Extended Descriptor)
The DCMD byte
IDE VLEDEXT(=1)
reserved(=0) RS TSE IFCS EOP
7 6 5 4 3 2 1 0
Legend: IDE (Interrupt Delay Enable) VLE (VLAN Enable) DEXT (Descriptor Extension) RS (Report Status) TSE (TCP-Segmentation Enable) IFCS (Insert Frame CheckSum) EOP (End Of Packet))
always valid valid only when EOP=1
Our usage example
• We’ve created a module named ‘offload.c’ which demonstrates the NIC’s checksum-offloading capability for TCP/IP packets
• It’s a modification of our earlier ‘nictcp.c’ character-mode device-driver module
• We have excerpted the main changes in a class-handout – the full version is online
Data-type definitions // Our type-definition for the ‘Type 0’ Context-Descriptor
typedef struct {unsigned char ipcss;unsigned char ipcso;unsigned short ipcse;
unsigned char tucss;unsigned char tucso;unsigned short tucse;
unsigned int paylen:20;unsigned int dtyp:4;unsigned int tucmd:8;
unsigned char status;
unsigned char hdrlen;unsigned short mss;} TX_CONTEXT_OFFLOAD;
Definitions (continued) // Our type-definition for the ‘Type 1’ Context-Descriptor
typedef struct {unsigned long long base_addr;
unsigned int dtalen:20;unsigned int dtyp:4;unsigned int dcmd:8;
unsigned char status;
unsigned char pkt_opts;unsigned short vlan_tag;} TX_CONTEXT_DATA;
typedef union {TX_CONTEXT_OFFLOAD off;TX_CONTEXT_DATA dat;} TX_DESCRIPTOR;
Our packets’ layout
Ethernet Header (14 bytes)
IP Header (20 bytes) (no options)
TCP Header (20 bytes) (no options)
HDRCKSUM
TCPCKSUM
10 bytes
16 bytes
14 bytes
Packet-Data (length varies)
How we use contexts
• Our ‘offload.c’ driver will send a ‘Type 0’ Context Descriptor within ‘module_init()’
txring[ 0 ].off.ipcss = 14; // IP-header CheckSum Starttxring[ 0 ].off.ipcso = 24; // IP-header CheckSum Offsettxring[ 0 ].off.ipcse = 34; // IP-header CheckSum Ending
txring[ 0 ].off.tucss = 34; // TCP/UDP-segment CheckSum Starttxring[ 0 ].off.tucso = 50; // TCP/UDP-segment Checksum Offsettxring[ 0 ].off.tucse = 0; // TCP/UDP-segment Checksum Ending
txring[ 0 ].dtyp = 0; // Type 0 Context Descriptor
txring[ 0 ].tucmd = (1<<5)|(1<<3); // DEXT=1, RS=1
iowrite32( 1, io + E1000_TDT ); // give ownership to NIC
Using contexts (continued)
• Our ‘offload.c’ driver will then use a Type 1 context descriptor every time its ‘write()’ function is called to transmit user-data
• The network controller ‘remembers’ the checksum-offloading parameters that we sent during module-initialization, and so it continues to apply them to every outgoing packet (we keep our same packet-layout)
Sequence of ‘write()’ steps
• Adjust the ‘len’ argument (if necessary)• Copy ‘len’ bytes from the user’s ‘buf’ array• Prepend the packet’s TCP Header• Insert the pseudo-header’s checksum• Prepend the packet’s IP Header• Prepend the packet’s Ethernet Header• Initialize the Data-Context Tx-Descriptor• Give descriptor-ownership to the NIC
The TCP pseudo-header
• We do initialize the TCP Checksum field, (but this only needs a short computation)
• The one’s complement sum of these six words is placed into ‘TCP Checksum’
Source IP-address
Destination IP-address
Protocol-ID(= 6)
TCP Segment-lengthZero
Setting up the Type-1 Context
int txtail = ioread32( io + E1000_TDT );
txring[ txtail ].dat.base_addr = tx_desc + (txtail * TX_BUFSIZ);txring[ txtail ].dat.dtalen = 54 + len;txring[ txtail ].dat.dtyp = 1;txring[ txtail ].dat.dcmd = 0;txring[ txtail ].dat.status = 0;txring[ txtail ].dat.pkt_opts = 3; // IXSM=1, TXSM=1txring[ txtail ].dat.vlan_tag = vlan_id;
txring[ txtail ].dat.dcmd |= (1<<0); // EOP (End-Of-Packet)txring[ txtail ].dat.dcmd |= (1<<3); // RS (Report Status)txring[ txtail ].dat.dcmd |= (1<<5); // DEXT (Descriptor Extension)txring[ txtail ].dat.dcmd |= (1<<6); // VLE (VLAN Enable)
txtail = (1 + txtail) % N_TX_DESC;iowrite32( txtail, io + E1000_TDT );
In-class demonstration
• We can demonstrate checksum-offloading by using our ‘dram.c’ device-driver to look at the packet that is being transmitted from one of our ‘anchor’ machines, and to look at the packet that gets received by another ‘anchor’ machine
• The checksum-fields (at offsets 24 and 50) do get modified by the network hardware!
In-class exercise
• The NIC can also deal with packets having the UDP protocol-format – but you need to employ different parameters in the Type 0 Context Descriptor and arrange a ‘header’ for the UDP segment that has a different length and arrangement of parameters
• Also the UDP protocol-ID is 17 (=0x11)
UDP Header
00
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Source Port Destination Port
Length Checksum
Data :::
Traditional ‘Big-Endian’ representation