Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert...

22
Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert...

Page 1: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Checksum ‘offloading’

A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums

Page 2: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Network efficiency

• Last time (in our ‘nictcp.c’ demo) we saw the amount of work a CPU would need to do when setting up an ethernet packet for transmission with TCP/IP protocol format

• In a busy network this amount of packet- computation becomes a ‘bottleneck’ that degrades overall system performance

• But a lot of that work can be ‘offloaded’!

Page 3: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

The ‘loops’ are costly

• To prepare for a packet-transmission, the device-driver has to execute a few dozen assignment-statements, to set up fields in the packet’s ‘headers’ and in the Transmit Descriptor that will be used by the NIC

• Most of these assignments involve simple memory-to-memory copying of parameters

• But the ‘checksum’ fields require ‘loops’

Page 4: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Can’t ‘unroll’ checksum-loops

• One programming technique for speeding up loop-execution is known as ‘unrolling’, to avoid the ‘test-and-branch’ inefficiency:

• But it requires knowing in advance what number of loop-iterations will be needed

int sum = 0;

sum += wp[0];sum += wp[1];sum += wp[2];…sum += wp[99];

Page 5: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

The ‘offload’ solution

• Modern network controllers can be built to perform TCP/IP checksum calculations on packet-data as it is being fetched from ram

• This relieves a CPU from having to do the most intense portion of packet preparation

• But ‘checksum offloading’ is an optional capability that has to be ‘enabled’ – and ‘programmed’ for a specific packet-layout

Page 6: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

‘Context’ descriptors

• Intel’s Pro1000 network controllers employ special ‘Context’ Transmit-Descriptors for enabling and configuring the ‘checksum-offloading’ capability

• Two kinds of Context Descriptor are used:– An ‘Offload’ Context Descriptor (Type 0)– A ‘Data’ Context Descriptor (Type 1)

Page 7: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Context descriptor (type 0)

IPCSS

PAYLENDTYP

=0MSS

IPCSE IPCSOTUCSSTUCSE TUCSO

TUCMDSTAHDRLEN RSV

63 48 47 40 39 32 31 16 15 8 7 0

Legend: IPCSS (IP CheckSum Start) TUCSS (TCP/UDP CheckSum Start) IPCSO (IP CheckSum Offset) TUCSO (TCP/UDP CheckSum Offset) IPCSE (IP CheckSum Ending) TUCSE (TCP/UDP CheckSum Ending)

PAYLEN (Payload Length) DTYP (Descriptor Type) TUCMD (TCP/UCP Command) STA (TCP/UDP Status) HDRLEN (Header Length) MSS (Maximum Segment Size)

DEXT=1 (Extended Descriptor)

Page 8: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

The TUCMD byte

IDE SNAPDEXT(=1)

reserved(=0) RS TSE IP TCP

7 6 5 4 3 2 1 0

Legend: IDE (Interrupt Delay Enable) SNAP (Sub-Network Access Protocol) DEXT (Descriptor Extension) RS (Report Status) TSE (TCP-Segmentation Enable) IP (Internet Protocol) TCP (Transport Control Protocol)

always valid valid only when TSE=1

Page 9: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Context descriptor (type 1)

ADDRESS

DTALENDTYP

=1VLAN DCMDSTAPOPTS RSV

63 48 47 40 39 32 31 16 15 8 7 0

Legend: DTALEN (Data Length) DTYP (Descriptor Type) DCMD (Descriptor Command) STA (Status) RSV (Reserved) POPTS (Packet Options) VLAN (VLAN tag)

DEXT=1 (Extended Descriptor)

Page 10: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

The DCMD byte

IDE VLEDEXT(=1)

reserved(=0) RS TSE IFCS EOP

7 6 5 4 3 2 1 0

Legend: IDE (Interrupt Delay Enable) VLE (VLAN Enable) DEXT (Descriptor Extension) RS (Report Status) TSE (TCP-Segmentation Enable) IFCS (Insert Frame CheckSum) EOP (End Of Packet))

always valid valid only when EOP=1

Page 11: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Our usage example

• We’ve created a module named ‘offload.c’ which demonstrates the NIC’s checksum-offloading capability for TCP/IP packets

• It’s a modification of our earlier ‘nictcp.c’ character-mode device-driver module

• We have excerpted the main changes in a class-handout – the full version is online

Page 12: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Data-type definitions // Our type-definition for the ‘Type 0’ Context-Descriptor

typedef struct {unsigned char ipcss;unsigned char ipcso;unsigned short ipcse;

unsigned char tucss;unsigned char tucso;unsigned short tucse;

unsigned int paylen:20;unsigned int dtyp:4;unsigned int tucmd:8;

unsigned char status;

unsigned char hdrlen;unsigned short mss;} TX_CONTEXT_OFFLOAD;

Page 13: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Definitions (continued) // Our type-definition for the ‘Type 1’ Context-Descriptor

typedef struct {unsigned long long base_addr;

unsigned int dtalen:20;unsigned int dtyp:4;unsigned int dcmd:8;

unsigned char status;

unsigned char pkt_opts;unsigned short vlan_tag;} TX_CONTEXT_DATA;

typedef union {TX_CONTEXT_OFFLOAD off;TX_CONTEXT_DATA dat;} TX_DESCRIPTOR;

Page 14: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Our packets’ layout

Ethernet Header (14 bytes)

IP Header (20 bytes) (no options)

TCP Header (20 bytes) (no options)

HDRCKSUM

TCPCKSUM

10 bytes

16 bytes

14 bytes

Packet-Data (length varies)

Page 15: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

How we use contexts

• Our ‘offload.c’ driver will send a ‘Type 0’ Context Descriptor within ‘module_init()’

txring[ 0 ].off.ipcss = 14; // IP-header CheckSum Starttxring[ 0 ].off.ipcso = 24; // IP-header CheckSum Offsettxring[ 0 ].off.ipcse = 34; // IP-header CheckSum Ending

txring[ 0 ].off.tucss = 34; // TCP/UDP-segment CheckSum Starttxring[ 0 ].off.tucso = 50; // TCP/UDP-segment Checksum Offsettxring[ 0 ].off.tucse = 0; // TCP/UDP-segment Checksum Ending

txring[ 0 ].dtyp = 0; // Type 0 Context Descriptor

txring[ 0 ].tucmd = (1<<5)|(1<<3); // DEXT=1, RS=1

iowrite32( 1, io + E1000_TDT ); // give ownership to NIC

Page 16: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Using contexts (continued)

• Our ‘offload.c’ driver will then use a Type 1 context descriptor every time its ‘write()’ function is called to transmit user-data

• The network controller ‘remembers’ the checksum-offloading parameters that we sent during module-initialization, and so it continues to apply them to every outgoing packet (we keep our same packet-layout)

Page 17: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Sequence of ‘write()’ steps

• Adjust the ‘len’ argument (if necessary)• Copy ‘len’ bytes from the user’s ‘buf’ array• Prepend the packet’s TCP Header• Insert the pseudo-header’s checksum• Prepend the packet’s IP Header• Prepend the packet’s Ethernet Header• Initialize the Data-Context Tx-Descriptor• Give descriptor-ownership to the NIC

Page 18: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

The TCP pseudo-header

• We do initialize the TCP Checksum field, (but this only needs a short computation)

• The one’s complement sum of these six words is placed into ‘TCP Checksum’

Source IP-address

Destination IP-address

Protocol-ID(= 6)

TCP Segment-lengthZero

Page 19: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

Setting up the Type-1 Context

int txtail = ioread32( io + E1000_TDT );

txring[ txtail ].dat.base_addr = tx_desc + (txtail * TX_BUFSIZ);txring[ txtail ].dat.dtalen = 54 + len;txring[ txtail ].dat.dtyp = 1;txring[ txtail ].dat.dcmd = 0;txring[ txtail ].dat.status = 0;txring[ txtail ].dat.pkt_opts = 3; // IXSM=1, TXSM=1txring[ txtail ].dat.vlan_tag = vlan_id;

txring[ txtail ].dat.dcmd |= (1<<0); // EOP (End-Of-Packet)txring[ txtail ].dat.dcmd |= (1<<3); // RS (Report Status)txring[ txtail ].dat.dcmd |= (1<<5); // DEXT (Descriptor Extension)txring[ txtail ].dat.dcmd |= (1<<6); // VLE (VLAN Enable)

txtail = (1 + txtail) % N_TX_DESC;iowrite32( txtail, io + E1000_TDT );

Page 20: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

In-class demonstration

• We can demonstrate checksum-offloading by using our ‘dram.c’ device-driver to look at the packet that is being transmitted from one of our ‘anchor’ machines, and to look at the packet that gets received by another ‘anchor’ machine

• The checksum-fields (at offsets 24 and 50) do get modified by the network hardware!

Page 21: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

In-class exercise

• The NIC can also deal with packets having the UDP protocol-format – but you need to employ different parameters in the Type 0 Context Descriptor and arrange a ‘header’ for the UDP segment that has a different length and arrangement of parameters

• Also the UDP protocol-ID is 17 (=0x11)

Page 22: Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.

UDP Header

00

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

Source Port Destination Port

Length Checksum

Data :::

Traditional ‘Big-Endian’ representation