FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson [email protected].

23
FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson [email protected]

Transcript of FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson [email protected].

Page 1: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

FCoE Overview

IEEE CommSoc/SP ChapterAustin, Texas, May 21 2009Tony [email protected]

Page 2: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

Networked Storage History

FC Target System

Fibre ChannelStorage Area

Network

Data DataData

SCSIBlockData

Transfer

FabricCharacteristics:

LosslessLow Latency

High throughputReliable

Redundant paths(failover)

SAN

Fileserver

Ethernet (TCP/IP)network

Data DataData

File-baseddata

transfer(eg, NFS,

CIFS)

FabricCharacteristics:

Packet drop onbuffer fullHigh-lowLatencyHigh-low

throughputNo multi-pathing

NAS

CPU

clients

servers

Network Attached Storage Storage Area Network

DataData

Page 3: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

SCSI Read, Write over FC

SCSI Read

Host

FCP_CMND

Target

FCP_DATA

FCP_DATA

FCP_DATA

FCP_RSP

FCP_CMND

Host Target

FCP_XFER_RDY

FCP_DATA

FCP_DATA

FCP_RSP

SCSI Write

FCP_DATA

Unsolicited data(modest amount)

Exchange

Sequence (maybe out of order)

Page 4: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

FC Fabric Port Terminology

F_Port

F_Port

E_Port

N_Port

switchhost

N_Port

host

F_Port

F_Port

E_Port

switch

target

N_Port

N_Port

N_Port - Host or Target endpoint

F_Port - Endpoint-facing switch port

E_Port - Inter-switch port

Virtualization adds a ‘V’ prefix to all of these

Page 5: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

FC Routing

switch switch switch switch

switch switch switch switch

switchswitchswitchswitch

Fabric Shortest Path First

Based on OSPF (IP)

“Static” Routing Tables per switch

Chooses shortest paths (hop counts)

Load balances multiple paths

Handles link failover automatically

Page 6: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

Ethernet Routing

Dynamic Scheme: Source Learning If unicast DstMAC is not in lookup table,

flood frame to all ports except its source port.

Note source port of SrcMAC in lookup table, if not already present

Age/invalidate lookup entries Similar flooding behavior for multicast Precludes loops in fabric

Page 7: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

FC Frame Format

Parameter

SOF Frame Header Opt. header Payload (2KB + Markers) CRC EOF

R_CTL D_ID

S_ID

F_CTLType

SEQ_ID DF_CTL SEQ_CNT

OX_ID RX_ID

02331

Fabric-assigned (Fabric Login) source,destination [V]N_Port identifiers

Local, Remote Exchange Identifiers, used tolook up Exchange state at endpoints

Sequence trackers

Page 8: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

Protocol Stack History and Comparison

SCSI

FCP

FC-3

FC-2V

FC-1

FC-0

iSCSI

TCP

Ethernet

IPFCoE

FCP

PHY

Link

Transport

Mapping, Discovery,Services, Recovery

Mapping

Transport

NetworkEncap/decap

FC-3

FC-2V

Transport

Link

PHY

Mapping, Discovery,Services, Recovery

Chronological order of development

LosslessEthernet

Page 9: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

Lossless Ethernet – via PAUSE

Switch or Endpoint

Eth

Tx

Switch or Endpoint

Port transmitbuffer

Port receivebuffer

Port receive packet buffer

Eth

Rx

HWM LWM

Outbound PAUSE generator

Eth

Tx

Eth

Rx

Ethernetlink

Port transmitbuffer

Inb

ou

nd

PA

US

E

Inb

ou

nd

PA

US

E

Outbound PAUSE generator

When port receive bufferfills to a high watermark,

issue PAUSE XOFF to linkpeer; when buffer drains to

low watermark, issuePAUSE XON to peer

Page 10: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

FCoE Early Deployment Example

FC Storage Array

Database TierLarge SMP

FC fabric

Application Tier8, 16-way SMPdiskless blades

Lossless, Converged Ethernet Fabric

Presentation Tier20, 4-way SMPdiskless blades

FCoE -FC

gateway

Firewall

To/Frominternet

Page 11: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

FCoE Frame Format

EtherType = FCoE_TYPE Version

031

SOF

Encapsulated FC Frame (n words)

EOF

Page 12: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

FCoE Endpoint Model

Lossless Ethernet MAC

To lossless Eth. Fabric

FIP mgmt.protocol

FCoE_LEP

VN_Port

FC-3/FC-4 FIP - Fibre Channel Initialization Protocol -initiates Fabric Logins with FCoE switch (FCF)

Each Fabric Login Establishes a VN_Port and aVN_Port - VF_Port logical connection.

FCoE_LEP - link endpoint, performsencapsulation/decapsulation of FC frame.

Each VN_Port has a unique MAC address, server-or fabric-provided.

FC-2V

Page 13: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

FCoE Switch Functional Model

FCoE_LEPFCoE_LEP

Lossless EthernetMAC (FCF-MAC)

To lossless Eth. Fabric

FIP mgmt.protocol

FCoE_LEP

VF_Port

FCoE_LEPFCoE_LEP

Lossless EthernetMAC (FCF-MAC)

To lossless Eth. Fabric

FIP mgmt.protocol

FCoE_LEP

VF_Port

FCoE_LEPFCoE_LEP

Lossless EthernetMAC (FCF-MAC)

To lossless Eth. Fabric

FIP mgmt.protocol

FCoE_LEP

VE_Port

FC Switch (FC-SW-5)

E_Port

(To FC fabric)

F_Port

(To FC endpoint)

Page 14: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

Converged Ethernet AKA Data Center Bridging (DCB). Run up to

four major traffic classes on single 10 GbE fabric. In order of market prevalence: Networking (TCP/IP, lossy). Block Storage (lossless FCoE, or lossless/lossy

iSCSI). Management (“heartbeat” traffic, low bandwidth,

but must get through). Inter-Process Communication (clustered

computing: high bandwidth, low latency, lossless preferred).

Page 15: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

Groundwork for DCB IEEE 802.1Qaz – ETS & DCBX –

bandwidth allocation to major traffic classes (Priority Groups); plus DCB management protocol.

IEEE 802.1Qbb – Priority PAUSE. Selectively PAUSE traffic on link by Priority Group.

IEEE 802.1Qau – Dynamic Congestion Notification.

Page 16: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

IEEE 802.1Qaz Enhanced Transmission Selection Support at least 3 Priority Groups/traffic

classes PGs identified by Priority field of existing

802.1Q VLAN Tag Configured Bandwidth per PG has 1%

resolution PG15 has limitless bandwidth (use

sparingly!, for Management) Work Conservation – if the wire’s free, use

it.

Page 17: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

ETS Configuration Example PG0 (Storage): 40% of port b/w PG1 (Networking): 20% of port b/w PG2 (IPC): 40% of port b/w PG15 (mgmt): limitless

If a PG underutilizes, others can fill the space.

Typical implementation: DWRR.

Page 18: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

IEEE 802.1Qbb Priority PAUSE

Switch or Endpoint

10 GbE link

DW

RR

sc

hed

ule

rPG0 - Storage

PG1 - Networking

PG2 - IPC

PG15 - Management

Output queues, bytraffic class

ETS

Switch or Endpoint

PG0 - Storage

PG1 - Networking

PG2 - IPC

PG15 - Management

Receive Buffers, bytraffic class

Priority PAUSE!!PG0 only

Generally,Networking (TCP/IP) should NEVER

be PAUSEd

lossy buffer

lossless buffer

Page 19: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

IEEE 802.1Qau Dynamic Congestion Control Background Lossless fabrics are prone to congestion

spreading (congestion trees). Ethernet-FC gateways with their different

port speeds (10 GbE; 8 Gbps) are natural bottlenecks.

ETS Work Conservation model adds fuel to fire.

Solution: switches/endpoints notify traffic sources of incipient congestion, via feedback messages; sources reduce rates accordingly.

Page 20: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

Congestion Notification in Action

1. Source endpoint,supporting ‘n’CongestionControlled Flows,tags each outboundpacket with CCF#

Destinationendpoint

Data

2. Switch (or dest.Endpoint) detectsincipientcongestion; issuesCongestionNotificationMessage back todata source

3. Source reacts toCNM, reducing txrate. Sourcerecovers its rateover time and viabyte counting

CNM

Page 21: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

Congestion Control at Endpoint Transmit

IEEE 802.1Qaz/Qau Endpoint

10 GbElink

DW

RR

sch

edu

ler

PG0 - Storage

PG1 - Networking

PG2 - IPC

PG15 - Management

802.1 QazETS

802.1QauCongestion Control

rate limiters(“Reaction Points”)

Typicalimplementation:

byte-based tokenbuckets

Shallow buckets (2- 6 packets) for

rapid CNMresponse Shallow queues (2 -

6 packets) for rapidCNM response

Incoming Congestion Notification Messages (CNMs) - “Slow Down!”

CNMs only slowdown RPs. Rate

recovery is internal(byte- and time-

based)

Page 22: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

FCoE Summary Presents new, but very familiar, PHY and

Link Layers for FC. Core switching discipline remains FC-SW-5. Higher FC layers almost completely

unchanged (that’s the legacy value!) Biggest Ethernet-level requirement: lossless

fabric. Part of Converged Ethernet initiative – lots

of ancillary activity at IEEE.

Page 23: FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson tony.hurson@ieee.org.

Further Reading

FCoE: www.t11.org IEEE 802.1Q(az|au|bb): www.ieee.org

Thank you! Questions?