FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson [email protected].
-
Upload
madeline-powell -
Category
Documents
-
view
214 -
download
1
Transcript of FCoE Overview IEEE CommSoc/SP Chapter Austin, Texas, May 21 2009 Tony Hurson [email protected].
FCoE Overview
IEEE CommSoc/SP ChapterAustin, Texas, May 21 2009Tony [email protected]
Networked Storage History
FC Target System
Fibre ChannelStorage Area
Network
Data DataData
SCSIBlockData
Transfer
FabricCharacteristics:
LosslessLow Latency
High throughputReliable
Redundant paths(failover)
SAN
Fileserver
Ethernet (TCP/IP)network
Data DataData
File-baseddata
transfer(eg, NFS,
CIFS)
FabricCharacteristics:
Packet drop onbuffer fullHigh-lowLatencyHigh-low
throughputNo multi-pathing
NAS
CPU
clients
servers
Network Attached Storage Storage Area Network
DataData
SCSI Read, Write over FC
SCSI Read
Host
FCP_CMND
Target
FCP_DATA
FCP_DATA
FCP_DATA
FCP_RSP
FCP_CMND
Host Target
FCP_XFER_RDY
FCP_DATA
FCP_DATA
FCP_RSP
SCSI Write
FCP_DATA
Unsolicited data(modest amount)
Exchange
Sequence (maybe out of order)
FC Fabric Port Terminology
F_Port
F_Port
E_Port
N_Port
switchhost
N_Port
host
F_Port
F_Port
E_Port
switch
target
N_Port
N_Port
N_Port - Host or Target endpoint
F_Port - Endpoint-facing switch port
E_Port - Inter-switch port
Virtualization adds a ‘V’ prefix to all of these
FC Routing
switch switch switch switch
switch switch switch switch
switchswitchswitchswitch
Fabric Shortest Path First
Based on OSPF (IP)
“Static” Routing Tables per switch
Chooses shortest paths (hop counts)
Load balances multiple paths
Handles link failover automatically
Ethernet Routing
Dynamic Scheme: Source Learning If unicast DstMAC is not in lookup table,
flood frame to all ports except its source port.
Note source port of SrcMAC in lookup table, if not already present
Age/invalidate lookup entries Similar flooding behavior for multicast Precludes loops in fabric
FC Frame Format
Parameter
SOF Frame Header Opt. header Payload (2KB + Markers) CRC EOF
R_CTL D_ID
S_ID
F_CTLType
SEQ_ID DF_CTL SEQ_CNT
OX_ID RX_ID
02331
Fabric-assigned (Fabric Login) source,destination [V]N_Port identifiers
Local, Remote Exchange Identifiers, used tolook up Exchange state at endpoints
Sequence trackers
Protocol Stack History and Comparison
SCSI
FCP
FC-3
FC-2V
FC-1
FC-0
iSCSI
TCP
Ethernet
IPFCoE
FCP
PHY
Link
Transport
Mapping, Discovery,Services, Recovery
Mapping
Transport
NetworkEncap/decap
FC-3
FC-2V
Transport
Link
PHY
Mapping, Discovery,Services, Recovery
Chronological order of development
LosslessEthernet
Lossless Ethernet – via PAUSE
Switch or Endpoint
Eth
Tx
Switch or Endpoint
Port transmitbuffer
Port receivebuffer
Port receive packet buffer
Eth
Rx
HWM LWM
Outbound PAUSE generator
Eth
Tx
Eth
Rx
Ethernetlink
Port transmitbuffer
Inb
ou
nd
PA
US
E
Inb
ou
nd
PA
US
E
Outbound PAUSE generator
When port receive bufferfills to a high watermark,
issue PAUSE XOFF to linkpeer; when buffer drains to
low watermark, issuePAUSE XON to peer
FCoE Early Deployment Example
FC Storage Array
Database TierLarge SMP
FC fabric
Application Tier8, 16-way SMPdiskless blades
Lossless, Converged Ethernet Fabric
Presentation Tier20, 4-way SMPdiskless blades
FCoE -FC
gateway
Firewall
To/Frominternet
FCoE Frame Format
EtherType = FCoE_TYPE Version
031
SOF
Encapsulated FC Frame (n words)
EOF
FCoE Endpoint Model
Lossless Ethernet MAC
To lossless Eth. Fabric
FIP mgmt.protocol
FCoE_LEP
VN_Port
FC-3/FC-4 FIP - Fibre Channel Initialization Protocol -initiates Fabric Logins with FCoE switch (FCF)
Each Fabric Login Establishes a VN_Port and aVN_Port - VF_Port logical connection.
FCoE_LEP - link endpoint, performsencapsulation/decapsulation of FC frame.
Each VN_Port has a unique MAC address, server-or fabric-provided.
FC-2V
FCoE Switch Functional Model
FCoE_LEPFCoE_LEP
Lossless EthernetMAC (FCF-MAC)
To lossless Eth. Fabric
FIP mgmt.protocol
FCoE_LEP
VF_Port
FCoE_LEPFCoE_LEP
Lossless EthernetMAC (FCF-MAC)
To lossless Eth. Fabric
FIP mgmt.protocol
FCoE_LEP
VF_Port
FCoE_LEPFCoE_LEP
Lossless EthernetMAC (FCF-MAC)
To lossless Eth. Fabric
FIP mgmt.protocol
FCoE_LEP
VE_Port
FC Switch (FC-SW-5)
E_Port
(To FC fabric)
F_Port
(To FC endpoint)
Converged Ethernet AKA Data Center Bridging (DCB). Run up to
four major traffic classes on single 10 GbE fabric. In order of market prevalence: Networking (TCP/IP, lossy). Block Storage (lossless FCoE, or lossless/lossy
iSCSI). Management (“heartbeat” traffic, low bandwidth,
but must get through). Inter-Process Communication (clustered
computing: high bandwidth, low latency, lossless preferred).
Groundwork for DCB IEEE 802.1Qaz – ETS & DCBX –
bandwidth allocation to major traffic classes (Priority Groups); plus DCB management protocol.
IEEE 802.1Qbb – Priority PAUSE. Selectively PAUSE traffic on link by Priority Group.
IEEE 802.1Qau – Dynamic Congestion Notification.
IEEE 802.1Qaz Enhanced Transmission Selection Support at least 3 Priority Groups/traffic
classes PGs identified by Priority field of existing
802.1Q VLAN Tag Configured Bandwidth per PG has 1%
resolution PG15 has limitless bandwidth (use
sparingly!, for Management) Work Conservation – if the wire’s free, use
it.
ETS Configuration Example PG0 (Storage): 40% of port b/w PG1 (Networking): 20% of port b/w PG2 (IPC): 40% of port b/w PG15 (mgmt): limitless
If a PG underutilizes, others can fill the space.
Typical implementation: DWRR.
IEEE 802.1Qbb Priority PAUSE
Switch or Endpoint
10 GbE link
DW
RR
sc
hed
ule
rPG0 - Storage
PG1 - Networking
PG2 - IPC
PG15 - Management
Output queues, bytraffic class
ETS
Switch or Endpoint
PG0 - Storage
PG1 - Networking
PG2 - IPC
PG15 - Management
Receive Buffers, bytraffic class
Priority PAUSE!!PG0 only
Generally,Networking (TCP/IP) should NEVER
be PAUSEd
lossy buffer
lossless buffer
IEEE 802.1Qau Dynamic Congestion Control Background Lossless fabrics are prone to congestion
spreading (congestion trees). Ethernet-FC gateways with their different
port speeds (10 GbE; 8 Gbps) are natural bottlenecks.
ETS Work Conservation model adds fuel to fire.
Solution: switches/endpoints notify traffic sources of incipient congestion, via feedback messages; sources reduce rates accordingly.
Congestion Notification in Action
1. Source endpoint,supporting ‘n’CongestionControlled Flows,tags each outboundpacket with CCF#
Destinationendpoint
Data
2. Switch (or dest.Endpoint) detectsincipientcongestion; issuesCongestionNotificationMessage back todata source
3. Source reacts toCNM, reducing txrate. Sourcerecovers its rateover time and viabyte counting
CNM
Congestion Control at Endpoint Transmit
IEEE 802.1Qaz/Qau Endpoint
10 GbElink
DW
RR
sch
edu
ler
PG0 - Storage
PG1 - Networking
PG2 - IPC
PG15 - Management
802.1 QazETS
802.1QauCongestion Control
rate limiters(“Reaction Points”)
Typicalimplementation:
byte-based tokenbuckets
Shallow buckets (2- 6 packets) for
rapid CNMresponse Shallow queues (2 -
6 packets) for rapidCNM response
Incoming Congestion Notification Messages (CNMs) - “Slow Down!”
CNMs only slowdown RPs. Rate
recovery is internal(byte- and time-
based)
FCoE Summary Presents new, but very familiar, PHY and
Link Layers for FC. Core switching discipline remains FC-SW-5. Higher FC layers almost completely
unchanged (that’s the legacy value!) Biggest Ethernet-level requirement: lossless
fabric. Part of Converged Ethernet initiative – lots
of ancillary activity at IEEE.
Further Reading
FCoE: www.t11.org IEEE 802.1Q(az|au|bb): www.ieee.org
Thank you! Questions?