© 2013 IBM Corporation
HiperSockets for System zNewest FunctionsAlan Altmark – Senior Managing z/VM and Linux ConsultantIBM Systems Lab Services and Training
Alexandra Winter – HiperSockets ArchitectIBM System z Firmware Development
Session 13206
© 2013 IBM Corporation2 /
Trademarks
Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.This information provides only general descriptions of the types and portions of workloads that are eligible for execution on Specialty Engines (e.g, zIIPs, zAAPs, and IFLs) ("SEs"). IBM authorizes customers to use IBM SE only to execute the processing of Eligible Workloads of specific Programs expressly authorized by IBM as specified in the “Authorized Use Table for IBM Machines” provided at www.ibm.com/systems/support/machine_warranties/machine_code/aut.html (“AUT”). No other workload processing is authorized for execution on an SE. IBM offers SE at a lower price than General Processors/Central Processors because customers are authorized to use SEs only to process certain types and/or amounts of workloads as specified by IBM in the AUT.
* Registered trademarks of IBM Corporation
The following are trademarks or registered trademarks of other companies.
* Other product and service names might be trademarks of IBM or other companies.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Windows Server and the Windows logo are trademarks of the Microsoft group of countries.ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. UNIX is a registered trademark of The Open Group in the United States and other countries. Java and all Java based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries.
© 2013 IBM Corporation3 /
Agenda
Review
Features and functions QDIO ASSIST Layer 2 vs Layer 3 VM vSwitch HiperSockets Bridge System z Network Virtualization Manager IEDN Completion Queues …
Where to find more information
© 2013 IBM Corporation4 /
System z Networking Review
LP-A7z/VM
hs0 hs0hs0 hs0
LP-A8z/VM
eth0 eth0eth0 eth0
OSA2
VM VSwitch
guest-B2z/Linux
LP-A4z/VM
LP-A3z/OS
LP-A2z/Linux
guest-B1z/Linux
LP-A1z/Linux
hs0 hs0
hs0 hs0
OSA1
eth0
HiperSockets1 HiperSockets2
LP-A5z/OS LP-A6
z/OS
Bridge
HiperSockets3
© 2012 IBM Corporation
IBM System z - zEC12
5
Syst
em z
Har
dwar
e M
anag
emen
t Con
sole
(HM
C)
with
Uni
fied
Res
ourc
e M
anag
er
zBX
Select IBM Blades
Blade HW Resources
Optimizers
Dat
aPow
er X
I50z
z HW Resources
z/OS®
Support Element
z/VM
Unified Resource Manager
Private data network (IEDN)
Customer Network Customer Network
System z Host
Linux on System x
AIX on POWER7
Dat
aPow
er X
I50z
Blade Virtualization
Blade Virtualization
System z PR/SM™
z/TPF
z/VSE®
Linux on
System z
Windows on
System x
Blade Virtualization
Private High Speed Data Network IEDN
Private Management Network INMNPrivate Management Network (information only)
zBX zEnterprise BladeCenter Extension
Linux on
System z
© 2013 IBM Corporation6 /
HiperSockets Features and functions
What is available to you?
What is new?
© 2013 IBM Corporation7 /
Dedicated QDIO devices for VM guests
QDIO ASSIST / QEBSM(also for OSA, FCP)
interface definition with VM Hipervisor (1:1 mapping of virtual devices to real devices)
support in guest OS required available in zLinux and zVSE
direct pass-through for data transfer, without interception to the VM Hipervisor
delivery of interrupts to the VM guest without interception to the VM Hipervisor
Source: HiperSockets Implementation Guide www.redbooks.ibm.com
© 2013 IBM Corporation8 /
Layer 3 versus Layer 2
A HiperSockets VNIC can be defined by the device driver either as Layer 2 device (MAC addressing, ethernet frames) or as Layer 3 device (IPv4 or IPv6)
L2 and L3 devices can be defined on the same channel, but cannot communicate with each other!
Only L2 devices can be activated on IQDX / IEDN and External Bridge Channels
© 2013 IBM Corporation9 /
Miscalleaneous features
Multiple Writeexploited by z/OS, send multiple output buffers at one time
Network Traffic Analyzerset one IQD VNIC in 'promiscuos mode' and get a copy of all traffic on this channelAuthorization and 'filtering' on SE required
Which LPAR is authorized to run a NTA?Traffic between which LPARs will be sniffed?
Linux exploitation for tcpdump is available (see ZSQ03039USEN white paper)
VLANVLAN support availabledevice driver defines which VLAN this device is allowed to useout-of-band VLAN management only for IQDX (zManager)
Network concentratorLinux tool to connect L3 IPv4 HiperSockets to external network
see “Linux on System z, Device Drivers, Features, and Commands” www.ibm.com/developerworks
see also VM Bridge
© 2013 IBM Corporation10 /
z/VM VSWITCH HiperSockets Bridge
Connect HiperSocket LAN to ethernet LAN without a router– Same subnet as ethernet LAN
Full redundancy– Up to 5 bridges per CPC (CEC)– Automatic failover with optional failback– Each bridge can have more than one OSA uplink (typical)
© 2013 IBM Corporation11 /
z/VM VSWITCH HiperSockets Bridge
One active bridge per HiperSocket CHPID
OSA
LP1 LP2 LP3 LP4
HiperSocket
chpid
LP5
Externalhosts
OSAOSA OSA OSA OSA
Path MTU discovery support– Large frames inside– Small frames outside
© 2013 IBM Corporation12 /
OSA
z/VM VSWITCH HiperSockets Bridge Layer 2 only
– No transport mode conversions
Bridges both IEDN and Customer networks
Only traffic to/from QEBSM NICs will flow over the bridge
Guests QA1,QA2, QA3 and QA4 have real (dedicated) QEBSM connections to HS CHPID.
– Requires almost no z/VM involvement
– Bridged by default (if bridge is defined)
Guests VA1 and VA2 have virtual NIC connections through VSWITCH A
– Optimum performance for guests that are not deployed with QEBSM on z/VM. Eliminates “shadow queue” overhead
– Connectivity to HS and external LAN segments
OSA uplink port BAU– No changes in current support
CEC X
External LAN
HiperSockets (IQD)
PR/SM
z/VM LPAR A
VSwitch A
Primary BC
Linu
x Q
A3
Linu
x Q
A4
OSA
Upl
ink
Port
HS
Brid
g e P
ort
Linu
x V
A 1
z/O
S V
A2
QEBSM
Linu
x Q
A2Li
nux
QA
1
OSA
© 2013 IBM Corporation13 /
z/VM VSWITCH HiperSockets Bridge
CEC X
z/OS LPAR
OSA NICOSA NIC
Server A(native)
HiperSockets (IQD)
IQD NICIQD NIC
PR/SM
If 1If 1
OSD NICOSD NIC
Linux LPAR
Server B(native)
If AIf A
z/VM LPAR B
VSwitch B
Primary BC
zLin
ux Q
B1
zLin
ux Q
B2zL
inux
QB
3zL
inux
QB
4
OSA
Upl
ink
P ort
HS
Brid
g e P
ort
zLin
ux V
B1
z/O
S V
B2
QEBSM
z/VM LPAR A
VSwitch A
Secondary BC
zLin
ux Q
A1
zLin
ux Q
A2zL
inux
QA3
zLin
ux Q
A4
OSA
Upl
ink
P ort
HS
Brid
ge P
o rt
zLin
ux V
A1
z/O
S V
A2
QEBSM
External LAN
One active bridge port per IQD channel; max of 1 primary and 4 secondary native LPARs are not bridged z/OS uses concept of converged devices (IEDN only)
© 2013 IBM Corporation14 /
z/VM VSWITCH HiperSockets Bridge
DEFINE VSWITCH switch (all the traditional keywords)
ETHERNET BRIDGEPORT RDEV hipersocket_rdev [PRIMARY]
The HiperSocket device must be on a CHPID defined in the IOCP with CHPARM=x4
CP DEFINE CHPID …. EXTERNAL_BRIDGED is available for dynamic I/O
© 2013 IBM Corporation15 /
System z Network Virtualization Manager (z196+)
LP-A7z/VM
hs0 hs0hs0 hs0
OSXvSwitch
LP-A5z/OSc
guest-B2z/Linux
LP-A4z/VM
LP-A3z/OS
guest-B1z/Linux
hs0 hs0
OSXvSwitch
IQDX vSwitch
LP-A5z/OS
LP-A6z/OS
Bridge
OSXvSwitch
HMCHMCSE NetworkVirtualization
Manager
RulesMAC prefixVLAN rules
...StatisticsIaaS
IEDN single flat L2 network connects CECs and zBXs in
an ensemble separation via VLANs z/VM 6.1 and 6.2
LP-A7z/VM
hs0 hs0hs0 hs0
OSXvSwitch
LP-A6z/OS
Bridge
OSXvSwitch
IQDX vSwitch
© 2013 IBM Corporation16 /
IEDN / IQDX
Only one IQDX channel per CEC
Layer 2 only
VLAN mandatory
Bridged via z/VM bridges to OSX (Linux as z/VM guest) or
merged interface with OSX vNIC (z/OS)
Managed by Network Virtualization Manager (NVM) component of zManager / URMMAC address management (prefix)VLAN managementMonitoringdefinition of z/VM bridges to OSX / IEDN
© 2013 IBM Corporation17 /
Completion Queues
HiperSockets messages are sent synchronously, in-order and reliably
If the target has no free input buffers, error is delivered to the senderSender can retry, but does not know when new target buffers are available
Performance impact!
OSA has the capability to buffer 512 packets. In a high sharing environment OSA may perform better than Hipersockets, packet buffering may be a reason.
Completion queues:
Deliver synchronously if possible, asynchronously if necessary
Messages remain at sender
When target provides free input buffers, messages are delivered and completion messages are reported to senderIBM zEnterprise System 196 (z196) and later
© 2013 IBM Corporation18 /
Completion Queue exploitation
Exploitation possible per serveronly sender needs support
Amount of buffered messages counted by Resource Measurement Facility (RMF) and NVM Monitoring as 'unavailable receive buffers'
Exploited today by z/VM bridge ports
LP-A7z/VM
hs0 hs0hs0 hs0
OSA
z/VM VSwitch
IQD Bridgeport
OSA
guest-B2z/Linux
LP-A4z/VM
guest-B1z/Linux
hs0 hs0
HiperSockets
© 2013 IBM Corporation19 /
Exploited today by IUCV Sockets over HiperSockets (Linux, z/VSE)Inter-User Communication Vehicle (IUCV) is traditionally provided by z/VM for communication between two z/VM guests in same z/VM LPARpoint-to-point connectionUsed to provide z/VSE Fast Path to Linux (LFP)
IUCV over HiperSocketsflow control by completion messagesAvailable for z/VM guests and native LPARsAvailable for communication between z/VM guests in different z/VM LPARs
Completion Queue exploitation
LP-A7z/VSE
hs0 hs0
guest-B2Linux
LP-A4z/VM
guest-B1z/VSE
hs0 hs0
HiperSockets
LP-A8Linux
IUCV
© 2013 IBM Corporation20 /
Functional Matrix
HiperSockets Features z/OS z/VM Linux z/VSE
IPv4 Support Yes Yes Yes Yes
IPv6 Support Yes Yes Yes Yes
VLAN Support Yes Yes Yes Yes
Network Concentrator No No Yes No
Layer 2 Support No Yes Yes No
Multiple Write Facility Yes No No No
zIIP Assisted Multiple Write Facility Yes No No No
HiperSockets NTA (Network Traffic Analyzer) No No Yes No
Integration with IEDN (IQDX) No Yes Yes No
Merged IEDN interfaces (OSX / IQDX) Yes No No No
Virtual Switch Bridge Support No Yes No No
IUCV over HiperSockets No No Yes Yes
Completion Queue No Yes No Yes
© 2013 IBM Corporation21 /
HiperSockets CHPARM
CHPID Parameter MFS max. MTU
CHPARM=0x (default) 16kByte 8kByte
CHPARM=4x 24kByte 16kByte
CHPARM=8x 40kByte 32kByte
CHPARM=Cx 64kByte 56kByte
Maximum Frame Size / Maximum Transfer Unit:
Allows optimization per HiperSockets LAN for small packets versus large streams MFS == size of 1 input buffer MTU defined for device driver <= max. MTU in CHPARM;
device driver may put multiple frames in a HiperSockets message
Channel flavor:CHPID Parameter Usage
CHPARM=x0 (default) Traditional HiperSockets
CHPARM=x2 HiperSocktets for IEDN (IQDX)
CHPARM=x4 HiperSockets for External Bridge
© 2013 IBM Corporation22 /
More information
www.ibm.com/developerworks“Linux on System z, Device Drivers, Features, and Commands”
IBM Redbooks http://www.redbooks.ibm.com
– HiperSockets Implementation Guide, SG24-6816– IBM System z Connectivity Handbook, SG24-5444– I/O Configuration Using z/OS HCD and HCM, SG24-7804– Building an Ensemble Using Unified Resource Manager, SG24-7921
System z HiperSockets web page:http://www.ibm.com/systems/z/hardware/networking/products.html
IBM ATS Technical Documents: http://www.ibm.com/support/techdocs
IBM Information Centerhttp://www.ibm.com/support/documentation/us/en
© 2013 IBM Corporation23 /
Session 13206
Top Related