IMS Parallel Sysplex Best Practices - IMS UG May 2013 Helsinki
Mittwoch, 1. September 2010 HV Stufe 2: Parallel Sysplex · 2010. 9. 3. · Configure the Parallel...
Transcript of Mittwoch, 1. September 2010 HV Stufe 2: Parallel Sysplex · 2010. 9. 3. · Configure the Parallel...
© 2010 IBM Corporation
Hanseatic Mainframe Summit 2010Mittwoch, 1. September 2010
HV Stufe 2: Parallel Sysplex
Martin Sö[email protected] – 734 42 32
© 2010 IBM Corporation
Availability
Partitioned Shared
Coupling Technologie
Shared Data - Ständige Anwendungsverfügbarkeit
© 2010 IBM Corporation
Scalability
Shared Data - Unterbrechungsfreies schrittweises Wachstum
Coupling Technologie
Partitioned Shared
© 2010 IBM Corporation
Basic shared DASD
� Limited capability
� Reserve and release against a whole disk
� Limits access to that disk for the duration of the update
zSeries ( or LPAR)
z/OS
channels
zSeries ( or LPAR)
z/OS
channels
control unit control unit
Real system would have many more control units and devices
© 2010 IBM Corporation
GRS Ring
� Global Resource Sharing (GRS) used to pass information between systems via the CTC ring
� Request ENQueue on a dataset, update, the DEQueue
� Loosely coupled system
zSeries ( or LPAR)
z/OS
channels
zSeries ( or LPAR)
z/OS
channels
control unit control unit
Can have more systems in the CTC"ring"
CTC
CTC
© 2010 IBM Corporation
Parallel Sysplex
� This extension of the CTC ring uses a dedicated Coupling Facility to store ENQ data for GRS
� This is much faster
� The CF can also be used to share application data such as DB2 tables
� Can appear as a single system
zSeries ( or LPAR)
z/OS
channels
zSeries ( or LPAR)
z/OS
channels
control unit control unit
system or LPAR
CouplingFacility
CF channels
© 2010 IBM Corporation
Parallel Sysplex
121
2
3
4
56
7
8
9
10
11
Physische Sicht
DynamischesWorkload Balancing
CouplingFacility
DataSharing
APPL
APPL
APPL
APPL
Appl
Application
Logische Sicht
CouplingTechnologie
Shared data
Sysplex Timer
ESCON9672711
511
© 2010 IBM Corporation
Redundanz im Parallel Sysplex
Doppelte Coupling Facility
CEC 1 CEC N
CPN
CP1
CPN
CP1
CPN
CP1
Coupling Facility
Redundante Timer
RedundanteLinks
ESCON Director 1 ESCON Director N
CPN
CP1
Coupling Facility
12 12
34
56
78
910
11 12 12
34
56
78
910
11
DASD DASD
Kein Single Point of Failure
Remote Copy
© 2010 IBM Corporation
� Addresses Planned / Unplanned Hardware / Software Outages
� Flexible, Non-disruptive Growth- Capacity beyond largest CPC- Scales better than SMPs
� Dynamic Workload/Resource Management
� Built In Redundancy� Capacity Upgrade on
Demand� Capacity Backup� Hot Pluggable I/O
1 to 32 Systems
Single System Parallel Sysplex
12 1
2
3
4567
8
9
10
11
GDPS
Site 1 Site 2
1212
34
56
789
1011 12
12
345
67
89
1011
� Addresses Site Failure and Site Maintenance
� Disk / Tape Remote Copy- Eliminates SPOF- No / Some Data Loss
� Application Independent
System z Continuous Availability
© 2010 IBM Corporation
Parallel Sysplex
� Provides for a single system image from several perspectives:
– Data access, allowing dynamic workloadbalancing and improved availability
– Dynamic Transaction Routing, providingdynamic workload balancing andimproved availability
– End-user interface, allowing logonto a logical network entity
– Operational interfaces, allowing easierSystems Management
1 to 32 Systems
© 2010 IBM Corporation
Parallel Sysplex
1 to 32 Systems
� At the heart of a Parallel Sysplex, there is (at least one) Coupling Facility. It is used for three purposes:
– Locking information that is shared among all attached systems
– Cache information (such as for a database) that is shared among all attached systems
– Data list information that is shared among all attached systems
© 2010 IBM Corporation
Parallel Sysplex
1 to 32 Systems
� Scalability (allows granular growth)
� Flexibility of workload
� Continuous availability– Planed outages (of a single system) – HW maintanance (repair, upgrade)– SW maintanance (Patches, fixes, release upgrades)
with IPL– Unplanded outages– HW problems – SW problems
� Workload Management – SLA
� Members (LPARs) of a Parallel Sysplex can can reside on one or multiple machines
� Members (LPARs) of a Parallel Sysplex can residein one or multple sites– Sites can have a distance of about up to 100 km (cross-site)
� Members (LPARs) of a Parallel Sysplex can run on different hardware or software releases!
© 2010 IBM Corporation
Parallel Sysplex
� Data sharing and Coupling facility
© 2010 IBM Corporation
Data Integrity in a Parallel Sysplex Cluster
LOCKSDATA
BUFFERS
DATABASE MANAGER
DATA BUFFERS
DATABASE MANAGER
Sysplex Services
Coupling Technology
REQUESTS REQUESTS
z/OS
zSeries
z/OS
zSeries
Locks
Directories
Caches
LOCKS
Multi-System
Serialization
Changed Data
© 2010 IBM Corporation
Parallel Sysplex Software Structure
Base ServicesHardware Interfaces
Appl Appl Appl Appl Appl
z/OS®
VTAM
Data Sharing
Applications Unchanged
Dynamic Workload Balancing
Single System Image & High Availability Connections
Appl
TCP/IP
BatchJob
WEB
Data Managers
Transaction Managers
CICS® IMS™ TSOBatch
IMS DB DB2 ®RLS,TVS Oracle Adaplex IDMS Datacom
OLTP cross- platform
MQ
© 2010 IBM Corporation
CICS Transaction Server
� ARM Support for Data Sharing Servers (V2.1)
� Named Counter servers
� Coupling Facility data table server
� Shared Temporary Storage server
� System Managed Duplex & Rebuild support (V2.2)
� Sign-On Retention for Persistent Sessions (V2.2)
� DB2 Group Attach (V2.2)
� Batch logging of VSAM data (SOD for CICSVR)
� Support for DB2V8 Restart Light (V2.3)
� Systems management enhancements in (V2.3)
Comm Server
TOR
CPSM
Comm Server
TOR
AOR AOR
z/OS - A z/OS - B
NETWORK
VTAM TCP/IP
CPSM
SMSVSAM SMSVSAM
CF
Improved Application Availability
© 2010 IBM Corporationz196TLLB17
12x InfiniBand SDR
12x InfiniBand
HCA2-O
HCA2-C
I/O Drawer or I/0 Cage
ISC-3
ISC-3ISC-3
ISC-3
z196, z10 EC, z10 BC
IFB-MP
z196
ISC-3
z196, z10 EC, z10 BC, z9 EC, z9 BC
HCA2-O LR
Up to 150 meters
Up to 150 meters
Up to 10/100 km
.... ....
HCA2-O
.... ....
1x InfiniBand DDR
Up to 10/100 Km
z9 EC and z9 BC
Fanouts
� Fanout, not I/O slot, used for InfiniBand
� ICB-4 – No longer supported� ETR – No longer supported� All coupling links support STP� Sysplex Coexistence – z10 EC and BC
and z9 EC and BC only
z196 Coupling Links
© 2010 IBM Corporation
� STP supports configurations up to Stratum 3
� STP transmits timekeeping information in layers or Stratums– Stratum 1
• Highest level in the hierarchy of timing network that uses STP to synchronize to CST
– Stratum 2• Server/Coupling Facility (CF) that
uses STP messages to synchronize to Stratum 1
– Stratum 3• Server/Coupling Facility (CF) that
uses STP messages to synchronize to Stratum 2
� STP supports configurations up to S3
STP Design Point Concept
Stratum 2
Stratum 3 Stratum 3
Stratum 2
Stratum 1
© 2010 IBM Corporationz196TLLB19
� z196 DOES NOT support ETR. � It is possible to have a z196 server as a Stratum 2 or Stratum 3 server in a Mixed CTN as long as
there are at least two z10s or z9s attached to the Sysplex Timer operating as Stratum 1 servers� Two Stratum 1 servers are recommended but not required to provide redundancy and avoid a single
point of failure� Suitable for a customer planning to migrate to an STP-only CTN.
No Support for ETR with z196 – Use Mixed CTN
ETR Link
Sysplex Timer ETR ID = XX
Sysplex Timer ETR ID = XX
CLO Link
z196Stratum 2
z196Stratum 2
CF Link
Ethernet Switch
System z9/z10Stratum 1
System z9/z10Stratum 1
HMC
© 2010 IBM Corporation
STP-Only CTN with z196
� Configuration has to be defined
� Must assign PTS and CTS– Optionally assign BTS
• Strongly recommended to allow near-continuous availability
– Optionally assign Arbiter• Recommended for configurations
of 3 or more servers/CFs• Can improve recovery
� STP can use coupling links
� z196 can be any one (or all) of the servers
z196TLLB20
z9 EC
z196
z196
z196
z10 EC
Ethernet Switch
External Time Source (ETS)
HMC
© 2010 IBM Corporation
Create a redundant I/O configuration
LPA
R1
LPA
Rn
LPA
R2
CSS / CHPID
LPA
R1
LPA
Rn
LPA
R2
Director(Switch)
....
DASD CUDASD CU
© 2010 IBM Corporation
� CA at System z level - Parallel Sysplex and Data Sharing:– Having a minimum of 2 Logical Partitions running on 2 different System z boxes,
and having the same application running in parallel in these LPARs and sharing the data and the control information in a Coupling Facility, together with a front-end dynamic workload balancing mechanism can bring a high level of availability.
• Typical example of such an architecture:
IBM HA / CA technology solutions
System z
CF
System z
Sysplex Timer
LPAR1
LPAR2
LPAR3
No SPoF if the application can
run everywhere(in any of the 3 partitions
� no affinity).
�IB
M S
yste
m z
uni
que
© 2010 IBM Corporation
IBM HA / CA technology solutions
Configure the Parallel Sysplex for system availability:
– Run with cloned systems and applications • Maintain all systems at the same release and mainte nance levels
– Deploy redundant components– Install enough capacity to accommodate failures– Exploit Global Resource Serialization (GRS) Star– Exploit Virtual IP Addressing (VIPA)– Implement Workload Manager policies (WLM) – Exploit Automatic Restart Manager (ARM)– Exploit Sysplex Failure Management (SFM)– Implement data sharing, transaction routing,
queue sharing– Implement z/OS Health Checker– Implement data replication
12 1
234
56789
10
11
12 1
234
56789
10
11
CouplingFacility
Sysplex Timer
or STP
FICON switch
z990z9 EC
z890
Shared Data
Metro Mirror