Introduction to Windows Cluster
-
Upload
dinesh-moorthy -
Category
Documents
-
view
226 -
download
1
Transcript of Introduction to Windows Cluster
Introduction to Clustering
2
Prerequisites
Before starting this session, you should understand what fault tolerance and load balancing mean.
3
Industry Definition of Cluster• Cluster Definition:
– A group of computers and storage devices that work together and can yet be accessed as a single system.
• A Cluster provides:– Distribution of processing load – Automatic recovery from failure of one or more
components in the cluster
4
Availability, Scalability and Manageability
• Availability: – Measure of the amount of time a system or component
performs its specified function. • Scalability:
– The ability to incrementally add smaller, standard systems as needed to meet overall processing power requirements.
• Manageability:– The ease of administering a cluster solution to include
configuration, updates and/or patches, and new additions.
5
Availability OverviewNode1 Services
Web Clients
Node4 Provides access to SQL Database
Node1
Node2
Node3
Node4
Cluster Solution
Node2 Services Web Clients
Node4 Provides access to SQL Database
Node1
Node2
Node3
Node4
Cluster Solution
Before Node1 Failure After Node1 FailureNode1 Fails
6
Scalability Overview
• Scaling up:– Scaling up is achieved by adding more resources,
such as memory, processors, and disk drives to a system.
• Scaling Out:– Scaling out delivers high performance when the
throughput requirements of an application exceed the capabilities of an individual system.
7
Manageability Overview The following questions must be answered:• Setup
– How easy is it to install the cluster solution? • Configuration
– How easy is it to install applications into the cluster and administer the different aspects of the clustering software?
– How easy is it to dynamically increase or scale up the cluster solution when your business requirements exceed the current capacity?
• Disaster Recovery– How quickly and easily in the event of a complete and total disaster can you bring the cluster
solution back into production?• Application
– On the applications that you install into the cluster, what type of additional maintenance and administration is required above a stand-alone version of the application.
• Application updates – How easy is it to update the applications when the time comes for new features or security
updates?• Operating System patch management.
– How easy is it to update the Core OS on which the cluster server runs or update the cluster service due to security patches being released or patches to resolve bugs in the existing software.
8
Cluster Solution BenefitsFactors to be considered while planning a Cluster Deployment:
• Cost of hardware– Cost of the Computers or Nodes– Networking devices such as Switches or routers.– Shared or External Storage (SAN)
• Cost of the Cluster Software Product or Suite– This would be the OS, Clustering Software and applications that will be
used to run on the cluster.• Cost of ownership
– What you need to take into account is that the hardware might be cheaper but it will possibly take more man-hours from your administrative and developer staff to implement, design, create and maintain the cluster solution.
9
Cluster Models and Their Configurations
• Active/ Active• Active/ Passive
10
Active / Active
File/ Print Group 1
File/ Print Group 2
Server
Cluster
Server
Capacity to Failover Group 2
Capacity to Failover Group 1
11
Active / Passive
File/ Print Group 1
Server
Cluster
Server
Capacity to Failover Group 1
12
Active/Active Configuration
Node BNode ADisk 2Disk 1
Quorum
Capacity to failover Group 1
Cluster Service
\\Engineering
Group 1
\\Accounting
Group 2
Capacity to failover Group 2
Disk 1
Disk 2
13
Active/Passive Configuration
Node B
Quorum
Disk 1
Node A
Node A manages virtual server \\Accounting.Node B is configured as a hot spare and will take ownership of \\accounting if Node A goes offline
Cluster Service
\\Accounting
Group 1
Disk 1
14
Microsoft Technologies for Clustering
• Two Microsoft technologies for clustering:– Network Load Balancing (NLB)– Server Cluster (MSCS)
• NLB and MSCS must be installed on separate machines• Example
– Front-End NLB servers hosting IIS and communicating with a Backend MSCS Cluster for Database information
NLB Hosting IIS MSCS Hosting Database
Client
15
Microsoft Windows 2003 Server Cluster (MSCS)
Additional Capabilities provided by MSCS
• Every node has full connectivity and communication with the other nodes in the cluster through the following:– One or more shared SCSI, iSCSI or Fibre Channel buses for Block Level
storage. – A private network, or interconnect, that carries only internal cluster
communication. – One or more public networks.
• Every node in the cluster is aware when another system joins or leaves the cluster.
• Every node in the cluster is aware of the resources that are running locally as well as the resources that are running on all other cluster nodes.
16
Server Cluster and NLB ComparisonServer Cluster NLB
• Used for databases, e-mail services, line of business (LOB) applications, and custom applications
• Used for Network Services such as Web servers, FTP Servers, firewalls, and other networking services
• Included with Windows Server 2003, Enterprise Edition, and Windows Server 2003, Datacenter Edition
• Included with all four versions of Windows Server 2003
• Provides high availability, scalability and server consolidation
• Provides high availability and scalability
• Can be deployed on a single network or geographically distributed
• Generally deployed on a single network but can span multiple networks if properly configured
• Supports clusters up to eight nodes • Supports clusters up to 32 nodes
• Requires the use of shared or replicated storage
• Does not require any special hardware or software and works “out of the box”
17
Microsoft Server Cluster Terminology and Definitions (1)
A group of independent network servers that present themselves to a network as a single systemCLUSTER
A cluster node is a Microsoft Windows 2003 Server system that has a working installation of the Cluster service. NODE
Resources are physical or logical entities, such as a file share, that are managed by the Cluster serviceRESOURCES
All resources can have the following states: Online, Offline, Online pending, Offline pending and Failed.
RESOURCE STATES
A dependency is a two-way association between resources. RESOURCE DEPENDENCIES
Groups are a collection of resources that need to be managed as a single unit for configuration and recovery purposes. GROUPS
Failover is the process of moving a group of resources from one node to another in the case of a failure or for administrative tasks. FAILOVER
18
Microsoft Server Cluster Terminology and Definitions (2)
Groups that contain an IP Address resource and a network name resource and appear as individual servers to clientsVIRTUAL SERVER
All nodes must have a network link between them that they can use to communicate with each other.
CLUSTER NETWORK
The shared disks are logical devices that all the cluster nodes are attached to via the shared bus. SHARED DISKS
A group of independent network servers that present themselves to a network as a single system
QUORUM RESOURCE
Cluster service is the collection of software on each node that manages all cluster specific activity. CLUSTER SERVICE
A “cluster-aware” application is any application that has been designed to function on a cluster and ships with a resource DLL.
CLUSTER-AWARE AND CLUSTER-
ENABLED APPLICATIONS
Failback is the process of returning a group of resources to the node on which it was running before a failover occurred .FAILBACK
19
New Cluster Setup Features (1)The default installation of Clustering reduces the administrative overhead and also does not require a reboot.Installed by Default
Node eviction does not require a reboot. This results in increased availability and easier disaster recovery when there is a node failure.Node Eviction
Allows other nodes in the cluster to function while a node OS, is upgraded to a newer version.Rolling Upgrades
The cluster service can queue up changes that need to be completed if a node is offline.Queued Changes
Uninstalling Cluster Service from a node is now a one-step process of evicting the node.
Simpler Un-installation
Remote Administration allows full remote creation and configuration of the server cluster.
Remote Administration
20
New Cluster Setup Features (2)A pre-configuration analysis ensures that any known incompatibilities are detected prior to configuration.
Pre-configuration Analysis
Installation of cluster service now allows multiple nodes to be added to a server cluster in a single operation.Multi-Node Addition
The disk that needs to be used as the Quorum Resource is automatically configured on the smallest disk that is larger then 50 MB and formatted with NTFS.
Quorum Selection
If a node is not attached to a shared disk, it will automatically configure as a "Local Quorum" resource. Local Quorum
21
Administrative EnhancementsIn Windows Server 2003, you can change the Cluster Service account password without having to take the cluster offline. Password Change
Cluster Service now includes enhanced logic for Group Failover, when you have a cluster with three or more nodes.
Enhanced Node Failover
Group Affinity Support allows an application to describe itself as an N+I (N active nodes and I “spare” nodes)
Group Affinity Support
WMI allows server clusters to be managed as part of an overall WMI environment.WMI Support
Resources can be deleted in Cluster Administrator or with Cluster.exe without taking the resources offline first. Resource Deletion
22
Supporting and Troubleshooting Enhancements (1)
Software Tracing is a new method for debugging that allows debugging the Cluster Service without loading checked build versions of the dlls.Software Tracing
The use of Event Log allows event log parsing and management tools to be used to track successful failovers rather than just catastrophic failures.
Event Log
During configuration of Cluster Service, a separate setup log (%SystemRoot%\system32\Logfiles\Cluster\ClCfgSrv.log) is created to assist in troubleshooting.
Clcfgsrv.log
The use of the Chkdsk utility enables easier monitoring and troubleshooting.Chkdsk logging
23
Supporting and Troubleshooting Enhancements (2)
The cluster.log file has been changed to add logging levels (ERR, INFO, WARN) to entries in the log, thereby making it easier to locate problem sections in the log.
Cluster.log– new info
The cluster.obj file eliminates the need to open the registry to figure out the friendly name of the resource. Cluster.obj
The Offline/Failure Reason Codes allow the application to have different semantics if the applications has failed or some dependency of the application has failed
Offline/Failure Reason Codes
The Cluster diagnostic tool greatly assists in the analysis of cluster logs by capturing the Cluster.log file from each node.Clusdiag
24
Disaster Recovery Enhancements• NT-Backup / ASR• Confdisk and Clusterrecovery
25
Confdisk
Confdisk.exe -- is a tool that can be used to recover failed disks in a cluster. We need to use Confdisk.exe in conjunction with the Cluster Recovery and Cluster Administrator tools due to the nature of cluster troubleshooting.
26
Clusterrecovery
27
Microsoft Windows Server Cluster Benefits of Microsoft Clusters:• Support for automatic recovery of services in the event of failure of one or more
computers within the cluster.• Provision of data consistency across all nodes in the cluster.• Standard, cross-platform application programming interface (API) for developing and
supporting “cluster-aware” and “cluster-enabled” applications.• Standard set of clustering services for clusters from many different hardware vendors.• Increased scalability by allowing new components to be added as system load
increases without taking existing cluster services offline. • Ability to allow administrators to manage a cluster as a single system and to manage
applications as if they were running on a single server. • Improves the availability of client/server applications by increasing the availability of
server resources. • By clustering existing hardware with new computers, you protect your investment in
both hardware and software: Instead of replacing an existing computer with a new one of twice the capacity, you can simply add another computer of equal capacity.
28
Additional ReferencesThe following Microsoft articles provide information on Cluster, SAN
and Disk Management.• http://technet.microsoft.com/en-us/library/
aa996161%28v=exchg.65%29.aspx• http://blogs.technet.com/b/askcore/archive/2007/11/12/so-what-
does-cluster-recovery-actually-recover-anyway.aspx• http://support.microsoft.com/kb/323437• 280297: How to Configure Volume Mount Points on a Clustered
Server• 304736: How to Extend the Partition of a Cluster Shared Disk• 301647: Cluster Service improvements for Storage Area Networks
(SANs)
Q & A
Thank you