High Availability in
Scott SchnollPrincipal Technical WriterMicrosoft CorporationUNC313
Agenda
Exchange 2010 High Availability Vision/GoalsExchange 2010 High Availability FeaturesExchange 2010 High Availability Deep DiveDeploying Exchange 2010 High Availability FeaturesTransitioning to Exchange 2010 High AvailabilityEnd-to-End Availability ImprovementsHigh Availability Design Examples
Exchange 2010 High Availability Vision/Goals
Exchange 2010 High Availability Vision and Goals
Vision: Deliver a fast, easy-to-deploy and operate, economical solution that can provide messaging service continuity for all customersGoals
Deliver a solution for high availability and site resilience that is native to ExchangeEnable less expensive and less complex storageSimplify administration and reduce support costsIncrease end-to-end availabilitySupport Exchange Server 2010 Online
Exchange 2010 High Availability Solution
Unified technology for high availability and site resilienceNew framework for creating highly available mailboxesEvolution of continuous replication technologyCan be deployed on a range of storage optionsNative to Exchange; not bolted onto the side
DB1
Front End Server
NodeB(passive)
Outlook OWA, ActiveSync, or Outlook Anywhere
San Jose
Dallas
Standby Cluster
Third-party data replication needed for site resilience
Complex site resilience and recovery
Clustering knowledge required
DB2
DB3
DB4
DB5
DB6
Failover at Mailbox server level
DB1
DB2
DB3Clustered Mailbox Server had to be created manually
Exchange Server 2003
NodeA(active)
DB1
Client Access Server
NodeB(passive)
SCROutlook OWA, ActiveSync, or Outlook Anywhere
San Jose
Dallas
Standby Cluster
No GUI to manage SCR
Complex activation for remote server / datacenter
Clustering knowledge required
DB2
DB3
DB4
DB5
DB6
DB1
DB2
DB3
DB4
DB5
DB6
Failover at Mailbox server level
DB1
DB2
DB3Clustered Mailbox Server can’t co-exist with other roles
Exchange Server 2007
NodeA(active) CCR
DB2
DB3
DB2
DB3
DB4
DB4
DB5
Client Access Server
Mailbox Server 1
Mailbox Server 2
Mailbox Server 3
Mailbox Server 6
Mailbox Server 4
Dallas
San Jose
Mailbox Server 5
DB5
DB2
DB3
DB4
DB5DB1
DB1DB1
DB1
Failover managed by/with Exchange
Database level failover
Easy to extend across sites
All clients connect via CAS servers DB3
DB5
DB1
Client
Exchange Server 2010
Exchange 2010 High Availability Features
Exchange 2010 High Availability Feature Names
Mailbox Resiliency – Name of Unified High Availability and Site Resilience SolutionDatabase Availability Group – A group of up to sixteen mailbox servers that host a set of replicated databasesMailbox Database Copy – A mailbox database (.edb file and logs) that is either active or passiveDatabase Mobility – The ability of a single mailbox database to be replicated to and mounted on other mailbox servers
Exchange 2010 High Availability Feature Names
RPC Client Access service – A Client Access server feature that provides a MAPI endpoint for Outlook clientsShadow Redundancy – A transport feature that provides redundancy for messages for the entire time they are in transitIncremental Deployment – The ability to deploy high availability /site resilience after Exchange is installedExchange Third Party Replication API – An Exchange-provided API that enables use of third-party replication for a DAG in lieu of continuous replication
Exchange 2010 High Availability Terminology
High Availability – Solution must provide data availability, service availability, and automatic recovery from failuresDisaster Recovery – Process used to manually recover from a failureSite Resilience – Disaster recovery solution used for recovery from site failure*over – Short for switchover/failover; a switchover is a manual activation of one or more databases; a failover is an automatic activation of one or more databases after a failure
Exchange 2010 *overs
Within a datacenterDatabase or server *overs
Datacenter level: switchoverBetween datacenters
Database or server *oversAssumptions:
Each datacenter is a separate Active Directory siteEach datacenter has live, active messaging servicesStandby datacenter must be active to support single database *over
Exchange 2007 Concepts Brought Forward
Extensible Storage Engine (ESE)Databases and log files
Continuous ReplicationLog shipping and replayDatabase seedingStore service/Replication serviceDatabase health and status monitoringDivergenceAutomatic database mount behavior
Concepts of quorum and witnessConcepts of *overs
Exchange 2010 Deprecated Concepts
Storage groupsDatabases identified by the server on which they liveServer names as part of database namesClustered Mailbox Servers
Pre-installing a Windows Failover ClusterRunning setup in Clustered ModeMoving a CMS network identity between serversShared storage
Two HA copy limitsPrivate and public networks
Exchange 2010 High Availability Vision/Goals
Exchange 2010 HA Fundamentals
Database Availability GroupServerDatabaseDatabase CopyActive ManagerRPC Client Access
DAG
copy copy
AM
SVR
copy copy
AM
SVR
DB DB
RPC CAS
RPC CAS
Database Availability Group (DAG)
Base component of high availability and site resilienceA group of up to 16 servers that host a set of replicated databases“Wraps” a Windows Failover Cluster
Manages membership (DAG member = node)Provides heartbeat of DAG member serversActive Manager stores data in cluster database
Defines a boundary for:Mailbox database replicationDatabase and server *oversActive Manager
Active Manager
Exchange component that manages *oversRuns on every server in the DAGSelects best available copy on failoversIs the definitive source of information on where a database is active
Stores this information in cluster databaseProvides this information to other Exchange components (e.g., RPC Client Access and Hub Transport)
Two Active Manager roles: PAM and SAM
Active ManagerPrimary Active Manager (PAM)
Runs on the node that owns the cluster groupGets topology change notificationsReacts to server failuresSelects the best database copy on *overs
Standby Active Manager (SAM)Runs on every other node in the DAGResponds to queries about which server hosts the active copy of the mailbox database
Both roles are necessary for automatic recoveryIf Replication service is stopped, automatic recovery will not happen
Active ManagerSelection of Active Database Copy
Active Manager selects the “best” copy to become active when existing active fails
Ignores servers that are unreachable or activation is temporarily or regularly blockedSorts copies by currency to minimize data lossBreaks ties during sort based on Activation PreferenceSelects from sorted listed based on copy status of each copy
Active ManagerSelection of Active Database Copy
Active Manager selects the “best” copy to become active when existing active fails
Catalog HealthyCopy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing, orSeedingSource
CopyQueueLength < 10ReplayQueueLength < 50
Catalog CrawlingCopy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing, orSeedingSource
CopyQueueLength < 10ReplayQueueLength < 50
Catalog HealthyCopy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing, orSeedingSource
ReplayQueueLength < 50
Catalog CrawlingCopy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing, orSeedingSource
ReplayQueueLength < 50
5Copy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing, orSeedingSource
ReplayQueueLength < 50
6Catalog HealthyCopy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing, orSeedingSource
CopyQueueLength < 10
7Catalog CrawlingCopy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing, orSeedingSource
CopyQueueLength < 10
8Catalog HealthyCopy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing, orSeedingSource
9Catalog CrawlingCopy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing, orSeedingSource
10Copy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing, orSeedingSource
Example: Database Failover
Database failure occursFailure item is raisedActive Manager moves active databaseDatabase copy is restoredSimilar flow within and across datacenters
DB2
DB3
DB2
DB3
DB4
DB4
DB5
Mailbox Server
1
Mailbox Server
2
Mailbox Server
3
Mailbox Server
4
Mailbox Server
5
DB5
DB2
DB3
DB4
DB5DB1
DB1
DB1
DAG
Example: Server FailoverServer failure occursCluster notification of node downActive Manager moves active databasesServer is restoredCluster notification of node upDatabase copies resynchronize with active databasesSimilar flow within and across datacenters
DB2
DB3
DB2
DB3
DB4
DB4
DB5
Mailbox Server
1
Mailbox Server
2
Mailbox Server
3
Mailbox Server
4
Mailbox Server
5
DB5
DB2
DB3
DB4
DB5DB1
DB1
DB1
DAG
DAG LifecycleDAG is created initially as empty object in Active Directory
Continuous replication or 3rd party replication using Third Party Replication mode
When first mailbox server is added to a DAGA Windows failover cluster is formed with a Node Majority quorum using the name of the DAG The server is added to the DAG object in Active DirectoryA cluster network object (CNO) for the DAG is created in the built-in Computers containerOne or more IP addresses is assigned to the DAGThe Name and IP address(s) of the DAG is registered in DNSThe cluster database for the DAG is updated with info on configured databases, including if they are locally active (which they should be)
DAG Lifecycle
When second and subsequent Mailbox server is added to a DAG
The server is joined to cluster for the DAGThe quorum model is automatically adjusted
Node Majority - DAGs with odd number of membersNode and File Share Majority - DAGs with even number of membersFile share witness cluster resource, directory, and share are automatically created by Exchange when needed
The server is added to the DAG object in Active DirectoryThe cluster database for the DAG is updated with info on configured databases, including if they are locally active (which they should be)
DAG Lifecycle
After servers have been added to a DAGConfigure the DAG
Network EncryptionNetwork Compression
Configure DAG networksNetwork subnetsEnable/disable MAPI traffic/replication
Create mailbox database copiesSeeding is performed automatically
Monitor health and status of database copiesPerform switchovers as needed
DAG Lifecycle
Before you can remove a server from a DAG, you must first remove all replicated databases from the serverWhen a server is removed from a DAG:
The server is evicted from the clusterThe cluster quorum is adjusted as neededThe server is removed from the DAG object in Active Directory
Before you can remove a DAG, you must first remove all servers from the DAG
Exchange 2010 High Availability Vision/Goals
Deploying Exchange 2010 HA Features
Legacy Deployment Steps (CCR/SCC)
1. Prepare hardware, install proper OS, and update
Extra for SCC: configure storage2. Build Windows Failover Cluster
Extra for SCC: configure storage3. Configure cluster quorum, file share
witness, and public and private networks
4. Run Setup in Custom mode and install clustered mailbox server
5. Configure clustered mailbox serverExtra for SCC: configure disk resource
dependencies6. Test *overs
Legacy Deployment Steps (CCR/SCC) Exchange 2010 Incremental Deployment
1. Prepare hardware, install proper OS, and update
Extra for SCC: configure storage2. Build Windows Failover Cluster
Extra for SCC: configure storage3. Configure cluster quorum, file share
witness, and public and private networks
4. Run Setup in Custom mode and install clustered mailbox server
5. Configure clustered mailbox serverExtra for SCC: configure disk resource
dependencies6. Test *overs
1. Prepare hardware, install proper OS, and update
2. Run Setup and install Mailbox role3. Create a DAG and replicate databases4. Test *overs
Exchange 2010 Incremental Deployment
Create a DAGNew-DatabaseAvailabilityGroup -Name DAG1 -FileShareWitnessShare \\EXHUB1\DAG1FSW -FileShareWitnessDirectory C:\DAG1FSW
Add first Mailbox Server to DAGAdd-DatabaseAvailbilityGroupServer -Identity DAG1 -MailboxServer EXMBX1 -DatabaseAvailablityGroupIpAddresses 10.0.0.8
Add second and subsequent Mailbox ServerAdd-DatabaseAvailabilityGroupServer -Identity DAG1 -MailboxServer EXMBX2
Add-DatabaseAvailabilityGroupServer -Identity DAG1 -MailboxServer EXMBX2 -DatabaseAvailablityGroupIpAddresses 10.0.0.8,10.0.1.8
Add Mailbox Database CopyAdd-MailboxDatabaseCopy -Identity MBXDB1 -MailboxServer EXMBX3
Extend as needed
Creating a database availability groupAdding servers to a database availability groupAdd mailbox database copyDatabase switchover
demo
Transitioning to Exchange 2010 High Availability
Transition Steps
Verify that you meet requirements for Exchange 2010Deploy Exchange 2010Use Exchange 2010 mailbox move features to migrateUnsupported Transitions
In-place upgrade to Exchange 2010 from any previous version of ExchangeUsing database portability between Exchange 2010 and non-Exchange 2010 databasesBackup and restore of earlier versions of Exchange databases on Exchange 2010Using continuous replication between Exchange 2010 and Exchange 2007
Exchange 2010 End-to-End Availability Improvements
Exchange 2010 End-to-End Availability Improvements
Online Move MailboxSupported between Exchange 2010 databases, and between Exchange 2007 SP2 and Exchange 2010 databasesUser can access their mailbox while move is in progressMove is performed asynchronouslyby a new service called the Microsoft Exchange Mailbox Replication Service (MRS), which runs on Client Access servers
E-Mail Client
Mailbox Server 1 Mailbox Server 2
Client Access Server
Exchange 2010 End-to-End Availability Improvements
RPC Client Access serviceA new service that establishes a RPC endpoint for client access on the CAS role to replace the existing RPC endpoint on the Mailbox role
New RPC endpoint entirely re-written in managed codeRe-factored common business logic from Exchange 2007 that overlaps with what is needed by the RPC endpointCmdlets, performance counters, etc. to manage and monitor
Does not replace RPC endpoint for public folder databases; Outlook clients logon directly with public folder store to access public folder databases
Exchange 2010 End-to-End Availability Improvements
Mailbox Server
HubTransport
Edge Transport
EdgeTransport
Servers keep “shadow copies” of items until they are delivered to the next hop
Also helps simplify Hub and Edge Transport Server upgrades and maintenance
X
Shadow Redundancy
Exchange 2010 End-to-End Availability Improvements
Transport Dumpster ImprovementsGets feedback from replication pipeline to let it know when to delete items
Once something has been delivered, and the logs for the message are replicated, transport dumpster can delete the messageReplay is not required for deleting items from dumpster; only data in dumpster is data that has not yet been replicated
Responds to requests for redelivery after lossy failover both within its Active Directory site and across Active Directory sites (old site and new site)
Exchange 2010 End-to-End Availability Improvements
Exchange 2010 HAE-mail ArchiveHold Policy
X
Database Availability Group
Mailbox Server 1
Mailbox Server 2
Mailbox Server 3
DB1
DB2
DB3
DB1
DB2
DB3
DB1
DB2
DB3
Site/Server/Disk failureArchiving/ComplianceRecover deleted items
Using 3 or more database copies enables you to use replication for your backups
Exchange Server 2010 High Availability Design Examples
Client AccessHub
TransportMailbox
Client AccessHub
TransportMailbox
Member servers of DAG can host other server roles
Hardware Load Balancer
DB1
DB2
DB3
DB2
DB1
DB2
DB3
2-server DAGs should use RAID
8 processor cores recommended with a maximum of 64GB RAM
UM role not recommended for co-location
High Availability Design ExampleBranch/Small Office Design
Single Site
3 HA Copies
Database Availability Group
DB1 DB2 DB3
DB5 DB6
DB1 DB2 DB3
DB4 DB5 DB6
DB1 DB2 DB3
DB4 DB5 DB6DB4
MailboxServer 1
MailboxServer 2
MailboxServer 3
3 Nodes
X
CAS NLB Farm
AD: Dublin
XJBOD -> 3 physical Copies
2 servers out -> manual activation of server 3
In 3 server DAG, quorum is lostDAGs with more servers sustain more failures – greater resiliency
High Availability Design ExampleDouble Resilience – Maintenance + DB Failure
• Single Site• 4 Nodes• 3 HA Copies• JBOD -> 3 physical Copies
Database Availability Group (DAG)
DB2 DB3
DB5DB4
DB7 DB8 DB1
DB2 DB3 DB4
MailboxServer 1
DB5 DB6 DB7
DB8 DB1 DB2
MailboxServer 2
MailboxServer 3
X
CAS NLB Farm
AD: Dublin
DB3 DB4 DB5
DB6 DB7 DB8
MailboxServer 4
DB1 XDB6
• Upgrade server 1• Server 2 fails• Server 1 upgrade is done• 2 active copies die
High Availability Design ExampleDouble Node/Disk Failure Resilience
Key Takeaways
Greater end-to-end availability with Mailbox ResiliencyUnified framework for high availability and site resilienceFaster and easier to deploy with Incremental DeploymentReduced TCO with core ESE architecture changes and more storage optionsSupports large mailboxes for less money
question & answer
www.microsoft.com/teched
Sessions On-Demand & Community
http://microsoft.com/technet
Resources for IT Professionals
http://microsoft.com/msdn
Resources for Developers
www.microsoft.com/learningMicrosoft Certification and Training Resources
www.microsoft.com/learning
Microsoft Certification & Training Resources
Resources
Related ContentBreakout Sessions (session codes and titles)•UNC316 - Microsoft Exchange Server 2010 Architecture•UNC321 - Storage in Microsoft Exchange Server 2010
Interactive Theater Sessions (session codes and titles)•UNC02-TLC - Designing Microsoft Exchange Server 2010 High Availability Solutions
Hands-on Labs (session codes and titles)•UNC12-HOL - Microsoft Exchange Server 2010 High Availability and Storage Scenarios
Call to ActionLearn More!
Related Content at TechEd on “Related Content” SlideAttend in-person or consume post-event at TechEd Online
Check out online learning/training resourceshttp://technet.microsoft.com/exchange/2010 http://technet.microsoft.com/office/ocs
Try It Out!Download the Exchange Server 2010 Beta Evaluation
http://www.microsoft.com/exchange/2010/try-it
Get a 5-Day Trial of Office Communications Server 2007 R2https://r2.uctrial.com/
Complete an evaluation on CommNet and enter to win!
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS,
IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Top Related