Post on 31-Mar-2015
Exchange Architecture
October 2013Brian DaySenior Program ManagerExchange Customer Advisory Team
Agenda Client Access server role
Mailbox server role
Service Availability Changes
Client Access Server Role
Client Access Server role• Domain-joined machine in the internal Active Directory
forest Thin, stateless (protocol session) server
• Comprised of three components: Client access protocols (HTTP, IMAP, POP) SMTP UM Call Router
• Exchange-aware proxy server Understands requests from different protocols (OWA, EWS, etc.) Supports proxy and redirection logic for client protocols Capable of supporting legacy servers with redirect or proxy logic Contains logic to route specific protocol requests to their destination end-point
Client Access Array• A group of CAS organized in a load-balanced
configuration Designed to work with only TCP session affinity (aka, layer 4 LB) Does not require session affinity (aka, layer 7 LB)
• Provides a unified namespace and authentication Similar to Exchange 2010 in terms of providing a unified endpoint
for client connectivity and authentication
Load Balancer
MDB
HTTP Proxy
IISClient Acces
s
RPC CA
Mailbox
IIS
RPS OWA, EAS, EWS, ECP, OAB
POP, IMAP SMTP UM
POP IMAP
Transport UM
SMTPPOP, IMAPHTTP
MailQ
Client Protocol Architecture in Exchange 2013
RpcProxy
SMTP
SIP
Redirect
SIP + RTP
POP/IMAPOutlook Web App Outlook EAS EAC PowerShell
Outlook Connectivity in Exchange 2013• Exchange 2013 supports RPC/HTTP only; No
RPC/TCP Simplifies the protocol stack Provides an extremely reliable and stable connectivity model because RPC session is always on Mailbox server hosting active copy
Eliminates need for RPC CAS Array namespace(s) Eliminates end user interruptions like “The Exchange administrator has made a change that requires you to quit and restart Outlook” during mailbox moves or *overs
Namespace Simplification
• Exchange 2013 no longer requires multiple namespaces for site resilient solutions or site specific scenarios
• Easy to setup a single, worldwide client access namespace Can be used in coexistence with Exchange 2010
A Single Common Namespace ExampleGeographical DNS Solution
Sue (somewhere in
NA) DNS Resolution
DAG
VIP #1 VIP #2
Sue (traveling in APAC)DNS Resolution via Geo-
DNSRound-Robin between # of VIPs
DAG
VIP #3 VIP #4
mail.contoso.com
Round-Robin between # of VIPs
Third-party MAPI products will need to use RPC/HTTP to connect to CAS 2013
Exchange 2013 will be the last release to support a MAPI/CDO downloadThird parties must move to Exchange Web Services in the future
The MAPI/CDO download will be updated to include support for RPC/HTTP connectivityWill require third-party application configuration; either by programmatically editing a dynamic MAPI profile, or by setting registry keysLegacy environments can continue to use RPC/TCP
Third-Party MAPI Products
10
CAS 2013 Client Protocol Connectivity Flow
Layer 4 LB
CAS
IIS
HTTP Proxy
MBX
Protocol Head
DB
Layer 4 LB
CAS
IIS
HTTP Proxy
MBX
Protocol Head
DB
Site
Boundary
HTTP
Local Proxy Request OWA Cross-Site Redirect Request
MBX
Protocol Head
DB
Site
Boundary
Cross-Site Proxy Request
HTTP
HTTPHTTP HTTP
11
CAS
IIS
HTTP Proxy
HTTP
CAS 2013 Client Protocol Connectivity FlowExchange 2010 Legacy Coexistence
Layer 4 LB
CAS 2013
IIS
HTTP Proxy
MBX2013
Protocol Head
DB
Exchange 2010 CAS
Protocol Head
MBX
Store
DB
Site
Boundary
E2010 CAS
Protocol Head
MBX
Store
DB
RPC RPC
Cross-Site Proxy Request
Layer 7 LBOWA
Cross-Site
Redirect Request
HTTP
12
CAS 2013 Client Protocol Connectivity FlowLegacy Coexistence
Protocol E2007 user accessing E2010 namespace
E2007 user accessing E2013 namespace
E2010 user accessing E2013 namespace
Requires Legacy Namespace Legacy Namespace No additional namespaces
OWA • Same AD site: silent or SSO FBA redirect• Externally facing AD site: manual or
silent/SSO cross-site redirect• Internally facing AD site: proxy
Silent redirect in CU2+ to CAS 2007 externally facing URL in-site or cross-site
• Proxy to CAS 2010• Cross-site silent redirect in CU2+, which may redirect
to CAS 2010 or CAS 2013
Exchange ActiveSync
• EAS v12.1+ : Autodiscover & redirect • Older EAS devices: proxy
Proxy to MBX 2013 Proxy to CAS 2010
Outlook Anywhere
Direct CAS 2010 support Proxy to CAS 2007 Proxy to CAS 2010
Autodiscover Exchange 2010 answers Autodiscover query for 2007 User
Exchange 2013 answers Autodiscover query for 2007 User
Proxy to CAS 2010
EWS Uses Autodiscover to find CAS 2007 EWS External URL
Uses Autodiscover to find CAS 2007 EWS External URL
Proxy to CAS 2010
POP/IMAP Proxy Proxy to CAS 2007 Proxy to CAS 2010
OAB Direct CAS 2010 support Proxy to CAS 2007 Proxy to CAS 2010
RPS n/a n/a Proxy to CAS 2010
ECP n/a n/a • Proxy to CAS 2010• Cross-site redirect, which may redirect to CAS 2010 or
CAS 2013
13
Outlook only supports a single RPC Proxy endpoint
If Outlook Anywhere is allowed on the Internet, this may have internal Outlook clients connect to the external firewall for connectivity
To ensure that internal Outlook clients follow the internal pathway, use split-brain DNSForces internal clients to use internal IPForces external clients to use external IP
Split DNSWhat you need to control connectivity flow
14
Benefits of new architecture
Simplifies the network layer
Removes need for RPC CAS Array
Provides deployment flexibility
Front-End Transport Service
Handles all inbound and optionally the outbound external SMTP traffic for the organization, as well as client endpoint for SMTP trafficDoes not replace the Edge Transport Server role
Functions as a layer 7 proxy and has full access to protocol conversation
Will not queue mail locally, and will be completely stateless
Optionally all outbound traffic appears to come from CAS 2013
Listens on TCP25 and TCP587 (two receive connectors)
Front-End Transport Service
17
Front-End Transport Service Architecture
Front-End Transport Pipeline
SMTP SendSMTP Receive
Protocol Agents
SMTP to MBX 2013SMTP from MBX 2013
External SMTP External SMTP
Hub Selector
18
Bifurcation does not occur on Front-End transport (FET), so only one DAG or MBX 2013 is selected, regardless of the number of recipients in a message
FET uses delivery groups: DAG, mailbox, AD site
Server selection within the delivery group is based on recipient type• If message only has a single mailbox recipient, select MBX server within delivery
group based on proximity of AD site• If multiple mailbox recipients, select MBX server in closest delivery group, factoring
in site proximity• If there are no mailbox recipients (DG, MEUs, etc.), select a random MBX 2013,
giving preference to local AD site
Entry Point Routing
19
SMTP Inbound/Outbound Mail FlowInbound Mail Flow1. FET accepts initial SMTP
conversation if source passes connection filtering
2. FET determines the recipient type/location via hub selector
3. Proxies the message to the appropriate destination
Outbound Mail Flow1. MBX 2013 determines if mail
recipient is a remote destination and selects a FET within local site if the FrontEndProxyEnabled parameter on Send Connector is set to $true
2. MBX 2013 connects to FET and initiates SMTP conversation
3. FET proxies outbound connection to appropriate destination
20
The SMTP Front-End Service provides:• Network protection – centralized, load-balanced egress/ingress point for the
organization• Mailbox locator – avoids unnecessary hops by determining the best MBX
2013 to deliver the message• Load-balanced solution for client/application SMTP submissions
Scales based on number of connections – just add more servers
Benefits of SMTP Front-End Service
21
Mailbox Server Role
22
Mailbox Server Role• Server that hosts the components that process,
render and store Exchange data Includes components previously found in separate roles
• Only Client Access servers connect directly to the Mailbox server Clients connect to Client Access servers Note – one exception is UM with RTP
Connectivity to a mailbox is always provided by the protocol instance local to the active database copy
Database Availability Group• Collection of servers that form a unit
of high availability• Boundary for replication and *over• DAG members can be in different
sites• Can have a maximum of 16 Mailbox
servers
MBX1
MBX2
MBX16
Mailbox-related changes
Managed Store
IOPS reductions
Larger mailbox support
Modern public folders
New search infrastructure
Managed Store
Managed Store• Store controller service process
(Microsoft.Exchange.Store.Service.exe) Manages worker process lifetime based on mount/dismount Logs failure item when store worker process problems detected Terminates store worker process in response to “dirty” dismount
during failover
• Store worker process (Microsoft.Exchange.Store.Worker.exe) One process per database, RPC endpoint instance is database GUID Responsible for block-mode replication for passive databases Fast transition to active when mounted Transition from passive active increases ESE cache size 5X
Microsoft Exchange Replication service• MSExchangeRepl.exe
Detecting unexpected database failures Issues mount/dismount operations to Store
Provides administrative interface for management tasks
Initiates failovers on failures reported by ESE, Store and Responders
ESE Cache Management• Algorithm allocates memory for ESE cache for store worker
processes based on RAM (max cache target)• ESE cache allocated to each database (store worker process)
based on number of local database copies and value of MaximumActiveDatabases Static amount of cache allocated to passive and active copies
• Store worker process will only use max cache target when operating as active Passive database allocates 20% of max cache target
• Max cache target computed at service process startup Restart service process when adding/removing copies or changing
maximum active database configuration
ESE Cache Example #1
30
2560
2560
2560
2560
25602560
2560
2560
2560
2560
Per DB cache usage in Megabytes10 Active DBs0 Passive DBs
10 Max Allowed Active DBs
10
0 G
B S
yst
em
Mem
ory
ESE Cache~25GB
Search Foundation
s Cache ~25GB
Other50GB
ESE Cache Example #2
10
0 G
B S
yst
em
Mem
ory
ESE Cache~25GB
Search Foundation
s Cache ~25GB
Other50GB DB1, 2560
DB2, 2560
DB3, 2560
DB4, 2560
DB5, 2560
DB6, 512DB7, 512DB8, 512DB9, 512DB10, 512
Remaining Pool, 14336
Per DB cache usage in Megabytes5 Active DBs5 Passive DBs
10 Max Allowed Active DBs
ESE Cache Example #3
10
0 G
B S
yst
em
Mem
ory
ESE Cache~25GB
Search Foundation
s Cache ~25GB
Other50GB
4267
4267
42674267
4267
853
853
853853
853
Per DB cache usage in Megabytes5 Active DBs5 Passive DBs
5 Max Active DBs Allowed
IOPS Reductions
History of Exchange Storage
Exchange 5.5
9gb, 18gb10k-15k RPM25mb MBX
Exchange 200036gb
10k-15k RPM100mb MBX
Exchange 2003
72-146gb10k-15k RPM250mb MBX
Exchange 2007
300-600gb7200 RPM2GB MBX
Exchange 20102TB
7200 RPM10GB MBX
Exchange 20134-8TB
7200 RPM25GB MBX
Capacity: Drive Sizes increase 800%IO: Drive rotational speeds are the same and/or declining
IO Constrained
Capacity Constrained
35
Storage Evolution Large, unstructured data sets have driven the need
to find alternate storage methods Provide large capacity Reduce cost Increase availability Reduce energy consumption
Benefits end users Benefits admins Examples:
Microsoft TerraServer Whitepaper (2003): http://research.microsoft.com/apps/pubs/default.aspx?id=64151
Adoption: Hotmail, Gmail, Google File System, Yahoo Mail, AOL mail, etc Hadoop: Netflix, Amazon S3, Facebook
Small Business Medium Business Large Business Cloud
IOPS Reductions• Improvements to logical contiguity of store
schema Property blobs are used to store actual message properties Several messages per page means fewer large IOs to retrieve
message properties Use of long-value storage is reduced, though when accessed, large
sequential IOs are used
• Reduction in passive copy IO 100MB checkpoint depth reduces write IO Transaction log code has been refactored for fast failover with deep
checkpoint
E2010 vs. E2013 Performance Comparison* Results based on daily Outlook cached mode Load Generator simulations (10 databases, 1000 users) to measure key metrics used to identify performance improvements/regressions (Beta2 build 466, subject to change)
Online | Cached Modes48 | 76% IOPS reduction (disk IOPS capacity not expected to change)
18 | 41% Average RPC Latency reduction
17 | 34% increase in CPU per RPC processed (offset by additional CPU cores)
~4X increase in store memory overhead (~4GB vs. ~1GB not including ESE cache)
DB IOPS/Mailbox0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.65
0.16
E14SP1 E15 Build 466
RPC Average La-tency
Mcycles per RPC packet
Store Memory per Mailbox (MB)
0
0.5
1
1.5
2
2.5
3
3.5
43.99
3.09
0.736420927114487
2.35
3.75
3.16318300602913
E14SP1 E15 Build 46637
IOPS Reductions
Exchange 2003 Exchange 2007 Exchange 2010 Exchange 20130
0.2
0.4
0.6
0.8
1
DB IOPS/Mailbox
IOPS/Mailbox
~95.5% Reduction!
Larger Mailboxes
Support for Larger Mailboxes• Large Mailbox Size is 100
GB+ Aggregate Mailbox =
Primary Mailbox + Archive Mailbox + Recoverable Items
1-2 years of mail (minimum)
• Increase IW productivity• Eliminate or reduce PST
files• Eliminate or reduce third-
party archive solutions• OST size control with
Outlook 2013
Time Items Mailbox Size
1 Day 150 11 MB
1 Month 3300 242 MB
1 Year 39000 2.8 GB
2 Years 78000 5.6 GB
4 Years 156000 11.2 GB
Elimination of Scheduled Maintenance• Recurring maintenance implemented within time-based
assistant (TBA) infrastructure as two assistants: StoreMaintenance: lazy index maintenance, isinteg StoreDirectoryServiceMaintenance: disconnected mailbox expiration
• Workload Management monitors CPU, RPC Latency, and replication health Task execution throttled/deferred when resource pressure exists
• Background ESE database scanning further throttled Based on datacenter disk failure analysis, target to complete background database scan within 4 weeks
(based on multiple databases on 8 TB disks)
• Periodic tasks to generate mailbox quota notification removed Quota notifications generated at logon time
Modern Public Folders
Modern Public Folders• Public folders based on the mailbox
architecture • Single-master model
Hierarchy is stored in a PF mailbox (one writeable) Content can be broken up and placed in multiple
mailboxes The hierarchy folder points to the target content
mailbox• Because it’s a mailbox, it’s in a mailbox
database…thus, High availability achieved through continuous
replication No separate replication mechanism
• Similar administrative features to current PFs No end-user changes
MBX2013
CAS2013
MBX2013
MBX2013
Public logon
Private logon
Public logon
Content Mailbox
Hierarchy Mailbox
Modern Public Folders• 1 - User connects to their home
Public Folder mailbox first, which should be located near their primary mailbox.
• 2- Folder contents live in one specific mailbox for that folder. All content operations are redirected to the mailbox for that folder
• 3 – Folder hierarchy changes are intercepted and written to writeable copy of Public Folder hierarchy
• 4 – All Public Folder mailboxes listen for hierarchy changes and update similar to Outlook clients
• 5 - When a Public Folder mailbox gets full, move some folders to a new mailbox
1
2 3 5
4
New Search Infrastructure
New Search Infrastructure
Uses Search Foundation
Significantly improved query performance
Significantly improved indexing performance
Search Foundation Primer
Core
Catalog
CTS
Incoming Documents
FilterWord Break
Content
XForm
MARS Write
r
Incoming Queries
“CTS Flow”
IMSContent XForm
Query
Parse
“IMS Flow”
Res
ults
Content Transformation System Integration Management Service
Mailbox
DB
Idx
Passive
Exchange Search Infrastructure
TransportTransport CTS
MailboxStore
DB
Index Node
Idx
ExSearch
Loca
l Del
iver
y
Reliable
Event
CTS
Read Content
MBX2013
Log
MBX2013
Log
Transport-related Changes
Transport Components• Transport on Mailbox server is three services
Microsoft Exchange Transport - Stateful and handles SMTP mail flow for the organization and performs content inspection
Microsoft Exchange Mailbox Transport Delivery - Receives mail from the Transport service and deliveries to the mailbox database
Microsoft Exchange Mailbox Transport Submission - Takes mail from the mailbox databases and submits to the Transport service
• Transport has the following responsibilities Receives all inbound mail to the organization Submits all outbound mail from the organization Handles all internal message processing such as transport rules, content filtering,
and antivirus Performs mail flow routing Queue messages Supports SMTP extensibility
Transport Service Architecture
Transport Pipeline
SMTP to MBX Transport Submission
SMTP from MBX Transport Delivery
SMTP SMTP
Delivery Agents for other protocols
Submission Queue
Delivery Queue
Delivery Queue
Pickup/Replay
Categorizer
Routing Agents
SMTP Send
SMTP ReceiveProtocol Agents
Mailbox Transport SubmissionMailbox Transport Delivery
Mailbox Transport Component Architecture
Mailbox Transport Pipeline
Store Driver Deliver
MBX Deliver Agents
SMTP SendSMTP Receive
Hub Selector (Router)
Store Driver Submit
MBX Assistants
MBX Submit Agents
MAPI MAPI
Mailbox Store
SMTP to Transport Service
SMTP from Transport Service
Mailbox Transport Component• Two separate services to handle mail submissions
(from the store) and mail delivery (from the Transport service)
• Mailbox Assistant and Store Driver combined• Leverages SMTP (encrypted) for communication
with the Transport component and TCP465 for inbound traffic
• Leverages local RPC for delivery to store• Is stateless and does not have a persistent
storage mechanism
Message Delivery• When receiving a message, the Mailbox
Transport component can either deliver the message or not deliver the message
• If non-delivery is chosen, then the Mailbox Transport component must provide a response back to Transport Retry delivery Generate an NDR Reroute the message
Service Availability Changes
All core Exchange functionality for a given mailbox is served by the MBX 2013 server where that mailbox’s database is currently activatedMailbox access fails over when a database fails over Protocols shift to the server hosting the active database copy
Managed Availability: Internal monitoring and high availability are tied together and can be used to detect and recover from problems as they occur and are discovered
Best copy selection now includes health of services when selecting best copy (best copy and server selection)
Failover time reductions
Service Availability Improvements
56
Managed Availability
DB2
MBX1
DB2
OWAMBX2
DB1 DB2
OWAMBX3
DB1 DB1DB1
CAS2
OWAOWA
MA: Fa
ilove
r dat
abas
e
OW
A res
tart
com
ple
te
OW
A s
end
OW
A fai
lure
det
ecte
d
OW
A fai
lure
OW
A res
tart
serv
ice
OW
A v
erifi
ed a
s
heal
thy
OW
A res
tart
ser
vice
faile
d
OW
A s
end
OW
A fai
lure
det
ecte
d
OW
A fai
lure
OW
A res
tart
serv
ice
OW
A v
erifi
ed a
s
heal
thy
OW
A s
ervi
ce res
tart
s
time
Managed Availability = Monitoring + HA
“Stuff breaks, but the Experience does not”
DB1
DAG
CAS1 CAS2
L4 Load Balancer
Every message is redundantly persisted before its receipt is acknowledged to the sender
Delivered messages are kept redundant in transport, similar to active messages
Every DAG represents a transport HA boundary and owns its HA implementationIf you have a stretched DAG, you also have transport site resilience
Resubmits due to transport DB loss or MDB *over are fully automatic and do not require any manual involvement
Transport High Availability Improvements
58
Same fundamental concept as in Exchange 2010, with new implementation in Exchange 2013
All mail is made redundant on a another server
Shadow messages are queued until Primary server successfully delivers the mail
Shadow server regularly heartbeats Primary server for status on the primary copy
On Primary server failure, Shadow server self-promotes itself as the Primary and delivers mail
Shadow Redundancy
59
New transport configuration – RejectOnShadowFailure ensures that no message is acknowledged and accepted unless a shadow copy was first created
Messages are made redundant on other servers within a DAG, stamp group, or site
Messages are tried for a configurable amount of time before giving up and rejecting the message
Guaranteed Redundancy
60
Introduced in Office 365 to redundantly store all mail for a configured time span to protect against mailbox irrecoverable failures
Now has a “shadow” equivalent and is no longer a Single Point of Failure (SPOF)
Consolidates and improves Exchange 2010 Transport Dumpster functionalitySafetyNet retains data for a set period of time, regardless of whether the message has been successfully replicated to all database copies or delivered to final destination
Processes replay requests from “primary” or “shadow” SafetyNet for lossy mailbox failovers
SafetyNet Enhancements
61
New Building BlocksFacilitates deployments at all scales – from self-hosted small organizations to Office 365Provides more flexibility in namespace management
Simplified upgrade and interoperabilityAll components in a given server upgraded togetherNo need to juggle with CAS <-> MBX <-> HT versions separately
Client Access Server roleSimplifies the network layer – layer 7 solutions are no longer needed!Proxies and authenticates all client protocolsProvides load-balanced SMTP proxy solution for clients, external applications
Key Takeaways
62
Summary Exchange Server 2013 uses Building Blocks to facilitate deployments at all scales – from self-hosted, small organizations to Office 365
Exchange Server 2013 provides you with an architecture that is Flexible, Scalable, and Simpler and helps you reduce costs
Q&A
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.