Monitoring a Dynamics CRM Infrastructure

39
Monitoring a Microsoft Dynamics CRM Infrastructure Stéphane Dorrekens RealDolmen EXTREMECRM 2014 BARCELONA

Transcript of Monitoring a Dynamics CRM Infrastructure

Slide 1

Monitoring a Microsoft Dynamics CRM InfrastructureStphane DorrekensRealDolmeneXtremeCRM2014 Barcelona

2Monitoring allowsknowing what hits youProactivityFaster Reaction TimeAnd alsoAutomatic fixesAutomatic performance scalability

Users began seeing these errors on affected services at 11:02 a.m., and at that time our internal monitoring alerted Googles Site Reliability Team. Engineers were still debugging 12 minutes later when the same system, having automatically cleared the original error, generated a new correct configuration at 11:14 and began sending it; errors subsided rapidly starting at this time.Google Official Blog, 26th January, 2014

(via script or procedures)2/12/20142Microsoft SharePoint 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Introduction o Health, Performance and E2E Monitoringo Microsoft Dynamics CRM Eventso Events and Services Monitoring Demoo Microsoft Dynamics CRM Performance Counterso Performance Counters Monitoring and Automatic Scalability Demoo Microsoft System Center Operations Manager o Q&A

Agenda

3

Health, Performance and End 2 End Monitoring

Health MonitoringDetect problems after they occurredAutomatic alertingAllows faster recoveryLimits/Remove business impact

Performance Monitoring- Detect Degradation- Automatic alerting- Preemptive fix/scalability

End to End MonitoringNot (crm) server basedDetect health and performance issuesDoes not gives causes

MSDxCRM Events

The question is how do we filter them, which one to monitor.6

MSDxCRM Events+600 Events in the Application BranchAlso Generate events in the System Branch (ie: 7000, 7034, 7036 for service start, fail)Email Router events can beactivated separately (KB2862024)in a MSCRMEmailLog Branch

The question is how do we filter them, which one to monitor.7

Application Events SourcesMSCRM Event SourceRoleMSCRMAsyncServiceAsynchronous ServiceMSCRMAsyncService$maintenanceAsynchronous ServiceMSCRMCalloutn/aMSCRMDeletionServiceAsynchronous ServiceMSCRMDeploymentDeployment Web ServiceMSCRMEmailE-mail RouterMSCRMKeyArchiveManagerAsynchronous ServiceMSCRMKeyGeneratorAsynchronous ServiceMSCRMKeyServiceAsynchronous ServiceMSCRMLocatorServiceDiscovery Web ServiceMSCRMMonitoringRuntimeAllMSCRMMonitoringServerRoleAllMSCRMMonitoringServiceAllMSCRMMonitoringTestAllMSCRMPerfCoutersAllMSCRMPlatformWeb Application ServerMSCRMReportingMicrosoft Dynamics CRM Reporting ExtensionsMSCRMReportingDataConnectorMicrosoft Dynamics CRM Reporting ExtensionsMSCRMSandboxClientOrganization Web ServiceMSCRMSandboxServiceSandbox Processing ServiceMSCRMSandboxWorkerSandbox Processing ServiceMSCRMTracingAllMSCRMUnzipServiceWeb Application ServerMSCRMVssWriterDeployment ToolsMSCRMWebServiceOrganization Web Service

Used to detect expired digital certificates

First filter is to use the Event SourceDark Blue are new in CRM2013

8

MSDxCRM Events

Events List

.9

Some Critical Events

Some Critical EventsPlatform16940, 17205, 17206 (Authentication Failed)KeyGenerator18951, 18955, 18956 (crashed)Async17409 (crashed)Sandbox20240, 20244 (not started, crashed)

Services Monitoring

The question is how do we filter them, which one to monitor.12

Services MonitoringAlertAvailable trough System/Service Control Manager BranchWill triggers once per status change.Event 7000 (Failed to start)Event 7034 (crashed, corrective actions failed)http://technet.microsoft.com/en-us/library/dd349427(v=ws.10).aspxA service failing will most of the time triggers an internal MSCRM Event and then a Service Monitoring EventEx : 17409 -> 7034StateAvailable trough Service Control ApiWill triggers every time until status change

The question is how do we filter them, which one to monitor.13

Services Names to monitorW3SVCWorld Wide Web Publishing ServiceMSSQLSERVERSQL Server MSCRMSandboxServiceMicrosoft Dynamic CRM Sandbox Processing ServiceReportServerSQL Server Reporting ServicesMSCRMAsyncServiceMicrosoft Dynamic CRM Asynchronous Processing ServiceMSCRMAsyncService$maintenanceMicrosoft Dynamic CRM Asynchronous Processing Service (maintenance)MSCRMMonitoringService Microsoft Dynamic CRM Monitoring Service (2013 only)MSCRMUnzipServiceMicrosoft Dynamic CRM Unzip ServiceMSCRMVssWriterServiceMicrosoft Dynamic CRM VSS Writer (2013 only)WsearchWindows Search

External dependencies (AD, DNS, Exchange, ADFS, )

Ranked by other of importa14

Events Tools

The question is how do we filter them, which one to monitor.15

Events ToolsEvent viewer (eventvwr)Live displayFilteringExportBasic Alerting

Task Scheduler (schtasks)On alert trigger

PowershellGet-WinEvent cmdlet (replaces getEventLog)Live or read exported dataXpath can be extracted from Event Viewer or Task Scheduler filterGet-Service cmdled (get Service State)Automation is done via Task Scheduler (schtasks)

The question is how do we filter them, which one to monitor.16

AlertingEvents and ServicesDemo17

Demo script : Show event viewer, filter; alert.Demo PS Script that detect service failureExplain its not a good idea to use mscrm to send emailsExplain how it can be usefull for partners to deploy those scripts at SME.17

MSDxCRM Performance counters

The question is how do we filter them, which one to monitor.18

MSDxCRM Performance counters+950 Counters in \CRM Branches (25% of all counters in a Full CRM Server)Other noteworthy counters in :\Processor, \W3SVC, \.NET CLR, \ASP .NET, \SQLServer, \LogicalDisk, \Memory

The question is how do we filter them, which one to monitor.19

Performance counters & RolesCounter PathRoleCRM Async ServiceAsynchronous ServiceCRM AuthenticationWeb Application ServerCRM ConfigDBDeployment Web ServiceCRM DiscoveryDiscovery Web ServiceCRM LocatorServiceDiscovery Web ServiceCRM OutlookSyncWeb Application ServerCRM PlatformWeb Application ServerCRM Router ServiceE-mail RouterCRM Sandbox ClientSandbox Processing ServiceCRM Sandbox HostSandbox Processing ServiceCRM ServerWeb Application Server

The question is how do we filter them, which one to monitor.20

Performance counters Categories

The question is how do we filter them, which one to monitor.21

Async Performance countersCounters List

The question is how do we filter them, which one to monitor.22

Counters Guidelines

The question is how do we filter them, which one to monitor.23

Counters GuidelinesAlways in relation to a timeframe (ie: every x sec)Threshold needs to be adjusted to every implementation, timeframe.Performanceis monitored viaOutstanding, waiting state (too much operations)Average time spent (operation too slow)Event 17972 (Db Query > 10 sec)Health Usually better monitored via EventsIs monitored viaFailed counters

The question is how do we filter them, which one to monitor.24

Typical CountersOperations CompletedOperations Completion ThroughputOperations ExecutingOperations FailedOperations Failed With RetryOperations OutstandingRate of Operations Failed With ExceptionRate of Operations Failed With RetryAverage time spent in operationAverage time spent in waiting stateAverage time spent in throttled stateOperations Waiting on I/OOperations ThrottledOperations Resumed Prematurely

HealthPerformanceHealthPerformance

The question is how do we filter them, which one to monitor.25

Performance counters Tools

The question is how do we filter them, which one to monitor.26

Performance counters ToolsTypeperfhttp://technet.microsoft.com/en-us/library/bb490960.aspxPerformance monitor (perfmon*)Live displayFilteringScheduled collection Alerting or generic tasks (Trigger is only above level)Can be saved as templatePowershellGet-Counter cmdlet to Get Live dataExport-Counter cmdlet to save data collection pointsImport-Counter to import saved data collection pointsXpath can be extracted from Perfmon filterAlerting or more complex actions like automatic scalability

*Use mmc /add snapin to save counters configuration

The question is how do we filter them, which one to monitor.27

Alerting and automatic scalabilityPerformance CountersDemo28

Demo script :Show perfmon, how to add alertDemo crm Attack (slide crm attack ?)Show powershell (how to use the excel ?)

28

Demo Setup

System Center Operations Manager

Wouldt it be nice if MS had already packaged the most important monitors and alerts in a monitoring system ?30

System Center Operations Manager

Management Pack currently exists only for CRM 2011 (new services not monitored, events)Pack is designed for SCOM 2007 but can be imported in SCOM 2012R2Contains both health and performance monitoringPerformance monitoring is not enabled by default as the thresholds needs to be defined for each implementations

Wouldt it be nice if MS had already packaged the most important monitors and alerts in a monitoring system ?31

Priorities and Escalations

The question is how do we filter them, which one to monitor.32

Priorities and Escalations

Alerting is not enough

For every alert, you need a specific priority (ie: P1,P2,P3,..)

For most alerts (ie: all P1,P2; less P3,P4) you need escalation and resolutions procedures

The question is how do we filter them, which one to monitor.33

Escalations Procedures

MonitorTargetCategoryEnabledMonitored Health StatesProductionNon ProductionMicrosoft Dynamics CRM Asynchronous Processing Service (maintenance)Microsoft Dynamics CRM 2011 Asynchronous Processing ServiceAvailability HealthYesAutomatic ServiceP3P4Microsoft Dynamics CRM Asynchronous Processing ServiceMicrosoft Dynamics CRM 2011 Asynchronous Processing ServiceAvailability HealthYesAutomatic ServiceP2P3E-mail Router ServiceMicrosoft Dynamics CRM 2011 E-mail RouterAvailability HealthYesAutomatic ServiceP2P3Indexing ServiceMicrosoft Dynamics CRM 2011 Help ServerAvailability HealthYesAutomatic ServiceP3P4World Wide Web Publishing ServiceMicrosoft Dynamics CRM 2011 IIS Dependent ServerAvailability HealthYesAutomatic ServiceP1P2The SQL Server Reporting Services: MSSQLSERVERMicrosoft Dynamics CRM 2011 Reporting ExtensionsAvailability HealthYesAutomatic ServiceP3P4Microsoft Dynamics CRM Sandbox Processing ServiceMicrosoft Dynamics CRM 2011 Sandbox Processing ServiceAvailability HealthYesAutomatic ServiceP4P5File Server Resource Manager service is not runningMicrosoft Dynamics CRM 2011 Web Application ServerAvailability HealthYesAutomatic ServiceP4P5Microsoft Dynamics CRM Unzip Service is not runningMicrosoft Dynamics CRM 2011 Web Application ServerAvailability HealthYesAutomatic ServiceP3P4

The question is how do we filter them, which one to monitor.34

Whats to remember35

35

Whats to remember36

Not everything makes sense to monitorUses (mainly) Events and Services State for health monitoringUse (mainly) Performance countersfor performance monitoringUse Traces for incident analysisMonitoring is more than alerting, it can triggers auto-remediation or scalability.Alerting is not enough, you need escalation procedures as well.

36

Stphane Dorrekens, [email protected]@stephanedujourhttp://blog.dorrekens.com

Please remember to fill out your session evaluation survey online!The link to the survey was emailed to you, or go to:http://www.extremecrm.com/2014Barcelona/2014BarcelonaSurvey.aspxComplete prior to the closing session to be included in todays drawing!

Survey: http://www.extremecrm.com/2014Barcelona/2014BarcelonaSurvey.aspxTHANK YOU to all of our eXtremeCRM 2014 Barcelona Sponsors!

CORPORATEGOLDSILVERBRONZEEXHIBITORS

One source

38

Thank You

39

MSDxCRM Trace

CRM Server Log is Enabled via Powershell (Deployment level) or Registry (Server Level) (see http://support.microsoft.com/kb/907490) CRM Outlook Client Log is enable via Diagnostic tool or Registry (see http://support.microsoft.com/kb/2862031) If activated by Powershell OR Registry, trace will be activated but Registry settings have precedence (Server over Deployment)Trace files can be huge so change the default directory to something else than the system drive. 20 Main Categories (Platform, Sandbox, etc..) 48 Categories in all5 levels (Off ,Error ,Warning, Info,Verbose)7 Files (W3WP, Sandbox, CRMAsync, etc..) Toolhttp://crmtracereader.codeplex.com/

ADUtility Application Application.Outlook DataMigration Deployment Deployment.Provisioning Deployment.Sdk Exception Etm Live Live.AggregationDataExport Live.PartnerInteraction Live.Platform Live.Portal Live.Provisioning Live.Support Live.SyncDaemon Monitoring NewOrgUtility ObjectModel ParameterFilter Platform Platform.Async Platform.ImportExportPublish Platform.Import Platform.Metadata Platform.Sdk Platform.Soap Platform.Sql Platform.WorkflowReports Sandbox Sandbox.AssemblyCache Sandbox.LoadBalancer Sandbox.CallReturn Sandbox.EnterExit Sandbox.StartStop Sandbox.Performance Sandbox.Monitoring SchedulingEngine ServiceBus Shared SharePointCollaboration Solutions Unmanaged.Outlook Unmanaged.Platform Unmanaged.Sql Visualizations )

W3WP-Help W3WP-CRMWeb Sandbox-WorkerProcess Sandbox-HostService CRMUnzipService CRMAsyncService-Server CRMAsyncService-Maintenance

40

Non MSDxCRM Trace

SSRS Log http://technet.microsoft.com/en-us/library/ms157403.aspx

SSRS Http Loghttp://technet.microsoft.com/en-us/library/bb630443.aspx ExecutionLogn Table in ReportServer Database Default is 2 months history, can be changed with ExecutionLogDaysKept (see http://technet.microsoft.com/en-us/library/bb934303%28v=sql.105%29.aspx) IIS Log http://technet.microsoft.com/en-us/library/cc754631(v=ws.10).aspx Can be used to log users data access (user, machine, Ip, etc..) SQL Server Log http://technet.microsoft.com/en-us/library/ms187109.aspx Tool Log Parser (http://www.microsoft.com/en-us/download/details.aspx?id=24659)

Reference http://blogs.msdn.com/b/crminthefield/archive/2013/10/15/crm-2011-platform-tracing-registry-vs-powershell.aspx http://blogs.msdn.com/b/emeadcrmsupport/archive/2011/05/30/crm-2011-new-tool-crmdiagtool-20311.aspx41

Service State Monitoringtry { $theTime = Get-Date "Event 1 Started on "+$theTime.ToUniversalTime()+":" | Out-File C:\Temp\CRMEventLog.txt -Append -Width 1000 $ErrorActionPreference = "Stop"; $theService = Get-Service -ComputerName CRM2013 -Name MSCRMAsyncService if ($theService.Status -ne [System.ServiceProcess.ServiceControllerStatus]::Running) { Send-MailMessage -SmtpServer 127.0.0.1 -from "Monitor " -to "Admin " -subject "Log Alert" -body "Async Service is stopped" "Alert service not running" | Out-File C:\Temp\CRMEventLog.txt -Append -Width 1000 } }

catch [System.Exception]{ $_ | Out-File C:\Temp\CRMEventLogExceptions.txt -Append -Width 1000 $ErrorActionPreference = "Continue"; }

Service Event Monitoringtry { $theTime = Get-Date "Event 1 Started on "+$theTime.ToUniversalTime()+":" | Out-File C:\Temp\CRMEvent2Log.txt -Append -Width 1000 $ErrorActionPreference = "Stop"; $Thelog = Get-WinEvent -ComputerName "CRM2013" -FilterXML "*[System[(EventID=17409) and TimeCreated[timediff(@SystemTime)