Oracle Exadata Monitoring and€¦ · Exadata Monitoring with 12c Wells Fargo DAN Infrastructure...
Transcript of Oracle Exadata Monitoring and€¦ · Exadata Monitoring with 12c Wells Fargo DAN Infrastructure...
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Oracle Exadata Monitoring and Management Best Practices Session CON9727 October 26, 2015 Ashish Agrawal, Group Product Manager Swapnil Sinvhal, Sr. Software Dev Manager Oracle Corporation Rick Shawver, Infrastructure DBA Wellsfargo Om Prakash Seth, Vice President HDFC Bank LTD
Oracle Confidential – Internal/Restricted/Highly Restricted
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Oracle Confidential – Internal/Restricted/Highly Restricted 3
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 4
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Top EM 13c Features for Exadata Management
Real World Tips & Best Practices
- Wells Fargo Bank
- HDFC Bank
1
2
Oracle Confidential – Internal/Restricted/Highly Restricted 5
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Top EM 13c Features for Exadata Management
Real World Tips & Best Practices
- Wells Fargo Bank
- HDFC Bank
1
2
Oracle Confidential – Internal/Restricted/Highly Restricted 6
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Exadata Virtualization Provisioning
Natively Integrated Exadata Hardware
Management
Patching Automation
Support of Exadata Stack
Exachk Integration
Auto Service Request (ASR)
Integration - Fault Telemetry
Support for Exadata Flash
Cache Features
Oracle Confidential – Internal/Restricted/Highly Restricted 7
Top EM 13c Features for Exadata Management
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
• Includes VMs/DB/GI/ASM
VM provisioning on Virtualized Exadata
involves reliable, automated, & scheduled mass deployment of RAC
Cluster
• Including DB/GI/ASM
Create / delete RAC Cluster
• Includes DB/GI/ASM
Scale up / down RAC Cluster by adding or
removing VMs
Oracle Confidential – Internal/Restricted/Highly Restricted 8
EM Support for Exadata Virtualization Provisioning
Increase Operational Efficiency by Deploying RAC Cluster Faster on Virtualized Exadata
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 9
Exadata Provisioning Workflow
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Demo Exadata Virtualization Provisioning
Oracle Confidential – Internal 10
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Natively Integrated Exadata Hardware Management
11
Availability Incidents Monitoring
Photo Realistic View
Hardware depicting faulty components (in red) and integrated with Incident Manager
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Natively Integrated Exadata Hardware Management
Oracle Confidential – Internal/Restricted/Highly Restricted 12
ILOM Details & Health Status
Photo-Realistic ILOM
Front View integrated
with Incident Manager
Hardware View
Logical View
Energy
Network Connectivity
Service Processor Configuration
Hardware View of ILOM
Incidents Dashlet
Resource Usage
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 13
Logical View of ILOM
Logical View
New Target Type in EM 13c for Hardware
CPU Summary & Status
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 14
Compute Server Details
Administrators have hardware visibility, can correlate errors across tools (like a an intermittent error with Exadata performance issue)
Exadata Component Level View
Open Incidents
Photo-Realistic View
Hover your mouse cursor over component to see details
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 15
Patching Automation Support of Exadata Stack
• Quarterly Full Stack Patch Download (QFSDP) is released every quarter
• Primary components are
• Database (Database, Clusterware)
• Infrastructure includes Exadata Storage Server, InfiniBand Switch, PDU
• For more details refer MOS Doc ID: 888828.1
• Before EM 13c you could only patch the database software from EM 12c
• Starting EM 13c you can now also patch
• Compute Nodes: Firmware and OS
• Storage Server Cells: Firmware and cell software
• Infiniband Network: Switch firmware
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted
Exadata System Patch Management Extending patching beyond the Database software
Database Grid Database Servers • Oracle GI / RDBMS • Firmware / OS
InfiniBand Network Switches
Storage Grid Exadata Storage Servers
Supports application of the complete system patch – Quarterly Full Stack Patch Download (QFSPD)
EM 13c patching support
EM 13c patching support
EM 13c patching support
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Comprehensive overview of the maintenance status & needs
Proactive patch recommendations for the quarterly full stack patches
Supports auto patch download, ability to patch either in rolling & non-rolling modes
Ability to schedule runs & get notified of the status updates
Granular step-level status tracking with real time updates
Log monitoring & aggregation, support for quick filing of support issues with pre-packaged log dumps
Oracle Confidential – Internal/Restricted/Highly Restricted 17
Exadata Systems Patching
Navigation: Targets --> Exadata --> Target Name --> Database Machine --> Software Update
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 18
QFSDP: Quarterly Full Stack Download Patch for Exadata
One can select a QFSDP, Analyze & then Deploy it also
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Exachk Integration
Exachk proactively scans for the most impactful problems across the various layers of stack in Oracle Database and Exadata
Install, setup, upgrade & schedule Exachk utility from EM
EM shows Exachk results as EM Compliance Standard violations
Customer benefits with Exachk best practices
For more details on Exachk refer to ORAchk/Exachk Master Reference, MOS Doc ID: 1969085.1
Oracle Confidential – Internal/Restricted/Highly Restricted 19
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 20
Enterprise Compliance Library Compliance Standards
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 21
Exachk Compliance Results
Enterprise Compliance Results Event is generated for compliance violation
Event type is “Compliance Standard Rule Violation”
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Provides ASR capability for Exadata
One emcli command to turn on ASR
View ASR Incidents in EM Incident Manager
Auto create SRs for faults & update SR# to EM
Works for all Systems qualified for ASR
Oracle Confidential – Internal/Restricted/Highly Restricted 22
Auto Service Request (ASR) Integration: Fault Telemetry
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 23
Example of Viewing ASR Incidents in Enterprise Manager
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Support for Exadata Flash Cache Features
• Enhanced DB Machine Schematic diagram
• Enhanced Incident Details page for disk failure
• Enhanced capacity reporting, IORM and performance charts
Monitor X5 “Extreme Flash” Configurations
• Flash Cache Space Usage Monitoring
• Administration of Flash I/O Resource Management
• New charts on Cell and Grid Home page, Performance page
• I/O Resource consumption and performance monitoring for Flash
Flash I/O Resource Monitoring & Management
Oracle Confidential – Internal/Restricted/Highly Restricted 24
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Monitor X5 “Extreme Flash” Configuration
Oracle Confidential – Internal/Restricted/Highly Restricted 25
Performance Section for Flash Read
& Write
All Flash and no Hard Disk
Note only Flash
Disk Size
Monitor Flash
IO service time
Target Navigation Icon Exadata Storage Server Grid
Target Navigation
Icon
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Flash Cache Space Usage Monitoring
Oracle Confidential – Internal/Restricted/Highly Restricted 26
Exadata Storage Server Grid Administration Manage IO Resources Flash Cache Space Usage
Chart to illustrate Current & Historical Flash Cache Space Usage
Optimize Flash Space Usage of Critical Database
Current
Flash Space Usage
Historical Flash Cache Space Usage
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Administration of Flash I/O Resource Management
27
Flash I/O Resource Management Protects the Latency of Critical OLTP I/O Requests in Flash Cache
Easily manage your IORM settings for
Flash Cache
Set Minimum & Maximum Size for your Flash Cache per database
Share based plan
Exadata Storage Server Grid Administration Manage IO Resources
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
New charts on Cell and Grid Home page, Performance page
28
Monitor Flash performance and IORM waits
Numbers will be workload dependent
Any deviation from normal
baselines should be investigated
Target Navigation Icon Exadata Storage Server Grid
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Lessons from Oracle Public Cloud
Enterprise Manager used to monitor entire public cloud
infrastructure
Largest Monitored Site
• Over 2.5 Million Targets
• 51 Exadata, 11k Databases, 150K WLS servers, 900K SOA Composites, 1350 Business Applications, 5k Beacons, 24k Agents
• 3.4 million events processed per day
Leverage groups for everything
• Group of group hierarchies aligned with how they manage targets. Largest group: over 200k targets
• Use System Dashboard to monitor top level groups
• Provides summary counts of target down, critical and warning incidents for each top level group
Set Lifecycle Status for the most important targets
• Guarantees rapid delivery of notifications & creation of incidents/tickets for the most important events
Tight control over metrics & thresholds
• Disable unused metrics
• Put meaningful thresholds on metrics they care about
• Create incidents on important events.
• Use templates to deploy metric settings.
Blackouts at group level, control who sets the blackouts
Over 2.5M targets & 3.4 Million Events per Day
MOS Note 1929586.1 - Oracle Enterprise Manager 12c Configuration Best Practices
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Top EM 13c Features for Exadata Management
Real World Tips & Best Practices
- Wells Fargo Bank
- HDFC Bank
1
2
Oracle Confidential – Internal/Restricted/Highly Restricted 30
Exadata Monitoring with 12c
Wells Fargo DAN Infrastructure Team
We build, support, and manage standardized, shared environments for Oracle
OID, RMAN, ASM, SAN, NAS, Oracle s/w stack, Grid Control
35000 Targets split between two Grid Control environments (prod and nonprod)
Directly support approximately 640 hosts, 160 clusters, 1756 databases, 5102 instances, etc
Same 11.2.0.4 and 12.1.0.2 versions across the environment
Redhat 5/6, Solaris 10/11, OEL 6 (Exadata), Oracle ZBA
We provide triage support for all of the above to the LOB DBA’s
So what’s new?
We got our first 14 Exadata appliances!
Target usage initially for a few critical applications
Shared environments
Predominately used for OLTP
Now we’re building on our history of lessons learned to develop compatible standards that can be applied to the Exadata architecture
In our new role as DBMA’s we have added patching, upgrading, managing the Exadata infrastructure
32
*
Exadata Monitoring with 12c
The Exadata plugin exposes the Appliance Components
33
* Monitoring complexity is increased if the Exadata platform is used to house databases for
multiple LOB’s and the Exadata infrastructure is maintained by a yet another team.
Exadata Monitoring with 12c
The Exadata plugin displays all components of the Appliance Components. Some examples:
34
* The Exadata cell information is a component that the Application DBA team wanted access to.
This required additional group and incident rule setup.
The 12c target type “oracle_dbmachine” is a composite target that includes ALL targets within the Exadata environment.
Items to consider when setting up Exadata monitoring:
36
• In a typical group/role security model, adding the “Exadata Database Machine” to any
group, adds all databases as well as all platform related targets such as cell, iloms, pdus,
network, etc.
• LOB support teams require 12c access to their databases and shared targets such as
host, agents, crs, cells, asm, listeners, but NOT iloms, pdus, network, etc.
• There are cases where databases such as fsdb require monitoring with alerts directed to
the Exadata support team.
How Wellsfargo handles this unique separation of duty and alert notifications
37
• Create a separate 12c group for those Exadata targets that the LOB teams will have access to. This includes the
Exadata Storage Server Grid and the Exadata Storage Servers
• Create two 12c groups for the Exadata support team
• Group 1 contains all platform related Exadata type excluding the DB Machine.
• Group 2 contains only the Top Level DB Machine targets. This is used for the Incident rule alerting on the
overall DB Machine status.
• Place the databases that each LOB and Exadata support teams are responsible for in groups that each team has full
control over.
• Create 12c Roles that contain the appropriate groups for each support team
38
There is an automatic expansion of the DB Machine targets. If the DB Machines are added
to an Incident Rule, ALL targets on the DB Machine will be included.
Notice both groups appear to have the same targets. However group 2 only contains:
The typical setup for LOB support groups (Those supporting DBs on Exadata) required view access into the Exadata Cells
41
A second LOB group was needed because “view” access is given to the cell targets. The
main LOB group propagates “full” for all targets included.
The LOB Exadata group is only given View access to the cells.
The role these groups are in get assigned to the DBA’s in that LOB.
e.g. View Role: LENDING_CORE_ROLE
42
The Exadata Plugin gives a detailed look into the setup and health of an Exadata environment. However, care must be taken to secure and authorize only those targets that are required for each support role.
44
Monitoring a shared Exadata environment requires:
• The ability to limit LOB DBA access to those targets they are responsible for.
• The ability to produce alerts for LOB DBAs that exclude specific platform hardware targets that are the sole
responsibility of the Exadata support team.
• Giving the LOB DBA team a view into the Exadata cells, and allow them to receive alerts for the cells.
• The ability to alert on the DB Machine target without inundating the Exadata support team with database alerts
that should only be sent to the LOB DBA teams.
• Allow the Exadata support team to receive alerts for databases they are responsible for (such as fsdb).
Enhancing Exadata 12c Monitoring utilizing Metric Extensions
45
The Exadata Plugin exposes metrics for the Exadata Target Types for
monitoring/alerting.
New incident rules were created with all “out of the box” metrics that Oracle defined
thresholds for.
The Exadata Plugin also exposes additional metrics that cannot be used directly for alerting
46
There are metrics that are collected from the Exadata environment, that thresholds cannot be
set for.
In cases where metrics are exposed, but not available for alerting, Repository side Metrics extensions can be used to produce alerts on those metrics that have not be defined with thresholds.
47
Application LOB teams desired additional cell alerting that was not available “out of the box”,
but that could be exposed via Metric Extensions.
Using Repository side Metric Extensions the desired metrics can be alerted on:
48
Custom repository side Metric Extensions were created and then added to the LOB DBA
Incident Rules.
Repository side Metric Extensions use the metric data that is populated when the Exadata Plugin is deployed:
49
The Repository side Metric Extensions used data that is being captured by the Exadata
Plugin.
Summary: The Exadata Plugin provides the 12c functionality needed to “See” into and monitor your Appliance.
50
• The plugin provides an extensive picture of what your Exadata environment looks like. Providing an easy to
understand landscape of all involved components.
• Performance data for all Exadata components is available either through the GUI or 12c reporting.
• Monitoring and alerting for all Exadata Target Types can be enabled to ensure stability and reduce risk of
outages.
• 12c allows for very specific Exadata target security for a variety of support roles.
• 12c Exadata monitoring can easily be extend with Metric Extension to enhance alerting capabilities.
Oracle Exadata Monitoring and Management Best
Practices
Om Prakash Seth
Vice President - IT
HDFC Bank …. Bank aapki Muththi Mein
HDFC Bank Limited, incorporated in 1994, is an Indian banking & financial services company
headquartered in Mumbai, Maharashtra, India
Largest private sector bank in India by market capitalization as of Feb. 2014
Winner of Best Asian Bank award 2015
Top 100 most valuable global brands in 2015 with a value of $14 billion
Ranked as 'Most Valuable Indian Brand’ for second consecutive year
Go Digital ….Bank offers 10-second Personal loan, the 30-minute auto and two wheeler loan, Chillr
app & Payzapp as part of digital banking initiative
About me: Om Prakash Seth is VP – IT & Joint Incident Management Head & manages
production incidents for mission critical Core banking applications. Om has lead the implementation of Oracle Super Cluster in
HDFC Security in 2013, which has been the Word's
first Super Cluster implementation in e-broking segment & has been accredited as "Best technology Implementation of the year"
by Asian Banker's award in 2014.
Evolution of Engineered Systems in HDFC Bank
Extreme Performance
High
Availability
Scalability Technology
Management
Purpose Built
Engineered
Machines
Engineered System’s Journey in HDFC Bank
2012 : Migrated Retail Assets Solution on Exadata
2013 : Implemented Core Trading Platform on Oracle Super Cluster in
HDFC Securities Ltd
2015 : In Process of Migrating Core Retail Banking Platform on Oracle
Super Cluster
2013 : Migrated Basel GL Solution on Exadata
Engineered System’s Footprint in HDFC Bank
3 full Rack & 5 half Rack Oracle Super cluster being deployed
across 3 data centers in HA for running Core Retail Banking
System
10 Exadata database machines running critical workload –
Retail Asset, eBusiness Suites
2 half Rack Oracle Super Cluster machines deployed across
2 data centers for running Core Trading Solution
Drivers for OEM12c
Real time Centralized Monitoring
Notification & Unified
Dashboard
Performance Optimization & Advisory
Managing DB operations
Database Life Cycle Management
Automated Fault Tracking and Resolution
Implemented OEM12c With ASR & Platinum Support
Implement IORM for prioritizing application workload
Keep your OEM 12c to latest patch set level
Install OEM 12c Exadata Plug-in for monitoring & enhanced performance
impact detection
Take advantage of performance & tuning, advance notification features of OEM
12c
Implement Monitoring template for Incident creation, notification for hardware &
database
Implement ASR
Exadata & Super Cluster Monitoring : Best Practices
OEM12c @ HDFC Bank : Technology benefits Achieved
• Integrated Management Console for H/W +S/W
• Single Dashboard for all Engineered Systems End to End Monitoring
• Comprehensive view including Performance, availability, usage by databases, services, clusters through OEM 12c Plug-ins
• Enabled out of the box alerts for databases, cluster, ASM, Topology view of DB systems/clusters
Performance & Availability
Management
• Enabled consolidated Configuration view including Version summary, patch recommendation & Ongoing Database Provisioning
• Exadata and Super Cluster specific Compliance evaluation, ongoing Drift tracking across the stack
Standardization& Compliance
Cloud Ready Architecture
• Enabled Private database cloud & smart UAT cloud
• Eliminated need to store multiple copies through Snap Clone on Exadata
Thank You [email protected]