Oracle Exadata Monitoring and€¦ · Exadata Monitoring with 12c Wells Fargo DAN Infrastructure...

61

Transcript of Oracle Exadata Monitoring and€¦ · Exadata Monitoring with 12c Wells Fargo DAN Infrastructure...

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Oracle Exadata Monitoring and Management Best Practices Session CON9727 October 26, 2015 Ashish Agrawal, Group Product Manager Swapnil Sinvhal, Sr. Software Dev Manager Oracle Corporation Rick Shawver, Infrastructure DBA Wellsfargo Om Prakash Seth, Vice President HDFC Bank LTD

Oracle Confidential – Internal/Restricted/Highly Restricted

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Oracle Confidential – Internal/Restricted/Highly Restricted 3

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 4

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

Top EM 13c Features for Exadata Management

Real World Tips & Best Practices

- Wells Fargo Bank

- HDFC Bank

1

2

Oracle Confidential – Internal/Restricted/Highly Restricted 5

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

Top EM 13c Features for Exadata Management

Real World Tips & Best Practices

- Wells Fargo Bank

- HDFC Bank

1

2

Oracle Confidential – Internal/Restricted/Highly Restricted 6

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Virtualization Provisioning

Natively Integrated Exadata Hardware

Management

Patching Automation

Support of Exadata Stack

Exachk Integration

Auto Service Request (ASR)

Integration - Fault Telemetry

Support for Exadata Flash

Cache Features

Oracle Confidential – Internal/Restricted/Highly Restricted 7

Top EM 13c Features for Exadata Management

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

• Includes VMs/DB/GI/ASM

VM provisioning on Virtualized Exadata

involves reliable, automated, & scheduled mass deployment of RAC

Cluster

• Including DB/GI/ASM

Create / delete RAC Cluster

• Includes DB/GI/ASM

Scale up / down RAC Cluster by adding or

removing VMs

Oracle Confidential – Internal/Restricted/Highly Restricted 8

EM Support for Exadata Virtualization Provisioning

Increase Operational Efficiency by Deploying RAC Cluster Faster on Virtualized Exadata

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 9

Exadata Provisioning Workflow

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Demo Exadata Virtualization Provisioning

Oracle Confidential – Internal 10

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Natively Integrated Exadata Hardware Management

11

Availability Incidents Monitoring

Photo Realistic View

Hardware depicting faulty components (in red) and integrated with Incident Manager

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Natively Integrated Exadata Hardware Management

Oracle Confidential – Internal/Restricted/Highly Restricted 12

ILOM Details & Health Status

Photo-Realistic ILOM

Front View integrated

with Incident Manager

Hardware View

Logical View

Energy

Network Connectivity

Service Processor Configuration

Hardware View of ILOM

Incidents Dashlet

Resource Usage

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 13

Logical View of ILOM

Logical View

New Target Type in EM 13c for Hardware

CPU Summary & Status

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 14

Compute Server Details

Administrators have hardware visibility, can correlate errors across tools (like a an intermittent error with Exadata performance issue)

Exadata Component Level View

Open Incidents

Photo-Realistic View

Hover your mouse cursor over component to see details

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 15

Patching Automation Support of Exadata Stack

• Quarterly Full Stack Patch Download (QFSDP) is released every quarter

• Primary components are

• Database (Database, Clusterware)

• Infrastructure includes Exadata Storage Server, InfiniBand Switch, PDU

• For more details refer MOS Doc ID: 888828.1

• Before EM 13c you could only patch the database software from EM 12c

• Starting EM 13c you can now also patch

• Compute Nodes: Firmware and OS

• Storage Server Cells: Firmware and cell software

• Infiniband Network: Switch firmware

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

Exadata System Patch Management Extending patching beyond the Database software

Database Grid Database Servers • Oracle GI / RDBMS • Firmware / OS

InfiniBand Network Switches

Storage Grid Exadata Storage Servers

Supports application of the complete system patch – Quarterly Full Stack Patch Download (QFSPD)

EM 13c patching support

EM 13c patching support

EM 13c patching support

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Comprehensive overview of the maintenance status & needs

Proactive patch recommendations for the quarterly full stack patches

Supports auto patch download, ability to patch either in rolling & non-rolling modes

Ability to schedule runs & get notified of the status updates

Granular step-level status tracking with real time updates

Log monitoring & aggregation, support for quick filing of support issues with pre-packaged log dumps

Oracle Confidential – Internal/Restricted/Highly Restricted 17

Exadata Systems Patching

Navigation: Targets --> Exadata --> Target Name --> Database Machine --> Software Update

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 18

QFSDP: Quarterly Full Stack Download Patch for Exadata

One can select a QFSDP, Analyze & then Deploy it also

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exachk Integration

Exachk proactively scans for the most impactful problems across the various layers of stack in Oracle Database and Exadata

Install, setup, upgrade & schedule Exachk utility from EM

EM shows Exachk results as EM Compliance Standard violations

Customer benefits with Exachk best practices

For more details on Exachk refer to ORAchk/Exachk Master Reference, MOS Doc ID: 1969085.1

Oracle Confidential – Internal/Restricted/Highly Restricted 19

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 20

Enterprise Compliance Library Compliance Standards

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 21

Exachk Compliance Results

Enterprise Compliance Results Event is generated for compliance violation

Event type is “Compliance Standard Rule Violation”

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Provides ASR capability for Exadata

One emcli command to turn on ASR

View ASR Incidents in EM Incident Manager

Auto create SRs for faults & update SR# to EM

Works for all Systems qualified for ASR

Oracle Confidential – Internal/Restricted/Highly Restricted 22

Auto Service Request (ASR) Integration: Fault Telemetry

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 23

Example of Viewing ASR Incidents in Enterprise Manager

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Support for Exadata Flash Cache Features

• Enhanced DB Machine Schematic diagram

• Enhanced Incident Details page for disk failure

• Enhanced capacity reporting, IORM and performance charts

Monitor X5 “Extreme Flash” Configurations

• Flash Cache Space Usage Monitoring

• Administration of Flash I/O Resource Management

• New charts on Cell and Grid Home page, Performance page

• I/O Resource consumption and performance monitoring for Flash

Flash I/O Resource Monitoring & Management

Oracle Confidential – Internal/Restricted/Highly Restricted 24

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Monitor X5 “Extreme Flash” Configuration

Oracle Confidential – Internal/Restricted/Highly Restricted 25

Performance Section for Flash Read

& Write

All Flash and no Hard Disk

Note only Flash

Disk Size

Monitor Flash

IO service time

Target Navigation Icon Exadata Storage Server Grid

Target Navigation

Icon

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Flash Cache Space Usage Monitoring

Oracle Confidential – Internal/Restricted/Highly Restricted 26

Exadata Storage Server Grid Administration Manage IO Resources Flash Cache Space Usage

Chart to illustrate Current & Historical Flash Cache Space Usage

Optimize Flash Space Usage of Critical Database

Current

Flash Space Usage

Historical Flash Cache Space Usage

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Administration of Flash I/O Resource Management

27

Flash I/O Resource Management Protects the Latency of Critical OLTP I/O Requests in Flash Cache

Easily manage your IORM settings for

Flash Cache

Set Minimum & Maximum Size for your Flash Cache per database

Share based plan

Exadata Storage Server Grid Administration Manage IO Resources

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

New charts on Cell and Grid Home page, Performance page

28

Monitor Flash performance and IORM waits

Numbers will be workload dependent

Any deviation from normal

baselines should be investigated

Target Navigation Icon Exadata Storage Server Grid

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Lessons from Oracle Public Cloud

Enterprise Manager used to monitor entire public cloud

infrastructure

Largest Monitored Site

• Over 2.5 Million Targets

• 51 Exadata, 11k Databases, 150K WLS servers, 900K SOA Composites, 1350 Business Applications, 5k Beacons, 24k Agents

• 3.4 million events processed per day

Leverage groups for everything

• Group of group hierarchies aligned with how they manage targets. Largest group: over 200k targets

• Use System Dashboard to monitor top level groups

• Provides summary counts of target down, critical and warning incidents for each top level group

Set Lifecycle Status for the most important targets

• Guarantees rapid delivery of notifications & creation of incidents/tickets for the most important events

Tight control over metrics & thresholds

• Disable unused metrics

• Put meaningful thresholds on metrics they care about

• Create incidents on important events.

• Use templates to deploy metric settings.

Blackouts at group level, control who sets the blackouts

Over 2.5M targets & 3.4 Million Events per Day

MOS Note 1929586.1 - Oracle Enterprise Manager 12c Configuration Best Practices

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

Top EM 13c Features for Exadata Management

Real World Tips & Best Practices

- Wells Fargo Bank

- HDFC Bank

1

2

Oracle Confidential – Internal/Restricted/Highly Restricted 30

Oracle Exadata Monitoring and Management Best Practices

Presented by : Rick Shawver

October 2015

Exadata Monitoring with 12c

Wells Fargo DAN Infrastructure Team

We build, support, and manage standardized, shared environments for Oracle

OID, RMAN, ASM, SAN, NAS, Oracle s/w stack, Grid Control

35000 Targets split between two Grid Control environments (prod and nonprod)

Directly support approximately 640 hosts, 160 clusters, 1756 databases, 5102 instances, etc

Same 11.2.0.4 and 12.1.0.2 versions across the environment

Redhat 5/6, Solaris 10/11, OEL 6 (Exadata), Oracle ZBA

We provide triage support for all of the above to the LOB DBA’s

So what’s new?

We got our first 14 Exadata appliances!

Target usage initially for a few critical applications

Shared environments

Predominately used for OLTP

Now we’re building on our history of lessons learned to develop compatible standards that can be applied to the Exadata architecture

In our new role as DBMA’s we have added patching, upgrading, managing the Exadata infrastructure

32

*

Exadata Monitoring with 12c

The Exadata plugin exposes the Appliance Components

33

* Monitoring complexity is increased if the Exadata platform is used to house databases for

multiple LOB’s and the Exadata infrastructure is maintained by a yet another team.

Exadata Monitoring with 12c

The Exadata plugin displays all components of the Appliance Components. Some examples:

34

* The Exadata cell information is a component that the Application DBA team wanted access to.

This required additional group and incident rule setup.

Exadata Monitoring with 12c

35

*

The 12c target type “oracle_dbmachine” is a composite target that includes ALL targets within the Exadata environment.

Items to consider when setting up Exadata monitoring:

36

• In a typical group/role security model, adding the “Exadata Database Machine” to any

group, adds all databases as well as all platform related targets such as cell, iloms, pdus,

network, etc.

• LOB support teams require 12c access to their databases and shared targets such as

host, agents, crs, cells, asm, listeners, but NOT iloms, pdus, network, etc.

• There are cases where databases such as fsdb require monitoring with alerts directed to

the Exadata support team.

How Wellsfargo handles this unique separation of duty and alert notifications

37

• Create a separate 12c group for those Exadata targets that the LOB teams will have access to. This includes the

Exadata Storage Server Grid and the Exadata Storage Servers

• Create two 12c groups for the Exadata support team

• Group 1 contains all platform related Exadata type excluding the DB Machine.

• Group 2 contains only the Top Level DB Machine targets. This is used for the Incident rule alerting on the

overall DB Machine status.

• Place the databases that each LOB and Exadata support teams are responsible for in groups that each team has full

control over.

• Create 12c Roles that contain the appropriate groups for each support team

38

There is an automatic expansion of the DB Machine targets. If the DB Machines are added

to an Incident Rule, ALL targets on the DB Machine will be included.

Notice both groups appear to have the same targets. However group 2 only contains:

There are Incident rules for Exadata targets in one rule set, AND a specific DB Machine Ruleset.

39

The two Exadata rules for the Exadata Support team used two different groups:

40

The typical setup for LOB support groups (Those supporting DBs on Exadata) required view access into the Exadata Cells

41

A second LOB group was needed because “view” access is given to the cell targets. The

main LOB group propagates “full” for all targets included.

The LOB Exadata group is only given View access to the cells.

The role these groups are in get assigned to the DBA’s in that LOB.

e.g. View Role: LENDING_CORE_ROLE

42

The LOB rules also include a specific rule for cells monitoring.

43

The Exadata Plugin gives a detailed look into the setup and health of an Exadata environment. However, care must be taken to secure and authorize only those targets that are required for each support role.

44

Monitoring a shared Exadata environment requires:

• The ability to limit LOB DBA access to those targets they are responsible for.

• The ability to produce alerts for LOB DBAs that exclude specific platform hardware targets that are the sole

responsibility of the Exadata support team.

• Giving the LOB DBA team a view into the Exadata cells, and allow them to receive alerts for the cells.

• The ability to alert on the DB Machine target without inundating the Exadata support team with database alerts

that should only be sent to the LOB DBA teams.

• Allow the Exadata support team to receive alerts for databases they are responsible for (such as fsdb).

Enhancing Exadata 12c Monitoring utilizing Metric Extensions

45

The Exadata Plugin exposes metrics for the Exadata Target Types for

monitoring/alerting.

New incident rules were created with all “out of the box” metrics that Oracle defined

thresholds for.

The Exadata Plugin also exposes additional metrics that cannot be used directly for alerting

46

There are metrics that are collected from the Exadata environment, that thresholds cannot be

set for.

In cases where metrics are exposed, but not available for alerting, Repository side Metrics extensions can be used to produce alerts on those metrics that have not be defined with thresholds.

47

Application LOB teams desired additional cell alerting that was not available “out of the box”,

but that could be exposed via Metric Extensions.

Using Repository side Metric Extensions the desired metrics can be alerted on:

48

Custom repository side Metric Extensions were created and then added to the LOB DBA

Incident Rules.

Repository side Metric Extensions use the metric data that is populated when the Exadata Plugin is deployed:

49

The Repository side Metric Extensions used data that is being captured by the Exadata

Plugin.

Summary: The Exadata Plugin provides the 12c functionality needed to “See” into and monitor your Appliance.

50

• The plugin provides an extensive picture of what your Exadata environment looks like. Providing an easy to

understand landscape of all involved components.

• Performance data for all Exadata components is available either through the GUI or 12c reporting.

• Monitoring and alerting for all Exadata Target Types can be enabled to ensure stability and reduce risk of

outages.

• 12c allows for very specific Exadata target security for a variety of support roles.

• 12c Exadata monitoring can easily be extend with Metric Extension to enhance alerting capabilities.

Oracle Exadata Monitoring and Management Best

Practices

Om Prakash Seth

Vice President - IT

[email protected]

HDFC Bank …. Bank aapki Muththi Mein

HDFC Bank Limited, incorporated in 1994, is an Indian banking & financial services company

headquartered in Mumbai, Maharashtra, India

Largest private sector bank in India by market capitalization as of Feb. 2014

Winner of Best Asian Bank award 2015

Top 100 most valuable global brands in 2015 with a value of $14 billion

Ranked as 'Most Valuable Indian Brand’ for second consecutive year

Go Digital ….Bank offers 10-second Personal loan, the 30-minute auto and two wheeler loan, Chillr

app & Payzapp as part of digital banking initiative

About me: Om Prakash Seth is VP – IT & Joint Incident Management Head & manages

production incidents for mission critical Core banking applications. Om has lead the implementation of Oracle Super Cluster in

HDFC Security in 2013, which has been the Word's

first Super Cluster implementation in e-broking segment & has been accredited as "Best technology Implementation of the year"

by Asian Banker's award in 2014.

Evolution of Engineered Systems in HDFC Bank

Extreme Performance

High

Availability

Scalability Technology

Management

Purpose Built

Engineered

Machines

Engineered System’s Journey in HDFC Bank

2012 : Migrated Retail Assets Solution on Exadata

2013 : Implemented Core Trading Platform on Oracle Super Cluster in

HDFC Securities Ltd

2015 : In Process of Migrating Core Retail Banking Platform on Oracle

Super Cluster

2013 : Migrated Basel GL Solution on Exadata

Engineered System’s Footprint in HDFC Bank

3 full Rack & 5 half Rack Oracle Super cluster being deployed

across 3 data centers in HA for running Core Retail Banking

System

10 Exadata database machines running critical workload –

Retail Asset, eBusiness Suites

2 half Rack Oracle Super Cluster machines deployed across

2 data centers for running Core Trading Solution

Drivers for OEM12c

Real time Centralized Monitoring

Notification & Unified

Dashboard

Performance Optimization & Advisory

Managing DB operations

Database Life Cycle Management

Automated Fault Tracking and Resolution

Implemented OEM12c With ASR & Platinum Support

Implement IORM for prioritizing application workload

Keep your OEM 12c to latest patch set level

Install OEM 12c Exadata Plug-in for monitoring & enhanced performance

impact detection

Take advantage of performance & tuning, advance notification features of OEM

12c

Implement Monitoring template for Incident creation, notification for hardware &

database

Implement ASR

Exadata & Super Cluster Monitoring : Best Practices

OEM12c @ HDFC Bank : Technology benefits Achieved

• Integrated Management Console for H/W +S/W

• Single Dashboard for all Engineered Systems End to End Monitoring

• Comprehensive view including Performance, availability, usage by databases, services, clusters through OEM 12c Plug-ins

• Enabled out of the box alerts for databases, cluster, ASM, Topology view of DB systems/clusters

Performance & Availability

Management

• Enabled consolidated Configuration view including Version summary, patch recommendation & Ongoing Database Provisioning

• Exadata and Super Cluster specific Compliance evaluation, ongoing Drift tracking across the stack

Standardization& Compliance

Cloud Ready Architecture

• Enabled Private database cloud & smart UAT cloud

• Eliminated need to store multiple copies through Snap Clone on Exadata

Thank You [email protected]

Oracle Confidential – Internal 60