Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development,...

51

Transcript of Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development,...

Page 1: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability
Page 2: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Michael Nowak MAA Solutions Architect Oracle Development, Systems Technology Group October 27, 2015

Oracle Exadata High-Availability Secrets Explained Direct from Development CON8823

Page 3: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

Key Criteria to World Class Application Level Availability

Exadata HA New/Marquee Features and Secrets Revealed

Demo: A Busy Day for Exadata; An Easy Day for You

1

2

3

3

Page 4: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

Key Criteria to World Class Application Level Availability

Exadata HA New/Marquee Features and Secrets Revealed

Demo: A Busy Day for Exadata; An Easy Day for You

1

2

3

4

Page 5: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

World Class Application Level Availability

• Exadata has supported very stringent service levels for years. Back story: – 2011 MAA Exadata HA low brownout video series(https://vimeo.com/62754145) – 2013 & 2014 OOW presentations (http://education.oracle.com) – Exadata documentation (docs.oracle.com) – MAA collateral (http://www.oracle.com/au/products/database/exadata-maa-

131903.pdf)

• Lets take a quick look at how this was done

Application Service Level Focus

5

Page 6: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Engineered System

6

Page 7: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Development Prioritization

• Evolution beyond fault tolerance / component level HA • Full stack testing with service level focus

• Service level regression is a P1 inside development

Application Service Level Focus for all Customers

7

Service Level Agreements Met or Exceeded Through Unplanned and Planned Outages

Oracle's Goal for your Service Level Your Service Level Agreement

Poor availability

Average Availability

Great Availability

Extreme Availability

Page 8: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Development Backed Best Practices Continuous improvement, always a priority

8

Idea

MOS Note 757552.1

Default Exadata deployment

Exadata Health Check (Exachk)

Engineered System with

Best Practices

Publication Weekly Expert Review / Testing

You are here But you are also here!

Page 9: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Development Backed Best Practices

– New checks – Verify clusterware state is "normal" – Verify Exadata Smart Flash Cache Status is

"normal" – Verify one or more non-default AWR

baselines were created – Update asm diskgroup attribute checks – asm diskgroup attribute disk_repair_time

= 3.6h – asm diskgroup attribute

failgroup_repair_time = 24.0h – Updated database node MTU

recommendation

– New features – Ability to execute with storage servers in

lock down mode(12.1.2.2.0 feature) – Profile to check that exadata default

passwords have been changed – Profile for hardware only checks – Support for 1/8 virtualized Exadata – Support for caching of exachk environment

discovery information – Support for SAP, User Defined Checks,

GoldenGate in MAA scorecard – Command line option to execute or

exclude one or more check(s)

9

Exachk – Just Some of the Exachk Enhancements Coming your Way

Over 1000 checks per target in exachk today!

Page 10: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Development Backed Best Practices MAA Integration

10

Edition-based Redefinition, Online Redefinition, Data Guard, GoldenGate – Minimal downtime maintenance, upgrades, migrations

Active Data Guard – Data Protection, DR – Query Offload

GoldenGate – Active-active replication – Heterogeneous

Active Replica

RMAN, Oracle Secure Backup, Recovery Appliance – Backup to disk, tape or cloud

Enterprise Manager Cloud Control – Site Guard, Coordinated Site Failover Application Continuity – Application HA Global Data Services – Service Failover / Load Balancing

RAC – Scalability – Server HA

ASM – Local storage

protection

Production Site

Flashback – Human error

correction

Page 11: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

Key Criteria to World Class Application Level Availability

Exadata HA New/Marquee Features and Secrets Revealed

Demo: A Busy Day for Exadata; An Easy Day for You

1

2

3

11

Page 12: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

“a fact or piece of information that is kept hidden from other people”

– Webster’s Dictionary

12

“a special or unusual way of doing something to achieve a good result” “something that cannot be explained”

Page 13: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata HA features

• Auto online

• Auto disk management

• Large connection support

• Redundancy protection on cellsrv shutdown

• Health factor on predicatively failed disks

• Disk confinement

• Automatic LED support for disk removal

• Fast cell death detection

• Network Resource Management

• Automatic ASM mirror read on IO error corruption

• IO error prevention with disk scrubbing / ASM repair

• Drop hard disk for replacement

• Drop BBU for Replacement

• Blue OK-to-remove LED light notification for redundancy protection

• Efficient resilver rebalance after flash failure

• Appliance mode support

• Active Active IB

• ILOM hang detection and repair

• Drop disk for replacement

• Content type support

• Improved EM failure reporting

• IO Resource Management

• Corruption prevention with HARD support

• Priority rebalance support

• I/O latency capping for reads and writes

• Cell IO timeout threshold

• Smart Write Back Flash Cache persistence

• IO Resource Management

• Exadata Smart Logging

• Cell Alert Summary

• Flash and Disk Life Cycle Management Alerts

• Elimination of false positive drive failures

• Cell to Cell offload

• Exadata AWR support

• Minimum flash cache reservation per database

• MS on database servers

• IO hang detection and repair

• Redundancy protection on cell shutdown

• Updating database nodes with patchmgr

• Custom Diagnostic Package for Cell Alerts

• 8-Socket Database Nodes kdump

• Cell-to-Cell Rebalance Preserves Flash Cache

13

HA Features Supporting Stringent Application Service Levels for Years

Page 14: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata HA Features

Data Protection

Quality of Service

Management

Performance

Reduced HA Brownout

14

HA Categories Supporting Stringent Application Service Levels for Years

Page 15: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features

Data Protection

Quality of Service

Management

Performance

Reduced HA Brownout

15

Reduced HA Brownout

Page 16: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features Reduced HA Brownout – Fast Node Death Detection on Database Nodes and Cells

16

Example of Database node power failure with an OLTP workload and CSS misscount=60

Page 17: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 17

Fast Node Death Detection

The feature is available for both cell and database node failure. It is

implemented via callouts to the IB Subnet Manager if we think something is wrong. Diskmon performs the check for cells and gipcd

performs it for database nodes. If the Infiniband Subnet Manager tells us both ports are down in the fabric, we can efficiently proceed with

the eviction.

Page 18: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features

Data Protection

Quality of Service

Management

Performance

Reduced HA Brownout

18

Data Protection

Page 19: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features Data Protection - Redundancy Check When Powering Down Storage Server

19

Page 20: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features Data Protection - Redundancy Check When Powering Down Storage Server

20

Page 21: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features

Database update

encounters corruption

Just in case the Administrator would like to know, we log the following: <Database side> Corrupt block relative dba: 0x16400087 (file 89, block 135) Bad check value found during multiblock buffer read Data in bad block:

type: 6 format: 2 rdba: 0x16400087

last change scn: 0x0000.b6702b33 seq: 0x1 flg: 0x04

spare1: 0x0 spare2: 0x0 spare3: 0x0

consistency value in tail: 0x2b330601

check value in block header: 0xa07a

computed block checksum: 0x3

Reading datafile '+DATA/qs/datafile/c.257.825768683' for corruption at rdba: 0x16400087 (file 89, block 135)

Read datafile mirror ‘DATA_CD_08_CELL13' (file 89, block 135) found same corrupt data (no logical check)

Read datafile mirror ‘DATA_CD_07_CELL14' (file 89, block 135) found valid data Hex dump of (file 89, block 135) in trace file /u01/app/oracle/diag/… /qs1_ora_60475.trc Repaired corruption at (file 89, block 135)

continue to run without ever noticing the failure

OLTP, Analytics, Consolidation, In-Memory DB

Database reads ASM mirror copy and repairs corruption

Data Protection – Corruption Detection, Mirror Read, and Repair

21

Page 22: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features Data Protection – Power Cycle Drive to Avoid False Positive Drive Failure

22

Drive reported as failed but not physically failed

Page 23: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features Data Protection – Power Cycle Drive to Avoid False Positive Drive Failure

23

Drive automatically resurrected and resynced

This feature is works on both X5-2 High Capacity and Extreme Flash cells

Page 24: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 24

Exadata & ASM Coordination

Exadata and ASM are exchanging information at very low levels to keep

each other informed as to what is going on. You saw a few examples here with the redundancy check, automatic corruption

detection/repair, and the disk resynchronization after disk power cycle. Another neat example is when we direct slow IOs to ASM

mirrors to improve performance

Page 25: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features

Data Protection

Quality of Service

Management

Performance

Reduced HA Brownout

25

Management

Page 26: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features Management – Exadata AWR Support

26

Page 27: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features

27

Management – Exadata Support in Database Active Report

@perfhubrpt

Page 28: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 28

Exadata AWR Report HTML Requirement

If you are like me, you like to whip up a quick AWR report in txt format

when debugging a performance problem. Keep in mind that the Exadata section of the AWR report requires you to generate your AWR

report in html format, not txt format. HTML enables very useful features like special colors and links for outliers so they stand out.

Note in 12.1.0.2 BP13, the txt report will contain a reminder of this fact

Page 29: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features

Data Protection

Quality of Service

Management

Performance

Reduced HA Brownout

29

Performance

Page 30: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features

• These features have been meeting or exceeding application service levels for years via big performance and availability gains with Analytics/Reporting, OLTP and Database Consolidation workloads – Exadata Smart Scan – Exadata Smart Logging – Exadata Smart Persistent Write Back Flash Cache – Exadata Active/Active IB network

Performance

30

Page 31: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 31

Reverse Offload

The concept of offloading work to the cells is generally well understood as it has been available since Exadata day 1. We also have the concept

of reverse offload which pushes work from the cell back to the database node when it makes sense. Several enhancements were implemented in

12.1.2.2.0 which improve reverse offload performance by up to 15%. With varying configurations and workloads, keeping the database nodes

and cells in lock step w/respect to resource utilization maximizes performance and is a unique advantage of an engineered system.

Page 32: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features

Data Protection

Quality of Service

Management

Performance

Reduced HA Brownout

32

Quality of Service

Page 33: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features Quality of Service – IO Latency Capping

33

Cell1

Cell2

Cell3

IO Latency Capping • Flash disk in Cell1, PCI slot 5 is exceeding

performance thresholds during a database IO

• If it is a read, it is cancelled and automatically redirected to partner Cell3. Alert log reports “NOTE: ASM has redirected some slow reads to mirror sides to improve performance.”

• If it a write, it is cancelled and temporarily written to a healthy flash disk on the same cell.

IO latency capping works for both flash and hard disks

Page 34: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Exadata Marquee/New HA Features Quality of Service – Disk Confinement

34

Cell1

Cell2

Cell3

Disk Confinement • Disk in Cell2, slot 7 becomes sick and is taken

offline

• IOs redirected to one of the partner disks on Cell1, slot 3

• Dr. Exadata runs diagnostics run on disk to determine health

• If deemed healthy, disk is returned to online status and resynced

• If deemed unhealthy, health factor drop is performed, and blue LED is lit when rebalance completes

Page 35: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 35

Does confinement just use reads to assess the sick disk?

Actually no. Most of the automatic tests are read based but there is a small subset of write based diagnostics (the data is read and written back). Confinement is already very thorough today but keeps getting even better as we try to cover every possible

case of service level interruption.

What if the sick disks partners are not available?

Exadata is always very aware of protecting redundancy. We will never confine or redirect IO for a disk that contains the last copy of an extent.

Page 36: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

Key Criteria to World Class Application Level Availability

Exadata HA New/Marquee Features and Secrets Revealed

Demo: A Busy Day for Exadata; An Easy Day for You

1

2

3

36

Page 37: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Demo: A Busy Day for Exadata; An Easy Day for You • This demo takes us through a cell patching scenario on an Exadata quarter

rack, with two unexpected events occurring beforehand • Although the sequence of events is shown at a high level, the outages shown

here were actually induced on an Exadata rack running our latest production software last week, with a real OLTP workload running. All charts represent real data.

• For a very technical, detailed readout of HA unplanned and planned outage cases including those illustrated here and many more, see the links provided at the beginning of this presentation, especially https://vimeo.com/62754145

• Three outages in one day is shown for illustration purposes. It is not a typical day

37

Page 38: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Thursday (The Day Before Cell Patching)

Your Exadata quarter rack is ready to go.

Customer SLA is >= 800 transactions per second with consistent application response times. Current service levels are exceeded.

You are double checking the patch procedure for tomorrow. There is one extra step this time…exachk has advised you about a rare issue that can be avoided by manually resetting the ILOM just before patching so it has been added to the plan.

38

Page 39: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 3:00 am

Hard disk fails and disk drop rebalance begins. High priority files (ex: control and log files) are processed first, and they complete within a few minutes as seen in the ASM alert log.

Customer SLAs continue to be met despite the hard disk failure.

You are getting some good sleep, hopefully not dreaming about cell patching!

39

Page 40: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 6:00 am

Disk drop rebalance finished a short time ago. Now at 6:00, data center tech notices blue light and replaces disk. Disk add rebalance begins.

Customer SLAs look good with the Exadata best practice power limit of 4.

I told you it was a good sleep.

40

Page 41: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 8:30 am

Disk add rebalance continues. Customer SLAs continue to look good.

Slept in a bit today. Post shower, you are checking the news when you notice the Exadata alerts in your email describing disk failure and replacement. You check v$asm_operation and see the disk add rebalance is almost done.

41

Page 42: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 10:00 am

Disk add rebalance completes. Customer SLAs continue to be met. You had to make a quick stop on the way to the office, but did notice the disk add rebalance completed via alert that popped up as an email notification on your phone.

42

Page 43: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 12:00 pm

Exadata still humming along. Customer SLAs look good. You make a very good point in a lunch meeting. All are impressed.

43

Page 44: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 1:30 pm

Exadata detects hard failure of cell1 in less than one second, evicts the cell allowing the application to continue.

Customer service level sees a very small 1 second brownout, then back to business as usual.

Time to execute that one extra pre-cell patching step, the ILOM reset. Still thinking about the good point you made during the meeting, you type ‘reset /SYS’ on the first cell instead of ‘reset /SP’.

44

Page 45: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 1:35 pm

Exadata detects the cell has come back up and begins fast resynchronization process.

Customer SLAs look good during the resynchronization running at best practice power level of 4.

After calming yourself down, you check Enterprise Manager and just see a momentary wait event blip. You breathe a sigh of relief and finish off the pre patch steps properly.

45

Page 46: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 3:00 pm

Fast resynchronization of cell1 completed a short time ago.

Customer SLAs look good during and after the resynchronization.

It’s a really nice day outside today. You need a break.

46

Page 47: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 4:30 pm

Exadata gracefully takes cell1 offline and starts patching.

Graceful cell offline has no impact to the customer SLA.

You start cell rolling patching.

47

# patchmgr –cells <cell_group> -patch -rolling

Realizing the time, you head out because happy hour is starting.

Page 48: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 6:00 pm

Exadata finishes cell1 and moves on to cell2.

Customer SLA still looking good through the cell patching transition.

You heard a song on the way over to happy hour you have to share with your friend before heading in. While sharing, you notice the email alert notification on your phone that patching completed on cell1 and has moved on to cell2. Nice.

48

Page 49: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Friday 9:00 pm

Exadata finished patching cell2 and cell3 with alerts sent out accordingly. Your quarter rack cell patching is complete.

Customers continue to hit system throughout the cell patching, even late on a Friday, and SLA stays in good shape throughout.

Its been a long day so you head to bed early.

49

Page 50: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

Key Criteria to World Class Application Level Availability

Exadata HA New/Marquee Features and Secrets Revealed

Demo: A Busy Day for Exadata; An Easy Day for You

1

2

3

50

Page 51: Oracle Exadata High-Availability Secrets Explained · MAA Solutions Architect . Oracle Development, Systems Technology Group . October 27, 2015 . Oracle Exadata High-Availability