Oracle Exadata: Architecture and Internals Technical Deep Dive · Internals Technical Deep Dive...

35
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle Exadata: Architecture and Internals Technical Deep Dive -TRN4113 Tuesday 23 rd , 12:30 PM Moscone West - Room 3008 Kothanda “Kodi” Umamageswaran Vice President, Exadata Development Gurmeet Goindi, Master Product Manager, Exadata Confidential – Oracle Internal/Restricted/Highly Restricted

Transcript of Oracle Exadata: Architecture and Internals Technical Deep Dive · Internals Technical Deep Dive...

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle Exadata: Architecture and Internals Technical Deep Dive -TRN4113

Tuesday 23rd, 12:30 PM Moscone West - Room 3008

Kothanda “Kodi” UmamageswaranVice President, Exadata Development

Gurmeet Goindi,Master Product Manager, Exadata

Confidential – Oracle Internal/Restricted/Highly Restricted

Copyright © 2018, Oracle and/or its affiliates. All rights reserved.

Safe Harbor StatementThe following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

2

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Exadata Vision

Dramatically Better Platform for All Database Workloads

• Ideal Database Hardware - Scale-out, database optimized compute, networking, and storage for fastest performance and lowest costs

• Smart System Software – specialized algorithms vastly improve all aspects of database processing: OLTP, Analytics, Consolidation

• Automated Management – Automation and optimization of configuration, updates, performance, and management culminating in Fully Autonomous Database and Infrastructure

3

Identical On-Premises and in Cloud

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 4

Exadata: #1 Database Platform for 10 years

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

10 Years of Innovation: Generations Ahead of All Competition• First and only smart scale-out storage

• First and only RDMA and InfiniBand for converged networking

• First and only OLTP Machine

• First ever enterprise platform to use NVMe Flash

• First and only In-Memory Performance in Storage

• First and only Mission Critical Cloud at Customer Platform

• Only Enterprise Storage to make the leap to Public Cloud

• Only Database Machine to make the leap to Public Cloud

• And now: Only Database Machine to run Autonomous Database

5

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Proven at Thousands of Ultra-Critical Deployments since 2008

• Best for all Workloads• Petabyte Warehouses

• Online Financial Trading

• Business Applications– SAP, Oracle, Siebel, PSFT, …

• Massive DB Consolidation

6

4 of the 5 Largest Enterprises in the World Run Exadata4 OF THE TOP 5

BANKS, TELECOMS, RETAILERS RUN EXADATA

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Exadata Powering Oracle Cloud

7

• Oracle is more committed to Exadata than ever before• Oracle SaaS apps run exclusively on Exadata• Hundreds of Exadata systems deployed globally in our public cloud

• Autonomous Database run exclusively on Exadata

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Exadata Scale-Out State-of-the-Art Hardware

8

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Exadata Database Machine X7-2

9

State-of-the-Art Hardware

120 TB disk capacity (10 TB helium disks)25.6 TB PCI NVMe Flash20 cores for SQL offload

51.2 TB PCI NVMe Flash20 cores for SQL offload

40 Gb/s InfiniBand internal network25/10/1 GigE external network

2 socket Xeon processors48 cores per server384 GB - 1.5 TB DRAM

• Scale-Out Database Servers

• Fastest Internal Fabric

• Scale-Out Intelligent Storage

High-Capacity Storage Server

Extreme Flash Storage Server

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Exadata Database Machine X7-8

10

Large SMP Processor Model– Big data warehouses– Massive database consolidation– In-Memory databases

120 TB disk capacity (10 TB helium disks)25.6 TB PCI NVMe Flash20 cores for SQL offload

51.2 TB PCI NVMe Flash20 cores for SQL offload

40 Gb/s InfiniBand25/10/1 GigE external connectivity

• Scale-Out Database Servers– 8-socket x86

processors– 192 cores– 3-6 TB DRAM

• Fastest Internal Fabric

• Scale-Out Intelligent Storage

High-Capacity Storage Server

Extreme Flash Storage Server

Same Networking, Storage and Software as X7-2

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Configure Servers to Match Your Workload

11

Elastic Hardware Configurations

Capacity-on-Demand Software Licensing• Enable compute cores as needed, subject to minimums• License Oracle software for enabled cores only

*14 cores minimum per DB server (max 48 cores)*8 cores minimum per Eighth Rack DB server (max 24 cores)

X7-2 Eighth Rack Quarter Rack

Add Servers

as needed*

Full Rack

Add racks to continue scaling*

* Expand older racks with new servers and multi-rack old and new racks together

X7-8 Elastic Configuration

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Exadata X7-2 and X7-8 Performance Improvements vs X6

• 350 GB/sec IO Throughput– 17% more (vs Exadata X6)

• 5.97 Million OLTP Read IOPs– 50% more IOPs (vs Exadata X6) under 250usec = 3.5M

• 40% CPU improvement for Analytics

• 20% CPU improvement for OLTP– 40% on X7-8

• Dramatically faster than leading all-flash arrays

12

Each rack has up to:• 1.7 PB Disk• 720 TB NVMe Flash

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Hot Swappable Hardware for Online Maintenance

• Flash• Disks• M.2 boot drive• Power supplies• Fans• InfiniBand switch• Not Hot Swappable: – PCI cards (network, IB, HBA), CPU, Memory (bad sectors will be disabled)

13

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Fault Tolerant Availability

Only other AL4 Systems• IBM - z Systems• HPE - Integrity NonStop &

Superdome• Fujitsu – GS & BS2000• NEC – FT Server/320 Series• Stratus ftServer & V Series• Unisys – Dorado

“Exadata and SuperClusterboth achieve AL4 fault

tolerance in a Maximum Availability Architecture*

configuration”

FIVE NINES

5X999.999%

A New Gold Standard

14

*Gold or Platinum reference architecture

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 15

Exadata Smart Software

Continue Tradition of Adding Major Differentiators

Introducing Exadata 19.1.0.0.0

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Exadata 18.1.0.0.0 and 19.1.0.0.0 Highlights

• Over 40 unique software features and enhancements in a year– Better analytics, better transaction processing, better consolidation,

more secure, faster and more robust upgrades, and easier to manage

• Complete investment protection– All new software features work on all supported Exadata hardware

generations

• Full storage offload functionality for Database 19.1– Oracle Database 11.2, 12.1, 12.2 and 18.1 coexist alongside 19.1 on the

same system

• Updated Oracle Linux kernel and Oracle VM improve robustness and scalability– Oracle Linux 7.5 with UEK4, Oracle Virtual Machine 3.4.4

16

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle Linux 7: Seamless Upgrades from Oracle Linux 6

• Strategy for Linux Upgrades– Upgrade the kernel faster to use the new features in Linux– Upgrade the distribution slowly to keep compatibility with as many applications

as possible

• Oracle Database 19c works only on Linux 7 and won’t work on Linux 6• Bare Metal and guests (domU)– Oracle Linux 7.5 and UEK4 (4.1.12-124 series)

• Hypervisor (dom0)– Oracle Linux 6.9 and Xen 3.4.4 errata on dom0 (no change here)– Dom0 Linux kernel updated to latest UEK4 (4.1.12-124 series but uses the Oracle

Linux 6 kernel)

• No reimage required, just rolling upgrade to Oracle Linux 7

17

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle Database Release Minimum Required Version• Oracle Database and Grid Infrastructure 19c works only on Linux 7 and won’t work on Linux 6

• Oracle Database and Grid Infrastructure 18c needs 18.3 or newer

• Oracle Database and Grid Infrastructure 12c Release 2 (12.2.0.1.0) needs 12.2.0.1.180717 or newer

• Oracle Database and Grid Infrastructure 12c Release 1 (12.1.0.2.0) needs 12.1.0.2.180831 or newer

• Oracle Database 11g Release 2 (11.2.0.4.0) needs 11.2.0.4.180717 or newer– Requires Grid Infrastructure release 12.1.0.2.180831 or newer

• Oracle Database 11g Release 2 (11.2.0.3.0) needs 11.2.0.3.28– Requires Grid Infrastructure release 12.1.0.2.180831 or newer

18

Support all the Oracle Database versions on all hardware platforms

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 19

Smart Analytics and OLTP

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Exadata: Revolutionizing Analytics for 10 Years

20

2008

Smart Scan SQL OffloadMassively Parallel Storage ArchitectureRDMA over InfiniBand

Roadmap:Persistent Memory

2019

Database Aware Flash CacheStorage IndexesHybrid Columnar Compression

Data Mining OffloadOffload Decrypt on ScansOffload index fast full scans

JSON and XML OffloadBig Data SQLStorage Offload of CLOBs & LOBS

Flash IO Resource ManagementColumnar Flash CacheNVME – IM Performance in Flash

In Memory Fault TolerantIn Memory Formats in FlashIn Memory Active Data Guard

Automatic In-MemoryIn-Memory for external tablesIn-Memory Optimized Arithmetic

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Smart column level checksum with In-Memory Columnar Cache– Selective checksum computation• select temperature from weather where city = 'NASHUA' and weather = 'SNOW' and weather_date between '01-JUN-18' and '30-DEC-18';

• Checksums are computed only on columns temperature, city, weather, weather_date even though table has lots of other columns

– Just in time checksum computation• select temperature from weather where city = 'REDWOOD CITY' and weather = 'SNOW' and weather_date between '01-JUN-18' and '30-DEC-18';

• Since , city = 'REDWOOD CITY' and weather = 'SNOW' returns no results, checksum for weather_date and temperature is skipped

– Storage Server CPU reduction of up to 28%– Completely automatic and transparent

21

In-Memory Columnar scans

In-FlashColumnar scans

Smarter Analytics

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Exadata: Revolutionizing OLTP for 10 Years

22

2009

Write-back Flash CacheDatabase Aware PCI FlashMassively Parallel Storage Architecture

Roadmap:Persistent Memory

Accelerator

2019

Exadata Smart Flash LoggingSmart File InitializationNetwork Resource Management

IO Prioritization to ensure QoSSub-second failover of IOActive AWR for end to end monitoring

Exafusion: Direct-to-Wire ProtocolEXAchk full-stack validationPCI NVMe Flash for lowest latency

Smart Fusion Block TransferCell-to-cell rebalance preserving flash cacheFull-stack security scanning

Hot Plug NVME FlashIn-Memory OLTP AccelerationInstant detection of server failure

Memory optimized OLTP RDMA Support for UNDOExadata Commit Cache

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Smarter OLTP caching• Flash disk failure and flash disk replacement used to create performance glitches

• When buffer cache evicts a buffer globally, add block to flash cache on secondary mirror to handle failures transparently

• When cells fail or flash devices fail, mirrors will be warm• Fall back to primary only after the primary cache is warmed up not as soon as the cell service is

started

• Completely automatic and transparent!

23

AfterNo more dips

Incredible!

Needs Exadata 19.1 and Database 19c

BeforePerformance dip at

flash failure and flash replacement

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Cloud and Security

24

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Automated Cloud Scale Performance Alerting• Management Server for managing CPU, Memory, IO, Network, File System

• Management server will run on the guest in addition to the existing one on dom0 and Storage Server

• Provides brilliant insights into the running of the system

• No more memory hog that eats up your machine, no more unknown run away process, no more database instances without hugepages, and many more

• Utilizes new machine learning algorithms and 10 years of feedback from running mission critical customers

25

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Hardened Security: Advanced Intrusion Detection• Enables Advanced Intrustion Detection Environment (AIDE) – Critical files checked every night via sha256, modified time, and more– Critical directories monitored for items added/removed

26

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

ERASE

• Drop celldisk all and drop cell even with 1-pass, 3-pass, and 7-pass implicitly perform a secure erase when possible – Works for both hard disks and flash disks

• Implicitly perform a secure erase as part of fresh imaging

– Only for hardware supported via secure erase

Hardened Security

27

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Hardened Security• HTTPS access control for MS

– Allow selective IP addresses/subnet to connect to a cell

– Can be “none” to disable all access via HTTPS• will shutdown listening port for port-scanner compliance reports

– Example: alter cell httpsAccess='10.239.152.0/24‘

• Implement principle of least privilege to improve security

– Run Offload Servers as a lower privileged user and not root

– Reduce privilege of most exawatcher collectors like top that can be collected as regular user and not root

• Increased security for storage server processes

– Enables seccomp() system call filtering for cell server and cell offload server processes

– If there is some bug that someone tries to exploit, it will crash the processes instead of letting it proceed with a rogue system call

• `28

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

This support with same db_unique_name is new

Support duplicate db_unique_name across clusters• Support duplicate db_unique_name across ASM clusters with ASM scoped

security • Consolidate databases across VM clusters by different departments that can

potentially use the same db_unique_name• IO Resource Management plans, Storage Index, Flash Cache etc are now aware

of multiple databases with the same name scoped by separated ASM clusters

29

Shared Storage

Cluster 1

Cluster 2

Database “sales”

Database “sales”

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Cloud• Customize syslogformat for log analytics – Better integration with Security Information and Event Management systems

• Support Admin network (eth0) and ILOM on separate networks– Needs OEDACli or webUI

• Allow user to change their own exacli password and expire passwords• Connect to ASR manager through SSL – Enables a more secure Exadata Cloud at Customer deployment

30

Copyright © 2018 Oracle and/or its affiliates. All rights reserved.

Exadata Automates Software Updates at Cloud Scale

• New automation updates all Exadata infrastructure software on full fleet– 600+ components per full rack

• Updates multiple systems in parallel

• Runs automatically on schedule– Online (rolling) or Offline (all in parallel)

• Pull Model simplifies orchestration– Just point Exadata system at software

repository and give update window– Dependencies between systems

automatically handled

31

PARALLEL

ROLLING

Update Tool

FLEET UPDATES

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

The Next Big Thing in Performance:

In-Memory Performance in Storage

32

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

OLTP: Exadata Brings In-Memory OLTP to Storage

• Exadata Storage Servers add a memory cache in front of Flash memory– Similar to current Flash cache in front of disk

• Cache is additive with cache at Database Server– Only possible because of tight integration with DB

• 2.5x Lower latency for OLTP IO – 100 µsec

• Up to 21 TB of DRAM for OLTP acceleration with memory upgrade kit

33

DRAM Cache (HOT)

Flash Cache

DisksStorage Server

DRAM Cache in storage serves OLTP blocks 2.5xfaster than Flash Cache

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

The Community Effect

Exadata Takes Standardization to the Next Level…

§ Standard Technologies§ Standard Integration and Tuning§ Standard Support§ Same as Oracle Development§ Same as Oracle Public Cloud§ Same as 1000s of leading Banks,

Telecoms, Cloud Providers, etc.

Thousands of Identical Configurations Spanning

Every Workload and Industry

More: Performant, Reliable, Secure, and Supportable than Company Standard

Global Standard is Far More Effective than Company Standard

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Conclusion: Exadata Advantages Increase Every Year

35

• Smart Scan• InfiniBand Scale-Out

• Database Aware Flash Cache• Storage Indexes• Columnar Compression

• IO Priorities• Data Mining Offload•Offload Decrypt on Scans

• In-Memory Fault Tolerance• Direct-to-wire Protocol• JSON and XML offload• Instant failure detection

• Network Resource Management•Multitenant Aware Resource Mgmt• Prioritized File Recovery

Smart Software

Smart Hardware• Unified InfiniBand

• Scale-Out Servers

• Scale-Out Storage

• DB Processors in Storage

• PCIe NVMe Flash• Tiered Disk/ Flash

• Software-in-Silicon

• 3D V-NAND Flash

• In-Memory Columnar in Flash• Exadata Cloud Service• Smart Fusion Block Transfer

• Exadata Cloud at Customer• In-Memory OLTP Acceleration

Dramatically Better Performance and Cost

• Hot Swappable Flash

• 25 GigE Client Network

•Autonomous Database