Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth...

50
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH Big Data Infrastructure The Oracle Way . Daniel Steiger

Transcript of Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth...

Page 1: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH

Big Data Infrastructure –The Oracle Way.Daniel Steiger

Page 2: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

About ...

Big Data Infrastructure - The Oracle Way2 17.11.2016

Daniel SteigerPrincipal Consultant @ Trivadis

Oracle DBA and IT Infrastructure Architect

Program Manager IT Infrastructure Optimization

Co-Author "Der Oracle DBA", Hanser Verlag

Speaker and Teacher

Page 3: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Our company.

Big Data Infrastructure - The Oracle Way3 17.11.2016

Trivadis is a market leader in IT consulting, system integration, solution engineeringand the provision of IT services focusing on andtechnologiesin Switzerland, Germany, Austria and Denmark. We offer our services in the followingstrategic business fields:

Trivadis Services takes over the interacting operation of your IT systems.

O P E R A T I O N

Page 4: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

COPENHAGEN

MUNICH

LAUSANNEBERN

ZURICHBRUGG

GENEVA

HAMBURG

DÜSSELDORF

FRANKFURT

STUTTGART

FREIBURG

BASEL

VIENNA

With over 600 specialists and IT experts in your region.

Big Data Infrastructure - The Oracle Way4 17.11.2016

14 Trivadis branches and more than600 employees

200 Service Level Agreements

Over 4,000 training participants

Research and development budget:CHF 5.0 million

Financially self-supporting andsustainably profitable

Experience from more than 1,900 projects per year at over 800customers

Page 5: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Agenda

Big Data Infrastructure - The Oracle Way5 17.11.2016

1. Introduction

2. Oracle Big Data Infrastructure

3. Oracle Big Data Software

4. BDA Setup

5. Use Case

6. Summary

Page 6: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Infrastructure - The Oracle Way6 17.11.2016

Introduction

Page 7: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Infrastructure - The Oracle Way7 17.11.2016

2006 2008 2011 2012

Oracle Makes Big Data Appliance Move With Cloudera

Oracle Rolls Out 'Big Data Appliance'

Foundation of Cloudera

Hadoop is born from Apache Nutch 197

2006 2008 2011 2012

Page 8: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Infrastructure - The Oracle Way8 17.11.2016

About the Current State of Big Data Technology

"Cloudera is eight; Apache Hadoop is ten. Big data has gone from zero to how-did-that-happen huge. The bestiary is bigger than ever,

too: new projects like Apache Kudu, Apache Impala (incubating), Apache Kafka and Apache Spark define the future of big data and

analytics, extending the core Hadoop platform to handle streaming, real-time and advanced analytics."

Mike Olson, Cloudera CSO and Co-Founder, Aug. 25, 2016

Page 9: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Data Lakes and Reservoirs

Big Data Infrastructure - The Oracle Way9 17.11.2016

Since the data doesn’t just sit there until it evaporates but eventually flows to various applications, we should think of this as a “data reservoir” rather than a “data lake.”http://blogs.informatica.com

Page 10: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Data Reservoir Functions

Big Data Infrastructure - The Oracle Way10 17.11.2016

Source: Architecting Data Lakes, 2016 O’Reilly Media, Inc.

Ingestion Storage/Retention Processing Access

Page 11: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Oracle Big Data Management System Architecture

Big Data Infrastructure - The Oracle Way11 17.11.2016

Schema-on-readRaw data

Complex processingHuge volume –

at low cost

Schema-on-writeCleansed data

Complex integrationLarge volume –

at moderate cost

Page 12: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Oracle's Big Data Solution

A complete and optimized solution for big data

Tight integration with Exadata, Exalogic, Exalytics and SPARC Supercluster using Infiniband network

Single-vendor support for both hardware and software

Big Data Infrastructure - The Oracle Way17.11.201612

Page 13: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Infrastructure - The Oracle Way13 17.11.2016

Oracle Big Data Infrastructure

Page 14: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

The Big Data Appliance X6-2 Hardware

Big Data Infrastructure - The Oracle Way14 17.11.2016

Per Node (X6-2):

2 x 22-Core (2.2GHz) Intel ® Xeon ® E5-2699 v4

8 x 32GB DDR4-2400 Memory (max. 768GB)

12 x 8TB 7,200 RPM High Capacity SAS Drives

2 x QDR 40Gb/sec InfiniBand Ports

4 x 10 Gb Ethernet Ports, 1 x ILOM Ethernet Port

RAM to CPU Ratio:

ODA X6-2M: 38 GB per Core

MiniCluster S7-2: 32 GB per Core

BDA: 17.5 GB per Core*

Starter Rack: 6 x nodes

Full Rack: 18 x nodes

Up to 18 racks

* Cloudera recommendation for "Compute Intensive Workloads": 16 GB per core

Page 15: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Appliance Network Connectivity

Big Data Infrastructure - The Oracle Way15 17.11.2016

Source: Oracle Big Data Appliance: Datacenter Network Integration, Oracle White Paper, 2012

Page 16: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Oracle Big Data Appliance Software Stack (Release 4.6.0)

Big Data Infrastructure - The Oracle Way16 17.11.2016

Oracle Linux, Oracle Java JDK

MySQL Database Enterprise Server -Advanced Edition

Oracle SQL Connector for HDFS

Oracle XQuery for Hadoop

Oracle R Advanced Analytics for Hadoop

Oracle NoSQL Database (key-value)Community Edition (CE)

Enterprise Manager Plug-In

Cloudera Enterprise Data Hub Edition– Apache Hadoop (CDH)– Cloudera Impala– Cloudera Search (Apache Solr)– Apache HBase and Apache

Accumulo– Apache Spark– Apache Kafka– Cloudera Manager– Cloudera Navigator– Cloudera Backup and Disaster

Recovery (BDR)

Page 17: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Facilitate access to data stored in an Apache Hadoop cluster.

Available on either Oracle Big Data Appliance or a Hadoop cluster running on commodity hardware

– Oracle SQL Connector for HDFS

– Oracle Loader for Hadoop

– Oracle XQuery for Hadoop

– Oracle R Advanced Analytics for Hadoop

– Oracle Data Integrator

– Oracle DataSource for Hadoop (OD4H)

Note: The connectors are licensed separately from Oracle Big Data Appliance

Oracle Big Data Connectors

Source: Oracle ®

17.11.2016 Big Data Infrastructure - The Oracle Way17

Page 18: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Security for Data at Rest and Data in Motion

Big Data Infrastructure - The Oracle Way18 17.11.2016

Authentication through Kerberos

Authorization through Apache Sentry

Auditing through Oracle Audit Vault

Encryption for Data-at-Rest

Network Encryption

Big Data SQL adds

– Advanced Security on Hadoop & NoSQL: Masking and Redaction

– Virtual Private Database: Fine-grain Access Control

Page 19: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Administration with EM Cloud Control

Big Data Infrastructure - The Oracle Way19 17.11.2016

Plug-In for EM Cloud Control 12.1.0.4 and later

– Discover the components of a Big Data Appliance Network as managed targets

– Manage the HW and SW components

– Collect metrics to analyze the performance of the network and each BDS component

– Trigger alerts based on availability and system health

– Respond to warnings and incidents

Always (!) check My Oracle Support Doc ID 1570523.1, "Enterprise Manager for Oracle Big Data Appliance Frequently Asked Questions"

Page 20: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Infrastructure - The Oracle Way20 17.11.2016

Oracle Big Data Software

Page 21: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Oracle Big Data Software

Big Data Infrastructure - The Oracle Way21 17.11.2016

Oracle Big Data SQL

Oracle Big Data Discovery

Oracle Data Integrator for Big Data

Oracle GoldenGate for Big Data

Page 22: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Oracle Big Data SQL

Big Data Infrastructure - The Oracle Way22 17.11.2016

Query Data in RDBMS, Hadoop and NoSQL

Same query - but there are intelligent optimizations that push the queries down to the source

Tables in Hadoop or NoSQL databases are defined as external tables in Oracle(leveraging Hive metastore to determine both parallelism and read semantics)

Applying query optimizations to the data(Storage Indexes, Local filtering and Caching)

Oracle DataSource for Hadoop (OD4H)

Page 23: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Oracle Big Data SQL (cont.)

Big Data Infrastructure - The Oracle Way23 17.11.2016

Oracle Big Data SQL extends SmartScan capabilities (such as filter-predicate off-loads) to Oracle external tables with the installation of the Big Data SQL processing agent on the DataNodes of the Hadoop cluster. This technology enables the Hadoop cluster to discard a huge portion of irrelevant data – up to 99 percent of the total – and return much smaller result sets to the Oracle Database server.

Oracle Big Data SQL 3.0 can connect Oracle Database to the Hadoop environment on Oracle Big Data Appliance, other systems based on CDH (Cloudera's Distribution including Apache Hadoop), HDP (Hortonworks Data Platform), and potentially other non-CDH Hadoop systems

Page 24: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Oracle Big Data Discovery

Big Data Infrastructure - The Oracle Way24 17.11.2016

The Visual Face of Big Data

Uses the power of Apache Spark to process massive amounts of information

Uses Oracle Big Data SQL to query the data in HDFS without moving it at all

Page 25: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Oracle Data Integrator (ODI) for Big Data

Big Data Infrastructure - The Oracle Way25 17.11.2016

ODI for Big Data is used to transform and enrich data within the big data reservoir

ODI for Big Data generates native code that is then run on the underlying Hadoop platform without requiring any additional agents

Enable users to build business and data mappings without having to learn HiveQL, Pig Latin and Map Reduce

ODI separates the design interface to build logic and the physical implemen-tation layer to run the code

Page 26: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Oracle GoldenGate for Big Data

Big Data Infrastructure - The Oracle Way26 17.11.2016

Data Delivery to Big Data Targets

– Less invasive compared to ETL-Processes

– Real-Time Data for StreamingAnalytics

Release 12.2 (Dec. 2015)

Native Java Replication

Pluggable Formatting Architecture

– JSON, AVRO, XML, Delimited Text

Native Kerberos Support

Kafka Targets

Page 27: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Infrastructure - The Oracle Way27 17.11.2016

Big Data Appliance Setup

Page 28: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Well, first you have to move the box ...

Big Data Infrastructure - The Oracle Way28 17.11.2016 Source: kerryosborne

Safety and Compliance Guide

Site Checklist

Page 29: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Setup

Step 1 = PreinstallChecksStep 2 = SetupPuppetStep 3 = PatchFactoryImageStep 4 = CopyLicenseFilesStep 5 = CopySoftwareSourceStep 6 = CreateUsersStep 7 = SetupMountPointsStep 8 = SetupMySQL

cd /opt/oracle/BDAMaamothmammoth –s 1 cdh

Step 9 = InstallHadoopStep 10 = StartHadoopServicesStep 11 = InstallBDASoftwareStep 12 = SetupKerberosStep 13 = HDFSTransparentEncryptionStep 14 = SetupEMAgentStep 15 = SetupASRStep 16 = CleanupInstallStep 17 = CleanupSSHroot (Optional)

Mammoth is the utility that deploys software on Oracle's Big Data Appliance

Big Data Infrastructure - The Oracle Way17.11.201629

Page 30: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Install Big Data Discovery

At one node only

Takes a couple of minutes as RAID 6 is built locally

Some hints ...

– Cannot connect to mysql database => edit temporary password file

– Needs email adress during setup dialog

– Installation shows “finished successfully” ... but was not

bdacli enable bdd

Big Data Infrastructure - The Oracle Way17.11.201630

Page 31: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Patching

Patching means: Software to raise a pre-existing software release number

– E.g. CDH 5.5.1 to CDH 5.5.2

Example: Re-Image to 4.2.0 with Patch 22118555 (3.6G)

– JSON specs must exist at server to be reimaged

– Re-Imaging writes image to internal usb and boots from usb

BDA Configurator v4.4.0-1

BDA Patch 4.4.0

– P22537238_440_Linux-x86-64_1of3.zip Mammoth

– P22537238_440_Linux-x86-64_2of3.zip BDABaseImage-ol6-4.4.0_RELEASE.iso

– P22537238_440_Linux-x86-64_3of3.zip BDAExtras-ol6-4.4.0

Big Data Infrastructure - The Oracle Way17.11.201631

Page 32: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

From our experience ...

A Big Data Appliance Admin needs a broad skill set

– Unix admin skills are mandatory (ssh, X-server, scp, networking, ...)

– Oracle Engineered System expertise helps a lot (Exadata, ODA, Infiniband, ...)

– Cloudera administration skills are usefull

Setup and patching:

– Always check for known issues on My Oracle Support (see references for Doc IDs)

– Check logfiles after every step

Pay attention to Infiniband Firmware Release on IB Switches when connecting Exadata and BDA (require exact same version)

Big Data Infrastructure - The Oracle Way17.11.201632

Page 33: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Infrastructure - The Oracle Way33 17.11.2016

Use Case

Page 34: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Use Case "Fraud Detection"

Big Data Infrastructure - The Oracle Way34 17.11.2016

Company:

Business Case: Fraud Detection

Motivation Statement:"Mit der BDA wollen wir unsere Analysen zur Betrugserkennung um zusätzlichen Dimensionen verfeinern. Die BDA erfasst zum Beispiel auch Sport-Performance-Kennzahlen wie die Laufleistung der einzelnen Spieler, die sie dann mit seiner durchschnittlichen Laufleistung vergleichen können. Krass untypische Leistungswerte können ein Hinweis auf vorab getroffene Absprachen sein, dem wir dann nachgehen."

Reference: Oracle Open World 2016

Reference: Computer World

Page 35: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Use Case Solution

Big Data Infrastructure - The Oracle Way35 17.11.2016

Big Data Appliance as "Data Reservoir"

Key arguments from customer perspective

– "Die Exadata und die BDA im Tandem bieten uns Integrationsvorteile, die wir mit Konkurrenzsystemen nicht so einfach erzielen können." *

– Fast start to "Big Data"

– Comprehensive software stack for data analytics

– Start small, grow on demand

– Ready for future (yet unknown) demands

*Reference: Computer World

Page 36: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Infrastructure - The Oracle Way36 17.11.2016

Summary

Page 37: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Summary

Big Data Infrastructure - The Oracle Way37 17.11.2016

The main technical advantage when deploying Big Data SQL on the Oracle Big Data Appliance is InfiniBand’s high bandwidth to other Oracle Engineered Systems

Other BDA exclusive features: Perfect Balance for reduce tasks

The Big Data Appliance provides a solid enterprise-class infrastructure (HW & SW)

Installation and patching procedures are not yet as mature on the BDA as on other engineered systems like Exadata

Leveraging the full potential of a BDA requires both Engineered System expertise and Data Analytics knowhow

Page 38: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Is the Big Data Appliance the Rigth Choice for You?

Big Data Infrastructure - The Oracle Way38 17.11.2016

Yes, if ....you need a fast start to production ready data analytics

you already run other Oracle engineered systems with Infiniband technology

your use case involves data in RDBMS, Haddop and NoSQL and you have high query performance demands

you have an important business case with unpredictable grow J

you like to stay with cloudera

Page 39: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Questions and responses…Daniel SteigerPrincipal Consultant

17.11.2016 Big Data Infrastructure - The Oracle Way39

[email protected]

Page 40: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Trivadis @ DOAG 2016

Booth: 3rd Floor – next to the escalatorKnow how, T-Shirts, Contest and Trivadis Power to goWe look forward to your visitBecause with Trivadis you always win !

17.11.2016 Big Data Infrastructure - The Oracle Way40

Page 41: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Infrastructure - The Oracle Way41 17.11.2016

Links & References

Page 42: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Links and References (1)

Big Data Infrastructure - The Oracle Way42 17.11.2016

An Enterprise Architect’s Guide to Big Data – Reference Architecture Overview

– http://www.oracle.com/technetwork/topics/entarch/oracle-wp-big-data-refarch-2019930.pdf

Oracle Big Data Management System – Statement of Direction

– http://www.oracle.com/ocom/groups/public/@otn/documents/webcontent/2516729.pdf

Oracle Big Data Appliance Documentation

– https://docs.oracle.com/bigdata/bda46/

Oracle Big Data Lite Virtual Machine

– http://www.oracle.com/technetwork/database/bigdata-appliance/oracle-bigdatalite-2104726.html#wp

Page 43: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Links and References (2)

Big Data Infrastructure - The Oracle Way43 17.11.2016

Information Center: Oracle Big Data Appliance (My Oracle Support Doc ID 1445762.2)

http://blog.cloudera.com/blog/2013/08/how-to-select-the-right-hardware-for-your-new-hadoop-cluster/

Oracle Big Data SQL: One Fast Query, All Your Data

– https://blogs.oracle.com/datawarehousing/entry/oracle_big_data_sql_one

Page 44: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Links and References (3)

Owner’s Guide Owner's Guide Release 4 (4.4) E65664-03 January 2016

Oracle Big Data Appliance Patch Set Master Note, Doc ID 1485745.1

Information Center: Install/Upgrade/Configure Oracle BDA, Doc ID 1445745.2

Oracle BDA Base Image Version 4.2.0 for New Installations on OL6, Doc ID 2077858.1 (Base for BMR to finally reach 4.4.0)

Oracle Big Data Appliance Installation Frequently Asked Questions, Doc ID 1518939.1

Upgrading CDH, Doc ID 2109175.1

How to Enable/Disable Oracle Big Data Discovery on Oracle Big Data Appliance V4.3/OL6 with bdacli, Doc ID 2083079.1 (is also (not) valid for 4.4)

"bdacli enable bdd" Fails with "ERROR: Error getting mysql database status" on BDA 4.4.0 / BDD 1.1, Doc ID 2109175.1

Big Data Infrastructure - The Oracle Way17.11.201644

Page 45: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Infrastructure - The Oracle Way45 17.11.2016

Backup Slides

Page 46: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Oracle Big Data SQL Licensing

Big Data Infrastructure - The Oracle Way46 17.11.2016

All nodes within the Hadoop cluster that runs Oracle Big Data SQL must be licensed.

A separate license must be procured per disk per Hadoop cluster.

All disks within every node that is part of a cluster running Oracle Big Data SQL must be licensed.

Partial licensing within a node is not available. All nodes in the cluster are included.

Only the Hadoop cluster side (Oracle Big Data Appliance, or other) of an Oracle Big Data SQL installation is licensed and no additional license is required for the database server side.

Page 47: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

BDA Prize List

Big Data Infrastructure - The Oracle Way47 17.11.2016

Page 48: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data in the Cloud – Offering & Prizing

Big Data Infrastructure - The Oracle Way48 17.11.2016

Reference: https://cloud.oracle.com

Page 49: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

BDA Specific Software Features

Big Data Infrastructure - The Oracle Way49 17.11.2016

Oracle NoSQL Database

– Oracle NoSQL Database is a distributed key-value database built on storage technology of Berkeley DB Java Edition.

– An intelligent driver on top of Berkeley DB keeps track of the underlying storage topology, shards the data and knows where data can be placed with the lowest latency

. Oracle R Support for Big Data

– The standard R distribution is installed on all nodes of Oracle Big Data Appliance

– Oracle R Connector for Hadoop provides R users with high-performance, native access to HDFS and the MapReduce programming framework

– Oracle R Enterprise is a separate package that provides real-time access to Oracle Database.

Page 50: Big Data Infrastructure – The Oracle Way. · PDF filecd /opt/oracle/BDAMaamoth mammoth –s 1 cdh Step 9 = InstallHadoop Step 10 = StartHadoopServices Step 11 = InstallBDASoftware

Big Data Preparation (Cloud Service)

Big Data Infrastructure - The Oracle Way50 17.11.2016

Self-service data preparation for domain experts

Ingest, prepare, enrich, and publish data with a unified cloud-based data wrangling solution

Unique combination of Natural Language Processing (NLP) with Machine Learning (ML)

Leverage Linked Open Data graph of domain knowledge

Powered by Apache Spark

See https://cloud.oracle.com/en_US/big-data-preparation