Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL...

29
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Big Data

Transcript of Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL...

Page 1: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Oracle Big Data

Page 2: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Tapping into Diverse Data Sets

Transactions

Information

Architectures

Today:

Decisions based

on database data

Big Data:

Decisions based

on all your data

Video and Images

Machine-Generated Data Social Data

Documents

Page 3: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Case: On-line Ads and Content

NoSQL DB

Expert System

Real-time: Determine best ad to place

on page for this user

Input into

Lookup user profile

Add user if not present

Web logs

HDFS

Profiles

NoSQL DB

High scale data reductions BI and

Analytics Billing

Predictions on browsing

Actual ads

served

Low Latency

Batch

Page 4: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Case: On-line Adds and Content

NoSQL DB

HDFS

Hadoop

RDBMS

• Dynamic and rapidly changing schema

• Scalable single record lookup

• Low cost, high scale storage

• Write once, read many times

• High scale batch processing

• Highly customizable infrastructure

• Deep analytics and BI value add

• Reporting for large user community

Page 5: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Big Data Is About…

Tapping into diverse data sets

Finding and monetizing hidden relationships

Driving data-based business decisions

Page 6: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

6 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

• Deep Analytics

• Agile Development

• Massive Scalability

• Real Time Results • High Throughput

• In-Place Preparation

• All Data Sources/Structures

• Low, predictable Latency

• High Transaction Count

• Flexible Data Structures

Big Data: Infrastructure Requirements

Acquire Organize Analyze

Page 7: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

7 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Divided Solution Spectrum

Acquire Analyze Organize

MapReduce Solutions

Distributed File Systems

Transaction (Key-Value)

Stores

NoSQL Flexible

Specialized Developer

Centric

DBMS (DW)

DBMS (OLTP)

Advanced Analytics ETL

SQL Trusted Secure

Administered

“High Density”

Information Density

“Low Density” Schema-less

Unstructured

Data

Variety

Schema

Page 8: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

8 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Oracle Integrated Software Solution Stack

Acquire Analyze Organize

Oracle

Database (DW)

Oracle Database

(OLTP)

In-DB Analytics

“R” Mining

Text Graph Spatial

Oracle BI EE

Oracle NoSQL DB

HDFS Hadoop

Oracle Data Integrator

Oracle Loader for Hadoop

Data Variety

Information Density

Unstructured

Schema

Page 9: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

9 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Oracle’s Big Data solution

Oracle

Big Data Appliance

Oracle

Exadata

InfiniBand

Acquire Organize Analyze & Visualize Stream

Oracle

Exalytics

InfiniBand

Page 10: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

10 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Why build a Hadoop Appliance?

• Time to Build?

• Required Expertise?

• Cost and Difficulty Maintaining?

Page 11: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

11 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Oracle Engineered Solutions

Acquire Analyze Organize

Oracle

Database (DW)

Oracle Database

(OLTP)

In-DB Analytics

“R” Mining

Text Graph Spatial

Oracle BI EE

Oracle NoSQL DB

HDFS Hadoop

Oracle Data Integrator

Oracle Loader for Hadoop

Data Variety

Information Density

Unstructured

Schema

Big Data Appliance • Hadoop

• NoSQL Database

• Oracle Loader for hadoop

• Oracle Data Integrator

Oracle Exadata • OLTP & DW

• Data Mining & Oracle R

• Semantics

• Spatial

Exalytics • Speed of

Thought

Analytics

Page 12: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

12 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Big Data Appliance: Hardware

18 Sun X4270 M2 Servers

• 48 GB memory per node; 864 GB memory total

• 2 CPUs (6-core Intel) per node, 216 cores total

• 12 x 2 TB HDD capacity, 432TB raw disk total

3 Infiniband switches

• 40 Gb/sec InfiniBand – 100 total ports (for internal

backplane and interconnection to Exadata)

• 10 Gb/sec Ethernet – 16 total ports (for connection to

datacenter)

Page 13: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Big Data Appliance

Cluster of industry standard servers for Hadoop and NoSQL Database

• Focus on Scalability and Availability at low cost

Compute and Storage

• 18 High-performance low-cost servers

acting as Hadoop nodes

• 24 TB Capacity per node

• 2 6-core CPUs per node

• Hadoop triple replication

• NoSQL Database triple replication

10GigE Network

• 8 10GigE ports

• Datacenter connectivity

InfiniBand Network

• Redundant 40Gb/s switches

• IB connectivity to Exadata

Page 14: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Big Data Appliance Building Block

• High-performance storage server built from industry

standard components

• 12 disks - 2TB 7200 RPM

High Capacity SAS

• 2 Six-Core Intel Xeon Processors (L5640)

• Dual ported 40 Gb/sec InfiniBand

• Optimized software layout:

• Hadoop HDFS

• HBase and Hive

• NoSQL Database and Replicas

• Hardware by Sun

• Software by Oracle

Page 15: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Scale Out to Infinity

Scale out by connecting racks to each other using Infiniband

•60 Nodes

•864 Cores

•1.7 PB Storage

Page 16: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

17 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Big Data Appliance: Software Big Data for the Enterprise

• Foundation Software:

– Oracle Linux

– Oracle Java VM

– Open-source Apache Hadoop Distribution

– Open-source R Distribution

• Application Software:

– Oracle NoSQL Database Enterprise Edition – New

– Oracle Loader for Hadoop - New

– Oracle Data Integrator Application Adapter

for Hadoop - New

Page 17: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

•Oracle Linux 5.6

•Java Hotspot VM

•Apache Hadoop Distribution v0.20.x

•R Distribution

•Oracle NoSQL Database Enterprise Edition

•Oracle Data Integrator Application Adapter for

Hadoop

•Oracle Loader for Hadoop

Oracle Big Data Appliance Software

Page 18: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

19 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Big Data Appliance Big Data for the Enterprise

• Optimized and Complete

– Everything you need to store and integrate

your lower information density data

• Integrated with Oracle Exadata • Analyze all your data

• Easy to Deploy

– Risk Free, Quick Installation and Setup

• Single Vendor Support

– Full Oracle support for the entire system and

software set

Page 19: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Oracle NoSQL Database A distributed, scalable key-value database

• Simple Programming and Operational Model • Simple Major + Sub key and Value data structure

• ACID transactions

• Configurable consistency & durability

• Scalable throughput, bounded latency

• Commercial Grade Software and Support • General-purpose

• Reliable – Based on proven Berkeley DB JE HA

• Easy to install and configure

• Easy Management • Web-based console, API accessible

• Manages and Monitors: Topology; Load; Performance; Events; Alerts

Storage Nodes

Data Center A

Storage Nodes

Data Center B

NoSQLDB Driver

Application

NoSQLDB Driver

Application

Page 20: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Input

Input

Query

Table

Oracle Loader for Hadoop

Load

. . . .

Partition and transform into Oracle

ready format

. . . .

Oracle Loader for Hadoop

Page 21: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Streaming Access to HDFS

HDFS

HDFS

HDFS

HDFS

HDFS

Datafile_part_1

Datafile_part_2

Datafile_part_m

Datafile_part_n

Datafile_part_x

Oracle Database

FUSE

External Table

View

Or

Table Function

Reduce Map

Query

Page 22: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

Oracle Data Integrator

Easily integrate data from any source

Expanded functionality:

=> Construct Hadoop jobs to transform and load data into Oracle

=> Leverage Oracle Loader for Hadoop and/or Hive

Page 23: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

25 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Data Mining

Exadata: A Platform for Analytics

2 miles

Text Analytics

Spatial Analytics

Graph Analytics

Integrate into Applications

Statistics

Page 24: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

26 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

In-database Statistics and Advanced Analytics with R

• Deliver enterprise-level advanced analytics based on R environment

1. Oracle’s Distribution of Open Source R • Enterprise support for open-source R

• Enhanced performance with Intel MKL libraries for x86 hardware

2. Oracle R Enterprise • Eliminates R’s memory constraint by enabling R to work directly and transparently on

database-resident data

• Transparently leveraging Oracle’s in-database analytics via R language

• Enables integration of R scripts into enterprise production applications and OBIEE

dashboards

• Leverages latest R algorithms and CRAN packages

Page 25: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

27 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Oracle R Architecture

Function push-

down – data

transformation & statistics

R workspace console

Oracle statistics engine

OBIEE, Web

Services

No changes to

the user

experience

Scale to large

data sets

Embed in

operational

systems

Page 26: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

30 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Oracle’s Big Data solution

Oracle

Big Data Appliance

Oracle

Exadata

InfiniBand

Acquire Organize Analyze & Visualize Stream

Oracle

Exalytics

InfiniBand

Page 27: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

The preceding is intended to outline our general product direction. It is intended for

information purposes only, and may not be incorporated into any contract. It is not a

commitment to deliver any material, code, or functionality, and should not be relied

upon in making purchasing decisions. The development, release, and timing of any

features or functionality described for Oracle’s products remains at the sole

discretion of Oracle.

Page 28: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

32 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Page 29: Oracle Big Data€¦ · •Apache Hadoop Distribution v0.20.x •R Distribution •Oracle NoSQL Database Enterprise Edition •Oracle Data Integrator Application Adapter for Hadoop

33 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.