Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ......

66
Scaling JSON Documents and Relational Data in Distributed Sharded Databases Oracle Code New York Christoph Bussler Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Christoph Bussler CMTS March 21, 2017

Transcript of Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ......

Page 1: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Scaling JSON Documents and Relational Data in Distributed Sharded DatabasesOracle Code New York

Christoph Bussler

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Christoph BusslerCMTSMarch 21, 2017

Page 2: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

3

Page 3: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Presentation Agenda

Background, Context and Goals

Oracle 12c as Multi-Modal Database

JSON OLTP

1

2

3

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

JSON OLTP

Analytics support for JSON

Sharding support for JSON

3

4

5

4

Page 4: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Background, Context and Goals

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Background, Context and Goals

5

Page 5: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Background: Federated Application System Architecture

• Strategy for concurrent relational data and JSON data management?

• That was easy:

– Deploy one database management system supporting one data type each!

• Application system architecture

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– Two databases

–One optional access layer for each

– Application accessing two access layers

6

RELATIONALDBMS

JSON DBMS

Page 6: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

• Access

– Two systems, two set of drivers, two interfaces, two query languages, two data type semantics

• Transactions

– Local to database, not distributed

• Scalability

– Two approaches

• Engineering knowledge

– Separated engineering knowledge, two communities, two test environments

Background: Federated Application System Architecture Evaluation

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– Local to database, not distributed

– Possibly different transaction models

– Failure recovery to be done by application logic code

• Analytics

– Separated, not on common data set

communities, two test environments

• Management

– Separate systems, different backup functionality and strategies, non-coordinated backup

• Support

– Two support systems

7

Page 7: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Alternative Application System Architecture

• Two DBMSs supporting one data type each

• One DBMS supporting two (or more) data types concurrently, integrated and homogenously

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 8

DBMS

Page 8: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Context

• Recent versions of Oracle 12c

–Oracle 12c Release 1

–Oracle 12c Release 2 ("Oracle 12c")

• JSON data structure support is one area of major functional enhancement in all areas of the database functionality

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

in all areas of the database functionality

– Storage

–Querying

– Analytics

– Sharding

9

Page 9: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Goals

• Goals

– JSON OLTP (Online Transaction Processing)

• Introduce "traditional" software system architecture for JSON processing

• Provide overview of Oracle 12c JSON support

– JSON Analytics

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– JSON Analytics

• Discuss "traditional" software system architecture for JSON analytics

• Provide overview of Oracle 12c JSON in-memory analytics support

– JSON Sharding

• Discuss Oracle 12c JSON sharding support

10

Page 10: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Oracle 12c as Multi-Modal Database

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Oracle 12c as Multi-Modal Database

11

Page 11: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Multi-Model Database

• Database management system that supports more than one data type

• Oracle 12c

– Relational model

– Object/relational model

– XML

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– XML

– RDF

– Topology (Graph)

– JSON

• Independent of data model, the same non-functional properties are supported

– E.g., backup/restore, RAC database, Data Guard, In-Memory option, sharding, etc.

12

Page 12: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

JSON OLTP

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

JSON OLTP

13

Page 13: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

JSON

• JavaScript Object Notation (JSON) Data Interchange Format

{"firstName": "Chris","lastName": "Bussler","zip": 94065}

{"productId": 1011,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

{"productId": 1011,"sizes": [4, 5, 6, "custom"]}

14

Page 14: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Standards (I)

• The JavaScript Object Notation (JSON) Data Interchange Format

– Internet Engineering Task Force (IETF)

– Request for Comments: 7159

–Obsoletes: 4627, 7158

– Category: Standards Track

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– Category: Standards Track

– ISSN: 2070-1721

– T. Bray, Ed., Google, Inc., March 2014

• ECMA404 The JSON Data Interchange Standard

– json.org

15

Page 15: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Standards (II)

• JSON Schema: core definitions and terminology

– draft-zyp-json-schema-04

– Internet Engineering Task Force

– Internet-Draft

– Intended status: Informational

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– Intended status: Informational

– Expires: August 4, 2013

– F. Galiegue, Ed., K. Zyp, Ed., SitePen (USA), G. Court, January 31, 2013

16

Page 16: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

JSON.org

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 17

Page 17: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Observations and Caveats

• JSON is an interchange format (only)– Syntax only

– No operational semantics defined• E.g., no comparison operations (>, <, =, etc.), no string operations, no Boolean operations, etc.

• E.g., no restrictions on array: array elements can be of any type

• Unknown value cannot be expressed (unlike e.g. SQL Null)

• Property order is undefined

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Property order is undefined

• Duplicate properties are not restricted

• No type constructors (new types cannot be introduced by specification)

• Identifier sizes, array sizes, object sizes, etc., are undefined

• Case variations (TRUE vs. true vs. TrUe) are not supported

• Uniqueness (aka, primary key(s)) is undefined

• Array base (zero or one?) is undefined

• Top level object restriction (composite only?)

• Etc.

18

Page 18: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Know Your Semantics!

• Language libraries

– Back-end and/or user interface libraries

• Database behavior

– Driver and database functionality

Establish knowledge and baseline of operational semantics

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Establish knowledge and baseline of operational semantics

– Regression unit tests that cover all possible semantic aspects

– Difference in semantics of systems implementing JSON

19

Page 19: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

• Oracle Database, JSON Developer's Guide, 12c Release 2 (12.2), E58287-10 (206 pages)

• Relational schema support– Create table statement

– JSON column(s)

• Additional topics

– Virtual columns

– Referential integrity

– Partitioning

– JSON data generation

– GeoJSON

Oracle 12c JSON Support (Native)

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– JSON column(s)

• CRUD support– SQL

– JSON functions

• Transaction Support– ACID transactions

– "Multi-document" transactions ☺

– GeoJSON

– OSON

– Indexing

– Encoding

– External table (file access)

– SODA

– JSON Data Guide

20

Page 20: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

JSON Relational Schema Support (I)

• Create table statement

– VARCHAR (4000)

– VARCHAR2 (32767)

– BLOB (recommended), CLOB

• Optimization: LOB (<COLUMN_NAME>) STORE AS (CACHE)

• Constraints

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Constraints

– Well-formed JSON (lax syntax)

• CONSTRAINT <constraint_name> CHECK (<column_name> IS JSON));

– Well-formed JSON (strict syntax)

• CONSTRAINT <constraint_name> CHECK (<column_name> IS JSON (STRICT))

– No duplicate properties

• WITH UNIQUE KEYS

21

Page 21: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

JSON Relational Schema Support (II)

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 22

Oracle® Database

SQL Language

Reference

12c Release 2 (12.2)

E49448-12

January 2017

Page 22: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Create Table with JSON Column

CREATE TABLE one_coll(part VARCHAR2(4000)

CONSTRAINT ensure_jsonCHECK (part IS JSON (STRICT WITH UNIQUE KEYS)));

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

INSERT INTO one_coll VALUES('{"id": 1, "cost": 5, "inventory": 100,

"description": "screw driver"}');

23

Page 23: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Create Table with Two JSON Columns

CREATE TABLE two_coll(part VARCHAR2(4000)

CONSTRAINT ensure_json_pCHECK (part IS JSON (STRICT WITH UNIQUE KEYS)),

notes VARCHAR2 (2000)CONSTRAINT ensure_json_n

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

CONSTRAINT ensure_json_nCHECK (notes IS JSON (STRICT WITH UNIQUE KEYS)));

INSERT INTO two_coll VALUES('{"id": 1, "cost": 5, "inventory": 100,

"description": "screw driver"}','{"status": "brand new"}');

24

Page 24: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Create Table with Mixed Columns

CREATE TABLE mixed_coll(id NUMBER,part VARCHAR2(4000)CONSTRAINT ensure_json_p2

CHECK (part IS JSON (STRICT WITH UNIQUE KEYS)),notes VARCHAR2 (2000) CONSTRAINT ensure_json_n2

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

CONSTRAINT ensure_json_n2 CHECK (notes IS JSON (STRICT WITH UNIQUE KEYS)));

INSERT INTO mixed_coll VALUES(1,'{"id": 1, "cost": 5, "inventory": 100,

"description": "screw driver"}','{"status": "brand new"}');

25

Page 25: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

• Create – Read – Update – Delete

– Insert

• Standard SQL

–Update

• Standard SQL

• Standard SQL

– Standardized by standardization body

– Extension of SQL for JSON data structure

• Not separate query language (!)

JSON CRUD Support

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Standard SQL

• Update complete JSON value

– Delete

• Standard SQL

–Query

• Standard SQL

• Not separate query language (!)

26

Page 26: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Insert JSON

INSERT INTO two_coll VALUES('{"id": 1, "cost": 5, "inventory": 100,

"description": "screw driver"}','{"status": "brand new"}');

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 27

Page 27: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Update JSON

INSERT INTO two_coll VALUES('{"id": 1, "cost": 5, "inventory": 100,

"description": "screw driver"}','{"status": "brand new"}');

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

UPDATE two_collSET notes = '{"status": "used"}'WHERE json_value(part, '$.id') = 1;

28

Page 28: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Delete JSON

INSERT INTO two_coll VALUES('{"id": 1, "cost": 5, "inventory": 100,

"description": "screw driver"}','{"status": "brand new"}');

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

DELETE FROM mixed_collWHERE json_value(part, '$.id') = 1;

29

Page 29: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Query JSON – DOT Notation

• DOT notation

– <column>.<property_name>[.<property_name>|<array_step>]*

– In projection to select property of JSON document

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 30

Page 30: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – DOT notation

SELECT mc.id, mc.part.costFROM mixed_coll mc;

SELECT id, json_value(part, '$.cost') AS COSTFROM mixed_coll;

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

FROM mixed_coll;

31

Page 31: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

• Path expression

– Selects zero or more matching JSON values

– Each step must match for the expression to match

• Functions

– JSON_EXISTS()

• Returns true, if at least one value matches

– JSON_VALUE()

• Returns value if scalar, error if non-scalar

Query JSON – JSON Functions

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Returns value if scalar, error if non-scalar

• Returns SQL Null if no match

– JSON_QUERY()

• Returns all matching values

– JSON_TABLE()

• Create a relational view (JSON decomposition)

32

Page 32: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – JSON_TABLE()

INTO complex_coll VALUES(1,'{"id": 1, "cost": 5, "inventory": 100,

"description": "screw driver", "shipper": [{"name": "FAST Shipper", "quality": 5},

{"name": "SLOMO", "quality": 1},

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

{"name": "SLOMO", "quality": 1},{"name": "ALWAYS-ON-TIME", "quality": 10}]}');

33

Page 33: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – JSON_TABLE()

SELECT cc.id, jt.shipper, jt.qualityFROM complex_coll cc,

json_table(part, '$.shipper[*]' COLUMNS (shipper VARCHAR2(32 CHAR) PATH '$.name',

quality VARCHAR2(32 CHAR) PATH '$.quality')) jt;

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 34

Page 34: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – JSON Join

SELECT mc.id,tc.notes AS "tc notes",mc.notes AS "mc notes"

FROM two_coll tc,mixed_coll mc

WHERE json_value(tc.part, '$.id') = json_value(mc.part, '$.id');

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

WHERE json_value(tc.part, '$.id') = json_value(mc.part, '$.id');

35

Page 35: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – JSON and Relational Data Join

SELECT mc.id,tc.notes AS "tc notes",mc.notes AS "mc notes"

FROM two_coll tc,mixed_coll mc

WHERE json_value(tc.part, '$.id') = mc.id;

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

WHERE json_value(tc.part, '$.id') = mc.id;

36

Page 36: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Transactions

• Oracle's transaction semantics applies unchanged

• One or more DML SQL statements referring to JSON columns can be in one transaction

• JSON as well as relational DML SQL statements can occur in any order in a transaction

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

transaction

37

Page 37: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Summary

• JSON

– Standardized interchange format

– Popular format for UI, backend programming as well as storage: one format across all application system layers

• Oracle 12c database

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Oracle 12c database

– Provides complete operational semantics

– Provides extensive functionality

– Includes JSON support in all non-functional features

38

Page 38: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Analytics support for JSON

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Analytics support for JSON

39

Page 39: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Analytics

• Use of aggregation functions to gain insight and knowledge from OLTP data subset

– Aggregation functions: avg, min, max, count, stddev, …

• Example query

–What is average quality of all shippers?

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

–What is average quality of all shippers?

• Analytics dashboard

–User interface collection of different analytic evaluations for given metrics

–Not discussed in the following

40

Page 40: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Classical Analytics Architecture

• Independent analytics system separate from OLTP system

–Optimized for analytics processing

• ETL (Extract-Transform-Load) from OLTP to analytics system

– Extract subset of OLTP data set required for analytics

– Transform extracted data set into form suitable for analytics system

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– Transform extracted data set into form suitable for analytics system

• Possibly semantic transformation and "cleansing"

– Load into analytics system for analytics processing

41

OLTPDBMS

ANALYTICSDBMS

ETL

Page 41: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Classical Analytics Architecture Evaluation

• Separate systems

– Additional failure points, different infrastructure requirements, separate maintenance approaches, operations support required for several systems

• Data ETL

– Significant execution duration for overall data transfer (practical data volume limited)

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– Significant execution duration for overall data transfer (practical data volume limited)

– Data snapshot (outdated, not up-to-date), continuous stream possible (still lagging)

– Different programming paradigm compared to OLTP

• Change in analytics requirements

–Might require change in ETL programming for extracting different data set and/or transforming differently

42

Page 42: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Ideal Architecture

• One database system used for OLTP as well as analytics processing

–One system and environment

–One programming and querying approach

–No data movement required through ETL

• Fundamental idea

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Fundamental idea

– Same data can be represented in form optimized for OLTP as well as for analytics processing

• OLTP: row (tuple) format

• Analytics processing: columnar format

43

DBMS

Page 43: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Columnar Format: Data Representation

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 44

From:

Oracle® Database

In-Memory Guide

12c Release 2 (12.2)

E71458-06

January 2017

Page 44: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Oracle 12c Analytics Support: In-Memory Option

• OLTP data represented in main memory in columnar format

• Data in main memory (columnar) transactionally consistent with OLTP data (row)

• Analytics processing expressed as regular SQL queries

Optimizer decides if columnar format is advantageous over row (tuple) representation

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

–Optimizer decides if columnar format is advantageous over row (tuple) representation

–No special query language or query syntax elements required

45

Page 45: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Configure Database

• ALTER SYSTEM SET INMEMORY_SIZE = 100M SCOPE=SPFILE;

• ALTER SYSTEM SET MAX_STRING_SIZE=EXTENDED; -- in update mode

• ALTER SYSTEM SET INMEMORY_EXPRESSIONS_USAGE='ENABLE';

• ALTER SYSTEM SET INMEMORY_VIRTUAL_COLUMNS=ENABLE SCOPE=SPFILE;

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 46

Page 46: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Analytics Support for JSON

• No difference with support for relational data

• Check execution plan for usage of In-Memory option

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 47

Page 47: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Create Table/Alter Table

CREATE TABLE im_coll(id NUMBER,part VARCHAR2(4000)

CONSTRAINT ensure_json_p5 CHECK (part IS JSON (STRICT WITH UNIQUE KEYS)));

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

ALTER TABLE im_coll INMEMORY;

ALTER TABLE im_coll NO INMEMORY;

48

Page 48: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Insert

INSERT INTO im_coll VALUES(1,'{"id": 1, "cost": 5, "inventory": 100,

"description": "screw driver", "shipper": [{"name": "FAST Shipper", "quality":5},

{"name": "SLOMO", "quality":1}, {"name": "ALWAYS-ON-TIME", "quality":10}]}');

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

{"name": "ALWAYS-ON-TIME", "quality":10}]}');

INSERT INTO im_coll VALUES(2,

'{"id": 2, "cost": 77, "inventory": 345, "description": "standard screw", "shipper": [{"name": "QUICK Shipper", "quality":5},

{"name": "SLO", "quality":1}, {"name": "ALWAYS-ON-TIME", "quality":10}]}');

49

Page 49: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Analytics Query

SELECT COUNT(st.shipper),SUM(st.quality),AVG(st.quality)

FROM (SELECT DISTINCT jt.shipper,jt.quality

FROM im_coll imc,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

FROM im_coll imc,json_table(part, '$.shipper[*]'

COLUMNS (shipper VARCHAR2(32 CHAR) PATH '$.name',

quality VARCHAR2(32 CHAR) PATH '$.quality')) jt

) st;

50

Page 50: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

• Compression methods

– E.g., MEMCOMPRESS FOR QUERY LOW, MEMCOMPRESS FOR CAPACITY HIGH

• Priority (for loading)

– E.g., PRIORITY LOW, PRIORITY CRITICAL

• Main memory capacity protected through selective OLTP data representation

– Virtual columns, selective enabling of individual columns

Wait – There is More!

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– E.g., PRIORITY LOW, PRIORITY CRITICAL

• Advisors

– In-Memory advisor, compression advisor

individual columns

• In-Memory Expressions

• …

51

Page 51: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Summary

• Single system with dual data representation optimized for OLTP as well as analytics processing

– Row format

– Columnar format

• JSON data format fully supported enabling JSON analytics processing

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• JSON data format fully supported enabling JSON analytics processing

–Query against JSON data

–Not ETL or pre-analytics transformation required

52

Page 52: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Sharding support for JSON

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Sharding support for JSON

53

Page 53: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

What is Sharding in Context of Databases?

• Separation of data and its storage into independent database management systems

– Independent DBMSs are called "shard"

– Shards might be local or remote

– The set of all shards combined is the "sharded database"

• Disjoint separation

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Disjoint separation

– Random (consistent hash) or based on "sharding" key

– Does not imply data replication

• Replication of sharded data

– For HA/DR support

– For read-only access

54

Page 54: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Sharding

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 55

Oracle® Database

Administrator’s Guide

12c Release 2 (12.2)

E49631-09

December 2016

Page 55: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Sharding – Replication

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 56

Oracle® Database

Administrator’s Guide

12c Release 2 (12.2)

E49631-09

December 2016

Page 56: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Sharding – Sharding Criteria

• Criteria to distribute data between shards

• Automatic sharding (system managed)

– System decides how to distribute data over shards

• Composite sharding

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

– Data designer decides how to distribute data

– Partitionset

– Specifies value or range of values in column ("shard key")

– Specified when table is created

57

Page 57: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example – Composite Sharding

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 58

Oracle® Database

Administrator’s Guide

12c Release 2 (12.2)

E49631-09

December 2016

Page 58: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Example

CREATE SHARDED TABLE Customers (CustId VARCHAR2(60) NOT NULL,Name VARCHAR2(60),Geo VARCHAR2(8),CustProfile VARCHAR2(4000),CONSTRAINT pk_customers PRIMARY KEY (CustId),CONSTRAINT json_customers CHECK (CustProfile IS JSON)

)

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

CONSTRAINT json_customers CHECK (CustProfile IS JSON)) partitionset by list(GEO)partition by consistent hash(CustId)partitions auto(partitionset america values ('AMERICA') tablespace set tsp_set_1,partitionset europe values ('EUROPE') tablespace set tsp_set_2);

59

Page 59: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Sharding Architecture

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 60

Oracle® Database

Administrator’s Guide

12c Release 2 (12.2)

E49631-09

December 2016

Page 60: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Universal Connection Pool - JDBC

• UCP introduced shared pools

–One pool can have connections to more than one database

–One pool can have connections to different shards

• Connection creation

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

protected Connection getCustomerConnection(PoolDataSource pool, Customer customer) throws SQLException{

return pool.createConnectionBuilder().shardingKey(pool.createShardingKeyBuilder()

.subkey(customer.email, OracleType.VARCHAR2)

.build()).build();

}

61

Page 61: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

• Adding, removing shards

• Resharding

– Required by adding/removing shards

• Backup/recovery

• Monitoring

– Command line options

• Schema modification

–Orchestrated by shard catalog

Sharding Management

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Patching

– Rolling patching supported

62

Page 62: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Wait – But Why?

• Linear scalability

• Fault containment

• Geographical data distribution

• Rolling upgrades

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Rolling upgrades

• Cloud deployment benefits

– Sizing, elasticity, mix of cloud/on-premise

• Cool

63

Page 63: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Summary

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Summary

64

Page 64: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

JSON

• Scaling JSON Documents and Relational Data in Distributed ShardedDatabases

–Oracle as multi-model database supports concurrently different data models, including JSON

–Oracle 12c provides complete functional and non-functional set capabilities for OLTP

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

–Oracle 12c provides complete functional and non-functional set capabilities for OLTP and analytics processing of JSON data (documents)

– Data modeler can choose from all data models within one database design

– Engineer can choose from all data models to implement OLTP and/or analytics

– Database ops can choose best deployment options for the scalability required

65

Page 65: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 66

Page 66: Scaling JSON Documents and Relational Data in Distributed ... · Oracle Code New York ... •Provide overview of Oracle 12c JSON in-memory analytics support ... –Includes JSON support