Oracle Week 2016 - Modern Data Architecture
-
Upload
arthur-gimpel -
Category
Technology
-
view
32 -
download
0
Transcript of Oracle Week 2016 - Modern Data Architecture
![Page 1: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/1.jpg)
Modern Operational Data Architecture
Arthur Gimpel, DataZone
![Page 2: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/2.jpg)
About Me
• Name: Arthur Gimpel
• Position: Technology Evangelist, Solutions Architect, Trainer
• Tech Stack: MongoDB, SQL Server, Couchbase, Elastic Stack, Redis, Kafka, Python, .NET
![Page 3: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/3.jpg)
Relational Databases
• First RDBMS was introduced in late 1970s • Exist in all possible flavors but share one
thing - ACID • Still dominate the database market
![Page 4: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/4.jpg)
RDBMS In Theory
• Atomicity: All or nothing approach, transactions
• Consistency: Hard state, every transaction changes the whole DBMS
• Isolation: Transactions cannot interfere with each other
• Durability: Every transaction is persisted
![Page 5: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/5.jpg)
RDBMS Is Not Perfect
• Everything is persisted, synchronously. Limited by IO performance
• All data is bound to a tabular schema, hard to make changes in big databases
• ACID makes horizontal scaling nearly* impossible
• Complex schema slows down aggregations and queries drastically
![Page 6: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/6.jpg)
NoSQL
• Distributed / Horizontal Scalability • Mostly Open Source • Mostly schema less:
• Key - Value • Document • Graph
• Serves specific purposes
![Page 7: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/7.jpg)
NoSQL - Key Value Stores
• Key: • Usually string, equivalent to primary key in a
relational database
• Value: • Simple values: Int, Float, DateTime • Complex values: Array, Binary, XML, JSON
![Page 8: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/8.jpg)
Key Value - Characteristics• Database is usually a set of unique keys,
and its values • KV data stores are usually easy to
distribute • Key Value access usually is VERY fast • Indexing and querying values is usually
challenging
![Page 9: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/9.jpg)
Key Value - Use Cases• Distributed caching
• Session / temporary user data
• Ad tech: Impressions
• Ad tech: Serving data - profiles, segments
• Recommendation engines - main data store
![Page 10: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/10.jpg)
NoSQL - Graph Stores“In computing, a graph database is a database
that uses graph structures for semantic
queries with nodes, edges and properties to
represent and store data” (Wikipedia)
![Page 11: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/11.jpg)
Graph - Characteristics• Nodes are entities - for example a person
• Properties describe nodes - for example age, name
• Edges are relations between nodes and/or properties
![Page 12: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/12.jpg)
Graph - Use Cases• Fraud detection
• Recommendation engines - link analysis
• Intelligence systems
• Social Networks
• Medical Research
![Page 13: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/13.jpg)
NoSQL - Document Stores
• Document databases usually store JSON • Used to store object oriented data • Usually used to avoid relational - object
mismatch • Document stores have the highest
adoption rate among NoSQL databases
![Page 14: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/14.jpg)
Document Store - Characteristics• Information is stored in JSON variations
• Some document stores support secondary indexes for easier querying
• Documents are usually divided to logical groups (collections, buckets, types - instead of RDBMS tables)
![Page 15: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/15.jpg)
Document Store - Use Cases• “Relational” use cases where there is a
need for high scale (volume, velocity, variety)
• Hierarchal data - aggregations
• Search use cases
![Page 16: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/16.jpg)
NoSQL - Challenges
• Every data store has its purpose. There is no single solution to all database needs
• NoSQL does not implement all of RDBMS’s abilities (CDC, Jobs, Stored Procedures, Triggers)
• Every data store has its own languages, and APIs. There is no ANSI SQL
![Page 17: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/17.jpg)
Not Only SQL
![Page 18: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/18.jpg)
Polyglot Persistence Sample Use Cases
• Add search capabilities to your database
• Split session / temporary data processing to key value stores
• Add Graph analysis capabilities to your operational database
![Page 19: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/19.jpg)
Search Use Case
![Page 20: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/20.jpg)
Search: Architecture #1
![Page 21: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/21.jpg)
Search: Architecture #2
![Page 22: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/22.jpg)
Architecture Comparison
Architecture #1 Architecture #2
Data distribution strategy Data store based Application based
Data distribution component Data Pipeline Message Queue
Implementation Team Data Engineers / DevOps DevOps / Developers
Implementation Complexity
Low: Data pipeline development
High: data access layer refactor
Scalability Limited to RDBMS ScaleFully scalable regardless
of RDBMS
![Page 23: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/23.jpg)
Summary
• Chose the relevant database engine for the right mission - replacing databases is not easy
• Do not hesitate to use more than one database engine in your operational application, single point of truth will be created in the analytical stack
• Sizing is no replacement for benchmark. Check your deployment carefully
![Page 24: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/24.jpg)
DataZone Advanced Data Solutions
Enterprise Search
Data Flow Management
Centralized Logging
Operational Analytics
Polyglot Persistence
Business Analytics
![Page 25: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/25.jpg)
DataZone Scale With Confidence
Troubleshooting & Tuning
Technological Evaluation
Training Services
Architecture Review
Cost Management
End-to-End Implementations
Infrastructure Support / DevOps
![Page 26: Oracle Week 2016 - Modern Data Architecture](https://reader033.fdocuments.us/reader033/viewer/2022052706/586fd9261a28ab18428b588d/html5/thumbnails/26.jpg)
Our Ecosystem