Supercharge your RDBMS with Elasticsearch

Supercharge Your RDBMS with Elasticsearch

Arthur Gimpel, Director of DataZone

Name: Arthur Gimpel

Position: Technology Evangelist, Solutions Architect, Trainer

Tech Stack: Elastic Stack, SQL Server, MongoDB, Couchbase, Redis, Kafka, StreamSets, Python, .NET…

Free Time: Motorcycles, Skydiving…

Click to edit Master title styleAbout Me

• First RDBMS was introduced in late 1970s

• Exist in all possible flavors but share one thing - ACID• Still dominate the database market

Click to edit Master title styleRelational Database Management Systems

• Atomicity: All or nothing approach, transactions

• Consistency: Hard state, every transaction changes the whole DBMS

• Isolation: Transactions cannot interfere with each other

• Durability: Every transaction is persisted

Click to edit Master title styleRDBMS in Theory - ACID

• Everything is persisted, synchronously. Limited by IO performance

• All data is bound to a tabular schema, hard to make changes in big databases

• ACID makes horizontal scaling nearly* impossible

• Complex schema slows down aggregations and queries drastically

Click to edit Master title styleACID Is Not Perfect

• Distributed / Horizontal Scalability

• Mostly Open Source• Mostly schema less:

• Key - Value

• Document

• Graph

• Serves specific purposes

Click to edit Master title styleNoSQL - New Kid in Town

• Every data store has its purpose. There is no single solution to all database needs

• NoSQL does not implement all of RDBMS’s abilities (CDC, Jobs, Stored Procedures, Triggers)

• Every data store has its own languages, and APIs. There is no ANSI SQL

Click to edit Master title styleNoSQL - Challenges

Click to edit Master title styleNoSQL = Not Only SQL | Polyglot Persistence

• Search platform, data store based on Apache Lucene

• Supports various search types: Filtered, Full-text, Geography, Aggregation (Facet, Nested, Pipeline), Graph

• Distributed - every index is split to shards relying on (potentially) a node

• Document store - JSON

• “Optimistic” Schema-less architecture

• Supports Replication by nature

• Supports Unsupervised Machine Learning by nature (Prelert, in beta)

Click to edit Master title style

Click to edit Master title styleSearch != SQL Querying

Click to edit Master title styleReference Architecture #1

Click to edit Master title styleReference Architecture #2

Click to edit Master title styleArchitecture Comparison

Architecture #1 Architecture #2

Data distribution strategy Data store based Application based

Data distribution component Data Pipeline ( StreamSets ) Message Queue ( Kafka )

Implementation Team Data Engineers / DevOps DevOps / Developers

Implementation Complexity Low: Data pipeline development High: data access layer refactor

Potential additional licensing Elasticsearch, StreamSets None

Scalability Limited to RDBMS Scale Fully scalable regardless of RDBMS

Thank You!

Supercharge your RDBMS with Elasticsearch

Technology

Transcript of Supercharge your RDBMS with Elasticsearch