Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous...
Transcript of Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous...
![Page 1: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/1.jpg)
©Continuent Ltd 2017
Spread the Database Love with Heterogeneous Replication
MC Brown, VP, Products
![Page 2: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/2.jpg)
©Continuent Ltd 2017
Heterogeneous Replication is NOT
• Exporting and Importing Data
• Moving to a different database platform
• One Time Exports
• ETL
![Page 3: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/3.jpg)
©Continuent Ltd 2017
Heterogeneous Replication IS
• Live, constant, low-latency movement of data
• For analytics
• For migration
• For upgrades
• For Caching
• Data/format matching
• Effective target reproduction
![Page 4: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/4.jpg)
©Continuent Ltd 2017
Know Your Databases
![Page 5: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/5.jpg)
©Continuent Ltd 2017
Not all Databases are Created Equal
• Transactional over non transactional
• Object Reference
• Rows
• Columns
• Documents
• Free text
• Unstructured
![Page 6: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/6.jpg)
©Continuent Ltd 2017
Is that a record, a field, a row a column?
• Row of data?
• Collection of related tables?
• What does it look like as a document?
• What does a document look like as a row?
• Databases, tables, collections, objects, buckets…
![Page 7: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/7.jpg)
©Continuent Ltd 2017
Related Tables or Document?
![Page 8: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/8.jpg)
©Continuent Ltd 2017
Mapping DB Compatibility
RDBMS Columnar Store Document Database
Freetext/unstructured data store
RDBMS Vendor specific only
Vendor specific only
Field mappings only Application specific
Columnar Store Vendor specific only
Vendor specific only
Field mappings only Application specific
Document Database
Field mappings only
Field mappings only
Vendor specific only Application specific
Freetext/unstructured data store
Application specific Application specific Application specific Application specific
Vendor specific - i.e. unique data typesField mappings - how we map the data
App Specific - how the data is used
![Page 9: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/9.jpg)
©Continuent Ltd 2017
Know Your Data
![Page 10: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/10.jpg)
©Continuent Ltd 2017
Hetero Replication Challenges
• Effective data replication
• nothing lost or removed
• low latency
• Automatic mapping
• Data typing
• Indexing and native use
![Page 11: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/11.jpg)
©Continuent Ltd 2017
Challenges: Data Typing
• Data types are not supported everywhere
• For some, the type does not matter
• Even if the type does matter, the format, precision, structure might be different
• Numbers, Dates, Strings, Compound Data Types all cause problems
![Page 12: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/12.jpg)
©Continuent Ltd 2017
Extraction/Apply Rates
• Data extraction rates vary
• Data apply rates
• Different solution handle data loading at different rtes
• Rows-based extraction/bulk apply
• Bulk extraction/row apply
• Non-destructive
![Page 13: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/13.jpg)
©Continuent Ltd 2017
Whats the solution?
![Page 14: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/14.jpg)
©Continuent Ltd 2017
Replicator Needs
• Native, neutral format
• Ability to change, reformat, restructure information
• Standalone nature
• Two-way
• Handle impedance problem
![Page 15: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/15.jpg)
©Continuent Ltd 2017
Guess What?
• Tungsten Replicator does this
• High Performance
• Flexible storage interchange format
• Built-in filtering
• Operates standalone
• Stop and restart
• Transactionally consistent
• Open Source
![Page 16: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/16.jpg)
©Continuent Ltd 2017
Applying Data
![Page 17: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/17.jpg)
©Continuent Ltd 2017
Native is Best, Batch an Alternative
• Native:
• Applying to JDBC
• Adapt JDBC Applier to construct statement
• Or apply a record to target using API
• Batch
• Use CSV for data interchange
• Call scripts to import
![Page 18: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/18.jpg)
©Continuent Ltd 2017
How Batch Apply Works
Replicator
Service ora2vrTransactions from master
CSVFilesCSVFilesCSVFiles
StagingTablesStagingTablesStagingTables
Base Tables
Base Tables
Base Tables
Merge Script
(or)COPY
directly to base tables
COPY to stage tables SELECT to
base tables
![Page 19: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/19.jpg)
©Continuent Ltd 2017
How Batch Apply Works
• Works on one table at a time
• Five functions in JavaScript
• Prepare - Run when going online
• Begin - Start of transaction
• Apply - During transaction
• Commit - After transaction
• Release - When going offline
![Page 20: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/20.jpg)
©Continuent Ltd 2017
During a transaction
• Copy, import, load the CSV
• Have access to column, key and transaction information
• Merge the data
• Delete and Insert, or
• Delete, Update and Insert
• Done
![Page 21: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/21.jpg)
©Continuent Ltd 2017
Case Study: Cassandra/CQL
• Load table data:
• COPY staging_tablename (optype,seqno,uniqno,id,message) from ‘FILENAME’
• Delete:
• delete from sample where id in (#{deleteidlist})
• Insert:
• insert into sample ("+collist+") values ("+substlist+")
![Page 22: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/22.jpg)
©Continuent Ltd 2017
Filters
![Page 23: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/23.jpg)
©Continuent Ltd 2017
Filter Execution
Extract Filter Apply
StageExtract Filter Apply
Stage
MySQLMaster
TransactionHistory Log
In-MemoryQueue
Slave ReplicatorsBinlog
tcp/ip
![Page 24: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/24.jpg)
©Continuent Ltd 2017
Filter Operation
• Always get one transaction at a time
• Transaction must be processed inline
• Metadata
• Data blocks
• SQL or ROW Info
• Always returns the transaction
![Page 25: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/25.jpg)
©Continuent Ltd 2017
JS Filters
• prepare() - called when going online
• filter() - does the work
• release() - called when going offline
• Access to:
• Connection to DB
• Full Java class environment
• Bunch of utility functions
![Page 26: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/26.jpg)
©Continuent Ltd 2017
Data Structure
ReplDBMSEvent DBMSData StatementData
DBMSData StatementData
DBMSData RowChangeData OneRowChange
OneRowChange
...
StatementData
ReplDBMSEvent DBMSData RowChangeData OneRowChange
OneRowChange
...
![Page 27: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/27.jpg)
©Continuent Ltd 2017
Get/Set Valuesfor(j = 0; j < rowChanges.size(); j++){ oneRowChange = rowChanges.get(j); columns = oneRowChange.getColumnSpec(); columnValues = oneRowChange.getColumnValues(); for (c = 0; c < columns.size(); c++) { columnSpec = columns.get(c); type = columnSpec.getType(); if (type == TypesDATE || type == TypesTIMESTAMP) { for (row = 0; row < columnValues.size(); row++) { values = columnValues.get(row); value = values.get(c);
if (value.getValue() == 0) { value.setValueNull() } } } }}
![Page 28: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/28.jpg)
©Continuent Ltd 2017
What you can do in a filter
• Anything
![Page 29: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/29.jpg)
©Continuent Ltd 2017
Case Study: Building a Kafka
Applier
![Page 30: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/30.jpg)
©Continuent Ltd 2017
Kafka?
• Message queue/bus
• Full publish/subscribe model
• Huge flexible
• Very practical
• High performance
• Not a database
![Page 31: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/31.jpg)
©Continuent Ltd 2017
Message Format for Data
• Embedded JSON
• CSV Row
• Encoded binary fields
• Message topic?
• Schema/table/primary key?
![Page 32: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/32.jpg)
©Continuent Ltd 2017
Impedance
• What happens with multi-row transactions?
• What happens when a multi-row transaction is not applied?
• Should we split data into chunks?
![Page 33: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/33.jpg)
©Continuent Ltd 2017
What we do already
Sources TargetsMySQL MySQLOracle Oracle
RedShiftVerticalHadoop
TextSQLite
RabbitMQS3
MongoDB
![Page 34: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/34.jpg)
©Continuent Ltd 2017
What are we adding?
Sources TargetsREST API Input Cassandra
MongoDB Amazon AthenaCouchbase CouchbaseCouchDB CouchDB
PostgreSQL ElasticSearchFlumeKafka
Native JDBC to HadoopPostgreSQL
![Page 35: Spread the Database Love with Heterogeneous Replication · ©Continuent Ltd 2017 Heterogeneous Replication IS • Live, constant, low-latency movement of data • For analytics •](https://reader034.fdocuments.us/reader034/viewer/2022050607/5fae25ec3e042b1e273eeb75/html5/thumbnails/35.jpg)
©Continuent Ltd 2017
Where Next
• github.com/continuent/tungsten-replicator
• mcb.guru