Real-Time Loading to Sybase IQ
-
Upload
sybase-tuerkiye -
Category
Technology
-
view
1.663 -
download
2
description
Transcript of Real-Time Loading to Sybase IQ
REPLICATION SERVER —REAL TIME LOADING (RTL) FOR IQ UPDATES
DIAL NUMBERS:
1‐866‐803‐2143
1‐210‐795‐1098
PASSCODE: SYBASE
2 – Company Confidential – June 4, 2012
Your Host… Guest Speaker…
YOUR HOSTS FOR TODAY
Tom TraubitzProduct Marketing
Manager
Bill ZhangSenior Product Manager
3 – Company Confidential – June 4, 2012
Questions?
Submit via the ‘Questions’ tool on your Live Meeting console, or call 1‐866‐803‐2143 United States, 1‐210‐795‐1098 other Password SYBASE Press *1 during the Q&A segment
Presentation copies?
Select the printer icon on the Live Meeting console
HOUSEKEEPING
REPLICATION SERVER —REAL TIME LOADING (RTL) FOR IQ UPDATES
BILL ZHANGPRODUCT MANAGEMENT
JUNE 2012
5 – Company Confidential – June 4, 2012
TYPICAL DATA REPLICATION SOLUTIONSHIGH VOLUME DATA TRANSFER FOR TRANSACTIONAL SYSTEMS
High Availability• Business Continuity vs. Disaster Recovery• Zero Risk/Downtime Application Migrations
Real Time Analytics• Companion Reporting/Analytics Servers• In‐Memory Real Time Synchronization
Light‐Weight Integration• Inter‐Application Data Movement• Security Enclaves (Web, etc.)
Data Distribution• Global Infrastructures (Peer‐to‐Peer)• Hierarchical Data Flow (Corporate Roll‐ups/Fan‐out)
6 – Company Confidential – June 4, 2012
AGENDA
• Background & Benefits– Pre‐RS 15.5 replication to IQ solutions– Complexity & manual effort– Performance limitation
• Real‐Time Loading (RTL) Overview
• Real‐Time Loading (RTL) Update
7 – Company Confidential – June 4, 2012
DIRECT REPLICATION
• Data is captured at the source database (transaction log)• Transactions are applied in order at the target IQ system
8 – Company Confidential – June 4, 2012
DIRECT REPLICATIONIndividual inserts of replicated transactions into Sybase IQ
Very simple to setup
Can use whole database replication at the source to minimize complexity
No custom code or scripts need be used
All architecture can be designed with PowerDesigner
ProsAll data is applied to IQ in OLTP, single‐row format
Very slow
•1‐100 rows per second
Cons
9 – Company Confidential – June 4, 2012
PRE‐REPLICATION SERVER 15.5/15.6 RTL SOLUTION
• Staging solution
– Replicate to an ASE to “stage” data– Customer function strings required
– External loading mechanism
– “Secret hand shakes”
10 – Company Confidential – June 4, 2012
STAGED REPLICATION
• Data is captured at the source database (transaction log)
• Data is queued for delayed movement into IQ
• Can be staged in an IMDB or lightweight DBMS implementation
• Data is periodically uploaded into Sybase IQ
• Via ETL tool, Insert/Location w/ Job Scheduler
• May be uploaded on a frequent (every 15 minute) basis
Replication Agent
Replication Server Sybase IQ
Staging Server
(ASE/ASA)
Scheduled Uploads to IQ
Continuous Movement into Staging
Continuous Capture of Changed Data
11 – Company Confidential – June 4, 2012
• Data is typically continuously replicated into the staging database so that all source operations are captured − insert, update, deletes are compiled to avoid multiple operations on same
rows− The old primary key and complete after image of each data row is
maintained (optionally other previous values or system variables such as commit time, user, etc.)
• At scheduled intervals the data is moved into IQ via the bulk loader− Data is initially inserted into work tables− Deletes, updates and inserts are then merged into schema
• Once the data base been fully applied to IQ, the data in the staging database (and work tables) is removed
STAGED REPLICATION STEPS
12 – Company Confidential – June 4, 2012
STAGED REPLICATIONGrouped inserts of replicated transactions to utilize bulk loading
Have full control over what data (tables, columns, rows) is moved into IQ
Can augment source data with other data and perform some cleansing and transformations
All architecture can be designed with PowerDesigner
If ASE is your source database, PowerDesigner will create staging database schema and generate load scripts
ProsRequires a staging database/server (ASA, ASE)
Custom code function strings for each table being replicated
Need custom scripts to move data from staging area to IQ
Cons
13 – Company Confidential – June 4, 2012
RS REAL TIME LOADING (RTL) EDITIONAchieving low latency Real Time Analytics
• Replication Server/Real Time Loading Edition
– Introduced with RS version 15.5–the ONLY DBMS target supported is Sybase IQ
Routes into and out of RS/RTL edition are supported
• Source DBMS’s Supported
–Sybase ASE (RS/RTL 15.5+)–Oracle (RS/RTL 15.6+)
14 – Company Confidential – June 4, 2012
BASICS OF RS REAL TIME LOADING EDITIONASE / Oracle Primary RS Replicate RS IQ
Transaction cache Group
CDB
DSI module
1. Outbound queue
2. Read xact from xact cache and grouped
3. Compile a group
4. Apply group
ApplyCompile
15 – Company Confidential – June 4, 2012
1. Group compile‐able transactions into one2. Compile commands — per row base (see next
page for compilation rules) 3. Bulk apply compiled commands — table and
operation type (insert/delete/update) base– Determine apply order– Different bulk interfaces according target DB– Join to get final result for update/delete
RS REAL‐TIME LOADING (RTL) EDITION
RTL performs these steps during replication:
16 – Company Confidential – June 4, 2012
RS REAL‐TIME LOADING EDITIONBulk loading from Sybase ASE without a separate staging database
Have full control over what data (tables, columns, rows) is moved into IQ
Reduced number of external components (no staging database)
Reduced latency without the overhead of the staging solution, or the performance of the direct replication solution
Simpler maintenance and manageability
Pros Data transformation is not supported, and the source and target schemas must be equivalent
Cons
17 – Company Confidential – June 4, 2012
SUPPORTED CONFIGURATIONSReplication Server Editions Needed
• No non‐IQ Replicate Databases Supported in RTL– You can however replicate directly from ASE to IQ ‐ just no non‐IQ targets
• If you need to replicate to both an ASE (e.g. WS) and IQ– you will need to use both an RS Enterprise Edition with route to RTL– Same is true to replicate from ASE to Heterogeneous Targets and Sybase IQ ‐ RS/HE
required with route to/from RS/RTL
RS/RTL
…or…
RS/EE or RS/HE
RS/RTL
…or… …or…
18 – Company Confidential – June 4, 2012
ORACLE PRIMARY DB FOR RTL
• Oracle Versions supported
–Oracle 10g and 11g• Oracle database administrator (Oracle DBA), Sybase IQ administrator (IQ DBS), and RS administrator (RSA)
–Object ownership and permissions granted
– See Heterogeneous ReplicationQuick Start Guide• Mark Oracle tables for replication
– pdb_setreptable RA command
• See RS 15.6 Oracle to IQ Replication Documentation for step‐by‐step instructions for instructions and syntax for each step
19 – Company Confidential – June 4, 2012
KEY UPDATES IN RTL 15.7.1
INCREASED PARALLELISM VIA MULTI‐PATH REPLICATION
20 – Company Confidential – June 4, 2012
MULTI‐PATH REPLICATION (MPR) & RTL
ASO
DIST Direct Cache Read
NRM Thread
Memory Alloc
HVAR
Block sizes > 16K
Multi‐Path Replication
• Multi‐Path Replication (MPR) is now available as part of Real Time Loading Edition 15.7.1
21 – Company Confidential – June 4, 2012
UNDERSTAND THE NEED FOR MPR
•RS in past has been very serialized• Ensured transaction serialization and integrity (next slide)
•Problem is this severely hampered performance
• Large transactions by batch users impacted OLTP user transaction latency
• Transactions on different areas of schema were serialized even though independent of each other
• Independent transactions by different users were serialized• E.g. the grocery store check‐out lane scenario
• Extremely large transactions could only use a single apply method
•Past work around attempts
• Parallel DSI – didn’t work well as transaction grouping often led to contention between threads
• Multiple DSI – worked okay, but only for DSI and required a non‐standard implementation with confusing TS support clauses
22 – Company Confidential – June 4, 2012
PRE‐RS 15.7 INDUCED SERIALIZATION
Single RepAgent per PDB
Single Route between PRS & RRS
Single DSI connection to RDB
23 – Company Confidential – June 4, 2012
PARALLELISM IN MULTI‐PATH REPLICATION
•Multiple Rep Agents
• Currently single log scanner (in ASE) but multiple senders –one each for each source path defined.
•Dedicated Routes• Key connections have a dedicated route (and resources) vs. the current shared route for all connections
•Multiple DSI
• Multiple independent connections to the same replicate database
24 – Company Confidential – June 4, 2012
MULTI‐PATH REPLICATION IN RS 15.7+
Multiple RepAgent Senders
(still single scanner)
Dedicated RoutePaths
Multiple DSIMultiple RS from Same Source
25 – Company Confidential – June 4, 2012
TYPES OF MULTI‐PATH REPLICATION (SUPPORTED)• Schema Subsets (supported in 15.7)– Different tables/stored procedures are replicated on different paths– This allows different areas of the schema acted upon by different business functions to have relative independence
Equity trade table vs. Commodity trade tableCustomer Service vs. SalesAudit data vs. transaction data
• User Session (supported in 15.7.1)– Different transactions from different user sessions that can be applied in any order use multiple paths
– This is the grocery check‐out lane situation– This also will help with large batch jobs
Several FSI & Healthcare applications leverage 100's of concurrent connections to perform batch processing in order to maximize parallelism on large SMP.Advantage over column value hashing is that the hashkey doesn't have to appear in every table (as it frequently doesn't).
• Other types will be introduced in the future releases
26 – Company Confidential – June 4, 2012
USE CASES FOR MULTIPLE DSI• Usual MPR separation for performance
• Separate DSI's for separate sources
– Corporate rollups, reporting systems, Sybase IQ, etc.
– Improves HVAR/RTL effectiveness as it prevents the transaction grouping to be terminated due to change in origin
• Separate large volume non‐business data
– Audit data
– Historical tables (e.g. trade_history) during archiving
• Replicate long running stored procedures
– Typically we don't want to do this If it ran for 5 hours at the primary, it would run for 5 hours at the replicate
This creates much more than 5 hours of latency due to serialization
– Now we canCreate an alternate connection just for long running procs
Create proc repdef and subscribe using alternate connection
We don't care how long it runs any more
– Note that we don't need MPR RA, etc. this – just MDSI
27 – Company Confidential – June 4, 2012
MULTI‐PATH REPLICATION TO SYBASE IQ SUMMARY
• Create multiple connections from Replication Server to the replicate Sybase IQ database to increase replication throughput and performance, and reduce latency and contention.
• MPR to IQ (end to end) works with following min. versions
– ASE 15.7– RS 15.7.1– IQ 15.1
28 – Company Confidential – June 4, 2012
THANK YOUFOR MORE INFORMATIONWWW.SYBASE.COM/REPLICATION