Cdc Sql2008
-
Upload
sqlservercoil -
Category
Technology
-
view
1.111 -
download
6
description
Transcript of Cdc Sql2008
Change Data Capture
Yossi MihailoviciSQL Server Consultant
Israeli SQL Server Users Group
Agenda
• Introduction to CDC• How does CDC work?• DDL changes on a tracked table• CDC objects overview• What is it useful for?• Performance issues
• Mechanism that records any DML changes (Insert,Update,Delete) on a table
• Changes are captured for each row in a table• All changes are recorded into designated table• This feature is available only in SQL 2008
Enterprise
Introduction
• CDC needs to be enabled on DB level and on table level
• When enabling CDC for the first time, it automatically generates 2 jobs:– Job for capture*– Job for cleanup
• The capture job runs every 5 seconds and scans the transaction log for changes in the tracked table
• Once a change has been recognized and commited in the log, it’s recorded into a designated table
How does it work?
Data Flow
• The tracking table is created when CDC is enabled for a table
• The table is named according to the captured instance name and is categorized as system table under cdc schema
• Like any regular table, it can be queried, and indexes can be created on it
• This table consists of 5 fixed columns + 1 column for each tracked column
Tracking Table
• Alter column is cascaded to the tracked table• When dropping a tracked column, the tracking
table shows null values for this column• Adding a column isn’t supported in the captured
instance• All DDL changes on a tracked table are recorded
in cdc.ddl_history
DDL Changes
• System Tables:– cdc.captured_columns– cdc.change_tables– cdc.ddl_history– cdc.index_columns– cdc.lsn_time_mapping– cdc.Schema_Name_CT (change tables)
• DMVs:– sys.dm_cdc_log_scan_sessions– sys.dm_repl_traninfo– sys.dm_cdc_errors
CDC Objects Overview
• System Stored Procedures:– sys.sp_cdc_cleanup_change_table– sys.sp_cdc_disable_db_change_data_capture– sys.sp_cdc_disable_table_change_data_capture– sys.sp_cdc_enable_db_change_data_capture– sys.sp_cdc_enable_table_change_data_capture– sys.sp_cdc_get_ddl_history– sys.sp_cdc_get_captured_columns– sys.sp_cdc_help_change_data_capture– sys.sp_cdc_help_jobs– sys.sp_cdc_change_job
CDC Objects Overview
• System Functions:– cdc.fn_cdc_get_all_changes_<capture_instance>– cdc.fn_cdc_get_net_changes_<capture_instance>– sys.fn_cdc_decrement_lsn– sys.fn_cdc_get_column_ordinal– sys.fn_cdc_get_max_lsn– sys.fn_cdc_get_min_lsn– sys.fn_cdc_has_column_changed– sys.fn_cdc_increment_lsn– sys.fn_cdc_is_bit_set– sys.fn_cdc_map_lsn_to_time– sys.fn_cdc_map_time_to_lsn
CDC Objects Overview
• ETL process to Data Warehouse – No need in “Date Last Change” column
• Trigger replacement• Auditing• Snapshot for a table• Fixing human errors• Evaluate workload on a table• Analyzing data changes
What is it useful for?
• Doesn’t support new column types introduced in SQL 2008
• Doesn’t support partition swicth operation• Only 2 captured instances can be created
on a table• Transaction log can’t be truncated as long
as there are changes marked for capture• Capture and cleanup jobs have the same
definitions for all capture tables
Limitations
• LOB data types that weren’t updated aren’t shown in the before update row
• CDC can’t be enabled on another CDC table
• In order to support net changes tracking, table must have PK or unique index
• Table tracked by CDC can’t be truncated
Limitations
• CDC overhead is mostly on the transaction log file and on the tracking table
• To increase performance, it is recommended to separate CDC table to a different data file
• If capture job is disabled, there is still some overhead, but it’s very minimal
Effect on Performance
Thank You!