Compendium Rarissimum Totius Artis Magicae Sistematisatae Per Celeberrimos Artis Hujus Magistros.
INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .
-
Upload
bennett-stanley -
Category
Documents
-
view
212 -
download
0
Transcript of INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .
INTRODUCTION TO DATA QUALITY SERVICES
Presentation by Tim Mitchell (Artis Consulting)www.TimMitchell.net
2
Today’s Agenda
Overview of DQS
Structure
Knowledge Base
DQS Project
Operations
Matching
Cleansing
Administration
SSIS Component
Shortcomings
3
About the Presenter
Tim Mitchell
BI Consultant, Artis Consulting
North Texas SQL Server User Group
SQL Server MVP
Contributing author, MVP Deep Dives Vol 2
Coauthor, SSIS Design Patterns
TimMitchell.net | twitter.com/Tim_Mitchell
4
Housekeeping
Questions
Surveys
v
Overview of Data Quality Services
6
What is DQS?
DQS is a knowledge driven data cleansing and matching services
Built on top of SQL Server 2012
Simple yet powerful interface
7
What is DQS?
8
What is DQS?
Replaces manual data quality work you’re already doing
Stored procedures
Triggers
Custom applications
v
DQS Structure
10
Knowledge Base
DQS Structure and Flow
DomainsMatching Policies
Composite Domains
Matching Project
Cleansing Project
Matching Project
Cleansing Project
Cleansing Project
11
Knowledge Base
Starting point for data quality provisioning
Uses locally customized data stores or marketplace data sources
Highly reusable and evolutionary
Key elements:
Domains
Matching policies
12
Knowledge Base
Create by:
Knowledge discovery
Domain management
Matching rule
13
Knowledge Base
14
Domains
Domain = data field
Domain rules
Composite domains
Allows greater flexibility in domain rules
15
Data Quality Project
Create interactive projects for data matching and cleansing
Leverage one or more domains in an existing knowledge base
Somewhat reusable
16
Data Quality Project
Nondestructive – no changes to source of data to be cleansed
No changes to the KB either
Separately, DQS project data can be used to improve the knowledge base
17
Data Quality Project
18
DQS Operations
Cleansing
Process data against known entities and domain rules
Similar to Fuzzy Lookup transform in SSIS
Matching
Group data together
Similar to Fuzzy Grouping transform in SSIS
19
DQS Administration
Monitor past activity
Set logging options
Set confidence thresholds
20
DQS Administration
21
DQS and SSIS
SQL Server Integration Services has integrated hook into DQS
DQS Cleansing Component
Provide automated, noninteractive data cleansing operations
22
DQS and SSIS
v
Demos
24
Shortcomings
V1 product
No API – must use DQS client interactively
SSIS component only does cleansing
25
Final Thoughts
CU1 performance improvements
http://bit.ly/IKmMow
DQS videos / blogs
http://technet.microsoft.com/en-us/sqlserver/hh780961
My blog (www.TimMitchell.net)
DQS/MDS virtual chapter
masterdata.sqlpass.org
v
Questions?