INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

26
INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) www.TimMitchell.net

Transcript of INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

Page 1: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

INTRODUCTION TO DATA QUALITY SERVICES

Presentation by Tim Mitchell (Artis Consulting)www.TimMitchell.net

Page 2: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

2

Today’s Agenda

Overview of DQS

Structure

Knowledge Base

DQS Project

Operations

Matching

Cleansing

Administration

SSIS Component

Shortcomings

Page 3: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

3

About the Presenter

Tim Mitchell

BI Consultant, Artis Consulting

North Texas SQL Server User Group

SQL Server MVP

Contributing author, MVP Deep Dives Vol 2

Coauthor, SSIS Design Patterns

TimMitchell.net | twitter.com/Tim_Mitchell

Page 4: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

4

Housekeeping

Questions

Surveys

Page 5: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

v

Overview of Data Quality Services

Page 6: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

6

What is DQS?

DQS is a knowledge driven data cleansing and matching services

Built on top of SQL Server 2012

Simple yet powerful interface

Page 7: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

7

What is DQS?

Page 8: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

8

What is DQS?

Replaces manual data quality work you’re already doing

Stored procedures

Triggers

Custom applications

Page 9: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

v

DQS Structure

Page 10: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

10

Knowledge Base

DQS Structure and Flow

DomainsMatching Policies

Composite Domains

Matching Project

Cleansing Project

Matching Project

Cleansing Project

Cleansing Project

Page 11: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

11

Knowledge Base

Starting point for data quality provisioning

Uses locally customized data stores or marketplace data sources

Highly reusable and evolutionary

Key elements:

Domains

Matching policies

Page 12: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

12

Knowledge Base

Create by:

Knowledge discovery

Domain management

Matching rule

Page 13: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

13

Knowledge Base

Page 14: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

14

Domains

Domain = data field

Domain rules

Composite domains

Allows greater flexibility in domain rules

Page 15: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

15

Data Quality Project

Create interactive projects for data matching and cleansing

Leverage one or more domains in an existing knowledge base

Somewhat reusable

Page 16: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

16

Data Quality Project

Nondestructive – no changes to source of data to be cleansed

No changes to the KB either

Separately, DQS project data can be used to improve the knowledge base

Page 17: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

17

Data Quality Project

Page 18: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

18

DQS Operations

Cleansing

Process data against known entities and domain rules

Similar to Fuzzy Lookup transform in SSIS

Matching

Group data together

Similar to Fuzzy Grouping transform in SSIS

Page 19: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

19

DQS Administration

Monitor past activity

Set logging options

Set confidence thresholds

Page 20: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

20

DQS Administration

Page 21: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

21

DQS and SSIS

SQL Server Integration Services has integrated hook into DQS

DQS Cleansing Component

Provide automated, noninteractive data cleansing operations

Page 22: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

22

DQS and SSIS

Page 23: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

v

Demos

Page 24: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

24

Shortcomings

V1 product

No API – must use DQS client interactively

SSIS component only does cleansing

Page 25: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

25

Final Thoughts

CU1 performance improvements

http://bit.ly/IKmMow

DQS videos / blogs

http://technet.microsoft.com/en-us/sqlserver/hh780961

My blog (www.TimMitchell.net)

DQS/MDS virtual chapter

masterdata.sqlpass.org

Page 26: INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) .

v

Questions?