Data Integration to Data Governance data relationship management.

23
Data Integration to Data Governance data relationship management

Transcript of Data Integration to Data Governance data relationship management.

Page 1: Data Integration to Data Governance data relationship management.

Data Integration to Data Governance

data relationship management

Page 2: Data Integration to Data Governance data relationship management.

2

Data In the News:

Data slipupsRick Whiting , 10-May-2006 Inaccurate business data lead to botched marketing campaigns, failed CRM projects--and angry customers. A home valued at US$121,900 somehow wound up recorded in Porter County's computer system as being worth a whopping US$400 million. Naturally, the figure ended up on documents used to calculate tax rates. By the time the blunder was uncovered in February, the damage was done.

Data slipupsRick Whiting , 10-May-2006 Inaccurate business data lead to botched marketing campaigns, failed CRM projects--and angry customers. A home valued at US$121,900 somehow wound up recorded in Porter County's computer system as being worth a whopping US$400 million. Naturally, the figure ended up on documents used to calculate tax rates. By the time the blunder was uncovered in February, the damage was done.

Page 3: Data Integration to Data Governance data relationship management.

3

Market Forces Affecting the Use of Data

CompetitiveEdge

CustomerPressure

Privacy Regulations

Business Governance

HIPAA, GLBA,PIPEDA, EU DPD

Data Inaccuracies, Over-billing

Straight-through processing, customer service, Consumer confidence

SEC/NAD rule,SARBOX, legalliability, Mergers

Page 4: Data Integration to Data Governance data relationship management.

What are Companies Doing in Response?

Page 5: Data Integration to Data Governance data relationship management.

5

Credit Card Company: Where is the Sensitive Data?

Business Problem:• Risk of a security breach exposes

potential regulatory fines, negative PR and customer backlash

Proposed Solution:• Identify sensitive data flows in

structured databases so critical data can be consolidated and properly secured

Roadblock:• 50 data analysts over 5 years

estimate makes project appear to be unbounded and infeasible

Status:• Project put on hold

Page 6: Data Integration to Data Governance data relationship management.

6

Health Insurance Company: Outsourcing Development

Business Problem:• Data must be sent to India for offshore

application development.• Sensitive data must be masked for

HIPAA compliance

Proposed Solution:• Mask sensitive data before sending it

outside the company

Roadblock:• Sensitive data, where is it?• Can two sets of data that individually

contain no sensitive data be combined to make it sensitive?

Status:• Manual discovery of sensitive data

slows outsourcing to a crawl

Page 7: Data Integration to Data Governance data relationship management.

7

Wall Street Firm: Data Consistency will Increase Profitability

Business Problem:• Transaction errors are expensive and the

risk of regulatory fines due to inconsistent reference data is unacceptable

Proposed Solution:• Deploy a master data management solution

Roadblock:• 5 years to determine the business rules that

relate the master data system to legacy systems

• Unable map two tables to each other after 6 weeks of work (70 tables total to map)

Status:• Project on hold

Page 8: Data Integration to Data Governance data relationship management.

8

Auto Insurance: Migrating Fragile Legacy Integration Code to Modern Tools

Business Problem:• Business changes force expensive and

difficult to implement changes in hand written legacy integration code

Proposed Solution:• Migrate legacy code to a modern ETL

(extract, transform, load) tool. Cost of maintenance of ETL is a fraction of legacy code

Roadblock:• No one knows the code. The cost of

migration is unpredictable.

Status• Company continues to manually change

hand written code ad hoc as the business demands

Page 9: Data Integration to Data Governance data relationship management.

9

Project Timeline

T= 0Internal Security

Consistency/MasterData

Integration

?

The Common “?” in the Project Schedule

Data Relationship Discovery• You have to know where your data is, how

it flows and relates across systems if you hope to secure it, move it, consolidate, integrate it ...

Data Relationship Discovery• You have to know where your data is, how

it flows and relates across systems if you hope to secure it, move it, consolidate, integrate it ...

Page 10: Data Integration to Data Governance data relationship management.

Don’t We Know Our Own Data?

Page 11: Data Integration to Data Governance data relationship management.

11

Myth #1: “We know our data”

• Subject matter experts (SMEs) only know their own systems

• But they can’t tell you how it changes and is transformed as it moves from system to system

• Relationships between systems are complex:

• SMEs sometimes change jobs!

I’m a professional. Of course I know

my data!

I’m a professional. Of course I know

my data!

I’m going to start my own consulting firm

I’m going to start my own consulting firm

But, once it leaves my hands, it is someone

else’s problem!

But, once it leaves my hands, it is someone

else’s problem!

Wow, that transformation is complex. Are you sure that

is in my data?

Wow, that transformation is complex. Are you sure that

is in my data?

Page 12: Data Integration to Data Governance data relationship management.

12

Myth #2: “We know our data”

• Business rules are broken all the time as data crosses business and system boundaries:• 83 year old man in system A is

a “youthful driver” in system B• Bond yield is listed as 5% in

system X and 5.3% in system Y

• Exceptions result in lost revenue, customer dissatisfaction, and regulatory fines

All of my data follows the business rules for

this system!

All of my data follows the business rules for

this system!

Page 13: Data Integration to Data Governance data relationship management.

13

Myth #3: “We know our data”

• Business rules change as organizations change• Mergers and Acquisitions• New products or services• Products/services are retired• Reorganizations• New IT systems are added

I can’t keep up with all the acquisitions and reorganizations. They mess up the way systems work together. It is very inconvenient.

I can’t keep up with all the acquisitions and reorganizations. They mess up the way systems work together. It is very inconvenient.

Page 14: Data Integration to Data Governance data relationship management.

14

The Reality

Companies lack a global view of their corporate

data map

Page 15: Data Integration to Data Governance data relationship management.

15

Current Trend: Data Governance

What is it?• The latest over-hyped term

• Data Integration is to Tactical as Data Governance is to Strategic

Definition• Data Governance encompasses the people, processes and

procedures to create a consistent, enterprise view of your data in order to:• Improve data security• Increase consistency & confidence in decision making• Decrease the risk of regulatory fines

Page 16: Data Integration to Data Governance data relationship management.

16

The Problem with Data Governance

• How do you do it?• Where is the sensitive data?• What are the business rules and data relationships• Where are the exceptions?

• How do you ensure a consistent, repeatable process?

Page 17: Data Integration to Data Governance data relationship management.

17

20th Century Data

Relationship Discovery Tool

Traditional Proposed Approach: Metadata

What is it?• Another over-hyped term• Data about data: datatype (character,

integer, number, date etc), column width, frequency, cardinality etc

The Problem • Single system metadata only:

• Profiling• Traditional data integration tools do not

discover metadata • Cleansing, ETL, EAI and EII

The Reality• Data analysts manually examine data

values to figure out the data map• The most sophisticated tool generally

used today is:

Traditional Data Relationship Discovery Tool

Page 18: Data Integration to Data Governance data relationship management.

There is a Better Way

Page 19: Data Integration to Data Governance data relationship management.

19

• New approach to a 40 year old problem

• Sophisticated heuristics and algorithms analyze actual data values

• Automates the discoveryand validation of:

• Sensitive data flows• Business rules • Complex transformations

between structured data sets in a consistent and repeatable manner

The Solution: Data-Driven Relationship Discovery

Page 20: Data Integration to Data Governance data relationship management.

20

Transformation

Hit Rate = 90%Application A Application B

B_Y Bond_Yield0.053 5300.062 6200.071 7100.034 3400.055 5500.072 7200.055 5500.067 6700.056 580

0.06 600

ApplicationA.BY * 10000 = ApplicationB.Bond_Yield

Solution: Data-Driven Exception & Discrepancy Discovery

• Identify exceptions to avoid:• Regulatory fines • Lost revenue • Customer dissatisfaction

Exception

Exception

Transformation

Hit Rate = 90%Application A Application B

AGE Youthful_Driver17 Y24 Y55 N28 N40 N33 N83 Y29 N36 N42 N

CASE WHEN AGE <=25 THEN Youthful_Driver = ‘Y’ ELSE ‘N’ END

Page 21: Data Integration to Data Governance data relationship management.

21

Data-Driven Discovery Results

Credit Card CompanyStatus: Project moving forward again

• Reduced estimated effort from 250 engineering years to 25 eng. years

• Eliminated project feasibility risk

Wall Street FirmStatus: Back on track

• Over 5x (2 days vs 6 weeks manually) improvement in discovery of business rules made MDM project possible

• Found bond yield discrepancies

Auto Insurance CompanyStatus: Predictable & affordable migration

• 80% reduction in effort required to migrate hand-code to ETL tool

• Mapping process discovered potentially costly business rule errors

Health Insurance CompanyStatus: Outsourcing rollout accelerated

• Now confident in sensitive data discovery accuracy and speed

• Launching new data masking service companywide

Page 22: Data Integration to Data Governance data relationship management.

22

Summary: Data Governance = Strategic Data Integration

• Companies are implementing data governance projects to:• Improve Security• Increase Consistency• Decrease Regulatory Risk

• First step of data governance… Discovery

• Automated data-driven discovery is a consistent, repeatable and proven approach to identify:• Sensitive Data• Business Rules • Data Exceptions

Page 23: Data Integration to Data Governance data relationship management.

23

Key Contacts

Bob Shannon: U.S. East Coast SalesPhone: (203) 878-8472Email: [email protected]

Brian Smogard: U.S. Central SalesPhone: (612) 605-9236Email: [email protected]

Clive Harrison: U.S. West Coast and International SalesPhone: 415-608-4632Email: [email protected]

If you have any other follow up questions, contact me:

Todd Goldman Phone: (408) 919-0191 ext 1115Email: [email protected]