Data Integration to Data Governance data relationship management.
-
Upload
lisa-merritt -
Category
Documents
-
view
216 -
download
2
Transcript of Data Integration to Data Governance data relationship management.
Data Integration to Data Governance
data relationship management
2
Data In the News:
Data slipupsRick Whiting , 10-May-2006 Inaccurate business data lead to botched marketing campaigns, failed CRM projects--and angry customers. A home valued at US$121,900 somehow wound up recorded in Porter County's computer system as being worth a whopping US$400 million. Naturally, the figure ended up on documents used to calculate tax rates. By the time the blunder was uncovered in February, the damage was done.
Data slipupsRick Whiting , 10-May-2006 Inaccurate business data lead to botched marketing campaigns, failed CRM projects--and angry customers. A home valued at US$121,900 somehow wound up recorded in Porter County's computer system as being worth a whopping US$400 million. Naturally, the figure ended up on documents used to calculate tax rates. By the time the blunder was uncovered in February, the damage was done.
3
Market Forces Affecting the Use of Data
CompetitiveEdge
CustomerPressure
Privacy Regulations
Business Governance
HIPAA, GLBA,PIPEDA, EU DPD
Data Inaccuracies, Over-billing
Straight-through processing, customer service, Consumer confidence
SEC/NAD rule,SARBOX, legalliability, Mergers
What are Companies Doing in Response?
5
Credit Card Company: Where is the Sensitive Data?
Business Problem:• Risk of a security breach exposes
potential regulatory fines, negative PR and customer backlash
Proposed Solution:• Identify sensitive data flows in
structured databases so critical data can be consolidated and properly secured
Roadblock:• 50 data analysts over 5 years
estimate makes project appear to be unbounded and infeasible
Status:• Project put on hold
6
Health Insurance Company: Outsourcing Development
Business Problem:• Data must be sent to India for offshore
application development.• Sensitive data must be masked for
HIPAA compliance
Proposed Solution:• Mask sensitive data before sending it
outside the company
Roadblock:• Sensitive data, where is it?• Can two sets of data that individually
contain no sensitive data be combined to make it sensitive?
Status:• Manual discovery of sensitive data
slows outsourcing to a crawl
7
Wall Street Firm: Data Consistency will Increase Profitability
Business Problem:• Transaction errors are expensive and the
risk of regulatory fines due to inconsistent reference data is unacceptable
Proposed Solution:• Deploy a master data management solution
Roadblock:• 5 years to determine the business rules that
relate the master data system to legacy systems
• Unable map two tables to each other after 6 weeks of work (70 tables total to map)
Status:• Project on hold
8
Auto Insurance: Migrating Fragile Legacy Integration Code to Modern Tools
Business Problem:• Business changes force expensive and
difficult to implement changes in hand written legacy integration code
Proposed Solution:• Migrate legacy code to a modern ETL
(extract, transform, load) tool. Cost of maintenance of ETL is a fraction of legacy code
Roadblock:• No one knows the code. The cost of
migration is unpredictable.
Status• Company continues to manually change
hand written code ad hoc as the business demands
9
Project Timeline
T= 0Internal Security
Consistency/MasterData
Integration
?
The Common “?” in the Project Schedule
Data Relationship Discovery• You have to know where your data is, how
it flows and relates across systems if you hope to secure it, move it, consolidate, integrate it ...
Data Relationship Discovery• You have to know where your data is, how
it flows and relates across systems if you hope to secure it, move it, consolidate, integrate it ...
Don’t We Know Our Own Data?
11
Myth #1: “We know our data”
• Subject matter experts (SMEs) only know their own systems
• But they can’t tell you how it changes and is transformed as it moves from system to system
• Relationships between systems are complex:
• SMEs sometimes change jobs!
I’m a professional. Of course I know
my data!
I’m a professional. Of course I know
my data!
I’m going to start my own consulting firm
I’m going to start my own consulting firm
But, once it leaves my hands, it is someone
else’s problem!
But, once it leaves my hands, it is someone
else’s problem!
Wow, that transformation is complex. Are you sure that
is in my data?
Wow, that transformation is complex. Are you sure that
is in my data?
12
Myth #2: “We know our data”
• Business rules are broken all the time as data crosses business and system boundaries:• 83 year old man in system A is
a “youthful driver” in system B• Bond yield is listed as 5% in
system X and 5.3% in system Y
• Exceptions result in lost revenue, customer dissatisfaction, and regulatory fines
All of my data follows the business rules for
this system!
All of my data follows the business rules for
this system!
13
Myth #3: “We know our data”
• Business rules change as organizations change• Mergers and Acquisitions• New products or services• Products/services are retired• Reorganizations• New IT systems are added
I can’t keep up with all the acquisitions and reorganizations. They mess up the way systems work together. It is very inconvenient.
I can’t keep up with all the acquisitions and reorganizations. They mess up the way systems work together. It is very inconvenient.
14
The Reality
Companies lack a global view of their corporate
data map
15
Current Trend: Data Governance
What is it?• The latest over-hyped term
• Data Integration is to Tactical as Data Governance is to Strategic
Definition• Data Governance encompasses the people, processes and
procedures to create a consistent, enterprise view of your data in order to:• Improve data security• Increase consistency & confidence in decision making• Decrease the risk of regulatory fines
16
The Problem with Data Governance
• How do you do it?• Where is the sensitive data?• What are the business rules and data relationships• Where are the exceptions?
• How do you ensure a consistent, repeatable process?
17
20th Century Data
Relationship Discovery Tool
Traditional Proposed Approach: Metadata
What is it?• Another over-hyped term• Data about data: datatype (character,
integer, number, date etc), column width, frequency, cardinality etc
The Problem • Single system metadata only:
• Profiling• Traditional data integration tools do not
discover metadata • Cleansing, ETL, EAI and EII
The Reality• Data analysts manually examine data
values to figure out the data map• The most sophisticated tool generally
used today is:
Traditional Data Relationship Discovery Tool
There is a Better Way
19
• New approach to a 40 year old problem
• Sophisticated heuristics and algorithms analyze actual data values
• Automates the discoveryand validation of:
• Sensitive data flows• Business rules • Complex transformations
between structured data sets in a consistent and repeatable manner
The Solution: Data-Driven Relationship Discovery
20
Transformation
Hit Rate = 90%Application A Application B
B_Y Bond_Yield0.053 5300.062 6200.071 7100.034 3400.055 5500.072 7200.055 5500.067 6700.056 580
0.06 600
ApplicationA.BY * 10000 = ApplicationB.Bond_Yield
Solution: Data-Driven Exception & Discrepancy Discovery
• Identify exceptions to avoid:• Regulatory fines • Lost revenue • Customer dissatisfaction
Exception
Exception
Transformation
Hit Rate = 90%Application A Application B
AGE Youthful_Driver17 Y24 Y55 N28 N40 N33 N83 Y29 N36 N42 N
CASE WHEN AGE <=25 THEN Youthful_Driver = ‘Y’ ELSE ‘N’ END
21
Data-Driven Discovery Results
Credit Card CompanyStatus: Project moving forward again
• Reduced estimated effort from 250 engineering years to 25 eng. years
• Eliminated project feasibility risk
Wall Street FirmStatus: Back on track
• Over 5x (2 days vs 6 weeks manually) improvement in discovery of business rules made MDM project possible
• Found bond yield discrepancies
Auto Insurance CompanyStatus: Predictable & affordable migration
• 80% reduction in effort required to migrate hand-code to ETL tool
• Mapping process discovered potentially costly business rule errors
Health Insurance CompanyStatus: Outsourcing rollout accelerated
• Now confident in sensitive data discovery accuracy and speed
• Launching new data masking service companywide
22
Summary: Data Governance = Strategic Data Integration
• Companies are implementing data governance projects to:• Improve Security• Increase Consistency• Decrease Regulatory Risk
• First step of data governance… Discovery
• Automated data-driven discovery is a consistent, repeatable and proven approach to identify:• Sensitive Data• Business Rules • Data Exceptions
23
Key Contacts
Bob Shannon: U.S. East Coast SalesPhone: (203) 878-8472Email: [email protected]
Brian Smogard: U.S. Central SalesPhone: (612) 605-9236Email: [email protected]
Clive Harrison: U.S. West Coast and International SalesPhone: 415-608-4632Email: [email protected]
If you have any other follow up questions, contact me:
Todd Goldman Phone: (408) 919-0191 ext 1115Email: [email protected]