5 Steps To Master Data Management

Five Steps to Mastering Master Data Management Ron LewisRon Lewis

November 19, 2009

Presentation Overview

• Introduction

• What is Master Data Management?g

• The 5 Steps for Master Data Management:• Discovery – finding all of the data sources, who they are used by and how they are used

• Analysis – identifying authoritative sources, discrepancies, and candidates for consolidation

• Design – designing the metadata repository

• Implementation–implementing a metadata repository

• Establish data governance

• Leveraging Technology to facilitate:• Business Process and Data Modelingg

• Data Governance and Discovery

• Metadata Repository Implementation

• Metadata Managementg

• Presentation Focus: The Discovery and Analysis Phases

219/11/2009

Master Data Management

• Master Data Management• Master Data is: Principle business data essential for conducting business

• MDM provides an enterprise perspective on the critical Business Processes and the Data necessary to support them

• Bottom line: Improve decision making

• Core Tasks• Building the Business Process Models

• Data Governance (Standardizing data - nomenclature, domains, data quality and consumption rules)

• Synchronizing related operational systems using the data

• Integrating/reconciling disparate data silos to provide single enterprise view

• Building and managing an enterprise metadata repository

• Challenge: Must Shift Thinking to the Enterprise Perspective• Challenge: Must Shift Thinking to the Enterprise Perspective

311/15/2009

Discovery Phase

• Step 1 – Discovery• Capturing and modeling the essential business processes

• Mapping processes to the data necessary to complete each process successfully

• Identifying data sources and gathering appropriate metadata

• Primary Challenges-• Cost - It’s Expensive and Disruptive

• Gaining Executive Leadership Support – (“You mean we don’t have this already?”)

• Solution• Solution-• Start with what’s most important

• What’s important should be obvious

411/15/2009

Discovery Phase

• Involve your infrastructure and/or security personnel

• Iteration I: Capture existing data and schemasp g• Find your database servers, respective owners and access

• Reverse engineering your physical data models

• Build a master data dictionary and catalogy g

• Iteration II: Profile existing applications to help with business • Database Centric: ETL, Stored Procedures, and Triggers

• Application Source Code and User Behavior

• Tools You’ll Need• Infrastructure/security tools (Nessus)y ( )

• Data Modeling and Profiling tools (ER/Studio Data Architect/DBOptimizer)

• Application Profiling tools (NitroSecurity APM)

• Repository to manage the metadata byproducts p y g yp

519/11/2009

Infrastructure / Security Tooling

619/11/2009

Use ER Studio to Reverse Engineer

719/11/2009

Reverse Engineer Physical Schemas

819/11/2009

Example Reverse Engineered Model

919/11/2009

Start Building Master Data Catalog

1019/11/2009

Exporting Catalog for Sharing

1119/11/2009

Discovery – Profiling Data Use

• Biggest Challenges We’re Solving: • Reconciling and integrating disparate “Data Silos” into a central location

• Identifying duplicative data elements (or attributes)

• Laying the foundation for identifying which of the data sources contain the actual “source data”

• High Percentage of Business Logic is encapsulated as Programming Logicg g g p g g g• Stored Procedures and Trigger code stored in the database

• Application Source Code

• Extract Transform and Load ScriptsExtract Transform and Load Scripts

• We need visibility to this logic, and we need to be able to store it somewhere

• Tools necessary for this:• DSAuditor and DB Optimizer or Performance Center (to capture live data use)

• Source Code Analyzers (I like Fortify SCA, and Embarcadero JBuilder)

• Profile ETL using Embarcadero’s MetaWizard (usually convert ETL to XML)

• Store metadata in ER/Studio Data Architect’s Data Lineage and Transform Rules Support

1219/11/2009

Profiling Data Use with DBOptimizer

1319/11/2009

Analysis Phase

• Step 2 – Analysis• Identifying authoritative sources, discrepancies, and candidates for consolidation

• Evaluating Data Flow and Transform Rules

• Capturing/Defining Synonyms and Assigning Aliases

• Setting the Foundation for Data Governance

• Primary Challenges-• Cost – It’s Time Consuming and is a “Team Effort”

• Getting ancillary information that teams don’t want to shareg y

• Solution-• Start with what’s most important

Wh ’ i h ld b b i• What’s important should be obvious

1411/15/2009

Analysis Phase

• Iteration I: Evaluate ETL for data lineage and transform rules• Start by reverse engineering the ETL, converting it to XML

• Incorporate it into the repository

• Iteration II: Identify synonymous elements and build alias list• Evaluate data domains and transform rules for issues such as state and use

• Enlist database and development staff to identify alias and tag the data elements in the master catalog

• Tools You’ll Need• Data Modeling tools (ER/Studio and MetaWizard)

• Repository to manage the metadata byproducts (ER/Studio)

1519/11/2009

Analysis Phase – Evaluating ETL

• Biggest Challenges We’re Solving: • Finding which data source is feeding what other data sources

• Collecting Data Lineage metadata

• Making it accessible to the right team members

• Convert the ETL to a form that allows manipulation (such as XML)p ( )

• Importing the metadata into the data modeling tool

• Build, publish and control access to your master data repository

• Start gathering and applying metadata tags

• Tools necessary for this:• MetaWizard

• ER/Studio Data Architect (or the like)

1619/11/2009

Data Lineage and Transform Rules

1719/11/2009

Setting the Foundation for Governance

18

19/11/2009

Analysis Phase – Identifying Synonyms

• Biggest Challenges We’re Solving: • Indentifying like data elements and candidates for consolidation

• Building Aliases

• Establishing the foundation for Data Governance

• Evaluate data nomenclature using tool functions such as Merge and g gCompare to identify the obvious overlaps

• Compare descriptors from database staff

• Compare data use and consumption rules derived from tools such as DB Optimizer

f• Tools necessary for this:• ER/Studio Data Architect (or the like)

1919/11/2009

Performing Analysis With Compare Utility

2019/11/2009

Exporting to Excel for Input into Database

2119/11/2009

Candidates for Consolidation

2219/11/2009

Step 3 Building the Repository

• Step 3–Building Metadata Repository• Populating the Repository with the right metadata

• Establishing and Controlling Access to the metadata

• Performing metadata management

• Primary Challenges-y g• Defining who needs access to what metadata

• Establishing the rules of use

• Suggestions• Suggestions-• Implement change control and auditing tool

• What’s important should be obvious

• Understand the value of the metadata on profitability

2319/11/2009

Step 4 Implementing the repository

• Step 4 - Implementing the repository • Mapping the metadata to the requisite business processes

• Leveraging the metadata to determine candidates for business process re-engineering

• Primary Challenges-• Getting the processes down in modeled formg p

• Obtaining Middle Level Management and Senior Leadership buy in to changes identified by metadata

• Suggestions-• Leverage a modeling tool that facilitates data to process mapping (integrated metadata)

• Focus on what’s most important to the business—try not to focus on EVERYTHING

2419/11/2009

Step 5 Establishing Data Governance

• Step 5 – Establishing Data Governance• All of the above steps lays the foundation for good data governance

• Get Senior Leadership to stipulate policy enforcing the rules you’ve derived

• Build a Plan and Standardize Iteratively – (don’t try to fix everything all at once)

• Primary Challenges-y g• Fundamental Opposition to Change

• Maintaining Momentum

• Suggestions• Suggestions-• Find a quick kill – tackle the biggest organizational problem you can handle

• Focus on what’s most important to the business—and what drives easily visible ROI

2519/11/2009

Summary

• What We Covered:• Defined Master Data and Master Data Management

• The 5 Steps for Master Data Management:

• Discovery – finding all of the data sources, who they are used by and how they are used

• Analysis – identifying authoritative sources, discrepancies, and candidates for consolidation

• Design – designing the metadata repository

• Implementation–implementing a metadata repository

• Establish data governance

• Demonstrated how to leverage specific technology to facilitate:

• Business Process and Data Modeling

• Data Governance and Discovery

• Metadata Repository Implementation

• Metadata Management

2619/11/2009

Questions and Answers

• Tools Discussed:• Nessus

• ER/Studio Data Architect / Business Architect and ER/Studio Repository

• DBOptimizer

• Change Manager

• Technologies Discussed: • Building the Data CatalogBuilding the Data Catalog

• Capturing and Storing Metadata

• Metadata Analysis

• Contact Info:• Ron Lewis, [email protected]

2719/11/2009

5 Steps To Master Data Management

Technology

Transcript of 5 Steps To Master Data Management