5 Steps To Master Data Management
-
Upload
embarcadero-technologies -
Category
Technology
-
view
6.867 -
download
1
description
Transcript of 5 Steps To Master Data Management
Five Steps to Mastering Master Data Management Ron LewisRon Lewis
November 19, 2009
Presentation Overview
• Introduction
• What is Master Data Management?g
• The 5 Steps for Master Data Management:• Discovery – finding all of the data sources, who they are used by and how they are used
• Analysis – identifying authoritative sources, discrepancies, and candidates for consolidation
• Design – designing the metadata repository
• Implementation–implementing a metadata repository
• Establish data governance
• Leveraging Technology to facilitate:• Business Process and Data Modelingg
• Data Governance and Discovery
• Metadata Repository Implementation
• Metadata Managementg
• Presentation Focus: The Discovery and Analysis Phases
219/11/2009
Master Data Management
• Master Data Management• Master Data is: Principle business data essential for conducting business
• MDM provides an enterprise perspective on the critical Business Processes and the Data necessary to support them
• Bottom line: Improve decision making
• Core Tasks• Building the Business Process Models
• Data Governance (Standardizing data - nomenclature, domains, data quality and consumption rules)
• Synchronizing related operational systems using the data
• Integrating/reconciling disparate data silos to provide single enterprise view
• Building and managing an enterprise metadata repository
• Challenge: Must Shift Thinking to the Enterprise Perspective• Challenge: Must Shift Thinking to the Enterprise Perspective
311/15/2009
Discovery Phase
• Step 1 – Discovery• Capturing and modeling the essential business processes
• Mapping processes to the data necessary to complete each process successfully
• Identifying data sources and gathering appropriate metadata
• Primary Challenges-• Cost - It’s Expensive and Disruptive
• Gaining Executive Leadership Support – (“You mean we don’t have this already?”)
• Solution• Solution-• Start with what’s most important
• What’s important should be obvious
411/15/2009
Discovery Phase
• Involve your infrastructure and/or security personnel
• Iteration I: Capture existing data and schemasp g• Find your database servers, respective owners and access
• Reverse engineering your physical data models
• Build a master data dictionary and catalogy g
• Iteration II: Profile existing applications to help with business • Database Centric: ETL, Stored Procedures, and Triggers
• Application Source Code and User Behavior
• Tools You’ll Need• Infrastructure/security tools (Nessus)y ( )
• Data Modeling and Profiling tools (ER/Studio Data Architect/DBOptimizer)
• Application Profiling tools (NitroSecurity APM)
• Repository to manage the metadata byproducts p y g yp
519/11/2009
Infrastructure / Security Tooling
619/11/2009
Use ER Studio to Reverse Engineer
719/11/2009
Reverse Engineer Physical Schemas
819/11/2009
Example Reverse Engineered Model
919/11/2009
Start Building Master Data Catalog
1019/11/2009
Exporting Catalog for Sharing
1119/11/2009
Discovery – Profiling Data Use
• Biggest Challenges We’re Solving: • Reconciling and integrating disparate “Data Silos” into a central location
• Identifying duplicative data elements (or attributes)
• Laying the foundation for identifying which of the data sources contain the actual “source data”
• High Percentage of Business Logic is encapsulated as Programming Logicg g g p g g g• Stored Procedures and Trigger code stored in the database
• Application Source Code
• Extract Transform and Load ScriptsExtract Transform and Load Scripts
• We need visibility to this logic, and we need to be able to store it somewhere
• Tools necessary for this:• DSAuditor and DB Optimizer or Performance Center (to capture live data use)
• Source Code Analyzers (I like Fortify SCA, and Embarcadero JBuilder)
• Profile ETL using Embarcadero’s MetaWizard (usually convert ETL to XML)
• Store metadata in ER/Studio Data Architect’s Data Lineage and Transform Rules Support
1219/11/2009
Profiling Data Use with DBOptimizer
1319/11/2009
Analysis Phase
• Step 2 – Analysis• Identifying authoritative sources, discrepancies, and candidates for consolidation
• Evaluating Data Flow and Transform Rules
• Capturing/Defining Synonyms and Assigning Aliases
• Setting the Foundation for Data Governance
• Primary Challenges-• Cost – It’s Time Consuming and is a “Team Effort”
• Getting ancillary information that teams don’t want to shareg y
• Solution-• Start with what’s most important
Wh ’ i h ld b b i• What’s important should be obvious
1411/15/2009
Analysis Phase
• Iteration I: Evaluate ETL for data lineage and transform rules• Start by reverse engineering the ETL, converting it to XML
• Incorporate it into the repository
• Iteration II: Identify synonymous elements and build alias list• Evaluate data domains and transform rules for issues such as state and use
• Enlist database and development staff to identify alias and tag the data elements in the master catalog
• Tools You’ll Need• Data Modeling tools (ER/Studio and MetaWizard)
• Repository to manage the metadata byproducts (ER/Studio)
1519/11/2009
Analysis Phase – Evaluating ETL
• Biggest Challenges We’re Solving: • Finding which data source is feeding what other data sources
• Collecting Data Lineage metadata
• Making it accessible to the right team members
• Convert the ETL to a form that allows manipulation (such as XML)p ( )
• Importing the metadata into the data modeling tool
• Build, publish and control access to your master data repository
• Start gathering and applying metadata tags
• Tools necessary for this:• MetaWizard
• ER/Studio Data Architect (or the like)
1619/11/2009
Data Lineage and Transform Rules
1719/11/2009
Setting the Foundation for Governance
18
19/11/2009
Analysis Phase – Identifying Synonyms
• Biggest Challenges We’re Solving: • Indentifying like data elements and candidates for consolidation
• Building Aliases
• Establishing the foundation for Data Governance
• Evaluate data nomenclature using tool functions such as Merge and g gCompare to identify the obvious overlaps
• Compare descriptors from database staff
• Compare data use and consumption rules derived from tools such as DB Optimizer
f• Tools necessary for this:• ER/Studio Data Architect (or the like)
1919/11/2009
Performing Analysis With Compare Utility
2019/11/2009
Exporting to Excel for Input into Database
2119/11/2009
Candidates for Consolidation
2219/11/2009
Step 3 Building the Repository
• Step 3–Building Metadata Repository• Populating the Repository with the right metadata
• Establishing and Controlling Access to the metadata
• Performing metadata management
• Primary Challenges-y g• Defining who needs access to what metadata
• Establishing the rules of use
• Suggestions• Suggestions-• Implement change control and auditing tool
• What’s important should be obvious
• Understand the value of the metadata on profitability
2319/11/2009
Step 4 Implementing the repository
• Step 4 - Implementing the repository • Mapping the metadata to the requisite business processes
• Leveraging the metadata to determine candidates for business process re-engineering
• Primary Challenges-• Getting the processes down in modeled formg p
• Obtaining Middle Level Management and Senior Leadership buy in to changes identified by metadata
• Suggestions-• Leverage a modeling tool that facilitates data to process mapping (integrated metadata)
• Focus on what’s most important to the business—try not to focus on EVERYTHING
2419/11/2009
Step 5 Establishing Data Governance
• Step 5 – Establishing Data Governance• All of the above steps lays the foundation for good data governance
• Get Senior Leadership to stipulate policy enforcing the rules you’ve derived
• Build a Plan and Standardize Iteratively – (don’t try to fix everything all at once)
• Primary Challenges-y g• Fundamental Opposition to Change
• Maintaining Momentum
• Suggestions• Suggestions-• Find a quick kill – tackle the biggest organizational problem you can handle
• Focus on what’s most important to the business—and what drives easily visible ROI
2519/11/2009
Summary
• What We Covered:• Defined Master Data and Master Data Management
• The 5 Steps for Master Data Management:
• Discovery – finding all of the data sources, who they are used by and how they are used
• Analysis – identifying authoritative sources, discrepancies, and candidates for consolidation
• Design – designing the metadata repository
• Implementation–implementing a metadata repository
• Establish data governance
• Demonstrated how to leverage specific technology to facilitate:
• Business Process and Data Modeling
• Data Governance and Discovery
• Metadata Repository Implementation
• Metadata Management
2619/11/2009
Questions and Answers
• Tools Discussed:• Nessus
• ER/Studio Data Architect / Business Architect and ER/Studio Repository
• DBOptimizer
• Change Manager
• Technologies Discussed: • Building the Data CatalogBuilding the Data Catalog
• Capturing and Storing Metadata
• Metadata Analysis
• Contact Info:• Ron Lewis, [email protected]
2719/11/2009