593 Managing Enterprise Data Quality Using SAP Information Steward
-
Upload
vinny-gurvinder-ahuja -
Category
Documents
-
view
42 -
download
3
Transcript of 593 Managing Enterprise Data Quality Using SAP Information Steward
Managing Enterprise Data Quality
using SAP Information Steward
Vinny Ahuja, Cheryl Johnson
Intel Corporation
SESSION CODE: BI593
Disclaimer
This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN
THIS SUMMARY.
Software and workloads used in performance tests may have been optimized for performance only on Intel
microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer
systems, components, software, operations and functions. Any change to any of those factors may cause the results to
vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products.
For more complete information about performance and benchmark results, visit www.intel.com/benchmarks
Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
For a list of Intel trademarks, go to http://legal.intel.com/Trademarks/NamesDb.htm]
* Other names and brands may be claimed as the property of others.
Copyright © 2015, Intel Corporation. All rights reserved.
Data Quality(DQ) challenges within information
pipeline for Business Intelligence (BI)
Provide visibility into DQ issues within a
heterogeneous landscape
Role of Information Steward in addressing DQ
Share implementation experience
Use DQ tool as a regression test tool
Learning Points
About Intel
Data quality starts with systems of record
Data movement can introduce data quality issues
Don’t wait for customer to find data quality issues
Build instrumentation in pipeline to monitor quality
Business Intelligence Information Pipeline
SourceSystems Operational
Data Store (ODS)Extract Transform
& Load (ETL)Enterprise
DataWarehouseEDW
Data marts BI Platforms
Accuracy
Data was entered or derived correctly as measured by a physical assessment
Completeness
Data is not missing
Consistency
Data that should be the same in various systems is, in fact, the same
Timeliness
Data is available for use when the business requires it
Validity
Data conforms to business rules(constraints)
Key Data Characteristics (Dimensions)
Lack of ownership/accountability
Incomplete or no checks during data entry
Heterogeneous platforms
Purchased and homegrown applications
Limited or no documentation
Limited resources
Run vs grow the business
Mergers & acquisitions
What Makes Managing Data Quality Hard?
Managing Data Quality
Processes to Assess, Define, Monitor and Improve Data Quality
Discover &
Understand
Data
Define
Deploy
Monitor & Remediate
•Data Ownership, Roles &
Responsibilities
•Data specifications
•Data quality requirements
•Workflows with R&R for
accountability to resolve data quality
issues
•Analyze Monitor results
•Execute workflows to fix DQ
issues
•Assess/Profile Data
•Assess Risks and Impact
•Catalog Data Assets
•Governance processes
•Operational processes
•DQ Audit and Monitors
Analyst, Data Steward, Product Data
Manager (PdM)
Analyst, Data Steward, PdM
Enterprise
Data
Analyst, Data Steward, PdM
Data Steward, Analyst, Developer
DQ Management Capability Stack
Data Sources (ERP, MDM, CRM, DW, Data Marts)
Data Access Layer
Data Profiler Rules Engine
Audit Results Repository
Reporting
Analysis
Events
Notifications
Metad
ata Reposito
ry
Workflow
Engine
Analyst, Data Steward, Product Data
Manager (PdM)
Data Steward,
DQ Management with Information Steward
SourceSystems ODS
ETL EDWData marts
BI Platforms
• Data Validation Rules
• Data Profiles Setup
• DQ Scorecards
• DQ Monitor Tolerances
• Tasks and Notifications
• Accuracy
• Completeness
• Consistency
• Integrity
• Validity
DQ Metrics Repository
Data Profiles Data Fallouts DQ Metrics & Scorecards
Data security
Standards
Systems landscape
Roles & responsibilities
Development lifecycle (migration)
Dashboards, alerts and notifications
Production support
Training
Upgrades
Rolling Out the DQ Management Tool
Need to protect data from unauthorized use
Master Data, Sales, Procurement
Projects organized by subject areas (Data Taxonomy) or business process/function
Separate projects for business users and operations
Data stewards approve access to data
Data Organization and Security
• Naming standards:
• Connections: SAPCRM_ChnlMgmt
• Views: VW0012_ADRC join Channel_Mgmt.v_addr
• Rules: LR0012 SAP ADRC PK not in EDW addr
• Tasks: CHNL RT0012 VW0012 ADRC compare addr
• Emphasis on names and detailed descriptions that resonate with business or operations users
• Documentation within the tool
Naming Standards
Systems Landscape
PF DEV BM* PRDQA
PF DEV QA BM PRODPF DEV QA BM PROD
Pathfinding Development Test Benchmark Production
*If Necessary
Data Sources
Biz or IT
Developer
Support
Analyst
Support
Analyst
Separation of duties in support of audit requirements
Those who write monitors cannot deploy in production
Monitors developed and tested using non-production systems
Migrations to production though manual, handled by separate role (support analyst)
Support analyst and tool administrator separate roles and individuals
Changes migrate from non-production to production
Changes directly in production on exception basis
Roles & Responsibilities
Initial Engagement
•Meet with prospects to understand requirements
•Assess if tool is the right fit for requirements
•Tool limitations can be a show stopper for some scenarios
Development
•Assign a mentor to project team, share best practices, standards
•Document requirements for data, views, rules, bindings, schedules, thresholds
•Build DQ Monitor – data sources, views, rules, bindings, tasks, dashboards
Deployment
•Conduct a design review (quality assurance check)
•Migrate to production (support analyst), configure notifications
•Schedule tasks per schedule (support analyst)
Improvements
•Project team makes changes and tests in development and test environments
•Project team requests changes to be migrated to production
•Support analyst migrates changes to production
Development Methodology
3 - 4
Weeks
1 - 3
Days*
* Green Period
Need for history of records with DQ issues
Required custom solution to report historical records
Getting to Historical DQ Issue Records
Information pipeline is comprised of multiple platforms
One or more platforms get software/hardware upgrade at a minimum once per year
Each platform upgrade requires end-to-end testing of information pipeline
Was the flow of data complete and consistent after upgrade?
Reality of Platform Upgrades
SourceSystems Operational
Data Store (ODS)Extract Transform
& Load (ETL)Enterprise
DataWarehouseEDW
Data marts BI Platforms
DQ monitors validate data is complete and consistent within and across data repositories
DQ monitors are early detectors of data issues for critical business processes
An upgrade of a platform requires regression testing; a DQ monitor can do that job
Validate data is complete and consistent between source and target repository after platform upgrade
Eliminate need to maintain test data sets, test scripts for individual platforms
Test parts of pipeline or the entire pipeline using existing DQ monitors
DQ Monitors – Regression Test Suite
Trust in the data has significantly improved and more focus can be directed to value added activities
In one scenario improved DQ from 73% to 93% in 1
quarter
Enabled streamlining for metrics process
Reduction in recovery time and activities during data
excursions
Monitors take out the guesswork on what the issue is and
what the resolution needs to be
Recover in 2-4 hours, instead of 2-3 days
Return On Investment
BI information pipeline is a good place to start with DQ Monitors
Showcase value to those in business responsible for data
Design for data security, separation of roles and enforce standards
Use data profiling to troubleshoot data issue within data pipeline, especially in production
Use DQ monitors as regression test suite for the information pipeline
Best Practices
Monitors with no ownership for action, is waste of resources
Having metrics helps get the necessary focus on data quality
Partner with those championing data governance to derive value from IT investments
A major data crisis will get the right attention, grab the moment
Monitors are valuable during platform upgrades
Set development lifecycle expectations early in the engagement
Key Learnings
STAY INFORMED
Follow the ASUGNews team:
Tom Wailgum: @twailgum
Chris Kanaracus: @chriskanaracus
Craig Powers: @Powers_ASUG
THANK YOU FOR PARTICIPATING
Please provide feedback on this session by completing a short survey via the event mobile application.
SESSION CODE: BI593
For ongoing education on this area of focus,visit www.ASUG.com