Avoiding the Data Baggage Trap - Modern...
-
Upload
trankhuong -
Category
Documents
-
view
220 -
download
3
Transcript of Avoiding the Data Baggage Trap - Modern...
© 2010 Quest Software, Inc. ALL RIGHTS RESERVED
Avoiding the Data Baggage Trap A practical approach to mapping and traceability
By Omar Masri
May 2011
Objective
2About me
• Chief architect Quest Software’s BI solutions
• 15+ years development and consulting experience
• Specialties
– Data migrations
– BI solutions
– Project Recovery
– Product innovation & solution design
Objective
3
• Get a better understanding of the Analyst role
• Identify projects that are data baggage hotspots
• Learn 6 easy principles to deal with data baggage
Objectives
4
• BackgroundDefine data, mapping, traceability, the role of the analyst and data baggage
• Project data baggage hot spots
• The 6 Principles
• Tooling
• Future proofing your data processes
• Questions
Outline
5
The world has a zettabyte worth of data.
1,000,000,000,000,000,000,000 bytes = 1021
What is next?
A yottabyte
BackgroundFacts about Data
6Background
Definition of Data
7
Data and analysts both
exist to support
production systems
and
decision making processes.
BackgroundDefinition of Data
8
The existence of data and analysts alone
does not ensure
that processes and systems can be seamlessly,
• implemented
• maintained
• changed
•used
BackgroundDefinition of Data
9Background
Definition of Data
10
How well data supports systems and processes
depends on:
•how useable, or meaningful, a dataset is.
(mapping, scrubbing, & transforming)
•how well the processes that create the meaning and dataset are understood.
(traceability)
BackgroundDefinition of Data
11
Mapping is a translation operation
that transforms raw data
into a form that is
meaningful
to another process or system.
BackgroundDefinition of Mapping
12
Raw Data
38
1.07367
10-Aug-2010
93000
400
A005
Without context data is meaningless
BackgroundDefinition of Mapping - Example 1 : Giving Data Meaning
13
Raw Data Context Information
38 Age A person’s age
1.07367 AUDUSD for X date Currency pair spot price
10-Aug-2010 Sales date Date of a sales transaction
93000 Number of SAP tables List of tables that potentially
need to be mapped for
conversion
A005 Customer table Table we need to convert
Data + context = useful Information
Useful to who and what?
BackgroundDefinition of Mapping - Example 1 : Giving Data Meaning
14
Mapping is done at different levels to ensure the required information is created
BackgroundDefinition of Mapping
Data
Physical model
Logical model
Processes
Systems
15Background
Definition of the Analyst Role
16
How well an analyst supports processes & systems depends on:
• Their skills : Communication, technical, social
• Their level of understanding of the processes and systems that are involved
•Their ability to give meaning at multiple levels
BackgroundDefinition of the Analyst Role
17
Physical to Physical mapping
BackgroundDefinition of the Analyst Role - Example 2 – Adding more value
Gaps•Requirements•Data quality•Reconciliation•Process
Impact traceability
18Background
Definition of the Analyst Role - Example 2 – Adding more value
More context = Happier•Developers•Reconciliation team•Reduced project risk
Improved traceability
19Background
Definition of the Analyst Role – Summary
Business decisions
Either help mold or live with
Decisions that impact requirements
• Vendor and system selection processes
• What is in/out scope
• Data quality decisions
• Reconciliation
• Impact on business processes
• Go-live
20Background
Obstacles analyst face in performing the 3 functions
21
heavily relied on
but
domain knowledge
lost over the years.
BackgroundObstacles analyst face in performing the 3 functions
Data baggage -
ETL processes
Data layers
Reports & Repositories
Interfaces & Systems
22Project Types
Common Projects that are data baggage hot spots
Common data baggage hot spots
•Data layer projects
•Enterprise and Shadow BI projects
•Interfaces & Orchestration
•Data Migrations
23Project Types
Data baggage hotpot 1 – A Data layer Project
This is an unsustainable model
New virtual data layer technology is key to combating
this complexity
When the rate of
change > rate of
documentation=
data baggage
24
Enterprise BI Shadow IT - BI
Extracts
Spreadmarts
Developer
Desktop tools
Business Users
Departmental Data
Google AnalyticsTargetsBudgetsExpensesForecastCampaignsNoSQL
Project TypesData baggage hotpot 2 – BI projects
25Project Types
Data baggage hotpot 3 – System Interface/Orchestration project
26Project Types
Data baggage hotpot 4 – Data Migration project
27Project Types
Some key points for each project type
•Migration projects
•temporary
•key risks are reconciliation and process continuity
•Interface projects
•require transactional monitoring
•key risks are performance and data loss
•BI and Data layer projects
•key risks are accuracy versus flexibility
See appendix for more detailed considerations
28
Sorting out the baggage
The 6 principlesOvercoming the obstacles
29The 6 principles
Why are they important?
•Improve quality of mapping outputs
•Improve traceability over final processes
•Improve requirements gathering
•Provide a value based decision making framework
Overcoming the obstacles
30
Principle 1:
Start with the objective
•No Scope No Hope!
•Scope of project drives mapping considerations
•Systems, processes, interfaces
•Data quality & reconciliation strategy
•Archiving/Logging/Reporting strategy
•Go-Live strategy
The 6 principlesEvery Analyst should know when approaching a mapping project
31
Principle 2:
Validate key assumptions
•Data quality gap
What people think they have versus reality
•Leading cause of mapping scope creep
•Potential causes
•Prior conversions, system upgrades, journaling
•Documented business processes not followed
The 6 principlesEvery Analyst should know when approaching a mapping project
32
Principle 3:
Reconcile from the start
•Test the mapping
•Build reconciliation as part of the code
•Define the reconciliation pack
•Reconcile the source system
(report on header vs act on lines problem)
•Set expectations for things that will not reconcile
The 6 principlesEvery Analyst should know when approaching a mapping project
33
Principle 4:
Design and implement with traceability in mind
•How visible the mapping is
•Ensure readable code
•Automation
•Collaboration
•Documents & questions & issues
The 6 principlesEvery Analyst should know when approaching a mapping project
34
Principle 5:
Automate from the start
•# of iterations are key to any data project
•Automated testing of mapping
•Flushes out data quality unknowns
•Eliminates manual execution errors
•Everything is 100% traceable
•Go-live Execution time statistics
The 6 principlesEvery Analyst should know when approaching a mapping project
35
Principle 6:
Right tool for the right job
The 6 principlesEvery Analyst should know when approaching a mapping project
36Tools
The Analyst Tool Box
Need Tool
requirements and mapping Many available; Excel
Issues & question management Many available; RedMine
Document collaboration Many available
Data validation and discover
Virtual data layer, sandbox
Basic reporting
Basic automation
Data Modeling – Diagramming
Quest Software’s
TOAD Family of Products
Toad for *DB*
Toad for Data Analysts
Toad for Cloud DB
Toad Data Modeler
37
Use a product suite
ApplicationApplying Principles to Project Types – Data Migration
38Application
Applying Principles to Project Types – Data Migration
Or low cost and reliable….
Excel Organized folders
Scripts
Task based Runner
SaaS Issues & question system
SQL IDE
39
•Ensure key outputs exists
• Change history
•Collaboration Server
•Document
•Conversation
•Train-of-thought
•Eliminating complexity
•Reduce number of components
Future ProofingImproving the quality of your information systems
40
Enterprise BI Shadow IT - BI
Extracts
Spreadmarts
Desktop tools
Developer
Departmental Data
Google AnalyticsTargetsBudgetsExpensesForecastCampaignsNoSQL
Future ProofingEliminating complexity
41
Enterprise BI Shadow IT - BI
Extracts
Spreadmarts
Desktop tools
Developer
Departmental Data
Google AnalyticsTargetsBudgetsExpensesForecastCampaignsNoSQL
Replace with a virtual data layer
Future ProofingEliminating complexity
42Future Proofing
Eliminating complexity
Enterprise BI Personal - BI
Data prep & provision
BusinessUser
Departmental Data
Google AnalyticsTargetsBudgetsExpensesForecastCampaignsNoSQL
Desktop tools
Sanbox
Analytic cache
43Future Proofing
Improving the quality of your information systems
45Appendix
46
Principle
Objective Go live; decommission; reporting; data strategy
Reconcile What will and what won’t. Operational versus Stats
Assumptions Essential documented; data quality, scope
Design/Implement with
traceability in mind
Templates, procedural code preferred over object
oriented coding
Automate Critical. Nightly Dev, weekly Test, monthly UAT runs
Characteristic
Life span Temporary project
Key risks Execution window, reconciliation, business processes
Key artifacts Mapping documents; Reconciliation & process strategy
Key tools Excel, SQL IDE, Automation, Reporting, Issues Register
Applying Principles to Project Types – Data Migration
Appendix
47
Principle
Objective Go Live; Orchestration; Monitoring
Reconcile/Monitoring Detect errors, bottlenecks, data loss
Assumptions Essential documented; data quality, scope
Design/Implement with
traceability in mind
Well designed and well performing code. Log verbosity
levels. Log management
Automate Orchestration
Characteristic
Life span Permanent
Key risks Performance, data loss, business processes
Key artifacts Mapping & design documents; Monitoring & Log strategy;
admin guide
Key tools Diagramming, BPM; Data Investigation; CASE; Support
Applying Principles to Project Types – Interfaces / Orchestration
Appendix
48
Principle
Objective Result based
Reconcile Ensure localized transforms tie back to source figures
Assumptions Essential documented; data quality, scope
Design/Implement with
traceability in mind
Well designed and well performing code
Automate Process & Publish
Characteristic
Life span Permanent or Temporary
Key risks Access, Security, Accuracy, Flexibility
Key artifacts Report -> Dimensional Model -> Physical; Admin guide
Key tools SQL-IDE, Data Exploration, Sandbox, Automation,
collaboration, Diagramming
Applying Principles to Project Types – Reporting / Analytics
Appendix
49
Data Migration
Appendix