Post on 02-Dec-2014
description
© 2014 Genesee Academy, LLC
Data Modeling Data Vault Modeling Big Data Agile DW Ensemble Modeling Certification
CDVDM Recertification Event Data Vault: Then & Now
© 2014 Genesee Academy, LLC USA +1 303 526 0340 Sweden 072 736 8700 Hans@GeneseeAcademy.com www.GeneseeAcademy.com
CDVDM ReConnect 2014 gohansgo
© 2014 Genesee Academy, LLC
Then & Now Presentation Agenda
• Looking Back & Progress • Colors and Reverse Engineering • Business Oriented Modeling • Effective Dates • Architecture Revisited • Link Unique Specific Natural • Thinking Differently • Modeling Address • Sourcing the Data Vault • The L:L:L constructs • Automation
Mini-Topics for 5x5 Updates
• Ensemble Modeling • Core Business Concepts • The Business Key • Unit of Work & Possessive • Raw versus Business • Link & Why its not an Event • Satellite & Why its not MV • Big Data & Unstructured • Successful Agile DV DW • Industry Reference Models • Ensemble Forms
4
AGENDA ITEMS
© 2014 Genesee Academy, LLC
5
Then and Now…
2007 * 2008 * 2009 * 2010 * 2011 * 2012 * 2013 * 2014
© 2014 Genesee Academy, LLC
Genesee Academy Activities
6
Seminars
Advising
Online
Conferences
© 2014 Genesee Academy, LLC
Genesee Academy Activities
38%
29%
17%
14%
GA Activities
SeminarsAdvisingOnlineConferences
7
Genesee Academy, LLC – World Class Training
• Seminars – 1-4 day, on-location & in-company courses. – Certifications issued by GA. – Blended (hybrid) Pedagogy.
• Advising – DWBI Programs, Modeling Patterns, Enterprise
Architecture, Agility, etc. – Reviews: Programs, Models, Architectures, etc.
• Online – Classroom studio, online, on-demand video lessons. – Multiple channels DVA and TrainOvation.
• Conferences – Speaking, Presenting, and sometimes coordinating
industry conferences around the globe.
© 2014 Genesee Academy, LLC
Unified Decomposition™
8
• With the EDW, we seek to break things out into component parts for flexibility, adaptability, agility, and generally to facilitate the capture of things that are either interpreted in different ways or changing independently of each other.
• At the same time a core premise of data warehousing is integration and moving to a common standard view of unified concepts. So we also want to tie things together – to Unify.
© 2014 Genesee Academy, LLC
Ensemble Modeling™
9
All the parts of a thing taken together, so that each part is considered only in relation to the whole.
• The constellation of component parts acts as a whole – an Ensemble.
• With Ensemble Modeling the Core Business Concepts that we define and model are represented as a whole – an ensemble – including all of the component parts. An Ensemble is based on all things defining a Core Business Concept that can be uniquely and specifically said for one instance of that Concept.
© 2014 Genesee Academy, LLC
The Data Vault Ensemble
10
• The Data Vault Ensemble conforms to a single key – embodied in the Hub construct.
• The component parts for the Data Vault Ensemble include: – Hub The Natural Business Key – Link The Natural Business Relationships – Satellite All Context, Descriptive Data and History
© 2014 Genesee Academy, LLC
Data Vault means thinking differently
11
Customer
Customer • The minimal construct then for an “entity”
such as “Customer” is now a
Hub with a set of Satellites
© 2014 Genesee Academy, LLC
Data Vault means thinking differently
12
Customer
Customer
© 2014 Genesee Academy, LLC
DV versus 3NF
Sat
Sat
Sat Sat Sat Sat Sat Sat Sat
Sat Sat Sat
13
EDW
Hist
ory
Ope
ratio
nal
© 2014 Genesee Academy, LLC
The Data Vault modeling approach
• As the scope of the EDW is expanded and new data sources added, the Data Vault can adapt to these changes without impacting the existing model. This is what allows the EDW to be built incrementally and to adapt to change without the need for re-engineering.
New Area absorbed
14
H_Cust
H_Sale H_Empl
H_Store
H_Car
© 2014 Genesee Academy, LLC
Data Vault Modeling Process
• The Modeling Process for creating a Data Vault model includes three primary steps:
1) Identify and Model your Core Business Concepts • Business Interviews is at the heart of this step
What do you do? What are the main things you work with?
• Also find best/target Natural Business Key 2) Identify and Model your Natural Business Relationships
• Specific Unique Relationships • Be considerate of the Unit of Work and Grain
3) Analyze and Design your Context Satellites • Consider Rate of Change, Type of Data
and also the Sources of your data during design process
15
© 2014 Genesee Academy, LLC
Sales DV Model - Backbone
19
Sam
ple
Mod
el
© 2014 Genesee Academy, LLC
Identifying the Core Business Concepts
21
© 2014 Genesee Academy, LLC
Business Key?
• The Business Key that forms the basis of the Hub should be: – Enterprise Wide Unique – Central Business View Aligned
This means that: – It is not a “Technical Key” but rather a “Business Key” – It is not the source system primary key (id) – It is not driven by any one source system – Should be aligned with central business initiatives In a data warehouse this means: – Will have clashes – Will have duplicates
22
© 2014 Genesee Academy, LLC
Starting with Stars
• Begins to get complicated…
Star 1
Reach complexity and lack of agility level…
Star 2
Star 3
Star 4
Star 5
Star 6
Star 7
Star 8
Star 9
Star 10
Star 11
Star n…
23
Accounting
Finance
Logistics
Sales
© 2014 Genesee Academy, LLC
Adapting & Expanding the EDW
• With Data Vault, scale easily – without re-engineering!
Star 1
Easily adapts to changes…
Star 2
Star 3
Star 4
Star 5
Star 6
Star 7
Star 8
Star 9
Star 10
Star 11
Star n…
EDW DV EDW
24
Accounting
Finance
Logistics
Sales
© 2014 Genesee Academy, LLC
Fundamental Architecture
Data Mart Star
Schema
Other Marts & Error Marts
Enterprise DWBI Solution
Load
Tran
sfor
m
Calc
ulat
e Co
nver
t
Clea
nse
Prof
ile
Val
idat
e
Extra
ct
Load
D/T
Stam
p
Inte
grat
e
Extra
ct
Staging
EDW
Tran
sfor
m
Calc
ulat
e Co
nver
t
Clea
nse
Prof
ile
Val
idat
e
Inte
grat
e Raw BDW
* Integrate * Align
* Reconcile
Mart Specific Rules
Common Business Rules
25
Data Mart Star
Schema
© 2014 Genesee Academy, LLC
Identifying relationships that are really Ensembles
• Rules and Guidelines
• Does the Link have its own Business Key?
• Does the Link represent its own Core Business Concept?
• Are there several Satellites on the Link?
• Are there many attributes to describe the Link?
• Are there relationships (Link to Link) with this Link?
IF YES to any of these questions then the Link is Likely a Hub.
When a Link becomes a Hub
26
© 2014 Genesee Academy, LLC
Applying the Data Vault Ensemble
27
• Mixing “color types of data” is not Data Vaulting but rather unvaulting
* A blended pattern has different dynamics
Thinking Differently
• Stay with the Ensemble Modeling Pattern. Continue practicing Unified Decomposition. Continue Vaulting. Be aware when you change patterns.
Option 1 Option 2 Option 3
© 2014 Genesee Academy, LLC
Sourcing the Data Vault EDW
28
• Sourcing Data Vault requires more joins (Hub to Sats, 2 sides of Links)
• Sourcing Data Vault can be more efficient than sourcing other forms
• Primary path to efficient sourcing is thinking differently…
1. ETL team needs to understand the DV model to be efficient 2. Automation and templates for repeatable patterns make this easier 3. Pulling context from subset of Satellites eases this join impact 4. Hubs and Links are thin and short tables with no redundancy (fast) 5. Data Marts should not be based on creating another copy of DW 6. Data Mart design should be agile, purpose-built, and business driven 7. Data Marts should pass the virtualization test 8. Tune with PITS, Bridges, other Mart Stage views (& materialized)
© 2014 Genesee Academy, LLC
Link:Link:Link
29
• What does a L:L:L mean? • Can a relationship have relationships to other relationships?
Whenever you see a Link:Link you should take a moment to find the Hub you are missing. Either there or not yet modeled.
• Automation:
© 2014 Genesee Academy, LLC
30
Benefits of Data Vault Modeling
Agility Auditability History Scalability Simplicity Loadability
Responds Faster & Costs Less
© 2014 Genesee Academy, LLC
• Financial Institutions • Telecommunications • Retail • Manufacturing • Technology • Energy & Utility • HealthCare • Consultancy • Transportation • Government • Gaming • Etc.
31
Applying Data Vault
© 2014 Genesee Academy, LLC
Links and Information
CDVDM Training & Certification
www.GeneseeAcademy.com Hans@GeneseeAcademy.com
gohansgo
Book DataVaultBook.blogspot.com
HansHultgren.WordPress.com
HansHultgren
33
Online video-lesson training
DataVaultAcademy.com
DataVaultAcademy