Dimensional Modelling - Basic Concept
-
Upload
folio3-software -
Category
Data & Analytics
-
view
319 -
download
2
Transcript of Dimensional Modelling - Basic Concept
![Page 1: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/1.jpg)
DIMENSIONAL
MODELING Structuring Data for Better
Reporting and Analysis
Sajjad Zaheer21 Aug 2014, Folio3
@folio_3 www.folio3.com Copyright 2015
![Page 2: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/2.jpg)
1. Getting into the Context
@folio_3 www.folio3.com Copyright 2015
![Page 3: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/3.jpg)
Online Transaction Processing
• Core database
• Usually ER model
• For transactions and routine tasks
@folio_3 www.folio3.com Copyright 2015
![Page 4: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/4.jpg)
Data about data, i.e information about data tables
in OLTP System.
@folio_3 www.folio3.com Copyright 2015
![Page 5: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/5.jpg)
Extract from source (OLTP)
Transform, according to requirement
Load into Data Warehouse
@folio_3 www.folio3.com Copyright 2015
![Page 6: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/6.jpg)
• For effective querying, analysis and decision-
making
• OLAP (Online Analytical Processing) Design
• Subject-oriented, Integrated, Time-varying, non-
volatile collection of data
@folio_3 www.folio3.com Copyright 2015
![Page 7: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/7.jpg)
• Access layer of data warehouse
• Subset of data ware house
• Oriented to specific business unit or department
E.g. marketing
• Is not another physical entity
@folio_3 www.folio3.com Copyright 2015
![Page 8: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/8.jpg)
To analyze multidimensional data interactively
from multiple perspectives
@folio_3 www.folio3.com Copyright 2015
![Page 9: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/9.jpg)
• Computational process of discovering patterns in
large data sets.
• To extract information and transform it into an
understandable structure for further use.
@folio_3 www.folio3.com Copyright 2015
![Page 10: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/10.jpg)
Creation and study of the visual representation
of data E.g. scatter plot, bar chart.
@folio_3 www.folio3.com Copyright 2015
![Page 11: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/11.jpg)
Retrieve and present a subset of data for a
particular purpose
@folio_3 www.folio3.com Copyright 2015
![Page 12: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/12.jpg)
Dat
a
Informatio
n
Knowledg
e
Dimensional
Modeling (OLTP to
OLAP Structure)
@folio_3 www.folio3.com Copyright 2015
![Page 13: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/13.jpg)
Dimensional Modeling
@folio_3 www.folio3.com Copyright 2015
![Page 14: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/14.jpg)
@folio_3 www.folio3.com Copyright 2015
![Page 15: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/15.jpg)
@folio_3 www.folio3.com Copyright 2015
![Page 16: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/16.jpg)
Terminology
Dimensions
The time independent, textual and descriptive attributes by which users describe objects.
Who, where, what, how, when
Angles/Dimensions with which a data can be viewed.
E.g. Product category, Date-time of a transaction.
Facts
Business Measurements (Quantified). E.g. quantity, amount, cost, taxes,
Things that can be summed or aggregated. E.g. sales of a product.
Built from the lowest level of detail (grain)
Data at consideration
Time dependent@folio_3 www.folio3.com Copyright 2015
![Page 17: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/17.jpg)
Dimensional Modeling Process
Sub-setting
De-normalization
i.e. collapsing hierarchies of dimensions by de-
normalization to 2NF
Summarization
i.e. Summation of Facts
@folio_3 www.folio3.com Copyright 2015
![Page 18: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/18.jpg)
Modeling Design Steps
1. Identify the Business Process Source of “measurements”
2. Identify the Grain What does 1 row in the fact table represent or mean?
3. Identify the Dimensions Descriptive context, true to the grain
4. Identify the Facts Numeric additive measurements, true to the grain
@folio_3 www.folio3.com Copyright 2015
![Page 19: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/19.jpg)
Design Steps - Example
@folio_3 www.folio3.com Copyright 2015
![Page 20: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/20.jpg)
Case Study: Users Points
System
Consider a System simply explained as:
It has users and groups of users.
Every user can perform certain actions like message, comment, meeting etc.
For every action user get some points that are also added to the points of user groups that this user belongs.
The system also has many other features that are not relevant to points.
Let’s assume the system has over 100 tables to store various things.
@folio_3 www.folio3.com Copyright 2015
![Page 21: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/21.jpg)
Step 1: Identify the Business
Process
Question 1: Do we start doing dimensional
modeling to all the 100 tables in the system?
Answer: No
Question 2: So which tables should be
selected?
Answer: The tables that are relevant to the
business requirements.
@folio_3 www.folio3.com Copyright 2015
![Page 22: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/22.jpg)
Business Requirements
Three types of points are required for
reporting:
1. Per month points
2. Average lifetime points at end of each month
For:
1. Individual users
2. User groups
3. Individual users per action
4. User groups per action
@folio_3 www.folio3.com Copyright 2015
![Page 23: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/23.jpg)
Step 2: Identify the Grain
Analyzing the business requirements, following grains are identified.
1. Points per individual per month
2. Points per user group per month
3. Points per user per action per month
4. Average Lifetime Points per individual per month
5. Average Lifetime Points per user group per month
6. Average Lifetime Points per user per action per month
“Grain = What does 1 row in the fact table represent”
@folio_3 www.folio3.com Copyright 2015
![Page 24: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/24.jpg)
Step 3: Identify the Dimensions
Simply speaking, the content after ‘per’ in
grain are the dimensions. They are found to
be:
1. Date (granularity: month)
2. Uses
3. User groups
4. Actions
“Dimension: descriptive context true to grain”
@folio_3 www.folio3.com Copyright 2015
![Page 25: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/25.jpg)
Step 4: Identify the Facts
4 Facts are identified
1. User Points
2. User Lifetime Average Points
3. User Group Points
4. User Group Lifetime Average Points
“Facts: Numeric additive measures true to grain”
@folio_3 www.folio3.com Copyright 2015
![Page 26: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/26.jpg)
Tables Schema
Once Grain, facts and dimensions are identified, table schema is to be formed using these.
Please note:
It is not necessary to keep all facts in differenttables.
They can be part of single table.
Alternatively, there can be multiple fact tables for a single fact as per its relationship with dimensions.
Every dimension will be in different table and each dimension can be connected to many fact tables.
@folio_3 www.folio3.com Copyright 2015
![Page 27: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/27.jpg)
Tables Schema
Tables Schema should be the translation of
the Grain defined in step 2
@folio_3 www.folio3.com Copyright 2015
![Page 28: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/28.jpg)
Star Schema – fact_points_user
Grains covered:
1. Points per individual per month
2. Average lifetime points per individual per month
@folio_3 www.folio3.com Copyright 2015
![Page 29: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/29.jpg)
Star Schema –
fact_points_user_action
Grains covered:
1. Points per individual per action per month
2. Average lifetime points per individual per action per month
@folio_3 www.folio3.com Copyright 2015
![Page 30: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/30.jpg)
Star Schema –
fact_points_group
Grains covered:
1. Points per user group per month
2. Average lifetime points per user group per month
@folio_3 www.folio3.com Copyright 2015
![Page 31: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/31.jpg)
Star Schema for User Points
Grains
Grains covered:
1. Points per user group per action per month
2. Average lifetime points per user group per action per month
@folio_3 www.folio3.com Copyright 2015
![Page 32: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/32.jpg)
Example Query
SELECT fp.*, du.username, da.action_name
FROM fact_points_user_action fp
JOIN dim_user du ON fp.dim_user_id = du.dim_user_id
JOIN dim_date dd ON fp.dim_date_id = dd.dim_date_id
JOIN dim_action da ON fp.dim_action_id = da.dim_action_id
WHERE dd.month = 3 AND dd.year = 2014;
@folio_3 www.folio3.com Copyright 2015
![Page 33: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/33.jpg)
Data Transformation: OLTP to
OLAP
@folio_3 www.folio3.com Copyright 2015
![Page 34: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/34.jpg)
Data Transformation
Once the OLAP Schema has been designed, data is to be moved from the ERD (OLTP) DB to this new OLAP DB.
This can be achieved using dedicated scripts or cron jobs.
One simple example for the elaborated case is to set up a cron that gets executed at every month end and move relevant data from ERD DB to OLAP DB after calculations (if any).
@folio_3 www.folio3.com Copyright 2015
![Page 35: Dimensional Modelling - Basic Concept](https://reader030.fdocuments.us/reader030/viewer/2022032616/55a624f61a28ab073c8b4706/html5/thumbnails/35.jpg)
Conclusion
Dimensional Modeling helps to keep data in a
form that is relevant and quickly accessible for
reporting and analysis.
@folio_3 www.folio3.com Copyright 2015