Data warehousing theory and modelling techniques Graduate course on dimensional modelling.
-
Upload
adrian-blankenship -
Category
Documents
-
view
215 -
download
0
Transcript of Data warehousing theory and modelling techniques Graduate course on dimensional modelling.
![Page 1: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/1.jpg)
Data warehousing theory Data warehousing theory and modelling techniquesand modelling techniques
Graduate course on Graduate course on dimensional modellingdimensional modelling
![Page 2: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/2.jpg)
1. EXTENDED DIMENSION 1. EXTENDED DIMENSION TABLE DESIGNSTABLE DESIGNS
1.1.1.1. Many-to-many dimensions Many-to-many dimensions
1.2.1.2. Many-to-many traps Many-to-many traps
1.3.1.3. Role-playing dimensions Role-playing dimensions
1.4.1.4. Organisation and parts hierarchies Organisation and parts hierarchies
1.5.1.5. Time stamping the changes Time stamping the changes
1.6.1.6. Building an audit dimension Building an audit dimension
1.7. conclusion: too few or too many 1.7. conclusion: too few or too many dimensionsdimensions
![Page 3: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/3.jpg)
2. EXTENDED FACT TABLE 2. EXTENDED FACT TABLE DESIGNSDESIGNS
2.1. Facts of differing granularity and 2.1. Facts of differing granularity and allocatingallocating
2.2. Time of day2.2. Time of day
2.3. Multiple units of measurement2.3. Multiple units of measurement
2.4. Value band reporting2.4. Value band reporting
![Page 4: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/4.jpg)
1.1. Many-to-many dimensions:1.1. Many-to-many dimensions:if one of the dimensions has many valuesif one of the dimensions has many values
![Page 5: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/5.jpg)
1.1. Many-to-many 1.1. Many-to-many dimensions:dimensions:
if one of the dimensions has many valuesif one of the dimensions has many values
Solution: Solution: – A A bridge tablebridge table between the fact between the fact
table and the many-to-many table and the many-to-many dimensiondimension
– we must generalise the original we must generalise the original diagnoses key in the fadiagnoses key in the facct table to be a t table to be a special diagnoses key and we use a special diagnoses key and we use a group keygroup key and and a weighting factora weighting factor which sums up to 1.00which sums up to 1.00
![Page 6: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/6.jpg)
1.1.1.1. Many-to-many dimensions: Many-to-many dimensions:if one of the dimensions has many valuesif one of the dimensions has many values
![Page 7: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/7.jpg)
1.2.1.2. Many-to-many traps: Many-to-many traps:be aware of different cardinality when a be aware of different cardinality when a
dimension is attached to several fact tablesdimension is attached to several fact tables
![Page 8: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/8.jpg)
1.3.1.3. Role-playing dimensions Role-playing dimensions
A role in a data warehouse is a situation where a single dimension A role in a data warehouse is a situation where a single dimension appears several times in the same fact table. Then the underlying appears several times in the same fact table. Then the underlying dimension may exist as a single physical table, but each of the dimension may exist as a single physical table, but each of the roles must be presented in a separately labelled view.roles must be presented in a separately labelled view.
E.g. the telecommunications industryE.g. the telecommunications industryof a single call we might register:of a single call we might register:source system providersource system providerlocal switch providerlocal switch providerlong distance providerlong distance provideradded value service provideradded value service provider
![Page 9: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/9.jpg)
1.4. Organisation and parts 1.4. Organisation and parts hierarchieshierarchies
![Page 10: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/10.jpg)
1.4. Organisation and parts 1.4. Organisation and parts hierarchieshierarchies
![Page 11: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/11.jpg)
1.4. Organisation and parts 1.4. Organisation and parts hierarchieshierarchies
![Page 12: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/12.jpg)
1.5. Time stamping1.5. Time stamping
E.g. human resources environment of a E.g. human resources environment of a large enterprise with 100.000 employeeslarge enterprise with 100.000 employees
3 kinds of queries against the HR data:3 kinds of queries against the HR data:1. summary statistics of the entire employee data base (monthly)1. summary statistics of the entire employee data base (monthly)2.2. to profile the employee population at any moment in time to profile the employee population at any moment in time
(month end or not)(month end or not)3.3. we demand that every employee transaction can be we demand that every employee transaction can be
represented distinctly: we thus want to see represented distinctly: we thus want to see eevery transaction on very transaction on a given employee, with the correct transaction sequence and a given employee, with the correct transaction sequence and the correct timing of each transactionthe correct timing of each transaction
3 = 3 = detailed transaction theorydetailed transaction theory or or fundamental truthfundamental truth
![Page 13: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/13.jpg)
1.5. Time stamping1.5. Time stamping e.g. an employee transaction dimension time stamped e.g. an employee transaction dimension time stamped
with current and next transaction dates and timeswith current and next transaction dates and times
![Page 14: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/14.jpg)
1.6. Building an audit dimension1.6. Building an audit dimension during the extract process in the data staging phaseduring the extract process in the data staging phase
![Page 15: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/15.jpg)
2.1. Facts of differing granularity 2.1. Facts of differing granularity and allocatingand allocating E.g. shipment invoiceE.g. shipment invoice
Individual fact records should be on the lowest atomic levelIndividual fact records should be on the lowest atomic level When faced with facts of differing granularity, try to force all When faced with facts of differing granularity, try to force all
the facts to the lowest levelthe facts to the lowest level When allocating facts to the lowest level is impossible, the When allocating facts to the lowest level is impossible, the
higher level facts have to be presented in separate tableshigher level facts have to be presented in separate tables
High-level plans or forecastsHigh-level plans or forecasts: make sure that the actual : make sure that the actual aggregates exist at the same level of the plansaggregates exist at the same level of the plans– 1. Aggregate table and plan table share exactly the same dimensions1. Aggregate table and plan table share exactly the same dimensions
=> combine them in 1 single physical table=> combine them in 1 single physical table– 2. Aggregate table and plan table have 2. Aggregate table and plan table have different different dimensionsdimensions
=> combination in 1 single physical table is impossible=> combination in 1 single physical table is impossible
![Page 16: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/16.jpg)
2.1. Facts of differing granularity 2.1. Facts of differing granularity and allocatingand allocating
![Page 17: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/17.jpg)
2.1. Facts of differing granularity 2.1. Facts of differing granularity and allocating:and allocating: high level plans or high level plans or
forecastsforecasts
![Page 18: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/18.jpg)
2.1. Facts of differing granularity 2.1. Facts of differing granularity and allocating:and allocating: high level plans or high level plans or
forecastsforecasts
The dimension plan version makes it impossible to combine with an aggregate table
![Page 19: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/19.jpg)
2.2. Time of day2.2. Time of day
![Page 20: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/20.jpg)
2.3. Multiple units of measure2.3. Multiple units of measure
The wrong design when fact table quantities need to be expressed in several units of measure
![Page 21: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/21.jpg)
2.3. Multiple units of measure2.3. Multiple units of measure
The recommended design for multiple units of measure
![Page 22: Data warehousing theory and modelling techniques Graduate course on dimensional modelling.](https://reader035.fdocuments.us/reader035/viewer/2022062519/5697c0301a28abf838cdac2b/html5/thumbnails/22.jpg)
2.4. Value band reporting2.4. Value band reporting
Report of form:Report of form:
balancebalance number of number of total of total of
rangerange accounts accounts balancesbalances
0-10000-1000 45678 45678 $10222543$10222543
1001-2000 36154 1001-2000 36154 $45455789$45455789
2001-5000 11485 2001-5000 11485 $30851455$30851455