1. The Big Picture2. Data Warehouse Philosophy3. Data Warehouse Concepts4. Warehousing Applications5. Warehouse Schema Design6. Business Intelligence Reporting7. On-Line Analytical Processing8. OLAP Applications9. Data Warehouse Implementation10. Warehousing Software
2Data Warehouses & OLAP
1. The Big Picture2. Data Warehouse Philosophy3. Data Warehouse Concepts4. Warehousing Applications5. Warehouse Schema Design6. Business Intelligence Reporting7. On-Line Analytical Processing8. OLAP Applications9. Data Warehouse Implementation10. Warehousing Software
3Data Warehouses & OLAP
Data Warehouses & OLAP 4
Dimensional modeling refers to a set of data modeling techniques that have gained popularity and acceptance for DW implementations
The acknowledged guru of dimensional modeling is Ralph Kimball
The Data Warehouse, Toolkit. Practical Techniques for Building Dimensional Data Warehouses, John Wiley & Sons
Data Warehouses & OLAP 5
Normalization is the standard database design technique for the relational DB of OLTP systems
Normalized Data Structures (NDS) allow operational systems to record hundreds of discrete, individual transactions, with minimal risk of data loss or data error
Although normalized databases are appropriate for OLTP systems, they quickly create problems when used with decisional systems
Data Warehouses & OLAP 6
NDS are not easy to understand
NDS do not map to the natural thinking processes of business users
Business users are expected to perform queries against the DW on an ad hoc basis
They must be provided with data structures that are simple and easy to understand
NDS do not provide the required level of simplicity and friendliness
Data Warehouses & OLAP 7
NDS require technical knowledge
To create queries and reports against a NDS one requires knowledge of SQL
Business users, decision-makers, senior executives are not expected to manipulate SQL
Their time is better spent on non-programming activities
Unsurprisingly, the use of NDS results in many hours of IT resources devoted to writing reports for operational and decisional managers
Data Warehouses & OLAP 8
NDS are not optimized to support decisional queries
Decisional queries require the summation of hundreds to thousands of figures stored in perhaps many rows in a DB
Such processing on a fully NDS is slow and cumbersome
Data Warehouses & OLAP 10
Dimensional Modeling for Decisional Systems
Principles for denormalizing the database structure to create schemas suitable for supporting decisional processing
Two types of tables are used in dimensional modeling:
Fact tables
Dimensional tables
Data Warehouses & OLAP 11
Dimensional Modeling for Decisional Systems
Fact tables
Used to record actual facts or measures in the business
Facts are the numeric data items that are of interest to the business
Facts are the numbers that users analyze and summarizeto gain a better understanding of the business
Data Warehouses & OLAP 12
Dimensional Modeling for Decisional Systems
Fact tables (Examples)
Retail. Number of units sold, sales amount
Telecommunications. Length of call in minutes, average number of calls
Banking. Average daily balance, transaction amount
Insurance. Claims amounts
Airline. Ticket cost, baggage weight
Data Warehouses & OLAP 13
Dimensional Modeling for Decisional Systems
Dimension tables
Establish the context and store fields describing the facts
Retail. Store name, store zip, product name, product category, day of week
Telecommunications. Call origin, call destination
Banking. Customer name, account number, data, branch, account officer
Insurance. Policy type, insured policy
Airline. Flight number, flight destination, airfare class
Data Warehouses & OLAP 14
Dimensional Modeling for Decisional Systems
Facts and Dimensions in Reports
A manager requires a report showing the revenue for Store X, at Month Y, for Product Z
He needs the Store dimension, the Time dimension, and the Product dimension to describe the context of the revenue
Data Warehouses & OLAP 15
Dimensional Modeling for Decisional Systems
Facts and Dimensions in Reports Sales region and country are dimensional attributes
“2Q, 1997” is a dimensional value
They establish the context and lend meaning to the facts sales targets and sales actual
Data Warehouses & OLAP 16
Star Schema
The multidimensional view of data that is expressed using relational database semantics
Information are classified into 2 groups: facts and dimensions
Fact tables reside at the center of the schema, and their dimensions are typically drawn around it
Data Warehouses & OLAP 17
Star Schema
A key principles of dimensional modeling:
The use of fully normalized Fact tables
The use of fully denormalized Dimension tables
Normalized dimension tables decreases the friendlinessand navigability of the schema
By denormalizing the dimensions, we make available to the user all relevant attributes in a single table
Data Warehouses & OLAP 18
Dimensional Hierarchies
A dimension has hierarchies that imply a groupingstructure
Data Warehouses & OLAP 19
Hierarchical Drilling
Users drill up and down dimensional hierarchies to obtain more or less detail about the business
Data Warehouses & OLAP 20
Hierarchical Drilling
Users drill up and down dimensional hierarchies to obtain more or less detail about the business
Data Warehouses & OLAP 21
Granularity of the fact table
Granularity: indicates the level of detail stored in a table
The granularity of the Fact table follows from the level of detail of its related dimensions
For example, if each:◦ Time record represents a day,
◦ Product record represents a product,
◦ Organization record represents one branch,
then the grain of a sales Fact table with these dimensions is sales per product per day per branch
Data Warehouses & OLAP 22
Granularity of the fact table
Proper identification of the granularity of each schema is crucial to the usefulness and cost of the DW
Granularity at too high a level severely limits the ability of users to obtain additional detail
Granularity at too low a level results in an exponential increase in the size requirements of the DW
Data Warehouses & OLAP 23
The Fact Table Key Concatenates Dimension Keys
The key of the fact table is actually a concatenation of the keys of each of the dimensions that surround it
The sales fact table key is the concatenation of the client key, the product key and the time key (Day)
Data Warehouses & OLAP 24
Aggregates or Summaries
One of the most powerful concepts in DW
The proper use of aggregates dramatically improves the performance of the DW in terms of query response times
Improves the overall performance and usability of the DW
An aggregate is a pre calculated summary stored within the warehouse, usually in a separate schema
Aggregates are used to improve the performance of the warehouse for queries that require only high-level or summarized data
Data Warehouses & OLAP 25
Aggregates or Summaries
Aggregates are summaries of the base-level data higher pointsalong the dimensional hierarchies
Rather than running a high-level query against base-level or detailed data, users can run the query against aggregated data
Aggregates provide improvements in performance because of significantly smaller number of records
Data Warehouses & OLAP 27
Dimensional Attributes
Play a critical role in dimensional star schemas
The attribute values are used to establish the context of the facts
Data Warehouses & OLAP 28
Multiple Star Schemas
A DW can have multiple star schemas (many Fact tables)
Each schema is designed to meet a specific set of needs
Each focusing on a different aspect of the business
We can use the same Dimension table in more than one schema
The enterprises can reuse the Time dimension in all warehouse schemas, provided that the level of detail is appropriate
Data Warehouses & OLAP 29
Advantages of Dimensional Modeling
It is simple
◦ Business users can easily grasp and comprehend the schema
It promotes data quality
◦ Enforcing foreign key constraints as a form of referential integrity check
Performance optimization
◦ Aggregates are a way to optimize query performance
Use of relational database technology
◦ We can rely on the highly scalable relational DB technology
Top Related