Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources...
Transcript of Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources...
![Page 1: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/1.jpg)
Online Analytical Processing
Vaibhav Bajpai
![Page 2: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/2.jpg)
Decision Support Systems
Architecture
Information Sources
Data Warehouse
OLAP Servers
OLAP Clients
![Page 3: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/3.jpg)
DSS
Architecture
DSS are used to make business decisions based on data collected by OLTP.
![Page 4: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/4.jpg)
Architecture
Information Sources
Operational Databases
ERP system
Semi-Structured Sources
does not conform with formal structure of relational data models but contains tags to separate semantic elements (XML).
![Page 5: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/5.jpg)
Architecture
ETL
Extract-Transform-Load is a process involving
Extracting data from information sources.
Transforming to fit operational needs. (cleansing)
Loading it into the Data Warehouse.
![Page 6: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/6.jpg)
Architecture
Data Warehouse
DW is a repository of data to support management decision making process.
Characteristics
Basic Elements
Conceptual Models
Data Marts
![Page 7: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/7.jpg)
Data Warehouse
Characteristics
Subject-Oriented
Integrated (security, single-version)
Time Variant (particular time-period)
Non-Volatile (never removed!)
![Page 8: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/8.jpg)
Data Warehouse
Basic Elements
Facts (measures + fKeys to dimTables)
Measures (additive + non-additive + semi-additive)
Dimensions (measures from different perspective)
Hierarchies (classification of dimensions)
![Page 9: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/9.jpg)
Data Warehouse
Conceptual Models
Star Schema
Snowflake Schema (normalized)
Galaxy Schema (many fact tables)
![Page 10: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/10.jpg)
Conceptual Models
Star Schema
good for large data warehouses!
![Page 11: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/11.jpg)
Conceptual Models
Snowflake Schema
good for small data warehouses!
![Page 12: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/12.jpg)
Data Warehouse
Data Marts
DM is a subset of DW oriented to specific business line or team.
It is an access layer of DW used to get data out to the users.
![Page 13: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/13.jpg)
Data Mining
process of extracting patterns from data.
transforms data into business intelligence.
![Page 14: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/14.jpg)
Data Mining
![Page 15: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/15.jpg)
OLAP
Data Cube
Aggregation
Navigational Operations
![Page 16: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/16.jpg)
core of an OLAP system.
provides the m-Dim way to look at the summary of the data.
m-Dim generalization of group by operator
are sparse in nature
OLAP
Data Cube
![Page 17: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/17.jpg)
pre-calculated summaries of data.
answers are ready before the questions!
improve query response time.
aggregations are stored in the data cube
OLAP
Aggregation
![Page 18: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/18.jpg)
Roll-up (lower to higher aggregation)
Drill-down (higher to lower aggregation)
Slicing
Dicing
Pivot
OLAP
Navigational Operations
![Page 19: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/19.jpg)
a subset of a mDim array corresponding to a single value for one or more members of the dimensions not in the subset.
Navigational Operations
Slicing
![Page 20: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/20.jpg)
is a slice on more than two dimensions of a cube.
i.e. more than two consecutive slices.
Navigational Operations
Dicing
![Page 21: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/21.jpg)
Navigational Operations
Pivoting
![Page 22: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/22.jpg)
OLAP Approaches
ROLAP
MOLAP
HOLAP
![Page 23: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/23.jpg)
Overview
Architecture
Methods of Cubing
Performance Evaluation
OLAP Approaches
ROLAP
![Page 24: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/24.jpg)
uses a RDBMS as a source.however, a DB designed for OLTP will not function well with ROLAP.
does not require pre-aggregation
generate SQL queries at appropriate level at request time.
ROLAP
Overview
![Page 25: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/25.jpg)
scalable!
needs additional attributes to define position in m-Dim space.
ROLAP
Architecture
![Page 26: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/26.jpg)
Sort-based Methods (pipeSort)
Hash-based Methods (pipeHash)
ROLAP
Methods of Cubing
![Page 27: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/27.jpg)
ROLAP is CPU-bound.
slow
requires more disk
requires more IOs
requires more IO time
ROLAP
Performance Evaluation
![Page 28: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/28.jpg)
Overview
Architecture
Storage Issues
OLAP Approaches
MOLAP
![Page 29: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/29.jpg)
core is a m-Dim data cube.
allows position based computation.
the cube is *very* sparse (not-scalable).
extremely fast!
MOLAP
Overview
![Page 30: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/30.jpg)
MOLAP
Architecture
pre-aggregated cubes
formatting data to user’s
needs
data requests
}
![Page 31: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/31.jpg)
Chunking
Chunk-offset Compression
MOLAP
Storage Issues
![Page 32: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/32.jpg)
dividing m-Dim array into small chunks.
allows chunks to fit into available memory for in-memory computations.
MOLAP | Storage Issues
Chunking
![Page 33: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/33.jpg)
store a pair for each valid entry.
solves the sparse-array problem.
MOLAP | Storage Issues
Chunk-offset Compression
![Page 34: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/34.jpg)
Overview
Architecture
OLAP Approaches
HOLAP
![Page 35: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/35.jpg)
store detailed data in RDBMS
store aggregated data in MDBMS
HOLAP
Overview
![Page 36: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/36.jpg)
HOLAP
Architecture
![Page 37: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/37.jpg)
Performance is a concern?
MOLAP!
Data Volume and Scalability is a concern?
ROLAP!
When to choose What?
![Page 38: Online Analytical Processing - Vaibhav Bajpai · 2020-03-01 · Architecture Information Sources Operational Databases ERP system Semi-Structured Sources does not conform with formal](https://reader035.fdocuments.us/reader035/viewer/2022062915/5e8b477b428e7a753e07dd68/html5/thumbnails/38.jpg)
Thank You!