Online Analytical Processing by Hweichao Lu
-
Upload
vinaikneha -
Category
Documents
-
view
221 -
download
0
Transcript of Online Analytical Processing by Hweichao Lu
-
8/7/2019 Online Analytical Processing by Hweichao Lu
1/24
Online AnalyticalOnline Analytical
Processing (OLAP)Processing (OLAP)
Hweichao LuHweichao Lu
CS157BCS157B--02 Spring 200702 Spring 2007
-
8/7/2019 Online Analytical Processing by Hweichao Lu
2/24
What is OLAPWhat is OLAP
Basic idea:Basic idea: converting data intoconverting data into
information that decision makers needinformation that decision makers need
Concept to analyze data by multipleConcept to analyze data by multiple
dimension in a structure called data cubedimension in a structure called data cube
-
8/7/2019 Online Analytical Processing by Hweichao Lu
3/24
HistoryHistory
In 1993, E. F. Codd came up with theIn 1993, E. F. Codd came up with the
termterm online analytical processing (OLAP)online analytical processing (OLAP)
and proposed 12 criteria to define anand proposed 12 criteria to define anOLAP databaseOLAP database
the term OLAP seems perfect to describethe term OLAP seems perfect to describe
databases designed to facilitate decisiondatabases designed to facilitate decisionmaking (analysis) in an organizationmaking (analysis) in an organization
-
8/7/2019 Online Analytical Processing by Hweichao Lu
4/24
Purpose of OLAPPurpose of OLAP
To derive summarized information fromTo derive summarized information from
large volume databaselarge volume database
To generate automated reports forTo generate automated reports for
human viewhuman view
-
8/7/2019 Online Analytical Processing by Hweichao Lu
5/24
Why need OLAP overWhy need OLAP over
Relational Database IRelational Database I Consistently fast responseConsistently fast response
OLAP obtains a consistently fastOLAP obtains a consistently fast
response is by prestoring calculatedresponse is by prestoring calculated
valuesvalues
-
8/7/2019 Online Analytical Processing by Hweichao Lu
6/24
Why need OLAP overWhy need OLAP over
Relational Database IIRelational Database II MetadataMetadata--based queriesbased queries
provide analysis functions that areprovide analysis functions that are
difficult or impossible to express in SQLdifficult or impossible to express in SQL
SQLSQL was developed primarily forwas developed primarily for
transaction systems, not for reportingtransaction systems, not for reporting
applicationsapplications
-
8/7/2019 Online Analytical Processing by Hweichao Lu
7/24
Why need OLAP overWhy need OLAP over
Relational Database IIIRelational Database III SpreadsheetSpreadsheet--style formulasstyle formulas
design the data structure with users indesign the data structure with users in
mind.mind.
Spreadsheets areSpreadsheets are key components ofkey components of
business management because they arebusiness management because they are
intuitive to createintuitive to create
-
8/7/2019 Online Analytical Processing by Hweichao Lu
8/24
Step IStep I
1.1. identify multidimensional dataidentify multidimensional data
measure attributemeasure attribute
(measure some value, can be(measure some value, can beaggregated upon)aggregated upon)
dimension attributedimension attribute(define the dimension and summary of(define the dimension and summary ofmeasure attribute)measure attribute)
-
8/7/2019 Online Analytical Processing by Hweichao Lu
9/24
(Cont.)(Cont.)
Each dimension is typically expressed asEach dimension is typically expressed as
a hierarchya hierarchy
Hierarchy: Analyst is interested inHierarchy: Analyst is interested in
different level of detail of a dimensiondifferent level of detail of a dimension
-
8/7/2019 Online Analytical Processing by Hweichao Lu
10/24
Step IIStep II
2.2. Analyze multidimensional data intoAnalyze multidimensional data into
crosscross--tabulationtabulation
row header: value for one attributerow header: value for one attribute
column header: value for another attr.column header: value for another attr.
individual cell: value aggregationindividual cell: value aggregation
-
8/7/2019 Online Analytical Processing by Hweichao Lu
11/24
Step IIIStep III
3.3. Visualize nVisualize n--dimensional cubedimensional cube -- datadata
cubecube
the word CUBE describe what in thethe word CUBE describe what in the
relational world would be the integrationrelational world would be the integration
of the fact table with dimension tablesof the fact table with dimension tables
-
8/7/2019 Online Analytical Processing by Hweichao Lu
12/24
Step IVStep IV
After you design the cube, you will useAfter you design the cube, you will use
the cube's structure to build a relationalthe cube's structure to build a relational
database (known as a star schema) todatabase (known as a star schema) tohouse the data for the cubehouse the data for the cube
-
8/7/2019 Online Analytical Processing by Hweichao Lu
13/24
Step VStep V
Once you load data into the relationalOnce you load data into the relational
database, and then into the cube, you'lldatabase, and then into the cube, you'll
be able to see how attributes, dimensions,be able to see how attributes, dimensions,measures, and measure groups fitmeasures, and measure groups fit
together within a cube to create atogether within a cube to create a
powerful analytical tool.powerful analytical tool.
-
8/7/2019 Online Analytical Processing by Hweichao Lu
14/24
StarSchemaStarSchema
Cubes are easily stored in relational databases,Cubes are easily stored in relational databases,
using a denormalized data structure called theusing a denormalized data structure called the
star schema, developed by Ralph Kimballstar schema, developed by Ralph Kimball starts with a central fact tablestarts with a central fact table
Each row in the central fact table containsEach row in the central fact table contains
some combination of keys that makes it unique.some combination of keys that makes it unique.
These keys are called dimensions.These keys are called dimensions.
-
8/7/2019 Online Analytical Processing by Hweichao Lu
15/24
Slicing & DicingSlicing & Dicing
Additional Functionality that can beAdditional Functionality that can be
thought of as viewing a slice of the datathought of as viewing a slice of the data
cube, particularly when values forcube, particularly when values formultiple dimensions are fixed.multiple dimensions are fixed.
Slicing/Dicing simply consists of selectingSlicing/Dicing simply consists of selecting
specific values for these attributes, whichspecific values for these attributes, whichare then displayed on top of the crossare then displayed on top of the cross--tabtab
-
8/7/2019 Online Analytical Processing by Hweichao Lu
16/24
-
8/7/2019 Online Analytical Processing by Hweichao Lu
17/24
Rollup & DrillRollup & Drill--downdown
OLAP permit users to view data at ayOLAP permit users to view data at ay
desired level of granularity.desired level of granularity.
Rollup: moving from finerRollup: moving from finer--granularity datagranularity data
to coarser granularityto coarser granularity
DrillDrill--down: opposite to Rollupdown: opposite to Rollup
-
8/7/2019 Online Analytical Processing by Hweichao Lu
18/24
OLAP InplementationOLAP Inplementation
Multidimensional OLAP (MOLAP)Multidimensional OLAP (MOLAP)
Relational OLAP (ROLAP)Relational OLAP (ROLAP)
Hybrid OLAP (HOLAP)Hybrid OLAP (HOLAP)
-
8/7/2019 Online Analytical Processing by Hweichao Lu
19/24
MOLAPMOLAP
The database is stored in a special, usuallyThe database is stored in a special, usually
proprietary, structure that is optimized forproprietary, structure that is optimized for
multidimensional analysis.multidimensional analysis. + : very fast query response time because data+ : very fast query response time because data
is mostly preis mostly pre--calculatedcalculated
--:: practical limit on the size becausepractical limit on the size because the timethe time
taken to calculate the database and the spacetaken to calculate the database and the spacerequired to holdrequired to hold these prethese pre--calculated valuescalculated values
-
8/7/2019 Online Analytical Processing by Hweichao Lu
20/24
ROLAPROLAP
The database is a standard relational databaseThe database is a standard relational database
and the database model is a multidimensionaland the database model is a multidimensional
model, often referred to as a star or snowflakemodel, often referred to as a star or snowflakemodel or schema.model or schema.
+: more scalable solution+: more scalable solution
--:: performance of the queries will be largelyperformance of the queries will be largely
governed by the complexity of the SQL and thegoverned by the complexity of the SQL and thenumber and size of thenumber and size of the tables being joined intables being joined in
the querythe query
-
8/7/2019 Online Analytical Processing by Hweichao Lu
21/24
HOLAPHOLAP
a hybrid of ROLAPa hybrid of ROLAP and MOLAPand MOLAP
can be thought of as a virtual databasecan be thought of as a virtual database
whereby the higher levels of thewhereby the higher levels of the
database are implemented as MOLAPdatabase are implemented as MOLAP
and the lower levels of theand the lower levels of the database asdatabase as
ROLAPROLAP
-
8/7/2019 Online Analytical Processing by Hweichao Lu
22/24
DOLAPDOLAP
The previous terms are used to refer toThe previous terms are used to refer to
server based OLAP technologiesserver based OLAP technologies
DOLAP (Desktop OLAP)DOLAP (Desktop OLAP)
DOLAP enablesDOLAP enables users to quickly pullusers to quickly pull
together small cubes that run on theirtogether small cubes that run on their
desktops or laptopsdesktops or laptops
-
8/7/2019 Online Analytical Processing by Hweichao Lu
23/24
ConclusionConclusion
OLAP is a significant improvement overOLAP is a significant improvement over
query systemsquery systems
OLAP is an interactive system to showOLAP is an interactive system to showdifferent summaries of multidimensionaldifferent summaries of multidimensional
data by interactively selecting thedata by interactively selecting the
attributes in a multidimensional data cubeattributes in a multidimensional data cube
-
8/7/2019 Online Analytical Processing by Hweichao Lu
24/24
ReferencesReferences
IBM Redbooks.IBM Redbooks. DB2 Cube Views: A Primer.DB2 Cube Views: A Primer. Durham, NC,Durham, NC,USA: IBM, 2003. ebrary collections. San Jose StateUSA: IBM, 2003. ebrary collections. San Jose StateUniversity.University.
Jacobson, Reed,Jacobson, Reed, Microsoft SQL Server 2005 AnalysisMicrosoft SQL Server 2005 AnalysisServices Step by StepServices Step by Step. Microsoft Press.. Microsoft Press.
Berry, Michael J. A.Berry, Michael J. A. Data Mining Techniques : ForData Mining Techniques : For
Marketing, Sales, and Customer Relationship Management.Marketing, Sales, and Customer Relationship Management.Hoboken, NJ, USA: John Wiley & Sons, Incorporated, 2004.Hoboken, NJ, USA: John Wiley & Sons, Incorporated, 2004.ebrary collections. San Jose State University.ebrary collections. San Jose State University..