Multi dimensional model vs (1)

44
MULTIDIMENSIONAL DATA MODEL Niall Cosgrave Timothy Halpin Kevin McCarthy Cian O’Brien Amanda O’Donovan 112223751 112222171 107476477 108580235 108385581 1

Transcript of Multi dimensional model vs (1)

Page 1: Multi dimensional model vs (1)

MULTIDIMENSIONAL DATA MODEL

Niall CosgraveTimothy HalpinKevin McCarthyCian O’BrienAmanda O’Donovan

1122237511122221711074764771085802351083855811

Page 2: Multi dimensional model vs (1)

Multi-Dimensional Data Model is a model for data management whereby the databases are developed according to user's preferences, in order to be used for specific types of retrievals.

This model views data in the form of data cube. A data cube allow data to be modelled and viewed in multiple dimensions. It is define by dimensions and fact.

WHAT IS MULTIDIMENSIONAL DATA

2

Page 3: Multi dimensional model vs (1)

Multidimensional database (MDB) is a type of database that is optimized for data warehouse and online analytical processing (OLAP) applications

Multidimensional data-base technology is a key factor in the interactive analysis of large amounts of data for decision-making purposes.

WHAT IS MULTIDIMENSIONAL DATA

3

Page 4: Multi dimensional model vs (1)

Multi-dimensional databases are especially useful in sales and marketing applications that involve time series. Large volumes of sales and inventory data can be stored to ultimately be used for logistics and executive planning.

WHAT IS MULTIDIMENSIONAL DATA

4

Page 5: Multi dimensional model vs (1)

Enables interactive analyses of large amounts of data for decision-making purposes

Differ from previous technologies by viewing data as multidimensional cubes , which have proven to be particularly well suited for data analyses

Rapidly process the data in the database so that answers can be generated quickly.

A successful OLAP application provides “just-in-time”information for effective decision-making.

WHY MULTIDIMENSIONAL DATABASE

5

Page 6: Multi dimensional model vs (1)

The multidimensional data model is important because it enforces simplicity

As Ralph Kimball states in his landmark book, The Data Warehouse Toolkit:

"The central attraction of the dimensional model of a business is its simplicity.... that simplicity is the fundamental key that allows users to understand databases, and allows software to navigate databases efficiently."

WHY MULTIDIMENSIONAL DATABASE

6

Page 7: Multi dimensional model vs (1)

The multidimensional data model is composed of logical cubes, measures, dimensions, hierarchies, levels, and attributes.

WHY MULTIDIMENSIONAL DATABASE

7

Page 8: Multi dimensional model vs (1)

DIAGRAM OF THE MULTIDIMENSIONAL MODEL:

8

Page 9: Multi dimensional model vs (1)

Logical Dimensions: Logical Dimensions are dimensions contain a set of unique values that identify and categorise data.

Hierarchies and Levels : A hierarchy is a way to organize data at different levels of aggregation.

Attributes: An attribute provides additional information about the data. Some attributes are used for display.

DIAGRAM OF THE MULTIDIMENSIONAL MODEL:

9

Page 10: Multi dimensional model vs (1)

EXAMPLES OF DATA CUBES IN USE AND HOW THEY WORK

10

Page 11: Multi dimensional model vs (1)

1991 CANADIAN CENSUS

11

Page 12: Multi dimensional model vs (1)

SLICING, DICING AND ROTATING In the above cube we have the results of the 1991

Canadian Census with ethnic origin, age group and geography representing the dimensions of the cube, while 174 represents the measure. The dimension is a category of data. Each dimension includes different levels of categories. The measures are actual data values that occupy the cells as defined by the dimensions selected.

Three important concepts are associated with data cubes

- Slicing- Dicing- Rotating 12

Page 13: Multi dimensional model vs (1)

SLICING THE DATA CUBE• Figure 2 illustrates slicing

the Ethnic origin Chinese. When the cube is sliced like in this example, we are able to generate data for Chinese origin for the geography and age groups as a result.

• The data that is contained within the cube has effectively been filtered in order to display the measures associated only with the Chinese ethnic origin.

• From an end user perspective, the term slice most often refers to a two- dimensional page selected from the cube.

13

Page 14: Multi dimensional model vs (1)

DICING AND ROTATING• Dicing is a related operation to

slicing in which a sub-cube of the original space is defined

• Dicing provides the user with the smallest available slice of data, enabling you to examine each sub-cube in greater detail.

• Rotating, which is sometimes called pivoting changes the dimensional orientation of the report or page display from the cube data. Rotating may consist of swapping the rows an columns, or moving one of the row dimensions into the column dimension.

• http://demodc.chass.utoronto.ca/iassist/

14

Ontario

Page 15: Multi dimensional model vs (1)

EXAMPLE OF A DATA CUBE IN USE ‘Design and development of data mart for animal resources’

is a 2008 paper by Rai et al that critically examines the development of a Central Data Warehouse for a multitude of agricultural areas.

www.sciencedirect.com/science/article/pii/S0168169908001245

The paper provides a visual representation of a data cube that shows the livestock population census multidimensional cube which is accessed through Internet browser for OLAP.

In this cube, hierarchies are All States, All Species and All Years. All States has state names as a top level and district as bottom level of data flow hierarchy. All Species has top level as species name, second level as sex, third level as age group and bottom level as working categories of animals. All Years has only one level, i.e. years.

15

Page 16: Multi dimensional model vs (1)

VISUAL REPRESENTATION OF MULTIDIMENSIONAL CUBE

16

Page 17: Multi dimensional model vs (1)

EXAMPLE OF A DATA CUBE IN USE This on-line system has drag and drop option for

creation of nested tables, drill up and drill down functionalities based on hierarchies of various dimensions.

The system also has simple calculation options on tabular data, hide and show options to hide certain undesirable rows or columns to be displayed on the screen.

Find and search options are available for finding a particular piece of information in tabular data of a cube.

17

Page 18: Multi dimensional model vs (1)

CREATING YOUR OWN DATA CUBE There are a variety of tools available that allow you to build

your own data cube such as Microsoft Excel and Microsoft SQL server.

The processes required are: 1 Chose a data source: 2 Create the query that extracts data from the

database.3 Create the cube from the extracted data.

The Contoso database that we used for the Dashboard project is a good example of a data source from which we can generate data cubes

Use the query wizard to generate the query that you wish to build your cube on.

18

Page 19: Multi dimensional model vs (1)

CREATING YOUR OWN DATA CUBE In the Query Wizard Finish screen, select Create an OLAP Cube from

this query and click Finish.

The third step is to then use the OLAP Cube Wizard. This application allows you to turn your table columns into dimensions. i.e. Drag product_category, product_subcategory, and brand_name so that they appear in that order, in the available dimension box. Rename the dimension ‘Product.’

The next step is to select the option that best fits the type of cube you want to create. For example, select Save a cube file containing all data for the cube. Enter a path and filename for the cube, and then click Finish.

Save the query definition that you have created. The cube wizard then creates the cube file. Once the cube is created the PivotChart Wizard allows you to create a PivotTable report from the data in the cube.

http://msdn.microsoft.com/en-us/library/office/aa140038(v=office.10).aspx#odc_da_whatrcubes_topic5

19

Page 20: Multi dimensional model vs (1)

DATA WAREHOUSING & DATA MARTS

How do Data Cubes relate to Data Warehousing & Data Marts? Are they the same?

• Data Warehousing (DW) Definition • Pros/Cons of DW• Relation if any to Data Cubes

• Data Marts (DM) Definition • Pros/Cons of DM• Relation if any to Data Cubes

20

Page 21: Multi dimensional model vs (1)

DATA WAREHOUSING

What is a Data Warehouse?

A DW contains historical data derived from transaction data, but it can include data from other sources

It separates analysis workload from transaction workload and enables an organisation to consolidate data from several sources to business users

“Data Mining: Concepts & Techniques” , J. Han & M. Kamber 21

Page 22: Multi dimensional model vs (1)

DATA WAREHOUSING

22

“...The data warehouse is nothing more than the union

of all the data marts...”- Ralph

Kimball

“You can catch all the minnows in the ocean

and stack them together and they still do not make a whale”-

Bill Inmom

Page 23: Multi dimensional model vs (1)

DATA WAREHOUSING

23

Page 24: Multi dimensional model vs (1)

DATA WAREHOUSING Benefits:

1. Gives the data …2. Removes …3. Potential for …4. Increased productivity …5. Example : US Insurance Company, B. Shin 2001

Problems:

1. Increased …2. Maintenance … 3. Complexity …4. Required … 5. Ownership … 6. Duration …

24

Page 25: Multi dimensional model vs (1)

DATA WAREHOUSING Comparisons to Data Cubing:

1. Data cubes provide a …

2. Data cubes are used to …

3. From a design standpoint, it’s important to …

4. To put data in and get data out …

5. Some or all of these …25

Page 26: Multi dimensional model vs (1)

DATA MART The single most important issue … A subset of a data warehouse that …

Characteristics include:1. Focuses on …2. Do not normally …3. More easily …

How Is a Data Mart different from a Data Warehouse? A data warehouse, unlike a data mart … Are essentially different architectural structures,

even though when viewed from afar and superficially, they look to be very similar

Tumbleweed, oak tree example 26

Page 27: Multi dimensional model vs (1)

DATA MART Differences between Data Warehouse & Mart:

27

Page 28: Multi dimensional model vs (1)

DATA MART

28

Page 29: Multi dimensional model vs (1)

DATA MART Benefits of creating a data mart:

1. To give users …2. To improve …3. Building a data mart …4. The cost of implementing …

Problems:1. Functionality2. Size3. Load performance4. Administration5. Setup and configuration 29

Page 30: Multi dimensional model vs (1)

DATA MART Comparisons to Data Cubing:

1. The data mart is typically housed in multidimensional technology which is great for …

2. Data Cubing provides a solid base for …

3. Data Cubing gives end users …

4. “To me, a Data Mart is just place where data gets dumped in a relatively flat, unusable format. Data Cubes is taking that data and making it dance.” (B. Quinn, 2008) 30

Page 31: Multi dimensional model vs (1)

MULTI-DIMENSIONAL MODEL VS. RELATIONAL DATABASES

31

Page 32: Multi dimensional model vs (1)

RELATIONAL DATABASES

Data is stored in Relations Tables with rows and columns.

Records and Fields in each Table

Relationships between tables

“A shared repository of data” Sarma et.al (2011) 32

Page 33: Multi dimensional model vs (1)

OLTP Online Transaction Processing

Data is processed immediately and is always kept current

Banking, inventory, scheduling, reservation systems.

Simple queries Insert; update; select

For complex queries, relational databases are unsuitable

33

Page 34: Multi dimensional model vs (1)

DATA WAREHOUSE A large store of data accumulated from

various databases

ETL Process Extract Data Transform Data

Data Cleaning Load Data

Data Cube used for representing this data34

Page 35: Multi dimensional model vs (1)

DIMENSIONS AND MEASURES Multi-dimensional model defined by fact table and

dimension tables Measure attribute: Saved from relational into the

fact table Defines data in MDM model

Meta Data: Describes all the pertinent aspects of the data in the database fully and precisely Required for sources from relational database Determines data inserted into warehouse

35

Page 36: Multi dimensional model vs (1)

RELATIONAL VS. MULTI-DIMENSIONALRelational Database Multi-Dimensional Cube

1 ComplexDifferent tables and relationships 

SimpleDimension table has a direct relationship with the fact table

2 Flexible Rigid3 Normalization common Repetition allowed4 OLTP

Data updated frequentlyOLAPMinimum number of joins, which is provided in multi-diensional by a single join to a fact table

5 Data is stored in Tables Data is stored in Cubes6 Table fields store actual data Dimensions and measures store

actual data7 Table size is measured in

recordsCube size is measured in cell-sets

8 Keywords Questions or “Verbiage”9 Fundamental business tasks Planning, problem solving,

decision making

36

Page 37: Multi dimensional model vs (1)

ONLINE ANALYTICAL PROCESSING (OLAP) “Multi-dimensional models lie at the core of

OLAP” Jensen et.al (2007)

Provide quick answers to queries that aggregate large amounts of data to find trends and patters.

Well-suited for multidimensional data organization

Specific Questions Answers needed quickly 37

Page 38: Multi dimensional model vs (1)

SIMPLICITY AND CONSTRUCTION "The central attraction of the dimensional

model of a business is its simplicity.... that simplicity is the fundamental key that allows users to understand databases, and allows software to navigate databases efficiently."

Measures have same relationships Easily analysed and displayed together

Those with little experience find multidimensional model queries only take a short time to master.” 38

Page 39: Multi dimensional model vs (1)

ADVANTAGES AND DRAWBACKS OF MULTI-DIMENSIONAL MODELLING

39

Page 40: Multi dimensional model vs (1)

MULTIDIMENSIONAL CUBE OVERVIEW - ADVANTAGESo Tables - nature and structure no Longer forced on user.o Captures health of Organisation – allows drill down options.o Incorporates business rules automatically – and not

exposed to users.o Automatic pre-populated data – Saving time and Resources

40

Page 41: Multi dimensional model vs (1)

MULTIDIMENSIONAL CUBE OVERVIEW - DRAWBACKSo User Misuse and misunderstanding.o Ridged and Inflexible nature.o Too specific – Manipulation of Datao Not suitable for ad-hoc queries, unless within the

dimensions of the "cube space“

MOLAP

ROLAP

Simple Complex

Good

OK

Analysis

QueryPerformance

41

Page 42: Multi dimensional model vs (1)

MOLAP SERVER

Advantageso Performance Constraint Environment.o Used in Mission Critical Operations.

Disadvantageso Inflexible and limited data allowance.o Unavailable data.o Specifics of summarised data.

Server

userWarehouse

Query

Data

MDDB

Periodicload

42

Page 43: Multi dimensional model vs (1)

ROLAP SERVER

Advantageso Not Limited by Cube Data – ‘Live fetch’.o Maintains functionality of relational Database

Disadvantageso Inhibited Performance on large databases.o Limitations by SQL functionalities

Server userWarehouse

Datacache

Livefetch

Cache

Query

Data

43

Page 44: Multi dimensional model vs (1)

THANK YOU FOR LISTENINGAny questions?

44

We Guarantee this Presentation was made with 100% natural sources, 0% Wikipedia