Data Warehouse DSS Business Intelligence [email protected].

40
Data Warehouse DSS Business Intelligence [email protected]
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    231
  • download

    0

Transcript of Data Warehouse DSS Business Intelligence [email protected].

Page 1: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse

DSS

Business Intelligence

[email protected]

Page 2: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 2 © Minder Chen, 2004-2010

BI Evolution

MISReports

MISReports

DecisionSupportSystems

DecisionSupportSystems

BusinessIntelligenceBusiness

Intelligence

BusinessPerformanceManagement

BusinessPerformanceManagement

History Legacy Current 2005+

•Hand coded •Report writers •OLAP •Dashboard/mining

•Single system data

•Joined operating data

•DW •Enterprise portals

Source: META Group Inc.

•Summary metrics •Statistical metrics

•Predictive metrics •Recommendations

•Extreme latency •Extreme cost •Extreme ‘infoglut’ •Extreme integration

Moving beyond one-way info delivery to true BPM

Page 3: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 3 © Minder Chen, 2004-2010

BI Questions

• What happened?– What were our total sales this month?

• What’s happening?– Are our sales going up or down, trend analysis

• Why?– Why have sales gone down?

• What will happen?– Forecasting & “What If” Analysis

• What do I want to happen?– Planning & Targets

Source: Bill Baker, Microsoft

Page 4: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 4 © Minder Chen, 2004-2010

BI

Business Intelligence (BI) is the process of gathering meaningful information to answer questions and identify significant trends or patterns, giving key stakeholders the ability to make better business decisions.

“The key in business is to know something that

nobody else knows.”-- Aristotle Onassis

PHOTO: HULTON-DEUTSCH COLL

“To understand is to perceive patterns.”

— Sir Isaiah Berlin

"The manager asks how and when, the leader asks what and why."

— “On Becoming a Leader” by Warren Bennis

Page 5: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 5 © Minder Chen, 2004-2010

BI Definition

• Business intelligence provides the ability to transform data into usable, actionable information for business purposes. BI requires:– Collections of quality data and metadata

important to the business

– The application of analytic tools, techniques, and processes

– The knowledge and skills to use business analysis to identify/create business information

– The organizational skills and motivation to develop a BI program and apply the results back into the business

Page 6: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 6 © Minder Chen, 2004-2010

Increasing potentialto supportbusiness decisions (MIS) End User

Business Analyst

DataAnalyst

DBA

MakingDecisions

Data Presentation

Visualization Techniques

Data MiningInformation Discovery

Data ExplorationOLAP, MDA,

Statistical Analysis, Querying and Reporting

Data Warehouses / Data Marts

Data Sources(Paper, Files, Information Providers, Database Systems, OLTP)

Business Intelligence

Page 7: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 7 © Minder Chen, 2004-2010

Inmon's Definition of Data Warehouse – Data View

• A warehouse is a

– subject-oriented,

– integrated,

– time-variant and – non-volatile

collection of data in support of management's decision making process.

– Bill Inmon in 1990

Source: http://www.intranetjournal.com/features/datawarehousing.html

Page 8: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 8 © Minder Chen, 2004-2010

Inmon's Definition Explain• Subject-oriented: They are organized around major

subjects such as customer, supplier, product, and sales. Data warehouses focus on modeling and analysis to support planning and management decisions v.s. operations and transaction processing.

• Integrated: Data warehouses involve an integration of sources such as relational databases, flat files, and on-line transaction records. Processes such as data cleansing and data scrubbing achieve data consistency in naming conventions, encoding structures, and attribute measures.

• Time-variant: Data contained in the warehouse provide information from an historical perspective.

• Nonvolatile: Data contained in the warehouse are physically separate from data present in the operational environment.

Page 9: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 9 © Minder Chen, 2004-2010

The Data Warehouse Process

Data Marts Data Marts and cubesand cubes

DataDataWarehouseWarehouse

SourceSourceSystemsSystems

ClientsClients

Design theDesign the Populate Populate CreateCreate QueryQuery Data Warehouse Data Warehouse Data Warehouse Data Warehouse OLAP CubesOLAP Cubes DataData

33 44

Query ToolsQuery ToolsReportingReportingAnalysisAnalysis

Data MiningData Mining

2211

Page 10: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 10 © Minder Chen, 2004-2010

BI Architecture

Source: http://www.rpi.edu/datawarehouse/docs/DW-Architecture.pdf

Page 11: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 11 © Minder Chen, 2004-2010

BI Infrastructure Components

Page 12: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 12 © Minder Chen, 2004-2010

Key Concepts in BI Development Lifecycle

Application

Data

Technology

Page 13: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 13 © Minder Chen, 2004-2010

Performance Dashboards for Information Delivery

Page 14: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 14 © Minder Chen, 2004-2010

OLTP Versus Business Intelligence: Who asks what?

OLTP Questions

• When did that order ship?

• How many units are in inventory?

• Does this customer haveunpaid bills?

• Are any of customer X’s line items on backorder?

Analysis Questions• What factors affect order

processing time?

• How did each product line (or product) contribute to profit last quarter?

• Which products have the lowest Gross Margin?

• What is the value of items on backorder, and is it trending up or downover time?

Page 15: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 15 © Minder Chen, 2004-2010

Classification of Entity Types

Page 16: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 16 © Minder Chen, 2004-2010

Transaction Level Order Item Fact Table

Page 17: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 17 © Minder Chen, 2004-2010

OLTP Versus OLAP

OLTP Questions

• When did that order ship?

• How many units are in inventory?

• Does this customer haveunpaid bills?

• Are any of customer X’s line items on backorder?

OLAP Questions• What factors affect order

processing time?

• How did each product line (or product) contribute to profit last quarter?

• Which products have the lowest Gross Margin?

• What is the value of items on backorder, and is it trending up or down over time?

Page 18: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 18 © Minder Chen, 2004-2010

Requirements

Page 19: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 19 © Minder Chen, 2004-2010

Dimensional Design Process

• Select the business process to model • Declare the grain of the business process/data

in the fact table • Choose the dimensions that apply to each fact

table row• Identify the numeric facts that will populate

each fact table row

BusinessRequirements

Data Realities

Page 20: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 20 © Minder Chen, 2004-2010

Star Schema

Source: Moody and Kortink, "From ER Models to Dimensional Models: Bridging the Gap between OLTP and OLAP Design, Part I," Business Intelligence Journal, Summer 2003, pp. 7-24.

Page 21: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 21 © Minder Chen, 2004-2010

Identifying Measures and Dimensions

The attribute variescontinuously: •Balance•Unit Sold•Cost•Sales

The attribute is perceived asa constant or discrete value:

•Description•Location•Color•Size

DimensionsMeasures

Performance Measures for KPI

Performance Drivers

Page 22: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 22 © Minder Chen, 2004-2010

A Dimensional Model for a Grocery Store Sales

Why?

Page 23: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 23 © Minder Chen, 2004-2010

Facts Table

DateID

ProductID

CustomerID

Units

Dollars

DimensionsDimensionsDimensionsDimensions

MeasuresMeasuresMeasuresMeasures

The Fact Table contains keys and units of The Fact Table contains keys and units of measuremeasure

Measurements of business events.

Page 24: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 24 © Minder Chen, 2004-2010

Fact Tables

Fact tables have the following characteristics:• Contain numeric measures (metric) of the

business• May contain summarized (aggregated) data• May contain date-stamped data• Are typically additive• Have key value that is typically a concatenated

key composed of the primary keys of the dimensions

• Joined to dimension tables through foreign keys that reference primary keys in the dimension tables

Page 25: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 25 © Minder Chen, 2004-2010

Store Dimension

• It is not uncommon to represent multiple hierarchies in a dimension table. Ideally, the attribute names and values should be unique across the multiple hierarchies.

Page 26: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 26 © Minder Chen, 2004-2010

Inside a Dimension Table

• Dimension table key: Uniquely identify each row. Use surrogate key (integer).

• Table is wide: A table may have many attributes (columns).

• Textual attributes. Descriptive attributes in string format. No numerical values for calculation.

• Attributes not directly related: E.g., product color and product package size. No transitive dependency.

• Not normalized (star schemar).

• Drilling down and rolling up along a dimension.

• One or more hierarchy within a dimension.

• Fewer number of records.

Page 27: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 27 © Minder Chen, 2004-2010

Product Dimension

• SKU: Stock Keeping Unit

• Hierarchy: – Department Category Subcategory Brand Product

Page 28: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 28 © Minder Chen, 2004-2010

Hierarchy

Page 29: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 29 © Minder Chen, 2004-2010

Levels and Members

Year Quarter Month

1999 Quarter 1 Jan

1999 Quarter 1 Feb

1999 Quarter 1 Mar

1999 Quarter 2 Apr

1999 Quarter 2 May

1999 Quarter 2 Jun

1999 Quarter 3 Jul

1999 Quarter 3 Aug

1999 Quarter 3 Sep

1999 Quarter 4 Oct

1999 Quarter 4 Nov

1999 Quarter 4 Dec

Page 30: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 30 © Minder Chen, 2004-2010

Operations in Multidimensional Data Model

• Aggregation (roll-up)

– dimension reduction: e.g., total sales by city

– summarization over aggregate hierarchy: e.g., total sales by city and year -> total sales by region and by year

• Selection (slice) defines a subcube

– e.g., sales where city = Palo Alto and date = 1/15/96

• Navigation to detailed data (drill-down)

– e.g., (sales - expense) by city, top 3% of cities by average income

• Visualization Operations (e.g., Pivot)

Page 31: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 31 © Minder Chen, 2004-2010

We can drill down or up on attributes from more than one explicit hierarchy and with attributes that are part of no hierarchy.

Drilling down in a data mart is nothing more than adding row headers from the dimension tables. Drilling up is removing row headers.

Page 32: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 32 © Minder Chen, 2004-2010

Avoid Null Key in the Fact Table• Include a row in the corresponding dimension table to identify that

the dimension is not applicable to the measurent.

Sales Fact Table

11: No Promotion

Page 33: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 33 © Minder Chen, 2004-2010

Querying the Retail Sales Schema

Page 34: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 34 © Minder Chen, 2004-2010

Dragging and dropping dimensional attributes and facts into a simple report

Page 35: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 35 © Minder Chen, 2004-2010

ETL

ETL = Extract, Transform, Load

• Moving data from production systems to DW

• Checking data integrity

• Assigning surrogate key values

• Collecting data from disparate systems

• Reorganizing data

Page 36: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 36 © Minder Chen, 2004-2010

Building The WarehouseTransforming Data

Page 37: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 37 © Minder Chen, 2004-2010

Pivot Table in Excel

Page 38: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 38 © Minder Chen, 2004-2010

Use of Data Mining

• Customer profiling

• Market segmentation

• Buying pattern affinities

• Database marketing

• Credit scoring and risk analysis

Page 39: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 39 © Minder Chen, 2004-2010

OLAP and Data Mining Address Different Types of Questions

While reporting and OLAP are informative about past facts, only data mining can help you predict the future of your business.

OLAP  Data Mining 

What was the response rate to our mailing?  What is the profile of people who are likely to respond to future mailings?

 How many units of our new product did we sell to our existing customers?

 Which existing customers are likely to buy our next new product?

 Who were my 10 best customers last year? Which 10 customers offer me the greatest profit potential?

 Which customers didn't renew their policies last month?

 Which customers are likely to switch to the competition in the next six months?

 Which customers defaulted on their loans? Is this customer likely to be a good credit risk?

 What were sales by region last quarter? What are expected sales by region next year?

 What percentage of the parts we produced yesterday are defective?

 What can I do to improve throughput and reduce scrap?

Source: http://www.dmreview.com/editorial/dmreview/print_action.cfm?articleId=2367

Page 40: Data Warehouse DSS Business Intelligence Minder.Chen@CSUCI.EDU.

Data Warehouse - 40 © Minder Chen, 2004-2010

Improve Stakeholder Value

Revenue Growth Strategy Productivity Strategy

Build the Franchise Increase Customer Value

Improve Cost Structure

Improve Asset Utilization

A Motivated and Prepared Workforce

Stakeholder ValueROCE

New Revenue Services Profitability Cost per Unit Asset Utilization

Price

Financial Perspective

Customer Perspective

Internal Perspective

Learning & Growth Perspective

Product/Service Attributes

Strategic Competencies

Strategic Technologies

Climate for Action

“Build the Franchise” “Increase Customer Value”

“Achieve Operational Excellence”

“Become a Good Neighbor”

(Innovation Processes)

(Customer Management Processes)

(Operations & Logistics Processes)

(Regulatory & Environmental

Processes)

Customer Value Proposition

Quality

Operational Excellence

Customer Intimacy

Product Leadership

Customer Satisfaction

Customer Acquisition Customer Retention

Time Function Service Relations Brand

Relationship Image

�Customer Profitability

UA’s Existing Data Warehouse

Balanced Scorecard