Ppt

17
DATA WAREHOUSE By: RAVI RANJAN By: Ravi Ranjan

description

 

Transcript of Ppt

Page 1: Ppt

DATA

WAREHOUSE

By: RAVI RANJAN

By: Ravi Ranjan

Page 2: Ppt

DEFINITIONData Warehouse A collection of corporate information, derived directly from operational systems and some external data sources. Its specific purpose is to support business decisions, not business operations.

Page 3: Ppt

THE PURPOSE OF DATA WAREHOUSING

Realize the value of data Data / information is an asset Methods to realize the value, (Reporting,

Analysis, etc.)

Make better decisions Turn data into information Create competitive advantage Methods to support the decision making

process, (EIS, DSS, etc.)

Page 4: Ppt

Data Warehouse Components

• Staging Area• A preparatory repository where

transaction data can be transformed for use in the data warehouse

• Data Mart • Traditional dimensionally modeled set of

dimension and fact tables• Per Kimball, a data warehouse is the union

of a set of data marts • Operational Data Store (ODS)

• Modeled to support near real-time reporting needs.

Page 5: Ppt

DATA WAREHOUSE FUNCTIONALITY

Data Warehouse Engine

Optimized LoaderExtractionCleansing

AnalyzeQuery

Metadata Repository

RelationalDatabases

LegacyData

Purchased Data

ERPSystems

Page 6: Ppt

EVOLUTION ARCHITECTURE OF DATA WAREHOUSE

Top-Down Architecture

Bottom-Up Architecture

Enterprise Data Mart Architecture

Data Stage/Data Mart Architecture

GO TO DIAGRAM

GO TO DIAGRAM

GO TO DIAGRAM

GO TO DIAGRAM

Page 7: Ppt

VERY LARGE DATA BASES

Terabytes -- 10^12 bytes:

Petabytes -- 10^15 bytes:

Exabytes -- 10^18 bytes:

Zettabytes -- 10^21 bytes:

Zottabytes -- 10^24 bytes:

Wal-Mart -- 24 Terabytes

Geographic Information Systems

National Medical Records

Weather images

Intelligence Agency Videos

WAREHOUSES ARE VERY LARGE DATABASES

Page 8: Ppt

COMPLEXITIES OF CREATING A DATA WAREHOUSE

Incomplete errors Missing FieldsRecords or Fields That, by Design, are

not Being Recorded

Incorrect errorsWrong Calculations, AggregationsDuplicate RecordsWrong Information Entered into Source

System

Page 9: Ppt

SUCCESS & FUTURE OF DATA WAREHOUSE

The Data Warehouse has successfully supported the

increased needs of the State over the past eight

years.

The need for growth continues however, as the

desire for more integrated data increases.

The Data Warehouse has software and tools in place

to provide the functionality needed to support new

enterprise Data Warehouse projects.

The future capabilities of the Data Warehouse can be

expanded to include other programs and agencies.

Page 10: Ppt

DATA WAREHOUSE PITFALLS

You are going to spend much time extracting, cleaning, and loading data

You are going to find problems with systems feeding the data warehouse

You will find the need to store/validate data not being captured/validated by any existing system

Large scale data warehousing can become an exercise in data homogenizing

Page 11: Ppt

DATA WAREHOUSE PITFALLS…

The time it takes to load the warehouse will expand to the amount of the time in the available window... and then some

You are building a HIGH maintenance system You will fail if you concentrate on resource

optimization to the neglect of project, data, and customer management issues and an understanding of what adds value to the customer

Page 12: Ppt

BEST PRACTICES

Complete requirements and design

Prototyping is key to business understanding

Utilizing proper aggregations and detailed

data

Training is an on-going process

Build data integrity checks into your system.

Page 13: Ppt

Thank You

Page 14: Ppt

BACK TO ARCHITECTURE

Top-Down Architecture

Page 15: Ppt

BACK TO ARCHITECTURE

Bottom-Up Architecture

Page 16: Ppt

Enterprise Data Mart Architecture

BACK TO ARCHITECTURE

Page 17: Ppt

Data Stage/Data Mart Architecture

BACK TO ARCHITECTURE