Module 5 Metadata, Tools, and Data Warehousing
description
Transcript of Module 5 Metadata, Tools, and Data Warehousing
ITEC 450
1
MODULE 5 Metadata, Tools, and Data WarehousingSection 4 Data Warehouse Administration
2
ITEC 450DATA WAREHOUSE AND CHARACTERISTICSA data warehouse is a subject-oriented, integrated, time-
variant, non-volatile collection of data that is designed for query and analysis rather than for transaction processes.
Subject-oriented – data pertains to a particular subject instead of the many subjects pertinent to the company’s ongoing operations.
Integrated – consistent naming conventions, formats, encoding structures; from multiple data sources
Time-variant – data is identified with a particular time period, can study trends and changes
Non-updatable – data is stable in a data warehouse. Data loaded, and should not be removed.
3
ITEC 450COMPARISON OF DATABASE CHARACTERISTICS
4
ITEC 450DATA WAREHOUSE AND BUSINESS INTELLIGENCE A data warehouse usually contains historical
data derived from transaction data and other sources.
It enables an organization to consolidate data.
It includes An extraction, transportation, transformation,
and loading (ETL) solution An online analytical processing (OLAP) engine Client analysis tools Reporting
5
ITEC 450ANALYTICAL VS. TRANSACTION PROCESSING Analytical processing – informational
systems DSS – decision support system OLAP – online analytical processing Data mining – the process of mining or discovery of new
information in terms of patterns or rules from vast amounts of data
Transaction processing – operational system OLTP – online transaction processing
6
ITEC 450
DATA WAREHOUSE DESIGN Star schema - data modeling technique used
to map multidimensional decision support data into a relational database. It is excellent for ad-hoc queries, but bad for online transaction processing. It contains four components: Fact table Dimension tables Attributes Attribute hierarchies
Snowflake schema – a star schema in which the dimension tables have additional relationships
7
ITEC 450
STAR SCHEMA COMPONENTS
8
ITEC 450
STAR SCHEMA EXAMPLE
9
ITEC 450
DATA MOVEMENT – ETL PROCESS ETL – Extract, Transform, and Load Capture – extract or obtaining a snapshot of a chosen
subset of the source data for loading into the data warehouse
Scrub or data cleansing – uses pattern recognition and AI techniques to upgrade data quality
Transform – convert data from format of operational system to format of data warehouse
Load – place transformed data into the warehouse and create indexes
10
ITEC 450
DATA WAREHOUSE PERFORMANCE Perspectives of data warehouse performance
Extract performance – how ETL process performs Data management – database design and data quality Query performance – OLAP tuning Server performance – hardware support
Automated summary tables Provide a proper set of aggregate information Commonly implement with materialized views or batch
operation tables DBMS features to support data warehousing
Materialized views – automatically creation of summaries Bitmap indexes – widely used in data warehousing, in
addition to B-tree Parallel execution – multiple processes work together
simultaneously to run a single SQL statement
ITEC 450
11
MODULE 5 Metadata, Tools, and Data WarehousingSection 5 DBA Rules of Thumb
12
ITEC 450
THE RULES OF THUMB Personal DBA handbook
Write down your own experience Categorize them in a searchable note or repository
Backup everything and plan for worst all the time Before making any changes, ensure that you can
recover from them Automation and share your knowledge
Create a systematic way to troubleshoot problems Create, reuse and share scripts Knowledge sharing will open many revenues for you
Next levels Understand the business, not just the technology Keep up-to-date on technology
13
ITEC 450COURSE SUMMARY (YOUR LEARNING) DBA Roles and Responsibilities DBMS Architecture, Physical and Logical
Structures DBMS Installation and Database Creation Database Connectivity and Network
Components Database Security and Audit Capability Database Backup and Recovery Database Monitoring, DBMS System Tuning,
Physical Configuration Optimization SQL Query Coding and Tuning, Data
Loading Database Metadata, Data Dictionary Data Warehouse Characteristics and
Overview