Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
-
Upload
marilynn-watson -
Category
Documents
-
view
217 -
download
0
Transcript of Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Data Warehouse Design
Xintao Wu
University of North Carolina at CharlotteNov 10, 2008
2
Organization
• Concepts Data Warehousing Concepts (Ch1)
• Logical Design Logical design in data warehouse (Ch2)
• Physical Design Physical design in data warehouses (Ch3) Hardware and I/O considerations Parallelism and partitioning in data warehouses Indexes (Ch6) Integrity constraints (Ch7) Basic Materialized views (Ch8) Advanced materialized views Dimensions (Ch10)
3
Organization• Managing DW environment
Overview of extraction, transformation, and loading Extraction Transportation Loading and transformation Maintaining the DW Change data capture SQLAccess advisor
• DW performance Query rewrite Schema modeling techniques SQL for aggregation in DW SQL for analysis and reporting SQL for modeling OLAP and data mining Using parallel execution
4
What is DW
5
6
7
8
Logical vs. physical design
• In the logical design, you look at the logical relationships among the objects.
• In the physical design, you look at the most effective way of storing and retrieving the objects as well as handling them from a transportation and backup/recovery perspective.
• Your logical design should result in a set of entities and attributes corresponding to fact tables and
dimension tables A model of operational data from your source into subject-oriented
informaiton in your target data warehouse schema.
9
10
11
12
13
Physical Design• Logical design can use pen/paper/oracle warehouse builder/oracle designer
while physical design is the creation of database with SQL• Physical design decisions are mainly driven by query performance and
database maaintenance aspects.• You need to create
Tablespaces Tables and partitioned tables Views
A view takes the output of a query and treats it as a table. Views do not require any space in the database
Integrity constraints In OLTP, they prevent the insertion of invalid data while in DW, they are only used for
query rewrite. Dimensions
A schema object that defines hierarchical relationships between columns or column sets. Indexes and partitioned indexes
Bitmap indexes vs. B-tree indexes. Bitmap indexes are efficient for set-oriented operations.
Materialized views Query results that have been stored in advance .
14
Partition and parallel execution
•Range partitioning•Hash partitioning•List partitioning•Composite partitioning
15
Bitmap index
16
17
One dimension table columns joins one fact table
18
19
extension
20
21
Integrity constraints
• Unique constraints• NOT NULL constraints• FOREIGN KEY constraints
22
23
Basic materialized views
24
Materialized views with aggregates
25
26
Dimension
27
28
29