Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and...

22
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing (DW) Week 1 Introduction to Data Warehousing

Transcript of Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and...

Copyright© 2014, Sira YongchareonDepartment of Computing, Faculty of Creative Industries and Business

Lecturer : Dr. Sira Yongchareon

ISCG 6425 Data Warehousing (DW)

Week 1Introduction to Data Warehousing

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 2

Course’s introduction

Introduction to Data Warehousing (DW)

SQL & database concepts revision

Outline

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 3

Course’s introduction About our classes

3 hours/week Lecture then Lab session, 15-20 mins mid-break Workshop mode (practical), break at anytime

Attendance and homework will be checked and marked Elena (Teaching assistance) take care lab sessions & all the

markings.. Be nice to her.

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 4

Course’s introduction About assessments

No Final Exam!!! Yeah 2 x Assignments: 60%

Individual 30% (including attendance + worksheet + interview) Individual30% (including attendance + worksheet + interview) Interviews required to pass the assignment!!!

Final Test (Theory) : 40% (at Week 13)

How to pass this subject Obtain scores 50% from the overall

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 5

Course’s introduction Learning resources

Lecture slides, tutorials, worksheets, lab exercises, text books

All books can be accessed through Unitec’s library (e-book)

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 6

Course’s introduction Outside-a-class contact

Email me [email protected] Email code of conduct

In the subject line of your email, please write “[DW]” followed by “your topic/question/issue”.

For example, To : [email protected] Subject : [DW] Question about the next week’s test

WARNING!! Without following that your email may be directed to my “very low priority mailbox”

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 7

Any questions about the course?

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 8

Introduction to Data Warehousing(DW) What is a Data Warehouse? (Wikipedia)

DW is a database used for reporting and data analysis

DW is a central repository of data which is created by integrating data from one or more disparate sources

DW stores current as well as historical data that are used for creating reports to support decision makers in an organization Business Intelligence

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 9

DW to support “decision makers”

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 10

The flow of data in DW system

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 11

From Data to Knowledge

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 12

OLTP (On-Line Transaction Processing) Databases Database designed for day-to-day operations / transactions Queries contain SELECT, INSERT, UPDATE, DELETE Normalized schema (no duplicates, no inconsistency data)

OLAP (On-Line Analytical Processing) Database Database designed for business analytics, summary reports,

decision making Queries have SELECT only (no INSERT, DELETE,

UPDATE) De-normalized schema (lots of duplicates/redundancies)

OLTP vs. OLAP Databases

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 13

OLTP vs. OLAP Databases

Topic

OLTP Databases OLAP Databases

Application

Operational system : ERP, CRM, Legacy applications

Management Information System, Decision Support System

Typical users Staffs, day-to-day operation staffs Managers, Executives

What the data Reveals a snapshot of on-going business processes

Multi-dimensional views of various kinds of business activities

Inserts and Updates

Short and fast inserts and updates initiated by end users

Periodic long-running batch jobs refresh the data

Queries

Relatively standardized and simple queries Returning relatively few records

Often complex queries involving aggregations

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 14

OLTP vs. OLAP Databases (cont.)

Topic

OLTP Databases OLAP Databases

Processing Speed

Typically very fast

Depends on the amount of data involved; batch and complex queries may take many hours; query speed can be improved by creating indexes

Space Requirement

Can be relatively small if historical data is archived

Larger due to the existence of aggregation structures and history data; requires more indexes than OLTP

Database Design

Highly normalized with many tables

Typically de-normalized with fewer tables; use of star and/or snowflake schemas

Backup and Recovery

Backup religiously; operational data is critical to run the business, data loss is likely to entail significant monetary loss and legal liability

Instead of regular backups, some environments may consider simply reloading the OLTP data as a recovery method

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 15

Our lab has MS-SQL Server 2008 R2 installed.

If you want to download and install it on your computer. Google “unitec dreamspark” (Microsoft for academia) Register first then you can download any MS software Choose SQL Enterprise 2014 (latest version)

Software used in this course

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 16

Prerequisite knowledge Database concepts, explain what are the followings..

ERD, Relational Schema, and Normalization SQL

DDL: Create, Drop, Alter DML: Select, Insert, Update, Delete PK, FK, Composite Key, Null value, Constraints Union, Join (Inner and Outer), Having, Group by, Order by Count, Sum, Avg, Min, Max Sub-query

Please DO self study if you don’t know any of them

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 17

How much do you understand ERD?

1. How to read this ERD?

2. Convert this ERD to a Relational Schema (Database tables)

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 18

How much do you understand ERD 2?

1. How to read this ERD?

2. Convert this ERD to a Relational Schema (Database tables)

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 19

How much do you understand SQL?

SELECT Name FROM Orders, Salesperson WHERE Orders.salesperson_id = Salesperson.ID GROUP BY salesperson_id HAVING COUNT ( salesperson_id ) >1

Anything wrong??

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 20

Are you prepared for the course? Next week will have a “Hurdle Test” !!!

Multiple-choice questions no code writing but understanding codes is required

To test if you are READY for this course

** You MUST PASS in order to continue the course **

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 21

Are you prepared for the course? Test topics include:

Relational Databases and SQL concepts Relations, Keys and Normalization Basic + Intermediate SQL commands

Study resources - PLEASE DO YOUR OWN STUDY!

http://www.w3schools.com/sql/ http://www.tutorialspoint.com/sql/sql_tutorial.pdf And from your Lv.5 Introduction to Databases paper

ISCG6425 Data Warehousing (by Sira Yongchareon)Department of Computing, Faculty of Creative Industries and Business 22

Any questions?