The Importance of the ETL Process

23
p. 1 1 1 Chapter: SQL Server 2012 Integration Services Course: SQL Server 2012 - A Comprehensive Introduction Course ID: 170 Instructor: Scott Whigham Chapter 16: Video # 2 The Importance of the ETL Process

description

This presentation is part of LearnItFirst's SQL Server 2012: A Comprehensive Introduction course. The video that contains this presentation can be watched here: https://www.youtube.com/watch?v=FzayiGi97bc The concept of ETL is one of the more important concepts that will be covered in the last couple chapters of this course. Even if you are not working with ETL today and now, chances are high that at some point you will be asked to. This video will explain the scenario in which a business has several different databases, and how to make it all fit together. Highlights from this slideshow: - What is Microsoft's ETL tool? - What enables a business to use so many different databases? - What different options do you have for building a dashboard? - Needs of the organization versus the requirements of the vendor

Transcript of The Importance of the ETL Process

Page 1: The Importance of the ETL Process

p. 1 1

1

Chapter: SQL Server 2012 Integration Services Course: SQL Server 2012 - A Comprehensive Introduction Course ID: 170 Instructor: Scott Whigham

Chapter 16: Video # 2

The Importance of the ETL Process

Page 2: The Importance of the ETL Process

p. 2 2

2

SQL Server 2012 Integration Services (SSIS) is Microsoft’s ETL tool

– Extract, Transform, and Load

Page 3: The Importance of the ETL Process

p. 3 3

3

Most businesses have data in more than one format

–How does one business happen to use so many different databases?

Page 4: The Importance of the ETL Process

p. 4 4

4

Let’s walk through a likely scenario and see how this happens:

–2001: The “AdventureWorks” company launches a web store to complement its brick-and-mortar stores

• ASP-based website

• SQL Server 2000 backend

• Customers are encouraged to phone questions in or to send an email

Page 5: The Importance of the ETL Process

p. 5 5

5

Things change... – 2001: Launch with SQL 2000

–2003: AdventureWorks buys a competitor

• Competitor used a PHP/MySQL ticketing system

• AW mgmt chooses to adopt this system for customer ticketing rather than build/buy an alternative

Page 6: The Importance of the ETL Process

p. 6 6

6

AdventureWorks timeline:

Year Usage Data Source

2001 Website MS SQL Server 2000

2003 Customer Ticket System MySQL 3.23

Page 7: The Importance of the ETL Process

p. 7 7

7

Needs change... – 2001: Launch with SQL 2000

– 2003: PHP/MySQL 3.23 ticketing system

–2004: The company is growing – time for more “stuff”:

• A PHP/MySQL project management system is installed

• A marketing mailer application with contact mgmt is purchased

Page 8: The Importance of the ETL Process

p. 8 8

8

AdventureWorks timeline:

Year Usage Data Source

2001 Website MS SQL Server 2000

2003 Customer Ticket System MySQL 3.23

2004 Project Management MySQL 4.0

2004 Marketing mailer MS Access

Page 9: The Importance of the ETL Process

p. 9 9

9

Markets change... – 2001: Launch with SQL 2000

– 2003: PHP/MySQL 3.23 ticketing system

– 2004: PHP/MySQL 4.0 project management

–2005: A new ASP.NET website is rolled out with a SQL Server 2005 backend

• Major upgrade from SQL Server 2000 -> 2005

Page 10: The Importance of the ETL Process

p. 10 10

10

AdventureWorks timeline:

Year Usage Data Source

2001 Website MS SQL Server 2000

2003 Customer Ticket System MySQL 3.23

2004 Project Management MySQL 4.0

2004 Marketing mailer MS Access

2005 Website upgrade MS SQL Server 2005

Page 11: The Importance of the ETL Process

p. 11 11

11

Trends change... – 2001: Launch with SQL 2000

– 2003: PHP/MySQL 3.23 ticketing system

– 2004: PHP/MySQL 4.0 project management

– 2005: Upgraded website to SQL 2005

–2008: Website sales popularity causes “growing pains”

• A new supply chain management app purchased

• A new employee management/HR/payroll package is purchased

Page 12: The Importance of the ETL Process

p. 12 12

12

AdventureWorks timeline:

Year Usage Data Source

2001 Website MS SQL Server 2000

2003 Customer Ticket System MySQL 3.23

2004 Project Management MySQL 4.0

2004 Marketing mailer MS Access

2005 Website upgrade MS SQL Server 2005

2008 Supply chain mgmt MS SQL Server 2008

2008 Employee/HR/Payroll DB2

Page 13: The Importance of the ETL Process

p. 13 13

13

The world grows smaller... – 2001: Launch with SQL 2000

– 2003: PHP/MySQL 3.23 ticketing system

– 2004: PHP/MySQL 4.0 project management

– 2005: Upgraded website to SQL 2005

– 2008: Added supply chain mgmt and HR/payroll packages

–2010: Website sales continue to gain popularity, particularly overseas

• A new shipping database is purchased

• Employee expenses are now tracked in custom MS Excel spreadsheets

Page 14: The Importance of the ETL Process

p. 14 14

14

AdventureWorks timeline:

Year Usage Data Source

2001 Website MS SQL Server 2000

2003 Customer Ticket System MySQL 3.23

2004 Project Management MySQL 4.0

2004 Marketing mailer MS Access

2005 Website upgrade MS SQL Server 2005

2008 Supply chain mgmt MS SQL Server 2008

2008 Employee/HR/Payroll DB2

2010 Shipping *.csv file downloaded monthly

2010 Employee expense tracking MS Excel

Page 15: The Importance of the ETL Process

p. 15 15

15

It’s 2012 and company executives + management have been playing a game lately...

– You know this one, don’t you?

Page 16: The Importance of the ETL Process

p. 16 16

16

Page 17: The Importance of the ETL Process

p. 17 17

17

The world grows smaller... – 2001: Launch with SQL 2000

– 2003: PHP/MySQL 3.23 ticketing system

– 2004: PHP/MySQL 4.0 project management

– 2005: Upgraded website to SQL 2005

– 2008: Added supply chain mgmt and HR/payroll packages

– 2010: New shipping database, employee expense tracking

–2012: Executives want a B.I. solution

• You name it, they want it

• But... – there’s no budget for software purchases...

Page 18: The Importance of the ETL Process

p. 18 18

18

No budget for new software = more opportunities for you!

– You decide:

• ... to create a relational OLAP data warehouse to store all the company’s historic data in a unified way

• ... to create a multidimensional database with multiple cubes (to facilitate fast browsing of analytics)

• ... to install Excel 2013 on all CxO and management machines, and to teach them how to build pivot tables and pivot charts

• ... to investigate Reporting Services as a way to build internal web dashboards and subscription-based reporting

– On-the-job experience, here we come!

Page 19: The Importance of the ETL Process

p. 19 19

19

The company data is all “loosely connected”

– A customer makes a small order via the website

– The same customer submits a “Help!” ticket

– Customer rep. has to make an order for a replacement part

– Sales person takes customer to an entertainment event

– Customer now makes a large order

– Key question: how did we acquire this customer?

Page 20: The Importance of the ETL Process

p. 20 20

20

Integration Services is your ETL tool

1. You Extract the data from the source to a staging area • Optional, but typically an MS SQL Server relational

database

2. You make any changes to the data (a.k.a. a Transformation) • Either in motion or in the staging area

3. You Load the data into the relational data warehouse

4. You process the cube(s)

– SSIS is your “one stop shop” for all of this!

Page 21: The Importance of the ETL Process

p. 21 21

21

Your final step is to build a dashboard

– Reporting Services or PowerPivot?

– Power View or Excel?

– SharePoint or email?

– On-demand or subscription-based?

Page 22: The Importance of the ETL Process

p. 22 22

22

Your dashboard is a hit!

Page 23: The Importance of the ETL Process

p. 23 23

23

In the next video…

–How to Install and Configure SSIS 2012

“A painter paints pictures on canvas. But musicians paint their pictures on silence.”

- Leopold Stokowski