2006 Milo Davies

10
Copyright © 2006, SAS Institute Inc. All rights reserved. SAS Users New Zealand Proudly sponsored by… know are know are

Transcript of 2006 Milo Davies

Page 1: 2006 Milo Davies

Copyright © 2006, SAS Institute Inc. All rights reserved.

SAS Users New Zealand

Proudly sponsored by…

knowareknoware

Page 2: 2006 Milo Davies

Copyright © 2006, SAS Institute Inc. All rights reserved.

You are here!

Page 3: 2006 Milo Davies

Copyright © 2006, SAS Institute Inc. All rights reserved.

CommunicationsCredit Transactions

Dividend Payments

Loyalty Transactions

Subscription Bills

Prepaid Balances

Customer Scores

Telephony Transactions

Survey Responses

Page 4: 2006 Milo Davies

Copyright © 2006, SAS Institute Inc. All rights reserved.

Relational vs Dimensional

Relational Dimensional

On Line Transaction Processing (OLTP)

On Line Analytical Processing (OLAP)

Normalised – no data redundancy Denormalised – data redundancy

Optimised for updates Optimised for querying

Typically a snapshot in time Temporal – history can be maintained

Entities and Relationships Facts and Dimensions

Page 5: 2006 Milo Davies

Copyright © 2006, SAS Institute Inc. All rights reserved.

Why Dimensional?

Easier to understand and query

Clarity on how data can be used – what are business measurements, what is descriptive.

Required structure for loading an multi-dimensional OLAP cube

Optimal structure for query performance (less joins)

Page 6: 2006 Milo Davies

Copyright © 2006, SAS Institute Inc. All rights reserved.

Misconception: Dimensional Modelling leads to siloed data marts that hinder or prevent cross-departmental analysis and result in multiple versions of the truth.

Common Misconception

Customer

Covered Item

Time

Employee

Claims

FACT

Claim Type

Product

Premiums

FACT

Discount Type

Customer

Product Payment Method

Time

Not if Dimensions are CONFORMED

Challenging. Cross-departmental co-operation.

Important to get it right. If not, information silos result

Page 7: 2006 Milo Davies

Copyright © 2006, SAS Institute Inc. All rights reserved.

• Connectivity

• Data cleansing and enrichment

• Extraction, transformation and loading

• Data synchronisation

• Data migration

• Data federation

• Metadata Management

What is SAS Data Integration Studio?

SAS Data Integration Studio is a powerful visual design tool for the construction, execution and maintenance of data integration projects

Page 8: 2006 Milo Davies

Copyright © 2006, SAS Institute Inc. All rights reserved.

Dimension Table (Customers)• Fuzzy matching to cluster customers

• Data enrichment – guess gender based on name

• Keep history of changes – Slowly changing dimension

Fact Table (Orders)• Join Orders and Order Items

• Calculate the gross margin

• Check that order has pricing info available

• Lookup customers to get correct ID and gender information

• Generate exceptions when customer details not found

• Replace business key with a surrogate key

Data Integration Studio Demo

Page 9: 2006 Milo Davies

Copyright © 2006, SAS Institute Inc. All rights reserved.

Hand Coded Approach• Very flexible

• Highly productive for experienced programmers

Tools Approach• Self documenting

• Enforces rigour and consistency

• Enables data lineage / impact analysis

• Productivity gains

• Facilitates re-use

• …and you can still code if you really really want to

Data Integration: Tools vs Hand Coded

Page 10: 2006 Milo Davies

Copyright © 2006, SAS Institute Inc. All rights reserved.

SAS Users New Zealand

Proudly sponsored by…

knowareknoware