INFORMATICA EASY LEARNING ONLINE TRAINING

47

Transcript of INFORMATICA EASY LEARNING ONLINE TRAINING

Page 1: INFORMATICA EASY LEARNING ONLINE TRAINING
Page 2: INFORMATICA EASY LEARNING ONLINE TRAINING

Data Warehousing Concepts What is Data Warehousing? Dimensional Data Model Star Schema Snowflake Schema Slowly Changing Dimension Conceptual Data Model Logical Data Model Physical Data Model Conceptual, Logical, and Physical Data Model Data Integrity What is OLAP MOLAP, ROLAP, and HOLAP

Page 3: INFORMATICA EASY LEARNING ONLINE TRAINING

What is Data Warehousing?Different people have different definitions for a data warehouse. The most popular definition came from Bill Inmon, who provided the following:

A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process.

A process of transforming data into information and making it available to users in a timely enough manner to make a difference

Page 4: INFORMATICA EASY LEARNING ONLINE TRAINING

To summarize ...

• OLTP Systems are used to “run” a business

• The Data Warehouse helps to “optimize” the business

Page 5: INFORMATICA EASY LEARNING ONLINE TRAINING

Corporate Data

It includes

• human resource data• financial data• facilities data• sales data• expenses on marketing data• production planning cost• manufacturing cost• service delivery cost• inventory management• shipping and payment data

What is enterprise-wide corporate data?

How is the Business Intelligence in Retail Banking? Or Retail Industry?

Page 6: INFORMATICA EASY LEARNING ONLINE TRAINING

KPI’s

The KPI can be used as the performance measurement tool

(Key Performance Indicator) 

The KPI’s in Retail Banking: The Total cash deposits held in a month The average annual deposit held Average number of deposits per retail bank growth Average withdrawals made by each depositor Ratio of active depositor or dormant depositor Average number of default borrowers in a year Average number of credit cards issued by the retail bank Rate of borrowing risk Rate of default risk Average number of customers served in a day Average number of closed bank accounts

Page 7: INFORMATICA EASY LEARNING ONLINE TRAINING

KPI’s

The KPI can be used as the performance measurement tool

(Key Performance Indicator) 

The KPI’s in Retail Industry:

• Sales compared to Budget & Target• Sales compared to last year (or any other period)• Wage cost recovery• Average sale per customer/transaction• Units per customer/transaction• Sales per hour• Sales & Gross Margin

Page 8: INFORMATICA EASY LEARNING ONLINE TRAINING

KPI’s (Key Performance Indicator) 

Examples of common departmental KPIs

Sales GrowthAnalyze the pace at which your organization's sales revenue is growing and use that information in strategic decision-making

MarketingAnalyze the pace at which your organization's sales revenue is growing and use that information in strategic decision-making

FinancialMeasures your organization's financial health by analyzing readily available resources that could be used to meet any short-term obligations.

Page 9: INFORMATICA EASY LEARNING ONLINE TRAINING

Data Warehousing

Page 10: INFORMATICA EASY LEARNING ONLINE TRAINING

Data Warehousing Architecture

Page 11: INFORMATICA EASY LEARNING ONLINE TRAINING

Data Warehousing Environment

Page 12: INFORMATICA EASY LEARNING ONLINE TRAINING

• Duplicate data • Inconsistent values• Missing data• Unexpected use of fields• Impossible or wrong values

Data Quality

• Data-Type Constraints: • Range Constraints:• Mandatory Constraints: • Unique Constraints: • Set-Membership constraints: • Foreign-key constraints: Regular expression patterns:

Validations for Data Cleansing

Page 13: INFORMATICA EASY LEARNING ONLINE TRAINING

Views to build warehouse

• The top-down view• The data source view• The data warehouse view• The business query view

What approach is better to design data warehouse?

Page 14: INFORMATICA EASY LEARNING ONLINE TRAINING

Top Down Approach

Page 15: INFORMATICA EASY LEARNING ONLINE TRAINING

Bottom Up Approach

Page 16: INFORMATICA EASY LEARNING ONLINE TRAINING

Data Warehousing Design

• Requirement Gathering• Physical Environment Setup• Data Modeling• ETL• OLAP Cube Design• Front End Development• Report Development• Performance Tuning• Query Optimization• Quality Assurance• Rolling out to Production• Production Maintenance• Incremental Enhancements

Page 17: INFORMATICA EASY LEARNING ONLINE TRAINING

Why Data Warehousing?

Need to see daily, weekly, monthly, quarterly profit of each store.

Comparison of sales and profit on various time periods.

Comparison of sales in various time bands of the day.

Need to know which product has more demand on which location?

Need to study trend of sales by time period of the day over the week, month, and year?

On what day sales is higher?

Page 18: INFORMATICA EASY LEARNING ONLINE TRAINING

Phases of Data Warehousing Project

1. Identify and collect requirements

Need to see daily, weekly, monthly, quarterly profit of each store.

Comparison of sales and profit on various time periods.

Comparison of sales in various time bands of the day.

Need to know which product has more demand on which location?

Need to study trend of sales by time period of the day over the week, month, and year?

On what day sales is higher?

Will be handled by business analyst and leads

Who collects the requirements?

Page 19: INFORMATICA EASY LEARNING ONLINE TRAINING

Phases of Data Warehousing Project

2. Design the dimensional model

Pharmacy_Claims_FactDrug_Id (FK)Org_Id (FK)Practitioner_Id (FK)Product_Id (FK)Time_ID (FK)Claim_status_Id (FK)Provider_Id (FK)Subscriber_id (FK)Demographic_key (FK)

InsuranceType_Id (FK)Incurred_DateClaim_DateClaim_Settled_DateDays_SupplyDispensing_FeeIncentive_Savings_AmountIncentive_Fee_Paid_AmountAmount_ClaimedAmount_PaidAmount_PendingAmount_Adjusted

CoPayment_AmountCoInsurance_Amount

DeductibleRefill_IndicatorClaim_Production_Key

Claim_Production_Txn_NoStatus_Change_DateLast_Record_Flag

PractitionerPractitioner_IdPractitioner_NamePractitioner_Type

practioner_type_descQualification

Specialisationssn

Medical_Assoc_Enroll_No

OrganisationOrg_IdOrg_prod_idOrg_NameAddressCityCountyStateZipIndustry_Classification

SubscriberSubscriber_idSubscriber_prod_keyMember_prod_keyMember_NameDate_of_BirthSubscriber_typeAddressCityCountyStateZipHobby1Hobby2Smoker_YNAlcoholic_YNPre_Existing_Ailments

DemographicsDemographic_keyAge_groupIncome_groupRaceCountry_of_birthMarital_statusGenderCitizenship_status

ProviderProvider_IdProvider_NameProvider_TypeAddressCityCountyStateZipService_Area

Netwrok_Provider

Insurance_TypeInsuranceType_IdInsuranceType_NameInsuranceType_Desc

ProductProduct_IdProduct_NameProduct_Category

LoB

Claim_StatusClaim_status_IdClaim_Status_Reason

Claim_stat_catg

TimeTime_IDDayWeekMonthQuarterYearSeason

DrugsDrug_IdDrug_Name_GenericDrug_Name_TradeNational_Drug_CodeDrug_DescriptionDrug_CategoryFormularyManufacturer

Data Model will be designed by Data Modelers

Page 20: INFORMATICA EASY LEARNING ONLINE TRAINING

Phases of Data Warehousing Project

3. Create and Maintain the tables

Database will be maintained by DBA’s

Page 21: INFORMATICA EASY LEARNING ONLINE TRAINING

Phases of Data Warehousing Project

4. Loading the data into Data Warehouse and Data Marts

Will be taken care by ETL Team

Page 22: INFORMATICA EASY LEARNING ONLINE TRAINING

What is ETL?

Informatica is ETL application

Page 23: INFORMATICA EASY LEARNING ONLINE TRAINING

Phases of Data Warehousing Project

5. Develop Reports / Dashboards

Will be taken care by Reporting Team

Page 24: INFORMATICA EASY LEARNING ONLINE TRAINING

Phases of Data Warehousing Project

6. Testing ETL Mappings and Reports / Dashboards

Will be taken care by QA Department

7. Deploying to the Production and Maintaining by Production Team

Will be taken care by Production Department

Where do we fit after learning this training?

Page 25: INFORMATICA EASY LEARNING ONLINE TRAINING

Phases of Data Warehousing Project

Where do we fit after learning this training?

We can work as a1. ETL Developer2. ETL Administrator3. ETL Tester

Page 26: INFORMATICA EASY LEARNING ONLINE TRAINING

Data Modeling

Page 27: INFORMATICA EASY LEARNING ONLINE TRAINING

What is Data Modeling?

• Data model defines relationships between data

• Dimensional data model is most often used in data warehousing systems.

• Data modeling is the process of learning about the data.

Data modeling will be designed by data modelers

Page 28: INFORMATICA EASY LEARNING ONLINE TRAINING

What is Dimensional Modeling?

• It help us store the data

Goals and benefits of Dimensional Modeling• Faster Data retrieval• Better Understandability• Extensibility

It has 2 distinct categories• Dimension and• Measures

Page 29: INFORMATICA EASY LEARNING ONLINE TRAINING

Scenarios of Dimensional Data Modeling

McDonald’s client:I want to store information of how many burgers and fries are getting sold per day from a single McDonald’s outlet.

what is dimension and what is a measure in this example

Step1: Identify the Dimensions

1.Food (ex: Burgers and fries) 2. Store (McDonald’s) 3. Some specific day

Step2: Identify the measures

Number of burgers/fries sold is a measure.

The Fact table captures the data that measures the organizations business operations

Page 30: INFORMATICA EASY LEARNING ONLINE TRAINING

Scenarios of Dimensional Data Modeling

Step3: Identify the attributes or properties of dimensions

KEY NAME

1 Burger

2 Fries

KEY NAME

1 Store 1

2 Store 2

... ...

KEY DAY

1 01 Jan 2012

2 02 Jan 2012

3 03 Jan 2012

... ...

Page 31: INFORMATICA EASY LEARNING ONLINE TRAINING

Scenarios of Dimensional Data Modeling

Step 4: Identify the granularity of the measures

What is meant by "Granularity"?

Granularity refers to the lowest (or most granular) level of information stored in any table

Page 32: INFORMATICA EASY LEARNING ONLINE TRAINING

Scenarios of Dimensional Data Modeling

Step 5: History Preservation (Optional)

This can be solved by designing the dimension tables as "slowly changing dimension".

Entities:Entities are the things about which you want to store information.

For example: EMPLOYEE

Page 33: INFORMATICA EASY LEARNING ONLINE TRAINING

Cardinalities:

Scenarios of Dimensional Data Modeling

The cardinality shows how much of one side of the relationship belongs to how much of the other side of the relationship.

For example: • How many customers belong to 1 sale?; • How many sales belong to 1 customer?; • How many sales take place in 1 shop?

Customers --> Sales; 1 customer can buy something several timesSales --> Customers; 1 sale is always made by 1 customer at the timeCustomers --> Products; 1 customer can buy multiple productsProducts --> Customers; 1 product can be purchased by multiple customers

Page 34: INFORMATICA EASY LEARNING ONLINE TRAINING

Scenarios of Dimensional Data Modeling for Banking

Page 35: INFORMATICA EASY LEARNING ONLINE TRAINING

Scenarios of Dimensional Data Modeling for Retail Banking

Page 36: INFORMATICA EASY LEARNING ONLINE TRAINING

Scenarios of Dimensional Data Modeling for Retail Banking

Event 1 - Set-up Banks and BranchesEvent 2 - Create new CustomerEvent 3 - Setup New AccountEvent 4 - Issue Credit CardEvent 5 - Customer makes DepositEvent 6 - Customer uses CardEvent 7 - Bank Issues StatementEvent 8 - Customer closes Account

Page 37: INFORMATICA EASY LEARNING ONLINE TRAINING

Data Modeling

Page 38: INFORMATICA EASY LEARNING ONLINE TRAINING

Data Modeling

Page 39: INFORMATICA EASY LEARNING ONLINE TRAINING

Data Modeling

Page 40: INFORMATICA EASY LEARNING ONLINE TRAINING

Types of OLAP Servers

We have four types of OLAP servers:

• Relational OLAP (ROLAP)• Multidimensional OLAP (MOLAP)• Hybrid OLAP (HOLAP)• Specialized SQL Servers

Page 41: INFORMATICA EASY LEARNING ONLINE TRAINING

OLTP v/s OLAP

Page 42: INFORMATICA EASY LEARNING ONLINE TRAINING

OLTP Data Model

Page 43: INFORMATICA EASY LEARNING ONLINE TRAINING

OLTP OLAP

Page 44: INFORMATICA EASY LEARNING ONLINE TRAINING

Snowflake Schema

Page 45: INFORMATICA EASY LEARNING ONLINE TRAINING

Snowflake Schema

Page 46: INFORMATICA EASY LEARNING ONLINE TRAINING

Star Schema

Page 47: INFORMATICA EASY LEARNING ONLINE TRAINING

Informatica