Harsh_Ace_ETL Testing
-
Upload
harsh-kumar -
Category
Documents
-
view
46 -
download
1
Transcript of Harsh_Ace_ETL Testing
© 2013 Wells Fargo Bank, N.A. All rights reserved. Internal use only.
ETL TestingHarsh KumarSenior Quality Analyst
Bangalore
26/02/2015
2
Topic Covered
. Purpose of ETL Testing
. ETL Testing categories
. ETL Testing Process
. ETL Testing Techniques
. ETL Table Partition
. ETL Extraction Method
. ETL Transformation Example
. ETL Load Method
. ETL Tool Demo
3
Purpose of ETL Testing
To Analyze data for critical business decisions.
Provides a platform to move data from various source into data warehouse.
Company data may be scattered into different location and different formats
It helps in bringing them to one consistent system.
Removes Mistakes and corrects Data.
4
ETL Testing Categories
ETL or Data warehouse testing is categorized into four different engagements.
New Data Warehouse Testing – New DW is built and verified from scratch. Data input is taken from customer requirements and different data sources and new data warehouse is build and verified with the help of ETL tools.
Migration Testing – In this type of project customer will have an existing DW and ETL performing the job but they are looking to bag new tool in order to improve efficiency.
Change Request – In this type of project new data is added from different sources to an existing DW. Also, there might be a condition where customer needs to change their existing business rule or they might integrate the new rule.
Report Testing – Report are the end result of any Data Warehouse and the basic propose for which DW is build. Report must be tested by validating layout, data in the report and calculation.
5
ETL Testing Process
Verify that data transformation from source to destination works as expected.
Verify that expected data is added in target system.
Verify that all DB fields and field data is loaded without any truncation.
Verify data checksum for record count match.
Verify that for rejected data proper error logs are generated with all details.
Verify NULL value fields.
Verify that duplicate data is not loaded.
Verify data integrity.
6
ETL Process
The most time-consuming process in DW development
80% of development time is spent on ETL!
Extract relevant data
Transform data to DW format
Build keys, etc.
Cleansing of data
Load data into DW
Build aggregates, etc.
7
ETL Testing Techniques:
Verify that data is transformed correctly according to various business requirements and rules.
Make sure that all projected data is loaded into the data warehouse without any data loss and truncation.
Make sure that ETL application appropriately rejects, replaces with default values and reports invalid data.
Make sure that data is loaded in data warehouse within prescribed and expected time frames to confirm improved performance and scalability.
8
ETL Table Partition
Process that lets you decompose Large tables into smaller and more manageable pieces called partitions.
Range Partition (date column)
List Partiton (sales region column)
Hash Partiton (id column Tablespace)
Composite Partiton (Range+hash,range+list)
Microsoft Word Document
9
ETL Extraction Method
The process of reading the data from database.
There are 4 methods
Scripts in Linux Shell,Perl,Python
Sqlldr+SQL
Hardcoded Java,C#,JDBC
In house built ETL Tool
10
ETL Transformation ExampleSource System Type of
transformationDW
Address Field:#123 ABC Street|XYZ City |1000Republic of MN
Field Splitting No: 123 Street: ABCCity: XYZCountry: Republic of MNPostal Code: 1000
System ACustomer title: PresidentSystem BCustomer title: CEO
Field Consolidation Customer title: President & CEO
Order Date:05 August 1998Order Date: 08/08/98
Standardization Order Date:05 August 1998Order Date: 08 August 1998
System Debit CardCustomer Name:Ramesh KumarOrg: Wells FargoAddress:BengaluruSystem Credit CardCustomer Name: Ramesh KumarOrg: Wells FargoAddress:Bengaluru
Deduplication Bank Account Details
Customer Name:Ramesh KumarOrg:Wells FargoAddress:Bengaluru
11
ETL Load Method
The process of writing data into target database.
Slowly Changing dimension Load (Check Sum/Incremental)
Fact Table Load (Temp/Truncate)
Snapshot Table Load (Reporting,Temp,Trunc Fact)
Current State Table Load (Insert,Update)
12
Star Schema(Denormalized)
13
Snow Flake Schema(Normalized)
14
SCD 1 & 2
SCD 1 –
SCD 2-
Customer Key Name State
1001 Williams New York
Customer Key Name State
1001 Williams Los Angeles
Customer Key
Name State Start Date
End Date
1001 Williams New York 16-Mar-2009
19-Feb-2010-
1005 Williams Los Angeles
20-Feb-2010
29-Jul-2012
1009 Williams California 30-Jul-2012
31-12-2999
15
SCD-3
SCD 3-Customer Key Name State
1001 Williams New York
Customer Key
Name Original State
Current State
Effective Date
1001 Williams New York Los Angeles
20-FEB-2010
Customer Key
Name Original State
Current State
Effective Date
1001 Williams New York
California 30-JUL-2012
16
ETL Tool OFSAA
OFSAA works as an integrator, extracting data from different sources; transforming it in preferred format based on the business transformation rules and loading it in Data Warehouse.http://wsvra00a0152.wellsfargo.com:9704/analytics
https://wspra00a0555.wellsfargo.com:16351/profitview/prelogin.jsp
Landing Area(History)
Oracle Financial Services Analytics Applications (OFSAA) User Front End
ProfitView OFSAA Solution Architecture (Release 4)
Type 2 Rules
Dashboard for
Profitability & Cross-sell
Dimensions Tables (Used across Staging and
Results Area)
Dimension Tables
Aggregation / Provisioning
Essbase CubesProduct
CustomerOfficer
Organization
Reference Tables (Used across Staging and
Results Area)
End User / Analyst
Results Area – FACT Tables(5 Years History)
Adjustments / Restatements
Adjustment Tables & DQ Checks
Perform ProfitView Specific Calculations
FACT Data for 3 Years – Aggregation & Drilldown
Data Provisioning
Data Provisioning
Tables
Batch Creation
Run Framework
Adjustments to ProfitMax
Adjustments
RPD
SOR Transactional Data
Data Sourcing
Manual Data Feeds
CDS – All Information Available in CDS Without Any SOR
Specific Filter
Transactional & Master Data
Data Quality Checks
Weekly Feed
iHUB§ TRIP§ REALM
SIMCORP§ T24§ Int. Calypso§ EASTDIL
Master Data and Reference Data
iHUB§ CPL§ ORBT§ ICIS
Dimensions & Hierarchy
§ Product - CPL§ Organization -
ORBT
Truncate & Reload
Manual Feeds **
All SOR Transactional & iHub Enriched
Data(Incl. SOR Specific
Contract ID & Supporting
Components)
EPM § COF Base Rate§ COF Term Floating § FTP Rate
§ Apply Adjustments § Financial Attribution
Computation§ Aggregation Prep
Common Staging Area(History)
Transaction Staging Tables
Dimension/Reference/
Security Staging Tables
CURE§ FX Rates
DRM§ GL Account
Hierarchy§ FP&A Hierarchy
Excel Upload§ Other Metrics Related
Reference Data Elements§ Internal Breakage Charge§ Managed Term Premium§ Security File Upload§ Adjustments
Weekly Feed –
Informatica ETL
Weekly Feed – Informatica ETL
Data Quality Checks (Technical and
Business)
iHub Normalized Data Structure
Weekly Feed – Informatica
ETL
Weekly Feed – Informatica
ETL
Weekly Feed – Informatica
ETL
Weekly Feed – Informatica
ETL
§ iHub Contract ID to OFSAA Contract (Acct Skey)
§ iHub Contract ID Components
§ Acct Skey to Product§ Acct Skey to Officer§ Acct Skey to Customer§ Acct Skey to AU
Contract to Dimension Mapping
Weekly Feed – Informatica
ETL
Adjustments Utility
Financial Attribution Agreement
Setup
Security Maintenance
Utility
§ Assigns Roles from OFSAA, Essbase and OBIEE Applications
Security Administrator
Adj. Excel
Upload
Reference Tables
T2T
Weekly Feed – Informatica ETL (Truncate & Reload)
Slowly Changing Dimensions ProcessData Quality Error Report
Adjustment Error Report
Adjustment Entry Screen
Lookup from Reference Tables
CPL Hierarchy Lookup from Dimension Tables
Latest Dimensions &
Hierarchies
Financial Adj. for Processing
Lookup Financial Attribution, Margins
& Shadow AU
Sending PMAX Specific Financial
Adjustments & Dimensional
Reassignments via Flat File
3-60 Months Data Archive
AUTOSYS - Batch Scheduling & MonitoringIntegration with AUTOSYS Job
Scheduler
OFSAA Batches
Lookup from Dimension Tables
Officer, Product, Customer, AU
Adjustments for Processing
Error Report Back to Users
Lookup Contracts to Adjust
CDS Landing Area
Legend:
Release 5 Scope Item
Release 6+ Scope Item
CDS§ MCV§ Officer
Atomic Schema
Reporting SchemaCDS
§ ICON§ PMAX§ OSDP
Data Quality Checks (Technical and
Business)
17
Thank You