ETL Optimization PDO TPT

Post on 14-Dec-2015

236 views 0 download

description

ETL Optimization PDO TPT

Transcript of ETL Optimization PDO TPT

ETL Process Optimization using Push Down Optimization (PDO) and Teradata Parallel

Transporter (TPT)

Author: S AnandProject: GE Capital Fleet Management

Introduction Overview of Pushdown Optimization (PDO) How does PDO Works How does Integration Service handles PDO Explanation with 2 examples Types of PDO Active and Idle Databases Working with Dates Rules and guidelines for functions in PDO Error handling, Logging and Recovery Configuring sessions for PDO Limitations on types of transformations that can be pushed to DB Benefits of PDO Alternative to PDO- Teradata parallel transporter (TPT) What is TPT and Types of Operators

Overview of Pushdown Optimization (PDO)

Load balancing among servers Push the ETL logic to Database

How does PDO Works

Push transformation logic to Database Translates logic in to SQL queries and sends to Database Push transformation logic to Database

How does Integration Service handles PDO

Creation of Temporary objects Temporary Sequence objects and temporary views Session property- Push down optimization with sequence/View

Home

Example 1Example 1

Push Down QueryPush Down Query

INSERT INTO T_ITEMS (ITEM_ID, ITEM_NAME, ITEM_DESC) SELECT CAST ((CASE WHEN INSERT INTO T_ITEMS (ITEM_ID, ITEM_NAME, ITEM_DESC) SELECT CAST ((CASE WHEN 5419 IS NULL THEN '' ELSE 5419 END) + '_' + (CASE WHEN ITEMS.ITEM_ID IS NULL THEN '' 5419 IS NULL THEN '' ELSE 5419 END) + '_' + (CASE WHEN ITEMS.ITEM_ID IS NULL THEN '' ELSE ITEMS.ITEM_ID END) AS INTEGER), ITEMS.ITEM_NAME, ITEMS.ITEM_DESC FROM ELSE ITEMS.ITEM_ID END) AS INTEGER), ITEMS.ITEM_NAME, ITEMS.ITEM_DESC FROM ITEMS2 ITEMS.ITEMS2 ITEMS.

Example 2 ( Filter transformation with condition DEPTNO>40 )Example 2 ( Filter transformation with condition DEPTNO>40 )

Push Down QueryPush Down Query

insert into emp_tgt(empno, ename, sal, comm, deptno) insert into emp_tgt(empno, ename, sal, comm, deptno) select select emp_src.empno,emp_src.empno,emp_src.ename,emp_src.ename,emp_src.sal,emp_src.sal,emp_src.comm,emp_src.comm,emp_src.deptnoemp_src.deptnofrom emp_srcfrom emp_srcwhere (emp_src.deptno >40)where (emp_src.deptno >40)

Push Down Optimization Viewer Push Down Optimization Viewer Home Home

Types of Pushdown Optimization

Source Push Down Optimization Target Push Down Optimization Full Push Down Optimization

Integration Service behaviour with Full Push Down Optimization

Home

Working with Dates

Date values converted to character values Date formats for TO_CHAR and TO_DATE functions\ HH24 date format Blank spaces in date format strings SYSDATE built in variable

Rules and guidelines for functions in PDO

If you use ADD_TO_DATE in transformation logic to change days, hours, minutes, or seconds, you cannot push the function to a Teradata database

When you push LTRIM, RTRIM, or SOUNDEX to a database, the database treats the argument (' ') as NULL, but the Integration Service treats the argument (' ') as spaces.

Home

Configuring Session for PDOConfiguring Session for PDO

Home

Limitations on type of transformations Limitations on type of transformations

Home

Benefits of PDO

The Power Center Pushdown Optimization Option offers many benefits, including: Increased performance by using optimal resources Increased ease-of-use with a metadata-driven architecture that provides metadata

lineage Increased IT team productivity with simplified debugging and performance tuning Reduced risk and enhanced flexibility through database neutrality

Conclusion

Despite being successful we had tough time in implementing it because of the Rigid Rules and Guidelines to be followed for the PDO to work and also this technique was not feasible for a few of our mappings which handled transformations / Circumstances that do not support PDO .In such case we looked for options other than PDO. However Pushdown optimization proved to be successful method of improving performance of jobs

Home

Teradata parallel Transporter

Teradata Parallel Transporter (TPT) is a parallel multi-function load environment. It was designed to use the different Teradata Load and Unload Utilities under a single infrastructure. So, based on the protocols of fastload, multiload, tpump and fastexport, new "operators" were developed and they are named as Load, Update, Stream and Export respectively

Teradata parallel Transporter Operators

Stream Update Load