Best Practices for Using Informatica With Teradata Database

download Best Practices for Using Informatica With Teradata Database

of 18

description

Informatica

Transcript of Best Practices for Using Informatica With Teradata Database

Best Practices for using informatica with Teradata Database

Best Practices for using informatica with Teradata Database

By Chakra Sankaraiah

Best Practices for using informatica with Teradata DatabaseBy Chakra Sankaraiah

(Email id - [email protected] )

Document HistoryVersionDateAuthorCommentAuthorization

1.012/06/2009Chakra SankaraiahInitial document

Table of Contents1. Introduction

3

2. Three Key Rules

3 i) Use Pushdown

3 ii) Using Teradata loaders instead of ODBC connections

7 iii) Use SQL transformation to its fullest

131. Introduction Teradata database is one of the leaders among the different databases that are available for Data warehousing. It is able to outperform other database because of two major features, one is parallelism and other is shared nothing architecture. Because of its data warehousing abilities or expertise, Teradata does not come cheap. Each node you buy is close to million dollars and that for lot of medium size companies is a big investment. Now if you already invested heavily on your database to be so powerful, then you should try to use its capabilities to its best. My article over here is focused on how you can leverage the investment that you made in Teradata database in your ETL tool. I am going to provide some best practices on what should be done when you using Teradata with Informatica.2. Three Key Rules

There are three Key rules when you develop informatica job that is going to interacts with Teradata database. i) Use Pushdown

ii) Using Teradata loaders instead of ODBC connections

iii) Use SQL transformation to its fullest

Below is explanation to above points i) Use Pushdown When to use

When you use Pushdown in informatica session, informatica wraps the mapping logic into SQL statements and fires it at the database. Hence the logic in your informatica mapping is not getting processed using informatica server but it is getting processed at Teradata database server.

Steps to achieve pushdown in your informatica job (1) Below is an example of a informatica mapping that uses transformation like Source Qualifier, Joiner, Expression transformation and Filters.

This is a typical informatica mapping where a joiner is used to compare data, expression is used to apply transformation to the data values, filter is used to just get the modified records and finally the data is getting inserted into the target. This execution of complete logic can be moved from informatica server to Teradata database by changing session properties as below

Step2) Session properties for pushdown

There are various options available under Pushdown but I would recommend you to use Full with View as I believe you should use as much pushdown as possible in a session, if you really want to use the database servers rather than informatica server.

Step3) You can look at how the Pushdown SQL looks by selecting Mappings tab under session properties and then click on Pushdown Optimization

Step4) Now when you select Pushdown Option you can see the SQL that informatica generates.

This SQL is then sent to the database and complete execution happens at the database level without putting any load on informatica server.

There are various options that you select from pushdown that you can learn from informatica help File. There are certain informatica transformations where you cannot use pushdown, which is listed in informatica help files as well.ii) Using Teradata loaders instead of ODBC connections

In case the target is in different server than the source tables or in case you have records that come in for updates and deletes and not just inserts, then you can use Informatica loaders rather than relation connection for target. Now few things to keep in mind when you do this is

a) You should have TTU (Tereadata tools and utilities) installed in the same server where informatica is installed.

b) You should use loader instead of ODBC only when you are processing bigger set of records. (approximately more than say 20,000)c) You need to declare the target transformation keys as per the columns that is unique in target table. This means you should make sure that your target definitions has column defined as keys.(Only application for MLOAD)

d) You should have all the columns marked as primary key in target definitaion that are declared as PIs in that target table. (Only application for MLOAD, because MLOAD updates requires all PIs to be there in the where clause of Update statement)

Below are the steps to load target tables using Loader utilities from Teradata Step1) Create loader connection in informatica workflow manager.

Step2) You should use Teradata warehouse Builder Extension when you create a loader connection for Teradata

Step3) You can set the database properties and other detail as required. One key thing when you create a loader connection is, if you want that loader connection for just Insert or for update. In case for update then you are going to use Update or Stream. When you use Load as operator then you are using Fast Load utility of Teradata, when you are using Update operator then you are using MLOAD utility of Teradata and when you use Stream, you are using TPUMP utility of Teradata. You can know more about each operator properties from informatica guide.

Step4) Once loader connection is available you can set target connection as loader by first changing the Writer to File Writer and then change the connection type to Loader.

Optional ( Once you setup target connection as Loader, informatica creates scripts based upon the Operator you specify. You can modify that script generated by informatica in case you want to add or remove any portion of the script.

This you can do only at the session level, under Loader value as mention in the snapshot below.

Also, you can still use pushdown optimization along your target loader; only thing is your pushdown will happen just till target transformation.

iii) Use SQL transformation to its fullest

SQL transformation comes really handy when you are using a powerful database like Teradata. You can perform operation like Inputs, Updates and Deletes completely at database level by using SQL transformation. But before you use SQL transformation you need to make sure that the records are rightly flagged for insert, update or deletes. The usual practice is to put the CDC (Change data capture) data into a WIP(Work in progress table) with appropriate flags (Delete, Update or Insert) and then use that WIP table to apply those DML operation to target table using SQL transformation. Below are high level steps to use SQL transformation in Informatica.

Step1) First You need to create a mapping that creates a WIP (work in progress) table. Creation of this table can be done using a combination of Pushdown and loaders as mentioned above. Since most of the WIP tables are truncate and reload they are pretty fast.

Step2) Once data is available in WIP table, you can create two separate mappings, one for Update/Delete and other for Insert. Incase when you are using type3 dimension tables, this approach really comes handy because you can have insert SQL in SQL transformation where the records are flaged for update in WIP table.

In below snapshot we have the mapplet and target transformation just to capture the errors. Even the Source and Source Qualifier transformation are just to trigger the mapping.

Step3) in SQL transformation you can mention the DML statement under the tab SQL port.

This approach is really effective since the complete execution happens at the database level and there is practically no load on the informatica server at all.

Above 3 simple approaches if implemented effective will give you a great performance for any mapping that you create in Informatica for Teradata database. This material is very abstract, and is just meant to point you to the right direction. In case you have any questions about any of the above approaches, please shoot an email to me at [email protected] and I will try to get back to you as soon as possible.

Page 2 of 18