75779849-SCD-TYPE2
-
Upload
dharmendard -
Category
Documents
-
view
38 -
download
2
description
Transcript of 75779849-SCD-TYPE2
Dynamically generate parameter files
1. PowerCenter objects – Introduction: • A repository is the highest physical entity of a project in PowerCenter.
• A folder is a logical entity in a PowerCenter project. For example, Customer_Data is a folder.
• A workflow is synonymous to a set of programs in any other programming language.
• A mapping is a single program unit that holds the logical mapping between source and target
with required transformations. A mapping will just say a source table by name EMP exists with
some structure. A target flat file by name EMP_FF exists with some structure. The mapping
doesn’t say in which schema this EMP table exists and in which physical location this EMP_FF
table going to be stored.
• A session is the physical representation of the mapping. The session defines what a maping
didn’t do. The session stores the information about where this EMP table comes from. Which
schema, with what username and password can we access this table in that schema. It also tells
about the target flat file. In which physical location the file is going to get created.
• A transformation is a sub-program that performs a specific task with the input it gets and
returns some output. It can be assumed as a stored procedure in any database. Typical examples
of transformations are Filter, Lookup, Aggregator, Sorter etc.
• A set of transformations, that are reusable can be built into something called mapplet. A
mapplet is a set of transformations aligned in a specific order of execution.
As with any other tool or programing language, PowerCenter also allows parameters to be
passed to have flexibility built into the flow. Parameters are always passed as data in flat files to
PowerCenter and that file is called the parameter file.
2. Parameter file format for PowerCenter: For a workflow parameter which can be used by any session in the workflow, below is the
format in which the parameter file has to be created.
[Folder_name:WF.Workflow_Name] $$parameter_name1=value
$$parameter_name2=value
For a session parameter which can be used by the particular session, below is the format in which
the parameter file has to be created.
[Folder_name:WF.Workflow_Name:ST.Session_Name] $$parameter_name1=value
$$parameter_name2=value
3. Parameter handling in a data model:
• To have flexibility in maintaining the parameter files.
• To reduce the overhead for the support to change the parameter file every time a value of a
parameter changes
• To ease the deployment,
all the parameters have to be maintained in Oracle or any database tables and a PowerCenter
session is created to generate the parameter file in the required format automatically.
For this, 4 tables are to be created in the database:
1. FOLDER table will have entries for each folder.
2. WORKFLOWS table will have the list of each workflow but with a reference to the
FOLDERS table to say which folder this workflow is created in.
3. PARAMETERS table will hold all the parameter names irrespective of folder/workflow.
4. PARAMETER_VALUES table will hold the parameter of each session with references to
PARMETERS table for parameter name and WORKFLOWS table for the workflow name.
When the session name is NULL, that means the parameter is a workflow variable which can be
used across all the sessions in the workflow.
To get the actual names because PARAMETER_VALUES table holds only ID columns of
workflow and parameter, we create a view that gets all the names for us in the required format of
the parameter file. Below is the DDL for the view.
a. Parameter file view:
CREATE OR REPLACE VIEW PARAMETER_FILE
(
HEADER,
DETAIL
)
AS
select '['fol.folder_name'.WF:' wfw.workflow_name']' header
,pmr.parameter_namenvl2(dtl.logical_name, '_'dtl.logical_name, NULL)'='
dtl.value detail
from folder fol
,parameters pmr
,WORKFLOWS wfw
,PARAMETER_VALUES dtl
where fol.id = wfw.folder_id
and dtl.pmr_id = pmr.id
and dtl.wfw_id = wfw.id
and dtl.session_name is null
UNION
select '['fol.folder_name'.WF:' wfw.workflow_name'.ST:' dtl.session_name']' header
,decode(dtl.mapplet_name, NULL, NULL, dtl.mapplet_name'.')
pmr.parameter_namenvl2(dtl.logical_name, '_'dtl.logical_name, NULL)'=' dtl.value detail
from folder fol
,parameters pmr
,WORKFLOWS wfw
,PARAMETER_VALUES dtl
where fol.id = wfw.folder_id
and dtl.pmr_id = pmr.id
and dtl.wfw_id = wfw.id
and dtl.session_name is not null
b. FOLDER table
ID (NUMBER)
FOLDER_NAME (varchar50)
DESCRIPTION (varchar50)
c. WORKFLOWS table
ID (NUMBER)
WORKFLOW_NAME (varchar50)
FOLDER_ID (NUMBER) Foreign Key to FOLDER.ID
DESCRIPTION (varchar50)
d. PARAMETERS table
ID (NUMBER)
PARAMETER_NAME (varchar50)
DESCRIPTION (varchar50)
e. PARAMETER_VALUES table
ID (NUMBER)
WF_ID (NUMBER)
PMR_ID (NUMBER)
LOGICAL_NAME (varchar50)
VALUE (varchar50)
SESSION_NAME (varchar50)
• LOGICAL_NAME is a normalization initiative in the above parameter logic. For example, in a
mapping if we need to use $$SOURCE_FX as a parameter and also $$SOURCE_TRANS as
another mapping parameter, instead of creating 2 different parameters in the PARAMETERS
table, we create one parameter $$SOURCE. Then FX and TRANS will be two
LOGICAL_NAME records of the PARAMETER_VALUES table.
• m_PARAMETER_FILE is the mapping that creates the parameter file in the desired format
and the corresponding session name is s_m_PARAMETER_FILE.
Parameter File in Informatica
1. A parameter file contains a list of parameters and variables with their assigned values.
$$LOAD_SRC=SAP
$$DOJ=01/01/2011 00:00:01
$PMSuccessEmailUser= [email protected]
2. Each heading section identifies the Integration Service, Folder, Workflow, Worklet, or Session to which the
parameters or variables apply.
[Global]
[Folder_Name.WF:Workflow_Name.WT:Worklet_Name.ST:Session_Name]
[Session_Name]
3. Define each parameters and variables definition in the form name=value pair on a new line directly below
the heading section. The order of the parameters and variables is not important within the section.
4. [Folder_Name.WF:Workflow_Name.ST:Session_Name]
5. $DBConnection_SRC=Info_Src_Conn
6. $DBConnection_TGT=Info_Tgt_Conn
7. $$LOAD_CTRY=IND
8. $Param_Src_Ownername=ODS
9. $Param_Src_Tablename=EMPLOYEE_IND
10. The Integration Service interprets all characters between the beginning of the line and the first equal signs as
the parameter name and all characters between the first equals sign and the end of the line as
the parameter value. If we leave a space between the parameter name and the equals sign, Integration
Service interprets the space as a part of the parameter name.
11. If a line contains multiple equal signs, Integration Service interprets all equals signs after the first one as part
of the parameter value.
12. Do not enclose parameter or variable values in quotes as Integration Service interprets everything after the
first equals sign as part of the value.
13. Do not leave unnecessary line breaks or spaces as Integration Service interprets additional spaces as part of
a parameter name or value.
14. Mapping parameter and variable names are not case sensitive.
15. To assign a null value, set the parameter or variable value to <null> or simply leave the value blank.
$PMBadFileDir=<null>
$PMCacheDir=
16. The Integration Service ignores lines that are not valid headings,or do not contain an equals sign character
(=) as Comments.
17. ---------------------------------------
18. Created on 01/01/2011 by Admin.
19. Folder: Work_Folder
20. CTRY:SG
21. ; Above are all valid comments
22. ; because this line contains no equals sign.
23. Precede parameters and variables used within mapplets with their corresponding mapplet name.
24. [Session_Name]
25. mapplet_name.LOAD_CTRY=SG
26. mapplet_name.REC_TYPE=D
27. If a parameter or variable is defined in multiple sections in the parameter file, the parameter or variable with
the smallest scope takes precedence over parameters or variables with larger scope.
28. [Folder_Name.WF:Workflow_Name]
29. $DBConnection_TGT=Orcl_Global
30. [Folder_Name.WF:Workflow_Name.ST:Session_Name]
31. $DBConnection_TGT=Orcl_SG
In the specified session name, the value for session parameter $DBConnection_TGT is Orcl_SG and for rest
all other sessions in the workflow, the connection object used will be Orcl_Global.
Scope of Informatica Parameter File
Next we take a quick look on how we can restrict the scope of Parameters by changing the Parameter File Heading
section.
1. [Global] -> All Integration Services, Workflows, Worklets, Sessions.
2. [Service:IntegrationService_Name] -> The Named Integration Service and Workflows, Worklets, Sessions
that runs under this IS.
3. [Service:IntegrationService_Name.ND:Node_Name]
4. [Folder_Name.WF:Workflow_Name] -> The Named workflow and all sessions within the workflow.
5. [Folder_Name.WF:Workflow_Name.WT:Worklet_Name] -> The Named worklet and all sessions within the
worklet.
6. [Folder_Name.WF:Workflow_Name.WT:Worklet_Name.WT:Nested_Worklet_Name] -> The Named nested
worklet and all sessions within the nested worklet.
7. [Folder_Name.WF:Workflow_Name.WT:Worklet_Name.ST:Session_Name] -> The Named Session.
8. [Folder_Name.WF:Workflow_Name.ST:Session_Name] -> The Named Session.
9. [Folder_Name.ST:Session_Name] -> The Named Session.
10. [Session_Name] -> The Named Session.
Types of Parameters and Variables
There are many types of Parameters and Variables we can define. Please find below the comprehensive list:
Service Variables: To override the Integration Service variables such as email addresses, log file counts,
and error thresholds. Examples of service variables are $PMSuccessEmailUser, $PMFailureEmailUser,
$PMWorkflowLogCount, $PMSessionLogCount, and $PMSessionErrorThreshold.
Service Process Variables: To override the the directories for Integration Service files for each Integration
Service process. Examples of service process variables are $PMRootDir, $PMSessionLogDir and
$PMBadFileDir.
Workflow Variables: To use any variable values at workflow level. User-defined workflow variables like
$$Rec_Cnt
Worklet Variables: To use any variable values at worklet level. User-defined worklet variables like
$$Rec_Cnt. We can use predefined worklet variables like $TaskName.PrevTaskStatus in a parent workflow,
but we cannot use workflow variables from the parent workflow in a worklet.
Session Parameters: Define values that may change from session to session, such as database
connections, db owner, or file names. $PMSessionLogFile, $DynamicPartitionCount and
$Param_Tgt_Tablename are user-defined session parameters. List of other built in Session Parameters:
$PMFolderName, $PMIntegrationServiceName, $PMMappingName, $PMRepositoryServiceName,
$PMRepositoryUserName, $PMSessionName, PMSessionRunMode [Normal/Recovery],
$PM_SQ_EMP@numAffectedRows, $PM_SQ_EMP@numAppliedRows, $PM_SQ_EMP@numRejectedRows,
$PM_SQ_EMP@TableName, $PM_TGT_EMP@numAffectedRows, $PM_TGT_EMP@numAppliedRows,
$PM_TGT_EMP@numRejectedRows, $PM_TGT_EMP@TableName, $PMWorkflowName, $PMWorkflowRunId,
$PMWorkflowRunInstanceName.
Note: Here SQ_EMP is the Source Qualifier Name and TGT_EMP is the Target Definition.
Mapping Parameters: Define values that remain constant throughout a session run. Examples are
$$LOAD_SRC, $$LOAD_DT. Predefined parameters examples are $$PushdownConfig.
Mapping Variables: Define values that changes during a session run. The Integration Service saves the
value of a mapping variable to the repository at the end of each successful session run and uses that value
the next time you run the session. Example $$MAX_LOAD_DT
Difference between Mapping Parameters and Variables
A mapping parameter represents a constant value that we can define before running a session. A mapping parameter
retains the same value throughout the entire session. If we want to change the value of a mapping parameter
between session runs we need to Update the parameter file.
A mapping variable represents a value that can change through the session. The Integration Service saves the value
of a mapping variable to the repository at the end of each successful session run and uses that value the next time
when we run the session. Variable functions like SetMaxVariable, SetMinVariable, SetVariable, SetCountVariable are
used in the mapping to change the value of the variable. At the beginning of a session, the Integration Service
evaluates references to a variable to determine the start value. At the end of a successful session, the Integration
Service saves the final value of the variable to the repository. The next time we run the session, the Integration
Service evaluates references to the variable to the saved value. To override the saved value, define the start value of
the variable in the parameter file.
Parameterize Connection Object
First of all the most common thing we usually Parameterise is the Relational Connection Objects. Since starting from
Development to Production environment the connection information obviously gets changed. Hence we prefer to go
with parameterisation rather than to set the connection objects for each and every source, target and lookup every
time we migrate our code to new environment.E.g.
$DBConnection_SRC
$DBConnection_TGT
If we have one source and one target connection objects in your mapping, better we relate all the Sources, Targets,
Lookups and Stored Procedures with $Source and $Target connection. Next we only parameterize $Source and
$Target connection information as:
$Source connection value with the Parameterised Connection $DBConnection_SRC
$Target connection value with the Parameterised Connection $DBConnection_TGT
Lets have a look how the Parameter file looks like. Parameterization can be done at folder level, workflow level,
worklet level and till session level.
[WorkFolder.WF:wf_Parameterize_Src.ST:s_m_Parameterize_Src]
$DBConnection_SRC=Info_Src_Conn
$DBConnection_TGT=Info_Tgt_Conn
Here Info_Src_Conn, Info_Tgt_Conn are Informatica Relational Connection Objects.
Note: $DBConnection lets Informatica know that we are Parameterizing Relational Connection
Objects.
For Application Connections use $AppConnection_Siebel, $LoaderConnection_Orcl when parameterizing Loader
Connection Objects and $QueueConnection_portal for Queue Connection Objects.
In a precise manner we can use Mapping level Parameter and Variables as and when required. For example
$$LOAD_SRC, $$LOAD_CTRY, $$COMISSION, $$DEFAULT_DATE, $$CDC_DT.
Parameterize Source Target Table and Owner Name
Situation may arrive when we need to use a single mapping from various different DB Schema and Table and load
the data to different DB Schema and Table. Condition provided the table structure is the same.
A practical scenario may be we need to load employee information of IND, SGP and AUS and load into global
datawarehouse. The source tables may be orcl_ind.emp, orcl_sgp.employee, orcl_aus.emp_aus.
So we can fully parameterise the Source and Target table name and owner name.
$Param_Src_Tablename
$Param_Src_Ownername
$Param_Tgt_Tablename
$Param_Tgt_Ownername
The Parameterfile:-
[WorkFolder.WF:wf_Parameterize_Src.ST:s_m_Parameterize_Src]
$DBConnection_SRC=Info_Src_Conn
$DBConnection_TGT=Info_Tgt_Conn
$Param_Src_Ownername=ODS
$Param_Src_Tablename=EMPLOYEE_IND
$Param_Tgt_Ownername=DWH
$Param_Tgt_Tablename=EMPLOYEE_GLOBAL
Parameterize Source Qualifier Attributes
Next comes what are the other attributes we can parameterize in Source Qualifier.
Sql Query: $Param_SQL
Source Filter: $Param_Filter
Pre SQL: $Param_Src_Presql
Post SQL: $Param_Src_Postsql
If we have user-defined SQL statement having join as well as filter condition, its better to add a$$WHERE clause at
the end of your SQL query. Here the $$WHERE is just a Mapping level Parameter you define in your parameter file.
In general $$WHERE will be blank. Suppose we want to run the mapping for todays date or some other filter
criteria, what you need to do is just to change the value of $$WHERE in Parameter file.
$$WHERE=AND LAST_UPDATED_DATE > SYSDATE -1
[WHERE clause already in override query]
OR
$$WHERE=WHERE LAST_UPDATED_DATE > SYSDATE -1
[NO WHERE clause in override query]
Parameterize Target Definition Attributes
Next what are the other attributes we can parameterize in Target Definition.
Update Override: $Param_UpdOverride
Pre SQL: $Param_Tgt_Presql
Post SQL: $Param_Tgt_Postsql
$Param_UpdOverride=UPDATE $$Target_Tablename.EMPLOYEE_G SET
ENAME = :TU.ENAME, JOB = :TU.JOB, MGR = :TU.MGR, HIREDATE = :TU.HIREDATE,
SAL = :TU.SAL, COMM = :TU.COMM, DEPTNO = :TU.DEPTNO
WHERE EMPNO = :TU.EMPNO
Parameterize Flatfile Attributes
Now lets see what we can do when it comes to Source, Target or Lookup Flatfiles.
Source file directory: $PMSourceFileDir\ [Default location SrcFiles]
Source filename: $InputFile_EMP
Source Code Page: $Param_Src_CodePage
Target file directory: $$PMTargetFileDir\ [Default location TgtFiles]
Target filename: $OutputFile_EMP
Reject file directory: $PMBadFileDir\ [Default location BadFiles]
Reject file: $BadFile_EMP
Target Code Page: $Param_Tgt_CodePage
Header Command: $Param_headerCmd
Footer Command: $Param_footerCmd
Lookup Flatfile: $LookupFile_DEPT
Lookup Cache file Prefix: $Param_CacheName
Parameterize FTP Connection Object Attributes
Now for FTP connection objects following are the attributes we can parameterize:
FTP Connection Name: $FTPConnection_SGUX
Remote Filename: $Param_FTPConnection_SGUX_Remote_Filename [Use the directory path
and filename if directory is differnt than default directory]
Is Staged: $Param_FTPConnection_SGUX_Is_Staged
Is Transfer Mode ASCII:$Param_FTPConnection_SGUX_Is_Transfer_Mode_ASCII
Parameterization of Username and password information of connection objects are possible
with $Param_OrclUname.
When it comes to password its recommended to Encrypt the password in the parameter file using
the pmpasswd command line program with the CRYPT_DATA encryption type.
Using Parameter File
We can specify the parameter file name and directory in the workflow or session
properties or in the pmcmd command line.
We can use parameter files with the pmcmd startworkflow or starttask commands. These commands
allows us to specify the parameter file to use when we start a workflow or session.
The pmcmd -paramfile option defines which parameter file to use when a session or workflow runs.
The -localparamfile option defines a parameter file on a local machine that we can reference when we
do not have access to parameter files on the Integration Service machine
The following command starts workflow using the parameter file, param.txt:
pmcmd startworkflow -u USERNAME -p PASSWORD
-sv INTEGRATIONSERVICENAME -d DOMAINNAME -f FOLDER
-paramfile 'infa_shared/BWParam/param.txt'
WORKFLOWNAME
The following command starts taskA using the parameter file, param.txt:
pmcmd starttask -u USERNAME -p PASSWORD
-sv INTEGRATIONSERVICENAME -d DOMAINNAME -f FOLDER
-w WORKFLOWNAME -paramfile 'infa_shared/BWParam/param.txt'
SESSION_NAME
Workflow and Session Level Parameter File
When we define a workflow parameter file and a session parameter file for a session within the
workflow, the Integration Service uses the workflow parameter file, and ignores the session parameter
file. What if we want to read some parameters from Parameter file at Workflow level and some defined
at Session Level parameter file.
The solution is simple:
Define Workflow Parameter file. Say infa_shared/BWParam/param_global.txt
Define Workflow Variable and assign its value in param_global.txt with the session level param
file name. Say $$var_param_file=/infa_shared/BWParam/param_runtime.txt
In the session properties for the session, set the parameter file name to this workflow variable.
Add $PMMergeSessParamFile=TRUE in the Workflow level Parameter file.
Content of infa_shared/BWParam/param_global.txt
[WorkFolder.WF:wf_runtime_param]
$DBConnection_SRC=Info_Src_Conn
$DBConnection_TGT=Info_Tgt_Conn
$PMMergeSessParamFile=TRUE
$$var_param_file=infa_shared/BWParam/param_runtime.txt
Content of infa_shared/BWParam/param_runtime.txt
[WorkFolder.wf:wf_runtime_param.ST:s_m_emp_cdc]
$$start_date=2010-11-02
$$end_date=2010-12-08
The $PMMergeSessParamFile property causes the Integration Service to read both the session and
workflow parameter files.