15610813 Easy Approach to a

download 15610813 Easy Approach to a

of 13

Transcript of 15610813 Easy Approach to a

  • 8/8/2019 15610813 Easy Approach to a

    1/13

    Easy Approach to Informatica

    Author:subrata_sahana

    Date Written: 20th June, 2003.

    Declaration: I hereby declare that this document is based on my project experience. Tothe best of my knowledge, this document does not contain any material that infringes the

    copyrights of any other individual or organization including the customers of Infosys.

    Target Readers: All

    Table of Contents

    1. Introduction......................................................................................................................22. Component and Architecture...........................................................................................2

    Source..........................................................................................................................2Server...........................................................................................................................2

    Target...........................................................................................................................2

    Source data...................................................................................................................2Transformed data.........................................................................................................2

    Instructions from Metadata..........................................................................................2

    Repository....................................................................................................................23. Informatica Design Process.............................................................................................3

    4. Informatica Repository....................................................................................................4

    5. Informatica Client............................................................................................................45.1 Repository Manager...................................................................................................4

    5.1.1 Repository Security.............................................................................................5

    5.2 Designer.....................................................................................................................6

    5.2.1 Transformations..................................................................................................95.2.1.1 Aggregator Transformation..........................................................................9

    5.2.1.2 Expression Transformation..........................................................................9

    5.2.1.3 Filter Transformation...................................................................................95.2.1.4 Router Transformation.................................................................................9

    5.2.1.5 Joiner Transformation .................................................................................9

    5.2.1.6 Lookup Transformation...............................................................................9

    5.2.1.7 Normalizer Transformation........................................................................105.2.1.8 Rank Transformation.................................................................................10

    5.2.1.9 Sequence Generator Transformation..........................................................10

    5.2.1.10 Source Qualifier Transformation.............................................................105.2.1.11 Update Strategy Transformation..............................................................10

    5.3 Server Manager........................................................................................................11

    5.3.1 Transformation Process.....................................................................................125.3.2 Sessions and Batches........................................................................................12

    mailto:[email protected]:[email protected]:[email protected]
  • 8/8/2019 15610813 Easy Approach to a

    2/13

    5.3.3 Session Log.......................................................................................................12

    6. Connectivity Overview..................................................................................................12

    7. Some Typical Troubleshooting......................................................................................13

    1. IntroductionInformatica is anETL toolthat allows you to load data into a centralized location, such as

    datamart, data warehouse or operational data store.

    ETL Tool:

    -Extract data from multiple sources

    -Transform the data according to business logic and need-Load the transformed data into file and relational targets

    2. Component and ArchitectureInformatica consists of the following integrated components:

    Informatica Repository: Informatica Repository is the center of Informatica.

    You create a set of metadata tables within repository database that the Informatica

    application and tools access. Informatica Client and Server access the repositoryto save and access metadata.

    Informatica Client: Informatica Client is used to manage users, define sources

    and targets, build mappings and mapplets with the transformation logic, and

    create sessions to run the mapping logic. Informatica client consists of RepositoryManager, Designer and Server Manager.

    Informatica Server: Informatica Server extracts data from source, transforms

    data and load-transformed data into targets.

    The figure below illustrates the architecture of Informatica.

    Source Server Target

    Transformed dataSource data

    Instructions from Metadata

    Repository

  • 8/8/2019 15610813 Easy Approach to a

    3/13

    Sources

    Informatica access the following sources:

    Relational - Oracle, Sybase, Informix, IBM DB2, Microsoft SQL Server and

    Teradata.

    File - Fixed and delimited flat file, COBOL file and XML.

    Extended - PeopleSoft, SAP R/3, Sieble and IBM MQSeries (need to purchase

    additional products for these sources).

    Mainframe - Need to purchase additional products.

    Other - Microsoft Excel and Access.

    TargetsInformatica can load data into following targets:

    Relational - Oracle, Sybase, Sybase IQ, Informix, IBM DB2, Microsoft SQL

    Server and Teradata.

    File - Fixed and delimited flat files and XML.

    Extended - SAP BW and IBM MQSeries (need to purchase additional productsfor these targets).

    Other - Microsoft Access.

    3. Informatica Design ProcessInformatica design process mainly consists of five different steps:

    1. Create Repository Repository will hold all metadata and thus drive extraction

    and transformation process of Informatica.

    2. Import Source Definitions Source Analyzer in Designer is used to import orcreate source definitions.

    3. Create Target Schema Warehouse Designer in Designer is used to import or

    create target definitions.

    4. Create Mappings Mapping Designer in Designer is used to link source to targetwith the required transformations.

  • 8/8/2019 15610813 Easy Approach to a

    4/13

    5. Load Data Server Manager is used to create and schedule sessions and batches

    to run the mappings. Based on the information in transformation and repository

    metadata Informatica Server loads data into targets.

    4. Informatica RepositoryInformatica Repository is a set of tables that stores metadata created while usingInformatica Client tools. A database is required to create a repository. The following

    database platforms can be used to create Informatica Repository

    IBM DB2

    Informix

    Microsoft SQL Server

    Oracle

    Sybase

    There are three different types of repositories standalone, global and local.

    Standalone repository: A repository that functions individually, unrelated andunconnected to other repositories.

    Global repository: A centralized repository in a domain. The global repository is used to

    store common objects that can be used by many people through shortcuts. These objects

    may be source definitions, reusable transformation, mappings and mapplets.

    Local repository: The repository in a domain that is not global repository. Local

    repository is used for development. From local repository, shortcuts to objects in shared

    folders in global repository can be created.

    5. Informatica ClientInformatica client comprised of three applications:

    Repository Manager Repository Manager is used to create and administer

    metadata in the repository.

    Designer Designer is used to create mappings that contain transformations

    instruction for Informatica Server.

    Server Manager Server Manager is used create, schedule and monitor sessions.

    5.1 Repository ManagerRepository Manager allows creating and administering one or more repositories.

    Repository Manager consists of four windows.

    Navigator Window

    Output Window

    Dependency Window

    Main Window

  • 8/8/2019 15610813 Easy Approach to a

    5/13

    Navigator Window displays all objects that are created in Repository Manager,

    Designer and Server Manager.Main Window displays properties of object selected in Navigator Window.

    Dependency Window displays dependencies on sources, target and mappings of

    the object selected in Navigator Window or Main Window.Output Window provides output of the processes executed in Repository

    Manager.

    5.1.1 Repository SecurityThe Informatica Client, Server, and Repository offer several layers of security.

    The following are some important points related to repository security:

    When a repository is created two default user groups are created automatically

    Administrators and Public. These two groups cannot be deleted or their

    privileges cannot be changed. Repository Manager automatically creates two users in Administrators group

    Administrator and database username used to create repository. These two

    users cannot be deleted or cannot be removed from Administrators group.

    Repository Manager does not create any default user for Public group.

    Each repository user must be assigned to at least one group. User receives all

    group privileges, inherits any changes to group privileges, and loses and gains

    privileges if you change the user group membership.

  • 8/8/2019 15610813 Easy Approach to a

    6/13

    A user can create or delete group (except default groups) if the user has

    Administer repository or Super User privileges.

    If a group is deleted which has users then those users are assigned to Public

    group.

    A user with Administer Repository or Super User privileges can edit any

    users properties except for Administrator, default database user and cannotchange the user name.

    A user can edit his/her own password if user has Browse Repository privilege.

    A user with Administer Repository or Super User privileges can edit any

    users password.

    A user with Administer Repository or Super User privileges can change the

    privileges of other users (except Administrator and default database user) orgroup. Users individually granted privileges have to be revoked individually.

    A user can have three different types of permissions in a folder read, write

    and execute.

    A user can change folder permissions if the user has Super User privilege,Administer Repository privilege with read permission in folder or Browse

    Repository privilege as folder owner with read permission.

    If a user is working on an object, repository locks that object so that another

    user does not work on the same object simultaneously.

    A user with Browse Repository or Administer Repository privilege with read

    permission can unlock objects locked by his/her username.

    A user with Super User privilege can unlock any lock in the repository.

    5.2 DesignerDesigner helps to create source definitions, target definitions and transformations

    to build mappings. Designer consists of four windows:

    Workspace

    OverviewWindow

  • 8/8/2019 15610813 Easy Approach to a

    7/13

  • 8/8/2019 15610813 Easy Approach to a

    8/13

    Navigator Window is used to connect and work in different repositories and

    folders.

    Workspace is used to view and edit sources, targets, transformations, mappletsand mappings.

    Output Window provides details when some tasks are performed, such as saving

    or validating a mapping.Overview Window is used for viewing workbook containing large mappings or

    large number of objects.

    Status bar displays the status of the operation performed.

    Designer consists of five tools:o Source Analyzer: Use to import or create source definitions.

    o Warehouse Designer: Use to import or create target definitions.

    o Transformation Developer: Use to create reusable transformations.

    o Mapplet Designer: Use to create mapplets (a set of transformations that

    can be used in multiple mappings).

    o Mapping Designer: Use to create mappings.

    OutputWindow

    Workbook Tabs

    Navigator

    Status Bar

  • 8/8/2019 15610813 Easy Approach to a

    9/13

    5.2.1 TransformationsA transformation is a repository object, which generates, modifies or passes data.

    There are many types of transformations that can be incorporated in a mapping toprocess data. The brief descriptions of some frequently used transformations are

    given below:

    5.2.1.1 Aggregator Transformation

    The Aggregator transformation allows performing aggregate calculation, such as

    average and sum. The Aggregator transformation is unlike Expressiontransformation because former can be used to perform calculation on groups

    whereas later can be used to perform calculation on row-by-row basis.

    5.2.1.2 Expression Transformation

    The Expression transformation is used to calculate value in single row before

    writing to target. This transformation can be used to perform non-aggregate

    calculations.

    5.2.1.3 Filter Transformation

    The Filter transformation provides the means for filtering records in a mapping.All the rows from a source transformation are passed through the filter

    transformation, then a filter condition is entered .All the ports are input/output

    ports and only records that meet the condition pass through the Filtertransformation. This transformation is used to eliminate all unwanted records

    from being processed.

    5.2.1.4 Router Transformation

    The Router transformation is similar to Filter transformation. A Filter

    transformation can test data for one condition and drops all rows that do not meetthe condition. A Router transformation can test data for more than one conditionand the rows that do not meet any of the conditions can be route through default

    output group. If same input data need to be tested against many conditions then

    use router instead of using multiple filter transformation.

    5.2.1.5 Joiner Transformation

    Source Qualifier can join data originating from a common source database but

    joiner transformation joins two related heterogeneous sources residing in differentlocations or file systems. The joiner transformation is used to join two sources

    with at least one matching port of data. The joiner transformation uses a condition

    that matches one or more pairs of ports between the two sources.

    5.2.1.6 Lookup Transformation

    The Lookup transformation is used to access data from any relational database towhich both Informatica Client and Server can connect. A mapping can contain

    multiple lookups.

    The lookup transformation can be used to perform tasks like:

  • 8/8/2019 15610813 Easy Approach to a

    10/13

    o Perform a calculation

    o Update slowly changing dimension tables

    o Take into account integrity constraints in tables

    5.2.1.7 Normalizer Transformation

    The Normalizer transformation is used to organize the data. In sources likeCOBOL normalizer is used instead of source qualifier. With Normalizer repeateddata in a record can be broken into separate records. For each new record it

    creates, the normalizer generates a unique identifier. This key value can be used to

    join the normalized records. A normalizer transformation can also be used tocreate multiple rows from a single row of data.

    5.2.1.8 Rank Transformation

    The Rank transformation allows selecting only the top or bottom rank of data. A

    rank transformation can be used to return the largest or smallest value in a port or

    group. The rank transformation differs from transformation functions MAX and

    MIN, as it allows selecting a group of top or bottom values and not just one value.

    5.2.1.9 Sequence Generator Transformation

    Sequence Generator transformation generates numeric values that can be used tocreate primary key values, to replace missing primary keys, or to cycle through a

    sequential range of numbers. The sequence generator transformation is a

    connected transformation, which contains two output ports that can connect to oneor more transformations. The Informatica Server generates a value each time a

    row enters a connected transformation. Sequence generator can be made reusable,

    and can be used in multiple mappings for multiple loads on a single target.

    5.2.1.10 Source Qualifier TransformationThe Source Qualifier transformation is used to connect a relational or flat file

    source. The source qualifier represents the records that the Informatica Serverreads when it runs a session.

    Source Qualifier can be used to perform following task:

    o Join the data originating from same source database

    o Filter records when the Informatica Server reads source data

    o Specify an outer join rather than the default inner join

    o Select distinct values from the source

    o Specify sorted ports

    o Create a custom query to issue a special SELECT statement for the

    Informatica Server to read source data.

    5.2.1.11 Update Strategy Transformation

    The Update Strategy transformation is used to implement the logic to insert,

    update, delete and reject data in target tables. Update Strategy can be set at twodifferent levels:

  • 8/8/2019 15610813 Easy Approach to a

    11/13

    o Within a session: This can be achieved by instructing the Informatica

    Server to treat all rows in the same way or use the instruction coded in the

    session mapping to flag the records for different database operation.

    o Within a mapping: An update strategy transformation can be used to flag

    records for insert, delete, update or reject.

    5.3 Server ManagerServer Manager allows creating session, monitoring session, tuning session,running session and configuring server.

    Server Manager consists of the Navigator window, Configure window, Monitor

    window and Output window.Navigator window is used to view and select configured sessions.

    Configure window is used to create and edit session.

    Monitor window is used to view information about running and completedsessions and batches.

    Output window is for viewing messages from Informatica Server.

    Monitor Window

    Navigator

    Configure Window

    OutputWindow

  • 8/8/2019 15610813 Easy Approach to a

    12/13

    5.3.1 Transformation ProcessA transformation to take place Informatica Server carries out the following steps:

    o Reads information from the Repository.o Extracts data from the Sources and stores the data in memory while it

    applies the transformation rules you created.

    o Loads the transformed data into the mapping targets.

    5.3.2 Sessions and Batches

    Session: A session is a set of instructions that tell the Informatica Server how and

    when to move data from sources to targets.

    Batches: A group of sessions, which are to be run together.Batches provide away to group sessions for either serial or parallel execution by the Informatica

    Server. There are two types of batches:

    o Sequential: Runs sessions one after the other.

    o Concurrent: Runs sessions at the same time.

    Once a session or batch is created, the Server Manager or the command line

    program pmcmdcan be used to start or stop the session or batch.

    5.3.3 Session LogThe Informatica server creates session log files for each session it runs. Thesession log file contains information about all tasks Informatica Server performs.

    The amount of detail in the session log file depends upon the tracing level set by

    the user. Error tracing level can be defined per transformation or for the entiresession. By default the Informatica server saves the session log in the directory for

    Informatica session variable $PMSessionLogDir which can be defined in theserver manager properties. The default name for session log is session_name.log.

    6. Connectivity Overview

  • 8/8/2019 15610813 Easy Approach to a

    13/13

    7. Some Typical Troubleshooting Problem encountered in saving a mapping and status bar shows message "Run out

    of locks" contact repository database administrator. This is a problem in databaseside.

    Informatica Client hangs during login, even if correct user id and password are

    entered, and status bar show "Connecting to repository", contact repository

    database administrator as database may have run out of space.

    If some other user id has obtained lock on your session and you are not

    administrator, to run your mapping create another session with different name andrun it. Ask administrator to release the lock.