PowerCenter 7 Architecture and Performance...

38
1 Erwin Dral Sales Consultant PowerCenter ® 7 Architecture and Performance Tuning

Transcript of PowerCenter 7 Architecture and Performance...

Page 1: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

1

Erwin DralSales Consultant

PowerCenter® 7Architecture andPerformance Tuning

Page 2: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

2

Agenda

° PowerCenter Architecture

° Performance tuning step-by-step

° Eliminating Common bottlenecks

Page 3: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

3

Oracle API, SQL*LoaderOracle API, SQL*LoaderMS SQL Server, BCPMS SQL Server, BCPSybase, IQ LoadSybase, IQ LoadInformixInformixDB2 UDB, AutoloaderDB2 UDB, AutoloaderTeradataTeradata floadfload, , tpumptpump,,mpumpmpumpODBCODBCFlat FileFlat FileXMLXML

HeterogeneousHeterogeneousTargetsTargets

OracleOracleMS SQL ServerMS SQL Server

SybaseSybaseInformixInformix

DB2 UDBDB2 UDBODBCODBC

Flat FileFlat FileXMLXML

VSAM/COBOLVSAM/COBOLCopybookCopybook

HeterogeneousHeterogeneous

SourcesSources

PowerCenter Architecture:Engine-based & Metadata-driven

Sources Targets

GDR

Client ToolsWindows

MetadataRepository

RepositoryManager

WorkflowManager

WorkflowMonitor

Designer

ODBC ODBC

Native

TCP/IP

Repository Server

Repository Agent

ODBC

MetadataReporter

ErwinErwinDesigner 2000Designer 2000Power DesignerPower DesignerCWMCWM

MetadataExchange

JDBC

UNIX, Windows

NativeODBC

PowerCenter Server Engine

Buffers

WriterDTMReader

NativeODBC

TCP/IP

MainFrameERPSAS

RealTimeRemote Files

MainFrameERPSASRealTimeRemote Files

PowerConnectPowerConnect

DataMetadata

Key

Page 4: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

4

Introducing PowerExchangeOn-Demand Data Access through Changed Data Capture

Relational

File Formats, EAI

Mainframe

AS/400, HP3000

Real-time

Change

BulkChange

Batch

Page 5: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

5

PowerCenter Environment

° This is a multi-vendor, multi-system environment

° There are many components involved− Operating systems, databases, networks, I/O, PowerCenter

° Performance is determined byTHE SLOWEST COMPONENT (the bottleneck)

− Usually need to monitor performance in several places− Usually need to monitor outside PowerCenter

Disk

PowerCenter

Disk Disk

Disk Disk

Disk Disk

OSDBMSLAN/WAN

Disk

Disk Disk

Disk Disk

Disk Disk

Page 6: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

6

Server Architecture - Memory

° The PowerCenter Server utilizes two main processes

− Load Manager process (pmserver)

− Session process (DTM)

° The Load Manager process is a continuous listener processdesigned to handle tasks such as session start, scheduling,error reporting, email, etc.

− Configured using the using the Load Manager Shared Memoryparameter

− Set value to approximately 200K bytes per session multiplied bythe max number of concurrent sessions

Page 7: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

7

Server Architecture - Memory

° The DTM process uses shared memory to handle tasks such asreading, data transformation and writing

° Two session parameters control the DTM memory allocation

− DTM Buffer Pool Size

− Buffer Block Size

° DTM pipeline threads overlap when possible

Transformation Engine

Reader Writer

Page 8: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

8

Server memory runtime

° Example

Page 9: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

9

Server Architecture - Memory

° DTM Buffer Pool Size controls the total amount of memory usedto buffer rows internally by the reader and writer− This sets the total number of blocks available− The optimal value is about 25MB− If the block size is 64K, then you get 25M/64K = 390 blocks

° Buffer Block Size controls the size of the blocks that move inthe pipeline− Optimum size depends on the row size being processed− 64KB ≈ 64 rows of 1KB− 128KB ≈ 128 rows of 1KB

Page 10: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

10

Server Architecture – DTM Parameters

The Session Task parameters control the processing pipeline andare found on the Properties and Config Object tabs

Page 11: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

11

Server Architecture - Threads

Load Manager

WriterThreadThreadWriterThread

ReaderThreadThreadReader

Thread

Assume a mapping with an Aggregator, a Rank, and othertransformations in a session with two partitions. Pre andPost session commands would add one thread each.

Process Memory

Mapping Thread

DTMMaster Thread

Transformation ThreadTransformation

Thread

Transformation ThreadTransformation

Thread

Transformation ThreadTransformation

Thread

Rank Threads

AggregatorThreads

Page 12: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

12

Performance tuning step-by-step

1. Determine Batch window

4. Make ONE

change

Until elapsed time < batch window

3. Determine bottleneck

5. Run

sessions

2. Measure

HINTS:

•Write down a log of every step•If all resources are used 100%, buy more•If the change doesn’t help, UNDO

Page 13: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

13

2. Measuring Performance Internal toInformatica

Page 14: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

14

Measuring Performance - Internal

° Several types of Bottlenecks can affect session performance− Network− System− Database− Informatica Mapping and Session

° There several ways of measuring performance such as total amount ofdata (volume) per unit of time

− Volume can be measured as:° Number of bytes° Number of rows

− Time can be measured as:° CPU or process time° Lapsed time

Page 15: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

15

Measuring Performance - Internal

° For the purpose of identifying bottlenecks use:

− Lapsed time as a relative measurement time

− Number of rows loaded over the period of time (rows per second)

° Rows per second allows performance measurement of asession over a period of time and with changes in theenvironment

° Rows per sec can have a very wide range depending on the sizeof the row (number of bytes), the type of source/target (flat fileor relational) and underlying hardware

Page 16: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

16

Measuring Performance - Internal

° Establishing the baseline using the Workflow Manager

− Run the session task to be measured

− View the session task Transformation Statistics detail window atthe end of the session and record the number of rows loaded

− View the Session Task Properties window and record the startand end times of the session

− Subtract the start time from the end time of the session, convertto seconds to get the total session time

− Divide the number of rows loaded by the number of seconds ofrun time for the session

Page 17: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

17

Measuring Performance - Internal

ExampleExample

Session NameSession NameStart/End TimesStart/End TimesApplied RowsApplied Rows

Page 18: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

18

Measuring Performance - Internal

Tips:

° Calculated rows per second are not the same as “WriteThroughput”

° For multiple targets use sum of rows loaded for targets whichare similar in row size

° For multiple partitions use the sum of rows loaded for allpartitions

° Monitor background processes external to Informatica that willhave an effect between test runs

Page 19: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

19

Establishing Baselines Internal toInformatica

Page 20: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

20

Establishing Baselines - Internal

° Each component in a production environment contributes to theoverall session performance

° Performance is limited to the slowest component

° Knowing the physical data limits establishes the maximum datathroughput

° Baseline measurement can be used for future comparisons

PowerCenter

OSDBMSLAN/

WAN

Page 21: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

21

Establishing Baselines – ReadThroughput Mapping

° Read Throughput Mapping – Use a database table toflat file mapping to establish a typical read rate

Session Name

RowsLoaded

RowsFailed

StartTime

EndTime

ElapsedTime

RowsPerSec

s_m_RDB_TO_FF_TEST 249995 0 10/18/200211:00:58 AM

10/18/200211:01:17 AM

19 13158

Page 22: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

22

Establishing Baselines - Historical

° Each Informatica Repository contains a history of each sessionrun

° Use MX view “REP_SESS_LOG to extract session information

SUBJECT_AREA (Folder) SUCCESSFUL_ROWS (Rows Loaded) ACTUAL_START (Start Time)SESSION_NAME (Session) FAILED_ROWS (Rows Not Loaded) SESSION_TIMESTAMP (End Time)

Note: simple query – select * from rep_sess_log

Page 23: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

23

2. Measure Performance

° Use repository views to establish performance− Session elapsed time (in seconds) =(REP_SESS_LOG.SESSION_TIMESTAMP -REP_SESS_LOG.ACTUAL_START) * 86400

TIMESTAMPDIFF(2,CHAR(SESSION_LOG.SESSION_TIMESTAMP -SESSION_LOG.ACTUAL_START))

- Target Rows per second =SUCCESSFUL_ROWS / Session elapsed time

° OR: Use the MetaData Reporter!

Page 24: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

24

3. Determine bottleneck

° Identifying Target Bottlenecks

° Identifying Source Bottlenecks

° Identifying Mapping Bottlenecks− session parameters

− system resource allocation

− mapping/transformation design

Page 25: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

25

3. Determine Target Bottlenecks

° Writing to a flat file usually does not cause abottleneck

° Configure a session task to write toa flat file target (/dev/null)− If write throughput increases significantly,

then you have a target database bottleneck.

Page 26: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

26

3. Determine Source or Mapping Bottlenecks

Add a FILTER behind each source qualifier set filter condition to false

OriginalOriginal

ModifiedModified

No faster Source bottleneckFaster mapping bottleneck

Page 27: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

27

6. Make ONE change

° Very case-specific,here are some common bottlenecks− Target

− Source

− Mapping

− Session

− System

° Only keep the changes that improve performance(maintaining changes is confusing and costly)

Page 28: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

28

6. Eliminate Target Bottlenecks

° Databases indexes and constraints− Disable indexes and constraints before the load, and enable

afterward (connection/target pre- & post SQL)− Check the database space allocation for indexes

° indexes should be on a different disk if possible

° Use a loader connection

° Check the commit interval− Very small commit intervals cause excessive overhead− Make sure you have allocated plenty of rollback space

(PC6: connection Rollback segment)− Good Commit interval is 50,000

Page 29: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

29

6. Eliminate Target Bottlenecks

° PowerCenter updates and deletes

− Updates and deletes can be extremely slowwithout an index or key

− Bitmap Indexes on columns you are updating cause very slowperformance (usually less than 100 rows/sec)

− Do NOT use an Update Strategy transformation if all rows aretreated the same (DD_INSERT, DD_UPDATE).The writer cannot do block inserts or block updates

Page 30: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

30

3. Eliminate Source Bottlenecks

Discuss with your DBA how to optimizeyour Source Qualifier SQL (in the session log file)− standard DBMS tuning:

explain plan, add indexes, estimate statistics (regularly)alter database parameters, etc

° Optimize the query to begin returning rows early− the total query time may be longer, but PowerCenter

processing can overlap with the query execution

Page 31: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

31

3. Eliminate Mapping Bottlenecks

° Reduce I/O times− Cache in memory

− Use fast disks for Cache, BadFiles, SessionLogs etc.

− Check your Sequence Generator

° Reduce amount of data to transform− Filter early

° Aggregator or joiner: prefix with a sorter

Page 32: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

32

6. Optimize expression performance

° Use numeric ports instead of string ports

° Reduce (hidden) Data type conversions

° Simplify expressions

− Factor out common logic to transformation variablesor even mapping variables or parameters

° Simplify nested IIFs when possibleor use DECODE statements

Page 33: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

33

6. Optimize Lookup Performance

° Reduce the number of lookup rows.

− ‘where’ clause in lookup sql

° Use persistent lookup caches

− When a nightly batch has several sessions that use thesame lookup

− Build the persistent cachefile in a separate session

° Lookup with date-range: lookup/filter combo

° Lookup against large dimension with few changes:− PoweExchange Changed Data Capture

− checksum AEP plus lookup (devnet.informatica.com)

° Remove the lookup, use ‘update else insert’

Page 34: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

34

6. Session Optimizing

° Set the DTM Buffer Pool Size and Buffer Block Size− Large row sizes may require a larger buffer block size− Default buffer pool is 12000000b = 12 Mb,

recommended is 24Mb

° Buffer Block Size controls the size of the blocks thatmove in the pipeline− Buffer Block size should hold about 100 rows− 64K (64,000) ≈ 64 rows of 1Kb− 128K (128,000) ≈ 128 rows of 1Kb

° Extremely large DTM may SLOW DOWN session!

Page 35: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

35

6. Session Memory Settings

° Set cache memory larger than the size of the cachefile on disk

° Set the server variable directories(Badfiles, Cache, SessLogs, etc.)to point to high performance disk arrays

° Reduce transformation errors (& error logging)

Page 36: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

36

For those that are still on PowerCenter 5 …PowerCenter 6 Performance highlights

° More efficient server

° New Sorter transformation

° ‘Sorted Input’ switch for aggregator & joiner

° More bulk loaders

° Pipeline Partitioning(PowerCenter only)

Upgrade!Upgrade!Upgrade!

Page 37: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

37

Upgrade!Upgrade!Upgrade!

For those that are still on PowerCenter 6 …PowerCenter 7 Performance highlights

° Block DTM− Enables moving/transforming a block of rows at a time

at each transformation− Accelerates ALL sessions with:

° Mapping bottleneck AND° (Lots of transformations OR Lots of string ports)

° Superior XML reading and writing

° Easy GUI for partitioning

° Max 64 partitions per partition point

° 64-bit version

° Server Grid (workflow load balancing across several servers)

° Change Data Capture (MVS, Oracle 9i and MS SQL server)

Page 38: PowerCenter 7 Architecture and Performance Tuningdbmanagement.info/Books/MIX/informatica_7_Informatica.pdfInformix DB2 UDB, Autoloader Teradata fload, ... Performance tuning step-by-step

38

Performance tuning step-by-step

1. Determine Batch window

4. Make ONE

change

Until elapsed time < batch window

3. Determine bottleneck

5. Run

sessions

2. Measure

HINTS:

- Write down a log of every step3. If all resources are used 100%, buy more4. If the change doesn’t help, UNDO