BI Apps796 Perf Tech NoteV7

1

Oracle Business

Intelligence Applications

Version 7.9.6.x Performance

Recommendations An Oracle Technical Note, 7

th Edition

April 2011

Copyright 2011, Oracle. All rights reserved.

2

Oracle Business Intelligence Applications Version 7.9.6.x

Performance Recommendations

Introduction ............................................................................................................................................................. 4 Hardware recommendations for implementing Oracle BI Applications .................................................................. 4

Storage Considerations for Oracle Business Analytics Warehouse ................................................................... 5 Introduction ..................................................................................................................................................... 5 Shared Storage Impact Benchmarks ............................................................................................................. 5 Conclusion ...................................................................................................................................................... 7

Source Tier ......................................................................................................................................................... 7 Oracle BI Enterprise Edition (OBIEE) / ETL Tier ................................................................................................ 7

Review of OBIEE/ETL Tier components ........................................................................................................ 7 Deployment considerations for the ETL components .................................................................................... 7

Target Tier .......................................................................................................................................................... 7 Oracle RDBMS ............................................................................................................................................... 7

Oracle Business Analytics Warehouse configuration ............................................................................................. 8 Database configuration parameters ................................................................................................................... 8 ETL impact on amount of generated REDO Logs .............................................................................................. 8 Oracle RDBMS System Statistics ....................................................................................................................... 9 Parallel Query configuration ............................................................................................................................... 9 Oracle Business Analytics Warehouse Tablespaces ....................................................................................... 10

Oracle BI Applications Best Practices for Oracle Exadata ................................................................................... 10 Handling BI Applications Indexes in Exadata Warehouse Environment .......................................................... 10 Gather Table Statistics for BI Applications Tables ........................................................................................... 11 Oracle Business Analytics Warehouse Storage Settings in Exadata ............................................................... 11 Parallel Query Use in BI Applications on Exadata ............................................................................................ 11 Compression Implementation Oracle Business Analytics Warehouse in Exadata .......................................... 12 Exadata Smart Flash Cache ............................................................................................................................. 12 Database Parameter File for Analytics Warehouse on Exadata ...................................................................... 12

Informatica configuration for better performance .................................................................................................. 14 Informatica PowerCenter 8.6 32-bit vs. 64-bit .................................................................................................. 14 Informatica Session Logs ................................................................................................................................. 14 Informatica Lookups ......................................................................................................................................... 15 Disabling Lookup Cache for very large Lookups .............................................................................................. 15 Joining Staging Tables to Lookup Tables in Informatica Lookups ................................................................... 16 Informatica Custom Relational Connections for long running mappings.......................................................... 16 Informatica Session Parameters ...................................................................................................................... 17

Commit Interval ............................................................................................................................................ 17 DTM Buffer Size ........................................................................................................................................... 17 Additional Concurrent Pipelines for Lookup Cache Creation ....................................................................... 18 Default Buffer Block Size ............................................................................................................................. 18

Informatica Load: Bulk vs. Normal ................................................................................................................... 18 Informatica Bulk Load: Table Fragmentation ................................................................................................... 18 Use of NULL Ports in Informatica Mappings .................................................................................................... 19 Informatica Parallel Sessions Load on ETL tier................................................................................................ 19 Informatica Load Balancing Implementation .................................................................................................... 20

Bitmap Indexes usage for better queries performance ......................................................................................... 20 Introduction ....................................................................................................................................................... 20 DAC properties for handling bitmap indexes during ETL ................................................................................. 20 Bitmap Indexes handling strategies .................................................................................................................. 22 Disabling Indexes with DISTINCT_KEYS = 0 or 1 ........................................................................................... 25

3

Monitoring and Disabling Unused Indexes ....................................................................................................... 26 Handling Query Indexes during Initial ETL ....................................................................................................... 28

Partitioning guidelines for Large Fact tables ......................................................................................................... 29 Introduction ....................................................................................................................................................... 29 Convert to partitioned tables ............................................................................................................................. 30

Identify a partitioning key and decide on a partitioning interval .................................................................... 30 Create a partitioned table in Data Warehouse ............................................................................................. 31 Configure Informatica to support partitioned tables ..................................................................................... 34 Configure DAC to support partitioned tables ................................................................................................ 34 Unit test the changes for converted partitioned tables in DAC ..................................................................... 41

Interval Partitioning ........................................................................................................................................... 41 Informatica Workflows Session partitioning .......................................................................................................... 42

Workflow Session Partitioning for Parallel Writer Updates .............................................................................. 42 Table Compression implementation guidelines .................................................................................................... 44 Guidelines for Oracle optimizer hints usage in ETL mappings ............................................................................. 45

Hash Joins versus Nested Loops in Oracle RDBMS........................................................................................ 45 Suggested hints for Oracle Business Intelligence Applications 7.9.6 ............................................................... 48 Using Oracle Optimizer Dynamic Sampling for big staging tables ................................................................... 52

Custom Indexes in Oracle EBS for incremental loads performance .................................................................... 53 Introduction ....................................................................................................................................................... 53 Custom OBIEE indexes in EBS 11i and R12 systems ..................................................................................... 53 Custom EBS indexes in EBS 11i source systems ............................................................................................ 55 Oracle EBS tables with high transactional load ................................................................................................ 56 Custom EBS indexes on CREATION_DATE in EBS 11i source systems ....................................................... 57

Custom Aggregates for Better Query Performance .............................................................................................. 57 Introduction ....................................................................................................................................................... 57 Database Configuration Requirements for using MVs ..................................................................................... 57 Custom Materialized View Guidelines .............................................................................................................. 58 Integrate MV Refresh in DAC Execution Plan .................................................................................................. 62

Wide tables with over 255 columns performance ................................................................................................. 64 Introduction ....................................................................................................................................................... 64 Wide tables structure optimization ................................................................................................................... 64

Oracle BI Applications HIgh Availability ................................................................................................................ 65 Introduction ....................................................................................................................................................... 65 High Availability with Oracle Data Guard and Physical Standby Database ...................................................... 65

Oracle BI Applications ETL Performance Benchmarks ........................................................................................ 67 Oracle BI Applications 7.9.6.1, Siebel CRM 8.0 Adapter.................................................................................. 67 Oracle BI Applications 7.9.6.1, Oracle EBS R12 Projects Adapter .................................................................. 68 Oracle BI Applications 7.9.6.1, Oracle EBS 11i10 Enterprise Sales Adapter ................................................... 68 Oracle BI Applications 7.9.6.1, Oracle EBS 11i10 Supply Chain Adapter ........................................................ 69

Conclusion ............................................................................................................................................................ 70

4

Oracle Business Intelligence Applications Version 7.9.6.x Performance

Recommendations

INTRODUCTION

Oracle Business Intelligence (BI) Applications Version 7.9.6 delivers a number of adapters to various business

applications on Oracle database. 7.9.6.1 version is certified with other major data warehousing platforms. Each Oracle

BI Applications implementation requires very careful planning to ensure the best performance both during ETL and web

queries or dashboard execution.

This article discusses performance topics for Oracle BI Applications 7.9.6 and higher using Informatica PowerCenter

8.6 ETL platform.

Note: The document is intended for experienced Oracle BI Administrators, DBAs and Applications implementers. It

covers advanced performance tuning techniques in Informatica and Oracle RDBMS, so all recommendations must be

carefully verified in a test environment before applied to a production instance. Customers are encouraged to engage

Oracle Expert Services to review their configurations prior to implementing the recommendations to their BI

Applications environments.

HARDWARE RECOMMENDATIONS FOR IMPLEMENTING ORACLE BI APPLICATIONS

Depending on source data volumes, Oracle BI Applications Version 7.9.6 implementations can be categorized as

small, medium and large. The table below summarizes hardware recommendations for Oracle BI Applications tiers by

the volume ranges.

Source Data

Volume

SMALL:

Up to 200Gb

MEDIUM:

200Gb to 1Tb

LARGE:

1Tb and higher

Target Tier

# CPU cores 8 16 32*

Physical RAM 16Gb 32Gb 64Gb*

Storage Space Up to 400Gb 400Gb - 2Tb 2Tb and higher

Storage System Local (PATA, SATA, iSCSI),

preferred RAID configuration

Local (PATA, SATA, iSCSI).

Recommended two or more

I/O controllers

High performance SCSI or

network attached storage.

Hardware RAID controller with

multiple I/O channels.

Oracle BI Enterprise Edition / ETL Tier

# CPU cores 4 - 8 8 - 16 16**

Physical RAM 8Gb 8 - 16Gb 16Gb**

5

Storage Space 100Gb local 200Gb local 400Gb local

* Consider implementing Oracle RAC with multiple nodes to accommodate large numbers of concurrent users

accessing web reports and dashboards.

** Consider installing two or more servers on ETL tier and implementing Informatica Load Balancing across all ETL tier

servers.

Important: It is recommended to set up all Oracle BI Applications tiers in the same local area network.

Installation of any of these three tiers over Wide Area Network (WAN) may cause timeouts during ETL

mappings execution on the ETL tier.

Storage Considerations for Oracle Business Analytics Warehouse

Introduction

Oracle BI Applications ETL execution plans are optimized to maximize hardware utilization on ETL and target tiers and

reduce ETL runtime. Usually a well-optimized infrastructure consumes higher CPU and memory on an ETL tier and

causes rather heavy storage I/O load on a target tier during an ETL execution. The storage could easily become a

major bottleneck as the result of such actions as:

Setting excessive parallel query processes (refer to Parallel Query Configuration section for more details)

Running multiple I/O intensive applications, such as databases, on a shared storage

Choosing sub-optimal storage for running BI Applications tiers.

Oracle positions Exadata solution as fast and efficient hardware for addressing I/O bottlenecks in large volume

environments. The internal benchmarks for running Oracle BI Applications on Exadata will be published soon.

Shared Storage Impact Benchmarks

Sharing storage among heavy I/O processes could easily degrade ETL performance and result in extended ETL

runtime. The following benchmarks helped to measure the impact from sharing the same NetApp filer storage between

two target databases, concurrently loading data in two parallel ETL executions.

Configuration description:

Linux servers #1 and #2 have the following configurations:

2 quad-core 1.8 GHz Intel Xeon CPU

32 GB RAM

Shared NetApp filer volumes, volume1 and volume2, are mounted as EXT3 file systems:

o Server #1 uses volume1

o Server #2 uses volume2

Execution test description:

Set record block size for I/O operations to 32k, the recommended db block size in a target database.

Execute parallel load using eight child processes to imitate average workload during ETL run.

Run the following test scenarios:

o Test#1: execute parallel load above on NFS volume1 using Linux server #1; keep Linux server #2 idle.

o Test#2: execute parallel load above on both NFS volume1 and volume2 using Linux servers #1 and #2.

6

The following benchmarks describe performance measurements in KB / sec:

- Initial Write: write a new file.

- Rewrite: re-write in an existing file.

- Read: read an existing file.

- Re-Read: re-read an existing file.

- Random Read: read a file with accesses made to random locations in the file.

- Random Write: write a file with accesses made to random locations in the file.

- Mixed workload: read and write a file with accesses made to random locations in the file.

- Reverse Read: read a file backwards.

- Record Rewrite: write and re-write the same record in a file.

- Strided Read: read a file with a strided access behavior, for example: read at offset zero for a length of 4

Kbytes, seek 200 Kbytes, read for a length of 4 Kbytes, seek 200 Kbytes and so on.

The test summary:

Test Type Test #1 Test #2

"Initial write " 46087.10 KB/sec 30039.90 KB/sec

"Rewrite " 70104.05 KB/sec 30106.25 KB/sec

"Read " 3134220.53 KB/sec 2078320.83 KB/sec

"Re-read " 3223637.78 KB/sec 3038416.45 KB/sec

"Reverse Read " 1754192.17 KB/sec 1765427.92 KB/sec

"Stride read " 1783300.46 KB/sec 1795288.49 KB/sec

"Random read " 1724525.63 KB/sec 1755344.27 KB/sec

"Mixed workload " 2704878.70 KB/sec 2456869.82 KB/sec

"Random write " 68053.60 KB/sec 25367.06 KB/sec

"Pwrite " 45778.21 KB/sec 23794.34 KB/sec

"Pread " 2837808.30 KB/sec 2578445.19 KB/sec

Total Time 110 min 216 min

Initial Write, Rewrite, Initial Read, Random Write, and Pwrite (buffered write operation) were impacted the most, while

Reverse Read, Stride Read, Random Read, Mixed Workload and Pread (buffered read operation) were impacted the

least by the concurrent load.

Read operations do not require specific RAID sync-up operations therefore read requests are less dependent on the

number of concurrent threads.

7

Conclusion

Make sure you carefully plan for storage deployment, configuration and usage in Oracle BI Applications environment.

Avoid sharing the same RAID controller(s) across multiple databases. Set up periodic monitoring of your I/O system

during both ETL and end user queries load for any potential bottlenecks.

Source Tier

Oracle BI Applications data loads may cause additional overhead of up to fifteen percent of CPU and memory on a

source tier. There might be a bigger impact on the I/O subsystem, especially during full ETL loads. Using several I/O

controllers or a hardware RAID controller with multiple I/O channels on the source side would help to minimize the

impact on Business Applications during ETL runs and speed up data extraction into a target data warehouse.

Oracle BI Enterprise Edition (OBIEE) / ETL Tier

Review of OBIEE/ETL Tier components

The Oracle BIEE/ETL Tier is composed of the following parts:

- Oracle Business Intelligence Server 10.1.3.4

- Informatica PowerCenter 8.6 Client

- Informatica PowerCenter 8.6 Server

- Data Warehouse Administration Console (DAC) client 10.1.3.4.1

- Data Warehouse Administration Console server 10.1.3.4.1

- Informatica BI Applications Repository (usually stored in a target database)

- DAC BI Applications Repository (usually stored in a target database)

Deployment considerations for the ETL components

The Informatica server and DAC server should be installed on a dedicated machine for best performance.

The Informatica server and DAC server cannot be installed separately on different servers.

The Informatica client and DAC client can be located on an ETL Administration client machine, or a Windows

server, running Informatica and DAC servers.

Informatica and DAC repositories can be deployed as separate schemas in the same database, as Oracle

Business Analytics Warehouse, if the target database platform is Oracle, IBM DB2 or Microsoft SQL Server.

The Informatica server and DAC server host machine should be physically located near the source data machine

to improve network performance.

Target Tier

Oracle RDBMS

Oracle recommends deploying Oracle Business Analytics Warehouse on Oracle RDBMS 64-bit, running under 64-bit

Operating System (OS). If 64-bit OS is not available, then consider implementing Very Large Memory (VLM) on Unix /

Linux and Address Windowing Extensions (AWE) for Windows 32 bit Platforms. VLM/AWE implementations would

increase database address space to allow for more database buffers or a larger indirect data buffer window. Refer to

Oracle Metalink for VLM / AWE implementation for your platform.

8

Note: You cannot use sga_target or db_cache_size parameters if you enable VLM / AWE by setting

'use_indirect_data_buffers = true'. You would have to manually resize all SGA memory components and use

db_block_buffers instead of db_cache_size to specify your data cache.

ORACLE BUSINESS ANALYTICS WAREHOUSE CONFIGURATION

Database configuration parameters

Oracle Business Intelligence Applications version 7.9.6 is certified with Oracle RDBMS 10g and 11g. Since Oracle BI

Applications extensively use bitmap indexes, partitioned tables, and other database features in both ETL and front-end

queries logic, it is important that Oracle BI Applications customers install the latest database releases for their Data

Warehouse tiers:

- Oracle 10g customers should use Oracle 10.2.0.4 or higher.

- Oracle 11g customers should use Oracle 11.1.0.7 or higher.

Important: Oracle 10.2.0.1 customers must upgrade their Oracle Business Analytics Warehouses to the latest

Patchset.

Oracle BI Applications include template init.ora files with recommended and required parameters located in the

\dwrep\Documentation\ directory:

- init10gR2.ora - init.ora template for Oracle RDBMS 10g

- init11g.ora init.ora template for Oracle RDBMS 11g

- init11gR2.ora init.ora template for Oracle RDBMS 11gR2

Review an appropriate init.ora template file and follow its guidelines to configure target database parameters specific to

your data warehouse tier hardware.

Note: init.ora template for Exadata / 11gR2 is provided in Exadata section of this document.

ETL impact on amount of generated REDO Logs

Initial ETL may cause higher than usual generation of REDO logs, when loading large data volumes in a data

warehouse database. If your target database is configured to run in ARCHIVELOG mode, you can consider two

options:

1. Switch the database to NOARCHIVELOG mode, execute Initial ETL, take a cold backup and switch the

database back to ARCHIVELOG mode.

2. Allocate up to 10-15% of additional space to accommodate for archived REDO logs during Initial ETL.

Below is a calculation of generated REDO amount in an internal initial ETL run:

redo log file sequence:

start : 641 (11 Jan 21:10)

end : 1624 (12 Jan 10:03)

total # of redo logs : 983

log file size : 52428800

redo generated: 983*52428800 = 51537510400 (48 GB)

Data Loaded in warehouse:

SQL> select sum(bytes)/1024/1024/1024 Gb from dba_segments where owner='DWH' and

segment_type='TABLE';

9

Gb

----------

280.49

Oracle RDBMS System Statistics

Oracle has introduced workload statistics in Oracle 9i to gather important information about system such as single and

multiple block read time, CPU speed, and various system throughputs. Optimizer takes system statistics into account,

when it computes the cost of query execution plans. Failure to gather workload statistics may result in sub-optimal

execution plans for queries, excessive temporary space consumption, and ultimately impact BI Applications

performance.

Oracle BI Applications customers are required to gather workload statistics on both source and target Oracle

databases prior to running initial ETL.

Oracle recommends two options to gather system statistics:

- Run the dbms_stats.gather_system_stats('start') procedure at the beginning of the workload window, then the

dbms_stats.gather_system_stats('stop') procedure at the end of the workload window.

- Run dbms_stats.gather_system_stats('interval', interval=>N) where N is the number of minutes when statistics

gathering will be stopped automatically.

Important: Execute dbms_stats.gather_system_stats, when the database is not idle. Oracle computes desired system

statistics when database is under significant workload. Usually half an hour is sufficient to generate the valid statistic

values.

Parallel Query configuration

The Data Warehouse Administration Console (DAC) leverages the Oracle Parallel Query option for computing statistics

and building indexes on target tables. By default DAC creates indexes with the 'PARALLEL' clause and computes

statistics with pre-calculated degree of parallelism. Refer to the init.ora template files, located in

\dwrep\Documentation for details on setting the following parameters:

parallel_max_servers

parallel_min_servers

parallel_threads_per_cpu

Important: Parallel execution is non-scalable. It could easily lead to increased resource contention, creating

I/O bottlenecks, and increasing response time when the resources are shared by many concurrent

transactions.

Since DAC creates indexes and computes statistics on target tables in parallel on a single table and across multiple

tables, the parallel execution may cause performance problems if the values parallel_max_servers and

parallel_threads_per_cpu are too high. The system load from parallel operations can be observed by executing the

following query:

SQL> select name, value from v$sysstat where name like 'Parallel%';

Reduce the "parallel_threads_per_cpu" and "parallel_max_servers" value if the system is overloaded.

10

Oracle Business Analytics Warehouse Tablespaces

By default, DAC deploys all data warehouse entities into two tablespaces: all tables into a DATA tablespace, and all

indexes into an INDEX tablespace. Depending on your hardware configuration on the target tier you can improve its

performance by rearranging your data warehouse tablespaces.

The following table summarizes space allocation estimates in a data warehouse by its data volume range:

Target Data Volume SMALL:

Up to 400Gb

MEDIUM:

400Gb to 2Tb

LARGE:

2Tb and higher

Temporary Tablespace 40 60Gb 60 150Gb 150 250Gb

DATA Tablespace 350Gb 350Gb 1.8Tb > 1.8Tb

INDEX Tablespace 50Gb 50 200Gb > 200Gb

Important!!! Make sure you use Locally Managed tablespaces with AUTOALLOCATE clause. DO NOT use UNIFORM

extents size, as it may cause excessive space consumption and result in queries slower performance.

Use standard (primary) block size for your warehouse tablespaces. DO NOT build your warehouse on

non-standard block tablespaces.

Note that the INDEX Tablespace may increase if you enable more query indexes in your data warehouse.

During incremental loads, by default DAC drops and rebuilds indexes, so you should separate all indexes in a

dedicated tablespace and, if you have multiple RAID / IO Controllers, move the INDEX tablespace to a separate

controller.

You may also consider isolating staging tables (_FS) and target fact tables (_F) on different controllers. Such

configuration would help to speed up Target Load (SIL) mappings for fact tables by balancing I/O load on multiple RAID

controllers.

ORACLE BI APPLICATIONS BEST PRACTICES FOR ORACLE EXADATA

Handling BI Applications Indexes in Exadata Warehouse Environment

Oracle Business Analytic Applications Suite uses two types of indexes:

ETL indexes for optimizing ETL performance and ensuring data integrity

Query indexes, mostly bitmaps, for end user star queries

Exadata Storage Indexes functionality cannot be considered as unconditional replacement for BI Apps indexes. You

can employ storage indexes only in those cases when BI Applications query indexes deliver inferior performance and

you ran the comprehensive tests to ensure no regressions for all other queries without the query indexes.

Do not drop any ETL indexes, as you may not only impact your ETL performance but also compromise data integrity in

your warehouse.

The best practices for handling BI Applications indexes in Exadata Warehouse:

Turn on Index usage monitoring to identify any unused indexes and drop / disable them in your env. Refer to

the corresponding section in the document for more details.

Consider pinning the critical target tables in smart flash cache

11

Consider building custom aggregates to pre-aggregate more data and simplify queries performance.

Drop selected query indexes and disable them in DAC to use Exadata Storage Indexes / Full Table Scans only

after running comprehensive benchmarks and ensuring no impact on any other queries performance.

Gather Table Statistics for BI Applications Tables

Out of the box Data Warehouse Admin Console (DAC) uses FOR INDEXED COLUMNS syntax for computing BI

Applications table statistics. It does not cover statistics for non-indexed columns participating in end user query joins. If

you choose to drop some indexes in Exadata environment, then there would be more critical columns with NULL

statistics. As the result, Optimizer may choose sub-optimal execution plan and result in slower performance.

You should consider switching to FOR ALL COLUMNS SIZE AUTO syntax in

DBMS_STATS.GATHER_TABLE_STATS call in DAC:

1. Navigate to your /CustomSQLs and open customsql.xml file for editing.

2. Replace FOR INDEXED COLUMNS with FOR ALL COLUMNS SIZE AUTO in

DBMS_STATS.GATHER_TABLE_STATS call in section.

3. Save the changes.

Next time you run an ETL, DAC will compute the statistics for BI Applications tables for all columns.

Oracle Business Analytics Warehouse Storage Settings in Exadata

The recommended database block size (db_block_size parameter) is 8K. You may consider using 16K block

size as well, primarily to increase for better compression rate, as Oracle applies compression at block level.

Refer to init.ora template in the section below.

Make sure you use locally managed tablespaces with AUTOALLOCATE option. DO NOT use UNIFORM

extent size for your warehouse tablespaces.

Use your primary database block size 8k (or 16k) for your warehouse tablespaces. It is NOT recommended to

use non-standard block size tablespaces for deploying production warehouse.

Use 8Mb large extent size for partitioned fact tables and non-partitioned large segments, such as dimensions,

hierarchies, etc. Setting cell_partition_large_extents = TRUE will ensure all partitioned tables get created with

INITIAL extent size of 8Mb. You will have to manually specify INITIAL and NEXT extents size of 8Mb for non-

partitioned segments.

Set deferred_segment_creation = TRUE to defer a segment creation until the first record is inserted. Refer to

init.ora section below.

Parallel Query Use in BI Applications on Exadata

All BI Applications tables are created without any degree of parallelism in BI Applications schema. Since DAC manages

parallel jobs, such as Informatica mappings or indexes creation, during an ETL, the use of Parallel Query in ETL

mappings could generate more I/O overhead and cause performance regressions for ETL jobs.

Exadata hardware provides much better scalability for I/O resources, so you can consider turning on Parallel Query for

slow queries by setting PARALLEL attribute for large tables participating in the queries. For example:

SQL> ALTER TABLE W_GL_BALANCE_F PARALLEL;

12

Make sure you benchmark the query performance prior to implementing the changes in your Production environment.

Compression Implementation Oracle Business Analytics Warehouse in Exadata

Table compression can significantly reduce a segment size, and improve queries performance in Exadata environment.

However, depending on the nature DML operations in ETL mappings, it may result in their slower mapping

performance and larger consumed space. The following guidelines will help to ensure successful compression

implementation in your Exadata environment:

Consider implementing compression after running an Initial ETL. The initial ETL plan contains several mappings

with heavy updates, which could impact your ETL performance.

Implement large facts table partitioning and compress inactive historic partitions only. Make sure that the active

ones remain uncompressed.

Choose either Basic or Advanced compression types for your compression candidates.

Review periodically the allocated space for a compressed segment, and check such stats as num_rows, blocks

and avg_row_len in user_tables view. For example, the following compressed segment needs to be re-

compressed, as it consumes too many blocks:

Num_rows Avg_row_len Blocks Compression

541823382 181 13837818 ENABLED

The simple calculation (num_rows * avg_row_len / 8k block size) * ~25% (block overhead) gives ~15M blocks for an

uncompressed segment. This segment should be re-compressed reduce its footprint and improve its queries

performance.

Refer to Table Compression Implementation Guidelines section in this document for additional information on

compression for BI Applications Warehouse.

Exadata Smart Flash Cache

The use of Smart Flash Cache in Oracle Business Analytics Warehouse can significantly improve end user queries

performance. You can consider pinning most frequently used dimensions which impact your queries performance. To

manually pin a table in Exadata Smart Flash Cache, use the following syntax:

ALTER TABLE W_PARTY_D STORAGE (CELL_FLASH_CACHE KEEP);

The Exadata Storage Server will cache data for W_PARTY_D table more aggressively and will try to keep the data

from this table longer than cached data from other tables.

Important!!! Use manual Flash Cache pinning only for the most common critical tables.

Database Parameter File for Analytics Warehouse on Exadata

Use the template file below for your init.ora parameter file for Business Analytics Warehouse on Oracle Exadata.

###########################################################################

# Oracle BI Applications - init.ora template

# This file contains a listing of init.ora parameters for 11.2 / Exadata

###########################################################################

db_name =

control_files = //ctrl01.dbf, //ctrl02.dbf

13

db_block_size = 8192 # or 16384 (for better compression)

db_block_checking = FALSE

db_block_checksum = TYPICAL

cell_partition_large_extents = TRUE

deferred_segment_creation = TRUE

user_dump_dest = //admin//udump

background_dump_dest = //admin//bdump

core_dump_dest = //admin//cdump

max_dump_file_size = 20480

processes = 1000

sessions = 2000

db_files = 1024

session_max_open_files = 100

dml_locks = 1000

cursor_sharing = EXACT

cursor_space_for_time = FALSE

session_cached_cursors = 500

open_cursors = 1000

db_writer_processes = 2

aq_tm_processes = 1

job_queue_processes = 2

timed_statistics = true

statistics_level = typical

sga_max_size = 45G

sga_target = 40G

shared_pool_size = 2G

shared_pool_reserved_size = 100M

workarea_size_policy = AUTO

pre_page_sga = FALSE

pga_aggregate_target = 16G

log_checkpoint_timeout = 3600

log_checkpoints_to_alert = TRUE

log_buffer = 10485760

undo_management = AUTO

undo_tablespace = UNDOTS1

undo_retention = 90000

parallel_adaptive_multi_user = FALSE

parallel_max_servers = 128

parallel_min_servers = 32

# ------------------- MANDATORY OPTIMIZER PARAMETERS ----------------------

star_transformation_enabled = TRUE

query_rewrite_enabled = TRUE

14

query_rewrite_integrity = TRUSTED

_b_tree_bitmap_plans = FALSE

_optimizer_autostats_job = FALSE

INFORMATICA CONFIGURATION FOR BETTER PERFORMANCE

Informatica PowerCenter 8.6 32-bit vs. 64-bit

32-bit OS memory can address only 2 ^ 32 bytes, or four gigabytes of RAM, and allow maximum two gigabytes for any

application. Oracle BI Applications ETL mappings use complex Informatica transformations such as lookups, cached in

memory, and their performance is heavily impacted by data from incremental extracts and high watermark

warehousing volumes. Additionally BI Applications ETL execution plans employ parallel mappings execution. So 32-bit

ETL tier can quickly exhaust the available memory and end up with very expensive I/O paging and swapping

operations, thus causing rather dramatic regression in ETL performance.

On the contrast, Informatica 64-bit takes the advantage of more physical RAM for performing complex transformations

in memory and eliminating costly disk I/O operations. Informatica PowerCenter 8.6 provides a true 64-bit performance

and the ability to scale because no intermediate staging or hashing files on disk are required for processing.

The internal BI Applications ETL benchmarks for Informatica 8.6 32-bit vs. 64-bit showed at least two times better

throughputs for 64-bit configuration. So, Oracle Business Intelligence Applications customers are strongly encouraged

to use Informatica 8.6 64-bit version for Medium and Large environments.

Informatica Session Logs

Oracle BI Applications 7.9.6 uses Informatica PowerCenter 8.6, which has improved log reports. Each session log

provides the detailed information about transformations as well as summary of a mapping execution, including the

detailed percentage run time, idle time, etc.

Below is an example of the execution summary from an Informatica session log:

***** RUN INFO FOR TGT LOAD ORDER GROUP [1], CONCURRENT SET [1] ***** Thread [READER_1_1_1] created for [the read stage] of partition point [Sq_W_CUSTOMER_LOC_USE_DS] has completed. Total Run Time = [559.812502] secs Total Idle Time = [348.453112] secs Busy Percentage = [37.755389] Thread [TRANSF_1_1_1] created for [the transformation stage] of partition point [Sq_W_CUSTOMER_LOC_USE_DS] has completed. Total Run Time = [559.843748] secs Total Idle Time = [322.109055] secs Busy Percentage = [42.464472] Thread work time breakdown: Fil_W_CUSTOMER_LOC_USE_D: 2.105263 percent Exp_W_CUSTOMER_LOC_USE_D_Update_Flg: 10.526316 percent Lkp_W_CUSTOMER_LOC_USE_D: 13.684211 percent mplt_Get_Etl_Proc_Wid.EXP_Constant_for_Lookup: 1.052632 percent mplt_Get_Etl_Proc_Wid.Exp_Get_Integration_Id: 2.105263 percent mplt_Get_Etl_Proc_Wid.Exp_Decide_Etl_Proc_Wid: 3.157895 percent mplt_Get_Etl_Proc_Wid.LKP_ETL_PROC_WID: 20.000000 percent mplt_SIL_CustomerLocationUseDimension.Exp_Scd2_Dates: 44.210526 percent mplt_SIL_CustomerLocationUseDimension.Exp_W_CUSTOMER_LOC_USE_D_Transform: 3.157895 percent Thread [WRITER_1_*_1] created for [the write stage] of partition point [W_CUSTOMER_LOC_USE_D] has completed. Total Run Time = [561.171875] secs Total Idle Time = [0.000000] secs Busy Percentage = [100.000000]

Busy Percentage for a single thread cannot be considered as an absolute measure of performance for a whole

mapping. All threads statistics must be reviewed together. Informatica computes it for a single thread in a mapping as

follows:

15

Busy Percentage = (Total Run Time Total Idle Time) / Total Run Time

If the report log shows high Busy Percentage (> 70 - 80%) for the READER Thread, then you may need to review the

mappings Reader Source Qualifier Query for any performance bottlenecks.

If the report shows high Busy Percentage (> 60 - 70%) for the TRANSF Thread, then you need to review the detailed

transformations execution summary and identify the most expensive transformation. In the example above the

transformation mplt_SIL_CustomerLocationUseDimension.Exp_Scd2_Dates consumes 44.2% of all TRANSF runtime,

so it may be considered a candidate for investigation.

If the report shows high Busy Percentage for the WRITER Thread, it may not necessarily be a performance bottleneck.

Depending on the processed data volumes, you may want to turn off Bulk Mode. Refer to the section Informatica

Load: Bulk vs. Normal for more details.

The log above shows that most probably the mapping is well balanced between Reader and Transformation threads

and it keeps Writer busy with inserts.

Informatica Lookups

Too many Informatica Lookups in an Informatica mapping may cause significant performance slowdown. Review the

guidelines below for handling Informatica Lookups in Oracle Business Intelligence Applications mappings:

Inspect Informatica session logs for the number of lookups, including each lookups percentage runtime.

Check Lookup table row count and Lookup cache row count numbers for each Lookup Transformation. If

Lookup table row count is too high, Informatica will cache a smaller subset in its Lookup Cache. Such lookup

could cause significant performance overhead on ETL tier.

If functional logic permits, consider reducing a large lookup row count by adding more constraining predicates to

the lookup query WHERE clause.

If a Reader Source Qualifier query is not a bottleneck in a slow mapping, and the mapping is overloaded with

lookups, consider pushing lookups with row counts less than two million into the Reader SQL as OUTER JOINS.

Important: Some lookups could be reusable within a mapping or across multiple mappings, so they cannot be

constrained or pushed down into Reader queries. Consult Oracle Development prior to re-writing Oracle

Business Intelligence Applications mappings.

If you identify a very large lookup with row count more than 15-20 million, consider pushing it down as an OUTER

JOIN into the mappings Reader Query. Such update would slow down the Reader SQL execution, but it might

improve overall mappings performance.

Make sure you test the changes to avoid functional regressions before implementing optimizations in your

production environment.

Disabling Lookup Cache for very large Lookups

Informatica uses Lookup cache to store the lookup data on the ETL tier in flat files (dat and idx). The Integration

Service builds cache in memory when it processes the first row of data in the cached Lookup Transformation. If Lookup

data is small, the lookup data can be stored in memory and transformation processes the rows very fast. But, if Lookup

data is very large (typically over 20M), the lookup cannot fit into the allocated memory and the data has to be paged in

and out many times during a single session. As a result, such lookup transformations adversely affect the overall

mapping performance. Additionally Informatica takes more time to build such large lookups.

If constraining a large lookup is not possible, then consider disabling the lookup cache. Connect to Informatica

Workflow Manager, open the session properties, and find the desired transformation in the Transformations folder on

the Mapping tab. Then uncheck Lookup Cache Enabled property and save the session.

16

Disabling the lookup cache for heavy lookups will help to avoid excessive paging on the ETL tier. When the lookup

cache is disabled, the Integration Service issues a select statement against the lookup source database to retrieve

lookup values for each row from the Reader Thread. It would not store any data in its flat files on ETL tier. The issued

lookup query uses bind variables, so it is parsed only once in the lookup source database.

Disabling lookup cache may work faster for very large lookups under following conditions:

Lookup query must use index access path, otherwise data retrieval would be very expensive on the source lookup

database tier. Remember that Informatica would fire the lookup query for every record from its Reader thread.

Consider creating an index for all columns, which are used in the lookup query. Then Oracle Optimizer would

choose INDEX FAST FULL SCAN to retrieve the lookup values from index blocks rather than scanning the whole

table.

Check the explain plan for the lookup query to ensure index access path.

Make sure you test the modified mapping with the selected disabled lookups in a test environment and benchmark its

performance prior to implementing the change in the production system.

Joining Staging Tables to Lookup Tables in Informatica Lookups

If you identify bottlenecks with lookups having very large rowcounts, you can consider constraining them by updating

the Lookup queries and joining to a staging table used in the mapping. As a result, Informatica will execute the lookup

query and cache much fewer rows, and speed up the rows processing on its Transformation thread.

For example, the original query for Lkp_W_PARTY_D_With_Geo_Wid

SELECT DISTINCT W_PARTY_D.ROW_WID as ROW_WID, W_PARTY_D.GEO_WID as GEO_WID, W_PARTY_D.INTEGRATION_ID as INTEGRATION_ID, W_PARTY_D.DATASOURCE_NUM_ID as DATASOURCE_NUM_ID, W_PARTY_D.EFFECTIVE_FROM_DT as EFFECTIVE_FROM_DT, W_PARTY_D.EFFECTIVE_TO_DT as EFFECTIVE_TO_DT FROM W_PARTY_D

Can be modified to:

SELECT DISTINCT W_PARTY_D.ROW_WID as ROW_WID, W_PARTY_D.GEO_WID as GEO_WID, W_PARTY_D.INTEGRATION_ID as INTEGRATION_ID, W_PARTY_D.DATASOURCE_NUM_ID as DATASOURCE_NUM_ID, W_PARTY_D.EFFECTIVE_FROM_DT as EFFECTIVE_FROM_DT, W_PARTY_D.EFFECTIVE_TO_DT as EFFECTIVE_TO_DT FROM W_PARTY_D, W_RESPONSE_FS WHERE W_PARTY_D.INTEGRATION_ID=W_RESPONSE_FS.PARTY_ID AND W_PARTY_D.DATASOURCE_NUM_ID=W_RESPONSE_FS.DATASOURCE_NUM_ID

Such change ensured the lookup row count drop from > 22M to 180K and helped to improve the mapping

performance.

This approach can be applied selectively to both initial and incremental mappings after thorough benchmarks.

Informatica Custom Relational Connections for long running mappings

If you plan to summarize very large volumes of data (usually over 100 million records), you could speed up the large

data ETL mappings by turning off automated PGA structures allocation and set SORT and HASH areas manually for

the selected sessions.

17

To speed up such ETL mappings execution, set sort_area_size and hash_area_size to higher values. If you have

limited system memory, you can increase only the sort_area_size as sorting operations for aggregate mappings are

more memory intensive. Hash joins involving bigger tables can still perform better with smaller hash_area_size.

Follow the steps below to create a new Relational Connection with custom session parameters in Informatica:

1. Open Informatica Workflow Manager and navigate to Connections -> Relational -> New

2. Define a new Target connection 'DataWarehouse_Manual_PGA'

3. Use the same values as in DataWarehouse connection

4. Click on Connection Environment SQL and insert the following commands:

Repeat the same steps to define another custom Relational connection to your Oracle Source database.

alter session set workarea_size_policy = manual;

alter session set sort_area_size = 1000000000;

alter session set hash_area_size = 2000000000;

Each mapping that is a candidate to use the custom Relational connections, should meet the requirements below:

- The mapping doesnt use heavy transformations on ETL tier

- The Reader query joins very large tables

- Its Reader query execution plan uses HASH JOINS

Connect to Informatica Workflow Manager and complete the following steps for each identified mapping:

1. Open a session in Task Developer

2. Click on Mapping tab

3. Select Connections in the left pane

4. Select the defined Custom value for Source or Target connection

5. Save the changes.

Informatica Session Parameters

There are three major properties, defined in Informatica Workflow Manager for each session, which impact Informatica

mappings performance.

Commit Interval

The target-based commit interval determines the commit points at which the Integration Service commits data writes in

the target database. The larger the commit interval, the better the overall mappings performance. However too large

commit interval may cause database logs to fill and result in session failure.

Oracle BI Applications Informatica mappings have the default setting 10,000. The recommended range for commit

intervals is from 10,000 up to 200,000.

DTM Buffer Size

The DTM Buffer Size specifies the amount of memory the Integration Service uses for DTM buffer memory. Informatica

uses DTM buffer memory to create the internal data structures and buffer blocks used to bring data into and out of the

Integration Service.

18

Additional Concurrent Pipelines for Lookup Cache Creation

Additional Concurrent Pipelines for Lookup Cache Creation parameter defines the concurrency for lookup cache

creation.

Oracle BI Applications Informatica mappings have the default setting 0. You can reduce lookup cache build time by

enabling parallel lookup cache creation by setting the value larger than one.

Important: Make sure you carefully analyze long running mapping bottlenecks before turning on lookup cache build

concurrency in your production environment. Oracle BI Applications execution plans take advantage of parallel

workflows execution. Enabling concurrent lookup cache creation may result in additional overhead on a target

database and longer execution time.

You can consider turning on lookup cache creation concurrency when you have one or two long running mappings,

which are overloaded with lookups.

Default Buffer Block Size

The buffer block size specifies the amount of buffer memory used to move a block of data from the source to the

target. Oracle BI Applications Informatica mappings have the default setting 128,000. Avoid using Auto value for

Default Buffer Block Size, as it may cause performance regressions for your sessions.

The internal tests showed better performance for both Initial and Incremental ETL with Default Buffer Block Size set to

512,000 (512K). You can run the following SQL to update the Buffer Block Size to 512K for all mappings in your

Informatica repository:

SQL> update opb_cfg_attr set attr_value='512000' where attr_value='128000' and attr_id = 5; SQL> commit;

Important: Make sure you test the changes in your development repository and benchmark ETL performance before

making changes to your production environment.

Informatica Load: Bulk vs. Normal

The Informatica writer thread may become a bottleneck in some mappings that use bulk mode to load very large

volumes (>200M) into a data warehouse.

The analysis of a trace file from a Writer database session shows that Informatica uses direct path insert to load data in

Bulk mode. The database session performs two direct path writes to insert each new portion of data. Every time Oracle

scans for 12 contiguous blocks in a target table to perform a new write transaction. As the table grows larger, it takes

longer and longer to scan the segment for chunks of 12 contiguous blocks. Even though it does bypass database block

cache, the Informatica Writer thread may slow down the mappings overall performance.

To determine whether your mapping, which loads very large data in bulk mode, slows down because of writer thread,

open its Informatica session log, and compute the time to write the same set of blocks (usually 10,000) at the beginning

and the end of the log. If you observe significant increase in the writer execution time at the end of the log, then you

should consider either increasing commit size for the mapping or changing the session load mode from Bulk to Normal

in Informatica Workflow Manager, and test the mapping with the updated setting.

Informatica Bulk Load: Table Fragmentation

Informatica Bulk Load for very large volumes may not only slow down the mapping performance but also cause

significant table fragmentation.

The internal tests showed that the commit size for Normal load did not affect the number of allocated extents for one

million rows in W_RESPONSE_F fact, used in the internal benchmarks. However for the Bulk Load the number of

19

extents increased rather significantly with commit size going down. The commit size also affected the mapping

performance for both Normal and Bulk load; the drop in throughput has been more significant for the latter scenario.

The table below shows the number of extents (ext) and throughput (rps) for each tested scenario.

Informatica Load

type 1M commit 100K commit 10K commit 1K commit 10 rows commit

Normal mode 80 ext / 34K rps 80 ext / 33K rps 80 ext / 30K rps 80 ext / 27K rps

80 ext / 14K rps

Bulk mode 80 ext / 55.5K rps

190 ext / 55.5K rps

200 ext / 37K rps

960 ext / 8K rps

> 5K ext (out of space) / 600 rps

Important!!! To ensure bulk load performance and avoid or minimize target table fragmentation, make sure you set

larger commit size in Informatica mappings.

Use of NULL Ports in Informatica Mappings

The use of connected or disconnected ports with hard-coded NULL values in Informatica mappings can be yet another

reason for slower ETL mappings performance. The internal study showed that, depending on the number of NULL

ports, such mappings performance can drop two times or even more. The performance gap becomes larger when

more ports are used in a mapping. The session CPU time grows nearly proportionally to the number of connected

ports, so does the row width, processed by Informatica. As soon as certain threshold of ports reached, the internal

Informatica session processing for wide mappings becomes even more complex, and its execution runtime slows down

dramatically. The internal tests demonstrated that Informatica treats equally NULL and non-NULL values and allocates

critical resources for processing NULL ports. It also includes NULL values into INSERT statements, executed by

WRITER thread on data warehouse tier.

To ensure effective performance of Informatica mappings:

- Avoid using NULL ports in Informatica transformations.

- Try to keep the total number of ports no greater than 50 per mapping.

- Review slow mappings for NULL ports or any other potentially redundant ports, which could be eliminated.

Informatica Parallel Sessions Load on ETL tier

Informatica mappings with complex transformations and heavy lookups typically consume larger amounts of memory

during ETL execution. While processing large data volumes and executing in parallel, such mappings may easily

overload the ETL server and cause very heavy memory swapping and paging. As the result, the overall ETL execution

would take much longer time to complete. To avoid such potential bottlenecks:

Consider implementing Informatica 64-bit version on your ETL tier.

Ensure you have enough physical memory on your ETL tier server. Refer to Hardware Recommendations section

for more details.

Keep in mind that too many Informatica sessions, running in parallel, may overload either source or target

database.

Set smaller number of connections to Informatica Integration Service in DAC. Navigate to DACs Setup screen ->

Informatica Servers tab -> Maximum Sessions in the lower pane for both Informatica and Repository connections.

The recommended range is from 5 to 10 sessions.

Benchmark your ETL performance in your test environment prior to implementing the change in the production

system.

20

Informatica Load Balancing Implementation

To improve the performance on the ETL tier, consider implementing Informatica Load Balancing to balance the

Informatica load across multiple ETL tiers and speed up mappings execution. You can register one or more

Informatica servers and the Informatica Repository Server in DAC and specify the number of workflows that can be

executed in parallel. The DAC server automatically load balances across the servers. It does not run more sessions

than the value specified for each of them.

To implement Informatica Load Balancing in DAC perform the following steps.

1. Register additional Informatica Server(s) in DAC. Refer to the section Registering Informatica Servers in the DAC

Client in the publication Oracle Business Intelligence Applications Installation Guide for Informatica PowerCenter

Users, Version 7.9.6

2. Configure the database connection information in Informatica Workflow Manager. Refer to the section Process of

Configuring the Informatica Repository in Workflow Manager in the publication Oracle Business Intelligence

Applications Installation Guide for Informatica PowerCenter Users, Version 7.9.6.

Important: Deploying multiple Informatica domains and repository services on different server nodes would cause

additional maintenance overhead. Any repository updates or configuration changes, performed on one node, must be

replicated across all the participating nodes in the multiple domains configuration.

To minimize the overhead from Informatica repositories maintenance, consider the load balancing implementation

below:

Configure a single Informatica domain and deploy a single PowerCenter Repository service in it.

Create Informatica services on each Informatica node and subscribe them to the single domain

BITMAP INDEXES USAGE FOR BETTER QUERIES PERFORMANCE

Introduction

Oracle Business Intelligence Applications Version 7.9.0 introduced the use of the Bitmap Index feature of the Oracle

RDBMS. In comparison with B-Tree indexes, Bitmap indexes provide significant performance improvements on data

warehouse star queries. The internal benchmarks showed performance gains when B-Tree indexes on the foreign

keys and attributes were replaced with bitmap indexes.

Although bitmap indexes improve star queries response time, their use may cause ETL performance degradations both

in Oracle 10g and 11g. Dropping all bitmap indexes on a large table prior to an ETL run, and then recreating them after

the ETL completion may be quite expensive and time consuming. This is especially the case when there are a large

number of such indexes, or when there is little change expected in the number of records updated or inserted into a

table during each ETL run. Conversely, the quality of the existing bitmap indexes may degrade as more updates,

deletes, and inserts are performed with indexes in place, making such indexes less effective unless they are rebuilt.

This section reviews the index processing behavior of the DAC and provides the recommendations for bitmap indexes

handling during ETL runs.

DAC properties for handling bitmap indexes during ETL

DAC handles the same indexes differently for initial and incremental ETL runs. Prior to an initial load in a data

warehouse, there are no indexes created on the tables except for the unique B-Tree indexes to preserve data integrity.

During the initial ETL run, DAC will create ETL indexes on a loaded table, which will be required for faster execution of

subsequent mappings. For an incremental ETL run, DACs index handling will vary based on the combination of the

several DAC properties and individual index usage settings.

21

The following table summarizes the list of parameters, available in DAC 10.1.3.4.1, to handle indexes during ETL runs:

Parameter

Name

Parameter

Type Values Effect

Default

Value

Drop/Create

Indices

Execution

Plan Y | N

DAC will drop all indexes on a target table, truncated before a load, and then re-

create them after loading the table. It is used mostly in small Execution plans.

Initial ETL:

- Y all indexes irrespective of any other settings will be dropped and created

- N - no indexes will be dropped during an initial ETL

Incremental ETL:

- Y - indexes with Always Drop & Create (Bitmap) will be dropped during an

incremental ETL

- N - no indexes will be dropped during an incremental ETL

DB2/390 customers may want to set it to N. The recommended default value for

other platforms is Y, unless you are executing a micro ETL in which case it would

be too expensive to drop and create all indexes, so the value should be changed to

N.

Important: When set to N, this parameter overrides all other index level properties.

Y

Always Drop

& Create

Bitmap

Index Y | N

The property Always Drop and Create is an index specific property, applicable to

bitmap indexes only.

- Y - a Bitmap index will be dropped prior to an ETL run.

- N - a Bitmap index will not be dropped in an incremental ETL run only.

The index property Always Drop & Create Bitmap does not override Drop/Create

Indices execution plan property if the latter is set to N'. If an index is inactivated in

DAC, the index would not be dropped and recreated during subsequent ETL runs.

The property applies to Oracle data warehouse platform only.

N/A

Always Drop

& Create Index Y | N

The property Always Drop and Create is an index specific property, applicable to all

indexes.

- Y an index will be dropped prior to an ETL run.

- N an index will not be dropped in an incremental ETL run only.

The index property Always Drop & Create does not override Drop/Create Indices

execution plan property if the latter is set to N'. If an index is inactivated in DAC, the

index would not be dropped and recreated during subsequent ETL runs.

N/A

Index Usage Index ETL |

QUERY

- ETL - an index is required to improve subsequent ETL mappings performance.

DAC drops ETL indexes on a table if it truncates the table before the load, or

you set Drop/Create Indices, Always Drop and Create Bitmap or Always Drop &

Create to True. DAC will re-create the dropped ETL indexes after loading the

table, since the indexes will be used to speed up subsequent mappings.

- Query - an index is required to improve web queries performance.

N/A

22

Verify And

Create Non-

Existing

Indices

System True |

False

- True The DAC server will verify that all indexes defined in the DAC repository

are created in the target database.

- False - DAC will not run any reconciliation checks between its repository and

the target database.

This parameter is useful when the current execution plan has Drop/Create Indexes

set to True, and new indexes have been created in the DAC repository since the

last ETL run.

False

Num Parallel

Indexes per

Table

Physical

Data

Source

Number This parameter specifies the maximum number of indexes that the DAC server will

create in parallel for a single table. 1

Bitmap Indexes handling strategies

Review the following recommendations for effective bitmap indexes management in your environment.

1. Disable redundant bitmap indexes in DAC.

Pre-packaged Oracle BI Applications releases include bitmap indexes, enabled in the DAC metadata repository, and

therefore, created and maintained as part of ETL runs, even though the indexed columns might not be used in filtering

conditions in the Oracle BI Server repository.

Reducing the number of redundant bitmap indexes is an essential step for improving initial and incremental loads,

especially for dimension and lookup tables. To identify all enabled BITMAP indexes on a table in DAC metadata

repository:

- Log in into your repository through the DAC user interface, click on the Design button under top menu, select your

custom container in the pull down menu and select the Indices tab in the right pane.

- Click Query sub-tab

- Enter Table name and check Is Bitmap box in the query row and click Go.

To identify the list of the exposed columns, included into filtering conditions in RPD repository, connect to BI Server

Administration Tool and generate the list of dependencies for each column using Query Repository and Related To

features.

To disable the identified redundant indexes in DAC and drop them in Data Warehouse:

- Check the Inactive checkbox against the indexes, which should be permanently dropped in the target schema.

- Rebuild the DAC execution plan.

- Connect to your target database schema and drop the disabled indexes.

2. Decide whether to drop or keep bitmap indexes during incremental loads.

Analyze the total time to build indexes and computing statistics during an incremental run. You can connect to your

DAC repository and execute the following queries:

SQL> alter session set nls_date_format='DD-MON-YYYY:HH24:MI:SS';

-- Identify your ETL Run and put its format into the subsequent queries:

select ROW_WID, NAME ETL_RUN , EXTRACT(DAY FROM (END_TS - START_TS) DAY TO SECOND ) || ' days ' || EXTRACT(HOUR FROM (END_TS - START_TS) DAY TO SECOND ) || ' hrs ' || EXTRACT(MINUTE FROM (END_TS - START_TS) DAY TO SECOND ) || ' min ' || EXTRACT(SECOND FROM (END_TS - START_TS) DAY TO SECOND ) || ' sec ' PLAN_RUN_TIME

23

from W_ETL_DEFN_RUN order by START_TS DESC; -- Identify your custom Execution Plan Name: SELECT DISTINCT app.row_wid FROM w_etl_defn_run run , w_etl_app app , w_etl_defn_prm prm WHERE prm.etl_defn_wid = run.etl_defn_wid AND prm.app_wid = app.row_wid AND run.row_wid = '; -- Indexes build time: SELECT ref_idx.tbl_name table_name , ref_idx.idx_name , sdtl.start_ts start_time , sdtl.end_ts end_time , EXTRACT(DAY FROM(sdtl.end_ts - sdtl.start_ts) DAY TO SECOND) || ' days ' || EXTRACT(HOUR FROM(sdtl.end_ts - sdtl.start_ts) DAY TO SECOND) || ' hrs ' || EXTRACT(MINUTE FROM(sdtl.end_ts - sdtl.start_ts) DAY TO SECOND) || ' min ' || EXTRACT(SECOND FROM(sdtl.end_ts - sdtl.start_ts) DAY TO SECOND) || ' sec' idx_bld_time FROM w_etl_defn_run def , w_etl_run_step stp , w_etl_run_sdtl sdtl , (SELECT ind_ref.obj_wid , ind.name idx_name , tbl.name tbl_name FROM w_etl_index ind , w_etl_obj_ref ind_ref , w_etl_obj_ref tbl_ref , w_etl_table tbl , w_etl_app app WHERE ind_ref.obj_type = 'W_ETL_INDEX' AND ind_ref.soft_del_flg = 'N' AND ind_ref.app_wid = AND ind_ref.obj_wid = ind.row_wid AND tbl_ref.obj_type = 'W_ETL_TABLE' AND tbl_ref.soft_del_flg = 'N' AND tbl_ref.app_wid = AND tbl_ref.obj_wid = tbl.row_wid AND tbl_ref.obj_ref_wid = ind.table_wid AND ind.app_wid = app.row_wid AND ind.inactive_flg = 'N' ) ref_idx WHERE def.row_wid = stp.run_wid AND def.row_wid =' AND sdtl.run_step_wid = stp.row_wid AND sdtl.type_cd = 'Create Index' AND sdtl.index_wid = ref_idx.obj_wid -- AND ref_idx.tbl_name = 'W_OPTY_D' ORDER BY sdtl.end_ts - sdtl.start_ts DESC -- Table Stats computing time: select TBL.NAME TABLE_NAME , STP.STEP_NAME , EXTRACT(DAY FROM (SDTL.END_TS - SDTL.START_TS) DAY TO SECOND ) ||' days ' || EXTRACT(HOUR FROM (SDTL.END_TS - SDTL.START_TS) DAY TO SECOND ) ||' hrs ' || EXTRACT(MINUTE FROM (SDTL.END_TS - SDTL.START_TS) DAY TO SECOND ) ||' min ' || EXTRACT(SECOND FROM (SDTL.END_TS - SDTL.START_TS) DAY TO SECOND ) ||' sec' TBL_STATS_TIME from W_ETL_DEFN_RUN DEF , W_ETL_RUN_STEP STP , W_ETL_RUN_SDTL SDTL

24

, W_ETL_TABLE TBL where DEF.ROW_WID=STP.RUN_WID and DEF.ROW_WID =' and SDTL.RUN_STEP_WID = STP.ROW_WID and SDTL.TYPE_CD = 'Analyze Table' and SDTL.TABLE_WID = TBL.ROW_WID order by SDTL.END_TS - SDTL.START_TS desc; -- Informatica jobs for the selected ETL run: select SDTL.NAME SESSION_NAME , SDTL.SUCESS_ROWS , STP.FAILED_ROWS , SDTL.READ_THRUPUT , SDTL.WRITE_THRUPUT , EXTRACT(DAY FROM (SDTL.END_TS - SDTL.START_TS) DAY TO SECOND ) ||' days ' || EXTRACT(HOUR FROM (SDTL.END_TS - SDTL.START_TS) DAY TO SECOND ) ||' hrs ' || EXTRACT(MINUTE FROM (SDTL.END_TS - SDTL.START_TS) DAY TO SECOND ) ||' min ' || EXTRACT(SECOND FROM (SDTL.END_TS - SDTL.START_TS) DAY TO SECOND ) ||' sec' INFA_RUN_TIME from W_ETL_DEFN_RUN DEF , W_ETL_RUN_STEP STP , W_ETL_RUN_SDTL SDTL where DEF.ROW_WID=STP.RUN_WID and DEF.ROW_WID =' and SDTL.RUN_STEP_WID = STP.ROW_WID and SDTL.TYPE_CD = 'Informatica' order by SDTL.END_TS - SDTL.START_TS desc;

If the report shows significant amounts of time to rebuild indexes and compute statistics, and the cumulative incremental load time does not fit into your load window, you can consider two options:

Option 1: range partition large fact tables if they show up in the report. Refer to the partitioning sections for more

details.

Option 2: If the incremental volumes are low, leave bitmap indexes on the reported tables for the next incremental run

and then compare the load times. Refer to the next chapter for the implementation.

Option 2 is not recommended for fact tables (%_F). It may be used for large dimension tables, which cannot be

partitioned effectively by range.

Important: Bitmap indexes present on target tables during inserts, updates or deletes could significantly

increase the SQL DML execution time. The same SQL would complete much faster if the indexes get dropped

prior to the query execution. Alternatively, it would take more time to rebuild the dropped bitmap indexes and

compute required statistics. You should measure the cumulative time to run a specific task plus the time to

rebuild indexes and compute required database statistics before deciding whether to drop or keep bitmap

indexes in place during incremental loads.

3. Configure DAC not to drop selected bitmap indexes during incremental loads.

If your benchmarks show that it is less time consuming to leave bitmap indexes in place on large dimension tables

during incremental loads and the incremental volumes are relatively small, then you can consider keeping the selected

indexes in place during incremental loads.

Since the DAC system property Drop and Create Bitmap Indexes Always overrides the index property Always Drop &

Create, the system property defines how DAC will handle all bitmap indexes for all containers in the data warehouse

schema. To workaround this limitation:

25

Log in into your repository through DAC user interface, click on the Design button under the top menu, and select

the Indices tab in the right pane.

Click on the Query sub-tab and get the list of all indexes defined on the target table.

Check both the check boxes Always Drop & Create and Inactive against the indexes, which should not be dropped

during incremental runs.

Important: You must uncheck the Inactive checkbox for these indexes before the next initial load; otherwise

DAC will not create them after the initial load completion. Since the Inactive property is used both for true

inactive indexes and "hidden from incremental load" indexes, the property Always Drop & Create could be used

for convenience to distinguish between two different categories.

If you choose to keep some bitmap indexes in place during incremental runs, consider creating the indexes with the

storage parameter PCTFREE value to at least 50 or higher. Oracle RDBMS packs bitmap indexes in a data block

much more tightly compared to B*Tree indexes. When an update, insert, or delete occurs on table columns with

enabled indexes, the bitmap indexes quality will degrade. The higher value of PCTFREE will mitigate the impact to

some degree.

4. Additional considerations for handling bitmap indexes during incremental loads.

- All bitmap indexes should be dropped for transaction fact tables, with over 20 million records, that will have a large

volume of data updates and inserts, such as over 0.5 1 percent of total records, during an incremental run.

- For large tables with a small number of bitmap indexes, consider dropping and recreating the bitmap indexes since

the time to rebuild would be short.

- For large tables with few data updates, the indexes can be enabled during incremental runs without significant

performance degradations.

Disabling Indexes with DISTINCT_KEYS = 0 or 1

Oracle BI Applications delivers a number of indexes to optimize both ETL and end user queries performance.

Depending on end user data and its distribution there may be some indexes on columns with just one distinct value.

Such indexes will not be used in any queries, so they can be safely dropped in your Data Warehouse schema and

disabled in DAC repository.

The following script helps to identify all such indexes, disable them in DAC repository and drop in database. You have

to either connect as DBA user or implement additional grants, since the script requires access to two database

schemes:

ACCEPT DAC_OWNER PROMPT 'Enter DAC Repository schema name: ' ACCEPT DWH_OWNER PROMPT 'Enter Data Warehouse schema name: ' SELECT row_wid FROM "&&DAC_OWNER".w_etl_app; ACCEPT APP_ID PROMPT 'Enter your DAC container from the list above: ' UPDATE "&&DAC_OWNER".w_etl_index SET inactive_flg = 'Y' WHERE row_wid IN ( SELECT ind_ref.obj_wid FROM "&&DAC_OWNER".w_etl_index ind, "&&DAC_OWNER".w_etl_obj_ref ind_ref, "&&DAC_OWNER".w_etl_obj_ref tbl_ref, "&&DAC_OWNER".w_etl_table tbl, "&&DAC_OWNER".w_etl_app app, all_indexes all_ind WHERE ind_ref.obj_type = 'W_ETL_INDEX' AND ind_ref.soft_del_flg = 'N' AND ind_ref.app_wid = '&&APP_ID' AND ind_ref.obj_wid = ind.row_wid AND tbl_ref.obj_type = 'W_ETL_TABLE'

26

AND tbl_ref.soft_del_flg = 'N' AND tbl_ref.app_wid = '&&APP_ID' AND tbl_ref.obj_wid = tbl.row_wid AND tbl_ref.obj_ref_wid = ind.table_wid AND ind.app_wid = app.row_wid AND ind.inactive_flg = 'N' AND all_ind.index_name = ind.name AND all_ind.table_name = tbl.name AND all_ind.distinct_keys = 1 -- AND ind.type_cd = 'Query' AND all_ind.owner = '&&DWH_OWNER'); COMMIT; -- Drop the indexes in the schema: spool drop_dist_indexes.sql SELECT 'DROP INDEX ' || owner|| '.' || index_name || ' ;' FROM all_indexes WHERE distinct_keys

27

%2 ON %3 ( %4 ) NOLOGGING'; execute immediate 'ALTER INDEx %2 MONITORING USAGE'; END; BEGIN execute immediate 'CREATE %1 INDEX %2 ON %3 ( %4 ) NOLOGGING'; execute immediate 'ALTER INDEX %2 MONITORING USAGE'; END; BEGIN execute immediate 'CREATE %1 INDEX %2 ON %3 ( %4 ) NOLOGGING PARALLEL'; execute immediate 'ALTER INDEX %2 MONITORING USAGE'; END;

6. If you implement index monitoring for the first time after completing ETLs, execute the following PL/SQL block

to enable monitoring for all indexes:

DECLARE CURSOR c1 IS SELECT index_name FROM user_indexes WHERE index_name NOT IN (SELECT index_name FROM v$object_usage WHERE MONITORING = 'YES'); BEGIN FOR rec IN c1 LOOP EXECUTE IMMEDIATE 'alter index '||rec.index_name||' monitoring usage'; END LOOP; END;

/

To query the unused indexes in your data warehouse execute the following SQL:

SELECT DISTINCT index_name FROM myobj_usage WHERE used = 'NO';

Important!!! There are two known cases when optimizer uses indexes but DOES NOT mark as used with Index Usage Monitoring turned on:

DML operations against Parent table (such as DELETE or UPDATE), associated with a Child table via the child table Foreign Key (FK) and the FK Normal Index on the Child table, do use the Child table FK index, but Oracle does not report them as used in v$object_usage. Note that BITMAP indexes are correctly flagged as used in the same scenario and reported in v$object_usage.

28

Optimizer may use extended statistics for computing correct table selectivity, using composite indexes, and yet, not report them in v$object_usage. Such case may not be a critical one for BI Analytics warehouse, since it doesnt use composite BITMAP indexes, while composite NORMAL indexes are used on surrogate keys (unique indexes) and critical columns, used in ETL or OBIEE queries.

Make sure you carefully review the reported unused indexes prior to dropping them in the database and disabling in DAC repository.

After identifying redundant indexes, disabling them in DAC and dropping in your data warehouse, follow the steps

below to turn off index monitoring:

1. Restore /bifoundation/dac/CustomSQLs/CustomSQL.xml from its backup copy.

2. Reset "Script before every ETL" System parameter in DAC

3. Execute the following PL/SQL block to disable index monitoring:

DECLARE CURSOR c1 IS SELECT index_name FROM user_indexes WHERE index_name IN (SELECT index_name FROM v$object_usage WHERE MONITORING = 'YES'); BEGIN FOR rec IN c1 LOOP EXECUTE IMMEDIATE 'alter index '||rec.index_name||' nomonitoring usage'; END LOOP; END;

Important!!! Make sure you monitor the index usage for the extended period of at least 1-2 months before deciding which additional indexes could be disabled in DAC and dropped in your target schema.

Handling Query Indexes during Initial ETL

Oracle BI Applications delivers a number of query indexes, which are not used during ETL but required for OBIEE

queries better performance. Most of query indexes are created as BITMAP indexes in Oracle database. Creation of

such large number of query indexes can extend both initial and incremental ETL windows. This article discusses

several options how to reduce index maintenance such as disabling unused query indexes, or partitioning large fact

tables and maintain local query indexes on the latest range partitions.

You can consider disabling ALL query indexes and reduce your ETL runtime in the following scenarios:

1. Disable query indexes -> run an initial ETL -> enable query indexes -> run an incremental ETL -> run OBIEE

reports

2. Disable query indexes -> run an incremental ETL -> enable query indexes -> run another incremental ETL ->

run OBIEE reports

To summarize, you can disable query indexes only for the following pattern: 1st ETL > 2

nd ETL > OBIEE. You

cannot use this option for 1st ETL > OBIEE > 2

nd ETL sequence.

Important: If you plan to implement partitioning for your warehouse tables and you want to take advantage of

conversion scripts in the next section, then you need to have query indexes, created on the target tables prior to

implementing partitioning.

Identify and preserve all activated query indexes PRIOR to executing the first ETL run:

CREATE TABLE psr_initial_query_idx AS SELECT ind_ref.obj_wid, ind.NAME idx_name, tbl.NAME tbl_name FROM w_etl_index ind, w_etl_obj_ref ind_ref, w_etl_obj_ref tbl_ref,

29

w_etl_table tbl, w_etl_app app WHERE ind_ref.obj_type = 'W_ETL_INDEX' AND ind_ref.soft_del_flg = 'N' AND ind_ref.app_wid = :APP_ID AND ind_ref.obj_wid = ind.row_wid AND tbl_ref.obj_type = 'W_ETL_TABLE' AND tbl_ref.soft_del_flg = 'N' AND tbl_ref.app_wid = :APP_ID AND tbl_ref.obj_wid = tbl.row_wid AND tbl_ref.obj_ref_wid = ind.table_wid AND ind.app_wid = app.row_wid AND ind.inactive_flg = 'N' AND ind.isunique = 'N' AND ind.type_cd = 'Query' AND (ind.DRP_CRT_ALWAYS_FLG = 'Y' OR ind.DRP_CRT_BITMAP_FLG = 'Y')

Where APP_ID can be identified from:

SELECT row_wid FROM w_etl_app;

Disable the identified query indexes PRIOR to starting the first ETL run:

SQL> UPDATE w_etl_index SET inactive_flg = 'Y' WHERE row_wid IN (SELECT obj_wid FROM psr_initial_query_idx);

SQL> commit; Execute your first ETL run.

Enable all preserved indexes PRIOR to starting the second ETL run:

SQL> UPDATE w_etl_index SET inactive_flg = 'N' WHERE row_wid IN (SELECT obj_wid FROM psr_initial_query_idx);

SQL> commit; Execute your second ETL run. DAC will recreate all disabled query indexes.

PARTITIONING GUIDELINES FOR LARGE FACT TABLES

Introduction

Taking advantage of range and composite range-range partitioning for fact tables will not only reduce index and

statistics maintenance time during ETL, but also improve web queries performance. Since the majority of inserts and

updates impact the last partition(s), you would need to disable only local indexes on a few impacted partitions, and then

rebuild disabled indexes after the load and compute statistics on updated partitions only. Online reports and

dashboards should also render results faster, since the optimizer would build more efficient execution plans using

partitions elimination logic.

Large fact tables, with more than 20 million rows, are good candidates for partitioning. To build an optimal partitioned

table with reasonable data distribution, you can consider partitioning by month, quarter, year, etc. You can either

identify and partition target fact tables before initial run, or convert the populated tables into partitioned objects after the

full load.

To implement the support for partitioned tables in Oracle Business Analytics Data Warehouse, you need to update

DAC metadata and manually convert the candidates into partitioned tables in the target database.

Follow the steps below to implement fact table partitioning in your data warehouse schema and DAC repository. Please

note that there are some steps, which apply for composite range-range partitioning only.

30

Convert to partitioned tables

Perform the following steps to convert a regular table into an range partitioned table.

Identify a partitioning key and decide on a partitioning interval

Choosing the correct partitioning key is the most important factor for effective partitioning, since it defines how many

partitions will be involved in web queries or ETL updates. Review the following guidelines for selecting a column for a

partitioning key:

Identify eligible columns of type DATE for implementing range partitioning.

Connect to the Oracle BI Server repository and check the usage or dependencies on each column in the

logical and presentation layers.

Analyze the summarized data distribution in the target table by each potential partitioning key candidate and

data volumes per time range, month, quarter or year.

Basing on the compiled data, decide on the appropriate partitioning key and partitioning range for your future

partitioned table.

The recommended partitioning range for most implementations is a month, though you can consider a quarter

or a year for your partitioning ranges.

The proposed partitioning guidelines assume that the majority of incremental ETL volume data (~90%) is new records,

which end up in the one or two latest partitions. Depending on the chosen range granularity, you may consider

rebuilding local indexes for the most impacted latest partitions:

- Monthly range: you are advised to maintain two

BI Apps796 Perf Tech NoteV7

Documents

Transcript of BI Apps796 Perf Tech NoteV7