Datastage Dynamic Relational Stage Guide

Ascential DataStage™

Dynamic Relational Stage GuideVersion 1.0

Part No. 00D-TB040

December 2004

This document, and the software described or referenced in it, are confidential and proprietary to Ascential

Software Corporation ("Ascential"). They are provided under, and are subject to, the terms and conditions of a

license agreement between Ascential and the licensee, and may not be transferred, disclosed, or otherwise

provided to third parties, unless otherwise permitted by that agreement. No portion of this publication may be

reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording, or otherwise, without the prior written permission of Ascential. The specifications and

other information contained in this document for some purposes may not be complete, current, or correct, and are

subject to change without notice. NO REPRESENTATION OR OTHER AFFIRMATION OF FACT CONTAINED IN THIS

DOCUMENT, INCLUDING WITHOUT LIMITATION STATEMENTS REGARDING CAPACITY, PERFORMANCE, OR

SUITABILITY FOR USE OF PRODUCTS OR SOFTWARE DESCRIBED HEREIN, SHALL BE DEEMED TO BE A

WARRANTY BY ASCENTIAL FOR ANY PURPOSE OR GIVE RISE TO ANY LIABILITY OF ASCENTIAL WHATSOEVER.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING

BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND

NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL ASCENTIAL BE LIABLE FOR ANY CLAIM, OR

ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM

LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS

ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. If you

are acquiring this software on behalf of the U.S. government, the Government shall have only "Restricted Rights" in

the software and related documentation as defined in the Federal Acquisition Regulations (FARs) in Clause

52.227.19 (c) (2). If you are acquiring the software on behalf of the Department of Defense, the software shall be

classified as "Commercial Computer Software" and the Government shall have only "Restricted Rights" as defined

in Clause 252.227-7013 (c) (1) of DFARs.

© 2004 Ascential Software Corporation. All rights reserved. DataStage®, EasyLogic®, EasyPath®, Enterprise Data

Quality Management®, Iterations®, Matchware®, Mercator®, MetaBroker®, Application Integration, Simplified®,

Ascential™, Ascential AuditStage™, Ascential DataStage™, Ascential ProfileStage™, Ascential QualityStage™,

Ascential Enterprise Integration Suite™, Ascential Real-time Integration Services™, Ascential MetaStage™, and

Ascential RTI™ are trademarks of Ascential Software Corporation or its affiliates and may be registered in the

United States or other jurisdictions.

Adobe Acrobat is a trademark of Adobe Systems, Inc. DB2, DB2 Universal Database, and IBM, and Informix are

either registered trademarks or trademarks of IBM Corporation. Microsoft, Microsoft SQL Server, Windows,

Windows NT, and Windows Server are either registered trademarks or trademarks of Microsoft Corporation in the

United States and/or other countries. Oracle, Oracle8i, and Oracle9i are either registered trademarks or trademarks

of Oracle Corporation. Open Client and Sybase are either registered trademarks or trademarks of Sybase, Inc. UNIX

is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company,

Ltd. Other marks mentioned are the property of the owners of those marks.

The software delivered to Licensee may contain third-party software code. See Legal Notices (LegalNotices.pdf)

for more information.

How to Use This Guide

Dynamic Relational Stage (DRS) reads data from any DataStage stage

and writes it to one of the supported relational databases. It also reads

data from any of the supported relational databases and writes it to

any DataStage stage. It supports the following relational databases:

DB2/UDB, Informix, Microsoft SQL Server, Oracle, and Sybase. It also

supports a generic ODBC interface. Version 1.0 of DRS is compatible

with Ascential DataStage Release 7.5.1.

AudienceThis guide is intended for DataStage designers who create or modify

jobs that use DRS.

How This Book is OrganizedThe following table lists topics that may be of interest to you and it

provides links to these topics.

To learn about Read…

Functionality "Functionality" on page 2

Configuration requirements "Configuration Requirements" on page 4

Installation "Installing the Stage" on page 10

The database connection "Defining the DRS Connection" on page 10

Character set mapping "Defining Character Set Mapping" on page 12

Defining input data "Defining Input Data" on page 13

Writing data to a database "Writing Data to a Database" on page 22

Defining output data "Defining Output Data" on page 24

SQL meta tags "SQL Meta Tags" on page 31

DataStage Dynamic Relational Stage Guide iii

Related Documentation How to Use This Guide

Related DocumentationTo learn more about documentation from other Ascential products

and third-party documentation as they relate to DRS, refer to the

following sections/tables.

Ascential Software Documentation

Third-Party Documentation

Considerations for Oracle DATE data type "Oracle DATE Data Type Considerations" on page 41

Data type support "Database Data Type Support" on page 43

Handling $ and # characters "Handling $ and # Characters" on page 46

To learn about Read…

Guide Description

Ascential DataStage Designer Guide

General principles for designing jobs

Ascential DataStage Server Job Developer’s Guide

Techniques for designing server jobs

Ascential DataStage Manager Guide

Techniques for using and and maintaining the DataStage Repository

Ascential MetaStage User’s Guide Information about Ascential MetaStage™

Ascential DataStage NLS Guide Information about NLS and techniques for character-set mapping

Ascential DataStage Plug-In Installation and Configuration Guide

Information required to configure your system and install this stage

Product Guide Description

DB2/UDB DB2 Connect Quick Beginnings Guide

Infromation about DB2/UDB

Informix CLI Informix CLI Programmer’s Manual Information about Informix

Oracle Installation Guide Installation information

Oracle8 Installation Guide for AIX-Based Systems

Information about installing Oracle 8i on AIX systems

iv DataStage Dynamic Relational Stage Guide

How to Use This Guide Conventions

Conventions

Contacting SupportTo reach Customer Care, please refer to the information below:

Call toll-free: 1-866-INFONOW (1-866-463-6669)

Email: [email protected]

Ascential Developer Net: http://developernet.ascential.com

Please consult your support agreement for the location and

availability of customer support personnel.

To find the location and telephone number of the nearest Ascential

Software office outside of North America, please visit the Ascential

Software Corporation website at http://www.ascentialsoftware.com.

Convention Used for…

bold Field names, button names, menu items, and keystrokes. Also used to indicate filenames, and window and dialog box names.

user input Information that you need to enter as is.

code Code examples

variable

or

<variable>

Placeholders for information that you need to enter. Do not type the greater-/less-than brackets as part of the variable.

> Indicators used to separate menu options, such as:

Start >Programs >Ascential DataStage

[A] Options in command syntax. Do not type the brackets as part of the option.

B… Elements that can repeat.

A|B Indicator used to separate mutually-exclusive elements.

{ } Indicator used to identify sets of choices.

DataStage Dynamic Relational Stage Guide v

mailto:[email protected]

http://developernet.ascential.com

http://www.ascentialsoftware.com

Contacting Support How to Use This Guide

vi DataStage Dynamic Relational Stage Guide

Contents

How to Use This GuideAudience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

How This Book is Organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Related Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Ascential Software Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Third-Party Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Contacting Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Chapter 1The DRS Stage

Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2

Configuration Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4

General Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4

Configuration Requirements for DB2/UDB. . . . . . . . . . . . . . . . . . . . . . . . . . 1-4

Configuration Requirements for Informix . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5

Configuration Requirements for Oracle 8i . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6

Configuration Requirements for Oracle 9i . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9

Configuration Requirements for Sybase . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

Installing the Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

Defining the DRS Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

Connecting to a Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11

Defining Character Set Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12

Defining Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13

About the Input Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13

Reject Row Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-21

DataStage Dynamic Relational Stage Guide vii

Contents

Writing Data to a Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-22

Using Generated SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-22

Using User-Defined SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-23

Defining Output Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-24

About the Output Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-24

Using User-Defined Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-31

SQL Meta Tags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-31

Additional Information about Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-38

Additional Information about %Like. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-40

Oracle DATE Data Type Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-41

Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-41

Truncation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-42

Database Data Type Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-43

Handling $ and # Characters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-46

Appendix ABest Practice for Handling Long Data Types in DataStage Jobs

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1

Recommended Job Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1

The Batch Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2

Minimum Job Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2

Template Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3

Other Job Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-5

The Preliminary Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6

Specifying Column Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7

Extracting the Length. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7

Identifying the Name of the File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8

Determining Optimal Array Size and

Maximum Length of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8

The ETL Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9

Using Description as an Override to Specify Precision . . . . . . . . . . . . . . . A-9

Managing Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10

viii DataStage Dynamic Relational Stage Guide

1The DRS Stage

DRS can be configured at runtime to read from or write to a number of

different supported relational database management systems using

native interfaces.

With Dynamic Relational Stage, you can:

Generate your SQL statement. (Fully generated SQL query/Column-generated SQL query)

Use a file name to contain your SQL statement. (User-defined SQL file)

Clear a table before loading using a TRUNCATE statement. (Clear table)

Choose how often to commit rows to the database. (Transaction size)

Input multiple rows of data in one call to the database. (Array size)

Read multiple rows of data in one call from the database. (Array size)

Specify transaction isolation levels for concurrency control and transaction performance tuning. (Transaction Isolation)

Specify criteria that data must meet before being selected. (WHERE clause)

Specify criteria to sort, summarize, and aggregate data. (Other clauses)

Specify the behavior of parameter marks in SQL statements. For information on the Pre 4.2 user-defined SQL behavior, see "Defining Input Data" on page 1-13 and "Defining Output Data" on page 1-24

DataStage Dynamic Relational Stage Guide 1-1

Functionality The DRS Stage

DRS lets you rapidly and efficiently prepare and load streams of

tabular data from any DataStage stage (for example, the ODBC stage,

the Sequential File stage, and so forth) to and from tables of the target

database. The Database client on Windows or UNIX uses SQL*Net to

access an Oracle server on Windows or UNIX.

Each DRS stage is a passive stage that can have any number of input,

output, and reference output links:

Input links specify the data you are writing, which is a stream of rows to be loaded into a database. You can specify the data on an input link using an SQL statement constructed by Ascential DataStage or a user-defined SQL statement.

Output links specify the data you are extracting, which is a stream of rows to be read from a database. You can also specify the data on an output link using an SQL statement constructed by Ascential DataStage or a user-defined SQL statement.

Each reference output link represents a row that is key read from a database (that is, it reads the record using the key field in the WHERE clause of the SQL SELECT statement).

FunctionalityDRS has the following functionality and benefits:

Support for the following relational databases:

– DB2/UDB

– Informix

– Microsoft SQL Server

– Oracle

– Sybase

Support for generic ODBC.

Support for transaction grouping to control a group of input links from a Transformer stage. This lets you write a set of data to a database in one transaction. DRS opens one database session per transaction group.

Support for reject row handling. Link reject variables tell the Transformer stage the DBMS error code when an error occurs in the DRS stage for insert, update, and so forth, for control of job execution. For more information on how the Transformer stage works with link reject variables, see Ascential DataStage Server Job Developer’s Guide.

1-2 DataStage Dynamic Relational Stage Guide

The DRS Stage Functionality

Support for create and drop table functionality before writing to a table.

Support for before and after SQL statements to execute user-defined SQL statements before or after the stage writes or reads into a database.

Support of stream input, stream output, and reference output links.

The ability to use the Derivation cell to specify fully-qualified column names used to construct an SQL SELECT statement for output and reference links.

Note When you select Enable case sensitive table/column name, it is your responsibility to use quotation

marks for the owner/ table.column name in the

Derivation cell to preserve any lower-case letters.

Prefetching of SELECT statement result set rows when executing a query. This minimizes server round trips and enhances performance.

Reduction of the number of network round trips (more processing is done on the client).

Support of new transparent data structures and interfaces.

Elimination of open and close cursor round trips.

Improved error handling.

Use of DRS as a supplement to existing jobs that already use the ODBC stage, rather than as a replacement for the ODBC stage.

Importing of table definitions. For more information about meta data import, see Ascential DataStage Server Job Developer’s Guide.

Support of a file name to contain your SQL statement.

Support for NLS (National Language Support). For more information, see Ascential DataStage Administrator Guide or NLS Guide.

Support for MetaStage™. For information, see Ascential MetaStage User’s Guide.

Support for foreign key meta data import. For information, see Ascential DataStage Manager Guide and Ascential DataStage Designer Guide.

Support for the behavior of parameter marks for SQL statements, which is the same as that for releases of Ascential DataStage before 4.2.


Configuration Requirements The DRS Stage

Support for some specific LOB data types. See the following table:

The following functionality is not supported:

Bulk loading for stream input links. Use the appropriate Load stages to bulk load data into one databases supported by DRS. For more information respectively, see Ascential DataStage Server Job Developer’s Guide or the appropriate technical bulletin.

Stored procedures.

Support of data types not documented in the above table such as DBCLOB, FILE, LOB, LONG, LONG RAW, MSLABEL, OBJECT, RAW, REF, ROWID, or a named data type.

Configuration Requirements

General RequirementsFor general configuration requirements, see Ascential DataStage

Plug-In Installation and Configuration Guide.

Configuration Requirements for DB2/UDBThe DB2 server plug-in requires Client Application Enablers (CAEs) if

the DB2 data resides on an AS/400 system.

On the server machine, set the following system environment

variables:

PATH. Set to include the path to the DB2 bin directory.

DB2 PATH. Set to the installation directory of DB2.

Any changes to system environment variables require a system

reboot before the values of the variables take effect.

DBMS SQL_LONGVARCHAR SQL_LONGVARBINARY

Oracle CLOB BLOB (no BFILE)

DB2/UDB LONG VARCHAR LONG VARCHAR for bit data

MS SQL Server

TEXT IMAGE


The DRS Stage Configuration Requirements

Configuration Requirements for InformixIf Informix is running on a UNIX platform, there may be special

configuration requirements.

If the data source uses a translation DLL, you must add INFORMIXDIR/

lib/esql to the shared library search path in dsenv. If you do not, and

your data source requires a translation DLL, the message “Unable to

load translation DLL” appears in the job log.

Informix CLI looks for an .odbc.ini file in the dshome directory, the

DataStage server engine home directory that is stored in /.dshome.

(To see the location of the DataStage installation directory, enter $cat ‘cat /.dshome‘.)

The driver field in the data source entry for the .odbc.ini file must

reference the DataDirect 4.0 non-wired ODBC driver that is included

on the DataStage CD. Other drivers are not supported.

Tru64 requires the Informix CLI ODBC driver.

Informix Connection Examples. The following example shows sample

DSN entries for AIX and HP platforms for Ascential DataStage 7.5.1:

[Informix]Driver=/u4/dsadm/Ascential/DataStage/branded_odbc/lib/VMinf19.soDescription=DataDirect InformixDatabase=<stores_demo>LogonID=<userid>Password=<password>ServerName=<informixserver>HostName=<informixhost>Service=<online>Protocol=ontlitcpEnableInsertCursors=0GetDBListFromInformix=0CursorBehavior=0CancelDetectInterval=0TrimBlankFromIndexName=1ApplicationUsingThreads=1

The following example shows sample DSN entries for Solaris

platforms for Ascential DataStage 7.5.1:

[Informix]Driver=/u4/dsadm/Ascential/DataStage/branded_odbc/lib/VMinf19.soDescription=DataDirect InformixDatabase=<stores_demo>LogonID=<userid>Password=<password>ServerName=<informixserver>HostName=<informixhost>Service=<online>;Protocol=ontlitcpEnableInsertCursors=0GetDBListFromInformix=0CursorBehavior=0



CancelDetectInterval=0TrimBlankFromIndexName=1ApplicationUsingThreads=1

The following example shows sample DSN entries for Tru64 and

Linux platforms for Ascential DataStage 7.5.1:

[hpds_stores]Driver=/u1/informix/lib/cli/iclis09b.soDescription=INFORMIX 3.3 32-BITDatabase=<stores7>LogonID=<userid>pwd=<password>Servername=<hpds.1>CursorBehavior=0CLIENT_LOCALE=en_us.8859-1DB_LOCALE=en_us.8859-1TRANSLATIONDLL=/u1/informix/lib/esql/igo4a304.so

Every Informix data source to which your jobs connect must have an

entry in the .odbc.ini file. The only required fields in the data source

specification are the Database and Server name. If you choose to

include the login ID (UID) and/or password (PWD), you can leave the

User Name and Password properties blank. If you enter values for

these properties, the values in .odbc.ini are ignored. For more

information on the format of the .odbc.ini file, see Informix CLI

Programmer’s Manual.

You can also use this .odbc.ini file for other ODBC applications

including jobs using the ODBC stage.

Configuration Requirements for Oracle 8iDRS requires the following configuration:

Version 8.n of the Oracle client software on the DataStage server machine, which requires the following:

– Version 8.1.6: Oracle8i Client 8.1.6 (Installation Type: Programmer)

– Version 8.1.7: Oracle8i Client 8.1.7 (Installation Type: Programmer)

Configuration of SQL*Net using a configuration program, for example, SQL*Net Easy Configuration, to set up and add database aliases.

Building the Oracle Client Shared Library

Oracle OCI 8 requires the libclntsh.so shared library (for the Solaris

and Tru64 platforms) and libclntsh.sl (for the HP-UX 11 platform),

which is normally built during the installation of Oracle client

software. The Oracle installation accomplishes this by running the



script genclntsh located in the ORACLE_HOME/bin directory, where

ORACLE_HOME is the environment variable set to the location in

which your Oracle software is installed.

The genclntsh script provided by Oracle creates a shared library that

causes errors resulting from undefined symbols. Users running

Oracle 8 or Oracle8i on specific platforms with certain versions of

Oracle client software (as documented in "UNIX Platforms" on

page 1-7 in the next section) must therefore use the genclntsh8,

genclntsh816, or genclntsh817 script provided with Ascential

DataStage to build a replacement libclntsh.so or libclntsh.sl. These

scripts are placed in a tar archive at the root level of the DataStage

installation media and replace the new libclntsh.so or libclntsh.sl in

the lib directory underneath the DataStage server installation

directory. This is the same directory where your Oracle OCI 8i stage

shared library is installed. It does not overwrite the original

libclntsh.so or libclntsh.sl in the $ORACLE_HOME/lib directory.

UNIX Platforms

The following sections specify information about library

requirements, depending on your platform.

Solaris.

A one-time site linking to build a replacement Oracle client shared

library is required for Oracle Clients 8.0.3, 8.0.4, 8.0.5, 8.1.6, and 8.1.7

on Solaris. This site linking binds your unique Oracle client

configuration into the file that is used by the Oracle OCI 8i stage to

access local and remote Oracle databases.

Before you build the Oracle client shared library, install Oracle, and set

the environment variable ORACLE_HOME to the directory where you

installed Oracle.

In order to build the library, copy the genclnt.tar file, which is located

at the root level of the DataStage installation media, into a directory

on the local hard disk and extract it before running the script.

Use the following commands to build the shared library:

# cp /cdrom/genclnt.tar # tar -xvf genclnt.tar# cd solaris# ./genclntsh8 (or ./genclntsh816 or ./genclntsh817)

Tru64


library is required for Oracle Client 8.1.6, and 8.1.7 on Tru64. This is

required to expose the necessary direct path loading API symbols.



Before you build the Oracle shared library, install Oracle and set the

environment variable ORACLE_HOME to the directory where you

installed Oracle.

In order to build the library, you must copy the GENCLNT.TAR;1 file,

which is located at the root level of the DataStage installation media,

into a directory on the local hard disk and extract it before running the

script.

Use the following commands to build the shared library for version

8.1.6:

# cp /cdrom/GENCLNT.TAR;1 ./genclnt.tar# tar -xvf genclnt.tar# cd tru64# ./genclntsh816

Use the following commands to build the shared library for version

8.1.7:

# cp /cdrom/GENCLNT.TAR;1 ./genclnt.tar# tar -xvf genclnt.tar# cd tru64# ./genclntsh817

HP-UX 11


library is required for Oracle Client 8.1.6 and 8.1.7 on HP-UX 11. This

site linking binds your unique Oracle client configuration into the file

that is used by the Oracle OCI 8i stage to access local and remote

Oracle databases.

Note Oracle 8.0.n is not supported for HP-UX.

Before you build the Oracle client shared library:

1 Install Oracle

2 Set the ORACLE_HOME environment variable to the directory where you installed Oracle

3 Set the SHLIB_PATH shared library path environment variable to reference $ORACLE_HOME/lib.

To build the library, copy the GENCLNT.TAR;1 file, which is located at

the root level of the DataStage installation media, into a directory on

the local hard disk and extract it before running the script.

Use the following sequence of commands to build the shared library

for version 8.1.6:

# cp /cdrom/GENCLNT.TAR;1 ./genclnt.tar# tar -xvf genclnt.tar# cd hpux# ./genclntsh816 (or ./genclntsh817)



Use the same sequence of commands to build the shared library for

version 8.1.7, replacing 816 with 817 in the last command.

AIX 4.3

You must set the LINK_CNTRL environment variable to the value

L_PTHREADS_D7 before installing the Oracle 8.0 client software to

prevent this incompatibility.

Otherwise, you must rerun the genclntsh script provided by Oracle to

generate a new shared library after setting this environment variable

to the proper value. In particular, you must set this environment

variable before:

Installing 8.0.3 on AIX 4.3

Upgrading an existing version of Oracle to Release 8.0.3RR

Relinking executables

Installing new Oracle patches

Oracle 8.0.5 contains the same text in Installation Guide in the

“Special Considerations for AIX 4.3” section. This guide can also be

found on the Oracle web site referenced earlier, as can Oracle8

Installation Guide for AIX-Based Systems. You also must set the

environment variable also before relinking any applications that link

against Oracle 8.0.3 libraries.

Install Oracle, and set the environment variable ORACLE_HOME to the

directory where you installed Oracle. On AIX 4.3, set the environment

variable LINK_CNTRL to the value L_PTHREADS_D7.

Configuration Requirements for Oracle 9iOracle OCI 9i requires the following configuration:

Version 9.n of the Oracle client software on the DataStage server machine, which requires the following:

– Version 9i: Oracle9i Client (runtime)

Note AIX 5.1 requires version 9.2 or later of the Oracle

client software.

Configuration of SQL*Net using a configuration program, for example, SQL*Net Easy Configuration, to set up and add database aliases.


Installing the Stage The DRS Stage

Configuration Requirements for SybaseIf you experience any timeout issues, increase the default values for

the Sybase configuration parameters CS_RETRY_COUNT and

CS_TIMEOUT_VALUE to 10 or higher.

Installing the Stage For instructions and information supporting the installation, see

Ascential DataStage Plug-In Installation and Configuration Guide.

Defining the DRS Connection When you use the client GUI to edit an DRS stage, the DRS stage

dialog box appears:

This dialog box can have up to three pages (depending on whether

there are inputs to and outputs from the stage):

Stage. This page displays the name of the stage you are editing. The General tab defines the database source and logon information to connect to the database. For details see the following section, "Connecting to a Database" on page 1-11

The NLS tab defines a character set map to be used with the

stage. (The NLS tab appears only if you have installed NLS.) For

details, see "Defining Character Set Mapping" on page 1-12


The DRS Stage Defining the DRS Connection

Input. This page is displayed only if you have an input link to this stage. It specifies the SQL table to use and the associated column definitions for each data input link. This page also specifies the type of update action and transaction isolation level information for concurrency control and performance tuning. It also contains the SQL statement used to write the data and lets you enable case sensitivity for SQL statements.

Output. This page is displayed only if you have an output link to this stage. It specifies the SQL tables to use and the associated column definitions for each data output link. This page also specifies the type of query and transaction isolation level information for concurrency control and performance tuning. It also contains the SQL SELECT statement used to extract the data, and lets you enable case sensitivity for SQL statements.

To edit an DRS stage from the DRS stage dialog box:

1 Define the connection (see the following section).

2 Optional. Define a character set map (see "Defining Character Set Mapping" on page 1-12).

3 Define the data on the input links (see "Defining Input Data" on page 1-13).

4 Define the data on the output links (see "Defining Output Data" on page 1-24).

Connecting to a DatabaseSet the connection parameters on the General tab on the Stage page

of the GUI. To connect to one of the supported databases:

1 Identify the Database type by selecting a database management system from the DBMS Type drop-down list. Alternatively select Use Job Parameter…. Provide the name of the job parameter in the box provided in the Use Job Parameter dialog box. Database type is required.

2 Enter the name of the database alias to access in the Connection name field.

– DB2/UDB. The database name is configured with the DB2 Client Configuration Assistant. The database name is all that is needed to determine the location of the target database on the network.

– Informix. This is the Informix data source.

Windows. Define using the ODBC Administrator.

HP-UX 11.0 UNIX platform. Define data sources (DSN) for

Informix databases in the .odbc.ini file.


Defining Character Set Mapping The DRS Stage

– Microsoft SQL Server. This is an ODBC DSN.

– Oracle. This is the name you created using the Oracle Configuration Assistant.

– Sybase. This is either the Sybase server name or a combination of the Sybase server name and database name in the format server.database.

If you provide only the server name, you are given access to

your default database. If you include a database name, you

must be a valid user in that database.

3 Enter the user name to use to connect to the database in the User ID field. This user must have sufficient privileges to access the specified database and source and target tables. This field is required. There is no default.

4 Enter the password that is associated with the specified user name to use in the Password field. This field is required. There is no default.

5 Enter an optional description of the DRS stage in the Description field.

Defining Character Set MappingYou can define a character set map for a stage. Do this from the NLS

tab on the Stage page. The NLS tab appears only if you have

installed NLS.

Specify information using the following fields:

Map name to use with stage. Defines the default character set map for the project or the job. You can change the map by selecting a map name from the list.

Show all maps. Lists all the maps that are shipped with Ascential DataStage.

Loaded maps only. Lists only the maps that are currently loaded.

Use Job Parameter…. Specifies parameter values for the job. Use the format #Param#, where Param is the name of the job parameter. The string #Param# is replaced by the job parameter when the job is run.

For more information about NLS or job parameters, see Ascential

DataStage NLS Guide or Ascential DataStage Designer Guide.


The DRS Stage Defining Input Data

Defining Input DataWhen you write data to a table in a database, the DRS stage has an

input link.

The properties of this link and the column definitions of the data are

defined on the Input page in the DRS stage dialog box of the GUI.

About the Input Page The Input page has an Input name field; the General, Columns,

and SQL tabs; and the Columns… and View Data… buttons:

Input name. The name of the input link. Choose the link you want to edit from the Input name list. This list displays all the input links to the DRS stage.

Click Columns… to display a brief list of the columns designated on the input link. As you enter detailed meta data in the Columns tab, you can leave this list displayed.

Click View Data… to invoke the Data Browser. This lets you look at the data associated with the input link in the database. For a description of the Data Browser, see Ascential DataStage Designer Guide.

General Tab

This tab is displayed by default. It contains the following fields:

Table name. This required field is editable when the update action is not User-defined SQL (otherwise, it is read-only). It is the name of the target database table the data is written to, and the table must exist or be created by choosing generate DDL


Defining Input Data The DRS Stage

from the Create table action list. You must have insert, update, or delete privileges, depending on input mode. You must specify Table name if you do not specify User-defined SQL. There is no default.

Click … (Browse button) to browse the Repository to select the

table.

Update action. Specifies which SQL statements are used to update the target table. Some update actions require key columns to update or delete rows. There is no default. Choose the option you want from the list:

– Clear table then insert rows. Deletes the contents of the table and adds the new rows, with slower performance because of transaction logging. This is the default value.

– Truncate table then insert rows. Truncates the table with no transaction logging and faster performance. For DB2/UDB and Informix, this option is the same as Clear table then insert rows.

– Insert rows without clearing. Inserts the new rows in the table.

– Delete existing rows only. Deletes existing rows in the target table that have identical keys in the source files.

– Replace existing rows completely. Deletes the existing rows, then adds the new rows to the table.

– Update existing rows only. Updates the existing data rows. Any rows in the data that do not exist in the table are ignored.

– Update existing rows or insert new rows. Updates the existing data rows before adding new rows. It is faster to update first when you have a large number of records.

– Insert new rows or update existing rows. Inserts the new rows before updating existing rows. It is faster to insert first if you have only a few records.

– Insert new rows only. Inserts the new rows in the table but does not report duplicate rows to the log.

– Truncate only. Ignores any incoming data and truncates the target table.



Important A stage cannot stand alone on the canvas.

Configure a dummy source stage as illustrated

below.

To keep the process simple and to preserve

the resource, make sure the dummy source

stage does not have any source data.

– User-defined SQL. Writes the data using a user-defined SQL statement, which overrides the default SQL statement generated by the stage. If you choose this option, you enter the SQL statement on the SQL tab. See "Using User-Defined SQL Statements" on page 1-23 for details.

– User-defined SQL file. Reads the contents of the specified file to write the data.

Transaction Isolation. Provides the necessary consistency and concurrency control between transactions in the job and other transactions for optimal performance. For more information on using these levels, see your database documentation. Use one of the following transaction isolation levels:

– Read Uncommitted. Takes exclusive locks on modified data. These locks are held until a commit or rollback is executed. However, other transactions can still read but not modify the uncommitted changes. No other locks are taken.

– Read Committed. Takes exclusive locks on modified data and sharable locks on all other data. This is the default. Exclusive locks are held until a commit or rollback is executed. Uncommitted changes are not readable by other transactions. Shared locks are released immediately after the data has been processed, allowing other transactions to modify it.



– Repeatable Read. Identical to serializable except that phantom rows may be seen.

– Serializable. Takes exclusive locks on modified data and sharable locks on all data. All locks are held until a commit or rollback is executed, preventing other transactions from modifying any data that has been referenced during the transaction.

Note Not all platforms support all four levels of transaction

isolation, and those that do have different names for

them. DRS maps the value for the property to the

closest supported value defined for the database

platform selected. See the applicable database

documentation to determine what levels are supported.

Array size. Specifies the number of rows to be transferred in one call between Ascential DataStage and the database before they are written. Enter a positive integer to indicate how often the database management system performs writes to the database. The default value is 1; that is, each row is written in a separate statement.

Larger numbers use more memory on the client to cache the

rows. This minimizes server round trips and maximizes

performance by executing fewer statements. If this number is too

large, the client may run out of memory.

Array size has implications for handling of reject rows. See "Reject

Row Handling" on page 1-21.

Transaction size. Specifies the number of rows written before a commit is executed in the database. A value of 0 causes all rows in the job to be written as a single transaction.

Create table action. Creates the target table in the specified database if Generate DDL is selected. It uses the column definitions in the Columns tab and the table name. You must have CREATE TABLE privileges on your schema.

Choose one of the following options to create the table:

– Do not create target table. Specifies that the target table is not created, and the Drop table action field and the Create Table Properties button on the right of the dialog are disabled.

– Generate DDL. Specifies that the stage generates the CREATE TABLE statement using information from Table name, the column definitions grid, and the values in the Create Table Properties dialog box.



Click the button to open the Create Table Properties dialog

box to display the table space and storage expression values

for generating the DDL.

Drop table action. Drops the target table before it is created by the stage if Generate DDL is selected. This field is disabled if you choose not to create the target table. The list displays the same items as the Create table action list except that they apply to the DROP TABLE statement. You must have DROP TABLE privileges on your schema.

Treat warning message as fatal error. Determines the behavior of the stage when an error is encountered while writing data to a table. If the check box is selected, a warning message is logged as fatal, and the job aborts.

If the check box is cleared (the default), three warning messages

are logged in the Ascential DataStage Director log file, and the job

continues.

Description. Contains an optional description of the input link.

Columns Tab

This tab contains the column definitions for the data written to the

table.

The column definitions appear in the same order as in the Columns

grid:

The Columns tab behaves the same way as the Columns tab in the

ODBC stage. For a description of how to enter and edit column

definitions, see Ascential DataStage Designer Guide.



SQL Tab

The SQL tab contains the Generated, User-defined, Before, After, Generated DDL, and User-defined DDL tabs.

Use these tabs to display the stage-generated SQL statement and the

SQL statement that you can enter.

Generated. Contains the SQL statements constructed by Ascential DataStage that are used to write data to a database. The SQL statements are based on the current values of the stage and link properties. You cannot edit these statements, but you can use Copy to copy them to the Clipboard for use elsewhere, for example, user-defined SQL statements. This tab is displayed by default.

User-defined. Contains the SQL INSERT, DELETE, or UPDATE statements executed to write data to a database.



Note For information about SQL Meta Tags, see "SQL Meta Tags"

on page 1-31.

Before. Contains the SQL statements executed before the stage processes any job data rows.

The SQL statement on the Before tab is the first SQL statement to

be executed. Depending on your choice, the job can continue or

abort after failing to execute a Before statement. It does not affect

the transaction grouping scheme. The commit or rollback is

performed on a per-link basis.

Each SQL statement is executed as a separate transaction if the

statement separator is a double semi-colon ( ;; ). All SQL

statements are executed in a single transaction if a semi-colon ( ; )

is the separator.

If the property value begins with FILE=, the remaining text is

interpreted as a pathname, and the contents of the file supplies

the property value.

Note For information about SQL Meta Tags, see "SQL Meta

Tags" on page 1-31.

Treat errors as non-fatal. If selected, errors caused by Before SQL

are logged as warnings, and processing continues with the next

command batch. Each separate execution is treated as a separate

transaction. If cleared, errors are treated as fatal to the job, and

result in a transaction rollback. The transaction is committed only

if all statements successfully execute.



After. Contains the SQL statements executed after the stage processes the job data rows.

The SQL statement on the After tab is the last SQL statement to

be executed. Depending on your choice, the job can continue or

abort after failing to execute an After SQL statement. It does not

affect the transaction grouping scheme. The commit or rollback is

performed on a per-link basis.

Each SQL statement is executed as a separate transaction if the

statement separator is a double semi-colon ( ;; ). All SQL

statements are executed in a single transaction if a semi-colon ( ; )

is the separator.


interpreted as a pathname, and the contents of the file supplies

the property value.

The behavior of Treat errors as non-fatal is the same as for Before.


Tags" on page 1-31.



Generated DDL. Select Generate DDL or User-defined DDL from the Create table action field on the General tab to enable this tab.

The CREATE TABLE statement field displays the CREATE

TABLE statement that is generated from the column meta data

definitions and the information provided on the Create Table Properties dialog box. If you select an option other than Do not drop target table from the Drop table action list, the DROP statement field displays the generated DROP TABLE statement

for dropping the target table.

User-defined DDL. Reserved for a future release.

Reject Row HandlingDuring input link processing, rows of data may be rejected by the

database for various reasons, such as unique constraint violations or

data type mismatches.

The DRS stage writes the offending row to the log for the DataStage

job. For the database message detail, you must use the error

messages returned by the database.

Ascential DataStage provides additional reject row handling. To use

this capability:

1 Set Array Size to 1.

2 Use a Transformer stage to redirect the rejected rows.

You can then design your job by choosing an appropriate target for

the rejected rows, such as a Sequential stage. Reuse this target as an


Writing Data to a Database The DRS Stage

input source once you resolve the issues with the offending row

values.

Writing Data to a DatabaseThe following sections describe the differences when you use

generated or user-defined SQL INSERT, DELETE, or UPDATE

statements to write data from Ascential DataStage to a database.

Using Generated SQL Statements By default, Ascential DataStage writes data to a database table using

an SQL INSERT, DELETE, or UPDATE statement that it constructs. The

generated SQL statement is automatically constructed using the table

and column definitions that you specify in the input properties for this

stage. The SQL tab displays the SQL statement used to write the data.

To use a generated SQL statement:

1 Enter a table name in the Table name field on the Input page.

2 Specify how you want the data to be written by choosing a suitable option from the Update action list. Choose one of these options for a generated statement:

– Clear table then insert rows

– Truncate table then insert rows

– Insert rows without clearing

– Delete existing rows only

– Replace existing rows completely

– Update existing rows only

– Update existing rows or insert new rows

– Insert new rows or update existing rows

– User-defined SQL

– User-defined SQL file

See "Defining Input Data" on page 1-13 for a description of each

update action.

3 Enter an optional description of the input link in the Description field.

4 Click the Columns tab on the Input page. The Columns tab appears.


The DRS Stage Writing Data to a Database

5 Edit the Columns grid to specify column definitions for the columns you want to write.

The SQL statement is automatically constructed using your

chosen update action and the columns you have specified.

6 Click the SQL tab on the Input page, then the Generated tab to view this SQL statement. You cannot edit the statement here, but you can click this tab at any time to select and copy parts of the generated statement to paste into the user-defined SQL statement.

7 Click OK to close the DRS Stage dialog box. Changes are saved when you save your job design.

Using User-Defined SQL Statements Instead of writing data using an SQL statement constructed by

Ascential DataStage, you can enter your own SQL INSERT, DELETE, or

UPDATE statement for each DRS input link. (You can include other

SQL statements such as CREATE TABLE only in a Before SQL

statement.) Ensure that the SQL statement contains the table name,

the type of update action you want to perform, and the columns you

want to write.

To enter a user-defined SQL statement:

1 Click the User-defined tab on the SQL tab.

2 Enter the SQL statement you want to use to write data to the target database tables. This statement must contain the table name, the type of update action you want to perform, and the columns you want to write.

When writing data, the INSERT statements must contain a

VALUES clause with the question mark (?) parameter marker for

each stage input column. All user-defined SQL is passed

unchanged to the underlying DBMS except for the parameter

marker. Therefore, the syntax must be correct for that DBMS.

UPDATE statements must contain SET clauses with parameter

markers for each stage input column. UPDATE and DELETE

statements must contain a WHERE clause with parameter markers

for the primary key columns. The parameter markers must be in

the same order as the associated columns listed in the stage

properties. For example:

insert emp (emp_no, emp_name) values (?, ?)

If you specify two SQL statements, they are executed as one

transaction. Do not use a trailing semicolon.

You cannot call stored procedures as there is no facility for

parsing the row values as parameters.


Defining Output Data The DRS Stage

Unless you specify a user-defined SQL statement, the stage

automatically generates an SQL statement.

3 Click OK to close the DRS Stage dialog box. Changes are saved when you save your job design.

Defining Output DataOutput links specify the data you are extracting from a database. You

can also specify the data on an output link using an SQL statement

constructed by Ascential DataStage or a user-defined SQL statement.

The following sections describe the differences when you use SQL

SELECT statements for generated or user-defined queries that you

define on the Output page in the DRS stage dialog box of the GUI.

About the Output PageThe Output page has one field and the General, Columns, Selection, and SQL tabs.

Output name. The name of the output link. Choose the link you want to edit from the Output name list. This list displays all the output links from the DRS stage.

The Columns… and the View Data… buttons function like those on the Input page.


The DRS Stage Defining Output Data

General Tab

This tab is displayed by default. It contains the following parameters:

Table names (SQL FROM clause). Specifies the name of the source table. Use comma-separated table names to indicate an inner join. Use outer-join SQL Meta Tags to define an outer join.

Browse. Allows you to work with the DataStage Repository to obtain column information.

Transaction Isolation. Specifies the transaction isolation levels that provide the necessary consistency and concurrency control between transactions in the job and other transactions for optimal performance. For more information on using these levels, see your database documentation. Use one of the following transaction isolation levels:

– Read Uncommitted. Takes exclusive locks on modified data. These locks are held until a commit or rollback is executed. However, other transactions can still read but not modify the uncommitted changes. No other locks are taken.

– Read Committed. Takes exclusive locks on modified data and sharable locks on all other data. This is the default. Exclusive locks are held until a commit or rollback is executed. Uncommitted changes are not readable by other transactions. Shared locks are released immediately after the data has been processed, allowing other transactions to modify it.

– Repeatable Read. Identical to serializable except that phantom rows may be seen.

– Serializable. Takes exclusive locks on modified data and sharable locks on all data. All locks are held until a commit or rollback is executed, preventing other transactions from modifying any data that has been referenced during the transaction.

Note Not all platforms support all four levels of transaction

isolation, and those that do have different names for

them. DRS maps the value for the property to the

closest supported value defined for the database

platform selected. See the applicable database

documentation to determine what levels are supported.

Array size. Specifies the number of rows read from the database at a time. Enter a positive integer to indicate the number of rows to prefetch in one call. The default value 1 means that prefetching is turned off.

Larger numbers use more memory on the client to cache the

rows. This minimizes server round trips and maximizes



performance by executing fewer statements. If this number is too

large, the client may run out of memory.

Query type. Specifies which SQL statements are used to read the target table. Choose the option you want from the list:

– Generated SQL query. Specifies that the data is extracted using an SQL statement constructed by Ascential DataStage. When you click the SQL tab, you can view this SQL statement. You cannot edit the statement here, but you can click this tab at any time to select and copy parts of the generated statement to paste into the user-defined SQL statement.

– User-defined SQL query. Specifies that the data is extracted using a user-defined SQL query. With this choice, you can edit the SQL statements.

– User-defined SQL query file. Specifies that the data is extracted using the SQL query in the pathname of the designated file that exists on the server. Enter the pathname for this file instead of the text for the query. With this choice, you can edit the SQL statements.

Description. Contains an optional description of the output link.

Columns Tab

The column tab page behaves the same way as the Columns tab in

the ODBC stage, and it specifies which columns are aggregated. For a

description of how to enter and edit column definitions, see Ascential

DataStage Designer Guide.

The column definitions for output links contain a key field. Key fields

are used to join primary and reference inputs to a Transformer stage.

For a reference output link, the DRS key reads the data by using a



WHERE clause in the SQL SELECT statement. For details on how key

fields are specified and used, see Ascential DataStage Designer

Guide.

The Derivation cell on the Columns tab contains fully-qualified

column names when table definitions are loaded from the DataStage

Repository. If the Derivation cell has no value, DRS uses only the

column names to generate the SELECT statement displayed in the

Generated tab of the SQL tab. Otherwise, it uses the content of the

Derivation cell. Depending on the format used in the Repository, the

format is owner.table.name.columnname or tablename.columnname.

The column definitions for reference links require a key field. Key

fields join reference inputs to a Transformer stage. DRS key reads the

data by using a WHERE clause in the SQL SELECT statement.

Selection Tab

Use this tab to build column-generated SQL queries. It contains

optional SQL clauses for the conditional extraction of data.

The Clauses tab is divided into two panes.

WHERE clause. Allows you to insert an SQL WHERE clause to specify criteria that the data must meet before being selected.


Tags" on page 1-31.

Other clauses. Allows you to insert a GROUP BY, HAVING, or ORDER BY clause to sort, summarize, and aggregate data.



SQL Tab

The SQL tab contains the Generated, User-defined, Before, and

After tabs.

Use these tabs to display the stage-generated SQL statement and the

SQL statement that you can enter.

Generated. Displays the SQL statements that read data from a database. You cannot edit these statements, but you can use Copy to copy them to the Clipboard for use elsewhere.

User-defined. Contains the user-defined SQL SELECT statement executed to read data from a database. With this choice, you can edit the SQL statements.


Tags" on page 1-31.



Before. Contains the SQL statements executed before the stage processes any job data rows.

The Before is the first SQL statement to be executed, and you can

specify whether the job continues or aborts after failing to execute

a Before SQL statement. The commit/rollback is performed on a

per-link basis.


interpreted as a pathname, and the contents of the file supply the

property value.


Tags" on page 1-31.

After. Contains the After SQL statement executed after the stage processes any job data rows.



It is the last SQL statement to be executed, and you can specify

whether the job continues or aborts after failing to execute an

After SQL statement. The commit/rollback is performed on a per-

link basis.

Ascential DataStage extracts data from a data source using an SQL

SELECT statement that it constructs. The SQL statement is

automatically constructed using the table and column definitions that

you entered in the stage output properties.

SQL SELECT statements have the following syntax:

SELECT clause FROM clause

[WHERE clause][GROUP BY clause][HAVING clause][ORDER BY clause];

When you specify the tables to use and the columns to be output from

the DRS stage as well as the selection criteria, the SQL SELECT

statement is automatically constructed and can be viewed by clicking

the SQL tab on the Output page.

The SELECT and FROM clauses are the minimum required and are

automatically generated by Ascential DataStage. However, you can

use any of these SQL SELECT clauses:

For more information about these clauses, see Ascential DataStage

Server Job Developer’s Guide.

SELECT clause Specifies the columns to select from the

database.

FROM clause Specifies the tables containing the selected

columns.

WHERE clause Specifies the criteria that rows must meet

to be selected. Use it also to join two or

more tables and limit the rows selected.

GROUP BY clause Groups rows to summarize results.

HAVING clause Specifies the criteria that grouped rows

must meet to be selected.

ORDER BY clause Sorts selected rows.


The DRS Stage SQL Meta Tags

Using User-Defined QueriesInstead of using the SQL statement constructed by Ascential

DataStage, you can enter your own SQL statement for each DRS

output link.

To enter a user-defined SQL query:

1 Choose User-defined SQL query from the Query type list on the General tab on the Output page. The User-defined tab on the SQL tab is enabled.

You can then edit or drag and drop the selected columns into your

user-defined SQL statement. Only one SQL statement is supported

for an output link.You must ensure that the table definitions for the

output link are correct and represent the columns that are

expected.

2 Click OK to close the DRS stage dialog box. Changes are saved when you save your job design.

SQL Meta TagsSQL Meta Tags are unique functions that enhance or modify platform-

specific SQL statements. When used in SQL statements, they are

translated into native database SQL before the SQL is executed by the

stage. They can be applied anywhere you define the SQL code:

User-defined tab on the Input page (see page 1-18)

Before tab on the Input page (see page 1-19)

After tab on the Input page (see page 1-20)

Table names (SQL FROM clause) of the General tab on the Output page (see page 1-25)

WHERE clause of the Selection tab on the Output page (see page 1-27)

User-defined tab on the Output page (see page 1-28)

Before tab on the Output page (see page 1-29)

After tab on the Output link (see page 1-29)


SQL Meta Tags The DRS Stage

The following table describes the available SQL Meta Tags.

The Tag Description

%Abs(x) Returns a decimal value equal to the absolute value of anumber x.

Input arguments: x is numeric1.

Output data type: Numeric.

%Coalesce(expr1, expr2, …) Returns the first nonnull argument provided to the function.

Input arguments: expr1, expr2, … are expressions ofany type.

Output data type: An expression of any type.

%Concat At runtime, the %Concat SQL Meta Tag is replaced by the string concatenation operator appropriate for the specific relational database. For example, in DB2, the %Concat SQL Meta Tag is replaced with CONCAT, whilein Sybase it is replaced with a "+".

This SQL Meta Tag is supported with the same limitations as the native concatenation operator on thespecific relational database. For example, some platforms allow you to concatenate a string with a numeric value; others flag this as an error. Ascential DataStage makes no attempt to check or convert the data types of either of the operands.

Input arguments: None.

Output data type: String2.

%CurrentDateIn Expands to a platform-specific SQL substring representing the current date in the WHERE clause of aSQL SELECT or UPDATE statement or when the currentdate is passed in an INSERT statement.


Output data type: DATE.

%CurrentDateOut Expands to platform-specific SQL for the current date inthe SELECT clause of an SQL query.


Output Data Type: (Date) String literal.

%CurrentDateTimeIn Expands to a platform-specific SQL substring representing the current datetime in the WHERE clauseof a SQL SELECT or UPDATE statement or when the current date time is passed in an INSERT statement.


Output data type: TIMESTAMP.



%CurrentDateTimeOut Expands to a platform-specific SQL for the current datetime in the SELECT clause of an SQL query.


Output data type: (Timestamp) String literal.

%CurrentTimeIn Expands to a platform-specific SQL substring representing the current time in the WHERE clause of aSQL SELECT or UPDATE statement or when the currenttime is passed in an INSERT statement.


Output data type: TIME.

%CurrentTimeOut Expands to a platform-specific SQL for the current timein the SELECT clause of an SQL query.


Output data type: (Time) String literal.

%DateAdd(date_from, add_days) Returns a date by adding add_days to date_from. add_days can be negative.

Input arguments: date_from is a DATE; add_days is apositive or negative INTEGER literal expression.


%DateDiff(date_from, date_to) Returns an integer representing the difference betweentwo dates in number of days.

Input arguments: date_from is a DATE; date_to is a DATE.

Output data type: INTEGER.

%DateIn(dt) Expands into platform-specific SQL syntax for the date.%DateIn should be used whenever a date literal or Datebind variable is used in a comparison in the WHERE clause of a SELECT or UPDATE statement or when a Date value is passed in an INSERT statement.

Input arguments: dt is a DATE or date string literal inYYYY-MM-DD format.


%DateOut(dt) Expands to a platform-specific SQL substring representing dt in the SELECT clause of an SQL query.

Input arguments: dt is a DATE.

Output data type: (Date) String literal.

%DatePart(DTTM_Column) Returns the date portion of the specified datetime column.

Input arguments: DTTM_Column is a TIMESTAMP.


The Tag Description



%DateTimeDiff(datetime_from,datetime_to)

Returns a time value representing the difference between two datetimes in minutes.

Input arguments: datetime_from and datetime_to areTIMESTAMP.


%DateTimeIn(dtt) Expands to platform-specific SQL for a DateTime valuein the WHERE clause of a SQL SELECT or UPDATE statement or when a datetime value is passed in an INSERT statement.

Input arguments: dtt is a TIMESTAMP or Timestamp string literal in the form YYYY-MM-DD-hh.mm.ss.ssssss.


%DateTimeOut(datetime_col) Expands to a platform-specific SQL substring representing datetime_col in the SELECT clause of an SQL query.

Input arguments: datetime_col is a TIMESTAMP.

Output data type: (Timestamp) String literal.

%DecDiv(a,b) Returns a floating number representing the value of a divided by b, where a and b are numeric expressions.

Input arguments: a and b are numeric expressions.

Output data type: Recommendation is to use a character type.

%DecMult(a,b) Returns a floating number representing the value of a multiplied by b, where a and b are numeric expressions.

Input arguments: a and b are numeric expressions.

Output data type: Recommendation is to use a character type.

%DTTM(date, time) Combines the database date represented by the date value with the database time in the time value and returns a database timestamp value.

Input arguments: date is a DATE; time is a TIME.


The Tag Description



.

.

.

%FullJoin(TableName1OrJoin, [tableAlias1], tableName2OrJoin, [tableAlias2], joinCondition)

Produces a full outer join. Use in Table name (SQL FROM clause) on the General tab of the Output pageor in the FROM clause on the User-defined tab of the SQL tab of the Output page.

See "Additional Information about Joins" on page 1-38

Input arguments: TableName1OrJoin is the name of the first table or nested join in the join; tableAlias1 is analias for the first table in the join and is optional; tableName2OrJoin is the name of the second table or nested join in the join; tableAlias2 is an alias for the second table in the join and is optional; joinCondition isthe expression describing the join condition.

Output data types: Any, as defined by the columns inthe joined table.

%LeftJoin(TableName1OrJoin, [tableAlias1], tableName2OrJoin, [tableAlias2], joinCondition)

Produces a left outer join. Use in Table name (SQL FROM clause) on the General tab of the Output pageor in the FROM clause on the User-defined tab of the SQL tab of the Output page.


Input arguments: TableName1OrJoin is the name of the first able or nested join in the join; tableAlias1 is analias for the first table in the join and is optional; tableName2OrJoin is the name of the second table or nested join in the join; tableAlias2 is an alias for the second table in the join and is optional; joinCondition isthe expression describing the join condition.


%Like("Literal") Expands to look for literal values. Use this meta-SQL when looking for like values. A % is appended to the literal.

%Like generates the following:

like 'literal%'If the literal value contains '\_' or '\%', %Like generates the following:

like 'literal%' escape '\'See "Additional Information about %Like" on page 1-40

Input arguments: Literal is a string.

Output data type: String.

The Tag Description



.

%LikeExact(fieldname, "Literal") Expands to look for literal values. Use this when exact matches are necessary, taking into account wildcards inthe literal values.

%LikeExact generates one of the following:

If the literal contains no wildcards:

fieldname = 'literal'If the literal ends with the '%' wildcard:

fieldname like 'literal' [escape '\']Some platforms require that you use RTRIM to get the correct value. The following characters are wildcards even when preceded with the backslash '\' escape character.

%

_

Therefore, on some platforms, the literal must end witha '%' wildcard that is not preceded by '\' as in the following example.

literal = 'ABC%' In the above example you do not need RTRIM on any platform.

literal = 'ABC\%' In the above example, you need RTRIM on SQL Server and DB2/UDB.

Input arguments: fieldname and Literal are strings.

Output data type: String

%NumToChar(Number) Transforms a numeric value into a character value. Spaces are trimmed from Number.

Input arguments: Number is a numeric.

Output data type: CHAR.

%RightJoin(TableName1OrJoin, [tableAlias1], tableName2OrJoin, [tableAlias2], joinCondition)

Produces a right outer join. Use in Table name (SQL FROM clause) on the General tab of the Output pageor in the FROM clause on the User-defined tab of the SQL tab of the Output page.


Input arguments: TableName1OrJoin is the name of the first table or nested join in the join; tableAlias1 is analias for the first table in the join and is optional; tableName2OrJoin is the name of the second table or nested join in the join; tableAlias2 is an alias for the second table in the join and is optional; joinCondition isthe expression describing the join condition.


The Tag Description



%Round(expression, factor) Rounds an expression to a specified scale before or after the decimal point. If factor is a literal, it can be rounded to a negative number.

Input arguments: expression is a numeric expression;factor is an INTEGER.

Output data type: DECIMAL.

%Substring(source_str, start, length)

Expands to a substring of source_str.

Input arguments: source_str is a string; start and length are INTEGERS.


%TimeAdd(datetime, add-minutes)

Generates the SQL that adds add-minutes (which can be a positive or negative integer literal or expression, provided that the expression resolves to a data type that can be used in datetime arithmetic for the given relational database) to the provided datetime (which can be a datetime literal or expression).

Input arguments: datetime is a Timestamp string literal or expression; add-minutes is a positive or negative INTEGER literal or expression.


%TimeIn(tm) Expands to platform-specific SQL for a Time value in the WHERE clause of a SQL SELECT or UPDATE statement or when a time value is passed in an INSERTstatement.

Input arguments: tm is TIME or a time string literal inthe form hh.mm.ss.ssssss.


%TimeOut(time_col) Expands to a platform-specific SQL substring representing time_col in the SELECT clause of an SQL query.

Input arguments: time_col is TIME.

Output data type: (Time) String literal.

%TimePart(DTTM_Column) Returns the time portion of the specified datetime column.

Input arguments: DTTM_Column is TIMESTAMP.


%TrimSubstr(source_str, start, length)

Expands to a substring of source_str, like %Substring, except that trailing blanks are removed from the substring.

Input arguments: source_str is a string, and start andlength are INTEGERS.


The Tag Description



Additional Information about JoinsThe following is additional information about Joins.

Implications for Oracle 8i Databases

For all database types except Oracle 8i, DRS expands the tags to the

standard SQL-92 syntax for outer joins. For Oracle 8i data sources,

DRS uses the Oracle proprietary syntax, which inserts “(+)” after

appropriate field references in the WHERE clause. In a simple

example, the expression

SELECT Table1.x,Table1.yFROM %LeftJoin (Table1,, Table2,, Table1.x = Table2.x)WHERE Table1.y > 2

produces the following SQL-92 expression:

SELECT Table1.x,Table1.yFROM Table1 LEFT OUTER JOIN Table2 ON Table1.x = Table2.xWHERE Table1.y > 2

With an Oracle 8i database, the same expression produces the

following Oracle 8i expression:

SELECT Table1.x,Table1.yFROM Table1, Table2

%Upper(charstring) Converts the string charstring to uppercase. You can use wildcards with charstring, like %.

Input arguments: charstring is a string.


1 Examples of numeric data types are:

BIGINTDECIMALDOUBLEFLOATINTEGERNUMERICREALSMALLINTTINYINT

2 Examples of string data types are:

CHARLONGNVARCHARLONGVARCHARNCHARNVARCHARVARCHAR

The Tag Description



WHERE (Table1.x = Table2.x(+)) and (Table1.y > 2)

In this case, the meta tag is not like a simple macro. The meta tag

expansion inserts the join condition in front of the existing WHERE

condition, if any, with an AND, putting the join condition and the

existing WHERE condition each in parentheses. The expansion

process then scans the resulting composite WHERE condition for

references to fields of the table on the right side of the left outer join.

After each of these references, it inserts a “(+).”

Warning There are special restrictions for using outer join SQL

Meta Tags with Oracle 8i databases:

In order for the meta tag expansion process to identify the field references, you must fully qualify all field references in join conditions and in the WHERE condition.

You can nest joins only on the left, as in

%LeftJoin (%LeftJoin (Table1,,Table2,,Table1.x = Table2.x),, Table3,, Table2.q =Table3.q

The following is not permitted:

%LeftJoin (Table3,,%LeftJoin (Table1,, Table2,, Table1.x = Table2.x),, Table2.q = Table3.q)

Aliases

You can provide aliases for the tables in the second and fourth

parameters. For example, in SQL-92

SELECT Table1Alias.x,Table1Alias.yFROM %LeftJoin (Table1, Table1Alias, Table2, Table2Alias, Table1Alias.x = Table2Alias.x)WHERE Table1Alias.y > 2

produces

SELECT Table1Alias.x,Table1Alias.yFROM Table1 Table1Alias LEFT OUTER JOIN Table2 Table2Alias ON Table1Alias.x = Table2Alias.xWHERE Table1Alias.y > 2

Nested Outer Joins

You can use the join SQL Meta Tags recursively to produce nested

outer joins. For example,

SELECT Table1.xFROM %LeftJoin (%LeftJoin (Table1,, Table2,, Table1.x=Table2.x),, Table3,, Table1.z = Table3.z AND Table2.q = Table3.q)WHERE Table1.a = ‘xyz’

In SQL-92, this becomes

SELECT Table1.x



FROM Table1 LEFT OUTER JOIN Table2 ON TABLE1.x = Table2.x LEFT OUTER JOIN Table3 on Table1.z = Table3.z AND Table2.q = Table3.qWHERE Table1.a = ‘xyz’

In Oracle 8i, this becomes

SELECT Table1.xFROM Table1, Table2, Table3WHERE (Table1.x = Table2.x (+)) AND ((Table1.z = Table3.z (+) AND Table2.q (+) = Table3.q (+)) AND (Table1.a = ‘xyz’))

Additional Information about %Like The following is additional information about %Like.

Using %Like and Eliminating Blanks

Some platforms require that you use RTRIM to get the correct value.

The following characters are wildcards even when preceded with the

backslash '\' escape character.

%

_

Therefore, on some platforms, the literal must end with a '%' wildcard

that is not preceded by '\'.

literal = 'ABC%'

In the example above, you do not need RTRIM on any platform.

literal = 'ABC\%'

In the example above, you need RTRIM on SQL Server and DB2/UDB.

Using %Like and Trailing Blanks

Note that not all executions of LIKE perform in the same way. When

dealing with trailing blanks, some platforms behave as if there is an

implicit % at the end of the comparison string, while most do not, as

shown in the following example:

In this example, if the selected column contains the string "ABCD "

(three trailing blanks), the following statement may or may not return

any rows:

select * from t1 Where c like 'ABCD'

Therefore, it is always important to explicitly code the % the end of

matching strings for those columns where you want to include trailing

blanks.


The DRS Stage Oracle DATE Data Type Considerations

LIKE Wild Cards

SQL provides two wild cards that can be used when specifying pattern

matching strings for use with the LIKE predicate. The underscore (_) is

used as a substitution for a single character within a string; the

percent sign (%) is used to represent any number of character spaces

within a string.

Oracle DATE Data Type ConsiderationsAn Oracle DATE data type contains date and time information (there is

no TIME data type in Oracle). Ascential DataStage maps the Oracle

DATE data type to a Timestamp data type. This is the default

DataStage data type when you import the Oracle meta data type of

DATE.

ConversionAscential DataStage uses a conversion of YYYY-MM-DD HH24:MI:SS

when reading or writing an Oracle date. If the DataStage data type is

Timestamp, DataStage uses the to_date function for this column

when it generates the INSERT statement to write an Oracle date. If the

DataStage data type is Timestamp or Date, Ascential DataStage uses

the to_char function for this column when it generates the SELECT

statement to read an Oracle date.

The following example creates a table with a DATE data type on an

Oracle server. The imported DataStage data type is Timestamp.

create table dsdate (one date);

The results vary, depending on whether the DRS stage is used as an

input or an output link:

Input link. The stage generates the following SQL statement:

insert into dsdate(one) values(TO_DATE(:1, 'yyyy-mm-dd hh24:mi:ss'))

Output link. The stage generates the following SQL statement:

select TO_CHAR(one, 'YYYY-MM-DD HH24:MI:SS') FROM dsdate


Oracle DATE Data Type Considerations The DRS Stage

TruncationIf you choose DataStage data type DATE or TIME, a portion of the

Oracle column is lost.

DRS provides two environment variables for overriding the default

Oracle behavior so that you can specify some other value, for

example, a value indicating that the date portion is not significant.

DS_DEFAULT_DATETIME_DATE. Set this value to a string with the format YYYY-MM-DD. DRS writes to Oracle a constant date component for each DataStage value defined as TIME.

DS_DEFAULT_DATETIME_TIME. Set this value to a string with the format HH:MM:SS. DRS writes to Oracle a constant date component for each DataStage value defined as DATE.

The following demonstrates the impact of using one of these

environment variables.

A DataStage job reads the following from a sequential file and writes it to an Oracle table with two DATE data types.

The full values that are stored in Oracle on November 23, 2004, are:

But if DS_DEFAULT_DATETIME_DATE is set to “1900-01-01” and DS_DEFAULT_DATETIME_TIME is set to “23:59:59,” then the following is written to Oracle.

When the data is… Its value is…

Located in its source 2004-10-29 11:18:24

Read by DataStage using a TIME data type 11:18:24

Located in Oracle after DataStage writes it using a TIME data type, on December 13, 2004

2004-12-01 11:18:24

Date Time

2004-11-25 09:14:37

DATE DATE

2004-11-25 12:00:00 2004-11-01 09:14:37

DATE DATE

2004-11-25 23:59:59 1900-01-01 09:14:37


Th

e D

RS

Sta

ge

Da

tab

ase D

ata

Ty

pe S

up

po

rt

Data

Sta

ge D

yn

am

ic R

ela

tion

al S

tag

e G

uid

e1-4

3

ting table definitions for a

e.

erverype

SybaseData Type

BINARY

BIT

CHAR, NCHAR

ME DATETIME,

SMALLDATETIME3

L DECIMAL, MONEY

DOUBLE PRECISION

FLOAT4

R INT

TEXT

L NUMERIC

Database Data Type SupportThe following table document s the support for database data types. When crea

database table, specify the SQL type, length, and scale attributes as appropriat

DataStage SQL Data Type

DB2/UDBData Type

Informix Data Type

OracleData Type

SQL SData T

SQL_BIGINT BIGINT INT8

SQL_BINARY CHAR FOR BIT DATA

BYTE

SQL_BIT Unsupported BOOLEAN1 BIT

SQL_CHAR CHAR CHAR CHAR CHAR

SQL_DATE TIMESTAMP DATE DATE2 DATETI

SQL_DECIMAL DECIMAL DECIMAL NUMBER DECIMA

SQL_DOUBLE DOUBLE PRECISION

DOUBLE PRECISION

NUMBER

SQL_FLOAT FLOAT FLOAT NUMBER FLOAT

SQL_INTEGER INTEGER INTEGER NUMBER INTEGE

SQL_LONGVARBINARY LONG VARCHAR FOR BIT DATA

BYTE BLOB5, 6 IMAGE

SQL_LONGVARCHAR LONG VARCHAR,

CLOB5, 7

TEXT CLOB5, 8 TEXT

SQL_NUMERIC DECIMAL DECIMAL NUMBER DECIMA

Th

e D

RS

Sta

ge

Da

tab

ase D

ata

Ty

pe S

up

po

rt

1-4

4D

ata

Sta

ge D

yn

am

ic R

ela

tion

al S

tag

e G

uid

e

REAL

INT SMALLINT

ME DATETIME,

SMALLDATETIME9

ME DATETIME, SMALLDATETIME

INT TINYINT

ARY

AR VARCHAR, NVARCHAR, SYSNAME

NCHAR

TEXT

APHIC NVARCHAR

n converted to an E or SMALLDATETIME

ision occurs when

DataStage SQL Data Type

DB2/UDBData Type

Informix Data Type

OracleData Type

SQL ServerData Type

SybaseData Type

SQL_REAL REAL REAL NUMBER REAL

SQL_SMALLINT SMALLINT SMALLINT SMALL

SQL_TIME TIME DATETIME DATE2 DATETI

SQL_TIMESTAMP TIMESTAMP DATETIME DATE2 DATETI

SQL_TINYINT SMALLINT SMALL

SQL_VARBINARY VARCHAR FOR BIT DATA

BYTE VARBIN

SQL_VARCHAR VARCHAR VARCHAR VARCHAR2 VARCH

SQL_WCHAR NCHAR NCHAR NCHAR NCHAR

SQL_WLONGVARCHAR DBCLOB10 NCLOB NTEXT

SQL_WVARCHAR NVARCHAR NVARCHAR NVARCHAR2 VARGR

1 Supported only with Informix Universal Server 9.n

2 See "Oracle DATE Data Type Considerations" on page 1-41.

3 The time component of a Sybase DATETIME or SMALLDATETIME value is lost wheDataStage DATE value. When writing a DataStage DATE value to a Sybase DATETIMvalue, the time component is set to midnight.

4 DataStage FLOAT values have a maximum precision of 15 digits. Some loss of precreading data from Sybase float(p) columns where p is greater than 15.

Th

e D

RS

Sta

ge

Da

tab

ase D

ata

Ty

pe S

up

po

rt

Data

Sta

ge D

yn

am

ic R

ela

tion

al S

tag

e G

uid

e1-4

5

n length to specify the

ta type with a precision n, choose DataStage’s B in the Columns tab. LOB cannot be used as

a type with a precision ition, choose re than 32 K in the R maps to

a type with a precision n, choose DataStage’s

n the Columns tab. The cannot be used as a

n converted to an E or SMALLDATETIME ne.

ARCHAR is

5 Values passed in LOB-type columns need to get into available memory. Use columamount of memory to reserve for a particular column.

6 The DRS Plug-in supports the BLOB data type by mapping the LONGVARBINARY dagreater than 4 KB to Oracle’s BLOB data type. To work with a BLOB column definitioLONGVARBINARY as the column’s data type and provide a Length of more than 4 KThe maximum size supported by DataStage is 2 GB. A column with a data type of Ba key.

7 The DRS Plug-in supports the CLOB data type by mapping the LONGVARCHAR datgreater than 32 K to DB2/UDB’s CLOB data type. To work with a CLOB column definDataStage’s LONGVARCHAR as the column’s data type and provide a Length of moColumns tab. If the Length is less than or equal to 32 K, DataStage’s LONGVARCHALONGVARCHAR.

8 The DRS Plug-in supports the CLOB data type by mapping the LONGVARCHAR datgreater than 4 KB to Oracle’s CLOB data type. To work with a CLOB column definitioLONGVARCHAR as the column’s data type and provide a Length of more than 4 KB imaximum size supported by DataStage is 2 GB. A column with a data type of CLOBkey.

9 The date component of a Sybase DATETIME or SMALLDATETIME value is lost wheDataStage TIME value. When writing a DataStage TIME value to a Sybase DATETIMvalue, the date component is set to the current date on the DataStage server machi

10 If the size of the column is less than 32 K, the DB2/UDB data type for SQL_WLONGVLONGVARCHAR.

Handling $ and # Characters The DRS Stage

Handling $ and # CharactersAscential DataStage has been modified to enable it to handle

databases that use the DataStage reserved characters # and $ in

column names. DataStage converts these characters into an internal

format and then converts them back as necessary.

To take advantage of this facility, do the following:

In Ascential DataStage Administrator, open the Environment Variables dialog box for the project in question, and set the environment variable DS_ENABLE_RESERVED_CHAR_CONVERT to true (this can be found in the General\Customize branch).

Avoid using the strings __035__ and __036__ in your column names (these are used as the internal representations of # and $ respectively).

Once the table definition is loaded, the internal column names are

displayed rather than the original database names both in table

definitions and in the Data Browser. They are also used in derivations

and expressions. The original names (that is, those containing the $ or

#) are used in generated SQL statements, however, and you should

use them if entering SQL in the job yourself.

When using a DRS Plug-In in a server job, you should use the external

names when entering user-defined SQL statements that contain

columns. The columns within the stage are represented by ? , ?, etc.

(parameter markers) and bound to the columns by order, so you do

not need to worry about entering names for them. This applies to:

Query

Update

Insert

Key

Select

Where clause

For example, for an update you might enter:

UPDATE tablename SET ##B$ = ? WHERE $A# = ?

Particularly note the key in this statement ($A#) is specified using the

external name.


ABest Practice for Handling Long Data

Types in DataStage Jobs

IntroductionThe Dynamic Relational Stage now supports the various long data

types provided by the major database vendors. DataStage represents

data types such as Oracle's CLOB and BLOB types with the Long

Varchar SQL type. In most cases, these types can support object sizes

of 2GB or greater. In DataStage, object sizes must be specified in a job

design and cannot be changed at runtime. Therefore, you must design

a job with the largest potential size of a large object rather than with

the largest actual size that may be encountered during job execution.

This results in unnecessary memory allocations which, together with

a large array size in the various database stages, can rapidly exhaust

available system memory.

The following describes a mechanism that assures DataStage users a

high degree of success in processing large objects on a system that is

adequately configured for the task.

For a sample job, use DataStage Manager to import the file

LongJobControl.dsx from the Samples directory on the DataStage

installation media.

Recommended Job DesignThe recommended job design requires a pairing of two jobs driven by

job control. The first is responsible for determining the actual

maximum length of each column containing a long data type and

passing this length to a second job that has a parameter describing

DataStage Dynamic Relational Stage Guide A-1

the length of its long data type columns. For the first job, use the

%MaxLength parameter to calculate the maximum length.

The Batch JobMore specifically, a batch job controls the sequence. The batch job can

be created from scratch or modified from a template. The batch job is

a server job with a short description on the designer panel as shown

in the following illustration.

Minimum Job ParametersThe batch job requires at least three job parameters: QUERYJOB,

ETLJOB, and $mam. $mam is a project level environment variable

that specifies maximum available memory for the ETLJOB expressed

in megabytes. QUERYJOB and ETLJOB provide the names for the

query job and transaction job respectively. Other job parameters can


Best Practice for Handling Long Data Types in DataStage Jobs The Batch Job

be added, if necessary. The job control must be modified if more job

parameters are added.

Template CodeThe BASIC code is in the Job Control tab of the batch job. The

template code is shown below.

REM PARAMETERFILE is the output file of QUERYJOB, and contains the maximum data length of the LONG column* Setup QUERYJOB, run it, wait for it to finish, and test for success hJob1 = DSAttachJob(QUERYJOB, DSJ.ERRFATAL) If NOT(hJob1) Then Call DSLogFatal("Job Attach Failed: ":QUERYJOB:"", "JobControl") Abort End ErrCode = DSSetParam(hJob1, "DBMS_MAX", DBMS_MAX) ErrCode = DSSetParam(hJob1, "DSN_MAX", DSN_MAX) ErrCode = DSSetParam(hJob1, "UID_MAX", UID_MAX) ErrCode = DSSetParam(hJob1, "PWD_MAX", PWD_MAX) ErrCode = DSRunJob(hJob1, DSJ.RUNNORMAL) ErrCode = DSWaitForJob(hJob1) Status = DSGetJobInfo(hJob1, DSJ.JOBSTATUS) If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then * Fatal Error - No Return Call DSLogFatal("Job Failed: ":QUERYJOB:"", "JobControl") End

* Determine optimal Array size and column length for ETL JOB PARAMETERFILE = QUERYJOB:".pf" CALL DSLogInfo("PARAMETERFILE = ":PARAMETERFILE:"", "JobControl")*Get the maximum length of LONG column data *If $mam is nonzero, use it to determine array size IF ($mam NE 0) THEN


The Batch Job Best Practice for Handling Long Data Types in DataStage Jobs

ARG1 = $mam*1048576 OPENSEQ PARAMETERFILE TO PFILE ELSE CALL DSLogInfo("File not found ":PARAMETERFILE:"", "JobControl") READSEQ ColLength FROM PFILE ELSE CALL DSLogInfo("Error reading ":PARAMETERFILE:"", "JobControl") DATALENGTH = TRIMB(ColLength)

CLOSESEQ PFILE

* 0.3 is a number of factor if there are only one DRS as source stage and * another DRS as target stage. With more than two DRS stages in the ETL job, * smaller number should be considered. MAXMEM = ARG1*0.3 * 6000 is for rows size of other columns ROWSIZE = DATALENGTH + 6000 * If the maximum length of the long column data exceeds MAXMEM, * MAXMEM will be the column precision for ETL job IF (DATALENGTH > MAXMEM) THEN

DATALENGTH = MAXMEM END

ARRAYSIZE = DIV(MAXMEM, ROWSIZE)

IF (ARRAYSIZE < 1) THEN ARG1 = 1

END

*32767 is the biggest arraysize allowed IF (ARRAYSIZE > 32767) THEN

ARG1 = 32767 END IF(ARRAYSIZE >= 1 AND ARRAYSIZE <= 32767) THEN

ARG1 = ARRAYSIZE END

ARG2 = DATALENGTH

CALL DSLogInfo("ARRAYSIZE = ":ARG1:"", "JobControl") CALL DSLogInfo("MAXLONGLENGTH = ":ARG2:"", "JobControl") END

*If $mam is set as zero, it will be ignored IF ($mam EQ 0) THEN OPENSEQ PARAMETERFILE TO PFILE ELSE CALL DSLogInfo("File not found ":PARAMETERFILE:"", "JobControl") READSEQ ColLength FROM PFILE ELSE CALL DSLogInfo("Error reading ":PARAMETERFILE:"", "JobControl") ARG2 = TRIMB(ColLength) CLOSESEQ PFILE CALL DSLogInfo("MAXLONGLENGTH = ":ARG2:"", "JobControl") END

* Setup ETLJOB, run it, wait for it to finish, and test for success hJob2 = DSAttachJob(ETLJOB, DSJ.ERRFATAL)

A-4 DataStage Dynamic Relational Stage Guide

Best Practice for Handling Long Data Types in DataStage Jobs The Batch Job

If NOT(hJob2) Then Call DSLogFatal("Job Attach Failed: ":ETLJOB:"", "JobControl") Abort End ErrCode = DSSetParam(hJob2, "DBMS_SRC", DBMS_SRC) ErrCode = DSSetParam(hJob2, "DSN_SRC", DSN_SRC) ErrCode = DSSetParam(hJob2, "UID_SRC", UID_SRC) ErrCode = DSSetParam(hJob2, "PWD_SRC", PWD_SRC) IF ($mam NE 0) THEN ErrCode = DSSetParam(hJob2, "ARSZ_SRC", ARG1) END IF ($mam EQ 0) THEN ErrCode = DSSetParam(hJob2, "ARSZ_SRC", ARSZ_SRC) END ErrCode = DSSetParam(hJob2, "LongLength", ARG2) ErrCode = DSSetParam(hJob2, "DBMS_TRG", DBMS_TRG) ErrCode = DSSetParam(hJob2, "DSN_TRG", DSN_TRG) ErrCode = DSSetParam(hJob2, "UID_TRG", UID_TRG) ErrCode = DSSetParam(hJob2, "PWD_TRG", PWD_TRG) IF ($mam NE 0) THEN ErrCode = DSSetParam(hJob2, "ARSZ_TRG", ARG1) END IF ($mam EQ 0) THEN ErrCode = DSSetParam(hJob2, "ARSZ_TRG", ARSZ_TRG) END ErrCode = DSRunJob(hJob2, DSJ.RUNNORMAL) ErrCode = DSWaitForJob(hJob2) Status = DSGetJobInfo(hJob2, DSJ.JOBSTATUS) If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then * Fatal Error - No Return Call DSLogFatal("Job Failed: ":ETLJOB:"", "JobControl") End

Other Job ParametersThere are fifteen other job parameters in the above code.

DBMS_MAX:DBMS type for QUERY job.

DSN_MAX:Data source name for QUERY job.

UID_MAX:User name for QUERY job.

PWD_MAX:Password for QUERY job.

DBMS_SRC:DBMS type for the source stage of the ETL job.

DSN_SRC:Data source name for the source stage of the ETL job.

UID_SRC:User name for the source stage of the ETL job.

PWD_SRC:Password for the source stage of the ETL job.

ARSZ_SRC:Array size for the source stage of the ETL job.

DBMS_TRG:DBMS type for the target stage of the ETL job.

DSN_TRG:Data source name for the target stage of the ETL job.


The Preliminary Job Best Practice for Handling Long Data Types in DataStage Jobs

UID_TRG:User name for the target stage of the ETL job.

PWD_TRG:Password for the target stage of the ETL job.

ARSZ_TRG:Array size for the target stage of the ETL job

LongLength:The maximum data length of LONG column.

Change the names of other job parameters if they are different from

those in your actual jobs.

The Preliminary JobThe first job consists minimally of a DRS stage writing to a sequential

file, named QUERYJOB.pf. The DRS stage uses user-defined SQL

containing the appropriate meta SQL (%MaxLength) to calculate the

maximum data size of the long column

.


Best Practice for Handling Long Data Types in DataStage Jobs The Preliminary Job

Specifying Column LengthUsing the Columns tab, create a column to contain the length.

Extracting the LengthUsing the User-defined tab, create the SQL to extract the length.


Determining Optimal Array Size and Maximum Length of Data Best Practice for Handling Long Data Types in

Identifying the Name of the FileIn the Sequential File stage, identify the File Name.

Note Performance can become an issue because this is likely to

be a relatively expensive operation.

Determining Optimal Array Size and Maximum Length of Data

Once the first job completes, the job control uses $mam and the

output file of query job to determine the optimal array size (ARG1) and

the maximum data length of the LONG column (ARG2). The maximum

data length from the query job is adjusted if it can not be handled

within the memory of $mam. Both ARG1 and ARG2 are passed to the

subsequent DRS stages of the second job (ETLJOB) as job parameters

by the job control. ETLJOB is a regular job with a job parameter to

specify data length for LONG column. All the rows that exceed ARG2

are identified by inserting information into the LONG column of the

target table. The format of the inserted information is in the format of

"Table|Column|Key Values".

Currently, DataStage does not support job parameters in column meta

data other than in Description on the Columns tab. DataStage

allows Description to specify an 'override'. In the description field of

the LONG column, enter a string of the form:

param{length=ParameterName}. The ParameterName may be

surrounded by # chars or not and is case-sensitive. The precision of

the LONG column is replaced by the value of ParameterName at

runtime.


Best Practice for Handling Long Data Types in DataStage Jobs The ETL Job

The ETL JobThe second job, illustrated below, is a job that extracts, transforms,

and loads.

Using Description as an Override to Specify PrecisionThe following is a sample of the Columns tab with Description

supplied.


Managing Failures Best Practice for Handling Long Data Types in DataStage Jobs

Managing FailuresDRS currently allocates its memory based on the column meta data

when the stage is initialized. Therefore, it is unlikely that a job will

encounter memory issues mid-run. However, it is possible that

truncations of long data occur if the maximum long column size

exceeds the memory available to process these rows. In such rare

cases, you need to log these truncated rows for later processing. The

information that uniquely identifies the target row is in the format of

"Table|Column|Key Values".


Index

Symbols%Like 1–40

AAfter tab

Input page 1–20

Output page 1–29

Ascential Developer Net v

BBefore tab

Input page 1–19

Output page 1–29

blanks, eliminating 1–40

Ccharacter set mapping 1–12

Columns tab

Input page 1–17

Output page 1–26

configuration requirements

DB2 1–4

general 1–4

Informix 1–5

Oracle 8i 1–6

Oracle 9i 1–9

Sybase 1–10

connecting to a database 1–11

Customer Care v

Customer Care, telephone v

Ddata type considerations

Oracle 8i 1–41

Oracle 9i 1–41

DataStage Dynamic Relational Stage Guide

data types

DataStage 1–43

DB2 1–43

Informix 1–43

Oracle 1–43

SQL Server 1–43

Sybase 1–43

DataStage data types 1–43

DataStage Designer iii

DB2

configuration requirements 1–4

data types 1–43

dollar sign ($) 1–46

DRS

defining the connection 1–10

description iii

Dynamic Relational Stage. See DRS.

Eeliminating blanks 1–40

environment variables 1–46

Ffunctionality 1–2

GGeneral tab

Input page 1–13

Output page 1–25

Stage page 1–11

Generated DDL tab 1–21

generated SQL statements 1–22

Generated tab

Input page 1–18

Output page 1–28

Index-1

Index

IInformix


data types 1–43

Input page

Columns tab 1–17

description 1–11

General tab 1–13

properties 1–13

SQL tab 1–18

After tab 1–20

Before tab 1–19

Generated DDL tab 1–21

Generated tab 1–18

User-defined DDL tab 1–21

User-defined tab 1–18

installation 1–10

Jjoins

aliases 1–39

nested outer 1–39

Oracle 8i 1–38

Llarge objects. See processing large objects

LOB data types 1–4

long data types A–1

NNLS tab 1–12

OOracle 8i

client shared library 1–6


data type considerations 1–41

data types 1–43

joins 1–38

Oracle 9i


data type considerations 1–41

data types 1–43

Output page

Columns tab 1–26

description 1–11, 1–24

General tab 1–25

Selection tab 1–27

Index-2

SQL tab 1–28

After tab 1–29

Before tab 1–29

Generated tab 1–28

User-defined tab 1–28

overview 1–1

Ppound sign (#) 1–46

processing large objects

batch job

description A–2

minimum job parameters A–2

other job parameters A–5

determining maximum data size A–6

determining maximum length of data A–8

determining optimal array size A–8

recommended job design A–1

specifying precision A–9

template code A–3

the ETL job A–9

truncation of data A–9

Rreject row handling 1–21

reserved characters

dollar sign ($) 1–46

pound sign (#) 1–46

SSelection tab 1–27

specifying object sizes A–1

SQL meta tags 1–31

SQL Server

data types 1–43

SQL tab

Input page 1–18

Output page 1–28

Stage page 1–10

description 1–10

General tab 1–11

NLS tab 1–12

Sybase


data types 1–43

Tthird-party documentation iv


Index

UUser-defined DDL tab 1–21

user-defined queries 1–31

user-defined SQL statements 1–23

User-defined tab

Input page 1–18

Output page 1–28

Wwriting to a database 1–22

Index-3

Index

Index-4

Datastage Dynamic Relational Stage Guide

Documents

Transcript of Datastage Dynamic Relational Stage Guide