Datastage Dynamic Relational Stage Guide
-
Upload
mohenishjaiswal -
Category
Documents
-
view
3.233 -
download
6
Transcript of Datastage Dynamic Relational Stage Guide
Ascential DataStage™
Dynamic Relational Stage GuideVersion 1.0
Part No. 00D-TB040
December 2004
This document, and the software described or referenced in it, are confidential and proprietary to Ascential
Software Corporation ("Ascential"). They are provided under, and are subject to, the terms and conditions of a
license agreement between Ascential and the licensee, and may not be transferred, disclosed, or otherwise
provided to third parties, unless otherwise permitted by that agreement. No portion of this publication may be
reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical,
photocopying, recording, or otherwise, without the prior written permission of Ascential. The specifications and
other information contained in this document for some purposes may not be complete, current, or correct, and are
subject to change without notice. NO REPRESENTATION OR OTHER AFFIRMATION OF FACT CONTAINED IN THIS
DOCUMENT, INCLUDING WITHOUT LIMITATION STATEMENTS REGARDING CAPACITY, PERFORMANCE, OR
SUITABILITY FOR USE OF PRODUCTS OR SOFTWARE DESCRIBED HEREIN, SHALL BE DEEMED TO BE A
WARRANTY BY ASCENTIAL FOR ANY PURPOSE OR GIVE RISE TO ANY LIABILITY OF ASCENTIAL WHATSOEVER.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL ASCENTIAL BE LIABLE FOR ANY CLAIM, OR
ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM
LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. If you
are acquiring this software on behalf of the U.S. government, the Government shall have only "Restricted Rights" in
the software and related documentation as defined in the Federal Acquisition Regulations (FARs) in Clause
52.227.19 (c) (2). If you are acquiring the software on behalf of the Department of Defense, the software shall be
classified as "Commercial Computer Software" and the Government shall have only "Restricted Rights" as defined
in Clause 252.227-7013 (c) (1) of DFARs.
© 2004 Ascential Software Corporation. All rights reserved. DataStage®, EasyLogic®, EasyPath®, Enterprise Data
Quality Management®, Iterations®, Matchware®, Mercator®, MetaBroker®, Application Integration, Simplified®,
Ascential™, Ascential AuditStage™, Ascential DataStage™, Ascential ProfileStage™, Ascential QualityStage™,
Ascential Enterprise Integration Suite™, Ascential Real-time Integration Services™, Ascential MetaStage™, and
Ascential RTI™ are trademarks of Ascential Software Corporation or its affiliates and may be registered in the
United States or other jurisdictions.
Adobe Acrobat is a trademark of Adobe Systems, Inc. DB2, DB2 Universal Database, and IBM, and Informix are
either registered trademarks or trademarks of IBM Corporation. Microsoft, Microsoft SQL Server, Windows,
Windows NT, and Windows Server are either registered trademarks or trademarks of Microsoft Corporation in the
United States and/or other countries. Oracle, Oracle8i, and Oracle9i are either registered trademarks or trademarks
of Oracle Corporation. Open Client and Sybase are either registered trademarks or trademarks of Sybase, Inc. UNIX
is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company,
Ltd. Other marks mentioned are the property of the owners of those marks.
The software delivered to Licensee may contain third-party software code. See Legal Notices (LegalNotices.pdf)
for more information.
How to Use This Guide
Dynamic Relational Stage (DRS) reads data from any DataStage stage
and writes it to one of the supported relational databases. It also reads
data from any of the supported relational databases and writes it to
any DataStage stage. It supports the following relational databases:
DB2/UDB, Informix, Microsoft SQL Server, Oracle, and Sybase. It also
supports a generic ODBC interface. Version 1.0 of DRS is compatible
with Ascential DataStage Release 7.5.1.
AudienceThis guide is intended for DataStage designers who create or modify
jobs that use DRS.
How This Book is OrganizedThe following table lists topics that may be of interest to you and it
provides links to these topics.
To learn about Read…
Functionality "Functionality" on page 2
Configuration requirements "Configuration Requirements" on page 4
Installation "Installing the Stage" on page 10
The database connection "Defining the DRS Connection" on page 10
Character set mapping "Defining Character Set Mapping" on page 12
Defining input data "Defining Input Data" on page 13
Writing data to a database "Writing Data to a Database" on page 22
Defining output data "Defining Output Data" on page 24
SQL meta tags "SQL Meta Tags" on page 31
DataStage Dynamic Relational Stage Guide iii
Related Documentation How to Use This Guide
Related DocumentationTo learn more about documentation from other Ascential products
and third-party documentation as they relate to DRS, refer to the
following sections/tables.
Ascential Software Documentation
Third-Party Documentation
Considerations for Oracle DATE data type "Oracle DATE Data Type Considerations" on page 41
Data type support "Database Data Type Support" on page 43
Handling $ and # characters "Handling $ and # Characters" on page 46
To learn about Read…
Guide Description
Ascential DataStage Designer Guide
General principles for designing jobs
Ascential DataStage Server Job Developer’s Guide
Techniques for designing server jobs
Ascential DataStage Manager Guide
Techniques for using and and maintaining the DataStage Repository
Ascential MetaStage User’s Guide Information about Ascential MetaStage™
Ascential DataStage NLS Guide Information about NLS and techniques for character-set mapping
Ascential DataStage Plug-In Installation and Configuration Guide
Information required to configure your system and install this stage
Product Guide Description
DB2/UDB DB2 Connect Quick Beginnings Guide
Infromation about DB2/UDB
Informix CLI Informix CLI Programmer’s Manual Information about Informix
Oracle Installation Guide Installation information
Oracle8 Installation Guide for AIX-Based Systems
Information about installing Oracle 8i on AIX systems
iv DataStage Dynamic Relational Stage Guide
How to Use This Guide Conventions
Conventions
Contacting SupportTo reach Customer Care, please refer to the information below:
Call toll-free: 1-866-INFONOW (1-866-463-6669)
Email: [email protected]
Ascential Developer Net: http://developernet.ascential.com
Please consult your support agreement for the location and
availability of customer support personnel.
To find the location and telephone number of the nearest Ascential
Software office outside of North America, please visit the Ascential
Software Corporation website at http://www.ascentialsoftware.com.
Convention Used for…
bold Field names, button names, menu items, and keystrokes. Also used to indicate filenames, and window and dialog box names.
user input Information that you need to enter as is.
code Code examples
variable
or
<variable>
Placeholders for information that you need to enter. Do not type the greater-/less-than brackets as part of the variable.
> Indicators used to separate menu options, such as:
Start >Programs >Ascential DataStage
[A] Options in command syntax. Do not type the brackets as part of the option.
B… Elements that can repeat.
A|B Indicator used to separate mutually-exclusive elements.
{ } Indicator used to identify sets of choices.
DataStage Dynamic Relational Stage Guide v
Contacting Support How to Use This Guide
vi DataStage Dynamic Relational Stage Guide
Contents
How to Use This GuideAudience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
How This Book is Organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Related Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Ascential Software Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Third-Party Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contacting Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Chapter 1The DRS Stage
Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
Configuration Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
General Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
Configuration Requirements for DB2/UDB. . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
Configuration Requirements for Informix . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
Configuration Requirements for Oracle 8i . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Configuration Requirements for Oracle 9i . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
Configuration Requirements for Sybase . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Installing the Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Defining the DRS Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Connecting to a Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11
Defining Character Set Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12
Defining Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13
About the Input Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13
Reject Row Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-21
DataStage Dynamic Relational Stage Guide vii
Contents
Writing Data to a Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-22
Using Generated SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-22
Using User-Defined SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-23
Defining Output Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-24
About the Output Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-24
Using User-Defined Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-31
SQL Meta Tags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-31
Additional Information about Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-38
Additional Information about %Like. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-40
Oracle DATE Data Type Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-41
Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-41
Truncation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-42
Database Data Type Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-43
Handling $ and # Characters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-46
Appendix ABest Practice for Handling Long Data Types in DataStage Jobs
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
Recommended Job Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
The Batch Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
Minimum Job Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
Template Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3
Other Job Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-5
The Preliminary Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6
Specifying Column Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7
Extracting the Length. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7
Identifying the Name of the File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8
Determining Optimal Array Size and
Maximum Length of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8
The ETL Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9
Using Description as an Override to Specify Precision . . . . . . . . . . . . . . . A-9
Managing Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10
viii DataStage Dynamic Relational Stage Guide
1The DRS Stage
DRS can be configured at runtime to read from or write to a number of
different supported relational database management systems using
native interfaces.
With Dynamic Relational Stage, you can:
Generate your SQL statement. (Fully generated SQL query/Column-generated SQL query)
Use a file name to contain your SQL statement. (User-defined SQL file)
Clear a table before loading using a TRUNCATE statement. (Clear table)
Choose how often to commit rows to the database. (Transaction size)
Input multiple rows of data in one call to the database. (Array size)
Read multiple rows of data in one call from the database. (Array size)
Specify transaction isolation levels for concurrency control and transaction performance tuning. (Transaction Isolation)
Specify criteria that data must meet before being selected. (WHERE clause)
Specify criteria to sort, summarize, and aggregate data. (Other clauses)
Specify the behavior of parameter marks in SQL statements. For information on the Pre 4.2 user-defined SQL behavior, see "Defining Input Data" on page 1-13 and "Defining Output Data" on page 1-24
DataStage Dynamic Relational Stage Guide 1-1
Functionality The DRS Stage
DRS lets you rapidly and efficiently prepare and load streams of
tabular data from any DataStage stage (for example, the ODBC stage,
the Sequential File stage, and so forth) to and from tables of the target
database. The Database client on Windows or UNIX uses SQL*Net to
access an Oracle server on Windows or UNIX.
Each DRS stage is a passive stage that can have any number of input,
output, and reference output links:
Input links specify the data you are writing, which is a stream of rows to be loaded into a database. You can specify the data on an input link using an SQL statement constructed by Ascential DataStage or a user-defined SQL statement.
Output links specify the data you are extracting, which is a stream of rows to be read from a database. You can also specify the data on an output link using an SQL statement constructed by Ascential DataStage or a user-defined SQL statement.
Each reference output link represents a row that is key read from a database (that is, it reads the record using the key field in the WHERE clause of the SQL SELECT statement).
FunctionalityDRS has the following functionality and benefits:
Support for the following relational databases:
– DB2/UDB
– Informix
– Microsoft SQL Server
– Oracle
– Sybase
Support for generic ODBC.
Support for transaction grouping to control a group of input links from a Transformer stage. This lets you write a set of data to a database in one transaction. DRS opens one database session per transaction group.
Support for reject row handling. Link reject variables tell the Transformer stage the DBMS error code when an error occurs in the DRS stage for insert, update, and so forth, for control of job execution. For more information on how the Transformer stage works with link reject variables, see Ascential DataStage Server Job Developer’s Guide.
1-2 DataStage Dynamic Relational Stage Guide
The DRS Stage Functionality
Support for create and drop table functionality before writing to a table.
Support for before and after SQL statements to execute user-defined SQL statements before or after the stage writes or reads into a database.
Support of stream input, stream output, and reference output links.
The ability to use the Derivation cell to specify fully-qualified column names used to construct an SQL SELECT statement for output and reference links.
Note When you select Enable case sensitive table/column name, it is your responsibility to use quotation
marks for the owner/ table.column name in the
Derivation cell to preserve any lower-case letters.
Prefetching of SELECT statement result set rows when executing a query. This minimizes server round trips and enhances performance.
Reduction of the number of network round trips (more processing is done on the client).
Support of new transparent data structures and interfaces.
Elimination of open and close cursor round trips.
Improved error handling.
Use of DRS as a supplement to existing jobs that already use the ODBC stage, rather than as a replacement for the ODBC stage.
Importing of table definitions. For more information about meta data import, see Ascential DataStage Server Job Developer’s Guide.
Support of a file name to contain your SQL statement.
Support for NLS (National Language Support). For more information, see Ascential DataStage Administrator Guide or NLS Guide.
Support for MetaStage™. For information, see Ascential MetaStage User’s Guide.
Support for foreign key meta data import. For information, see Ascential DataStage Manager Guide and Ascential DataStage Designer Guide.
Support for the behavior of parameter marks for SQL statements, which is the same as that for releases of Ascential DataStage before 4.2.
DataStage Dynamic Relational Stage Guide 1-3
Configuration Requirements The DRS Stage
Support for some specific LOB data types. See the following table:
The following functionality is not supported:
Bulk loading for stream input links. Use the appropriate Load stages to bulk load data into one databases supported by DRS. For more information respectively, see Ascential DataStage Server Job Developer’s Guide or the appropriate technical bulletin.
Stored procedures.
Support of data types not documented in the above table such as DBCLOB, FILE, LOB, LONG, LONG RAW, MSLABEL, OBJECT, RAW, REF, ROWID, or a named data type.
Configuration Requirements
General RequirementsFor general configuration requirements, see Ascential DataStage
Plug-In Installation and Configuration Guide.
Configuration Requirements for DB2/UDBThe DB2 server plug-in requires Client Application Enablers (CAEs) if
the DB2 data resides on an AS/400 system.
On the server machine, set the following system environment
variables:
PATH. Set to include the path to the DB2 bin directory.
DB2 PATH. Set to the installation directory of DB2.
Any changes to system environment variables require a system
reboot before the values of the variables take effect.
DBMS SQL_LONGVARCHAR SQL_LONGVARBINARY
Oracle CLOB BLOB (no BFILE)
DB2/UDB LONG VARCHAR LONG VARCHAR for bit data
MS SQL Server
TEXT IMAGE
1-4 DataStage Dynamic Relational Stage Guide
The DRS Stage Configuration Requirements
Configuration Requirements for InformixIf Informix is running on a UNIX platform, there may be special
configuration requirements.
If the data source uses a translation DLL, you must add INFORMIXDIR/
lib/esql to the shared library search path in dsenv. If you do not, and
your data source requires a translation DLL, the message “Unable to
load translation DLL” appears in the job log.
Informix CLI looks for an .odbc.ini file in the dshome directory, the
DataStage server engine home directory that is stored in /.dshome.
(To see the location of the DataStage installation directory, enter $cat ‘cat /.dshome‘.)
The driver field in the data source entry for the .odbc.ini file must
reference the DataDirect 4.0 non-wired ODBC driver that is included
on the DataStage CD. Other drivers are not supported.
Tru64 requires the Informix CLI ODBC driver.
Informix Connection Examples. The following example shows sample
DSN entries for AIX and HP platforms for Ascential DataStage 7.5.1:
[Informix]Driver=/u4/dsadm/Ascential/DataStage/branded_odbc/lib/VMinf19.soDescription=DataDirect InformixDatabase=<stores_demo>LogonID=<userid>Password=<password>ServerName=<informixserver>HostName=<informixhost>Service=<online>Protocol=ontlitcpEnableInsertCursors=0GetDBListFromInformix=0CursorBehavior=0CancelDetectInterval=0TrimBlankFromIndexName=1ApplicationUsingThreads=1
The following example shows sample DSN entries for Solaris
platforms for Ascential DataStage 7.5.1:
[Informix]Driver=/u4/dsadm/Ascential/DataStage/branded_odbc/lib/VMinf19.soDescription=DataDirect InformixDatabase=<stores_demo>LogonID=<userid>Password=<password>ServerName=<informixserver>HostName=<informixhost>Service=<online>;Protocol=ontlitcpEnableInsertCursors=0GetDBListFromInformix=0CursorBehavior=0
DataStage Dynamic Relational Stage Guide 1-5
Configuration Requirements The DRS Stage
CancelDetectInterval=0TrimBlankFromIndexName=1ApplicationUsingThreads=1
The following example shows sample DSN entries for Tru64 and
Linux platforms for Ascential DataStage 7.5.1:
[hpds_stores]Driver=/u1/informix/lib/cli/iclis09b.soDescription=INFORMIX 3.3 32-BITDatabase=<stores7>LogonID=<userid>pwd=<password>Servername=<hpds.1>CursorBehavior=0CLIENT_LOCALE=en_us.8859-1DB_LOCALE=en_us.8859-1TRANSLATIONDLL=/u1/informix/lib/esql/igo4a304.so
Every Informix data source to which your jobs connect must have an
entry in the .odbc.ini file. The only required fields in the data source
specification are the Database and Server name. If you choose to
include the login ID (UID) and/or password (PWD), you can leave the
User Name and Password properties blank. If you enter values for
these properties, the values in .odbc.ini are ignored. For more
information on the format of the .odbc.ini file, see Informix CLI
Programmer’s Manual.
You can also use this .odbc.ini file for other ODBC applications
including jobs using the ODBC stage.
Configuration Requirements for Oracle 8iDRS requires the following configuration:
Version 8.n of the Oracle client software on the DataStage server machine, which requires the following:
– Version 8.1.6: Oracle8i Client 8.1.6 (Installation Type: Programmer)
– Version 8.1.7: Oracle8i Client 8.1.7 (Installation Type: Programmer)
Configuration of SQL*Net using a configuration program, for example, SQL*Net Easy Configuration, to set up and add database aliases.
Building the Oracle Client Shared Library
Oracle OCI 8 requires the libclntsh.so shared library (for the Solaris
and Tru64 platforms) and libclntsh.sl (for the HP-UX 11 platform),
which is normally built during the installation of Oracle client
software. The Oracle installation accomplishes this by running the
1-6 DataStage Dynamic Relational Stage Guide
The DRS Stage Configuration Requirements
script genclntsh located in the ORACLE_HOME/bin directory, where
ORACLE_HOME is the environment variable set to the location in
which your Oracle software is installed.
The genclntsh script provided by Oracle creates a shared library that
causes errors resulting from undefined symbols. Users running
Oracle 8 or Oracle8i on specific platforms with certain versions of
Oracle client software (as documented in "UNIX Platforms" on
page 1-7 in the next section) must therefore use the genclntsh8,
genclntsh816, or genclntsh817 script provided with Ascential
DataStage to build a replacement libclntsh.so or libclntsh.sl. These
scripts are placed in a tar archive at the root level of the DataStage
installation media and replace the new libclntsh.so or libclntsh.sl in
the lib directory underneath the DataStage server installation
directory. This is the same directory where your Oracle OCI 8i stage
shared library is installed. It does not overwrite the original
libclntsh.so or libclntsh.sl in the $ORACLE_HOME/lib directory.
UNIX Platforms
The following sections specify information about library
requirements, depending on your platform.
Solaris.
A one-time site linking to build a replacement Oracle client shared
library is required for Oracle Clients 8.0.3, 8.0.4, 8.0.5, 8.1.6, and 8.1.7
on Solaris. This site linking binds your unique Oracle client
configuration into the file that is used by the Oracle OCI 8i stage to
access local and remote Oracle databases.
Before you build the Oracle client shared library, install Oracle, and set
the environment variable ORACLE_HOME to the directory where you
installed Oracle.
In order to build the library, copy the genclnt.tar file, which is located
at the root level of the DataStage installation media, into a directory
on the local hard disk and extract it before running the script.
Use the following commands to build the shared library:
# cp /cdrom/genclnt.tar # tar -xvf genclnt.tar# cd solaris# ./genclntsh8 (or ./genclntsh816 or ./genclntsh817)
Tru64
A one-time site linking to build a replacement Oracle client shared
library is required for Oracle Client 8.1.6, and 8.1.7 on Tru64. This is
required to expose the necessary direct path loading API symbols.
DataStage Dynamic Relational Stage Guide 1-7
Configuration Requirements The DRS Stage
Before you build the Oracle shared library, install Oracle and set the
environment variable ORACLE_HOME to the directory where you
installed Oracle.
In order to build the library, you must copy the GENCLNT.TAR;1 file,
which is located at the root level of the DataStage installation media,
into a directory on the local hard disk and extract it before running the
script.
Use the following commands to build the shared library for version
8.1.6:
# cp /cdrom/GENCLNT.TAR;1 ./genclnt.tar# tar -xvf genclnt.tar# cd tru64# ./genclntsh816
Use the following commands to build the shared library for version
8.1.7:
# cp /cdrom/GENCLNT.TAR;1 ./genclnt.tar# tar -xvf genclnt.tar# cd tru64# ./genclntsh817
HP-UX 11
A one-time site linking to build a replacement Oracle client shared
library is required for Oracle Client 8.1.6 and 8.1.7 on HP-UX 11. This
site linking binds your unique Oracle client configuration into the file
that is used by the Oracle OCI 8i stage to access local and remote
Oracle databases.
Note Oracle 8.0.n is not supported for HP-UX.
Before you build the Oracle client shared library:
1 Install Oracle
2 Set the ORACLE_HOME environment variable to the directory where you installed Oracle
3 Set the SHLIB_PATH shared library path environment variable to reference $ORACLE_HOME/lib.
To build the library, copy the GENCLNT.TAR;1 file, which is located at
the root level of the DataStage installation media, into a directory on
the local hard disk and extract it before running the script.
Use the following sequence of commands to build the shared library
for version 8.1.6:
# cp /cdrom/GENCLNT.TAR;1 ./genclnt.tar# tar -xvf genclnt.tar# cd hpux# ./genclntsh816 (or ./genclntsh817)
1-8 DataStage Dynamic Relational Stage Guide
The DRS Stage Configuration Requirements
Use the same sequence of commands to build the shared library for
version 8.1.7, replacing 816 with 817 in the last command.
AIX 4.3
You must set the LINK_CNTRL environment variable to the value
L_PTHREADS_D7 before installing the Oracle 8.0 client software to
prevent this incompatibility.
Otherwise, you must rerun the genclntsh script provided by Oracle to
generate a new shared library after setting this environment variable
to the proper value. In particular, you must set this environment
variable before:
Installing 8.0.3 on AIX 4.3
Upgrading an existing version of Oracle to Release 8.0.3RR
Relinking executables
Installing new Oracle patches
Oracle 8.0.5 contains the same text in Installation Guide in the
“Special Considerations for AIX 4.3” section. This guide can also be
found on the Oracle web site referenced earlier, as can Oracle8
Installation Guide for AIX-Based Systems. You also must set the
environment variable also before relinking any applications that link
against Oracle 8.0.3 libraries.
Install Oracle, and set the environment variable ORACLE_HOME to the
directory where you installed Oracle. On AIX 4.3, set the environment
variable LINK_CNTRL to the value L_PTHREADS_D7.
Configuration Requirements for Oracle 9iOracle OCI 9i requires the following configuration:
Version 9.n of the Oracle client software on the DataStage server machine, which requires the following:
– Version 9i: Oracle9i Client (runtime)
Note AIX 5.1 requires version 9.2 or later of the Oracle
client software.
Configuration of SQL*Net using a configuration program, for example, SQL*Net Easy Configuration, to set up and add database aliases.
DataStage Dynamic Relational Stage Guide 1-9
Installing the Stage The DRS Stage
Configuration Requirements for SybaseIf you experience any timeout issues, increase the default values for
the Sybase configuration parameters CS_RETRY_COUNT and
CS_TIMEOUT_VALUE to 10 or higher.
Installing the Stage For instructions and information supporting the installation, see
Ascential DataStage Plug-In Installation and Configuration Guide.
Defining the DRS Connection When you use the client GUI to edit an DRS stage, the DRS stage
dialog box appears:
This dialog box can have up to three pages (depending on whether
there are inputs to and outputs from the stage):
Stage. This page displays the name of the stage you are editing. The General tab defines the database source and logon information to connect to the database. For details see the following section, "Connecting to a Database" on page 1-11
The NLS tab defines a character set map to be used with the
stage. (The NLS tab appears only if you have installed NLS.) For
details, see "Defining Character Set Mapping" on page 1-12
1-10 DataStage Dynamic Relational Stage Guide
The DRS Stage Defining the DRS Connection
Input. This page is displayed only if you have an input link to this stage. It specifies the SQL table to use and the associated column definitions for each data input link. This page also specifies the type of update action and transaction isolation level information for concurrency control and performance tuning. It also contains the SQL statement used to write the data and lets you enable case sensitivity for SQL statements.
Output. This page is displayed only if you have an output link to this stage. It specifies the SQL tables to use and the associated column definitions for each data output link. This page also specifies the type of query and transaction isolation level information for concurrency control and performance tuning. It also contains the SQL SELECT statement used to extract the data, and lets you enable case sensitivity for SQL statements.
To edit an DRS stage from the DRS stage dialog box:
1 Define the connection (see the following section).
2 Optional. Define a character set map (see "Defining Character Set Mapping" on page 1-12).
3 Define the data on the input links (see "Defining Input Data" on page 1-13).
4 Define the data on the output links (see "Defining Output Data" on page 1-24).
Connecting to a DatabaseSet the connection parameters on the General tab on the Stage page
of the GUI. To connect to one of the supported databases:
1 Identify the Database type by selecting a database management system from the DBMS Type drop-down list. Alternatively select Use Job Parameter…. Provide the name of the job parameter in the box provided in the Use Job Parameter dialog box. Database type is required.
2 Enter the name of the database alias to access in the Connection name field.
– DB2/UDB. The database name is configured with the DB2 Client Configuration Assistant. The database name is all that is needed to determine the location of the target database on the network.
– Informix. This is the Informix data source.
Windows. Define using the ODBC Administrator.
HP-UX 11.0 UNIX platform. Define data sources (DSN) for
Informix databases in the .odbc.ini file.
DataStage Dynamic Relational Stage Guide 1-11
Defining Character Set Mapping The DRS Stage
– Microsoft SQL Server. This is an ODBC DSN.
– Oracle. This is the name you created using the Oracle Configuration Assistant.
– Sybase. This is either the Sybase server name or a combination of the Sybase server name and database name in the format server.database.
If you provide only the server name, you are given access to
your default database. If you include a database name, you
must be a valid user in that database.
3 Enter the user name to use to connect to the database in the User ID field. This user must have sufficient privileges to access the specified database and source and target tables. This field is required. There is no default.
4 Enter the password that is associated with the specified user name to use in the Password field. This field is required. There is no default.
5 Enter an optional description of the DRS stage in the Description field.
Defining Character Set MappingYou can define a character set map for a stage. Do this from the NLS
tab on the Stage page. The NLS tab appears only if you have
installed NLS.
Specify information using the following fields:
Map name to use with stage. Defines the default character set map for the project or the job. You can change the map by selecting a map name from the list.
Show all maps. Lists all the maps that are shipped with Ascential DataStage.
Loaded maps only. Lists only the maps that are currently loaded.
Use Job Parameter…. Specifies parameter values for the job. Use the format #Param#, where Param is the name of the job parameter. The string #Param# is replaced by the job parameter when the job is run.
For more information about NLS or job parameters, see Ascential
DataStage NLS Guide or Ascential DataStage Designer Guide.
1-12 DataStage Dynamic Relational Stage Guide
The DRS Stage Defining Input Data
Defining Input DataWhen you write data to a table in a database, the DRS stage has an
input link.
The properties of this link and the column definitions of the data are
defined on the Input page in the DRS stage dialog box of the GUI.
About the Input Page The Input page has an Input name field; the General, Columns,
and SQL tabs; and the Columns… and View Data… buttons:
Input name. The name of the input link. Choose the link you want to edit from the Input name list. This list displays all the input links to the DRS stage.
Click Columns… to display a brief list of the columns designated on the input link. As you enter detailed meta data in the Columns tab, you can leave this list displayed.
Click View Data… to invoke the Data Browser. This lets you look at the data associated with the input link in the database. For a description of the Data Browser, see Ascential DataStage Designer Guide.
General Tab
This tab is displayed by default. It contains the following fields:
Table name. This required field is editable when the update action is not User-defined SQL (otherwise, it is read-only). It is the name of the target database table the data is written to, and the table must exist or be created by choosing generate DDL
DataStage Dynamic Relational Stage Guide 1-13
Defining Input Data The DRS Stage
from the Create table action list. You must have insert, update, or delete privileges, depending on input mode. You must specify Table name if you do not specify User-defined SQL. There is no default.
Click … (Browse button) to browse the Repository to select the
table.
Update action. Specifies which SQL statements are used to update the target table. Some update actions require key columns to update or delete rows. There is no default. Choose the option you want from the list:
– Clear table then insert rows. Deletes the contents of the table and adds the new rows, with slower performance because of transaction logging. This is the default value.
– Truncate table then insert rows. Truncates the table with no transaction logging and faster performance. For DB2/UDB and Informix, this option is the same as Clear table then insert rows.
– Insert rows without clearing. Inserts the new rows in the table.
– Delete existing rows only. Deletes existing rows in the target table that have identical keys in the source files.
– Replace existing rows completely. Deletes the existing rows, then adds the new rows to the table.
– Update existing rows only. Updates the existing data rows. Any rows in the data that do not exist in the table are ignored.
– Update existing rows or insert new rows. Updates the existing data rows before adding new rows. It is faster to update first when you have a large number of records.
– Insert new rows or update existing rows. Inserts the new rows before updating existing rows. It is faster to insert first if you have only a few records.
– Insert new rows only. Inserts the new rows in the table but does not report duplicate rows to the log.
– Truncate only. Ignores any incoming data and truncates the target table.
1-14 DataStage Dynamic Relational Stage Guide
The DRS Stage Defining Input Data
Important A stage cannot stand alone on the canvas.
Configure a dummy source stage as illustrated
below.
To keep the process simple and to preserve
the resource, make sure the dummy source
stage does not have any source data.
– User-defined SQL. Writes the data using a user-defined SQL statement, which overrides the default SQL statement generated by the stage. If you choose this option, you enter the SQL statement on the SQL tab. See "Using User-Defined SQL Statements" on page 1-23 for details.
– User-defined SQL file. Reads the contents of the specified file to write the data.
Transaction Isolation. Provides the necessary consistency and concurrency control between transactions in the job and other transactions for optimal performance. For more information on using these levels, see your database documentation. Use one of the following transaction isolation levels:
– Read Uncommitted. Takes exclusive locks on modified data. These locks are held until a commit or rollback is executed. However, other transactions can still read but not modify the uncommitted changes. No other locks are taken.
– Read Committed. Takes exclusive locks on modified data and sharable locks on all other data. This is the default. Exclusive locks are held until a commit or rollback is executed. Uncommitted changes are not readable by other transactions. Shared locks are released immediately after the data has been processed, allowing other transactions to modify it.
DataStage Dynamic Relational Stage Guide 1-15
Defining Input Data The DRS Stage
– Repeatable Read. Identical to serializable except that phantom rows may be seen.
– Serializable. Takes exclusive locks on modified data and sharable locks on all data. All locks are held until a commit or rollback is executed, preventing other transactions from modifying any data that has been referenced during the transaction.
Note Not all platforms support all four levels of transaction
isolation, and those that do have different names for
them. DRS maps the value for the property to the
closest supported value defined for the database
platform selected. See the applicable database
documentation to determine what levels are supported.
Array size. Specifies the number of rows to be transferred in one call between Ascential DataStage and the database before they are written. Enter a positive integer to indicate how often the database management system performs writes to the database. The default value is 1; that is, each row is written in a separate statement.
Larger numbers use more memory on the client to cache the
rows. This minimizes server round trips and maximizes
performance by executing fewer statements. If this number is too
large, the client may run out of memory.
Array size has implications for handling of reject rows. See "Reject
Row Handling" on page 1-21.
Transaction size. Specifies the number of rows written before a commit is executed in the database. A value of 0 causes all rows in the job to be written as a single transaction.
Create table action. Creates the target table in the specified database if Generate DDL is selected. It uses the column definitions in the Columns tab and the table name. You must have CREATE TABLE privileges on your schema.
Choose one of the following options to create the table:
– Do not create target table. Specifies that the target table is not created, and the Drop table action field and the Create Table Properties button on the right of the dialog are disabled.
– Generate DDL. Specifies that the stage generates the CREATE TABLE statement using information from Table name, the column definitions grid, and the values in the Create Table Properties dialog box.
1-16 DataStage Dynamic Relational Stage Guide
The DRS Stage Defining Input Data
Click the button to open the Create Table Properties dialog
box to display the table space and storage expression values
for generating the DDL.
Drop table action. Drops the target table before it is created by the stage if Generate DDL is selected. This field is disabled if you choose not to create the target table. The list displays the same items as the Create table action list except that they apply to the DROP TABLE statement. You must have DROP TABLE privileges on your schema.
Treat warning message as fatal error. Determines the behavior of the stage when an error is encountered while writing data to a table. If the check box is selected, a warning message is logged as fatal, and the job aborts.
If the check box is cleared (the default), three warning messages
are logged in the Ascential DataStage Director log file, and the job
continues.
Description. Contains an optional description of the input link.
Columns Tab
This tab contains the column definitions for the data written to the
table.
The column definitions appear in the same order as in the Columns
grid:
The Columns tab behaves the same way as the Columns tab in the
ODBC stage. For a description of how to enter and edit column
definitions, see Ascential DataStage Designer Guide.
DataStage Dynamic Relational Stage Guide 1-17
Defining Input Data The DRS Stage
SQL Tab
The SQL tab contains the Generated, User-defined, Before, After, Generated DDL, and User-defined DDL tabs.
Use these tabs to display the stage-generated SQL statement and the
SQL statement that you can enter.
Generated. Contains the SQL statements constructed by Ascential DataStage that are used to write data to a database. The SQL statements are based on the current values of the stage and link properties. You cannot edit these statements, but you can use Copy to copy them to the Clipboard for use elsewhere, for example, user-defined SQL statements. This tab is displayed by default.
User-defined. Contains the SQL INSERT, DELETE, or UPDATE statements executed to write data to a database.
1-18 DataStage Dynamic Relational Stage Guide
The DRS Stage Defining Input Data
Note For information about SQL Meta Tags, see "SQL Meta Tags"
on page 1-31.
Before. Contains the SQL statements executed before the stage processes any job data rows.
The SQL statement on the Before tab is the first SQL statement to
be executed. Depending on your choice, the job can continue or
abort after failing to execute a Before statement. It does not affect
the transaction grouping scheme. The commit or rollback is
performed on a per-link basis.
Each SQL statement is executed as a separate transaction if the
statement separator is a double semi-colon ( ;; ). All SQL
statements are executed in a single transaction if a semi-colon ( ; )
is the separator.
If the property value begins with FILE=, the remaining text is
interpreted as a pathname, and the contents of the file supplies
the property value.
Note For information about SQL Meta Tags, see "SQL Meta
Tags" on page 1-31.
Treat errors as non-fatal. If selected, errors caused by Before SQL
are logged as warnings, and processing continues with the next
command batch. Each separate execution is treated as a separate
transaction. If cleared, errors are treated as fatal to the job, and
result in a transaction rollback. The transaction is committed only
if all statements successfully execute.
DataStage Dynamic Relational Stage Guide 1-19
Defining Input Data The DRS Stage
After. Contains the SQL statements executed after the stage processes the job data rows.
The SQL statement on the After tab is the last SQL statement to
be executed. Depending on your choice, the job can continue or
abort after failing to execute an After SQL statement. It does not
affect the transaction grouping scheme. The commit or rollback is
performed on a per-link basis.
Each SQL statement is executed as a separate transaction if the
statement separator is a double semi-colon ( ;; ). All SQL
statements are executed in a single transaction if a semi-colon ( ; )
is the separator.
If the property value begins with FILE=, the remaining text is
interpreted as a pathname, and the contents of the file supplies
the property value.
The behavior of Treat errors as non-fatal is the same as for Before.
Note For information about SQL Meta Tags, see "SQL Meta
Tags" on page 1-31.
1-20 DataStage Dynamic Relational Stage Guide
The DRS Stage Defining Input Data
Generated DDL. Select Generate DDL or User-defined DDL from the Create table action field on the General tab to enable this tab.
The CREATE TABLE statement field displays the CREATE
TABLE statement that is generated from the column meta data
definitions and the information provided on the Create Table Properties dialog box. If you select an option other than Do not drop target table from the Drop table action list, the DROP statement field displays the generated DROP TABLE statement
for dropping the target table.
User-defined DDL. Reserved for a future release.
Reject Row HandlingDuring input link processing, rows of data may be rejected by the
database for various reasons, such as unique constraint violations or
data type mismatches.
The DRS stage writes the offending row to the log for the DataStage
job. For the database message detail, you must use the error
messages returned by the database.
Ascential DataStage provides additional reject row handling. To use
this capability:
1 Set Array Size to 1.
2 Use a Transformer stage to redirect the rejected rows.
You can then design your job by choosing an appropriate target for
the rejected rows, such as a Sequential stage. Reuse this target as an
DataStage Dynamic Relational Stage Guide 1-21
Writing Data to a Database The DRS Stage
input source once you resolve the issues with the offending row
values.
Writing Data to a DatabaseThe following sections describe the differences when you use
generated or user-defined SQL INSERT, DELETE, or UPDATE
statements to write data from Ascential DataStage to a database.
Using Generated SQL Statements By default, Ascential DataStage writes data to a database table using
an SQL INSERT, DELETE, or UPDATE statement that it constructs. The
generated SQL statement is automatically constructed using the table
and column definitions that you specify in the input properties for this
stage. The SQL tab displays the SQL statement used to write the data.
To use a generated SQL statement:
1 Enter a table name in the Table name field on the Input page.
2 Specify how you want the data to be written by choosing a suitable option from the Update action list. Choose one of these options for a generated statement:
– Clear table then insert rows
– Truncate table then insert rows
– Insert rows without clearing
– Delete existing rows only
– Replace existing rows completely
– Update existing rows only
– Update existing rows or insert new rows
– Insert new rows or update existing rows
– User-defined SQL
– User-defined SQL file
See "Defining Input Data" on page 1-13 for a description of each
update action.
3 Enter an optional description of the input link in the Description field.
4 Click the Columns tab on the Input page. The Columns tab appears.
1-22 DataStage Dynamic Relational Stage Guide
The DRS Stage Writing Data to a Database
5 Edit the Columns grid to specify column definitions for the columns you want to write.
The SQL statement is automatically constructed using your
chosen update action and the columns you have specified.
6 Click the SQL tab on the Input page, then the Generated tab to view this SQL statement. You cannot edit the statement here, but you can click this tab at any time to select and copy parts of the generated statement to paste into the user-defined SQL statement.
7 Click OK to close the DRS Stage dialog box. Changes are saved when you save your job design.
Using User-Defined SQL Statements Instead of writing data using an SQL statement constructed by
Ascential DataStage, you can enter your own SQL INSERT, DELETE, or
UPDATE statement for each DRS input link. (You can include other
SQL statements such as CREATE TABLE only in a Before SQL
statement.) Ensure that the SQL statement contains the table name,
the type of update action you want to perform, and the columns you
want to write.
To enter a user-defined SQL statement:
1 Click the User-defined tab on the SQL tab.
2 Enter the SQL statement you want to use to write data to the target database tables. This statement must contain the table name, the type of update action you want to perform, and the columns you want to write.
When writing data, the INSERT statements must contain a
VALUES clause with the question mark (?) parameter marker for
each stage input column. All user-defined SQL is passed
unchanged to the underlying DBMS except for the parameter
marker. Therefore, the syntax must be correct for that DBMS.
UPDATE statements must contain SET clauses with parameter
markers for each stage input column. UPDATE and DELETE
statements must contain a WHERE clause with parameter markers
for the primary key columns. The parameter markers must be in
the same order as the associated columns listed in the stage
properties. For example:
insert emp (emp_no, emp_name) values (?, ?)
If you specify two SQL statements, they are executed as one
transaction. Do not use a trailing semicolon.
You cannot call stored procedures as there is no facility for
parsing the row values as parameters.
DataStage Dynamic Relational Stage Guide 1-23
Defining Output Data The DRS Stage
Unless you specify a user-defined SQL statement, the stage
automatically generates an SQL statement.
3 Click OK to close the DRS Stage dialog box. Changes are saved when you save your job design.
Defining Output DataOutput links specify the data you are extracting from a database. You
can also specify the data on an output link using an SQL statement
constructed by Ascential DataStage or a user-defined SQL statement.
The following sections describe the differences when you use SQL
SELECT statements for generated or user-defined queries that you
define on the Output page in the DRS stage dialog box of the GUI.
About the Output PageThe Output page has one field and the General, Columns, Selection, and SQL tabs.
Output name. The name of the output link. Choose the link you want to edit from the Output name list. This list displays all the output links from the DRS stage.
The Columns… and the View Data… buttons function like those on the Input page.
1-24 DataStage Dynamic Relational Stage Guide
The DRS Stage Defining Output Data
General Tab
This tab is displayed by default. It contains the following parameters:
Table names (SQL FROM clause). Specifies the name of the source table. Use comma-separated table names to indicate an inner join. Use outer-join SQL Meta Tags to define an outer join.
Browse. Allows you to work with the DataStage Repository to obtain column information.
Transaction Isolation. Specifies the transaction isolation levels that provide the necessary consistency and concurrency control between transactions in the job and other transactions for optimal performance. For more information on using these levels, see your database documentation. Use one of the following transaction isolation levels:
– Read Uncommitted. Takes exclusive locks on modified data. These locks are held until a commit or rollback is executed. However, other transactions can still read but not modify the uncommitted changes. No other locks are taken.
– Read Committed. Takes exclusive locks on modified data and sharable locks on all other data. This is the default. Exclusive locks are held until a commit or rollback is executed. Uncommitted changes are not readable by other transactions. Shared locks are released immediately after the data has been processed, allowing other transactions to modify it.
– Repeatable Read. Identical to serializable except that phantom rows may be seen.
– Serializable. Takes exclusive locks on modified data and sharable locks on all data. All locks are held until a commit or rollback is executed, preventing other transactions from modifying any data that has been referenced during the transaction.
Note Not all platforms support all four levels of transaction
isolation, and those that do have different names for
them. DRS maps the value for the property to the
closest supported value defined for the database
platform selected. See the applicable database
documentation to determine what levels are supported.
Array size. Specifies the number of rows read from the database at a time. Enter a positive integer to indicate the number of rows to prefetch in one call. The default value 1 means that prefetching is turned off.
Larger numbers use more memory on the client to cache the
rows. This minimizes server round trips and maximizes
DataStage Dynamic Relational Stage Guide 1-25
Defining Output Data The DRS Stage
performance by executing fewer statements. If this number is too
large, the client may run out of memory.
Query type. Specifies which SQL statements are used to read the target table. Choose the option you want from the list:
– Generated SQL query. Specifies that the data is extracted using an SQL statement constructed by Ascential DataStage. When you click the SQL tab, you can view this SQL statement. You cannot edit the statement here, but you can click this tab at any time to select and copy parts of the generated statement to paste into the user-defined SQL statement.
– User-defined SQL query. Specifies that the data is extracted using a user-defined SQL query. With this choice, you can edit the SQL statements.
– User-defined SQL query file. Specifies that the data is extracted using the SQL query in the pathname of the designated file that exists on the server. Enter the pathname for this file instead of the text for the query. With this choice, you can edit the SQL statements.
Description. Contains an optional description of the output link.
Columns Tab
The column tab page behaves the same way as the Columns tab in
the ODBC stage, and it specifies which columns are aggregated. For a
description of how to enter and edit column definitions, see Ascential
DataStage Designer Guide.
The column definitions for output links contain a key field. Key fields
are used to join primary and reference inputs to a Transformer stage.
For a reference output link, the DRS key reads the data by using a
1-26 DataStage Dynamic Relational Stage Guide
The DRS Stage Defining Output Data
WHERE clause in the SQL SELECT statement. For details on how key
fields are specified and used, see Ascential DataStage Designer
Guide.
The Derivation cell on the Columns tab contains fully-qualified
column names when table definitions are loaded from the DataStage
Repository. If the Derivation cell has no value, DRS uses only the
column names to generate the SELECT statement displayed in the
Generated tab of the SQL tab. Otherwise, it uses the content of the
Derivation cell. Depending on the format used in the Repository, the
format is owner.table.name.columnname or tablename.columnname.
The column definitions for reference links require a key field. Key
fields join reference inputs to a Transformer stage. DRS key reads the
data by using a WHERE clause in the SQL SELECT statement.
Selection Tab
Use this tab to build column-generated SQL queries. It contains
optional SQL clauses for the conditional extraction of data.
The Clauses tab is divided into two panes.
WHERE clause. Allows you to insert an SQL WHERE clause to specify criteria that the data must meet before being selected.
Note For information about SQL Meta Tags, see "SQL Meta
Tags" on page 1-31.
Other clauses. Allows you to insert a GROUP BY, HAVING, or ORDER BY clause to sort, summarize, and aggregate data.
DataStage Dynamic Relational Stage Guide 1-27
Defining Output Data The DRS Stage
SQL Tab
The SQL tab contains the Generated, User-defined, Before, and
After tabs.
Use these tabs to display the stage-generated SQL statement and the
SQL statement that you can enter.
Generated. Displays the SQL statements that read data from a database. You cannot edit these statements, but you can use Copy to copy them to the Clipboard for use elsewhere.
User-defined. Contains the user-defined SQL SELECT statement executed to read data from a database. With this choice, you can edit the SQL statements.
Note For information about SQL Meta Tags, see "SQL Meta
Tags" on page 1-31.
1-28 DataStage Dynamic Relational Stage Guide
The DRS Stage Defining Output Data
Before. Contains the SQL statements executed before the stage processes any job data rows.
The Before is the first SQL statement to be executed, and you can
specify whether the job continues or aborts after failing to execute
a Before SQL statement. The commit/rollback is performed on a
per-link basis.
If the property value begins with FILE=, the remaining text is
interpreted as a pathname, and the contents of the file supply the
property value.
Note For information about SQL Meta Tags, see "SQL Meta
Tags" on page 1-31.
After. Contains the After SQL statement executed after the stage processes any job data rows.
DataStage Dynamic Relational Stage Guide 1-29
Defining Output Data The DRS Stage
It is the last SQL statement to be executed, and you can specify
whether the job continues or aborts after failing to execute an
After SQL statement. The commit/rollback is performed on a per-
link basis.
Ascential DataStage extracts data from a data source using an SQL
SELECT statement that it constructs. The SQL statement is
automatically constructed using the table and column definitions that
you entered in the stage output properties.
SQL SELECT statements have the following syntax:
SELECT clause FROM clause
[WHERE clause][GROUP BY clause][HAVING clause][ORDER BY clause];
When you specify the tables to use and the columns to be output from
the DRS stage as well as the selection criteria, the SQL SELECT
statement is automatically constructed and can be viewed by clicking
the SQL tab on the Output page.
The SELECT and FROM clauses are the minimum required and are
automatically generated by Ascential DataStage. However, you can
use any of these SQL SELECT clauses:
For more information about these clauses, see Ascential DataStage
Server Job Developer’s Guide.
SELECT clause Specifies the columns to select from the
database.
FROM clause Specifies the tables containing the selected
columns.
WHERE clause Specifies the criteria that rows must meet
to be selected. Use it also to join two or
more tables and limit the rows selected.
GROUP BY clause Groups rows to summarize results.
HAVING clause Specifies the criteria that grouped rows
must meet to be selected.
ORDER BY clause Sorts selected rows.
1-30 DataStage Dynamic Relational Stage Guide
The DRS Stage SQL Meta Tags
Using User-Defined QueriesInstead of using the SQL statement constructed by Ascential
DataStage, you can enter your own SQL statement for each DRS
output link.
To enter a user-defined SQL query:
1 Choose User-defined SQL query from the Query type list on the General tab on the Output page. The User-defined tab on the SQL tab is enabled.
You can then edit or drag and drop the selected columns into your
user-defined SQL statement. Only one SQL statement is supported
for an output link.You must ensure that the table definitions for the
output link are correct and represent the columns that are
expected.
2 Click OK to close the DRS stage dialog box. Changes are saved when you save your job design.
SQL Meta TagsSQL Meta Tags are unique functions that enhance or modify platform-
specific SQL statements. When used in SQL statements, they are
translated into native database SQL before the SQL is executed by the
stage. They can be applied anywhere you define the SQL code:
User-defined tab on the Input page (see page 1-18)
Before tab on the Input page (see page 1-19)
After tab on the Input page (see page 1-20)
Table names (SQL FROM clause) of the General tab on the Output page (see page 1-25)
WHERE clause of the Selection tab on the Output page (see page 1-27)
User-defined tab on the Output page (see page 1-28)
Before tab on the Output page (see page 1-29)
After tab on the Output link (see page 1-29)
DataStage Dynamic Relational Stage Guide 1-31
SQL Meta Tags The DRS Stage
The following table describes the available SQL Meta Tags.
The Tag Description
%Abs(x) Returns a decimal value equal to the absolute value of anumber x.
Input arguments: x is numeric1.
Output data type: Numeric.
%Coalesce(expr1, expr2, …) Returns the first nonnull argument provided to the function.
Input arguments: expr1, expr2, … are expressions ofany type.
Output data type: An expression of any type.
%Concat At runtime, the %Concat SQL Meta Tag is replaced by the string concatenation operator appropriate for the specific relational database. For example, in DB2, the %Concat SQL Meta Tag is replaced with CONCAT, whilein Sybase it is replaced with a "+".
This SQL Meta Tag is supported with the same limitations as the native concatenation operator on thespecific relational database. For example, some platforms allow you to concatenate a string with a numeric value; others flag this as an error. Ascential DataStage makes no attempt to check or convert the data types of either of the operands.
Input arguments: None.
Output data type: String2.
%CurrentDateIn Expands to a platform-specific SQL substring representing the current date in the WHERE clause of aSQL SELECT or UPDATE statement or when the currentdate is passed in an INSERT statement.
Input arguments: None.
Output data type: DATE.
%CurrentDateOut Expands to platform-specific SQL for the current date inthe SELECT clause of an SQL query.
Input arguments: None.
Output Data Type: (Date) String literal.
%CurrentDateTimeIn Expands to a platform-specific SQL substring representing the current datetime in the WHERE clauseof a SQL SELECT or UPDATE statement or when the current date time is passed in an INSERT statement.
Input arguments: None.
Output data type: TIMESTAMP.
1-32 DataStage Dynamic Relational Stage Guide
The DRS Stage SQL Meta Tags
%CurrentDateTimeOut Expands to a platform-specific SQL for the current datetime in the SELECT clause of an SQL query.
Input arguments: None.
Output data type: (Timestamp) String literal.
%CurrentTimeIn Expands to a platform-specific SQL substring representing the current time in the WHERE clause of aSQL SELECT or UPDATE statement or when the currenttime is passed in an INSERT statement.
Input arguments: None.
Output data type: TIME.
%CurrentTimeOut Expands to a platform-specific SQL for the current timein the SELECT clause of an SQL query.
Input arguments: None.
Output data type: (Time) String literal.
%DateAdd(date_from, add_days) Returns a date by adding add_days to date_from. add_days can be negative.
Input arguments: date_from is a DATE; add_days is apositive or negative INTEGER literal expression.
Output data type: DATE.
%DateDiff(date_from, date_to) Returns an integer representing the difference betweentwo dates in number of days.
Input arguments: date_from is a DATE; date_to is a DATE.
Output data type: INTEGER.
%DateIn(dt) Expands into platform-specific SQL syntax for the date.%DateIn should be used whenever a date literal or Datebind variable is used in a comparison in the WHERE clause of a SELECT or UPDATE statement or when a Date value is passed in an INSERT statement.
Input arguments: dt is a DATE or date string literal inYYYY-MM-DD format.
Output data type: DATE.
%DateOut(dt) Expands to a platform-specific SQL substring representing dt in the SELECT clause of an SQL query.
Input arguments: dt is a DATE.
Output data type: (Date) String literal.
%DatePart(DTTM_Column) Returns the date portion of the specified datetime column.
Input arguments: DTTM_Column is a TIMESTAMP.
Output data type: DATE.
The Tag Description
DataStage Dynamic Relational Stage Guide 1-33
SQL Meta Tags The DRS Stage
%DateTimeDiff(datetime_from,datetime_to)
Returns a time value representing the difference between two datetimes in minutes.
Input arguments: datetime_from and datetime_to areTIMESTAMP.
Output data type: TIME.
%DateTimeIn(dtt) Expands to platform-specific SQL for a DateTime valuein the WHERE clause of a SQL SELECT or UPDATE statement or when a datetime value is passed in an INSERT statement.
Input arguments: dtt is a TIMESTAMP or Timestamp string literal in the form YYYY-MM-DD-hh.mm.ss.ssssss.
Output data type: TIMESTAMP.
%DateTimeOut(datetime_col) Expands to a platform-specific SQL substring representing datetime_col in the SELECT clause of an SQL query.
Input arguments: datetime_col is a TIMESTAMP.
Output data type: (Timestamp) String literal.
%DecDiv(a,b) Returns a floating number representing the value of a divided by b, where a and b are numeric expressions.
Input arguments: a and b are numeric expressions.
Output data type: Recommendation is to use a character type.
%DecMult(a,b) Returns a floating number representing the value of a multiplied by b, where a and b are numeric expressions.
Input arguments: a and b are numeric expressions.
Output data type: Recommendation is to use a character type.
%DTTM(date, time) Combines the database date represented by the date value with the database time in the time value and returns a database timestamp value.
Input arguments: date is a DATE; time is a TIME.
Output data type: TIMESTAMP.
The Tag Description
1-34 DataStage Dynamic Relational Stage Guide
The DRS Stage SQL Meta Tags
.
.
.
%FullJoin(TableName1OrJoin, [tableAlias1], tableName2OrJoin, [tableAlias2], joinCondition)
Produces a full outer join. Use in Table name (SQL FROM clause) on the General tab of the Output pageor in the FROM clause on the User-defined tab of the SQL tab of the Output page.
See "Additional Information about Joins" on page 1-38
Input arguments: TableName1OrJoin is the name of the first table or nested join in the join; tableAlias1 is analias for the first table in the join and is optional; tableName2OrJoin is the name of the second table or nested join in the join; tableAlias2 is an alias for the second table in the join and is optional; joinCondition isthe expression describing the join condition.
Output data types: Any, as defined by the columns inthe joined table.
%LeftJoin(TableName1OrJoin, [tableAlias1], tableName2OrJoin, [tableAlias2], joinCondition)
Produces a left outer join. Use in Table name (SQL FROM clause) on the General tab of the Output pageor in the FROM clause on the User-defined tab of the SQL tab of the Output page.
See "Additional Information about Joins" on page 1-38
Input arguments: TableName1OrJoin is the name of the first able or nested join in the join; tableAlias1 is analias for the first table in the join and is optional; tableName2OrJoin is the name of the second table or nested join in the join; tableAlias2 is an alias for the second table in the join and is optional; joinCondition isthe expression describing the join condition.
Output data types: Any, as defined by the columns inthe joined table.
%Like("Literal") Expands to look for literal values. Use this meta-SQL when looking for like values. A % is appended to the literal.
%Like generates the following:
like 'literal%'If the literal value contains '\_' or '\%', %Like generates the following:
like 'literal%' escape '\'See "Additional Information about %Like" on page 1-40
Input arguments: Literal is a string.
Output data type: String.
The Tag Description
DataStage Dynamic Relational Stage Guide 1-35
SQL Meta Tags The DRS Stage
.
%LikeExact(fieldname, "Literal") Expands to look for literal values. Use this when exact matches are necessary, taking into account wildcards inthe literal values.
%LikeExact generates one of the following:
If the literal contains no wildcards:
fieldname = 'literal'If the literal ends with the '%' wildcard:
fieldname like 'literal' [escape '\']Some platforms require that you use RTRIM to get the correct value. The following characters are wildcards even when preceded with the backslash '\' escape character.
%
_
Therefore, on some platforms, the literal must end witha '%' wildcard that is not preceded by '\' as in the following example.
literal = 'ABC%' In the above example you do not need RTRIM on any platform.
literal = 'ABC\%' In the above example, you need RTRIM on SQL Server and DB2/UDB.
Input arguments: fieldname and Literal are strings.
Output data type: String
%NumToChar(Number) Transforms a numeric value into a character value. Spaces are trimmed from Number.
Input arguments: Number is a numeric.
Output data type: CHAR.
%RightJoin(TableName1OrJoin, [tableAlias1], tableName2OrJoin, [tableAlias2], joinCondition)
Produces a right outer join. Use in Table name (SQL FROM clause) on the General tab of the Output pageor in the FROM clause on the User-defined tab of the SQL tab of the Output page.
See "Additional Information about Joins" on page 1-38
Input arguments: TableName1OrJoin is the name of the first table or nested join in the join; tableAlias1 is analias for the first table in the join and is optional; tableName2OrJoin is the name of the second table or nested join in the join; tableAlias2 is an alias for the second table in the join and is optional; joinCondition isthe expression describing the join condition.
Output data types: Any, as defined by the columns inthe joined table.
The Tag Description
1-36 DataStage Dynamic Relational Stage Guide
The DRS Stage SQL Meta Tags
%Round(expression, factor) Rounds an expression to a specified scale before or after the decimal point. If factor is a literal, it can be rounded to a negative number.
Input arguments: expression is a numeric expression;factor is an INTEGER.
Output data type: DECIMAL.
%Substring(source_str, start, length)
Expands to a substring of source_str.
Input arguments: source_str is a string; start and length are INTEGERS.
Output data type: String.
%TimeAdd(datetime, add-minutes)
Generates the SQL that adds add-minutes (which can be a positive or negative integer literal or expression, provided that the expression resolves to a data type that can be used in datetime arithmetic for the given relational database) to the provided datetime (which can be a datetime literal or expression).
Input arguments: datetime is a Timestamp string literal or expression; add-minutes is a positive or negative INTEGER literal or expression.
Output data type: TIME.
%TimeIn(tm) Expands to platform-specific SQL for a Time value in the WHERE clause of a SQL SELECT or UPDATE statement or when a time value is passed in an INSERTstatement.
Input arguments: tm is TIME or a time string literal inthe form hh.mm.ss.ssssss.
Output data type: TIME.
%TimeOut(time_col) Expands to a platform-specific SQL substring representing time_col in the SELECT clause of an SQL query.
Input arguments: time_col is TIME.
Output data type: (Time) String literal.
%TimePart(DTTM_Column) Returns the time portion of the specified datetime column.
Input arguments: DTTM_Column is TIMESTAMP.
Output data type: TIME.
%TrimSubstr(source_str, start, length)
Expands to a substring of source_str, like %Substring, except that trailing blanks are removed from the substring.
Input arguments: source_str is a string, and start andlength are INTEGERS.
Output data type: String.
The Tag Description
DataStage Dynamic Relational Stage Guide 1-37
SQL Meta Tags The DRS Stage
Additional Information about JoinsThe following is additional information about Joins.
Implications for Oracle 8i Databases
For all database types except Oracle 8i, DRS expands the tags to the
standard SQL-92 syntax for outer joins. For Oracle 8i data sources,
DRS uses the Oracle proprietary syntax, which inserts “(+)” after
appropriate field references in the WHERE clause. In a simple
example, the expression
SELECT Table1.x,Table1.yFROM %LeftJoin (Table1,, Table2,, Table1.x = Table2.x)WHERE Table1.y > 2
produces the following SQL-92 expression:
SELECT Table1.x,Table1.yFROM Table1 LEFT OUTER JOIN Table2 ON Table1.x = Table2.xWHERE Table1.y > 2
With an Oracle 8i database, the same expression produces the
following Oracle 8i expression:
SELECT Table1.x,Table1.yFROM Table1, Table2
%Upper(charstring) Converts the string charstring to uppercase. You can use wildcards with charstring, like %.
Input arguments: charstring is a string.
Output data type: String.
1 Examples of numeric data types are:
BIGINTDECIMALDOUBLEFLOATINTEGERNUMERICREALSMALLINTTINYINT
2 Examples of string data types are:
CHARLONGNVARCHARLONGVARCHARNCHARNVARCHARVARCHAR
The Tag Description
1-38 DataStage Dynamic Relational Stage Guide
The DRS Stage SQL Meta Tags
WHERE (Table1.x = Table2.x(+)) and (Table1.y > 2)
In this case, the meta tag is not like a simple macro. The meta tag
expansion inserts the join condition in front of the existing WHERE
condition, if any, with an AND, putting the join condition and the
existing WHERE condition each in parentheses. The expansion
process then scans the resulting composite WHERE condition for
references to fields of the table on the right side of the left outer join.
After each of these references, it inserts a “(+).”
Warning There are special restrictions for using outer join SQL
Meta Tags with Oracle 8i databases:
In order for the meta tag expansion process to identify the field references, you must fully qualify all field references in join conditions and in the WHERE condition.
You can nest joins only on the left, as in
%LeftJoin (%LeftJoin (Table1,,Table2,,Table1.x = Table2.x),, Table3,, Table2.q =Table3.q
The following is not permitted:
%LeftJoin (Table3,,%LeftJoin (Table1,, Table2,, Table1.x = Table2.x),, Table2.q = Table3.q)
Aliases
You can provide aliases for the tables in the second and fourth
parameters. For example, in SQL-92
SELECT Table1Alias.x,Table1Alias.yFROM %LeftJoin (Table1, Table1Alias, Table2, Table2Alias, Table1Alias.x = Table2Alias.x)WHERE Table1Alias.y > 2
produces
SELECT Table1Alias.x,Table1Alias.yFROM Table1 Table1Alias LEFT OUTER JOIN Table2 Table2Alias ON Table1Alias.x = Table2Alias.xWHERE Table1Alias.y > 2
Nested Outer Joins
You can use the join SQL Meta Tags recursively to produce nested
outer joins. For example,
SELECT Table1.xFROM %LeftJoin (%LeftJoin (Table1,, Table2,, Table1.x=Table2.x),, Table3,, Table1.z = Table3.z AND Table2.q = Table3.q)WHERE Table1.a = ‘xyz’
In SQL-92, this becomes
SELECT Table1.x
DataStage Dynamic Relational Stage Guide 1-39
SQL Meta Tags The DRS Stage
FROM Table1 LEFT OUTER JOIN Table2 ON TABLE1.x = Table2.x LEFT OUTER JOIN Table3 on Table1.z = Table3.z AND Table2.q = Table3.qWHERE Table1.a = ‘xyz’
In Oracle 8i, this becomes
SELECT Table1.xFROM Table1, Table2, Table3WHERE (Table1.x = Table2.x (+)) AND ((Table1.z = Table3.z (+) AND Table2.q (+) = Table3.q (+)) AND (Table1.a = ‘xyz’))
Additional Information about %Like The following is additional information about %Like.
Using %Like and Eliminating Blanks
Some platforms require that you use RTRIM to get the correct value.
The following characters are wildcards even when preceded with the
backslash '\' escape character.
%
_
Therefore, on some platforms, the literal must end with a '%' wildcard
that is not preceded by '\'.
literal = 'ABC%'
In the example above, you do not need RTRIM on any platform.
literal = 'ABC\%'
In the example above, you need RTRIM on SQL Server and DB2/UDB.
Using %Like and Trailing Blanks
Note that not all executions of LIKE perform in the same way. When
dealing with trailing blanks, some platforms behave as if there is an
implicit % at the end of the comparison string, while most do not, as
shown in the following example:
In this example, if the selected column contains the string "ABCD "
(three trailing blanks), the following statement may or may not return
any rows:
select * from t1 Where c like 'ABCD'
Therefore, it is always important to explicitly code the % the end of
matching strings for those columns where you want to include trailing
blanks.
1-40 DataStage Dynamic Relational Stage Guide
The DRS Stage Oracle DATE Data Type Considerations
LIKE Wild Cards
SQL provides two wild cards that can be used when specifying pattern
matching strings for use with the LIKE predicate. The underscore (_) is
used as a substitution for a single character within a string; the
percent sign (%) is used to represent any number of character spaces
within a string.
Oracle DATE Data Type ConsiderationsAn Oracle DATE data type contains date and time information (there is
no TIME data type in Oracle). Ascential DataStage maps the Oracle
DATE data type to a Timestamp data type. This is the default
DataStage data type when you import the Oracle meta data type of
DATE.
ConversionAscential DataStage uses a conversion of YYYY-MM-DD HH24:MI:SS
when reading or writing an Oracle date. If the DataStage data type is
Timestamp, DataStage uses the to_date function for this column
when it generates the INSERT statement to write an Oracle date. If the
DataStage data type is Timestamp or Date, Ascential DataStage uses
the to_char function for this column when it generates the SELECT
statement to read an Oracle date.
The following example creates a table with a DATE data type on an
Oracle server. The imported DataStage data type is Timestamp.
create table dsdate (one date);
The results vary, depending on whether the DRS stage is used as an
input or an output link:
Input link. The stage generates the following SQL statement:
insert into dsdate(one) values(TO_DATE(:1, 'yyyy-mm-dd hh24:mi:ss'))
Output link. The stage generates the following SQL statement:
select TO_CHAR(one, 'YYYY-MM-DD HH24:MI:SS') FROM dsdate
DataStage Dynamic Relational Stage Guide 1-41
Oracle DATE Data Type Considerations The DRS Stage
TruncationIf you choose DataStage data type DATE or TIME, a portion of the
Oracle column is lost.
DRS provides two environment variables for overriding the default
Oracle behavior so that you can specify some other value, for
example, a value indicating that the date portion is not significant.
DS_DEFAULT_DATETIME_DATE. Set this value to a string with the format YYYY-MM-DD. DRS writes to Oracle a constant date component for each DataStage value defined as TIME.
DS_DEFAULT_DATETIME_TIME. Set this value to a string with the format HH:MM:SS. DRS writes to Oracle a constant date component for each DataStage value defined as DATE.
The following demonstrates the impact of using one of these
environment variables.
A DataStage job reads the following from a sequential file and writes it to an Oracle table with two DATE data types.
The full values that are stored in Oracle on November 23, 2004, are:
But if DS_DEFAULT_DATETIME_DATE is set to “1900-01-01” and DS_DEFAULT_DATETIME_TIME is set to “23:59:59,” then the following is written to Oracle.
When the data is… Its value is…
Located in its source 2004-10-29 11:18:24
Read by DataStage using a TIME data type 11:18:24
Located in Oracle after DataStage writes it using a TIME data type, on December 13, 2004
2004-12-01 11:18:24
Date Time
2004-11-25 09:14:37
DATE DATE
2004-11-25 12:00:00 2004-11-01 09:14:37
DATE DATE
2004-11-25 23:59:59 1900-01-01 09:14:37
1-42 DataStage Dynamic Relational Stage Guide
Th
e D
RS
Sta
ge
Da
tab
ase D
ata
Ty
pe S
up
po
rt
Data
Sta
ge D
yn
am
ic R
ela
tion
al S
tag
e G
uid
e1-4
3
ting table definitions for a
e.
erverype
SybaseData Type
BINARY
BIT
CHAR, NCHAR
ME DATETIME,
SMALLDATETIME3
L DECIMAL, MONEY
DOUBLE PRECISION
FLOAT4
R INT
TEXT
L NUMERIC
Database Data Type SupportThe following table document s the support for database data types. When crea
database table, specify the SQL type, length, and scale attributes as appropriat
DataStage SQL Data Type
DB2/UDBData Type
Informix Data Type
OracleData Type
SQL SData T
SQL_BIGINT BIGINT INT8
SQL_BINARY CHAR FOR BIT DATA
BYTE
SQL_BIT Unsupported BOOLEAN1 BIT
SQL_CHAR CHAR CHAR CHAR CHAR
SQL_DATE TIMESTAMP DATE DATE2 DATETI
SQL_DECIMAL DECIMAL DECIMAL NUMBER DECIMA
SQL_DOUBLE DOUBLE PRECISION
DOUBLE PRECISION
NUMBER
SQL_FLOAT FLOAT FLOAT NUMBER FLOAT
SQL_INTEGER INTEGER INTEGER NUMBER INTEGE
SQL_LONGVARBINARY LONG VARCHAR FOR BIT DATA
BYTE BLOB5, 6 IMAGE
SQL_LONGVARCHAR LONG VARCHAR,
CLOB5, 7
TEXT CLOB5, 8 TEXT
SQL_NUMERIC DECIMAL DECIMAL NUMBER DECIMA
Th
e D
RS
Sta
ge
Da
tab
ase D
ata
Ty
pe S
up
po
rt
1-4
4D
ata
Sta
ge D
yn
am
ic R
ela
tion
al S
tag
e G
uid
e
REAL
INT SMALLINT
ME DATETIME,
SMALLDATETIME9
ME DATETIME, SMALLDATETIME
INT TINYINT
ARY
AR VARCHAR, NVARCHAR, SYSNAME
NCHAR
TEXT
APHIC NVARCHAR
n converted to an E or SMALLDATETIME
ision occurs when
DataStage SQL Data Type
DB2/UDBData Type
Informix Data Type
OracleData Type
SQL ServerData Type
SybaseData Type
SQL_REAL REAL REAL NUMBER REAL
SQL_SMALLINT SMALLINT SMALLINT SMALL
SQL_TIME TIME DATETIME DATE2 DATETI
SQL_TIMESTAMP TIMESTAMP DATETIME DATE2 DATETI
SQL_TINYINT SMALLINT SMALL
SQL_VARBINARY VARCHAR FOR BIT DATA
BYTE VARBIN
SQL_VARCHAR VARCHAR VARCHAR VARCHAR2 VARCH
SQL_WCHAR NCHAR NCHAR NCHAR NCHAR
SQL_WLONGVARCHAR DBCLOB10 NCLOB NTEXT
SQL_WVARCHAR NVARCHAR NVARCHAR NVARCHAR2 VARGR
1 Supported only with Informix Universal Server 9.n
2 See "Oracle DATE Data Type Considerations" on page 1-41.
3 The time component of a Sybase DATETIME or SMALLDATETIME value is lost wheDataStage DATE value. When writing a DataStage DATE value to a Sybase DATETIMvalue, the time component is set to midnight.
4 DataStage FLOAT values have a maximum precision of 15 digits. Some loss of precreading data from Sybase float(p) columns where p is greater than 15.
Th
e D
RS
Sta
ge
Da
tab
ase D
ata
Ty
pe S
up
po
rt
Data
Sta
ge D
yn
am
ic R
ela
tion
al S
tag
e G
uid
e1-4
5
n length to specify the
ta type with a precision n, choose DataStage’s B in the Columns tab. LOB cannot be used as
a type with a precision ition, choose re than 32 K in the R maps to
a type with a precision n, choose DataStage’s
n the Columns tab. The cannot be used as a
n converted to an E or SMALLDATETIME ne.
ARCHAR is
5 Values passed in LOB-type columns need to get into available memory. Use columamount of memory to reserve for a particular column.
6 The DRS Plug-in supports the BLOB data type by mapping the LONGVARBINARY dagreater than 4 KB to Oracle’s BLOB data type. To work with a BLOB column definitioLONGVARBINARY as the column’s data type and provide a Length of more than 4 KThe maximum size supported by DataStage is 2 GB. A column with a data type of Ba key.
7 The DRS Plug-in supports the CLOB data type by mapping the LONGVARCHAR datgreater than 32 K to DB2/UDB’s CLOB data type. To work with a CLOB column definDataStage’s LONGVARCHAR as the column’s data type and provide a Length of moColumns tab. If the Length is less than or equal to 32 K, DataStage’s LONGVARCHALONGVARCHAR.
8 The DRS Plug-in supports the CLOB data type by mapping the LONGVARCHAR datgreater than 4 KB to Oracle’s CLOB data type. To work with a CLOB column definitioLONGVARCHAR as the column’s data type and provide a Length of more than 4 KB imaximum size supported by DataStage is 2 GB. A column with a data type of CLOBkey.
9 The date component of a Sybase DATETIME or SMALLDATETIME value is lost wheDataStage TIME value. When writing a DataStage TIME value to a Sybase DATETIMvalue, the date component is set to the current date on the DataStage server machi
10 If the size of the column is less than 32 K, the DB2/UDB data type for SQL_WLONGVLONGVARCHAR.
Handling $ and # Characters The DRS Stage
Handling $ and # CharactersAscential DataStage has been modified to enable it to handle
databases that use the DataStage reserved characters # and $ in
column names. DataStage converts these characters into an internal
format and then converts them back as necessary.
To take advantage of this facility, do the following:
In Ascential DataStage Administrator, open the Environment Variables dialog box for the project in question, and set the environment variable DS_ENABLE_RESERVED_CHAR_CONVERT to true (this can be found in the General\Customize branch).
Avoid using the strings __035__ and __036__ in your column names (these are used as the internal representations of # and $ respectively).
Once the table definition is loaded, the internal column names are
displayed rather than the original database names both in table
definitions and in the Data Browser. They are also used in derivations
and expressions. The original names (that is, those containing the $ or
#) are used in generated SQL statements, however, and you should
use them if entering SQL in the job yourself.
When using a DRS Plug-In in a server job, you should use the external
names when entering user-defined SQL statements that contain
columns. The columns within the stage are represented by ? , ?, etc.
(parameter markers) and bound to the columns by order, so you do
not need to worry about entering names for them. This applies to:
Query
Update
Insert
Key
Select
Where clause
For example, for an update you might enter:
UPDATE tablename SET ##B$ = ? WHERE $A# = ?
Particularly note the key in this statement ($A#) is specified using the
external name.
1-46 DataStage Dynamic Relational Stage Guide
ABest Practice for Handling Long Data
Types in DataStage Jobs
IntroductionThe Dynamic Relational Stage now supports the various long data
types provided by the major database vendors. DataStage represents
data types such as Oracle's CLOB and BLOB types with the Long
Varchar SQL type. In most cases, these types can support object sizes
of 2GB or greater. In DataStage, object sizes must be specified in a job
design and cannot be changed at runtime. Therefore, you must design
a job with the largest potential size of a large object rather than with
the largest actual size that may be encountered during job execution.
This results in unnecessary memory allocations which, together with
a large array size in the various database stages, can rapidly exhaust
available system memory.
The following describes a mechanism that assures DataStage users a
high degree of success in processing large objects on a system that is
adequately configured for the task.
For a sample job, use DataStage Manager to import the file
LongJobControl.dsx from the Samples directory on the DataStage
installation media.
Recommended Job DesignThe recommended job design requires a pairing of two jobs driven by
job control. The first is responsible for determining the actual
maximum length of each column containing a long data type and
passing this length to a second job that has a parameter describing
DataStage Dynamic Relational Stage Guide A-1
the length of its long data type columns. For the first job, use the
%MaxLength parameter to calculate the maximum length.
The Batch JobMore specifically, a batch job controls the sequence. The batch job can
be created from scratch or modified from a template. The batch job is
a server job with a short description on the designer panel as shown
in the following illustration.
Minimum Job ParametersThe batch job requires at least three job parameters: QUERYJOB,
ETLJOB, and $mam. $mam is a project level environment variable
that specifies maximum available memory for the ETLJOB expressed
in megabytes. QUERYJOB and ETLJOB provide the names for the
query job and transaction job respectively. Other job parameters can
DataStage Dynamic Relational Stage Guide A-2
Best Practice for Handling Long Data Types in DataStage Jobs The Batch Job
be added, if necessary. The job control must be modified if more job
parameters are added.
Template CodeThe BASIC code is in the Job Control tab of the batch job. The
template code is shown below.
REM PARAMETERFILE is the output file of QUERYJOB, and contains the maximum data length of the LONG column* Setup QUERYJOB, run it, wait for it to finish, and test for success hJob1 = DSAttachJob(QUERYJOB, DSJ.ERRFATAL) If NOT(hJob1) Then Call DSLogFatal("Job Attach Failed: ":QUERYJOB:"", "JobControl") Abort End ErrCode = DSSetParam(hJob1, "DBMS_MAX", DBMS_MAX) ErrCode = DSSetParam(hJob1, "DSN_MAX", DSN_MAX) ErrCode = DSSetParam(hJob1, "UID_MAX", UID_MAX) ErrCode = DSSetParam(hJob1, "PWD_MAX", PWD_MAX) ErrCode = DSRunJob(hJob1, DSJ.RUNNORMAL) ErrCode = DSWaitForJob(hJob1) Status = DSGetJobInfo(hJob1, DSJ.JOBSTATUS) If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then * Fatal Error - No Return Call DSLogFatal("Job Failed: ":QUERYJOB:"", "JobControl") End
* Determine optimal Array size and column length for ETL JOB PARAMETERFILE = QUERYJOB:".pf" CALL DSLogInfo("PARAMETERFILE = ":PARAMETERFILE:"", "JobControl")*Get the maximum length of LONG column data *If $mam is nonzero, use it to determine array size IF ($mam NE 0) THEN
DataStage Dynamic Relational Stage Guide A-3
The Batch Job Best Practice for Handling Long Data Types in DataStage Jobs
ARG1 = $mam*1048576 OPENSEQ PARAMETERFILE TO PFILE ELSE CALL DSLogInfo("File not found ":PARAMETERFILE:"", "JobControl") READSEQ ColLength FROM PFILE ELSE CALL DSLogInfo("Error reading ":PARAMETERFILE:"", "JobControl") DATALENGTH = TRIMB(ColLength)
CLOSESEQ PFILE
* 0.3 is a number of factor if there are only one DRS as source stage and * another DRS as target stage. With more than two DRS stages in the ETL job, * smaller number should be considered. MAXMEM = ARG1*0.3 * 6000 is for rows size of other columns ROWSIZE = DATALENGTH + 6000 * If the maximum length of the long column data exceeds MAXMEM, * MAXMEM will be the column precision for ETL job IF (DATALENGTH > MAXMEM) THEN
DATALENGTH = MAXMEM END
ARRAYSIZE = DIV(MAXMEM, ROWSIZE)
IF (ARRAYSIZE < 1) THEN ARG1 = 1
END
*32767 is the biggest arraysize allowed IF (ARRAYSIZE > 32767) THEN
ARG1 = 32767 END IF(ARRAYSIZE >= 1 AND ARRAYSIZE <= 32767) THEN
ARG1 = ARRAYSIZE END
ARG2 = DATALENGTH
CALL DSLogInfo("ARRAYSIZE = ":ARG1:"", "JobControl") CALL DSLogInfo("MAXLONGLENGTH = ":ARG2:"", "JobControl") END
*If $mam is set as zero, it will be ignored IF ($mam EQ 0) THEN OPENSEQ PARAMETERFILE TO PFILE ELSE CALL DSLogInfo("File not found ":PARAMETERFILE:"", "JobControl") READSEQ ColLength FROM PFILE ELSE CALL DSLogInfo("Error reading ":PARAMETERFILE:"", "JobControl") ARG2 = TRIMB(ColLength) CLOSESEQ PFILE CALL DSLogInfo("MAXLONGLENGTH = ":ARG2:"", "JobControl") END
* Setup ETLJOB, run it, wait for it to finish, and test for success hJob2 = DSAttachJob(ETLJOB, DSJ.ERRFATAL)
A-4 DataStage Dynamic Relational Stage Guide
Best Practice for Handling Long Data Types in DataStage Jobs The Batch Job
If NOT(hJob2) Then Call DSLogFatal("Job Attach Failed: ":ETLJOB:"", "JobControl") Abort End ErrCode = DSSetParam(hJob2, "DBMS_SRC", DBMS_SRC) ErrCode = DSSetParam(hJob2, "DSN_SRC", DSN_SRC) ErrCode = DSSetParam(hJob2, "UID_SRC", UID_SRC) ErrCode = DSSetParam(hJob2, "PWD_SRC", PWD_SRC) IF ($mam NE 0) THEN ErrCode = DSSetParam(hJob2, "ARSZ_SRC", ARG1) END IF ($mam EQ 0) THEN ErrCode = DSSetParam(hJob2, "ARSZ_SRC", ARSZ_SRC) END ErrCode = DSSetParam(hJob2, "LongLength", ARG2) ErrCode = DSSetParam(hJob2, "DBMS_TRG", DBMS_TRG) ErrCode = DSSetParam(hJob2, "DSN_TRG", DSN_TRG) ErrCode = DSSetParam(hJob2, "UID_TRG", UID_TRG) ErrCode = DSSetParam(hJob2, "PWD_TRG", PWD_TRG) IF ($mam NE 0) THEN ErrCode = DSSetParam(hJob2, "ARSZ_TRG", ARG1) END IF ($mam EQ 0) THEN ErrCode = DSSetParam(hJob2, "ARSZ_TRG", ARSZ_TRG) END ErrCode = DSRunJob(hJob2, DSJ.RUNNORMAL) ErrCode = DSWaitForJob(hJob2) Status = DSGetJobInfo(hJob2, DSJ.JOBSTATUS) If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then * Fatal Error - No Return Call DSLogFatal("Job Failed: ":ETLJOB:"", "JobControl") End
Other Job ParametersThere are fifteen other job parameters in the above code.
DBMS_MAX:DBMS type for QUERY job.
DSN_MAX:Data source name for QUERY job.
UID_MAX:User name for QUERY job.
PWD_MAX:Password for QUERY job.
DBMS_SRC:DBMS type for the source stage of the ETL job.
DSN_SRC:Data source name for the source stage of the ETL job.
UID_SRC:User name for the source stage of the ETL job.
PWD_SRC:Password for the source stage of the ETL job.
ARSZ_SRC:Array size for the source stage of the ETL job.
DBMS_TRG:DBMS type for the target stage of the ETL job.
DSN_TRG:Data source name for the target stage of the ETL job.
DataStage Dynamic Relational Stage Guide A-5
The Preliminary Job Best Practice for Handling Long Data Types in DataStage Jobs
UID_TRG:User name for the target stage of the ETL job.
PWD_TRG:Password for the target stage of the ETL job.
ARSZ_TRG:Array size for the target stage of the ETL job
LongLength:The maximum data length of LONG column.
Change the names of other job parameters if they are different from
those in your actual jobs.
The Preliminary JobThe first job consists minimally of a DRS stage writing to a sequential
file, named QUERYJOB.pf. The DRS stage uses user-defined SQL
containing the appropriate meta SQL (%MaxLength) to calculate the
maximum data size of the long column
.
A-6 DataStage Dynamic Relational Stage Guide
Best Practice for Handling Long Data Types in DataStage Jobs The Preliminary Job
Specifying Column LengthUsing the Columns tab, create a column to contain the length.
Extracting the LengthUsing the User-defined tab, create the SQL to extract the length.
DataStage Dynamic Relational Stage Guide A-7
Determining Optimal Array Size and Maximum Length of Data Best Practice for Handling Long Data Types in
Identifying the Name of the FileIn the Sequential File stage, identify the File Name.
Note Performance can become an issue because this is likely to
be a relatively expensive operation.
Determining Optimal Array Size and Maximum Length of Data
Once the first job completes, the job control uses $mam and the
output file of query job to determine the optimal array size (ARG1) and
the maximum data length of the LONG column (ARG2). The maximum
data length from the query job is adjusted if it can not be handled
within the memory of $mam. Both ARG1 and ARG2 are passed to the
subsequent DRS stages of the second job (ETLJOB) as job parameters
by the job control. ETLJOB is a regular job with a job parameter to
specify data length for LONG column. All the rows that exceed ARG2
are identified by inserting information into the LONG column of the
target table. The format of the inserted information is in the format of
"Table|Column|Key Values".
Currently, DataStage does not support job parameters in column meta
data other than in Description on the Columns tab. DataStage
allows Description to specify an 'override'. In the description field of
the LONG column, enter a string of the form:
param{length=ParameterName}. The ParameterName may be
surrounded by # chars or not and is case-sensitive. The precision of
the LONG column is replaced by the value of ParameterName at
runtime.
A-8 DataStage Dynamic Relational Stage Guide
Best Practice for Handling Long Data Types in DataStage Jobs The ETL Job
The ETL JobThe second job, illustrated below, is a job that extracts, transforms,
and loads.
Using Description as an Override to Specify PrecisionThe following is a sample of the Columns tab with Description
supplied.
DataStage Dynamic Relational Stage Guide A-9
Managing Failures Best Practice for Handling Long Data Types in DataStage Jobs
Managing FailuresDRS currently allocates its memory based on the column meta data
when the stage is initialized. Therefore, it is unlikely that a job will
encounter memory issues mid-run. However, it is possible that
truncations of long data occur if the maximum long column size
exceeds the memory available to process these rows. In such rare
cases, you need to log these truncated rows for later processing. The
information that uniquely identifies the target row is in the format of
"Table|Column|Key Values".
A-10 DataStage Dynamic Relational Stage Guide
Index
Symbols%Like 1–40
AAfter tab
Input page 1–20
Output page 1–29
Ascential Developer Net v
BBefore tab
Input page 1–19
Output page 1–29
blanks, eliminating 1–40
Ccharacter set mapping 1–12
Columns tab
Input page 1–17
Output page 1–26
configuration requirements
DB2 1–4
general 1–4
Informix 1–5
Oracle 8i 1–6
Oracle 9i 1–9
Sybase 1–10
connecting to a database 1–11
Customer Care v
Customer Care, telephone v
Ddata type considerations
Oracle 8i 1–41
Oracle 9i 1–41
DataStage Dynamic Relational Stage Guide
data types
DataStage 1–43
DB2 1–43
Informix 1–43
Oracle 1–43
SQL Server 1–43
Sybase 1–43
DataStage data types 1–43
DataStage Designer iii
DB2
configuration requirements 1–4
data types 1–43
dollar sign ($) 1–46
DRS
defining the connection 1–10
description iii
Dynamic Relational Stage. See DRS.
Eeliminating blanks 1–40
environment variables 1–46
Ffunctionality 1–2
GGeneral tab
Input page 1–13
Output page 1–25
Stage page 1–11
Generated DDL tab 1–21
generated SQL statements 1–22
Generated tab
Input page 1–18
Output page 1–28
Index-1
Index
IInformix
configuration requirements 1–5
data types 1–43
Input page
Columns tab 1–17
description 1–11
General tab 1–13
properties 1–13
SQL tab 1–18
After tab 1–20
Before tab 1–19
Generated DDL tab 1–21
Generated tab 1–18
User-defined DDL tab 1–21
User-defined tab 1–18
installation 1–10
Jjoins
aliases 1–39
nested outer 1–39
Oracle 8i 1–38
Llarge objects. See processing large objects
LOB data types 1–4
long data types A–1
NNLS tab 1–12
OOracle 8i
client shared library 1–6
configuration requirements 1–6
data type considerations 1–41
data types 1–43
joins 1–38
Oracle 9i
configuration requirements 1–9
data type considerations 1–41
data types 1–43
Output page
Columns tab 1–26
description 1–11, 1–24
General tab 1–25
Selection tab 1–27
Index-2
SQL tab 1–28
After tab 1–29
Before tab 1–29
Generated tab 1–28
User-defined tab 1–28
overview 1–1
Ppound sign (#) 1–46
processing large objects
batch job
description A–2
minimum job parameters A–2
other job parameters A–5
determining maximum data size A–6
determining maximum length of data A–8
determining optimal array size A–8
recommended job design A–1
specifying precision A–9
template code A–3
the ETL job A–9
truncation of data A–9
Rreject row handling 1–21
reserved characters
dollar sign ($) 1–46
pound sign (#) 1–46
SSelection tab 1–27
specifying object sizes A–1
SQL meta tags 1–31
SQL Server
data types 1–43
SQL tab
Input page 1–18
Output page 1–28
Stage page 1–10
description 1–10
General tab 1–11
NLS tab 1–12
Sybase
configuration requirements 1–10
data types 1–43
Tthird-party documentation iv
DataStage Dynamic Relational Stage Guide
Index
UUser-defined DDL tab 1–21
user-defined queries 1–31
user-defined SQL statements 1–23
User-defined tab
Input page 1–18
Output page 1–28
Wwriting to a database 1–22
DataStage Dynamic Relational Stage Guide
Index-3Index
Index-4
DataStage Dynamic Relational Stage Guide