Download - PWX 910 Netezza UserGuide En

Transcript

Informatica PowerExchange for Netezza (Version 9.1.0)

User Guide

Informatica PowerExchange for Netezza User Guide

Version 9.1.0March 2011

Copyright (c) 2005-2011 Informatica. All rights reserved.

This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use anddisclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form,by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. This Software may be protected by U.S. and/or internationalPatents and other Patents Pending.

Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided inDFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013©(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.

The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us inwriting.

Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange,PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange, Informatica OnDemand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging and InformaticaMaster Data Management are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other companyand product names may be trade names or trademarks of their respective owners.

Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rightsreserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All rightsreserved.Copyright © Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright 2007 Isomorphic Software. All rights reserved. Copyright © MetaIntegration Technology, Inc. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Adobe Systems Incorporated. All rights reserved. Copyright © DataArt,Inc. All rights reserved. Copyright © ComponentSource. All rights reserved. Copyright © Microsoft Corporation. All rights reserved. Copyright © Rogue Wave Software, Inc. Allrights reserved. Copyright © Teradata Corporation. All rights reserved. Copyright © Yahoo! Inc. All rights reserved. Copyright © Glyph & Cog, LLC. All rights reserved.Copyright © Thinkmap, Inc. All rights reserved. Copyright © Clearpace Software Limited. All rights reserved. Copyright © Information Builders, Inc. All rights reserved.Copyright © OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved.

This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and other software which is licensed under the Apache License,Version 2.0 (the "License"). You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing,software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See theLicense for the specific language governing permissions and limitations under the License.

This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software copyright ©1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under the GNU Lesser General Public License Agreement, which may be found at http://www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any kind, either express or implied, including but notlimited to the implied warranties of merchantability and fitness for a particular purpose.

The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California, Irvine,and Vanderbilt University, Copyright (©) 1993-2006, all rights reserved.

This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and redistribution ofthis software is subject to terms available at http://www.openssl.org.

This product includes Curl software which is Copyright 1996-2007, Daniel Stenberg, <[email protected]>. All Rights Reserved. Permissions and limitations regarding thissoftware are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or withoutfee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

The product includes software copyright 2001-2005 (©) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms availableat http://www.dom4j.org/ license.html.

The product includes software copyright © 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to termsavailable at http:// svn.dojotoolkit.org/dojo/trunk/LICENSE.

This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations regarding thissoftware are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.

This product includes software copyright © 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at http://www.gnu.org/software/ kawa/Software-License.html.

This product includes OSSP UUID software which is Copyright © 2002 Ralf S. Engelschall, Copyright © 2002 The OSSP Project Copyright © 2002 Cable & WirelessDeutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.

This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software are subjectto terms available at http:/ /www.boost.org/LICENSE_1_0.txt.

This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at http://www.pcre.org/license.txt.

This product includes software copyright © 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to termsavailable at http:// www.eclipse.org/org/documents/epl-v10.php.

This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://www.stlport.org/doc/license.html, http://www.asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://httpunit.sourceforge.net/doc/license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/license.html, http://www.libssh2.org,http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/license-agreements/fuse-message-broker-v-5-3-license-agreement, http://antlr.org/license.html, http://aopalliance.sourceforge.net/, http://www.bouncycastle.org/licence.html, http://www.jgraph.com/jgraphdownload.html, http://www.jgraph.com/jgraphdownload.html, http://www.jcraft.com/jsch/LICENSE.txt and http://jotm.objectweb.org/bsd_license.html.

This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and DistributionLicense (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php) and the BSD License (http://www.opensource.org/licenses/bsd-license.php).

This product includes software copyright © 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this softwareare subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab. For furtherinformation please visit http://www.extreme.indiana.edu/.

This Software is protected by U.S. Patent Numbers 5,794,246; 6,014,670; 6,016,501; 6,029,178; 6,032,158; 6,035,307; 6,044,374; 6,092,086; 6,208,990; 6,339,775;6,640,226; 6,789,096; 6,820,077; 6,823,373; 6,850,947; 6,895,471; 7,117,215; 7,162,643; 7,254,590; 7,281,001; 7,421,458; 7,496,588; 7,523,121; 7,584,422; 7,720,842;7,721,270; and 7,774,791, international Patents and other Patents Pending.

DISCLAIMER: Informatica Corporation provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the impliedwarranties of non-infringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation is error free. Theinformation provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation issubject to change at any time without notice.

NOTICES

This Informatica product (the “Software”) includes certain drivers (the “DataDirect Drivers”) from DataDirect Technologies, an operating company of Progress SoftwareCorporation (“DataDirect”) which are subject to the following terms and conditions:

1.THE DATADIRECT DRIVERS ARE PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOTLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.

2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT,INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT INFORMED OFTHE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT LIMITATION, BREACHOF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.

Part Number: PWX-NZU-91000-0001

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiInformatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Informatica Customer Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Informatica Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Informatica How-To Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Informatica Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Informatica Multimedia Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Chapter 1: Understanding PowerExchange for Netezza. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Understanding PowerExchange for Netezza Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Using Code Pages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Chapter 2: Configuring PowerExchange for Netezza. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Configuring PowerExchange for Netezza Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Prerequisites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Configuring PowerExchange for Netezza. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Upgrading PowerExchange for Netezza. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Registering the Plug-in. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Chapter 3: Working with Netezza Sources and Targets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Working With Netezza Sources and Targets Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Configuring Source Qualifier Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Importing Netezza Source Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Importing Netezza Target Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Chapter 4: Netezza Sessions and Workflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Configuring a Session with a Netezza Source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Configuring a Session with a Netezza Target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Target Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Unprojected Columns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Pipeline Partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Target Connection Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Configuring Multiple Targets for the Same Target Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Updating Netezza Target Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Update As Insert. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Update Else Insert. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Table of Contents i

Null Values and Empty Strings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Configuring a Netezza Session for Optimal Performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Using a Netezza Distribution Key. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Troubleshooting Netezza Sessions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Appendix A: Datatype Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Netezza and Transformation Datatypes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

ii Table of Contents

PrefaceThe Informatica PowerExchange for Netezza User Guide provides information about extracting data from aNetezza source and loading data into a Netezza target. It is written for database administrators and developerswho are responsible for extracting data from Netezza and loading data to Netezza. This book assumes you haveknowledge of Netezza and PowerCenter.

Informatica Resources

Informatica Customer PortalAs an Informatica customer, you can access the Informatica Customer Portal site at http://mysupport.informatica.com. The site contains product information, user group information, newsletters,access to the Informatica customer support case management system (ATLAS), the Informatica How-To Library,the Informatica Knowledge Base, the Informatica Multimedia Knowledge Base, Informatica ProductDocumentation, and access to the Informatica user community.

Informatica DocumentationThe Informatica Documentation team takes every effort to create accurate, usable documentation. If you havequestions, comments, or ideas about this documentation, contact the Informatica Documentation team throughemail at [email protected]. We will use your feedback to improve our documentation. Let usknow if we can contact you regarding your comments.

The Documentation team updates documentation as needed. To get the latest documentation for your product,navigate to Product Documentation from http://mysupport.informatica.com.

Informatica Web SiteYou can access the Informatica corporate web site at http://www.informatica.com. The site contains informationabout Informatica, its background, upcoming events, and sales offices. You will also find product and partnerinformation. The services area of the site includes important information about technical support, training andeducation, and implementation services.

Informatica How-To LibraryAs an Informatica customer, you can access the Informatica How-To Library at http://mysupport.informatica.com.The How-To Library is a collection of resources to help you learn more about Informatica products and features. Itincludes articles and interactive demonstrations that provide solutions to common problems, compare features andbehaviors, and guide you through performing specific real-world tasks.

iii

Informatica Knowledge BaseAs an Informatica customer, you can access the Informatica Knowledge Base at http://mysupport.informatica.com.Use the Knowledge Base to search for documented solutions to known technical issues about Informaticaproducts. You can also find answers to frequently asked questions, technical white papers, and technical tips. Ifyou have questions, comments, or ideas about the Knowledge Base, contact the Informatica Knowledge Baseteam through email at [email protected].

Informatica Multimedia Knowledge BaseAs an Informatica customer, you can access the Informatica Multimedia Knowledge Base at http://mysupport.informatica.com. The Multimedia Knowledge Base is a collection of instructional multimedia filesthat help you learn about common concepts and guide you through performing specific tasks. If you havequestions, comments, or ideas about the Multimedia Knowledge Base, contact the Informatica Knowledge Baseteam through email at [email protected].

Informatica Global Customer SupportYou can contact a Customer Support Center by telephone or through the Online Support. Online Support requiresa user name and password. You can request a user name and password at http://mysupport.informatica.com.

Use the following telephone numbers to contact Informatica Global Customer Support:

North America / South America Europe / Middle East / Africa Asia / Australia

Toll FreeBrazil: 0800 891 0202Mexico: 001 888 209 8853North America: +1 877 463 2435 Standard RateNorth America: +1 650 653 6332

Toll FreeFrance: 00800 4632 4357Germany: 00800 4632 4357Israel: 00800 4632 4357Italy: 800 915 985Netherlands: 00800 4632 4357Portugal: 800 208 360Spain: 900 813 166Switzerland: 00800 4632 4357 or 0800 463200United Kingdom: 00800 4632 4357 or 0800023 4632 Standard RateFrance: 0805 804632Germany: 01805 702702Netherlands: 030 6022 797

Toll FreeAustralia: 1 800 151 830New Zealand: 1 800 151 830Singapore: 001 800 4632 4357 Standard RateIndia: +91 80 4112 5738

iv Preface

C H A P T E R 1

Understanding PowerExchange forNetezza

This chapter includes the following topics:

¨ Understanding PowerExchange for Netezza Overview, 1

¨ Using Code Pages, 1

Understanding PowerExchange for Netezza OverviewPowerExchange for Netezza provides bidirectional connectivity between PowerCenter and Netezza to extract andload data. The Designer uses a relational connector to connect to the Netezza database. You can import Netezzatables as sources and target definitions. You can connect to Netezza Performance Server to read data fromNetezza tables and load data to Netezza tables. Netezza Performance Server integrates database, server, andstorage in a single system.

The PowerCenter Integration Service reads and writes Netezza data through an external table. It uses the bulkload utility on the external table to extract and load data. The bulk loading utility can increase session performance.

Configure a Netezza database connection to read data from and write to Netezza.

Using Code PagesWhen the PowerCenter Integration Service runs in Unicode mode, it encodes Netezza data of the Nchar(m) andNVarchar(m) datatypes in UTF-8. It encodes Netezza data of the Varchar and Char datatypes in Latin-9.

If the data contains extended ASCII characters or UTF-8 characters, run the PowerCenter Integration Service inUnicode mode.

1

Administrator
Highlight
Administrator
Highlight

C H A P T E R 2

Configuring PowerExchange forNetezza

This chapter includes the following topics:

¨ Configuring PowerExchange for Netezza Overview, 2

¨ Registering the Plug-in, 3

Configuring PowerExchange for Netezza OverviewThis chapter provides information about configuring PowerExchange for Netezza.

PrerequisitesBefore you configure PowerExchange for Netezza, complete the following tasks:

¨ Install client and server components of the Netezza Performance Server.

¨ Verify that the Netezza database user has the following privileges on the database:

- CREATE TABLE

- CREATE EXTERNAL TABLE

- DELETE

- DROP

- INSERT

- LIST

- SELECT

- TRUNCATE

- UPDATE

Configuring PowerExchange for NetezzaTo read or write Netezza data in bulk mode, register the PowerExchange for Netezza plug-in withe repository. Toread or write Netezza data with a relational connection, you do not need to perform configuration steps.

2

Administrator
Highlight

Upgrading PowerExchange for NetezzaYou can upgrade PowerExchange for Netezza version 8.1.1.0.3 to version 9.0 or 9.0.1.

1. Upgrade PowerCenter.

2. Configure the Repository Service to run in exclusive mode.

To change the Repository Service operating mode, you can use the Administrator tool or the infacmdUpdateRepositoryService command.

3. Use the pmrep UpgradeNetezzaToRelational command to upgrade PowerExchange for Netezza.

Enter the following command:pmrep upgradeNetezzaToRelational

4. Configure the Repository Service to run in normal mode.

Registering the Plug-inTo read or write Netezza data in bulk mode, you need to register the plug-in with the repository. A plug-in is anXML file that defines the bulk mode functionality of PowerExchange for Netezza.

To register the plug-in, the repository must be running in exclusive mode. Use the Informatica Administrator or thepmrep RegisterPlugin command to register the plug-in.

The plug-in file for PowerExchange for Netezza is pmnetezza.xml. When you install the Service component, theinstaller copies pmnetezza.xml to the following directory:

<PowerCenter Installation Directory>\server\bin\Plugin

Note: If you do not have the correct privileges to register the plug-in, contact the user who manages thePowerCenter Repository Service.

Registering the Plug-in 3

Administrator
Highlight

C H A P T E R 3

Working with Netezza Sources andTargets

This chapter includes the following topics:

¨ Working With Netezza Sources and Targets Overview, 4

¨ Configuring Source Qualifier Properties, 4

¨ Importing Netezza Source Definitions, 5

¨ Importing Netezza Target Definitions, 6

Working With Netezza Sources and Targets OverviewNetezza source and target definitions represent metadata for Netezza tables. When you import Netezzadefinitions, you can choose to preview data in the tables.

You can edit definitions to configure the properties that you did not import from Netezza. If you want to enforce keyconstraints, define them in the Designer. When you run a session, the PowerCenter Integration Serviceestablishes relationships within the pipeline based on source and target definitions. Netezza does not enforce keyconstraints.

When the PowerCenter Integration Service extracts from Netezza, it uses the bulk extract functionality of theexternal tables. When the PowerCenter Integration Service loads to Netezza, it uses the bulk loading functionalityof the external tables.

Configuring Source Qualifier PropertiesYou can configure Application Source Qualifier properties to sort the number of input ports and to retrieve distinctdata from a Netezza source. You can override the values in the session properties.

4

Administrator
Highlight

The following table describes the Application Source Qualifier properties:

Source Options Description

Select Distinct Selects unique values. Netezza ignores trailing spaces. Therefore, the PowerCenter IntegrationService might extract fewer rows than expected.

Source Filter Reduces the number of rows the PowerCenter Integration Service queries.Use the following syntax:<table name>.”<field name>” <operator> <value>The filter condition is case sensitive.

User Defined Join Joins data from multiple sources.

Number of SortedPorts

Number of columns used when sorting rows queried from the source. The PowerCenter IntegrationService adds an ORDER BY clause to the default query when it reads source rows. The ORDER BYclause includes the number of ports specified, starting from the top of the transformation. When youspecify the number of sorted ports, the database sort order must match the session sort order.Default is 0.

SQL Query Overrides the default query. Enclose column names in double quotes. The SQL query is casesensitive.

Importing Netezza Source DefinitionsTo create a Netezza source definition, use the Source Analyzer to import source metadata with the Netezzarelational data source.

1. In the Source Analyzer, click Source > Import from Database.

2. Select the Netezza data source used to connect to the source database.

If you need to create or modify a Netezza data source, click the Browse button to open the ODBCAdministrator. Create the Netezza data source, and click OK. Select the new Netezza data source.

3. Enter a database user name and password to connect to the database.

Note: The user name must have the appropriate database permissions to view the object.

You may need to specify the owner name for database objects you want to use as sources.

4. Optionally, use the search field to limit the number of tables that appear.

5. Click Connect.

If no table names appear, or if the table you want to import does not appear, click All.

6. Scroll down through the list of sources to find the source you want to import. Select the relational object orobjects you want to import.

You can hold down the Shift key to select a block of sources within one folder or hold down the Ctrl key tomake non-consecutive selections within a folder. You can also select all tables within a folder by selecting thefolder and clicking Select All. Use the Select None button to clear all highlighted selections.

7. Click OK.

The source definition appears in the Source Analyzer. In the Navigator, the new source definition appears in theSources node of the active repository folder, under the source database name.

Importing Netezza Source Definitions 5

Administrator
Highlight
Administrator
Highlight

Importing Netezza Target DefinitionsTo create a Netezza target definition, use the Target Designer to import source metadata with the Netezzarelational data source.

1. In the Target Designer, click Targets > Import from Database.

2. Select the Netezza data source used to connect to the target database.

If you need to create or modify a Netezza data source, click the Browse button to open the ODBCAdministrator. Create the Netezza data source, and click OK. Select the new Netezza data source.

3. Enter the user name and password needed to open a connection to the database, and click Connect.

If you are not the owner of the table you want to use as a target, specify the owner name.

4. Drill down through the list of database objects to view the available tables as targets.

5. Select the relational table or tables to import the definitions into the repository.

You can hold down the Shift key to select a block of tables, or hold down the Ctrl key to make non-contiguousselections. You can also use the Select All and Select None buttons to select or clear all available targets.

6. Click OK.

The selected target definitions now appear in the Navigator under the Targets icon.

6 Chapter 3: Working with Netezza Sources and Targets

Administrator
Highlight
Administrator
Highlight

C H A P T E R 4

Netezza Sessions and WorkflowsThis chapter includes the following topics:

¨ Configuring a Session with a Netezza Source, 7

¨ Configuring a Session with a Netezza Target, 8

¨ Updating Netezza Target Data, 11

¨ Null Values and Empty Strings, 13

¨ Configuring a Netezza Session for Optimal Performance, 13

¨ Troubleshooting Netezza Sessions, 14

Configuring a Session with a Netezza SourceYou can configure the session properties for Netezza source on the Mapping tab. Define the properties for eachsource instance in the session.

The following table describes the session properties you can configure for a Netezza source:

Attribute Name Description

Socket Buffer Size Set the socket buffer size to 25 to 50 % of the DTM buffer size to increase session performance. Youmight need to test different settings for optimal performance. Enter a value between 4096 and2147483648 bytes.Default is 8388608 bytes.

Pipe Directory Path Path for the PowerCenter Integration Service to create the pipe. If you do not specify the path, thePowerCenter Integration Service uses <PowerCenter Installation Directory>/server/bin to create thepipe.Required if the machine hosting the PowerCenter Integration Service is on HP-UX and <PowerCenterInstallation Directory>/server/bin directory is on a NFS mounted directory. Enter a path that does notuse NFS mount.

Delimiter Delimiter separates successive input fields. You can enter any value supported by the NetezzaPerformance Server. The value can be a part of the data for the Netezza source. Default is |.

NullValue NullValue parameter of an external table. The PowerCenter Integration Service uses the NullValueinternally. Maximum value is one character. Default is blank.

EscapeCharacter Escape character of an external table. If the data contains NULL, CR, and LF characters in the Char orVarchar field, you need to escape these characters in the source data before extracting. Enter anescape character before the data. The supported escape character is backslash (\).

7

Administrator
Highlight
Administrator
Highlight
Administrator
Highlight
Administrator
Highlight

Note: You can view load statistics in the session log. The load summary in the Workflow Monitor does not displayload statistics.

Configuring a Session with a Netezza TargetYou can configure target properties for a session that writes data to Netezza targets:

¨ Target database connection

¨ Target properties

¨ Update strategy

¨ Multiple targets referring to the same table

¨ Pipeline partitioning

Target PropertiesYou can configure the session properties for Netezza targets in the Transformations view on the Mapping tab.Define the properties for each target instance in the session.

The following table describes the target properties available on the Mapping tab:

Target Property Description

Socket Buffer Size Set the socket buffer size to 25 to 50 % of the DTM buffer size to increase session performance. Youmight need to test different settings for optimal performance. Enter a value between 4096 and2147483648 bytes.Default is 8388608 bytes.

Pipe Directory Path Path for the PowerCenter Integration Service to create the pipe. If you do not specify the path, thePowerCenter Integration Service uses <PowerCenter Installation Directory>/server/bin to create thepipe.Required if the machine hosting the PowerCenter Integration Service is on HP-UX and <PowerCenterInstallation Directory>/server/bin directory is on a NFS mounted directory. Enter a path that does notuse NFS mount.

Error Log DirectoryName

Error log directory can reside on the machine where the PowerCenter Integration Service runs. Forexample, you can use $PMBadFileDir.By default, the PowerCenter Integration Service creates the error log in the /tmp directory on themachine hosting Netezza Performance Server.The PowerCenter Integration Service creates a bad file in the error log directory if the data is not valid.

Insert PowerCenter Integration Service inserts all rows flagged for insert. Default is enabled.

Delete PowerCenter Integration Service deletes all rows flagged for delete. Default is enabled.

Update(as Update)

PowerCenter Integration Service updates all rows flagged for update. Default is enabled.

Update(as Insert)

PowerCenter Integration Service inserts all rows flagged for update. Default is disabled.

Update(else Insert)

PowerCenter Integration Service updates the existing rows and inserts the remaining rows flagged asupdate. Default is disabled.

8 Chapter 4: Netezza Sessions and Workflows

Target Property Description

Truncate Target TableOption

PowerCenter Integration Service truncates the target before loading. Default is disabled.

Delimiter Set the delimiter to any value supported by Netezza Performance Server. The delimiter separatessuccessive input fields. The value must not be a part of the input data. Default is |.

Control Character CTRLCHARS parameter of the external table to transfer data containing control characters. You canenter control characters for Char and Varchar fields. If you enter a control character, you must escapeNULL, CR, and LF fields. Default is TRUE.

CRINSTRING CRINSTRING parameter to transfer data containing carriage returns (CR). You can enter a nonescape CR in Char or Varchar fields. To load the control characters present in the Char and Varcharfields, set the CTRLCHARS and CRINSTRING parameters to TRUE in the session properties for theNetezza source. Default is TRUE.

NullValue NullValue parameter of the external table. The PowerCenter Integration Service uses the NullValueinternally. Maximum value is one character. Default is blank.

EscapeCharacter Escape character of the external table. If the data contains NULL, CR, and LF characters in the Charor Varchar field, you need to escape these characters before loading. Enter a backslash (\) as theescape character.

Quoted Value QUOTEDVALUE parameter of the external table. Select SINGLE or DOUBLE to enclose the field insingle or double quotes. Select NO to omit quotes. Default is NO. The quoted value is not a part ofthe data.

Ignore Key Constraints Ignores constraints on primary key fields. When you select this option, the PowerCenter IntegrationService can write duplicate rows with the same primary key to the target. Default is disabled. ThePowerCenter Integration Service ignores this value when the target operation is “update as update” or“update else insert.”

Duplicate RowHandling Mechanism

Determines how the PowerCenter Integration Service handles duplicate rows. Select one of thefollowing values:- First Row. The PowerCenter Integration Service passes the first row to the target and rejects the

rows that follow with the same primary key.- Last Row. The PowerCenter Integration Service passes the last duplicate row to the target and

discards rest of the rows.Default is First Row.

Add Escape Character Adds an escape character to all the special characters in the data. The special characters include \n,\r, \0, delimiter, escape character, and null value character. Ensure that the value of the escapecharacter is entered in the EscapeCharacter attribute.

Unprojected ColumnsWhen the PowerCenter Integration Service generates SQL to load to a Netezza target, it ignores target columnsthat are not connected in the mapping. If a default value is defined in Netezza for an unconnected column,Netezza updates or populates the column with the default value.

Pipeline PartitioningYou can increase the number of partitions in a pipeline to improve session performance. When you increase thenumber of partitions, the PowerCenter Integration Service can create multiple connections to sources and targetsand process partitions of sources and target data concurrently.

Configuring a Session with a Netezza Target 9

Netezza Performance Server divides data into data slices. In a partitioned session that reads data from Netezza,each partition reads a different data slice to prevent data duplication except in the following cases:

¨ You enter an SQL override query for a partition.

¨ You enter different values for the source filter across partitions.

¨ You enter different values for the user-defined join across partitions.

Rules and Guidelines for Pipeline PartitioningUse the following rules and guidelines when you configure multiple partitions in a Netezza session:

¨ Set the partitioning type to pass-through for Netezza targets.

¨ Verify that the Delete and Update session properties on the Mapping tab are not enabled for more than onepartition. You cannot perform multiple updates, multiple deletes, or update and delete simultaneously on aNetezza target.

¨ To avoid unpredictable session results, configure the following session properties to have the same value foreach partition:

- Insert

- Delete

- Update

- Duplicate Row Handling Mechanism

¨ If you run a partitioned session that joins multiple sources, link the first column in the Application SourceQualifier to a source column that represents data for the Netezza table with the best distribution in Netezza.This means that the Netezza table is more uniformly distributed across Snippet Processing Units (SPU) thanother tables.

¨ If you run a partitioned session with key constraints, only one partition shows load statistics.

Target Connection GroupsA target connection group is a group of targets that the PowerCenter Integration Service uses to determinecommits and loading. When the PowerCenter Integration Service writes to Netezza, it commits data in the sametransaction for all targets in a target connection group. When the PowerCenter Integration Service needs toperform a rollback, the PowerCenter Integration Service rolls back all targets in the target connection group.

Netezza targets in the same target connection group must meet the following criteria:

¨ Belong to the same pipeline.

¨ Belong to the same partition.

¨ Have the same database connection name, user name, and password.

Use the following rules and guidelines when you configure multiple targets in a target connection group to write tothe same Netezza target table:

Target LoadType

Target Options Rules and Guidelines

Insert InsertUpdate as Insert

Select the Ignore Key Constraints target property for insert targets.

Update Update as Update Use a maximum of one update table for any target.

10 Chapter 4: Netezza Sessions and Workflows

Target LoadType

Target Options Rules and Guidelines

Update else Insert Do not use with delete tables.

Delete Delete Use a maximum of one delete table for any target.Do not use with update tables.

Configuring Multiple Targets for the Same Target TableYou can configure multiple targets to write to the same Netezza table, even if they are not in the same targetconnection group. When you configure targets in different partitions or pipelines to write to the same target table,use the same rules and guidelines as for target connection groups.

Updating Netezza Target DataThe PowerCenter Integration Service updates target rows based on the update options and the duplicate rowhandling.

Update As InsertWhen you configure the session to update as insert rows and the source key value matches a target key value, thePowerCenter Integration Service inserts each target row. It inserts with the first or last source row matched, basedon how you configure duplicate row handling.

This example uses the following data:

Source data: 1,a,1a1; 1,b,1b1; 1,a,1a2; 1,c,1c1; 1,d,1d1, 1,a,1a3Target data: 1,c,1c1; 1,a,1a3Updated target data: 1,a,1a1; 1,b,1b1; 1,c,1c1; 1,d,1d1, 1,a,1a3

In the pair of values, the first and second values are the primary keys. The session is configured to consider keyconstraints, and duplicate row handling is configured to update with the first source row.

The following table describes how the PowerCenter Integration Service updates the target:

Source Data Target Data Updated TargetData

Comment

1,a,1a1 The source primary key is found in the target. The row is notinserted.

1,b,1b1 1,b,1b1 Inserts 1,b,1b1.

1,a,1a2 The source primary key is found in the target. The row is notinserted.

1,c,1c1 1,c,1c1 1,c,1c1 Retains 1,c,1c1.No update required.

1,d,1d1 1,d,1d1 Inserts 1,d,1d1.

Updating Netezza Target Data 11

Source Data Target Data Updated TargetData

Comment

1,a,1a3 1,a,1a3 1,a,1a3 Retains 1,a,1a3.No update required.

Note: In the pair of values, the first two values are the primary key, for example 1 (primary key), a (primary key), 1a1.

Update Else InsertWhen you configure the session to update else insert rows, the PowerCenter Integration Service uses thefollowing process to update target rows:

¨ If the source key value matches a target key value, the PowerCenter Integration Service updates each targetrow. It updates with the first or last source row matched, based on how you configure duplicate row handling.

¨ If the source primary key value does not exist in the target, the PowerCenter Integration Service inserts thesource row.

This example uses the following data:

Source data: 1,2; 1,3; 2,4; 2,5Target data: 1,6; 1,8; 3,7Updated target data: 1,2; 1,2; 2,4; 3,7

In the pair of values, the first value is the primary key.

The following table describes how the PowerCenter Integration Service updates the target:

SourceData

TargetData

UpdatedTarget Data

Comment

1,2 1,6 1,2 Updates 1,6 with 1,2.The source primary key is found in the target. The target row is updatedbased on duplicate row handling to use first row.

1,3 1,8 1,2 Updates 1,8 with 1,2.Duplicate row handling is configured to update with first source row.Subsequent target rows with primary key “1” are updated with first sourcerow.

2,4 2,4 Inserts 2,4.The source primary key is not found in the target. The row is inserted.

2,5 Drops 2,5.The source primary key is found in the target, and first duplicate row hasbeen updated in the target.

3,7 3,7 Retains 3,7.No update required.

Note: In the pair of values, the first value is the primary key, for example 1 (primary key), 2.

12 Chapter 4: Netezza Sessions and Workflows

Null Values and Empty StringsWhen you want to extract non-null values from a Netezza source, the PowerCenter Integration Service alsoextracts empty strings. These values may appear as null values in the target.

Configuring a Netezza Session for Optimal PerformanceYou can increase the performance of PowerExchange for Netezza by setting the properties in the session. Set thefollowing parameters to increase the session performance:

¨ Default buffer block size. You can increase or decrease the number of available memory blocks that are usedto hold the source and target data in the session.

¨ Line sequential buffer length. You can improve the session performance by setting the number of bytes thePowerCenter Integration Service reads per line.

¨ Commit interval. You can increase or decrease the value of commit interval to determine the point at which thePowerCenter Integration Service commits data to the target.

¨ DTM buffer size. You can increase or decrease the value of DTM Buffer Size to specify the amount of memorythe PowerCenter Integration Service uses as DTM buffer memory.

¨ Socket buffer size. You can configure the socket buffer size to specify the size of the buffers used to extractdata from and load data to Netezza.

¨ Escape characters. You can improve session performance by avoiding the use of escape characters in asession.

¨ Ignore key constraints. You can improve session performance by ignoring key constraints when writing toNetezza targets. Since Netezza does not enforce key constraints, the PowerCenter Integration Serviceperforms additional processing when a session that writes to Netezza requires key constraints.

For example, to obtain optimal performance for 3 million rows and 32 KB row size, set the following parameters:

¨ Default buffer block size: 1,280,000

¨ Line sequential buffer length: 202,400

¨ Commit interval: 200,000

¨ DTM buffer size: 28,000,000

¨ Socket buffer size: 8388608 bytes

¨ Escape characters: None

¨ Ignore key constraints: Selected

Using a Netezza Distribution KeyUse a Netezza distribution key to increase session performance with parallel processing. Netezza uses adistribution key to distribute data for processing. By default, the distribution key is the first column of a table.

You can configure the distribution key to include up to four columns in a database table. When you configure adistribution key to evenly distribute data across available data slices, you can greatly increase sessionperformance. For more information, see the Netezza documentation.

Null Values and Empty Strings 13

Administrator
Highlight
Administrator
Highlight
Administrator
Highlight
Administrator
Highlight
Administrator
Highlight

Troubleshooting Netezza Sessions

A Netezza session stops responding with no definite error messages in the logs

A Netezza session can stop responding because of the following reasons:

¨ Source data contains special characters like delimiter.A Netezza Reader session can stop responding if the source contains special characters like delimiter. Addescape characters in the session to eliminate the delimiters. This is a Netezza issue and the reference numberis SWS-40577.

¨ Using Netezza drivers 3.1.2/3.1.4.Multi pipe and multi partition sessions can stop responding randomly with Netezza drivers 3.1.2/3/14 on HP-UX. Use the Netezza driver 4.04 P2 to avoid this issue.

¨ Pipe Directory Path on HP-UX is on an NFS mounted drive.If the Pipe Directory Path on HP-UX is on an NFS mounted drive, Netezza writer sessions can stop respondingor terminate unexpectedly. Specify a non NFS mounted drive (for example, /tmp) in the Pipe Directory Path ofNetezza Writer session property.

¨ Incorrectly set environment variables.Check whether the environment variables PATH, LIBPATH, ODBCINI, and NZ_ODBC_INI_PATH are setcorrectly.

¨ Incorrect permissions set for file paths set in session properties.Ensure that all the file paths in the session properties set for Netezza reader and writer sessions are correctand have proper permission. Check out for the ones which need directory path specification.

If the issue persists, you can try killing the blocking Netezza sessions or disable Netezza ODBC tracing and ODBCtracing.

How can I kill blocking Netezza sessions?

Use the nzsession utility, which comes with client tools, to kill blocking Netezza sessions.

Run the following command to view the active Netezza sessions:

nzsession show -host <hostname> -u <user> -pw <password> -maxColW <column width> |grep -i "active

Run the following command to kill active sessions:

-host <hostname> -u <user> -pw <password> -id <session id> [-force]

How can I enable or disable Netezza ODBC tracing?

In the odbcinst.ini file, set the parameter debugLogging as true to enable Netezza ODBC tracing and as false todisable Netezza ODBC tracing.

How can I enable or disable ODBC tracing?

In the odbc.ini file, set the parameter Trace as 1 to enable Netezza ODBC tracing and as 0 to disable NetezzaODBC tracing.

14 Chapter 4: Netezza Sessions and Workflows

Administrator
Highlight
Administrator
Highlight

A P P E N D I X A

Datatype ReferenceThis appendix includes the following topic:

¨ Netezza and Transformation Datatypes, 15

Netezza and Transformation DatatypesPowerCenter uses the following datatypes in Netezza mappings:

¨ Netezza native datatypes. Netezza datatypes appear in Netezza definitions in a mapping.

¨ Transformation datatypes. Set of datatypes that appear in the transformations. They are internal datatypesbased on ANSI SQL-92 generic datatypes, which the PowerCenter Integration Service uses to move dataacross platforms. They appear in all transformations in a mapping.

When the PowerCenter Integration Service reads source data, it converts the native datatypes to the comparabletransformation datatypes before transforming the data. When the PowerCenter Integration Service writes to atarget, it converts the transformation datatypes to the comparable native datatypes.

The following table lists the Netezza datatypes that PowerCenter supports and the corresponding transformationdatatypes:

NetezzaDatatype

Range TransformationDatatype

Range

BigInt Precision 19, scale 0 Bigint From -9,223,372,036,854,775,808 through9,223,372,036,854,775,807Precision of 19, scale of 0 Integer value.

Bool True or false, on or off, 0 or 1,yes or no.

String Precision 1

ByteInt Precision 3, scale 0 Small Integer Precision 5, scale 0

Char Single character String From 1 through 104,857,600 characters

Date ANSI SQL date Date/Time Jan. 1, 0001 A.D. to Dec. 31, 9999 A.D.(precision to the nanosecond).

Float8 Precision 15 Double Precision 15

Float4 Precision 6, scale 0 Double Precision 15

15

NetezzaDatatype

Range TransformationDatatype

Range

Integer Precision 10, scale 0 Integer Precision 10, scale 0

NChar(m) Single characterUsed for storing UTF-8 data.

String From 1 through 104,857,600 characters

NVarchar(m) BVarchar (length)Non-blank-padded string,variable storage length.Used for storing UTF-8 data.

String From 1 through 104,857,600 characters

Numeric Numeric (precision, decimal),arbitrary precision number.Precision must be between 1and 38.

Decimal Precision from 1 through 28 digits, scale from 0through 28

Real Precision 6, scale 0 Real Precision of 7, scale of 0Double-precision floating-point numeric value.

SmallInt Precision 5, scale 0 Small Integer Precision 5, scale 0

Time hh:mm:ss. ANSI SQL time. Date/Time Jan. 1, 0001 A.D. to Dec. 31, 9999 A.D.(precision to the nanosecond)

Timestamp Precision 26, scale 6 Date/Time Jan. 1, 0001 A.D. to Dec. 31, 9999 A.D.(precision to the nanosecond)

Varchar Varchar (length)Non-blank-padded string,variable storage length.

String From 1 through 104,857,600 characters

16 Appendix A: Datatype Reference

I N D E X

AApplication Source Qualifier

Netezza, overview 4

Ddatatypes

PowerExchange for Netezza 15default values

Netezza targets 9

Eempty strings

in Netezza 13

HHP-UX

pipe directory path, setting 7

Iinstallation

Netezza prerequisites 2

Kkey constraints

example 11key relationships

Netezza 4

Mmultiple targets

for the same Netezza table 11

NNetezza target connection groups

using multiple targets for the same table 10

null valuesin Netezza 13

Ppartitioning

Netezza sessions 9pipe directory path

setting 7setting for HP-UX 7

plug-insregistering for Netezza 3

prerequisitesNetezza installation 2

Ssocket buffer size

target property 7, 8

Ttarget connection groups

using with Netezza 10target property

socket buffer size 7, 8targets

Netezza default values 9unprojected columns in Netezza 9using multiple for the same Netezza table 11

Uupdate as insert

description for Netezza 11update else insert

description for Netezza 12update strategy

example 11update as insert for Netezza 11update else insert for Netezza 12

upgradingPowerExchange for Netezza 3

17