PC_811_XMLGuide

download PC_811_XMLGuide

of 244

Transcript of PC_811_XMLGuide

  • 8/7/2019 PC_811_XMLGuide

    1/244

    XML Guide

    Informatica PowerCenter(Version 8.1.1)

  • 8/7/2019 PC_811_XMLGuide

    2/244

    Informatica PowerCenter XML GuideVersion 8.1.1September 2006

    Copyright (c) 19982006 Informatica Corporation.All rights reserved. Printed in the USA.

    This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containingrestrictions on use and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may bereproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation.

    Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software li cense agreement and asprovided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR52.227-14 (ALT III), as applicable.The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing.Informatica Corporation does not warrant that this documentation is error free.

    Informatica, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerMart, SuperGlue, Metadata Manager, Informatica Data

    Quality and Informatica Data Explorer are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughoutthe world. All other company and product names may be trade names or trademarks of their respective owners.

    Portions of this software and/or documentation are subject to copyright held by third parties, including without l imitation: Copyright DataDirect Technologies,1999-2002. All rights reserved. Copyright Sun Microsystems. All Rights Reserved. Copyright RSA Security Inc. All Rights Reserved. Copyright OrdinalTechnology Corp. All Rights Reserved.

    Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University andUniversity of California, Irvine, Copyright (c) 1993-2002, all rights reserved.

    Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU Lesser GeneralPublic License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials are provided free of charge by

    Informatica, as-is, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitnessfor a particular purpose.

    Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration is a registered trademark of Meta IntegrationTechnology, Inc.

    This product includes software developed by the Apache Software Foundation (http://www.apache.org/). The Apache Software is Copyright (c) 1999-2005 TheApache Software Foundation. All rights reserved.

    This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit and redistribution of this software is subject to terms availableat http://www.openssl.org. Copyright 1998-2003 The OpenSSL Project. All Rights Reserved.

    The zlib library inc luded with this software is Copyright (c) 1995-2003 Jean-loup Gailly and Mark Adler.

    The Curl license provided with this Software is Copyright 1996-2004, Daniel Stenberg, . All Rights Reserved.

    The PCRE library included with this software is Copyright (c) 1997-2001 University of Cambridge Regular expression support is provided by the PCRE librarypackage, which is open source software, written by Philip Hazel. The source for this library may be found at ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre.

    InstallAnywhere is Copyright 2005 Zero G Software, Inc. All Rights Reserved .

    Portions of the Software are Copyright (c) 1998-2005 The OpenLDAP Foundation. All rights reserved. Redistribution and use in source and binary forms, with

    or without modification, are permitted only as authorized by the OpenLDAP Public License, available at http://www.openldap.org/software/release/license.html.

    This Software is protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6 ,014,670; 6,032,158; 5,794,246; 6,339,775 and other U.S. Patents Pending.

    DISCLAIMER: Informatica Corporation provides this documentation as is without warranty of any kind, either express or implied,including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information provided in thisdocumentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or changes in the products described inthis documentation at any time without notice.

  • 8/7/2019 PC_811_XMLGuide

    3/244

    Ta bl e o f Con te nt s iii

    Table of Contents

    List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

    List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

    Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

    About This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv

    Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xiv

    Other Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

    Visiting Informatic a Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . . xv

    Visiting the Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

    Visiting the Informatica Developer Network . . . . . . . . . . . . . . . . . . . . . xv

    Visiting the Informatica Knowledge Base . . . . . . . . . . . . . . . . . . . . . . . . xv

    Obtain ing Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

    Chapter 1: XML Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1

    Overvie w . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    XML Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    Validat ing XML Files with a DTD or Schema . . . . . . . . . . . . . . . . . . . . . 5

    DTD Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    DTD Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7DTD Attribute s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    XML Schema Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    Types of XML Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    Namespace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    Absolute Cardinali ty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    Relative Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    Simple and Complex XML Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    Simple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    Complex Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    Any Type Elements and Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    anyType Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

  • 8/7/2019 PC_811_XMLGuide

    4/244

    iv Ta ble o f Con te nt s

    anySimpleType Element s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    ANY Content E lements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    AnyAttribute Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22

    Component Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    Element and Attribute Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    Substitu tion Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    XML Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    Code Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    Chapter 2: Using XML with PowerCenter . . . . . . . . . . . . . . . . . . . . . . 29

    Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    Importing XML Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    Importing Metadata from an XML File . . . . . . . . . . . . . . . . . . . . . . . . . 31

    Importing Metadata from a DTD File . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    Importing Metadata from an XML Schema . . . . . . . . . . . . . . . . . . . . . . 34

    Creating Metadata from Relational Definitions . . . . . . . . . . . . . . . . . . . 36

    Creating Metadata from Flat Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37Working with XML Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    Creating Custom XML Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    Rules and Guidelines for XML Views . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    Generating Hierarchical Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    Generating a Normalized View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    Generating a Denormalized View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    Generating Entity Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

    Rules and Guidelines for Entity Relationships . . . . . . . . . . . . . . . . . . . . 44

    Using Entity Relationships in an XML Definition . . . . . . . . . . . . . . . . . 45

    Creating a Type Relationship Between a Column and View . . . . . . . . . . . 48

    Using Substitution Groups in an XML Definition . . . . . . . . . . . . . . . . . 49

    Working with Circular References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    Understanding View Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    Using XPath Query Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    Rules and Guidelines for Using View Rows . . . . . . . . . . . . . . . . . . . . . . 54

    Pivoting Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

    Using Multiple-Level Pivots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    Limita tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

  • 8/7/2019 PC_811_XMLGuide

    5/244

    Ta bl e o f C on te nts v

    Chapter 3: Working with XML Sources . . . . . . . . . . . . . . . . . . . . . . . . 59

    Overvie w . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    Importing an XML Source Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    Multi-line Attributes Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    Working with XML Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    Importing Part of an XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

    Generating Entity Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    Generat ing Hierarchy Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    Creating Custom XMLViews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    Selecting Root Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

    Reducing MetadataExplosion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

    Synchronizing XMLDefinit ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

    Editing XML Source Definition Properties . . . . . . . . . . . . . . . . . . . . . . . . . 76

    Creating XML Definitions from Repository Definitions . . . . . . . . . . . . . . . . 79

    Troubleshooting XML Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

    Chapter 4: Using the XML Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Overvie w . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

    XML Naviga tor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

    XML Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

    Columns W indow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

    Creating and Edi ting Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

    Creating an XML View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Adding Columns to a Vie w . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

    Deleting Columns from a View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

    Expanding a Comple x Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

    Applying Contentto anyAttribute or anyType Elements . . . . . . . . . . . . . 92

    Using anySimpleType in the XML Editor . . . . . . . . . . . . . . . . . . . . . . . 94

    Adding a Pass-Through Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

    Adding a File Name Column . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

    Creating anXPath Query Predicate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

    Query ing the Value of an Element of Attribute . . . . . . . . . . . . . . . . . . . 96

    Testing for Elements or Attr ibutes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

    XPath Query Predicate Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . 97

    Steps for Crea ting an XPath Query Predicate . . . . . . . . . . . . . . . . . . . . . 98

    Maintaining View Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    Creating a Relationship Between Views . . . . . . . . . . . . . . . . . . . . . . . . 102

  • 8/7/2019 PC_811_XMLGuide

    6/244

    v i Ta ble o f Con te nt s

    Creating a Type Relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

    Recreating Entity Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

    Viewing Schema Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

    Updating the Namespace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

    Navigating to Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

    Searching for Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

    Viewing a Simple or Complex Type Hierarchy . . . . . . . . . . . . . . . . . . . 110

    Viewing XML Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

    Previewing XMLData. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

    ValidatingXML Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

    Setting XML ViewOptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113Generating All Hierarchy Foreign Keys . . . . . . . . . . . . . . . . . . . . . . . . 113

    Generating Rows in Circular Relationships . . . . . . . . . . . . . . . . . . . . . 115

    Generating Hierarchy Relationship Rows . . . . . . . . . . . . . . . . . . . . . . . 116

    Setting the Force Row Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

    Generating Rows for Views in Type Relationships . . . . . . . . . . . . . . . . 119

    Troubleshoo ting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

    Chapter 5: Working with XML Targets. . . . . . . . . . . . . . . . . . . . . . . . 123

    Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

    Importing an XML Target Definition from an XML File . . . . . . . . . . . . . . . 125

    Creating a Target from an XML Source Definition . . . . . . . . . . . . . . . . . . . 126

    Editing XML Target Definition Properties . . . . . . . . . . . . . . . . . . . . . . . . . 127

    Validat ing XML Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

    Hierarchy Relationship Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

    Type Relati onship Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

    Inheritance Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

    Using an XML Target in a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

    Active Sourc es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

    Selecting a RootElement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

    Connecting Target Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

    Connecting Abstract Elements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

    Flushing XML Data to Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

    Naming XML Files Dynamically . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

    Troubleshoo ting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

  • 8/7/2019 PC_811_XMLGuide

    7/244

    Ta bl e o f Con te nt s v ii

    Chapter 6: XML Source Qualifier Transformation. . . . . . . . . . . . . . . 141

    Overvie w . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

    Adding an XML Source Qualifier to a Mapping . . . . . . . . . . . . . . . . . . . . . 143

    Creating an XML Source Qualifier Transformation by Default . . . . . . . 143

    Creating an XML Source Qualifier Transformation Manually . . . . . . . . 143

    Editing an XML Source Qualifier Transformation . . . . . . . . . . . . . . . . . . . 145

    Setting Sequence Numbers for Generated Keys . . . . . . . . . . . . . . . . . . 148

    Using the XML Source Qualifier in a Mapping . . . . . . . . . . . . . . . . . . . . . 150

    XML Source Qualifier Transformation Example . . . . . . . . . . . . . . . . . . 152

    Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

    Chapter 7: Midstream XML Transformations . . . . . . . . . . . . . . . . . . 157

    Overvie w . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

    XML Pars er Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

    XML Generator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

    Creating a Midstream XML Transformation . . . . . . . . . . . . . . . . . . . . . . . . 163

    Synchronizing a Midstream XML Definition . . . . . . . . . . . . . . . . . . . . . . . 164

    Editing Midstream XML Transformation Properties . . . . . . . . . . . . . . . . . . 165

    Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

    Midstream XML Parser Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

    Midstream XML Generator Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

    Generat ing Pass-Through Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

    Chapter 8: Working with XML Sessions . . . . . . . . . . . . . . . . . . . . . . 173

    Working with XML Sources in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . 174

    Server Handling for XML Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

    Working with XML Targets in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . 177

    Server Handling for XML Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

    Charact er Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179Special Character s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

    Null and Empty Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

    Handling Duplicat e Group Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

    DTD and Schema Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

    Flushing XML on Commits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

    XML Caching Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

    Session Logs for XML Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

    Multiple XML Document Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

  • 8/7/2019 PC_811_XMLGuide

    8/244

    vii i Table of Contents

    Partitioning the XML Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

    Working with Midstream XML Transformations . . . . . . . . . . . . . . . . . . . . . 187

    Appendix A: XML Datatype Reference . . . . . . . . . . . . . . . . . . . . . . . 191

    XML and Transformation Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

    Unsupported Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

    XML Date Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

    Appendix B: XPath Query Functions Reference . . . . . . . . . . . . . . . 195

    Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

    Function Quick Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

    boolean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

    ceiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

    concat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

    contains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

    false . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

    floor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

    lang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

    normal ize-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

    not . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

    number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

    round . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

    starts-with . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210string . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

    string-length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

    substring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

    substring -after . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

    substring-before . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

    translate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

    true . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

    Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

  • 8/7/2019 PC_811_XMLGuide

    9/244

    List of Figures ix

    List of Figures

    Figure 1-1. Sample XML File: StoreInfo.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    Figure 1-2. Elements in the XML Hiera rchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Figure 1-3. Sample DTD: StoreInfo.dtd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Figure 1-4. XML Schema Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    Figure 1-5. XML Cardinal ity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    Figure 1-6. Relat ive Cardinality of Elemen ts in StoreInfo.xml . . . . . . . . . . . . . . . . . . . . . . . . 14

    Figure 1-7. Union Type Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    Figure 1-8. Derived Complex Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    Figure 1-9. XPath of Elements and Attributes in the StoreInfo.xml . . . . . . . . . . . . . . . . . . . . 27

    Figure 2-1. Sample Employees XML File with Multiple-Occurring Elements . . . . . . . . . . . . . 32

    Figure 2-2. Root Element and XML Views in an XML Definition . . . . . . . . . . . . . . . . . . . . . 33

    Figure 2-3. XML Definition from an XML File Refere ncing a DTD File . . . . . . . . . . . . . . . . 34

    Figure 2-4. XML Schema with Derived Type s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    Figure 2-5. XML Definition Containi ng Derived Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    Figure 2-6. XML Targe t from Two Relational Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    Figure 2-7. XML Source Definition from Two Flat File Sources . . . . . . . . . . . . . . . . . . . . . . . 37

    Figure 2-8. DTD Elements to Create Normalized Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40Figure 2-9. XML Source Definition with Normalized Views . . . . . . . . . . . . . . . . . . . . . . . . . 41

    Figure 2-10. Normalized Views Data Preview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    Figure 2-11. Sample XML File for a Denormalized Vie w . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    Figure 2-12. Source Definition Containing a Denormalized View . . . . . . . . . . . . . . . . . . . . . 42

    Figure 2-13. Data Preview for the ProdAndSales.xml Denormalized View . . . . . . . . . . . . . . . . 43

    Figure 2-14. Complex Type View R elationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    Figure 2-15. XM L Complex Types Sample Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Figure 2-16. Co lumn and Type View Relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    Figure 2-17. XML Schema Using Substitution Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    Figure 2-18. XML Definition with Substitution Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    Figure 2-19. Circ ular Reference View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    Figure 2-20. Circ ular Reference Data P review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    Figure 2-21. Element Occurrences Pivoted into Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    Figure 3-1. XML Wizard Options to Gener ate Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    Figure 3-2. Root Selec tion Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

    Figure 3-3. Reduce Metadata Explosion s Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    Figure 4-1. XML Edi tor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

    Figure 4-2. Any Content Elemen t in the Schema Navigator . . . . . . . . . . . . . . . . . . . . . . . . . . 93

    Figure 4-3. XML Definition with Multipl e Foreign Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

    Figure 4-4. Sample Data for All Hierarchy Foreign Keys Option . . . . . . . . . . . . . . . . . . . . . 114

    Figure 4-5. Sample Data for One Foreign Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

    Figure 4-6. Views in a Hierarchical Relat ionship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Figure 4-7. Hierarchy Relationship Row Example - Default Views . . . . . . . . . . . . . . . . . . . . 117

  • 8/7/2019 PC_811_XMLGuide

    10/244

    x List o f Figures

    Figure 4-8. Hierarchy Relationship Row Option Example . . . . . . . . . . . . . . . . . . . . . . . . . . .117

    Figure 4-9. Ad dress View with Zip as a View Row . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .118

    Figure 5-1. FileName Column in a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .137

    Figure 6-1. XML Source Qualifier Transformati on Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . .146

    Figure 6-2. XML Source Qualifier Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .147

    Figure 6-3. Linking XML Source Quali fier to Single Input Group Transformation . . . . . . . . .150

    Figure 6-4. Linking XML Source Qualifier to Multiple Input Group Transformations . . . . . .151

    Figure 6-5. Sample XML File StoreInf o.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152

    Figure 6-6. Invalid Use of XML Source Qual ifier Transformation in Aggregato r Mapping . . . .153

    Figure 6-7. Using a Denormalized Grou p in a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . .154

    Figure 6-8. Using an XML Source Definition Twice in a Mapping . . . . . . . . . . . . . . . . . . . . .155

    Figure 7-1. XML Parser Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .160

    Figure 7-2. XML Generator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161

    Figure 7-3. Sample XML Generator Tran sformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .162

    Figure 7-4. Midstream XML Parser Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .166

    Figure 7-5. Midstream XML Parser Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .167

    Figure 7-6. Midstream XML Generator Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .169

    Figure 7-7. Pa ss-Through Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .172

    Figure 8-1. Properties Settings for an XML Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .174

    Figure 8-2. Properties Settings for an XML Writer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .177Figure 8-3. Mapping Data to an XML Root . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .185

    Figure 8-4. Properties Settings for an XML Gene rator Transformation . . . . . . . . . . . . . . . . . .187

    Figure 8-5. Properties Settings for an XML Pars er Transformation . . . . . . . . . . . . . . . . . . . . .189

  • 8/7/2019 PC_811_XMLGuide

    11/244

    List of Tables xi

    List of Tables

    Table 1-1. Namespace Declarati on Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    Table 1-2. Ca rdinality of Elemen ts in XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    Table 3-1. Create XML Views Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    Table 4-1. XPa th Query Predicate Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

    Table 6-1. XM L Source Qualifier Proper ties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

    Table 7-1. Mid stream XML Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

    Table 7-2. Mid stream XML Parser Sett ings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

    Table 7-3. Mid stream XML Generato r Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

    Table 8-1. XM L Reader Session Opt ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

    Table 8-2. XM L Source Qualifier Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

    Table 8-3. XM L Writer Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

    Table 8-4. Nu ll and Empty String Output for XML Targets . . . . . . . . . . . . . . . . . . . . . . . . . 181

    Table 8-5. XML Gener ator Transformation Sessi on Properties . . . . . . . . . . . . . . . . . . . . . . . 187

    Table 8-6. XM L Parser Transformation Se ssion Options . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

    Table A-1. XML and Transformation Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

    Table B-1. XPath Query Predicate Strin g Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

    Table B-2. XPath Query Predicate Number Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198Table B-3. XPath Query Predicate Boo lean Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

  • 8/7/2019 PC_811_XMLGuide

    12/244

    x ii L is t o f Ta bl es

  • 8/7/2019 PC_811_XMLGuide

    13/244

    xiii

    Preface

    Welcome to PowerCenter, the Informatica software product that delivers an open, scalabledata integration solution addressing the complete life cycle for all data integration projects

    including data warehouses, data migration, data synchronization, and information hubs.PowerCenter combines the latest technology enhancements for reliably managing datarepositories and delivering information resources in a timely, usable, and efficient manner.

    The PowerCenter repository coordinates and drives a variety of core functions, includingextracting, transforming, loading, and managing data. The Integration Service can extractlarge volumes of data from multiple platforms, handle complex transformations on the data,and support high-speed loads. PowerCenter can simplify and accelerate the process ofbuilding a comprehensive data warehouse from disparate data sources.

  • 8/7/2019 PC_811_XMLGuide

    14/244

    xiv Pre face

    About This Book

    The XML Guideis written for developers and software engineers responsible for working withXML in a data warehouse environment. Before you use the XML Guide, ensure that you have

    a solid understanding of XML concepts, your operating systems, flat files, or mainframesystem in your environment. Also, ensure that you are familiar with the interfacerequirements for your supporting applications.

    The material in this book is also available online.

    Document Conventions

    This guide uses the following formatting conventions:

    If you see It means

    italicized text The word or set of words are especially emphasized.

    boldfaced text Emphasized subjects.

    italicized monospaced text This is the variable name for a value you enter as part of an

    operating system command. This is generic text that should be

    replaced with user-supplied values.

    Note: The following paragraph provides additional facts.

    Tip: The following paragraph provides suggested uses.

    Warning: The following paragraph notes situations where you can overwriteor corrupt data, unless you follow the specified procedure.

    monospaced text This is a code example.

    bold monospaced text This is an operating system command you enter from a prompt to

    run a task.

  • 8/7/2019 PC_811_XMLGuide

    15/244

    Preface xv

    Other Informatica Resources

    In addition to the product manuals, Informatica provides these other resources:

    Informatica Customer Portal

    Informatica web site

    Informatica Developer Network

    Informatica Knowledge Base

    Informatica Technical Support

    Visiting Informatica Customer Portal

    As an Informatica customer, you can access the Informatica Customer Portal site at http://my.informatica.com. The site contains product information, user group information,newsletters, access to the Informatica customer support case management system (ATLAS),the Informatica Knowledge Base, Informatica Documentation Center, and access to theInformatica user community.

    Visiting the Informatica Web SiteYou can access the Informatica corporate web site at http://www.informatica.com. The sitecontains information about Informatica, its background, upcoming events, and sales offices.You will also find product and partner information. The services area of the site includesimportant information about technical support, training and education, and implementationservices.

    Visiting the Informatica Developer NetworkYou can access the Informatica Developer Network at http://devnet.informatica.com. TheInformatica Developer Network is a web-based forum for third-party software developers.The site contains information about how to create, market, and support customer-orientedadd-on solutions based on interoperability interfaces for Informatica products.

    Visiting the Informatica Knowledge Base

    As an Informatica customer, you can access the Informatica Knowledge Base at http://my.informatica.com. Use the Knowledge Base to search for documented solutions toknown technical issues about Informatica products. You can also find answers to frequentlyasked questions, technical white papers, and technical tips.

    Obtaining Technical Support

    There are many ways to access Informatica Technical Support. You can contact a TechnicalSupport Center by using the telephone numbers listed the following table, you can sendemail, or you can use the WebSupport Service.

  • 8/7/2019 PC_811_XMLGuide

    16/244

    xvi Preface

    Use the following email addresses to contact Informatica Technical Support:

    [email protected] for technical inquiries

    [email protected] for general customer service requests

    WebSupport requires a user name and password. You can request a user name and password at

    http://my.informatica.com.

    North America / South America Europe / Middle East / Africa Asia / Australia

    Informatica Corporation

    Headquarters

    100 Cardinal Way

    Redwood City, California

    94063United States

    Toll F ree

    877 463 2435

    Standard Rate

    United States: 650 385 5800

    Informatica Software Ltd.

    6 Waltham Park

    Waltham Road, White Waltham

    Maidenhead, Berkshire

    SL6 3TNUnited Kingdom

    Toll Free

    00 800 4632 4357

    Standard Rate

    Belgium: +32 15 281 702France: +33 1 41 38 92 26

    Germany: +49 1805 702 702Netherlands: +31 306 022 797

    United Kingdom: +44 1628 511 445

    Informatica Business Solutions

    Pvt. Ltd.

    Diamond District

    Tower B, 3rd Floor

    150 Airport RoadBangalore 560 008

    India

    Toll Free

    Australia: 00 11 800 4632 4357Singapore: 001 800 4632 4357

    Standard Rate

    India: +91 80 4112 5738

  • 8/7/2019 PC_811_XMLGuide

    17/244

    1

    C h a p t e r 1

    XML Concepts

    This chapter includes the following topics:

    Overview, 2

    XML Files, 3

    DTD Files, 7

    XML Schema Files, 9

    Types of XML Metadata, 10

    Cardinality, 12

    Simple and Complex XML Types, 15

    Any Type Elements and Attributes, 20

    Component Groups, 24

    XML Path, 27

    Code Pages, 28

  • 8/7/2019 PC_811_XMLGuide

    18/244

    2 Chapter 1 : XML Concepts

    Overview

    Extensible Markup Language (XML) is a flexible way to create common information formatsand to share the formats and data between applications and on the internet.

    You can import XML definitions into PowerCenter from the following types of files:

    XML file. An XML file contains data and metadata. An XML file can reference aDocument Type Definition file (DTD) or an XML schema definition (XSD) forvalidation.

    DTD file. A DTD file defines the element types, attributes, and entities in an XML file. ADTD file provides some constraints on the XML file structure but a DTD file does notcontain any data.

    XML schema. An XML schema defines elements, attributes, and type definitions. Schemascontain simple and complex types. A simple type is an XML element or attribute thatcontains text. A complex type is an XML element that contains other elements andattributes.

    Schemas support element, attribute, and substitution groups that you can referencethroughout a schema. Use substitution groups to substitute one element with another inan XML instance document. Schemas also support inheritance for elements, complex

    types, and element and attribute groups.

  • 8/7/2019 PC_811_XMLGuide

    19/244

    XML Files 3

    XML Files

    XML files contain tags that identi fy data in the XML file, but not the format of the data. Thebasic component of an XML file is an element. An XML element includes an element start

    tag, element content, and element end tag. All XML fil es must have a root element defined bya single tag at the top and bottom of the file. The root element encloses all the other elementsin the file.

    An XML file models a hierarchical database. The position of an element in an XML hierarchyrepresents its relationships to other elements. An element can contain child elements, andelements can inherit characteristics from other elements.

    For example, the following XML file descr ibes a book:

    Fun with XML

    Understanding XMLUsing XML

    Using DTD Files

    Fun with Schemas

    Book is the root element and it contains the title and chapter elements. Book is the parentelement of title and chapter, and chapter is the parent of heading. Title and chapter are siblingelements because they have the same parent.

    An element can have attributes that provide additional information about the element. In thefollowing example, the attribute graphic_type describes the content of picture:

    computer.gif

  • 8/7/2019 PC_811_XMLGuide

    20/244

    4 Chapter 1 : XML Concepts

    Figure 1-1 shows the structure, elements, and attributes in an XML file:

    An XML file has a hierarchical structure. An XML hierarchy includes the following elements:

    Child element. An element contained within another element.

    Enclosure element. An element that contains other elements but does not contain data.An enclosure element can include other enclosure elements.

    Global element. An element that is a direct child of the root element. You can referenceglobal elements throughout an XML schema.

    Leaf element. An element that does not contain other elements. A leaf element is thelowest level element in the XML hierarchy.

    Local element. An element that is nested in another element. You can reference localelements only within the context of the parent element.

    Multiple-occurring element. An element that occurs more than once within its parent

    element. Enclosure elements can be multiple-occurring elements.

    Figure 1-1. Sample XML File: StoreInfo.xml

    Element Data

    Attribute Value

    Element Tags

    Element Data

    Enclosure Element

    Root Element

    Attribute Tag

    Element Tags

  • 8/7/2019 PC_811_XMLGuide

    21/244

    XML Files 5

    Parent chain. The succession of child-parent elements that traces the path from anelement to the root.

    Parent element. An element that contains other elements.

    Single-occurring element. An element that occurs once within its parent.

    Figure 1-2 shows some elements in an XML hierarchy:

    Validating XML Files with a DTD or Schema

    A valid XML file conforms to the structure of an associated DTD or schema file.

    To reference the location and name of a DTD file, use the DOCTYPE declaration in an XML

    file. The DOCTYPE declaration also names the root element for the XML file.

    Figure 1-2. Elements in the XML Hierarchy

    Leaf Element:

    Element Zip, along with all itssibling elements, is the

    lowest level element withinelement Address.

    Multiple-occurring Element:

    Element Sales Region occursmore than once within

    element Product.

    Single-occurring Element:

    Element PName occurs oncewithin element Product.

    Enclosure Element:

    Element Address encloses

    elements StreetAddress, City,State, and Zip . Element

    Address is also a Parent

    element.

    Parent Chain:Element YTDSales is a child of element Sales, which is a child of element

    Product, which is a child of root element Store. All these elements belong inthe same parent chain.

    Child Element:

    Element PName is a child of

    Product, which is a child of

    Store.

    The DOCTYPE identifies an

    associated DTD file.

    The encoding attribute

    identifies the code page.

  • 8/7/2019 PC_811_XMLGuide

    22/244

    6 Chapter 1 : XML Concepts

    For example, the following XML file references the locat ion of the note.dtd file:

    XML Data

    To reference a schema, use the schemaLocation declaration. The schemaLocation contains thelocation and name of a schema.

    The following XML file references the note.xsd schema in an external location:

    XML Data

    Unicode Encoding

    An XML file contains an encoding attribute that indicates the code page in the file. The mostcommon encodings are UTF-8 and UTF-16. UTF-8 represents a character with one to fourbytes, depending on the Unicode symbol. UTF-16 represents a character as a 16-bit word.

    The following example shows a UTF-8 attribute in an XML file:

    XML Data

  • 8/7/2019 PC_811_XMLGuide

    23/244

    DTD Files 7

    DTD Files

    A Document Type Definition (DTD) file defines the element types and attributes in an XMLfile. A DTD file also provides some constraints on the XML file structure. A DTD file does

    not contain any data or element datatypes.Figure 1-3 shows elements and attributes in a DTD file:

    DTD ElementsIn the DTD file, an element declaration defines an XML element. An element declaration hasthe following syntax:

    The DTD description defines the XML tag . The description (#PCDATA) specifiesparsed character data. Parsed data is the text between the start tag and the end tag of an XML

    element. Parsed character data is text without child elements.The following example shows a DTD description of an element with two child elements:

    Brand and type are child elements of boat. Each child element can contain characters. In thisexample, brand and type can occur once inside the element boat. The following DTD

    description specifies that brand must occur one or more times for a boat:

    Figure 1-3. Sample DTD: StoreInfo.dtd

    ElementElement List

    Attribute

    Element Occurrence

    Attribute Value Option

    Attribute Name

    DTD Att ib t

  • 8/7/2019 PC_811_XMLGuide

    24/244

    8 Chapter 1 : XML Concepts

    DTD Attributes

    Attributes provide additional information about elements. In a DTD file, an attribute occursinside the starting tag of an element.

    The following syntax describes an attribute in a DTD file:

    The following parameters identify an attribute in a DTD file:

    Element_name. The name of the element that has the attribute.

    Attribute_name. The name of the attribute.

    Attribute_type. The kind of attribute. The most common attribute type is CDATA. ACDATA attribute is character data.

    Default_value. The value of the attribute if no attribute value occurs in the XML file.

    Use the following options with a default value:

    #REQUIRED. The XML file must contain the attribute value.

    #IMPLIED. The attribute value is optional.

    #FIXED. The XML file must contain the default value from the DTD file. A valid XMLfile can contain the same attribute value as the DTD, or the XML file can have no

    attribute value. You must specify a default value with this option.

    The following example shows an attribute with a fixed value:

    The element name is product. The attribute is product_name. The attribute has adefault value, vacuum.

    XML S h Fil

  • 8/7/2019 PC_811_XMLGuide

    25/244

    XML Schema Files 9

    XML Schema Files

    An XML schema is a document that defines the valid content of XML fi les. An XML schemafile, like a DTD file, contains only metadata. An XML schema defines the structure and type

    of elements and attributes for an associated XML file. When you use a schema to define anXML file, you can restrict data, define data formats, and convert data between datatypes.XML schemas support complex types and inheritance between types. They also provide a wayto specify element and attribute groups, ANY content, and circular references.

    Figure 1-4 shows XML schema components:

    For more information about XML schemas, see Simple and Complex XML Types on

    page 15. For information about element groups, see Component Groups on page 24.

    Figure 1-4. XML Schema Components

    Element Name

    Element List

    andOccurrence

    Attribute Type

    and NullConstraint

    Attribute

    Element List and Datatype

    Element Data

    Element

    Datatype

    Types of XML Metadata

  • 8/7/2019 PC_811_XMLGuide

    26/244

    10 Chapter 1: XML Concepts

    Types of XML Metadata

    You can create PowerCenter XML definitions from XML, DTD, or XML schema files. XMLfiles provide data and metadata. DTD files and XML schema files provide metadata.

    PowerCenter extracts the following types of metadata from XML, DTD, and XML schemafiles:

    Namespace. A collection of elements and attribute names identified by a UniformResource Identifier (URI) reference in an XML file. Namespace differentiates betweenelements that come from different sources. For more information about namespace, seeNamespace on page 10.

    Name. A tag that contains the name of an element or attribute. For more information

    about the name tag, see Name on page 11.

    Hierarchy. The position of an element in relationship to other elements in an XML file.For more information, see Hierarchy on page 11.

    Cardinality. The number of times an element occurs in an XML file. For moreinformation about cardinality, see Cardinality on page 12.

    Datatype. A classification of a data element, such as numeric, string, Boolean, or time.XML supports custom datatypes and inheritance. For more information about datatypes,

    see Simple and Complex XML Types on page 15.

    Namespace

    A namespace contains a URI to identify schema location. A URI is a string of characters thatidentifies an internet resource. A URI is an abstraction of a URL. A URL locates a resource,but a URI identifies a resource. A DTD or schema file does not have to exist at the URIlocation.

    An XML namespace identifies groups of elements. A namespace can identify elements andattributes from different XML files or distinguish meanings between elements. For example,you can distinguish meanings for the element table by declaring different namespaces, suchas math:tableand furniture:table. XML is case sensitive. The namespace Math:tableis differentfrom the namespace math:table.

    You can declare a namespace at the root level of an XML file, or you can declare a namespaceinside any element in an XML structure. When you declare multiple namespaces in the sameXML file, you use a namespace prefix to associate an element with a namespace. A namespacedeclaration appears in the XML file as an attribute that starts with xmlns. Declare thenamespace prefix with the xmlns attribute. You can create a prefix name of any length.

    The following example shows two namespaces in an XML instance document:

    xmlns:math = http://www.mathtables.com

    xmlns:furniture

    = http://www.home.com>4X6

    Brueners

  • 8/7/2019 PC_811_XMLGuide

    27/244

    Types of XML Metadata 11

    Brueners

    One namespace has math elements, and the other namespace has furniture elements. Eachnamespace has an element called table, but the elements contain different types of data. The

    namespace prefix distinguishes between the math table and the furniture table.The following text shows a common schema declaration:

    ...

    ...

    Table 1-1 describes each part of the namespace declaration:

    Name

    In an XML file, each tag is the name of an element or attribute. In a DTD file, the tag specifies the name of an element, and the tag indicates the set ofattributes for an element. In a schema file, specifies the name of an elementand specifies the name of an attribute.

    When you import an XML definition, the element tags become column names in the

    PowerCenter definition, by default.

    Hierarchy

    An XML file models a hierarchical database. The position of an element in an XML hierarchyrepresents its relationship to other elements. For example, an element can contain childelements, and elements can inherit characteristics from other elements.

    Table 1-1. Namespace Declaration Components

    Schema Declarati on Description

    xmlns:xs="http://www.w3.org/2001/XMLSchema" Namespace that contains the native XML schema and

    datatypes. In this example, each schema component has the

    prefix of xs.

    targetNamespace="http://www.w3XML.com" Namespace that contains the schema.

    xmlns="http://www.w3XML.com" Default namespace declaration. All elements in the schema that

    have no prefix belong to the default namespace. Declare a

    default namespace by using an xmlns att ribute with no prefix.

    elementFormDefault="qualified" Specifies that any element in the schema must have anamespace in the XML file.

    Cardinality

  • 8/7/2019 PC_811_XMLGuide

    28/244

    12 Chapter 1: XML Concepts

    Cardinality

    Element cardinality in a DTD or schema file is the number of times an element occurs in anXML file. Element cardinality affects how you structure groups in an XML definition.

    Absolute cardinality and relative cardinality of elements affect the structure of an XMLdefinition.

    Absolute Cardinality

    The absolute cardinality of an element is the number of times an element occurs within itsparent element in an XML hierarchy. DTD and XML schema files describe the absolutecardinality of elements within the hierarchy. A DTD file uses symbols, and an XML schema

    file uses the and attributes to describe the absolute cardinality ofan element.

    For example, an element has an absolute cardinality of once (1) if the element occurs oncewithin its parent element. However, the element might occur many times within an XMLhierarchy if the parent element has a cardinality of one or more (+).

    The absolute cardinality of an element determines its null constraint. An element that has anabsolute cardinality of one or more (+) cannot have null values, but an element with a

    cardinality of zero or more (*) can have null values. An attribute marked as fixed or requiredin an XML schema or DTD file cannot have null values, but an implied attribute can havenull values. For more information about required, fixed, and implied attributes, see DTDAttributes on page 8.

    Table 1-2 describes how DTD and XML schema files represent cardinality:

    Table 1-2. Cardinality of Elements in XML

    Absolute Cardinality DTD Schema

    Zero or once ? minOccurs=0 maxOccurs=1

    Zero or one or more t imes * minOccurs=0 maxOccurs=unboundedminOccurs=0 maxOccurs=n

    Once minOccurs=1 maxOccurs=1

    One or more times + minOccurs=1 maxOccurs=unbounded

    minOccurs=1 maxOccurs=n

    Note: You can declare a maximum number of occurrences or an unlimited occurrences in a schema.

    Figure 1-5 shows the absolute cardinality of elements in a sample XML file:

  • 8/7/2019 PC_811_XMLGuide

    29/244

    Cardinality 13

    g y p

    Relative Cardinality

    Relative cardinality is the relationship of an element to another element in the XMLhierarchy. An element can have a one-to-one, one-to-many, or many-to-many relationship toanother element in the hierarchy.

    An element has a one-to-one relationship with another element if every occurrence of oneelement can have one occurrence of the other element. For example, an employee element canhave one social security number element. Employee and social security number have a one-to-one relationship.

    An element has a one-to-many relationship with another element if every occurrence of oneelement can have multiple occurrences of another element. For example, an employee elementcan have multiple email addresses. Employee and email address have a one-to-manyrelationship.

    Figure 1-5. XML Cardinality

    Element Cityoccurs once within

    its parent element Address. Itsabsolute cardinality is once(1).

    Element Address occurs more

    than once within Store. Its

    absolute cardinality is one or

    more(+).

    Element Sales occurs zero ormore times within its parent

    element Product. Its absolutecardinality is zero or more(*).

    An element has a many-to-many relationship with another element if an XML file can have

  • 8/7/2019 PC_811_XMLGuide

    30/244

    14 Chapter 1: XML Concepts

    multiple occurrences of both elements. For example, an employee might have multiple emailaddresses and multiple street addresses. Email address and street address have a many-to-manyrelationship.

    Figure 1-6 shows the relative cardinality between elements in a sample XML file:

    Figure 1-6. Relative Cardinality of Elements in StoreInfo.xml

    Many-to-many relationship.

    For every occurrence of

    STATE, there can be

    multiple occurrences ofYTDSALES. For every

    occurrence of YTDSALES,

    there can be many

    occurrences of STATE.

    One-to-many relationship.

    For every occurrence of

    SNAME, there can bemany occurrences of

    ADDRESS and, therefore,

    many occurrences of

    CITY.

    One-to-one relationship.

    For every occurrence of

    PNAME, there is oneoccurrence of PPRICE.

    Simple and Complex XML Types

  • 8/7/2019 PC_811_XMLGuide

    31/244

    Simple and Complex XML Types 15

    S p e a d Co p e ypes

    The XML schema language has over 40 built-in datatypes, including numeric, string, time,XML, and binary. These datatypes are called simple types. They contain text but no other

    elements and attributes. You can derive new simple types from the basic XML simple types.For more information about simple types, see Simple Types on page 15.

    You can create complex XML datatypes. A complex datatype is a datatype that contains morethan one simple type. A complex datatype can also contain other complex types andattributes. For more information about complex datatypes, see Complex Types on page 17.

    For more information about XML datatypes, see the W3C specifications for XML datatypesat http://www.w3.org/TR/xmlschema-2.

    Simple Types

    A simple datatype is an XML element or attribute that contains text. A simple type isindivisible. Simple types cannot have attributes, but attributes are simple types.

    PowerCenter supports the following simple types:

    Atomic types. A basic datatype such as Boolean, string, or integer.

    Lists. An array collection of atomic types.

    Unions. A combination of one or more atomic or list types that map to a simple type in anXML file.

    Atomic Types

    An atomic datatype is a basic datatype such as a Boolean, string, integer, decimal, or date. Todefine custom atomic datatypes, add restrictions to an atomic datatype to limit the content.Use a facet to define which values to restrict or allow.

    A facet is an expression that defines minimum or maximum values, specific values, or a datapattern of valid values. For example, a pattern facet restricts an element to an expression ofdata values. An enumeration facet lists the legal values for an element.

    The following example contains a pattern facet that restricts an element to a lowercase letterbetween a and z:

    The following example contains an enumeration facet that restricts a string to a, b, or c:

  • 8/7/2019 PC_811_XMLGuide

    32/244

    16 Chapter 1: XML Concepts

    ListsA list is an array collect ion of atomic types, such as a list of strings that represent names. Thelist itemType defines the datatype of the list components.

    The following example shows a list called names:

    An XML file might contain the following data in the names list:Joe Bob Harry Atlee Will

    Unions

    A union is a combination of one or more atomic or list types that map to one simple type inan XML file. When you define a union type, you specify what types to combine. For example,you might create a type called size. Size can include string data, such as S, M, and L, or size

    might contain decimal sizes, such as 30, 32, and 34. If you define a union type element, theXML file can include a sizename type for string sizes, and a sizenum type for numeric sizes.

    Figure 1-7 shows a schema file containing a shoesize union that contains sizenames andsizenums lists

  • 8/7/2019 PC_811_XMLGuide

    33/244

    Simple and Complex XML Types 17

    sizenums lists:

    The union defines sizenames and sizenums as union member types. Sizenames defines a list ofstring values. Sizenums defines a list of decimal values.

    Complex Types

    A complex type aggregates a collection of simple types into a logical unit. For example, acustomer type might include the customer number, name, street address, town, city, and zip

    code. A complex type can also reference other complex types or element and attribute groups.XML supports complex type inheritance. When you define a complex type, you can createother complex types that inherit the components of the base type. In a type relationship, thebase type is the complex type from which you derive another type. A derived complex typeinherits elements from the base type.

    An extended complex type is a derived type that inherits elements from a base type andincludes additional elements. For example, a customer_purchases type might inherit its

    definition from the customer complex type, but the customer_purchases type adds item, cost,and date_sold elements.

    Figure 1-7. Union Type Example

    Sizename is a restrictedstring type.

    The sizenums type

    accepts a list decimals.

    The shoesize union

    accepts both the

    decimal and string lists.

    The sizenames typeaccepts a list of strings.

    A restricted complex type is a derived type that restricts some elements from the base type.For example mail list might inherit elements from customer but restrict the phone number

  • 8/7/2019 PC_811_XMLGuide

    34/244

    18 Chapter 1: XML Concepts

    For example, mail_list might inherit elements from customer, but restrict the phone_numberelement by setting the minoccurs and maxoccurs boundaries to zero.

    Figure 1-8 shows derived complex types that restrict and extend the base complex type:

    In Figure 1-8, the base type is PublicationType. BookType extends PublicationType andincludes the ISBN and Publisher elements.

    Publication_Minimum restricts PublicationType. Publication_Minimum requires between 1and 25 Authors and restricts the date to the year.

    Abstract Elements

    Sometimes a schema contains a base type that defines the basic structure of a complex elementbut does not contain all the components. Derived complex types extend the base type with

    more components. Since the base type is not a complete definit ion, you might not want to use

    Figure 1-8. Derived Complex Types

    Base Complex Type

    Extended ComplexType

    Restricted Complex

    Type

    Element Reference

    the base type in an XML file. You can declare the base type element to be abstract. An abstractelement is not valid in an XML file. Only the derived elements are valid.

  • 8/7/2019 PC_811_XMLGuide

    35/244

    Simple and Complex XML Types 19

    element is not valid in an XML file. Only the derived elements are valid.

    To define an abstract element, add an abstract attribute with the value true. The default isfalse.

    For example, PublicationType is an abstract element. BookType inherits the elements inPublicationType, but a lso includes ISBN and Publisher elements. Since PublicationType isabstract, a PublicationType element is not valid in the XML file. An XML file can contain thederived type, BookType.

    The following schema contains the PublicationType and BookType:

    Any Type Elements and Attributes

  • 8/7/2019 PC_811_XMLGuide

    36/244

    20 Chapter 1: XML Concepts

    Some schema elements and attributes allow any type of data in an XML file. Use theseelements and attributes when you need to validate an XML file that has unidentified elementand attribute types.

    Use the following element and attributes that allow any type of data:

    anyType element. Allows an element to be any datatype in the associated XML file. Formore information, see anyType Elements on page 20.

    anySimpleType element. Allows an element to be any simpleType in the associated XMLfile. For more information, see anySimpleType Elements on page 21.

    ANY content element. Allows an element to be any element already defined in the

    schema. For more information, see ANY Content Elements on page 21.

    anyAttribute attribute. Allows an element to be any attribute already defined in theschema. For more information, see AnyAttribute Attributes on page 22.

    anyType Elements

    An anyType element can be any datatype in an XML instance document. Declare an element

    to be anyType when the element contains different types of data.The following schema describes a person with a first name, last name, and an age element thatis anyType:

  • 8/7/2019 PC_811_XMLGuide

    37/244

    Any Type Elements and Attribut es 21

    An anySimpleType element can contain any atomic type. An atomic type is a basic datatypesuch as a Boolean, string, integer, decimal, or date.

    The following schema describes a person with a first name, last name, and other element that

    is anySimpleType:

    The following XML instance document substitutes the anySimpleType element with a stringdatatype:

    KathyRussellCissy

    The following XML instance document substitutes the anySimpleType element with anumeric datatype:

    Kathy

    Russell

    34

    ANY Content Elements

    The ANY content element accepts any content in an XML file. When you declare an ANYcontent element in a schema, you can substitute it for an element of any name and type in anXML instance document. The substitute element must exist in the schema.

    When you specify ANY content, you use the keyword ANY instead of an element name andelement type.

    The following schema describes a person with a first name, last name, and an element that isANY content:

  • 8/7/2019 PC_811_XMLGuide

    38/244

    22 Chapter 1: XML Concepts

    The schema includes a son element and a daughter el ement. You can substitute the ANY

    element for the son or daughter element in the XML instance document:

    DannyRussell

    Atlee

    Christine

    SladeSusie

    AnyAttribute Attributes

    The anyAttribute attribute accepts any attribute in an XML file. When you declare anattribute as anyAttribute you can substitute the anyAttribute element for any attribute in the

    schema.The following schema describes a person with a first name, last name, and an attribute that isanyAttribute:

    The following schema includes a gender attribute:

    The following XML instance document substitutes anyAttribute with the gender attribute:

    Anita

    Ficks

    Jim

  • 8/7/2019 PC_811_XMLGuide

    39/244

    Any Type Elements and Attribut es 23

    JimGeimer

    Component Groups

  • 8/7/2019 PC_811_XMLGuide

    40/244

    24 Chapter 1: XML Concepts

    You can create the following groups of components in an XML schema:

    Element and attribute group. Group of elements or attributes that you can reference

    throughout a schema. Substitution group. Group of elements that you can substitute with other elements from

    the same group.

    Element and Attribute Groups

    You can put elements and attributes in groups that you can reference in a schema. You mustdeclare the group of elements or att ributes before you reference the group.

    The following example shows the schema syntax for an element group:

    The following example shows the schema syntax for an attribute group:

    The following element groups provide constraints on XML data:

    Sequence group. All elements in an XML file must occur in the order that the schema lists

    them. For example, OrderHeader requires the customerName first, then orderNumber,and then orderDate:

    Choice group. One element in the group can occur in an XML file. For example, theCustomerInfo group lists a choice of elements for the XML file:

    All group. All elements must occur in the XML fi le or none at all. The elements can occurin any order. For example, CustomerInfo requires all or none of the three elements:

  • 8/7/2019 PC_811_XMLGuide

    41/244

    Component Groups 25

    Substitution Groups

    Use substitution groups to replace one element with another in an XML file. For example, ifyou have addresses from Canada and the United States, you can create an address type for

    Canada and another type for the United States. You can create a substitution group thataccepts either type of address.

    The following schema fragment shows an Address base type and the derived typesCAN_Address and USA_Address:

    CAN Address includes Province and PostalCode and USA Address includes State and Zip

  • 8/7/2019 PC_811_XMLGuide

    42/244

    26 Chapter 1: XML Concepts

    CAN_Address includes Province and PostalCode, and USA_Address includes State and Zip.The MailAddress substitution group includes both address types.

    For more information about using Substitut ion Groups with PowerCenter, see Using

    Substitution Groups in an XML Definition on page 49.

    XML Path

  • 8/7/2019 PC_811_XMLGuide

    43/244

    XML Path 27

    XMLPath (XPath) is a language that describes a way to locate items in an XML f ile. XPathuses an addressing syntax based on the route through the hierarchy from the root to anelement or attribute. An XML path can contain long schema component names.

    XPath uses a slash (/) to distinguish between elements in the hierarchy. XML attributes arepreceded by @ in the XPath.

    Figure 1-9 shows the XPath addresses for elements and attributes in a hierarchy:

    You can create a query on an element or attribute XPath to filter XML data. For moreinformation about creating an XPath query, see Using XPath Query Predicates on page 54.

    Figure 1-9. XPath of Elements and Attributes in the StoreInfo.xml

    STORE

    STORE/PRODUCT/PNAME

    STORE/PRODUCT/SALES/YTDSALES

    STORE/ADDRESS/

    STREETADDRESS

    STORE/SNAME

    STORE/PRODUCT/@PID

    STORE/PRODUCT/SALES/

    @REGION

    Code Pages

    f l d d l h d h d d h f l h

  • 8/7/2019 PC_811_XMLGuide

    44/244

    28 Chapter 1: XML Concepts

    XML files contain an encoding declaration that indicates the code page used in the file. Themost common code pages in XML are UTF-8 and UTF-16. All XML parsers support thesecode pages. For information on the XML character encoding specification, see the W3C

    website at http://www.w3c.org.

    PowerCenter supports the same set of code pages for XML files that it supports for relationaldatabases and other flat files. Use any code page supported by both PowerCenter and theXML specification. For a list of supported code pages, see Code Pages in the AdministratorGuide. PowerCenter does not support a user-defined code page.

    For XML source definitions, PowerCenter uses the code page declared in the XML file.

    For XML target definitions, PowerCenter uses the code page declared in the XML file. IfPowerCenter does not support the declared code page, the Designer returns an error and youcannot import the target definition.

  • 8/7/2019 PC_811_XMLGuide

    45/244

    29

    C h a p t e r 2

    Using XML with PowerCenter

    This chapter includes the following topics:

    Overview, 30

    Importing XML Metadata, 31 Working with XML Views, 38

    Generating Hierarchical Relationships, 40

    Generating Entity Relationships, 44

    Working with Circular References, 51

    Understanding View Rows, 53

    Pivoting Columns, 55

    Limitations, 58

    Overview

    You can create an XML definition in PowerCenter from an XML file DTD file XML

  • 8/7/2019 PC_811_XMLGuide

    46/244

    30 Chapter 2: Using XML with PowerCenter

    You can create an XML definition in PowerCenter from an XML file, DTD file, XMLschema, flat file definition, or relational table definition. When you create an XMLdefinition, the Designer extracts XML metadata and creates a schema in the repository. The

    schema provides the structure from which you edit and validate the XML definition.

    An XML definition can contain multiple groups. In an XML definition, groups are calledviews. The relationship between elements in the XML hierarchy defines the relationshipbetween the views. When you create an XML definition, the Designer creates views formultiple-occurring elements and complex types in a schema by default. The relativecardinality of elements in an XML hierarchy affects how PowerCenter creates views in anXML definition. Relative cardinality determines if elements can be part of the same view.

    The Designer defines relationships between the views in an XML definition by keys. Sourcedefinitions do not require keys, but target views must have them. Each view has a primary keythat is an XML e lement or a generated key.

    When you create an XML definition, you can create a hierarchical model or an entityrelationship model of the XML data. When you create a hierarchical model, you create anormalized or denormalized hierarchy. A normalized hierarchy contains separate views formultiple-occurring elements. A denormalized hierarchy has one view with duplicate data for

    multiple-occurring elements.

    If you create an entity model, the Designer creates views for complex types and multiple-occurring elements. The Designer creates an XML definition that models the inheritance andcircular relationships the schema provides.

    Importing XML Metadata

    When you import an XML definition the Designer creates a schema in the repository for the

  • 8/7/2019 PC_811_XMLGuide

    47/244

    Importing XML Metadata 31

    When you import an XML definition, the Designer creates a schema in the repository for thedefinition. The repository schema provides the structure from which you edit and validate theXML definition.

    You can create metadata from the following types of files:

    XML files

    DTD files

    XML schema files

    Relational tables

    Flat files

    Importing Metadata from an XML File

    In an XML file, a pair of tags marks the beginning and end of each data element. These tagsare the basis for the metadata that PowerCenter extracts from the XML file. If you import anXML file without an associated DTD or XML schema, the Designer reads the XML tags todetermine the elements, their possible occurrences, and their position in the hierarchy. The

    Designer checks the data within the element tags and assigns a datatype depending on thedata representation. You can change the datatypes for these elements in the XML definition.

    Figure 2-1 shows a sample XML file. The root element is Employees. Employee is a multipleoccurring element. The Employee element contains the LastName, FirstName, and Address.The Employee element also contains the multiple-occurring elements: Phone and Email.

  • 8/7/2019 PC_811_XMLGuide

    48/244

    32 Chapter 2: Using XML with PowerCenter

    The Designer determines a schema structure from the XML data.

    Figure 2-1. Sample Employees XML File with Multiple-Occurring Elements

    Employee, Phone, and

    Email are multiple-occurring

    elements.

    Figure 2-2 shows the default XML source definition with separate views for the root elementand the multiple-occurring elements:

    Figure 2-2. Root Element and XML Views in an XML Definition

  • 8/7/2019 PC_811_XMLGuide

    49/244

    Importing XML Metadata 33

    When you import an XML file, you do not need all of the XML data to create an XML

    definition. You need enough data to accurately show the hierarchy of the XML file.

    The Designer can create an XML definition from an XML file that references a DTD file orXML schema. If an XML file has a reference to a DTD or an XML schema on anothermachine, the machine that hosts the PowerCenter Client must have access to the machinewhere the schema resides so the Designer can read the schema. The XML file contains auniversal resource identifier (URI) which is the address of the DTD or an XML schema.

    Importing Metadata from a DTD File

    A DTD file provides constraints on a XML document structure. A DTD file lists elements,attributes, entities, and notations for an XML document. A DTD file specifies relationshipsbetween components. A DTD specifies cardinality and null constraint. However, a DTD filedoes not contain any data or datatypes.

    When you import a DTD file, you can change the datatypes for the elements in the XML

    definition. You can change the null constraint, but you cannot change element cardinality.If you import an XML file with an associated DTD, the Designer creates a definition basedon the DTD structure.

    g

    Root ElementEmployee View

    Email View

    Phone View

    Figure 2-3 shows an example of an XML file with a portion of the hierarchy from theassociated DTD and the source definition that the Designer creates:

    Figure 2-3. XML Definition from an XML File Referencing a DTD File

  • 8/7/2019 PC_811_XMLGuide

    50/244

    34 Chapter 2: Using XML with PowerCenter

    Importing Metadata from an XML Schema

    A schema file defines the structure of elements and attributes in an XML file. A schema filecontains descriptions of the type of elements and attributes in the file. When you import anXML schema, the Designer determines the datatype, precision, and cardinality of theelements. You cannot change an element definition in PowerCenter if the element definitioncomes from a schema.

    Each simple type definition in an XML schema is a restriction of another simple typedefinition in the schema Atomic datatypes, such as Boolean, str ing, or integer, restrict the

    anySimpleType datatype. When you define a simple datatype in an XML schema, you derive a

    StoreInfo.dtd contains the Store

    element. Product is one of the childelements of Store.

    ProductInfo.xml uses the Productelement from StoreInfo.dtd. Product

    includes the multiple-occurring Sales

    element.

    The ProductInfo definition contains the Product

    and Sales groups. The XML file determines

    what elements to include in the definition. TheDTD file determines the structure of the XML

    definition.

    new datatype from an existing datatype. For example, you can derive a restricted integer typethat holds only numbers from 1 to 20. The base type is integer.

    When you derive a complex datatype from another datatype, you create a new datatype thatt i th l t f th b t Y dd l t t th d i d t t

  • 8/7/2019 PC_811_XMLGuide

    51/244

    Importing XML Metadata 35

    contains the elements of the base type. You can add new elements to the derived type or createrestrictions on the inherited elements. The Designer creates views for derived types without

    duplicating the columns that represent inherited components. This reduces metadata anddecreases the size of the XML definition in the repository.

    Figure 2-4 shows a schema with simple and complex derived types:

    The MailAddress element is an Address type. A derived type, CAN_Address, inherits theName, City, and Street from the Address type, and extends Address by adding a Province andPostalCode. PostalCode is a simple type called CAN_Postal_Code.

    When you import an XML schema, every simple type or attribute in a complex type canbecome a column in an XML definition. Complex types become views.

    Figure 2-4. XML Schema with Derived Types

    Address is a basetype.

    CAN_Address is a

    derived complex

    type that extends

    Address.

    CAN_Postal_

    Code is a derived

    simple type that

    restricts a string to

    a pattern.

    Figure 2-5 shows an XML definition from the schema if you import the schema with thedefault options:

    Figure 2-5. XML Definition Containing Derived Types

  • 8/7/2019 PC_811_XMLGuide

    52/244

    36 Chapter 2: Using XML with PowerCenter

    The CAN_Address view contains the elements that are unique for its type. The view does notcontain the Name, Street, and City that it inherits from MailAddress.

    Creating Metadata from Relational Definitions

    You can create an XML definition by selecting multiple relational definitions and creatingrelationships between them. The Designer creates an XML view for each relational definitionyou import. The Designer converts every column in the relational definition and generatesprimary key-foreign key relationships. You can choose to create a root view.

    Figure 2-6 shows a sample XML target definition from the relational definitions, Orders andOrder_Items. The root is XRoot. XRoot encloses Orders and Order Items. Order_Items has aforeign key that points to Orders.

    Figure 2-6. XML Target from Two Relational Sources

    The root element is MailAddress.

    The Address type contains Name, Street, and

    City.

    The CAN_Address has a foreign key to Address.

    CAN_Address includes Province and

    PostalCode.

    Creating Metadata from Flat Files

    You can create an XML definition by importing a flat file definition from the repository. Ifyou import more than one flat file definition, the Designer creates an XML definition with aview for each flat file The views have no relationship to each other in the XML definition If

  • 8/7/2019 PC_811_XMLGuide

    53/244

    Importing XML Metadata 37

    view for each flat file. The views have no relationship to each other in the XML definition. Ifyou choose to create a root view, the Designer creates the views with foreign keys to the root.

    Figure 2-7 shows a sample XML source definition from flat files orders and products:

    Figure 2-7. XML Source Definition from Two Flat File Sources

    Products and Orders

    have a foreign key tothe root view.

  • 8/7/2019 PC_811_XMLGuide

    54/244

    Rules and Guidelines for XML Views

    Consider the following rules and guidelines when you work with view keys and relationships:

    A view can have one primary key.

  • 8/7/2019 PC_811_XMLGuide

    55/244

    Working with XML Views 39

    A view can be related to several other views, and a view can have multiple foreign keys.

    A column cannot be both a primary key and a foreign key.

    A view in a source definition does not require a key.

    A view in a target definition requires at least one key.

    The target root view requires a primary key, but the target root does not require aforeign key.

    A target leaf view requires a foreign key, but the target leaf view does not require a

    primary key. An enclosure element cannot be a key.

    A foreign key always refers to a primary key in another group. You cannot use self-referencing keys.

    A generated foreign key column always refers to a generated primary key column.

    The relative cardinality of elements in an XML hierarchy affects how PowerCenter createsviews in an XML definition. The following rules determine when elements can be part of

    the same view: Elements that have a one-to-one relationship can be part of the same view.

    Elements that have a one-to-many relationship can be part of the same normalized ordenormalized view.

    Elements that have a many-to-many relationship cannot be part of the same view.

    Generating Hierarchical Relationships

    An XML definition with hierarchical view relationships has each element in the hierarchyappear under its parent element in a view. Multiple-occurring elements can become views.

  • 8/7/2019 PC_811_XMLGuide

    56/244

    40 Chapter 2: Using XML with PowerCenter

    pp p p gComplex types do not become views, and elements unique to derived complex types do not

    occur in any view.

    You can generate the following types of hierarchical views:

    Normalized views. An XML definition with normalized views reduces redundancy byseparating multiple-occurring data into separate views. The views are related by primaryand foreign keys.

    Denormalized views. An XML definition with a denormalized view has all the elements ofthe hierarchy that are not unique to derived complex types in the view. A source or targetdefinition can contain one denormalized view.

    Generating a Normalized View

    When the Designer generates a normalized view, it establishes th