© 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

34
© 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010

Transcript of © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

Page 1: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation1

Information Server: Upgrade WorkshopMaj 2010

Page 2: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

2

What Is New In InfoSphere DataStage

● Three releases of DataStage since the 7.5.x series

● Each release has introduced new features targeted specifically for the data integration user

● Enhancements extend across a variety of product areas (as shown to the left)

● The following slides will:

introduce each category

list some specific features in each set

dive more deeply on a few

2

DeveloperUsability

Connectivity

FunctionalStages

AdministratorFeatures

Operations& Runtime

Page 3: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

3

Developer Usability

● Designer Performance

● Function Expansion

● Designer Graphical & Function Upgrades

● Multiple User Environment – Locking/Read Only

● Find/Search for Objects in the Repository

● Graphical Impact Analysis

● Job, Table or Routine Difference

● Job Deployment with Information Server Manager

● Balanced Optimization – leverage the power of the DBMS !

● DSX Export Improvements

● Globalization – UI and messages translated to 9 different languages

3

DeveloperUsability

Connectivity

FunctionalStages

AdministratorFeatures

Operations& Runtime

Features that simplify tasks within the user interface tasks for both basic and advanced use cases

Page 4: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

4

Designer – New Repository Tree

New tree view for display of Repository contents

– New folder model

– In-place “Find”

– Expandable view

New folder model

– Replaces previous Category model

– No restrictions on where objects live in the folder structure

• Jobs can live in the same folder as Table Definitions, Routines, Transforms etc.

• Allows user to configure Repository content in the way that suits their application e.g. task based vs class based structure

Page 5: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

5

Quick Find - Basic

● Find item in Repository tree

– In-place find

– Find by Name (Full or Partial)

– Wild card support

– Find next…

– Filter on type

Page 6: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

6

Impact Analysis – Graphical View

Results shown using the Advanced Find window

- Find dependencies …What does this item depend on?- Find where used …Where is this item used?

Impact Analysis:

Page 7: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

7

Job Difference – Integrated report

Difference report displayedin Designer - jobs opened automatically from report hot links

Options available to:

- Print report- Simple “Find” in

report- Launch external diff

tool for more in-depth diff of textual properties, e.g. Routine source

Page 8: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

88

General Enhancements

Design Time Performance– Significant Performance improvement in

Job Open, Save, Compile etc.

Function Expansion– New Utility, String and Date/Time functions

eg: IsValidTime, NthWeekdayFromDate, DecimalToTimestamp Parallel OnlyParallel Only

Page 9: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

9

Functional Stages

● New XML Stage

● Vertical Pivot

● Transformer Enhancements

● Slowly Changing Dimension

● Enhanced Surrogate Key Stage

● Multi Format Flat File Support

● Range Lookup

● Horizontal Parallel Pivot Stage

● Checksum Stage

9

DeveloperUsability

Connectivity

FunctionalStages

AdministratorFeatures

Operations& Runtime

New stage types that introduce new off the shelf data integration functions or expand existing ones

Page 10: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

1010

XML Pack

Improved environment for mapping complex XML documents from one schema to another in single or multiple steps

Support for Schemas (XSD XML Schema 1.0, WSDL 1.1 ) Deliver Support for complex XML Transformations with need for shredding the XML

– Hierarchical Join, Relational join, Filter, Switch, Sort, Union, Regroup, RowToColumns, ColumnsToRows, Aggregate, Distinct

Support for multiple input and output links, including reference, reject

Shall support partitioning, multi threaded and stream processing of large XML documents

Performance and volume improvements– reduced memory

requirements, increased throughput

– Remove restrictions on document size

Page 11: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

11

Customer XML Job – pre 8.5 vs 8.5

11

Ran in 11% of the time without introducing any of the new parallelization features

Page 12: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

1212

Vertical Pivot

• Enhanced Pivot stage to support - Vertical Pivoting

• mapping multiple input rows with a common key, to a single output row containing multiple columns

• Coves three basic requirements: key based groups, columnar pivot and aggregate functions

Parallel OnlyParallel Only

Page 13: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

1313

Transformer Enhancements

• Deliver Looping in the Transformer

• Allow multiple output rows to be produced from a single input row

• Support for End of Data Flag to support Key Break Logic

• New Input Cache• SaveInputRecord()

• GetSavedInputRecord()

• New System variables, function• @ITERATION, Loop Count

• LastRow(), End of data flag for last row

• LastRowInGroup(InputColumn), automate change detection

• Stage, Loop Variables support for Nullability

• More options for Null Handling

Parallel OnlyParallel Only

Page 14: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

14

SCD Stage Functionality

A single stage that encapsulates all of the functionality required to target a star schema model

“Wizard”-style features that enable the user to enter only the minimum amount of information needed

Support for columns of SCD Types 1 and 2 in the same dimension table

Initial and incremental population of fact and dimension tables

Support for surrogate key management across job runs

Parallel OnlyParallel Only

Page 15: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

15

Functional Stages

● New Suite Installer

● Increased Availability

● Information Server Manager

● Source Code Control System Integration

● Audit Logging

● Web Logging Console

● New Administration and Super Operator Roles

● Support Assistant

15

DeveloperUsability

Connectivity

FunctionalStages

AdministratorFeatures

Operations& Runtime

Simplify and expand the functionality available for administrators of the tool

Page 16: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

16

New Suite Installer

Web enabled (no XWindows configuation requirement)

Prerequisite checker confirms system meets basic requirements as well as selected installation options

Installer supports resuming an installation that failed for unexpected reasons

Streamlined feature selection including adding additional Information Server products and additional tiers to the system

Trust based licensing is now default Install utility also handles

patches/updates including unique Patch Merging Installation.

Page 17: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

17

Increased Availability

Delivering Horizontal and Vertical scaling / load balancing of the Domain and DB Tiers

Delivering higher levels of Availability for Production and Development environments

Cluster support for Application Server Tier

– WAS 6.1 and 7.0 ND Cluster support for Repository Tier

– DB2 HADR / Cluster, Oracle RAC

Improved Failover support for Engine Tier

Page 18: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

18

What is the Information Server Manager?

New Information Server application

Manages DataStage & QualityStage objects’ lifecycles

Move objects from development environments to test and production environments

Handle updates to previously deployed objects

New import/export capabilities GUI available on windows platforms

(client or server) Command line interface available on

all client and server platforms

Is a replacement for, NOT a reimplementation of, the pre-8.0 DataStage Version Control application

Page 19: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

1919

Source Code Control System Integration

Leveraging the Eclipse Platform for Team Development

Integration with code-management (SCCS) providers supported through Eclipse Plugins

Support for ClearCase, CVS– Other providers

configurable via Eclipse Team Plugins

Functions to interact with the SCCS invoked from the Information Server Manager

Page 20: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

20

ISA Lite - Support Assistant

1. System RequirementsInstallation prerequisites checker (general IS and SAP)

2. Data CollectionGeneral IS data files (passive) and specific component collections incl SAP. Option to collect IS repository records.

3. Diagnostics (Health)General IS checks as well as component specific checks.

4. Utilities DS project cleanup tool

Page 21: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

2121

Audit Logging

Deliver an audit log of security-related events. Delivers SOX and Security Compliance The following groups of audit events are logged

– User and group management creation and removal of users and groups, user group membership changes, and user credential changes

– User, group, and project security role assignments creation or deletion of a security role, assignment and removal of security roles to users or groups, and assignment or removal of users or groups and roles to a project.

– Engine credential mapping assignment and removal of credentials to IBM InfoSphere DataStage® suite users and assignment of default credentials for an IBM InfoSphere Information Server engine when mapping credentials using the Engine Credentials panel of the IBM InfoSphere Information Server Web console.

– User session management user login and logout, direct session termination, and session expiration

– Audit configuration auditing properties file location, audit file configuration settings, and audit event settings.

Page 22: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

22

Operations & Run Time

● New Platform Support

● Run-time Performance Improvements

● Job Parameter Sets

● Runtime Optimizations

● Machine Resource Estimation

● Job Performance Data Analysis

● Serviceability Tools

● Documentation

● Server-side import/export (via new istool command line utility)

22

DeveloperUsability

Connectivity

FunctionalStages

AdministratorFeatures

Operations& Runtime

Increased support for the management and performance of the run time environment

Page 23: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

2323

Red Hat Enterprise Linux 5,6 (64 bit) SUSE Linux Enterprise Server 9,10 (64 bit) Windows Server 2008 64bit (32-bit app) AIX 5.3, 6.1 (64 bit) Solaris 9,10 (64 bit) HP-UX Itanium (64 bit) Red Hat Enterprise Linux for System Z (64 bit) SUSE Linux Enterprise Server for System Z (64 bit) Windows Server 2003 (32-bit) Red Hat Enterprise Linux 5, 6 (as 32-bit app) SUSE Linux Enterprise Server 9, 10 (as 32-bit app)

Clients – Windows XP, Vista and 7 (32 & 64 bit)

Repository – DB2 9/5, 9.7, Oracle 10g, 11g, SQL Server 2005, 2008

Platform Support

Page 24: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

24

Parameter Sets

• Job Parameter Sets

• New object in repository that contains the names and values of job parameters.

• A Job Parameter set can be referenced by one or more jobs enabling easier deployment of jobs across machines and also enabling easy propagation of a changed job parameter value

Page 25: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

25

Run Time Optimizations

● Buffer Optimization– Improved buffer placement algorithm– E.g., Removed unnecessary buffer before parallel sort in some instances

● Improved Job Startup Time– Startup time improvements allow efficient use of EE against smaller data

sets● Adaptive Job Monitoring

– The primary function of the Adaptive Job Monitoring feature is to detect when CPU utilization by the conductor reaches 80% and throttle the volume of jobmon data by sending control messages to the players to reduce the output rate

– When 80% CPU utilization by the conductor is reached, a warning message will be issued to the user

– Note: only monitor messages will be throttled, metadata and summary messages are not affected

Parallel OnlyParallel Only

Page 26: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

26

Machine Resource Estimation

● Three modes:– Estimate

– Re-Estimate

– Run● Estimate Mode

– For a given job, provides estimates for the disk space required and CPU utilization.

– Two models:• Static – provides disk space

estimates based on schema and job design

• Dynamic – provides calculated estimates by node based on a run of the job against a sample of the data

Parallel OnlyParallel Only

Page 27: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

27

Job Performance Data Analysis

• Provides a graphical display of job performance and utilization based on a job run

• The type of data includes CPU time, system time, elapsed time, memory (heap) size, processed number of records

• The data is presented as sub-phases and phases for each of the stage/operator execution in the job

• Can view all nodes in the job or specific nodes in the job

Parallel OnlyParallel Only

Page 28: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

28

Connectivity

● New Connectors (available on Server canvas in 8.5)

● Local transaction support

● z/OS File Stage

● Distributed Transaction Stage

● CDC integration through DTS

● Connection Objects - Meta Data Reuse

● SQL Builder Enhancements

● Netezza Enterprise Stage

● iWay Enterprise Stage

● WebSphere II Federation

● Stored Procedure Plug-in Support (SQL Server & Teradata added)

28

DeveloperUsability

Connectivity

FunctionalStages

AdministratorFeatures

Operations& Runtime

Maximize the reach across the organization to easily access various types of data

Page 29: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

29

Common Connectors

● One component to access data from same source/target

– Supports DataStage, QualityStage, & Information Analyzer

● Combines & extends features of existing connectivity stages

● Ease of Use - improved and consistent interfaces

● Extended Functionality

● Better Performance

● DBMS Version Management

● Common Connectors can co-exist. Supersede DS SE plug-ins and DS EE Operators, yet, they can co-exist. Where IBM is investing and adding new capabilities.

Connectors

DB2

ODBC

WebSphere MQ

Oracle

Teradata

DTS – Distributed Transaction Stage (XA)

Migration Tool to convert existing jobs to use new Connectors

Page 30: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

3030

Connection Migration Tool

Modifies jobs that use legacy plug-in/operator stages to use newer Connectors

Migrates all compatible stages GUI and command line (batch) modes Server and Parallel jobs Backup, clone or replace jobs Jobs are annotated with information

about the migration

Page 31: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

3131

Distributed Transaction Stage

The Distributed Transaction Stage (DTS) utilizes the WebSphere MQ Transaction Manager to enable distributed two-phase transactions across multiple resources.

It works in collaboration with the WebSphere MQ Connector to move source messages to database targets.

Currently supports MQ (Source), DB2. Adding support for MQ (Target), Teradata, Oracle and ODBC

SourceSourceQueueQueueSourceSourceQueueQueue

TargetTargetDatabaseDatabaseTargetTargetDatabaseDatabase

Source data arrives on MQ queue from some external application

Business logic transforms the data to construct target actions

DTS updates target and deletes message from MQ source queue.

Parallel OnlyParallel Only

Page 32: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

32

Multiple Input Link Support for Connectors

● Support Local Transaction Grouping

– Insert/Update/Delete

– Commit all rows, on all links, or fail / roll back

● Both batch and real-time

● Support for SQL Error code and reject links by link

Page 33: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

3333

z/OS File stage (already enabled for V8.1)

• New Native Support for Accessing Mainframe files from Distributed platforms and Linux for System Z

– Through a new stage called z/OS File stage– VSAM files - KSDS, ESDS, RRDS – Sequential files - QSAM, Sequential read of BDAM/BSAM, PDS members,

GDG files.• Initial release

– Read/Write for Sequential files and read only for VSAM.– Fixed and variable-length records – single or multi record type format files will be supported

• Leveraging InfoSphere Classic Federation

Parallel OnlyParallel Only

Page 34: © 2010 IBM Corporation 1 Information Server: Upgrade Workshop Maj 2010.

© 2010 IBM Corporation

Upgrade Workshop

34

Tack för uppmärksamhetenPeter Bjelvert

IBM Software GroupInfoSphere [email protected]/software/data/infosphere