Administration and Programming for DB2

239
IBM DB2 Intelligent Miner Scoring Administration and Programming for DB2 Version 8.1 SH12-6745-00

description

 

Transcript of Administration and Programming for DB2

Page 1: Administration and Programming for DB2

IBM DB2 Intelligent Miner Scoring

Administration and Programmingfor DB2

Version 8.1

SH12-6745-00

���

Page 2: Administration and Programming for DB2
Page 3: Administration and Programming for DB2

IBM DB2 Intelligent Miner Scoring

Administration and Programmingfor DB2

Version 8.1

SH12-6745-00

���

Page 4: Administration and Programming for DB2

NoteBefore using this information and the product it supports, be sure to read the information in Appendix H,“Notices” on page 207.

First Edition, October 2002

This edition applies to Version 8.1 of IBM DB2 Intelligent Miner Scoring, program number 5765–F36, and to allsubsequent releases and modifications until otherwise indicated in new editions.

© Copyright International Business Machines Corporation 2001, 2002. All rights reserved.US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contractwith IBM Corp.

Page 5: Administration and Programming for DB2

Contents

Figures . . . . . . . . . . . . . vii

Tables . . . . . . . . . . . . . . ix

About this book . . . . . . . . . . xiWho should use this book . . . . . . . xiConventions and terminology used in thisbook. . . . . . . . . . . . . . . xiHow this book is structured . . . . . . xiiiHow to read the syntax diagrams . . . . xiiiHow to send your comments . . . . . . xiv

Part 1. Guide . . . . . . . . . . 1

Chapter 1. Introducing the Intelligent Minerproducts . . . . . . . . . . . . . 3IBM DB2 Intelligent Miner Scoring . . . . . 3IBM DB2 Intelligent Miner Modeling . . . . 4IBM DB2 Intelligent Miner Visualization . . . 5IBM DB2 Intelligent Miner for Data . . . . 5

Chapter 2. Introducing IM Scoring . . . . 7IM Scoring . . . . . . . . . . . . . 7

Mining functions supported by IM Scoring 7Using IM for Data to produce models . . . 8Using IM Modeling to produce models . . 9Scoring data types, methods, and functions 9

PMML: A markup language for data mining 11Converting models . . . . . . . . . . 11Online scoring with IM Scoring Java Beans . 12What is new in version 8.1 . . . . . . . 12

Ease of use . . . . . . . . . . . 13E-business enhancements . . . . . . 13Functional enhancements . . . . . . 13Standards conformance . . . . . . . 13Platform support . . . . . . . . . 14Shared infrastructure with IM Modeling . 14Limitations . . . . . . . . . . . 14

Chapter 3. Data mining functions . . . . 17Classification. . . . . . . . . . . . 17Clustering. . . . . . . . . . . . . 17Regression/Prediction. . . . . . . . . 18

Chapter 4. Getting started. . . . . . . 19Quick start . . . . . . . . . . . . 19

Installation . . . . . . . . . . . 20Configuring IM for Data to export PMMLmodels . . . . . . . . . . . . . 20Configuring the database environment . . 21Creating database objects . . . . . . 22Verifying the installation and configuration 22Executing sample applications . . . . . 22Generating SQL scripts from your ownmining models . . . . . . . . . . 23

Sample components . . . . . . . . . 23Completing the practice exercises . . . . . 25

Creating a table and importing data . . . 25Importing a mining model . . . . . . 26Applying a model and getting resultsvalues . . . . . . . . . . . . . 27Extracting information from a model. . . 31Applying models created with IMModeling . . . . . . . . . . . . 31Using IM Scoring Java Beans to scorerecords. . . . . . . . . . . . . 33Using idmmkSQL to work with your ownmining models . . . . . . . . . . 38

Chapter 5. Using IM Scoring . . . . . . 41Creating database objects . . . . . . . 41

Enabling databases. . . . . . . . . 41Disabling databases . . . . . . . . 42Checking databases . . . . . . . . 43

Working with mining models . . . . . . 43Exporting models from IM for Data . . . 43Converting exported models . . . . . 44Generating SQL statements from models 44Importing mining models . . . . . . 45Providing models by means of IMModeling . . . . . . . . . . . . 48

Applying mining models. . . . . . . . 49Querying model field names . . . . . 49Using the application functions . . . . 50Specifying data by means of REC2XML . . 51Specifying data by means of DM_applData 52Specifying data by means of CONCAT . . 53Results data . . . . . . . . . . . 53Code sample for applying models. . . . 55

© Copyright IBM Corp. 2001, 2002 iii

Page 6: Administration and Programming for DB2

Getting application results . . . . . . . 56Handling missing values . . . . . . . 57

Using IM Scoring Java Beans . . . . . . 58Setting environment variables . . . . . 59Specifying the mining model to be used . 60Accessing model metadata . . . . . . 61Specifying a data record . . . . . . . 62Applying scoring . . . . . . . . . 62Accessing computed results . . . . . . 62Scoring example . . . . . . . . . 63ScoringException classes . . . . . . . 64

Chapter 6. Administrative tasks . . . . . 65Using IM Scoring in a multilanguageenvironment . . . . . . . . . . . . 65Getting error information . . . . . . . 65Getting support . . . . . . . . . . . 66

Product README . . . . . . . . . 66'Frequently asked questions' and 'Hintsand tips' . . . . . . . . . . . . 67Problem identification worksheet . . . . 67Getting product information . . . . . 68Getting trace information . . . . . . 69Getting DB2 diagnostic information . . . 71

Part 2. Reference . . . . . . . . 73

Chapter 7. Overview of IM Scoringdatabase objects . . . . . . . . . . 75Data types provided by IM Scoring . . . . 75Methods provided by IM Scoring . . . . . 77Functions provided by IM Scoring . . . . 77Parameter sizes . . . . . . . . . . . 81

Chapter 8. IM Scoring methods reference 83DM_expDataSpec . . . . . . . . . . 84DM_getFldName . . . . . . . . . . 85DM_getFldType . . . . . . . . . . . 86DM_getNumFields . . . . . . . . . . 87DM_impDataSpec . . . . . . . . . . 88DM_isCompatible . . . . . . . . . . 89

Chapter 9. IM Scoring functions reference 91DM_applData . . . . . . . . . . . 92DM_applyClasModel . . . . . . . . . 94DM_applyClusModel . . . . . . . . . 95DM_applyRegModel . . . . . . . . . 96DM_expClasModel. . . . . . . . . . 97DM_expClusModel . . . . . . . . . 98

DM_expRegModel . . . . . . . . . . 99DM_getClasCostRate. . . . . . . . . 100DM_getClasMdlName . . . . . . . . 101DM_getClasMdlSpec . . . . . . . . . 102DM_getClasTarget . . . . . . . . . 103DM_getClusConf . . . . . . . . . . 104DM_getClusMdlName . . . . . . . . 105DM_getClusMdlSpec. . . . . . . . . 106DM_getClusScore . . . . . . . . . . 107DM_getClusterID . . . . . . . . . . 108DM_getClusterName. . . . . . . . . 109DM_getConfidence . . . . . . . . . 110DM_getNumClusters. . . . . . . . . 111DM_getPredClass . . . . . . . . . . 112DM_getPredValue. . . . . . . . . . 113DM_getQuality . . . . . . . . . . 114DM_getQuality(clusterid) . . . . . . . 115DM_getRBFRegionID . . . . . . . . 116DM_getRegMdlName . . . . . . . . 117DM_getRegMdlSpec . . . . . . . . . 118DM_getRegTarget . . . . . . . . . . 119DM_impApplData . . . . . . . . . 120DM_impClasFile . . . . . . . . . . 121DM_impClasFileE. . . . . . . . . . 122DM_impClasModel . . . . . . . . . 123DM_impClusFile . . . . . . . . . . 124DM_impClusFileE . . . . . . . . . 125DM_impClusModel . . . . . . . . . 127DM_impRegFile . . . . . . . . . . 128DM_impRegFileE . . . . . . . . . . 129DM_impRegModel . . . . . . . . . 130

Chapter 10. IM Scoring commandreference . . . . . . . . . . . . 131The idmcheckdb command . . . . . . 132The idmdisabledb command . . . . . . 132The idmenabledb command . . . . . . 133The idminstfunc command. . . . . . . 135The idmlevel command . . . . . . . . 135The idmlicm command . . . . . . . . 135The idmmkSQL command . . . . . . . 136The idmuninstfunc command . . . . . . 138The idmxmod command . . . . . . . 139

Chapter 11. IM Scoring Java Beansreference . . . . . . . . . . . . 141

Part 3. Appendixes . . . . . . . 143

iv Administration and Programming for DB2

Page 7: Administration and Programming for DB2

Appendix A. Installing IM Scoring . . . 145Installing IM Scoring on AIX systems . . . 145

Prerequisites for AIX systems . . . . . 145Installing IM Scoring. . . . . . . . 146Uninstalling IM Scoring. . . . . . . 148

Installing IM Scoring on Linux systems . . 149Prerequisites for Linux systems . . . . 149Installing IM Scoring. . . . . . . . 149Uninstalling IM Scoring. . . . . . . 150

Installing IM Scoring on Sun Solaris systems 150Prerequisites for Sun Solaris systems . . 150Installing IM Scoring. . . . . . . . 151Uninstalling IM Scoring. . . . . . . 152

Installing IM Scoring on Windows systems 153Prerequisites for Windows systems . . . 153Installing IM Scoring. . . . . . . . 154Uninstalling IM Scoring. . . . . . . 156

Configuring the database managementsystem on UNIX systems . . . . . . . 157

Enabling the DB2 instance on UNIXsystems . . . . . . . . . . . . 157Disabling the DB2 instance on UNIXsystems . . . . . . . . . . . . 157

Configuring the database managementsystem on Windows systems . . . . . . 158

Enabling the DB2 instance on Windowssystems . . . . . . . . . . . . 158

Enabling IM for Data to export PMML orXML models . . . . . . . . . . . 158

On AIX systems . . . . . . . . . 158On Sun Solaris systems . . . . . . . 159On Windows systems . . . . . . . 160

Appendix B. Installing IM Scoring JavaBeans . . . . . . . . . . . . . 161Installing IM Scoring Java Beans on AIXsystems . . . . . . . . . . . . . 161

Prerequisites for AIX systems . . . . . 161Installing IM Scoring Java Beans . . . . 161Uninstalling IM Scoring Java Beans . . . 162

Installing IM Scoring Java Beans on Linuxsystems . . . . . . . . . . . . . 162

Prerequisites for Linux systems . . . . 163Installing IM Scoring Java Beans . . . . 163Uninstalling IM Scoring Java Beans . . . 163

Installing IM Scoring Java Beans on SunSolaris systems. . . . . . . . . . . 164

Prerequisites for Sun Solaris systems . . 164Installing IM Scoring Java Beans . . . . 164Uninstalling IM Scoring Java Beans . . . 164

Installing IM Scoring Java Beans onWindows systems. . . . . . . . . . 165

Prerequisites for Windows systems . . . 165Installing IM Scoring Java Beans . . . . 165

Appendix C. Migration from IM ScoringV7.1 . . . . . . . . . . . . . . 167Working with IM Scoring V7.1 and V8.1 inparallel . . . . . . . . . . . . . 167Exporting and importing models with theuse of compression . . . . . . . . . 168Exporting and importing models by meansof DB2 Utilities . . . . . . . . . . 168Importing models in unfenced mode . . . 169Applying Neural models . . . . . . . 169Using the function DM_getClusterID . . . 170

Appendix D. Coexistence with IMModeling . . . . . . . . . . . . 171Shared schema . . . . . . . . . . . 171Shared data types . . . . . . . . . . 171Shared functions . . . . . . . . . . 171Shared methods . . . . . . . . . . 172Shared commands . . . . . . . . . 172

Appendix E. Error messages . . . . . 173DB2 SQL states . . . . . . . . . . 173IM Scoring SQL states . . . . . . . . 174IM Scoring error events . . . . . . . . 174

Appendix F. The DB2 REC2XML function 199

Appendix G. IM Scoring conformance toPMML . . . . . . . . . . . . . 203IM Scoring application . . . . . . . . 203IM Scoring conversion tools . . . . . . 204Radial-Basis Function prediction . . . . . 205

Appendix H. Notices . . . . . . . . 207Trademarks . . . . . . . . . . . . 209

Bibliography and related information . . 211IBM DB2 Intelligent Miner publications . . 211IBM DB2 Universal Database (DB2 UDB)publications. . . . . . . . . . . . 212Related information . . . . . . . . . 212

Index . . . . . . . . . . . . . 213

Contents v

Page 8: Administration and Programming for DB2

vi Administration and Programming for DB2

Page 9: Administration and Programming for DB2

Figures

1. The IM Scoring process . . . . . . . 92. Architecture sample to realize a

call-center scenario . . . . . . . . 12

3. Model import processes . . . . . . 464. Applying a model to data . . . . . 55

© Copyright IBM Corp. 2001, 2002 vii

Page 10: Administration and Programming for DB2

viii Administration and Programming for DB2

Page 11: Administration and Programming for DB2

Tables

1. Formatting conventions . . . . . . xii2. Abbreviations . . . . . . . . . xii3. PMML model types . . . . . . . . 44. Sample components for the Clustering

mining function of IM Scoring . . . . 245. Sample components for IM Scoring Java

Beans . . . . . . . . . . . . 256. Import functions and related data types

and tables . . . . . . . . . . . 467. Import functions using a specific XML

encoding . . . . . . . . . . . 478. Import functions using CLOB values 489. Functions for applying models . . . . 50

10. Application functions and their datatypes and results data . . . . . . . 53

11. Results functions and their purpose 5612. IM Scoring Java Beans methods for

accessing model metadata . . . . . 6113. IM Scoring Java Beans methods for

accessing computed results . . . . . 62

14. Data types specific to IM Scoring 7515. Methods for type DM_LogicalDataSpec 7716. Functions for working with scoring data

type DM_ApplicationData . . . . . 7817. Functions for working with data mining

model type DM_ClasModel . . . . . 7818. Functions for working with scoring

result type DM_ClasResult . . . . . 7819. Functions for working with scoring

result type DM_ClusResult . . . . . 7920. Functions for working with data mining

model type DM_ClusteringModel . . . 7921. Functions for working with data mining

model type DM_RegressionModel . . . 8022. Functions for working with scoring

result type DM_RegResult . . . . . 8023. Mining field types . . . . . . . . 8624. The idmcheckdb messages . . . . . 132

© Copyright IBM Corp. 2001, 2002 ix

Page 12: Administration and Programming for DB2

x Administration and Programming for DB2

Page 13: Administration and Programming for DB2

About this book

IBM DB2® Intelligent Miner™ Scoring is an application that integrates themodel application functionality of Intelligent Miner for Data Version 6.1 orhigher with the DB2 Universal Database™.

Intelligent Miner Scoring enables you to import and apply mining models,and to access the results.

Throughout this book, the following abbreviations are used:v IBM DB2 Intelligent Miner Scoring V8.1 is referred to as IM Scoring.v IBM DB2 Intelligent Miner Scoring V7.1 is referred to as IM Scoring V7.1.v IBM DB2 Intelligent Miner Modeling V8.1 is referred to as IM Modeling.v IBM DB2 Intelligent Miner Visualization V8.1 is referred to as IM

Visualization.v IBM DB2 Intelligent Miner for Data is referred to as IM for Data.

This book describes how to install and use IM Scoring and IM Scoring JavaBeans. This book also provides a full reference resource to the database objectsprovided by IM Scoring.

References in this book to DB2 refer to DB2 UDB Version 7.2 or higher.

Who should use this book

This book is intended for the following users:v DB2 database administrators who are familiar with DB2 administration

concepts, tools, and techniquesv Users of IM for Data who are familiar with the concepts underlying the

different data mining functions that IM for Data providesv DB2 application programmers who are familiar with SQL and with one or

more programming languages that can be used for DB2 applications

Conventions and terminology used in this book

In DB2, the names of the scoring methods, functions, data types, tables, andtable columns are created in capital letters, even if you used, for example,lowercase letters. In this book, these names are represented in mixed case forbetter readability.

© Copyright IBM Corp. 2001, 2002 xi

Page 14: Administration and Programming for DB2

The following table shows the formatting conventions used in this book.

Table 1. Formatting conventions

Convention used How it is used

Interface elements, for example, menubars, buttons, and labels are shown inboldface.

Click OK.

Menu instructions are shown in boldfaceand sequential instructions are separatedby arrows.

Click File —> Export.

Command syntax is shown in amonospaced font.

db2 -stf idmtab.db2

The names of the following are shown ina monospaced font:

v Files and directories

v Database tables and columns

v SQL methods, functions, and datatypes

The SQL INSERT command inserts themodel into a column of the tableClusterModels, which is configured for thedata type DM_ClusteringModel.

Variables within command syntax, whichyou should replace by a real value, areshown in italics between angle brackets.

idmdisabledb <db name>

Italics are used to highlight theintroduction of a new term.

These functions are also referred to asuser-defined functions.

The following table shows the abbreviations used in this book.

Table 2. Abbreviations

Abbreviation Full form

CRM Customer Relationship Management

GUI Graphical user interface

ICU International Classes for Unicode

PMML Predictive Model Markup Language

RBF Radial Basis Function

RPM Redhat Package Manager

SQL Structured Query Language

UDF User-defined function

UDM User-defined method

UDT User-defined data type

XML Extensible Markup Language

xii Administration and Programming for DB2

Page 15: Administration and Programming for DB2

How this book is structured

This book is divided into the following parts:

Part 1. GuideContains the following:v An overview of the functionality available with IM Scoringv Instructions on how to get started with IM Scoringv Guidance on how to use IM Scoring and how to perform

administrative tasks

Part 2. ReferenceProvides a reference resource to all the IM Scoring database objectsand utilities.

Part 3. AppendixesContains the following:v Instructions on how to install, configure, and uninstall IM Scoring

and IM Scoring Java Beansv Information on migration issues from IM Scoring V7.1 and on

conformance with PMMLv Instructions on using the DB2 function REC2XML

v Information about the error messages produced by IM Scoring

How to read the syntax diagrams

In the reference part of this book, the syntax for IM Scoring’s functionality isdescribed using the following structure:v Read the syntax diagrams from left to right and top to bottom, following

the path of the line.The $$─── symbol indicates the beginning of a statement.The ───$ symbol indicates that the statement syntax is continued on thenext line.The $─── symbol indicates that a statement is continued from the previousline.The ──$& symbol indicates the end of a statement.

v Required items appear on the horizontal line (the main path).

$$ required item $&

v Optional items appear below the main path.

$$optional item

$&

About this book xiii

Page 16: Administration and Programming for DB2

v If you can choose from two or more items, they appear in a stack.If you must choose one of the items, one item of the stack appears on themain path.

$$ required choice1required choice2

$&

If choosing none of the items is an option, the entire stack appears belowthe main path.

$$optional choice1optional choice2

$&

A repeat arrow above a stack indicates that you can make more than onechoice from the stacked items.

$$ *

optional choice1optional choice2

$&

v Keywords must be spelled exactly as shown. Variables appear in lowercaseletters (for example, encoding name). They represent names or values thatyou must supply.

v If punctuation marks, parentheses, arithmetic operators, or other suchsymbols are shown, you must enter them as part of the syntax.

How to send your comments

Your feedback is important in helping us to provide you with the mostaccurate and high-quality information possible. If you have any commentsabout this book:v Send your comments by e-mail to [email protected]. Be sure to include

the name and part number of the book, and to say which version of IMScoring you are using. If applicable, include the specific location of the textyou are commenting on. For example, give a page number or table number.

v Fill out the Readers’ Comments form at the back of this book. Return it bymail, by fax, or by giving it to an IBM representative. The mailing addressis on the back of the form. The fax number is +49-(0)7031-16-4892.

xiv Administration and Programming for DB2

Page 17: Administration and Programming for DB2

Part 1. Guide

This part introduces you to IM Scoring and gives you instructions for its use.v For an overview of the Intelligent Miner family of products, see Chapter 1,

“Introducing the Intelligent Miner products” on page 3.v For an overview of IM Scoring, see Chapter 2, “Introducing IM Scoring” on

page 7.v For a quick overview of what you need to do to get up and running with

IM Scoring, see Chapter 4, “Getting started” on page 19. This chapter alsocontains a tutorial in the form of sample exercises.

v For full instructions in the use of IM Scoring, see Chapter 5, “Using IMScoring” on page 41.

v For instructions on doing a number of administrative tasks connected withIM Scoring, see Chapter 6, “Administrative tasks” on page 65.

© Copyright IBM Corp. 2001, 2002 1

Page 18: Administration and Programming for DB2

2 Administration and Programming for DB2

Page 19: Administration and Programming for DB2

Chapter 1. Introducing the Intelligent Miner products

The IBM DB2 Intelligent Miner Version 8.1 is a set of the following products:v Intelligent Miner Scoringv Intelligent Miner Modelingv Intelligent Miner Visualizing

These products support rapid enablement of Intelligent Miner analyticsembedded in Business Intelligence (BI), eCommerce, or traditional OLTPapplication programs.v You can use IM Scoring to deploy PMML models that were created by one

of the Intelligent Miner products or by other applications and tools thatsupport interoperability through the use of PMML models.

v You can use IM Modeling to build data mining models.v You can use IM Visualizing to browse PMML models that are created by

one of the Intelligent Miner products or by other applications and tools thatsupport interoperability through the use of PMML models.

PMML is a standard format for data mining models. Based on XML, PMMLprovides a standard that enables data mining models to be shared betweenthe applications of different vendors. The intention is to provide avendor-independent method of defining models. In this way, proprietaryissues and incompatibilities are no longer a barrier to the exchange of modelsbetween applications.

You can find more information about PMML on the Web site of the DataMining Group (DMG) at http://www.dmg.org.

IBM DB2 Intelligent Miner Scoring

IM Scoring provides scoring technology as database extenders, DB2 extenders,and Oracle cartridges. It enables application programs to apply PMML modelsto large databases, subsets of databases, or single rows or cases. Applicationprograms use the SQL API, which consists of user-defined functions (UDFs)and user-defined methods (UDMs), to perform the scoring operation. ThePMML models might have been created by one of the Intelligent Minerproducts or by other applications and tools that support interoperabilitythrough the use of PMML models.

The following table shows the different PMML models that can be applied bydifferent mining algorithms.

© Copyright IBM Corp. 2001, 2002 3

Page 20: Administration and Programming for DB2

Table 3. PMML model types

PMML model type Mining algorithm

Center-based clustering Neural Clustering algorithm

Distribution-based clustering Demographic Clustering algorithm

Neural networks Neural Classification algorithm, NeuralPrediction algorithm

Decision tree Tree Classification algorithm

Regression Logistic Regression algorithm, PolynomialRegression algorithm, Linear Regressionalgorithm

Additionally, IM Scoring supports models that are built by the RBF Predictionalgorithm of the Intelligent Miner for Data. These models are not yet part ofPMML. You can export these models in XML format from the IntelligentMiner for Data and use them with IM Scoring.

Mining models that are applied by the SQL API of IM Scoring must becontained in database tables.

If the mining models are created by means of IM Modeling, they can bedirectly applied because IM Modeling writes the models into database tables.

If the mining models are created by means of the Intelligent Miner for Data,they must be exported from the Intelligent Miner for Data and imported intodatabase tables. IM Scoring provides UDFs to import the models.

You can also apply PMML V1.1 or PMML V2.0 models that are created withtools from different vendors.

IM Scoring provides a feature called Single Record Scorer. The Single RecordScorer consists of a Java API. You can use this feature to score single ormultiple data records against a mining model that is contained in a flat file.The Single Record Scorer is designed for applications where the online scoringof data records is the main task.

IBM DB2 Intelligent Miner Modeling

IM Modeling provides IM modeling technology as DB2 extenders. It enablesSQL application programs to call associations discovery, clustering, andclassification operations to develop analytic models based on data accessed byDB2 Universal Database Version 7 or Version 8 SQL. The resulting models arein PMML V2.0 format. They can be processed by IM Scoring or IMVisualizing.

4 Administration and Programming for DB2

Page 21: Administration and Programming for DB2

IM Modeling consists of an SQL API. By using this SQL API, you can buildAssociations, Demographic Clustering, and Tree Classification PMML modelsthat are stored in DB2 tables.

The data mining functions are based on the mining functions included in theIntelligent Miner for Data.

IBM DB2 Intelligent Miner Visualization

IM Visualizing provides the following JAVA visualizers to present datamodeling results for analysis:v Associations Visualizerv Classification Visualizerv Clustering Visualizer

You can use the Intelligent Miner Visualizers to visualize PMML-conformingmining models. Applications can call these visualizers to present modelresults, or you can deploy the visualizers as applets in a Web browser forready dissemination. The models might have been developed by using IMModeling or other applications and tools that support interoperability throughthe use of PMML models, or models of the Intelligent Miner for Data mighthave been exported as PMML models.

The Intelligent Miner Visualizers are included in Intelligent Miner for DataVersion 8.1.

IBM DB2 Intelligent Miner for Data

Intelligent Miner for Data Version 8.1 is an independent product that providesthe following mining functions to build and apply mining models based ondatabase or flat file data:v Associations mining functionv Classification mining function including the following algorithms:

– Neural Classification– Tree Classification

v Clustering mining function including the following algorithms:– Demographic Clustering– Neural Clustering

v Prediction mining functions including the following algorithms:– Neural Prediction– Polynomial Regression– RBF Prediction

v Processing functions

Chapter 1. Introducing the Intelligent Miner products 5

Page 22: Administration and Programming for DB2

The Processing functions can be used only on database tables.v Sequential Patterns mining functionv Similar Sequences mining functionv Statistics functions

Version 8 of the Intelligent Miner for Data includes the Intelligent MinerVisualizers. It also includes the PMML conversion component of IM Scoring,which allows you to export mining models in PMML format.

6 Administration and Programming for DB2

Page 23: Administration and Programming for DB2

Chapter 2. Introducing IM Scoring

This chapter introduces IM Scoring. It describes the functionality provided byIM Scoring, and provides information about PMML and model conversion.

This chapter also describes what is new in IM Scoring V8.1.

IM Scoring

IM Scoring is an add-on service to DB2 that extends the capabilities of DB2 toinclude data mining functions. Mining models continue to be built throughthe use of the following tools:

IM for DataThis produces models that can be exported as PMML models.

IM ModelingThis provides mining models in PMML 2.0 format.

Other toolsAny other tool that provides mining models in PMML 1.1 or PMML 2.0format.

You can use the IM Scoring functionality to import certain types of miningmodels into a DB2 table, to apply the models to data within DB2, and toaccess the results. This functionality comprises the scoring functions of IMScoring.

The results of applying the model are referred to as scoring results. Theseresults differ in content according to the type of model applied. IM Scoringincludes functions to retrieve the values of scoring results.

IM Scoring is available on the following operating systems:v AIX®

v Linuxv Sun Solarisv Windows NT®, Windows® 2000, Windows XP

Mining functions supported by IM ScoringIM Scoring supports the application mode for the following IM for Datamining and statistical functions:v Demographic and Neural Clusteringv Tree and Neural Classification

© Copyright IBM Corp. 2001, 2002 7

Page 24: Administration and Programming for DB2

v RBF and Neural Predictionv Polynomial Regression

For a short introduction to these mining functions, see Chapter 3, “Datamining functions” on page 17.

IM Scoring supports the application of the following models created by IMModeling:v Demographic Clusteringv Tree Classification

For descriptions of these mining models, see the IM Modeling documentation,IM Modeling Administration and Programming. In this guide, Chapter 3, “Datamining functions” on page 17 also contains brief introductory informationabout mining models.

In addition, IM Scoring supports the application of Logistic Regressionmodels.

Within IM Scoring, the mining functions are grouped into the mining typesClustering, Classification, and Regression as follows:v Clustering includes Demographic and Neural Clusteringv Classification includes Tree and Neural Classificationv Regression includes RBF Prediction, Neural Prediction, Polynomial

Regression, and Logistic Regression

Scoring functions are provided to work with each of these types. Each scoringfunction includes different algorithms to deal with the different miningfunctions included within a type. For example, the Clustering type includesDemographic and Neural Clustering; thus, scoring functions for Clusteringinclude algorithms for demographic and neural clustering.

Using IM for Data to produce modelsFor all the mining functions that are supported, except Logistic Regression,you can build and store the models by using IM for Data, which supportsPMML models. A model must then be exported to an external file.

To use the IM Scoring mining functions:v Import the mining model into a DB2 table, where it is stored as a large

objectv Apply the model to data stored in DB2 tablesv Store scoring results in a DB2 tablev Extract information about the results, for example, the cluster ID and score

8 Administration and Programming for DB2

Page 25: Administration and Programming for DB2

Figure 1 shows the process by which a mining model that was built with IMfor Data is exported from IM for Data, imported into a DB2 database, andapplied to selected data.

Using IM Modeling to produce modelsYou can use IM Modeling to create models for the mining functions that itsupports; these are Demographic Clustering and Tree Classification. Themodels that IM Modeling creates reside in a DB2 table. These models are in aformat that enables IM Scoring to apply them directly.

Scoring data types, methods, and functionsThe database objects supplied with IM Scoring consist of the following:v User-defined data types (UDTs)v User-defined functions (UDFs)v User-defined methods (UDMs)

Figure 1. The IM Scoring process

Chapter 2. Introducing IM Scoring 9

Page 26: Administration and Programming for DB2

These database objects are grouped together in the schema IDMMX. To access aUDT, UDF, or UDM, you must specify its fully-qualified name, for example,data type IDMMX.DM_ClusteringModel.

Part 2, “Reference” on page 73 supplies overview lists and full descriptions ofall the database objects supplied with IM Scoring.

User-defined data typesThe user-defined data types are used for identifying and storing miningmodels and results in DB2 tables. User-defined data types are also referred toas user-defined types or UDTs.

The user-defined data types provided by IM Scoring consist of distinct typesand structured types.

Distinct typesThe following user-defined types are distinct types in IM Scoring:v DM_ApplicationData

v DM_ClasModel, DM_ClusteringModel, DM_RegressionModelv DM_ClasResult, DM_RegResult, DM_ClusResult

Structured typeThe following user-defined type is a structured type in IM Scoring:

DM_LogicalDataSpec

User-defined methodsUse user-defined methods to create or modify user-defined structured types.You can call the methods that are defined for a type by using either a methodsyntax or a function syntax.

Method syntaxTo call, or invoke, a method using the method syntax:v In an appropriate context, specify the method name preceded by both a

reference to a structured type instance, and the double dot operator.v Follow this with the list of arguments enclosed in parentheses.

Example:select IDMMX.DM_getClusMdlSpec(modelcolumn)..DM_getNumFields()...

Function syntaxTo call, or invoke, a method using the function syntax:

In an appropriate context, specify the method name followed by, inparentheses, the structured type instance and the list of arguments.

Example:select IDMMX.DM_getNumFields( IDMMX.DM_getClusMdlSpec(modelcolumn) )...

10 Administration and Programming for DB2

Page 27: Administration and Programming for DB2

If the structured type instance is NULL, the method is not called, and NULLis returned.

User-defined functionsIM Scoring provides scoring functions, also referred to as user-defined functions(UDFs), which enable you to:v Import and export mining models, and access the properties of the models.v Apply these models to data held in DB2 tables.v Retrieve the results.

Function syntaxThe function syntax is described in ’Function syntax’ in “User-definedmethods” on page 10.

PMML: A markup language for data mining

PMML is a standard format for data mining models. Based on XML, thePMML format provides a standard that enables data mining models to beshared between the applications of different vendors. The intention is toprovide a vendor-independent method for defining models. In this way,proprietary issues and incompatibilities are no longer a barrier to theexchange of models between applications.

You can find more information on PMML on the Web site of the Data MiningGroup (DMG) at http://www.dmg.org.

Converting models

IM Scoring provides a model conversion facility, which converts miningmodels from IM for Data format to PMML 2.0 format. The model conversionfacility respects the current server locale and writes the appropriate XMLencoding into the PMML model.

Additionally, IM Scoring provides the features that are required to register themodel conversion facility with IM for Data by using the client tool registrationfacility of IM for Data.

You can use the model conversion facility by selecting the PMML format inthe Export dialog of the IM for Data GUI when you export the model.

If you import models created by IM for Data into DB2, you do not need toconvert the models to PMML 2.0. The model import functions read models inPMML 1.1, PMML 2.0, or Intelligent Miner format. Importing V6 modelsworks only in fenced mode; for further details, see “Importing models inunfenced mode” on page 169.

Chapter 2. Introducing IM Scoring 11

Page 28: Administration and Programming for DB2

Online scoring with IM Scoring Java Beans

IM Scoring Java Beans can be used to score single or multiple data recordsusing a specified mining model. IM Scoring Java Beans is designed to be usedfor applications where the online scoring of data records is the main task.

A possible application area of IM Scoring Java Beans might be the realizationof an Internet-based call center scenario. In this scenario, the required businesslogic – in this case the scoring functions – runs on a Web or applicationserver. Clients can connect to the server and send to it a data record that wasspecified by a call-center operator by means of a user interface on the client.The data record is scored on the server, and the result is passed back to theclient in real time.

Figure 2 shows a simplified design, illustrating how such a scenario could berealized using IM Scoring Java Beans. Here, IM Scoring Java Beans isintegrated into a J2EE implementation using, for example, servlets orEnterprise Java™ Beans.

Note: To get optimum performance throughput, you might decide to run eachmining model in a separate process. In this case, you would pass onlythe new records to the appropriate scoring process. This results in aconsiderable performance improvement. The reason for theimprovement is that the model-loading step, which is verytime-consuming, is done only once.

What is new in version 8.1

This section introduces you to the new features in IM Scoring.

Figure 2. Architecture sample to realize a call-center scenario

12 Administration and Programming for DB2

Page 29: Administration and Programming for DB2

Ease of use

idmmkSQLThis new command enables you to generate a sample SQL script from aPMML model. You can then use this sample script as a template to invokeIM Scoring on the model.

Improved samplesThe IM Scoring samples have been enhanced and reworked, and theydemonstrate how to use the new DB2 built-in function REC2XML. Thissimplifies SQL statements and improves performance.

E-business enhancements

Java support for Realtime ScoringThe new JAVA interface, IM Scoring Java Beans, enables you to integratereal-time scoring into e-business applications, for example, those used inCRM.

Functional enhancements

Model compressionModels are now compressed when they are imported into the database.This results in reduced resource consumption (database size) andimproved performance. Models that were imported by means of IMScoring V7 can be compressed through the use of export and importfunctions. For details, see “Exporting and importing models with the useof compression” on page 168.

New methods to work with mining fieldsThe new structured type DM_LogicalDataSpec contains information aboutthe mining fields that are part of the input data used to apply models.This information includes the field name and field type definitions of themining fields. A number of new methods are supported forDM_LogicalDataSpec: for details, see “Methods provided by IM Scoring”on page 77.

Additional functionsAdditional functions have been added to extract properties from a datamining model and from a scoring result.

Standards conformance

PMML 2.0 supportThe IM Scoring conversion utilities now generate PMML 2.0 models. Formore information about PMML, see http://www.dmg.org.

IM Scoring now accepts PMML 2.0 models in addition to the PMML 1.1models generated by IM Scoring V7.1. For detailed information about howIM Scoring conforms to PMML, see Appendix G, “IM Scoringconformance to PMML” on page 203.

Chapter 2. Introducing IM Scoring 13

Page 30: Administration and Programming for DB2

Platform supportThere is now support for Windows XP.

Shared infrastructure with IM ModelingIM Scoring shares common infrastructure like XML parsing, error handling,tracing, licensing, and diagnostics with IM Modeling. This causes somechanges in administrative interfaces.

Installation directoryIM Scoring uses a new default installation directory prefix, IMinerX,instead of IMinerSc as used in IM Scoring V7.1.

The utilities idmenabledb, idmdisabledb, and idmcheckdbThe commands to enable and disable a database for IM Scoring have beenimproved, and are shared between IM Scoring and IM Modeling. Theidmcheckdb utility is a new tool that checks the enablement status of adatabase.

Collecting diagnostic informationThe tracing infrastructure has been improved. New environment variablesenable you to customize the degree of tracing information.

A new tool, idmlevel, enables you to check which version of IM Scoringyou are using.

License use managementIM Scoring uses nodelock keys to check whether a valid license isavailable. The ’Try and Buy’ version installs a temporary key. This keyallows you to use the product for a limited period of time in accordancewith the EULA valid for the ’Try and Buy’ version. The new commandidmlicm lets you check the license status.

LimitationsThe following limitations exist in version 8.1.

v Importing models from IM for Data

Models in Intelligent Miner formatIt is no longer possible to import these models in unfenced mode.To import models of this kind, do one of the following:v Enable the database in fenced mode.v Use the model conversion utility idmxmod to convert the model

to PMML before importing it.

v Neural PMML models generated by IM Scoring V7.1You might have existing models that were generated using the neuralkernels of IM for Data V6 or higher in an IM Scoring database. Models ofthis kind must be migrated by importing them again.

14 Administration and Programming for DB2

Page 31: Administration and Programming for DB2

The cluster position in the PMML fileThe function DM_getClusterID returns the position of the cluster in thePMML file. This is different from the behavior in IM Scoring V7.1. Fordetails, see “Using the function DM_getClusterID” on page 170.

Chapter 2. Introducing IM Scoring 15

Page 32: Administration and Programming for DB2

16 Administration and Programming for DB2

Page 33: Administration and Programming for DB2

Chapter 3. Data mining functions

This chapter provides a general introduction to the data mining functions thatcan be used with IM Scoring. The generation and application of miningmodels is described. Note that IM Scoring supports only the application ofthese models.

The mining functions are described in the following sections:v “Classification”v “Clustering”v “Regression/Prediction” on page 18

Classification

Classification is the process of automatically creating a model of classes froma set of records that contain class labels. The classification technique analyzesrecords that are already known to belong to a certain class, and creates aprofile for a member of that class from the common characteristics of therecords. You can then use a data mining application tool to apply this modelto new records, that is, records that have not yet been classified. This enablesyou to predict if the new records belong to that particular class.

When a model is applied, IM Scoring assigns a class label and a confidencevalue to each individual record being scored.

Clustering

The clustering technique consists of a range of algorithms that group datarecords on the basis of how similar they are. For example, a data record mightconsist of a description of customers. In this case, clustering would groupsimilar customers together, and at the same time it would maximize thedifferences between the different customer groups formed in this way. Thegroups that are found are known as clusters. Each cluster tells a specific storyabout customer identity or behavior, for example, about their demographicbackground, or about their preferred products or product combinations. Inthis way, customers can be grouped in homogeneous groups that are verysimilar to each other.

When a model is applied, IM Scoring assigns a cluster ID, a cluster score, aquality value, and a confidence value to each individual record being scored.The cluster score, quality value, and confidence value are different measuresthat indicate how well the record fits into the assigned cluster.

© Copyright IBM Corp. 2001, 2002 17

Page 34: Administration and Programming for DB2

Regression/Prediction

The purpose of predicting values is to discover the dependency and thevariation of one field’s value on the values of the other fields within the samerecord. A model is generated that can predict a value for that particular fieldin a new record of the same form, based on other field values. For example, aretailer wants to use historical data to estimate the sales revenue for a newcustomer. A mining run on this historical data creates a model. This modelcan be used to predict the expected sales revenue for a new customer, basedon the new customer’s data. The model might also show that, for somecustomers, incentive campaigns improve sales. In addition, it might revealthat frequent visits by sales representatives lead to a lower revenue if thecustomer is young.

When a model is applied, IM Scoring assigns a predicted value and, for anRBF model, a region ID to each individual cluster being scored.

18 Administration and Programming for DB2

Page 35: Administration and Programming for DB2

Chapter 4. Getting started

The aim of this chapter is to get you up and running quickly in using IMScoring.v First, there is a quick-start guide. Here, you review the tasks that you need

to complete to get started.See “Quick start”.

v This is followed by sections that help you to gain confidence in using theIM Scoring mining functions. These sections guide you through a tutorial ofpractice exercises on sample data. By using the data and scripts providedwith the IM Scoring package and the instructions given in these sections,you can do the following:– Import and store a sample mining model– Apply it to sample data– Obtain results– Extract information from a model– Apply models created with IM Modeling– Use IM Scoring Java Beans to score records

All the tasks in the practice exercises are completed by means of samplescripts. The scripts include standard SQL commands, such as INSERT, andscoring functions such as DM_impClusFile. The contents of the scripts aregiven in this chapter so that you can see how the SQL statements arestructured. You can use these sample scripts as a basis for your own scripts.

See “Sample components” on page 23 and “Completing the practiceexercises” on page 25.

Quick start

This chapter guides you through the steps necessary to install and configureIM Scoring successfully. It gives you brief hints on what to do, and points youto the appropriate sections in this guide that describe each step in detail.

Some steps are mandatory, and some steps are optional.

Mandatory steps:

1. Installation2. Configuration3. Creating database objects

© Copyright IBM Corp. 2001, 2002 19

Page 36: Administration and Programming for DB2

Optional steps:

1. Verifying the installation and configuration2. Executing sample applications

If you have IM Scoring V7.1 installed and configured, first check anymigration issues. For further information, see Appendix C, “Migration fromIM Scoring V7.1” on page 167.

InstallationInstall IM Scoring by using the usual installation tools. The IM ScoringCD-ROM contains subdirectories for each platform that is supported.

To install IM Scoring, insert the CD-ROM into your CD-ROM drive, andchange to the appropriate subdirectory. For each platform, different setupprograms (Windows), installp images (AIX), or installation scripts (SUN andLinux) are provided. These enable you to install the various components ofIM Scoring (Scoring, Conversion, IM Scoring Java Beans).

For full instructions on installing all the components of IM Scoring,configuring the database management system, and uninstalling IM Scoring,see:v Appendix A, “Installing IM Scoring” on page 145v “Installing IM Scoring Java Beans” on page 164

The conversion component and the Scoring component need additionalconfiguration steps before they are ready to use. For information about themandatory steps needed to configure the conversion component, see“Configuring IM for Data to export PMML models”.

For information about the other mandatory steps that you must follow beforeyou can use the Scoring component, see:v “Configuring the database environment” on page 21v “Creating database objects” on page 22

For information about the optional steps that you can perform for the Scoringcomponent, see:v “Verifying the installation and configuration” on page 22v “Executing sample applications” on page 22

Configuring IM for Data to export PMML modelsAfter you have installed the conversion component on the AIX or SUN Solarisplatform, you need to register the conversion utilities. To do this:1. Add the contents of the file idmcsctr.add to the idmcsctr.dat file of the

IM for Data client

20 Administration and Programming for DB2

Page 37: Administration and Programming for DB2

2. Add the contents of the file idmcsstr.add to the idmcsstr.dat file of IMfor Data server

On the Windows platform, these steps are done automatically duringinstallation. It must be done manually only if you install IM for Data afteryou have installed the conversion component.

For more information, see “Enabling IM for Data to export PMML or XMLmodels” on page 158 in Appendix A, “Installing IM Scoring”. The informationthere will help you also if you are running an IM for Data client in a languageother than English.

Configuring the database environmentAfter you have installed the Scoring component, you need to configure yourDB2 instance and the databases that you want to use with IM Scoring.

To configure the DB2 instance as a user with SYSADM authority:

v On UNIX® platforms, call the idminstfunc script. This is available inthe bin directory of your IM Scoring installation.

v On all platforms, increase the database manager configurationparameter UDF_MEM_SZ. A recommended value is 60000, which is thehighest possible.

Syntaxdb2 update dbm cfg using UDF_MEM_SZ 60000

v On Windows platforms, increase the DB2 registry parameterDB2NTMEMSIZE to a value that matches your UDF_MEM_SZ value.

Syntaxdb2set DB2NTMEMSIZE=APLD:240000000

v Restart the DB2 instance.v For further information, see:

– For UNIX systems: “Enabling the DB2 instance on UNIX systems” onpage 157

– For Windows systems: “Enabling the DB2 instance on Windowssystems” on page 158

– “The idminstfunc command” on page 135

To configure the databases as a user with SYSADM or DBADM authority:

1. If you do not have an existing database, create a database by using thecommand DB2 CREATE DATABASE <DBNAME>.

2. Increase the database transaction log size LOGFILSIZ. A recommendedvalue is 2000.

Syntaxdb2 update db cfg for <database name> using logfilsiz2000

Chapter 4. Getting started 21

Page 38: Administration and Programming for DB2

3. Increase the database parameter APP_CTL_HEAP_SZ. A recommendedvalue is 10000.

Syntaxdb2 update db cfg for <database name> usingAPP_CTL_HEAP_SZ 10000

4. Increase the database parameter APPLHEAPSZ. A recommended value is1000.

Syntaxdb2 update db cfg for <database name> using APPLHEAPSZ1000

Creating database objectsThe UDTs, UDFs, and UDMs provided with IM Scoring must be created inthe databases that you want to use with IM Scoring. To do this, call theidmenabledb command, which is available in the bin directory of your IMScoring installation. A mandatory parameter to the command is the databasename. Some optional parameters are available. If you want to execute thesample applications provided with IM Scoring, call the command by means ofthe fenced and the tables options.

Syntaxidmenabledb <database name> fenced tables

For more information and a detailed description of idmenabledb, see:v “Enabling databases” on page 41v “The idmenabledb command” on page 133

Verifying the installation and configurationYou can quickly verify your installation and configuration, and make sure thatthe appropriate database objects have been created. To do so, follow thesesteps:1. Call the command idmcheckdb <database name>, which is available in the

bin directory of your installation. The command returns the enablementstatus of the database.

2. Connect to a database that you have enabled.3. Use the following command:

db2 "values( IDMMX.DM_applData(’Test’,4))"

4. The command must return without error. If you get any error messages,check your installation and configuration for completeness.

Executing sample applicationsIM Scoring provides a set of samples to help you to get familiar with theUDFs and UDTs. For descriptions of the samples and instructions on how touse them as practice exercises, see:

22 Administration and Programming for DB2

Page 39: Administration and Programming for DB2

v “Sample components”v “Completing the practice exercises” on page 25

Generating SQL scripts from your own mining modelsIf you already have PMML models available as flat files, you can generateSQL scripts from them by using the idmmkSQL tool. These scripts will containtemplate SQL statements that import and apply the model. The SQL scriptcontains placeholders that you replace with the names of concrete databaseobjects in order to finally get the executable SQL script.

For more information, see:v “Generating SQL statements from models” on page 44v “The idmmkSQL command” on page 136

You can find a practice exercise in using the idmmkSQL tool at “UsingidmmkSQL to work with your own mining models” on page 38.

Sample components

The IM Scoring package includes sample components consisting of a series ofpractice exercises in using IM Scoring. This tutorial material enables you to:v Use the Clustering mining function of IM Scoring

For an introduction to the Clustering mining function, see Chapter 3, “Datamining functions” on page 17.

v Score records using IM Scoring Java BeansFor an introduction to IM Scoring Java Beans, see “Online scoring with IMScoring Java Beans” on page 12.

The IM Scoring sample components reside in a samples directory. Thisdirectory contains the mining model, data, and scripts that you require tocomplete the exercises in this chapter.v On the AIX platform, the samples directory is:

/usr/lpp/IMinerX/samples/ScoringDB2

v On the Linux and Sun Solaris platforms, the samples directory is:/opt/IMinerX/samples/ScoringDB2

v On the Windows platform, the samples directory is:<install path>\samples\ScoringDB2

where <install path> is the directory where IM Scoring is installed. Youcan also use the shortcut IBM DB2 Intelligent Miner Scoring 8.1 —>Scoring - Samples in the program folder.

The IM Scoring Java Beans examples are available in samples/ScoringBean.

Chapter 4. Getting started 23

Page 40: Administration and Programming for DB2

Table 4 and Table 5 on page 25 list the files that are included in the samplesdirectory and explain the purpose of each.

Table 4. Sample components for the Clustering mining function of IM Scoring

Sample component Description

clusDemoBanking.dat An exported Demographic Clustering model. Themodel was built from data for a bank’s customerswho have a particular type of account. Customersare grouped according to similarities of age,income, number of siblings, gender, and accounttype.

bankingScoring.data A flat file containing records relating to thecustomers of a bank. This is the data to whichyou will apply the model.

bankingImport.db2 A script that creates the DB2 tableBANKING_SCORING, imports the filebankingScoring.data, and inserts the data intothe new table.

bankingInsert.db2 A script that imports the model, which is storedin the file clusDemoBanking.dat, and then insertsthe model into the table IDMMX.ClusterModels.

bankingApplyTable1.db2 A script that:

1. Creates a results table

2. Applies the imported Clustering model to thespecified data from the table banking

3. Stores the calculated results in the table

4. Obtains results values from the table

bankingApplyTable2.db2 A script that uses nested calls to DM_applDatainstead of calling the REC2XML function for thepurpose of applying the imported Clusteringmodel to the specified data from the tablebanking. The script:

1. Creates a results table

2. Applies the imported Clustering model to thespecified data from the table banking

3. Stores the calculated results in the table

4. Obtains results values from the table

bankingApplyView.db2 A script that applies the imported Clusteringmodel to the specified data from the tableBANKING_SCORING. The script then obtains valuesfrom the calculated results using a common tableexpression.

24 Administration and Programming for DB2

Page 41: Administration and Programming for DB2

Table 5. Sample components for IM Scoring Java Beans

Sample component Description

93er_cars.pmml A Polynomial Regression model

Sample93erCars.java A sample Java program

readme.txt A README file

Note: The script bankingInsert.db2 uses the table IDMMX.ClusterModels. Thisis one of the sample tables delivered with IM Scoring. Before youperform the tasks described in this chapter, ensure that you haveenabled the database by means of the tables option.

For instructions on installing the sample tables, see “The idmenabledbcommand” on page 133.

Completing the practice exercises

Before you can complete the practice exercises, you must install IM Scoringand configure the system environment. For guidance on the procedure ofinstalling and configuring IM Scoring, see “Quick start” on page 19.

The tutorial consists of the following tasks:v Creating a DB2 table and importing data into itv Importing a mining model into a DB2 tablev Applying the model to data and obtaining results values, without storing

the results in a DB2 tablev Applying the model to data, storing the results in a DB2 table, and

obtaining results values from the tablev Using IM Scoring Java Beans to score records

Note: Before starting the tasks, you must be connected to a database that isenabled for the use of IM Scoring.

To run the scripts, you must have SELECT and INSERT privileges on theIDMMX.ClusterModels table.

Go to the directory where the samples are installed. See “Samplecomponents” on page 23 for information on the directories where thesample files are stored.

Creating a table and importing dataIn this exercise, you create a table and import the banking data. You will laterapply the mining model to this data.

Chapter 4. Getting started 25

Page 42: Administration and Programming for DB2

First, you must connect to the database. To do this, use the followingcommand:db2 connect to <dbname>

To create a table and import the sample data contained in the filebankingScoring.data, run the sample script bankingImport.db2, which iscontained in the file bankingScoring.data, by using the following command:db2 -stf bankingImport.db2

Contents of the script bankingImport.db2

CREATE TABLE BANKING_SCORING (TYPE CHAR(7),GENDER CHAR(6),AGE DOUBLE,PRODUCT CHAR(1),SIBLINGS DOUBLE,INCOME DOUBLE);

IMPORT FROM bankingScoring.data OF DEL INSERT INTO BANKING_SCORING( TYPE,

GENDER,AGE,PRODUCT,SIBLINGS,INCOME);

In the first part of the script, the DB2 table BANKING_SCORING is created, itscolumns are specified, and data types are defined for each column. In thesecond part, the flat file bankingScoring.data is imported and inserted intothe new table. Data from the flat file populates the columns, which arespecified by their names.

Importing a mining modelIn this exercise, you import the sample mining model into the DB2 databaseand store it in a DB2 table, which has a column configured for miningmodels.

To import the sample mining model, which is stored in the fileclusDemoBanking.dat, run the script bankingInsert.db2 by using the followingcommand:db2 -stf bankingInsert.db2

Contents of the script bankingInsert.db2 for AIX

26 Administration and Programming for DB2

Page 43: Administration and Programming for DB2

insert into IDMMX.ClusterModels values( ’DemoBanking’, IDMMX.DM_impClusFile(’/usr/lpp/IMinerX/samples/ScoringDB2/clusDemoBanking.dat’));

This script uses the function DM_impClusFile, which is specific to IM Scoring,to import the mining model contained in the file clusDemoBanking.dat. TheSQL INSERT command inserts the mining model into a column in the tableClusterModels, and sets the name of the model to DemoBanking. The tableIDMMX.ClusterModels is configured for the data type DM_ClusteringModel.

Note: On Windows, the absolute path is automatically modified at installationtime to be consistent with the chosen install path.

Applying a model and getting results valuesYou can use different scripts to apply mining models and obtain the results ofapplying the mining model. In the following exercises, a DemographicClustering model is used.

Using the script 'bankingApplyView.db2'In this exercise, you apply the Demographic Clustering model to the bankingdata and get the values of the calculated results.

To apply the sample mining model and obtain the results of applying themodel, run the script bankingApplyView.db2 by using the following command:db2 -stf bankingApplyView.db2

Contents of the script bankingApplyView.db2

WITH clusterView( clusterResult ) AS(

SELECT IDMMX.DM_applyClusModel( C.MODEL , IDMMX.DM_impApplData(rec2xml( 1.0, ’COLATTVAL’, ’’, B.TYPE, B.AGE, B.SIBLINGS,B.INCOME ) ) )

FROM BANKING_SCORING B, IDMMX.ClusterModels CWHERE C.MODELNAME=’DemoBanking’

)SELECT IDMMX.DM_getClusterID( clusterResult ),

IDMMX.DM_getClusScore( clusterResult )FROM clusterView ;

This script defines a common table expression, clusterView(clusterResult),to hold the results of applying a model. The script then applies theDemoBanking model to selected data from the banking table by using theDM_applyClusModel function. The data values are obtained by means of a callto the DB2 function REC2XML.

Chapter 4. Getting started 27

Page 44: Administration and Programming for DB2

Note: The column names specified in the call to REC2XML must exactly matchthe names of fields that are used in the mining model. For informationon how to retrieve the names of the fields in a mining model, see“Querying model field names” on page 49.

Finally, the script obtains the cluster ID and the Clustering score fromCLUSTER_RESULT by means of the functions DM_getClusterID andDM_getClusScore.

Using the script 'bankingApplyTable1.db2'In this exercise, you:1. Apply the Demographic Clustering model to the banking data.2. Store the calculated results in a DB2 table.3. Obtain results values for any customer who is older than 50.

To apply the sample mining model, store results, and obtain results values,run the script bankingApplyTable1.db2 by using the following command:db2 -stf bankingApplyTable1.db2

Contents of the script bankingApplyTable1.db2

CREATE TABLE BANKING_APPLY (TYPE CHAR(7),GENDER CHAR(6),AGE DOUBLE,PRODUCT CHAR(1),SIBLINGS DOUBLE,INCOME DOUBLE,CLUSTER_RESULT IDMMX.DM_ClusResult);

INSERT INTO BANKING_APPLYSELECT B.TYPE, B.GENDER, B.AGE, B.PRODUCT, B.SIBLINGS, B.INCOME,

IDMMX.DM_applyClusModel( C.MODEL ,IDMMX.DM_impApplData(rec2xml(1.0, ’COLATTVAL’,’’, B.TYPE, B.AGE, B.SIBLINGS,

B.INCOME)))FROM BANKING_SCORING B, IDMMX.ClusterModels CWHERE C.MODELNAME=’DemoBanking’;

SELECT AGE, IDMMX.DM_getClusterID( CLUSTER_RESULT ),IDMMX.DM_getClusScore( CLUSTER_RESULT )

FROM BANKING_APPLY WHERE AGE > 50;

DROP TABLE BANKING_APPLY;

This script creates a DB2 table for the mining results by defining the namesand the data types of the columns. The last column, CLUSTER_RESULT, isdesignated for the results that are calculated. The column is configured for the

28 Administration and Programming for DB2

Page 45: Administration and Programming for DB2

data type DM_ClusResult. The script then applies the DemoBanking model toselected data from the banking table by using the DM_applyClusModel function.Finally, it obtains the cluster ID and the Clustering score from theCLUSTER_RESULT column of the new table by using the functionsDM_getClusterID and DM_getClusScore.

You can also apply models and compute cluster IDs in a single SQL query.The following example shows an SQL query of this kind:

select b.type, b.gender, b.age, b.product, b.siblings, b.income,IDMMX.DM_getClusterID(IDMMX.DM_applyClusModel( c.model,IDMMX.DM_impApplData( REC2XML( 1, ’COLLATVAL’, ’’, b.type,

b.age, b.siblings, b.income ) ) ) )from banking b, IDMMX.ClusterModels cwhere c.modelname = ’DemoBanking’;

Tip:You can use the application functions to define SQL VIEWS that aresimilar to the output tables created by IM for Data Version 6. The SQLstatement would look similar to the template in the following example:

CREATE VIEW ApplyOut ( ID, NAME, AGE, ClusterID )ASSELECT I.ID, I.NAME, I.AGE,

IDMMX.DM_getClusterID(IDMMX.DM_applyClusModel(c.model,IDMMX.DM_impApplData( REC2XML( 1, ’COLLATVAL’, ’’, ... ) ) ) )FROM InputTable I, IDMMX.ClusterModels CWHERE C.modelName = .....

Afterwards, you can access the SQL VIEW by using any SELECT statement,such as the following:

SELECT ID, NAME, AGE, ClusterIDFROM ApplyOutWHERE ClusterID = 3

Using the script 'bankingApplyTable2.db2'The script bankingApplyTable2.db2 has the same functionality as the scriptbankingApplyTable1.db2, but it uses nested calls to DM_applData instead of acall to REC2XML. For information on the advantages and possibleinconveniences of using DM_applData, see “Specifying data by means ofDM_applData” on page 52. Alternatively, you can use a CONCAT expression. Forfurther information about this possibility, see “Specifying data by means ofCONCAT” on page 53.

To apply the sample mining model, store results, and obtain results values,run the script bankingApplyTable2.db2 by using the following command:

Chapter 4. Getting started 29

Page 46: Administration and Programming for DB2

db2 -stf bankingApplyTable2.db2

Contents of the script bankingApplyTable2.db2

CREATE TABLE BANKING_APPLY (TYPE CHAR(7),GENDER CHAR(6),AGE DOUBLE,PRODUCT CHAR(1),SIBLINGS DOUBLE,INCOME DOUBLE,CLUSTER_RESULT IDMMX.DM_ClusResult);

INSERT INTO BANKING_APPLYSELECT B.TYPE, B.GENDER, B.AGE, B.PRODUCT, B.SIBLINGS, B.INCOME,IDMMX.DM_applyClusModel( c.model ,IDMMX.DM_applData(IDMMX.DM_applData(IDMMX.DM_applData(IDMMX.DM_applData(’TYPE’, b.type ),’AGE’, b.age),’SIBLINGS’, b.siblings ),’INCOME’, b.income ))FROM BANKING_SCORING B, IDMMX.ClusterModels CWHERE C.MODELNAME=’DemoBanking’;

SELECT AGE, IDMMX.DM_getClusterID( CLUSTER_RESULT ),IDMMX.DM_getClusScore( CLUSTER_RESULT )

FROM BANKING_APPLY WHERE AGE > 50;

DROP TABLE BANKING_APPLY;

This script creates a DB2 table for the mining results by defining the namesand the data types of the columns. The last column, CLUSTER_RESULT, isdesignated for the calculated results. It is configured for the data typeDM_ClusResult. The script then applies the DemoBanking model to selected datafrom the banking table by using the function DM_applyClusModel. Finally, thescript obtains the cluster ID and the Clustering score from the CLUSTER_RESULTcolumn of the new table. To do this, it uses the functions DM_getClusterID andDM_getClusScore.

You can also apply models and compute cluster IDs in a single SQL query.The following example shows an SQL query of this kind:

select b.type, b.age, b.product, b.siblings, b.income,IDMMX.DM_getClusterID(IDMMX.DM_applyClusModel(c.model ,IDMMX.DM_applData(

IDMMX.DM_applData(

30 Administration and Programming for DB2

Page 47: Administration and Programming for DB2

IDMMX.DM_applData(IDMMX.DM_applData(

’TYPE’, b.type ),’AGE’, b.age),

’PRODUCT’, b.product),’SIBLINGS’, b.siblings ),

’INCOME’, b.income ))from banking b,IDMMX.ClusterModels c

where c.modelname=’DemoBanking’;

Extracting information from a modelIn this exercise, you extract information from a model.

The model from which you extract information is the one that you insertedinto the database as part of the exercise in the section “Importing a miningmodel” on page 26.

The information that you extract is:v The name of the modelv The number of clustersv The names of the mining fields

To extract the information, run the script bankingExtract.db2 by using thefollowing command:db2 -tf bankingExtract.db2

Contents of the script bankingExtract.db2

WITH MODELCONTENT( CLUSMODELNAME, NOCLUSTERS, MODELFIELDS ) AS(

SELECT IDMMX.DM_getClusMdlName( MODEL ),IDMMX.DM_getNumClusters( MODEL ),IDMMX.DM_getClusMdlSpec( MODEL)

FROM IDMMX.ClusterModels WHERE MODELNAME=’DemoBanking’)SELECT CLUSMODELNAME, NOCLUSTERS,

MODELFIELDS..DM_getFldName(1) AS FIELDNAME1,MODELFIELDS..DM_getFldName(2) AS FIELDNAME2,MODELFIELDS..DM_getFldName(3) AS FIELDNAME3,MODELFIELDS..DM_getFldName(4) AS FIELDNAME4

FROM MODELCONTENT;

Applying models created with IM ModelingIn this exercise, you apply models created with IM Modeling.

A prerequisite for executing these samples is that you have installed andconfigured IM Modeling and executed the banking samples provided with IMModeling. Executing the IM Modeling samples before executing the sample

Chapter 4. Getting started 31

Page 48: Administration and Programming for DB2

files provided with IM Scoring has a further advantage. It helps you tounderstand which UDFs and UDMs belong to IM Modeling, which belong toIM Scoring, and which belong to both.

To apply the models, run the scripts bankingApplyModeling1.db2 andbankingApplyModeling2.db2 by using the following commands:

db2 -tf bankingApplyModeling1.db2

db2 -tf bankingApplyModeling2.db2

In the first set of statements in the scripts, information is extracted from themodel. The second set of statements in the scripts applies the models to newdata. The difference between the two sets of statements is that the first oneuses rec2xml to build the record and the second uses DM_applData.

Contents of the script bankingApplyModeling1.db2

WITH MODELCONTENT( CLUSMODELNAME, NOCLUSTERS, MODELFIELDS ) AS(

SELECT IDMMX.DM_getClusMdlName( MODEL ),IDMMX.DM_getNumClusters( MODEL ),IDMMX.DM_getClusMdlSpec( MODEL)

FROM IDMMX.ClusterModels WHERE MODELNAME=’BankingClusColumnModel’)SELECT CLUSMODELNAME, NOCLUSTERS,

MODELFIELDS..DM_getFldName(1) AS FIELDNAME1,MODELFIELDS..DM_getFldName(2) AS FIELDNAME2,MODELFIELDS..DM_getFldName(3) AS FIELDNAME3,MODELFIELDS..DM_getFldName(4) AS FIELDNAME4

FROM MODELCONTENT;

WITH clusterView( clusterResult ) AS(

SELECT IDMMX.DM_applyClusModel( C.MODEL ,IDMMX.DM_impApplData(rec2xml( 1, ’COLATTVAL’, ’’, B.TYPE, B.AGE,

B.SIBLINGS, B.INCOME ) ) )FROM BANKING_SCORING B, IDMMX.ClusterModels CWHERE C.MODELNAME=’BankingClusColumnModel’

)SELECT IDMMX.DM_getClusterID( clusterResult ),IDMMX.DM_getClusScore( clusterResult )FROM clusterView ;

Contents of the script bankingApplyModeling2.db2

WITH MODELCONTENT( CLUSMODELNAME, NOCLUSTERS, MODELFIELDS ) AS(

SELECT IDMMX.DM_getClusMdlName( MODEL ),

32 Administration and Programming for DB2

Page 49: Administration and Programming for DB2

IDMMX.DM_getNumClusters( MODEL ),IDMMX.DM_getClusMdlSpec( MODEL)

FROM IDMMX.ClusterModels WHERE MODELNAME=’BankingClusAliasModel’)SELECT CLUSMODELNAME, NOCLUSTERS,

MODELFIELDS..DM_getFldName(1) AS FIELDNAME1,MODELFIELDS..DM_getFldName(2) AS FIELDNAME2,MODELFIELDS..DM_getFldName(3) AS FIELDNAME3,MODELFIELDS..DM_getFldName(4) AS FIELDNAME4

FROM MODELCONTENT;

WITH clusterView( clusterResult ) AS(

SELECT IDMMX.DM_applyClusModel( C.MODEL ,IDMMX.DM_applData(IDMMX.DM_applData(IDMMX.DM_applData(IDMMX.DM_applData(’N_TYPE’, B.TYPE ),’N_AGE’, B.AGE),’N_SIB’, B.SIBLINGS ),’N_INC’, B.INCOME ))FROM BANKING_SCORING B, IDMMX.ClusterModels CWHERE C.MODELNAME=’BankingClusAliasModel’

)SELECT IDMMX.DM_getClusterID( clusterResult ),IDMMX.DM_getClusScore( clusterResult )FROM clusterView ;

Using IM Scoring Java Beans to score recordsIn this exercise, you use IM Scoring Java Beans to score records. To do this,you use the sample program Sample93erCars.java, which you can find in theDB2 IM Scoring installation directory under samples/ScoringBean.

In the example, the minimum price of a car is predicted, given the basiccharacteristics for a car. The data used to train and generate the mining modelcontained a large number of fields, including:

Horsepower

Engine Size

City MPG

Highway MPG

Passenger capacity

Weight (pounds)

The training data also contained the actual Minimum Price (in $1000), whichwas used as the predicted field when the mining model was generated. WhenIM Scoring Java Beans is used with this model, the scorer predicts theminimum car price for new, previously unseen data.

Chapter 4. Getting started 33

Page 50: Administration and Programming for DB2

First, IM Scoring Java Beans is set up to be used for scoring runs. The sectionof code that follows shows the source code that is used to do this. Here, aconstructor is provided that takes as its input parameter the name of the filein which the mining model is stored.

In the example, the model file 93er_cars.pmml is located in the same directoryas the sample program. The constructor calls the method initModel(StringmodelFile), where the instance variable scorer is initialized with the newmining model. This initialization can be done by means of the constructor ofthe RecordScorer class, or by using the setModel(String modelFile) method.This operation can take some time, because the mining model is loaded intomemory, parsed, and interpreted for scoring. When this is complete, theinitialized scorer is prepared for scoring.

Source code: Setting up the RecordScorer

public Sample93erCars(String modelFileName ) {initModel( modelFileName );

}

/*** Initializes <code>scorer</code> with the specified* mining model, i.e. sets and loads the mining model.*/public void initModel( String modelFileName ) {

try {// Sets the new model file and loads the model in// preparation for the scoring runs that will follow.if ( scorer == null ) {

scorer = new RecordScorer( modelFileName );} else scorer.setModelFile( modelFileName );

} catch ( ModelException e ) {e.printStackTrace();

}}

When the setup of RecordScorer is complete, it can be used immediately forscoring on a record-by-record basis and for the reading of results. The sourcecode given below in ’Source code: Applying scoring on a record-by-recordbasis’ shows how this works.

The doScoring(Map record) method gets as its input the record that you wantto score. The call to scorer.score(record) computes the scoring result. Theresult is stored as an instance variable in the scorer object, and can beaccessed now by means of the getPredictedValue() method. In order to keepthe scoring API as simple and small as possible, there are no extra resultobjects provided for each of the model types. Instead, the RecordScorerprovides a set of methods that are used to access the computed result fields. Ifa method is called that does not suit the actual model type, a ResultException

34 Administration and Programming for DB2

Page 51: Administration and Programming for DB2

is thrown. A call to getClusterID(), for example, instead of togetPredictedValue() in the code given in ’Source code: Applying scoring on arecord-by-record basis’ would result in such an exception.

Source code: Applying scoring on a record-by-record basis

/*** Applies scoring: Gets a record as input, applies scoring* and displays the output.* @param record the record used for scoring*/public void doScoring( Map record ) {

if ( scorer != null ) {try {

// do scoring nowscorer.score( record );

// Before reading the results, validate that the// model type is set to regression.if ( scorer.getModelType() == Scorer.REGRESSION_TYPE ) {

double predictedValue = scorer.getPredictedValue();displayPredictionResult( record, predictedValue,

"minimum car price" );}

} catch ( Exception e ) {// an error occurred while scoring the recorde.printStackTrace();

}}

}

The code given in ’Source code: Accessing model metadata’ shows how accessis gained to the metadata of the mining model that is used. For a specifiedmodel, it is possible to retrieve the active mining fields that are used, as wellas the mining types of these fields. A field that is used by the mining modelfor the computation of the scoring result is known as an active mining field.In the method displayActiveFields(), the active fields and their relatedmining types are displayed.

Source code: Accessing model metadata

/*** Displays the active fields of the mining model that was used.* The active fields are the ones that are used by the mining* model for scoring.*/public void displayActiveFields() {

if ( scorer != null ) {String[] activeFields = scorer.getFieldNames();if ( activeFields != null ) {

System.out.println( "\nActive Fields: " );

Chapter 4. Getting started 35

Page 52: Administration and Programming for DB2

for ( int i=0; i<activeFields.length; i++ ) {System.out.print( "Field Name: " + activeFields[i] );if ( scorer.isCategoricalField( activeFields[i] ) ) {

System.out.println( ", Mining Type: categorical" );} else {

System.out.println( ", Mining Type: numerical" );}

}}

}}

The code given here in ’Source code: The main program’ shows the mainprogram. Here, the following scenario is demonstrated:v A customer is interested in buying a car, and asks a car vendor for an

estimated minimum price for his or her dream car.v The customer incrementally adds the characteristics that the car should

have, and wants to know the estimated price for these additionalcharacteristics. The customer starts by asking for the basic features that thecar should have. Other specifications follow – the outside dimensions of thecar, the engine characteristics, and the driving behavior. The customerfinally gives the specifications for the inside of the car.

The code given here in ’Source code: The main program’ shows the output ofthe program. The output demonstrates how the price varies with the newfeatures that the car should have. As can be seen in this example, a record issimply realized as a java.util.Map, where field names are mapped to theiractual values.

Source code: The main program

public static void main( String[] args ) {

File file = new File( "93er_cars.pmml" );Sample93erCars obj = new Sample93erCars( file.getAbsolutePath() );

HashMap record = new HashMap();

// A customer starts with the specification of some basic features// that the car should have. The engine should be powerful, with// 200 horsepower. The car should have good safety features,// so that 2 airbags should be available.// As well, it should use only a moderate amount of gas when// being driven in the city.// The estimated minimum price is predicted.record.put( "Horsepower", new Integer(200) );record.put( "Air Bags standard", new Integer(2) );record.put( "City MPG", new Integer(17) );carPricePredictor.doScoring( record );

// After hearing the estimated price, the customer gets more// specific, and asks for some more features.

36 Administration and Programming for DB2

Page 53: Administration and Programming for DB2

// The car should not be longer than 185 inches, and should be// at least 68 inches wide.record.put( "Width (inches)", new Integer(68) );record.put( "Length (inches)", new Integer(185) );carPricePredictor.doScoring( record );

// The customer now details the requirements for the engine and the// driving behavior, and wants to know the new estimated price.record.put( "Number of cylinders", new Integer(8) );record.put( "RPM", new Integer(5500) );record.put( "U-turn space (feet)", new Integer(40) );carPricePredictor.doScoring( record );

// Finally, the customer adds some details about how the inside of// the car should look.// Note: If you look at the model 93er_cars.pmml, you will see that// "Passenger capacity" and "Luggage capacity (cu.feet)" do not// appear at all. This means that these features do not affect// the predicted price. Even if the model does not contain these// fields, they can be specified in the record.record.put( "Rear seat room (inches)", new Integer(28) );record.put( "Passenger capacity", new Integer(28) );record.put( "Luggage capacity (cu. feet)", new Integer(30) );carPricePredictor.doScoring( record );}

To compile the Java program Sample93erCars.java:1. Set the PATH and CLASSPATH variables as described in “Setting environment

variables” on page 59.2. Change to the sample directory.3. Type javac Sample93erCars.java.

This will generate the class file Sample93erCars.class.

To start the program, type java Sample93erCars. This will result in the outputshown as follows:

Results output for Sample93erCars

Active Fields:Field Name: Width (inches), Mining Type: numericalField Name: U-turn space (feet), Mining Type: numericalField Name: City MPG, Mining Type: numericalField Name: Air Bags standard, Mining Type: numericalField Name: Rear seat room (inches), Mining Type: numericalField Name: Horsepower, Mining Type: numericalField Name: RPM, Mining Type: numericalField Name: Number of cylinders, Mining Type: numericalField Name: Length (inches), Mining Type: numerical

Prediction result:Horsepower: 200City MPG: 17

Chapter 4. Getting started 37

Page 54: Administration and Programming for DB2

Air Bags standard: 2=========================================Minimum Car Price (in $1000): 30.45425764285364

Prediction result:Length (inches): 185Horsepower: 200Width (inches): 68City MPG: 17Air Bags standard: 2=========================================Minimum Car Price (in $1000): 32.57913872719416

Prediction result:Number of cylinders: 8Length (inches): 185U-turn space (feet): 40Horsepower: 200RPM: 5500Width (inches): 68City MPG: 17Air Bags standard: 2=========================================Minimum Car Price (in $1000): 36.47348105441428

Prediction result:Width (inches): 68U-turn space (feet): 40City MPG: 17Passenger capacity: 28Air Bags standard: 2Rear seat room (inches): 28Horsepower: 200RPM: 5500Number of cylinders: 8Luggage capacity (cu. feet): 30Length (inches): 185=========================================Minimum Car Price (in $1000): 36.41880910964151

Using idmmkSQL to work with your own mining modelsIn this exercise, you generate a template SQL script using the command linetool idmmkSQL. You then edit the template SQL script in order to create thefinal script and execute it.

To do this exercise, you must first have successfully completed the stepsdescribed in “Creating a table and importing data” on page 25 and “Importinga mining model” on page 26.

To generate a template SQL script that contains the necessary SQL statementsto perform a scoring run:1. Execute the following command:

38 Administration and Programming for DB2

Page 55: Administration and Programming for DB2

idmmkSQL /D DB2 clusDemoBanking.pmml clusDemoBanking.DB2

This generates the file clusDemoBanking.DB2.2. Open the file clusDemoBanking.DB2 with an editor of your choice and

perform the following modifications:v Replace ###IDMMX.CLUSTERMODELS### with

IDMMX.CLUSTERMODELSThis appears twice in the file.

v Replace ###ABSOLUTE_PATH### with the absolute path to the fileclusDemoBanking.pmml

This depends on your operating system and your installation directory.v Replace ###RECORDID### with AGE

There is no ID column in this table, and therefore the column AGE isused to identify the records.

v Replace ###MODEL### with MODELv Replace ###TABLENAME### with BANKING_SCORINGv Replace ###MODELNAME### with MODELNAME

3. Save the modified file.

To run the script clusDemoBanking.DB2, use the following command:db2 -stf clusDemoBanking.db2

The script:1. Imports the PMML 2.0 model clusDemoBanking.pmml into the table

IDMMX.CLUSTERMODELS2. Performs a scoring run with the input data from the table BANKING_SCORING

3. Writes the results into the table RESULTTABLE

You can display the results using the following SQL statement:SELECT * FROM RESULTTABLE

The results are the same as with the scripts bankingApplyView.db2, which youexecuted in a previous exercise.

Chapter 4. Getting started 39

Page 56: Administration and Programming for DB2

40 Administration and Programming for DB2

Page 57: Administration and Programming for DB2

Chapter 5. Using IM Scoring

This chapter guides you through the use of IM Scoring to perform datamining tasks within a DB2 database. Use IM Scoring to:v Create the database objects that you need for working with IM Scoringv Work with mining modelsv Apply mining modelsv Get results values

Creating database objects

IM Scoring provides a set of UDTs, UDFs, and UDMs. Before you can usethem from SQL statements, they must be created in the database as databaseobjects. This section contains instructions on how to create the databaseobjects necessary for IM Scoring. A subsection is devoted to each of the tasksinvolved, as follows:v “Enabling databases”v “Disabling databases” on page 42v “Checking databases” on page 43

Enabling databasesTo create the database objects that are necessary, you must first enable thedatabase for the use of IM Scoring. To enable the database, use the IM Scoringcommand idmenabledb, which is in the bin directory of your IM Scoringinstallation.

Example:idmenabledb mydb tables

The idmenabledb command gets the database name as an input parameter. Itconnects to the database, and creates the UDTs, UDFs, and UDMs in thedatabase in the schema IDMMX. To execute idmenabledb, you must haveSYSADM or DBADM authority.

You must call the idmenabledb command for each database that you want touse with IM Scoring.

The idmenabledb command is shared between IM Scoring and IM Modeling.This means that you have to call it only once for each database if you haveboth products installed. The command detects automatically which productsare installed. It also detects which UDTs, UDFs, and UDMs already exist inthe database and which ones must be created. This means that, to migrate

© Copyright IBM Corp. 2001, 2002 41

Page 58: Administration and Programming for DB2

from IM Scoring V7.1 to IM Scoring V8.1, you must call idmenabledb on thedatabases that were enabled for IM Scoring V7.1.

If you call idmenabledb and the only parameter that you specify is thedatabase name, all the UDFs and UDMs are created as fenced. This meansthat they run in a process separate to the DB2 server process. All model UDTsare created for a maximum size of 10 MB. Additional options allow you tochange these default values.

If you want to execute the samples that are provided with IM Scoring, youmust enable the database by the means of the tables option. If you specify thetables option, IM Scoring creates tables that are suitable for storing miningmodels. You can use these tables also for production.

For complete reference information about the idmenabledb command, see “Theidmenabledb command” on page 133.

Disabling databasesThe IM Scoring command idmdisabledb drops from the database all the IMScoring UDTs, UDFs, and UDMs that were created. Call this command if youwant to discontinue using a database with IM Scoring, or before you uninstallIM Scoring.

The idmdisabledb command gets the database name as its input parameter. Itconnects to the database and drops all the UDTs, UDFs, and UDMs that werecreated when the database and the schema IDMMX were enabled. To executeidmdisabledb, you must have SYSADM or DBADM authority.

Database objects cannot be dropped until they no longer have a dependencyon other database objects. Take the following situation, for example. Youmight have created tables using IM Scoring UDTs as column types, or youmight have created triggers using some IM Scoring UDFs or UDMs. In thiscase, you must drop these database objects first before you call idmdisabledb.It might be the case that the only dependent database objects that you haveare the tables created by means of the tables option in idmenabledb. In thiscase, these tables are dropped if you call idmdisabledb with the optional tablesoption.

The idmdisabledb command is shared between IM Modeling and IM Scoring.This means that, if you have both products installed, you have to call thecommand only once for each database. If both products are installed, it is notpossible to disable a database only for the use of one product. If a databasewas enabled for both products, the database should be disabled before anyproduct is uninstalled. If one product is uninstalled, the database can nolonger be disabled for either product.

42 Administration and Programming for DB2

Page 59: Administration and Programming for DB2

Checking databasesThe IM Scoring command idmcheckdb enables you to check whether adatabase is enabled or not.

The idmcheckdb command gets the database name as input. It connects to thedatabase and returns a message saying whether the database is alreadyenabled for IM Scoring, IM Modeling, or both.

Example:idmcheckdb mydb

The database "mydb" is enabled for IM Modeling and IM Scoringin "fenced" mode.

Working with mining models

To use IM Scoring, you need mining models in one of the following formats:v PMML 1.1 or PMML 2.0 format as a filev Intelligent Miner format as a filev PMML 1.1 or PMML 2.0 format in a database table as a column value of

type CLOB

This section contains instructions on working with mining models. Asubsection is devoted to each of the tasks involved, as follows:v “Exporting models from IM for Data”v “Converting exported models” on page 44v “Generating SQL statements from models” on page 44v “Importing mining models” on page 45v “Providing models by means of IM Modeling” on page 48

Exporting models from IM for DataIM for Data stores models as results objects within an internal structure, andallows you to export these objects to an external file. The external file can beaccessed by the IM Scoring import functions, but the internally stored resultsobject cannot. A prerequisite is that the conversion component is installed andconfigured.

To export a results object, follow these steps:1. Click the results object on the IM for Data GUI.2. From the Selected menu, click Export.3. Select one of the following formats:

v Intelligent Miner formatv PMML

Chapter 5. Using IM Scoring 43

Page 60: Administration and Programming for DB2

v XML

Note: The PMML and the XML format are included in the Export menuonly if you have registered the model conversion facility by meansof the client tool registration of IM for Data. Refer to the installationchapter for your platform, and see “Enabling IM for Data to exportPMML or XML models” on page 158.

If you export a model in PMML or XML format, an XML encoding is writteninto the file. What this encoding is depends on the language environment. TheXML encoding is determined by the locale of the IM for Data server. The IMfor Data client might reside on a processor that is different from the IM forData server. In this case, the locale of the IM for Data client might differ fromthe locale of the IM for Data server. Because IM for Data exports the modelsto the IM for Data client, the XML encoding might become incorrect.

For these client/server and multilanguage environments, the recommendedapproach is as follows:1. Export the IM for Data model in Intelligent Miner format.2. Transfer the model to the processor on which IM Scoring is installed.

Converting exported modelsYou can convert mining models in Intelligent Miner format to PMML 2.0format by using the command line tool idmxmod. The input model inIntelligent Miner format must exist as a flat file. The output model in PMMLformat is also contained in a flat file.

For instructions on using this command line tool, see “The idmxmodcommand” on page 139.

Generating SQL statements from modelsIt can be time-consuming to manually write the SQL scripts that contain thenecessary SQL statements for applying a model to a set of input records. Thisis especially the case if there are a lot of mining fields. For this reason, IMScoring provides a separate tool, the idmmkSQL command, that generates frommining models the SQL statements that you need. The tool:1. Analyzes an input file containing a PMML model2. Writes a template SQL script that can be used to do these tasks:

v Import a modelv Build the input data recordsv Perform the application functionsv Store the most important results of the scoring run in a table

44 Administration and Programming for DB2

Page 61: Administration and Programming for DB2

The template script contains placeholders for actual values like table names orfile names. You have to manually insert these values into the template inorder to get the final SQL script.

The idmmkSQL command supports the different variants of building input datarecords for the application functions that are provided by IM Scoring. The toolhas a command line interface. The type of SQL and the method of buildinginput data records can be controlled by means of a number of command lineparameters.

For instructions on using this tool, see “The idmmkSQL command” onpage 136.

Importing mining modelsThe IM Scoring functions for importing mining models read a file thatcontains the mining model or a CLOB value from a database table that containsa PMML model. Each import function returns the mining model as data ofone of the data types specific to IM Scoring.

You can use an SQL command to insert the returned data into a DB2 tablewhere a column has been configured for the appropriate data type.

The IM Scoring package contains sample tables that are configured for theimported models. For instructions on installing these sample tables, see “Theidmenabledb command” on page 133.

Errors can occur in the following circumstances:v A model is imported by means of the wrong import function. For example,

the function DM_impClusFile was used to import a Classification model.v The wrong type of data is inserted into a table. For example, the data type

DM_ClusteringModel was inserted into the RegressionModels table.v A table that does not exist was specified.

If an error occurs when you are using one of the sample table names,ensure that you have enabled the database by means of the tables option.For instructions on installing the sample tables, see “The idmenabledbcommand” on page 133.

Importing mining models from a fileThe import functions import a mining model from a file and return it as avalue of the appropriate data type. Table 6 on page 46 shows the relationshipbetween the import function, the returned data type, and the sample table.The encoding of characters in the imported file is determined by default. ForXML files, the XML encoding specification in the XML header is respected. Ifthe encoding specification does not exist, UTF8 is assumed. For moreinformation on encodings, see “Using IM Scoring in a multilanguage

Chapter 5. Using IM Scoring 45

Page 62: Administration and Programming for DB2

environment” on page 65.

Table 6. Import functions and related data types and tables

Import function Data type Table

DM_impClusFile DM_ClusteringModel ClusterModels

DM_impClasFile DM_ClasModel ClassifModels

DM_impRegFile DM_RegressionModel RegressionModels

Figure 3 illustrates the relationship between model type, scoring functions,and the data type of the column in the DB2 table.

You can use the following command from the command line to import the filemyclusters.x and insert it into the ClusterModels table in your schema:

db2 insert into IDMMX.ClusterModelsvalues (’CustomerSegments’, IDMMX.DM_impClusFile(’/tmp/myclusters.x’))

Figure 3. Model import processes

46 Administration and Programming for DB2

Page 63: Administration and Programming for DB2

Importing mining models from a file using a specific XML encodingWith IM Scoring’s import functions, you can override the encodingspecification given in the XML file and explicitly specify the encoding to beassumed for the file to be imported.

IM Scoring’s import functions use a specific XML encoding to import amining model from a file and return it as a value of the appropriate data type.The model is imported in the specified XML encoding.

Specifying an XML encoding might be necessary, for example, in a situationwhere both of the following are true:v The model was exported from IM for Data in PMML or XML format.v The locales of the Intelligent Miner server and the Intelligent Miner client

were different.

Table 7 shows the relationship between the import function, the returned datatype, and the sample table. The encoding of characters in the imported file isdetermined by default. For more information on encodings, see “Using IMScoring in a multilanguage environment” on page 65.

Table 7. Import functions using a specific XML encoding

Import function Data type Table

DM_impClusFileE DM_ClusteringModel ClusterModels

DM_impClasFileE DM_ClasModel ClassifModels

DM_impRegFileE DM_RegressionModel RegressionModels

You can use the following command from the command line to import the filemyclusters.x and insert it into the ClusterModels table in your schema:

db2 insert into IDMMX.ClusterModels values( ’CustomerSegments’,IDMMX.DM_impClusFileE(’/tmp/myclusters.x’, ’iso-8859-1’ )

Importing mining models from a CLOB value in a database tableYou can use import functions that import a PMML 1.1 or PMML 2.0 modelthat already resides in a database table column as a CLOB value.

Note: Use these functions to get a value of one of the model data types froma CLOB value. Do not use the casting functions that are provided byDB2.

Table 8 on page 48 shows the relationship between the import function, thereturned data type, and the sample table.

Chapter 5. Using IM Scoring 47

Page 64: Administration and Programming for DB2

Table 8. Import functions using CLOB values

Import function Data type Table

DM_impClusModel DM_ClusteringModel ClusterModels

DM_impClasModel DM_ClasModel ClassifModels

DM_impRegModel DM_RegressionModel RegressionModels

The following is an example of the use of these import functions.1. A model name and a model in PMML 1.1 or PMML 2.0 format as a CLOB

value are selected from a table, PMMLClusterModels.2. They are then inserted into the ClusterModels table.3. The PMML model is converted from the CLOB data type to the

DM_ClusteringModel data type.

db2 insert into IDMMX.ClusterModels select modelname,IDMMX.DM_impClusModel( model) from PMMLClusterModels

Providing models by means of IM ModelingIM Modeling provides an SQL interface consisting of a set of database objects.These objects enable you to build data mining models in PMML 2.0 formatfrom information held in IBM DB2 databases. IM Modeling writes the datamining models that it creates into tables, and these models can be directlyused by IM Scoring. IM Modeling supports the following subset of the IMScoring model types:

DM_ClusteringModel

DM_ClasModel

The samples provided in the samples/ScoringDB2 directory show you how toextract from a model the information needed to apply that model. The mostimportant part of this information consists of details about the active fieldsthat are contained in the model. Values must be provided for these fieldswhen the model is being applied. The samples also apply the models.

The names of the samples are bankingModelingApply1.db2 andbankingModelingApply2.db2. For instructions on how to execute these samples,see “Applying a model and getting results values” on page 27.

48 Administration and Programming for DB2

Page 65: Administration and Programming for DB2

Applying mining models

This section contains instructions on applying mining models. It contains thefollowing subsections:v “Querying model field names”v “Using the application functions” on page 50v “Specifying data by means of REC2XML” on page 51v “Specifying data by means of DM_applData” on page 52v “Specifying data by means of CONCAT” on page 53v “Results data” on page 53v “Code sample for applying models” on page 55

Querying model field namesIt is important to get the best possible result when you apply a model to newdata records. To ensure this, provide a value in a data record for each activefield in the model. The data record consists of a set of field names and theirvalues. The field names in the set match the names of the active fields in themodel.

If necessary, you can access information about the active fields in your model.For a sample script that does this, see “Extracting information from a model”on page 31.

To get the set of active fields and their data mining field type, use one of thefollowing UDFs:v DM_getClusMdlSpec for Clustering models of type DM_ClusteringModel

v DM_getClasMdlSpec for Classification models of type DM_ClasModel

v DM_getRegMdlSpec for Regression models of type DM_RegressionModel

The functions get the model as input parameter, and return a value of typeDM_LogicalDataSpec.

DM_LogicalDataSpec is a structured type. Methods (UDMs) are available thatenable you to query the content of the DM_LogicalDataSpec value. They are:v DM_getNumFields to get the total number of fields in the DM_LogicalDataSpec

valuev DM_getFldName to get the name of a fieldv DM_getFldType to get the mining field type of a field

DM_getFldName and DM_getFldType get a position as input parameter. Theposition value ranges from 1 to the result of a call to DM_getNumFields().

To determine the field names contained in the PMML model, the followinglogic is used:

Chapter 5. Using IM Scoring 49

Page 66: Administration and Programming for DB2

1. The attribute displayName of the element DataField might contain a fieldname. If this is the case, this field name is returned if the field is markedas an active field in the element MiningField. Otherwise, it is not returned.

2. Otherwise, the field name that is contained in the name attribute of theelement DataField is returned. This happens if the field is marked as anactive field in the element MiningField. If the field is not marked as anactive field, it is not returned.

Using the application functionsIM Scoring includes functions that apply imported mining models to selecteddata. Table 9 lists each function together with a brief summary of how it isused.

Table 9. Functions for applying models

Application function Purpose

DM_applyClusModel Applies a model to selected data to produceresults data, grouped into clusters. Clusters areidentified when the model is built. They identifysimilar characteristics in data.

A possible use of this function is in producingtargeted mailshots.

DM_applyClasModel Applies a model to selected data to produceresults data that is classified according to rulesestablished when the model is built.

A possible use of this function is in classifyinginsurance risks.

DM_applyRegModel Applies a model to selected data to calculate apredicted value. The predicted value is based onthe values of input fields, according to a patternestablished when the model is built.

A possible use of this function is in rankingcustomers.

If the models were created by means of IM for Data, refer to theappropriate chapters in Using the Intelligent Miner for Data for moreinformation about the way each type of modeling works.If the models were created by means of IM Modeling, refer to theappropriate chapters in IM Modeling Administration and Programming. Thisprovides more information about the way the Classification and Clusteringmodeling types work.

Each function has the following input arguments:

50 Administration and Programming for DB2

Page 67: Administration and Programming for DB2

An imported mining model of the appropriate typeThe mining models are assumed to reside in a table. One column ofthe table contains the identifier or the name of the model. The othercolumn contains the model itself. In this case, the WHERE clause in theSELECT statement determines which model is the input to theapplication function. If the WHERE clause fails to return unique results,an error occurs. You can apply only one model for each statement.

The data to which the model is appliedYou can use the function DM_applData or DM_impApplData to define thedata to which you want to apply a model. These functions constructan instance of the data type DM_ApplicationData. To select severaldata items, you can choose one of the following options:v Call the function DM_impApplData with a concatenation of data items

as the input.v Call DM_impApplData with the output of the REC2XML function.v Make several nested calls to the function DM_applData. Each nested

call appends data to an instance of DM_ApplicationData.

Note: Use the function DM_applData or DM_impApplData to get a valueof data type DM_ApplicationData. Do not use the castingfunctions that are provided by DB2.

The function returns results data that contains the values that the mining logiccalculates.

The field names contained in the DM_ApplicationData value are mapped to thefields using the following logic:1. A field name is compared with the field names in the PMML model that

are contained in the attribute displayName in the element DataField.2. If there is no match, a field name is compared with the field names in the

PMML model that are contained in the name attributes in the DataFieldelements.

If you created the model by using IM for Data, the attribute displayName isempty, and the attribute name contains the name of the field used in IM forData.

Specifying data by means of REC2XMLUse the DB2 built-in function REC2XML for performance-critical applications tobuild a data record. REC2XML gets a number of control parameters and a list ofcolumns as input. The output is an XML string containing pairs consisting ofcolumn names and values. This XML string can be used as the input to the

Chapter 5. Using IM Scoring 51

Page 68: Administration and Programming for DB2

DM_impApplData function. This function returns the XML string as data typeDM_ApplicationData, which is the correct data type to be used as inputparameter to the application functions.

You might want to use REC2XML to build a data record for the applicationfunctions. In this case, the field names of the active fields in the model mustmatch the columns of the table that is used for REC2XML.

If this is not the case, you have the following possibilities:1. Create a view on the table in order to have the same column names as

field names in the model.2. Use the CONCAT function instead of REC2XML to build the XML string. This

technique allows you to map the column names to the field names in themodel. By using CONCAT, you achieve performance similar to that achievedwhen you use REC2XML.

3. Use the function DM_applData instead of REC2XML. DM_applData is slower,but it is easier to use than CONCAT. DM_applData also allows you to map thefield names in the model to the column names in the table.

REC2XML is described in the DB2 V7.2 SQL Reference. For your convenience,that description is also given here in Appendix F, “The DB2 REC2XMLfunction” on page 199.

For an example of its use, see “Applying a model and getting results values”on page 27.

Specifying data by means of DM_applDataUse the function DM_applData to build a data record where both of thefollowing are true:v Performance is not important.v The column names for the data record are different from the field names in

the model.

You must call DM_applData for each column that goes into the data record. Thefunction returns an XML string consisting of a field name/value pair of typeDM_ApplicationData. DM_applData is called in a nested way, because the returnvalue of the function is the input to the next call of the function. The functionthen appends its field name/value pair to the input XML string, and returnsthe whole XML string. The input parameters to the function DM_applData areas follows:v The value of type DM_ApplicationData

v A field name that matches a field name in the modelv A value

52 Administration and Programming for DB2

Page 69: Administration and Programming for DB2

If you use DM_applData on a table, the values are specified using the columnname.

Specifying data by means of CONCATUse the built-in function REC2XML for performance-critical applications. Youcan also use the built-in operator CONCAT, or the corresponding characters ||,in a sequence to construct the type DM_ApplicationData. This is the input forthe application functions. For more information on the required format of thestring, see “DM_impApplData” on page 120. You can also generate the CONCATsyntax by using the idmmkSQL command, which is described in “TheidmmkSQL command” on page 136.

For example, the following calls correspond to each other:

IDMMX.DM_applData( IDMMX.DM_applData( ’CHARCOL’, CHARCOL ) ,’DOUBLECOL’, DOUBLECOL )

IDMMX.DM_impApplData(’<row><column name="CHARCOL">’||CHARCOL||’</column> ’||’column name="DOUBLECOL">’

||CHAR(DOUBLECOL)||’</column></row>’)

If the columns CHARCOL and DOUBLECOL have NULL values, you must callCONCAT as follows:

IDMMX.DM_impApplData(’<row><column name="CHARCOL"’||coalesce(’>’||CHARCOL,’ null="true">’)||’</column>’||’<column name="DOUBLECOL"’||coalesce(’>’||CHAR(DOUBLECOL),’ null="true">’)||’</column></row>’)

Note: IM for Data handles date values and time values in a character ISOformat. You might want to specify a date value or a time value asCONCAT operator, and you might want to apply a model created by IMfor Data. In this case, the date value and the time value must beconverted to character ISO format. To do this, use CHAR(<value>, ISO)for a date value and CHAR(<value>, JIS) for a time value.

Results dataThe results data from each of the scoring functions is identified by a data typethat is specific to IM Scoring. Table 10 lists the application functions togetherwith their data types and results data.

Table 10. Application functions and their data types and results data

Application function Data type Results data

DM_applyClusModel DM_ClusResult Cluster ID, score, quality,confidence

DM_applyClasModel DM_ClasResult Predicted class, confidence

Chapter 5. Using IM Scoring 53

Page 70: Administration and Programming for DB2

Table 10. Application functions and their data types and results data (continued)

Application function Data type Results data

DM_applyRegModel DM_RegResult Predicted value, region ID(if an RBF model was used)

Notes:

1. If you create your own DB2 tables to store results, ensure that you includea column that is configured for the appropriate data type.

2. If you use an RBF model for the DM_applyRegModel function, the resultsdata Region ID is additionally created.

Errors can occur if any of the following is the case:v A model is applied using the wrong function. For example, the function

DM_applyClusModel is used to apply a Classification model.v The wrong type of results data is inserted into a table. For example, the

data type DM_ClusResult is inserted into a column that is configured fordata type DM_RegResult.

v The fields specified for inclusion in the data do not match the fieldsincluded in the model.

v A results table that does not exist is specified.

Figure 4 on page 55 illustrates the process by which data is selected and amodel is applied. Values for the fields age and salary are read from adatabase table to form an instance of the data type DM_ApplicationData. Thisdata and the model, ClusterModel, form the input to the functionDM_applyClusModel. The results of applying the model consist of a cluster IDand a cluster score. The results are returned as data type DM_ClusResult.

54 Administration and Programming for DB2

Page 71: Administration and Programming for DB2

Code sample for applying modelsThe following code sample shows a statement that applies a Classificationmodel to all records in the table myData where the customer’s age is less than40.

INSERT INTO myClassifResults(name, salary, address, clfresult)SELECT s.name, s.salary, s.address, IDMMX.DM_applyClasModel(c.model,

IDMMX.DM_applData(IDMMX.DM_applData(’AGE’,s.age),’SALARY’, s.salary))

FROM ClassifModels c, myData sWHERE c.modelname=’Customers’ and s.age<40

The fields salary and age from the myData table are used.

The Classification model and the name of the model are stored in theClassifModels table in the columns model and modelname. The model name isCustomers.

The results of applying the model together with customer information arethen inserted into the myClassifResults table. The calculated results data iscontained in the clfresult column. This column is configured for the datatype DM_ClasResult.

Figure 4. Applying a model to data

Chapter 5. Using IM Scoring 55

Page 72: Administration and Programming for DB2

Getting application results

IM Scoring includes results functions that obtain the different results valuescalculated within the application functions. This section lists and describesthese functions. It also includes information about situations where inputrecords contain NULL values; this is in “Handling missing values” on page 57.

Table 11 lists the results functions and describes the value that each functionreturns.

Table 11. Results functions and their purpose

Results function Purpose

DM_getClusterID Obtains the cluster ID from results data calculated whena Clustering model is applied. This identifies theposition of the cluster in the Clustering model that wasthe best match for this data record. The position of thecluster is a value between 1 and the number of clusters.

DM_getClusScore Obtains the Clustering score from results data calculatedwhen a Clustering model is applied. The score is ameasure of how closely the data matches the modelcluster.

DM_getQuality Returns the quality of the best cluster from results datacalculated when a Clustering model is applied.

DM_getQuality(clusterid) Returns the quality for a specified cluster from resultsdata calculated when a Clustering model is applied.

DM_getClusConf Returns the confidence of attributing a record to the bestcluster in comparison with attributing it to anothercluster of the applied model.

DM_getPredClass Obtains the predicted class from results data calculatedwhen a Classification model is applied. This identifiesthe class within the model to which the data matched.

DM_getConfidence Obtains the classification confidence value from resultsdata calculated when a Classification model is applied.This is a value between 0.0 and 1.0; it measures theprobability that the class is predicted correctly.

DM_getPredValue Obtains the predicted value from results data calculatedwhen a Regression model is applied. This value iscalculated according to relations established by themodel.

DM_getRBFRegionID Obtains the number of the region from results datacalculated when an RBF Regression model is applied.This identifies the region within the model that therecord was assigned to.

56 Administration and Programming for DB2

Page 73: Administration and Programming for DB2

To obtain the results of applying a model, you run the appropriate resultsfunction, giving the return value of an application function as a parameter.This can be one of the following:v A common table expressionv A column in a database table where you have stored the return value of an

application functionv The return value itself, if you call the functions in a nested way

You use common table expressions or pass the return value of the applicationfunction directly if you use only a single statement. That is, you apply themodel and obtain the results value within a single statement. You use thedatabase column name if the results of applying the model were stored in aDB2 table.

If you specify a column name, and the column is not configured for thecorrect results data type, an error occurs. See Table 10 on page 53 for therelationships between results functions, results data types, and results data.

A results function might return a NULL value. This means that the applicationfunction that calculated the result got a faulty data record as input. The datarecord consisted of too many invalid values for a reasonable result to bereturned.

Handling missing valuesInput records might contain one or more values that are NULL; these areknown as missing values. The handling of missing values in IM Scoringdepends on the algorithm.

Neural algorithmsIn general, the Neural algorithms of the mining functions handle missingvalues as follows:v Numeric variables: If a missing value replacement (PMML 2.0) is

present, that will be taken. Otherwise, the activation of thecorresponding input node will be set to 0.5, which equals the meanvalue in most cases.

v Categorical variables: If a missing value replacement (PMML 2.0) ispresent, that will be taken. Otherwise, the activations of allcorresponding input nodes will be set to 0.

Classification

v Neural Classification: In IBM models, if none of the activations of theoutput neurons is above a certain threshold limit, DM_getPredClassreturns NULL. Other models always give a prediction.DM_getConfidence always returns a value.

v Tree Classification: The handling of missing values depends on whetherthe model was generated by an IBM product or by a non-IBM product.

Chapter 5. Using IM Scoring 57

Page 74: Administration and Programming for DB2

Models generated by an IBM productThese IBM products consist of IM Modeling and IM for Data.

With IBM models, a sophisticated value treatment is used. If amissing value occurs, the record being scored is fed into both childnodes (binary tree) of the tree node requiring the missing value.This process continues until the record reaches a leaf node. Thus, arecord is assigned to more than one leaf node. Tree Classificationaggregates all these leaf nodes, and DM_getPredClass returns thevalue assigned to this aggregated node.

Models generated by a non-IBM productThe scoring process stops at the first tree node requiring a missingvalue, and DM_getPredClass returns the value assigned to this(non-leaf) node.

Clustering

v Demographic Clustering: Missing values are ignored and thecorresponding field does not participate in the scoring process. If all thevalues of the record are missing, NULL is returned by DM_getClusterIDand DM_getClusScore.

v Neural Clustering: DM_getClusterID and DM_getClusScore never returnNULL.

Regression

v Linear Regression:– Numeric variables: If a missing value replacement (PMML 2.0) is

present, this will be taken. If a mean value is given in the PMML,that will be taken. Otherwise the variable is ignored.

– Categorical variables: If a missing value replacement (PMML 2.0) ispresent, that will be taken. Otherwise, the variable is ignored.

– If all input variables are missing values, no prediction will be given.The function DM_getPredValue returns NULL.

v Neural Prediction: DM_getPredValue will always give a prediction value.v RBF Prediction: Missing values are ignored, and the corresponding field

does not participate in the scoring process. If all the values of therecord are missing, DM_getPredValue and DM_getRBFRegionID returnNULL.

Using IM Scoring Java Beans

This section describes the interfaces of IM Scoring Java Beans. It contains thefollowing subsections:v “Setting environment variables” on page 59v “Specifying the mining model to be used” on page 60

58 Administration and Programming for DB2

Page 75: Administration and Programming for DB2

v “Accessing model metadata” on page 61v “Specifying a data record” on page 62v “Applying scoring” on page 62v “Accessing computed results” on page 62v “Scoring example” on page 63v “ScoringException classes” on page 64

To perform the scoring of a data record, you must specify the following input:1. The mining model to be used2. One or more data records for which you want to compute a score value

When you have specified the necessary input, you can apply scoring and thenaccess the result fields that have been computed.

The functions of IM Scoring Java Beans are implemented as methods of classcom.ibm.iminer.scoring.RecordScorer.

The sections that follow describe how to use IM Scoring Java Beans. Thesesections consist of:v “Specifying the mining model to be used” on page 60v “Accessing model metadata” on page 61v “Specifying a data record” on page 62v “Applying scoring” on page 62v “Accessing computed results” on page 62v “Scoring example” on page 63v “ScoringException classes” on page 64

Note that the Java API is documented in online documentation (Javadoc) inthe directory \doc\ScoringBean\index.html

Setting environment variablesAfter you have installed IM Scoring Java Beans, you must set environmentvariables before you can use it from Java applications.

For a Java application to invoke RecordScorer, you must do the following:

On Windows systems:

set PATH=%PATH%;<install path>\bin\

v set CLASSPATH=%CLASSPATH%,<install path>\java\xerces.jar

v set CLASSPATH=%CLASSPATH%,<install path>\java\idmscore.jar

On AIX systems:The exact commands depend on the shell being used.

Chapter 5. Using IM Scoring 59

Page 76: Administration and Programming for DB2

Set your LIBPATH to include /usr/lpp/IMinerX/lib

Set your CLASSPATH to include /usr/lpp/IMinerX/lib/xerces.jar and/usr/lpp/IMinerX/libidmscore.jar

On Linux and Sun Solaris systems:The exact commands depend on the shell being used.

Set your LD_LIBRARY_PATH to include /opt/IMinerX/lib

Set your CLASSPATH to include /opt/IMinerX/lib/xerces.jar and/opt/IMinerX/libidmscore.jar

The xerces.jar package is the XML4J package needed for XML parsing. It iscopied into the IMinerX/java directory during the installation of IM ScoringJava Beans. It is possible, however, to use any other implementation of theXML4J specification.

Specifying the mining model to be usedThe RecordScorer class expects the mining model to be stored in a file on thelocal file system in PMML 2.0 format. You can specify the mining model to beused for scoring in one of two ways, as follows:v By using the constructor

One of the two constructors that are provided allow the specification of afile in which the mining model is stored in PMML format.public RecordScorer( String modelFile ) throws ModelException

v By using the method interfacepublic void setModelFile( String modelFile ) throws ModelException

The ModelException is thrown if either of the following is true:v The specified file cannot be found or accessed successfullyv The specified mining model is in an incorrect format or is of an unknown

type

When a new mining model is set, this model is loaded explicitly into memory.Loading the mining model means reading the PMML file, interpreting themining model, and preparing everything for scoring. For big mining models,for example, those with a size of more than 50 MB, this operation might takesome time. After the mining model has been loaded, the model is keptinterpreted in memory. For consecutive scoring calls, the loaded and preparedmining model is used, so that the scoring of a single record needs a responsetime of less than a second.

When a mining model is set, it remains loaded until one of the followinghappens:1. A new model is loaded

or2. The garbage collector frees up the actual RecordScorer instance.

60 Administration and Programming for DB2

Page 77: Administration and Programming for DB2

Accessing model metadataThe RecordScorer class provides a set of methods that can be used to accessthe metadata of the mining model that is specified. These are as follows:

Table 12. IM Scoring Java Beans methods for accessing model metadata

Method Method description

public String[] getFieldNames() Returns the names of the mining fieldsused by the mining model to performscoring. If no correct model is specified, aString[0] array is returned.

public String[] getCategoricalFields() Returns the names of the categoricalmining fields used by the mining modelto perform scoring. If no correct model isspecified, a String[0] array is returned.

public String[] getNumericalFields() Returns the names of the numericalmining fields used by the mining modelto perform scoring. If no correct model isspecified, a String[0] array is returned.

public booleanisCategoricalField(String fieldName)

Returns true if the field specified byfieldName is a categorical mining field.Otherwise, false is returned.

public booleanisNumericalField(String fieldName)

Returns true if the field specified byfieldName is a numerical mining field.Otherwise, false is returned.

public boolean isField(fieldname) Returns true if the field specified byfieldName is one of the active miningfields. Otherwise, false is returned.

public int getModelType() Returns an integer value that identifiesthe type of the model that was used. Forthis, the constant CLUSTERING_TYPE,CLASSIFICATION_TYPE, REGRESSION_TYPE,and UNDEFINED_TYPE are defined in theScorer class.

Chapter 5. Using IM Scoring 61

Page 78: Administration and Programming for DB2

Table 12. IM Scoring Java Beans methods for accessing model metadata (continued)

Method Method description

public int[] getResultIdentifiers() Returns the identifiers of the result fieldsthat are relevant for the model type thatis set. For this, there are additionalconstants defined in the class Scorer:

v For CLUSTERING_TYPE, the result fieldsCLUSTER_ID, CLUSTER_SCORE, andCLUSTER_QUALITY are relevant.

v For CLASSIFICATION_TYPE, the resultfields PREDICTED_CLASS and CONFIDENCEare relevant.

v For REGRESSION_TYPE, the result fieldsPREDICTED_VALUE and RBF_REGION_ID arerelevant.

Specifying a data recordYou can represent a data record by a mapping. Here, the names of the fieldsused in the records are mapped to their actual values. This is exactly the wayin which a record is represented in the RecordScorer class. A java.util.Mapobject is instantiated, and a mapping between field names and values isdefined. For example:

HashMap record = new HashMap();record.put("Horsepower", new Integer(200) );record.put("Air Bags standard", new Integer(2) );record.put("City MPG", new Integer(17) );

Note that the values of categorical fields are interpreted as java.util.String,and the values of numerical fields are expected to inherit fromjava.lang.Number.

Applying scoringAfter the mining model and a data record are specified, the score of the datarecord can be computed. For this purpose, the RecordScorer class provides thefollowing method:public void score( Map record ) throws RecordException, ModelException

Accessing computed resultsAfter you have successfully called the score(Map) method, you can access thecomputed result fields by using one of the following methods:

Table 13. IM Scoring Java Beans methods for accessing computed results

Method Required model type Default value

double getClusterID() Clustering java.lang.Double.NaN

62 Administration and Programming for DB2

Page 79: Administration and Programming for DB2

Table 13. IM Scoring Java Beans methods for accessing computed results (continued)

Method Required model type Default value

double getClusterScore() Clustering java.lang.Double.NaN

doublegetClusterQuality()

Clustering java.lang.Double.NaN

StringgetPredictedClass()

Classification NULL

double getConfidence() Classification java.lang.Double.NaN

doublegetPredictedValue()

Regression java.lang.Double.NaN

Depending on the type of mining model that is used, different sets of fieldsare computed. After you specify a mining model with thesetModelFile(String) method, you can query the type of mining model byusing the method public int getModelType(). This method returns an integervalue that represents one of the following constants:

Scorer.CLUSTERING_TYPE

Scorer.CLASSIFICATION_TYPE

Scorer.REGRESSION_TYPE

Scorer.UNDEFINED_TYPE

With getModelType(), it is possible to determine which of the methods listedin Table 13 can be used to access the computed result fields. If the model typeis, for example, Scorer.CLASSIFICATION_TYPE, the scoring result can beaccessed with the methods getPredictedClass() and getConfidence(). Theuse of the method getPredictedValue() in this context, for example, wouldthrow a ResultException. For that reason, each of these result methods throwsa ResultException to indicate that the result field and the actual model typedo not fit together. It might be the case that a mining model is specified, butthe model is not loaded yet or the score method is not called yet. In this case,the result methods that fit the specified model type return their default valuesas listed in Table 13.

Scoring example

0 try {1 RecordScorer scorer = new RecordScorer( "93er_cars.pmml" );2 int modelType = scorer.getModelType();3 double predictedValue = getPredictedValue();4 try {5 int predictedClass = scorer.getPredictedClass();6 } catch ( ResultException e ) {7 // this is an expected exception for illustration purposes8 }

Chapter 5. Using IM Scoring 63

Page 80: Administration and Programming for DB2

9 HashMap record = new HashMap();10 record.put("Horsepower", new Integer(200));11 record.put("Air Bags standard", new Integer(2) );12 record.put("City MPG", new Integer(17) );13 scorer.score( record );14 predictedValue = scorer. getPredictedValue();1516 } catch ( Exception e ) {17 // should not occur if everything is correct18 }

First, a new RecordScorer instance is initialized with a Stepwise PolynomialRegression mining model (line 1). The value of modelType in line 2 thus isequal to the value specified by the constant Scorer.REGRESSION_TYPE. For thistype of model, a predicted value is expected for each scored record. When themethod getPredictedValue() is called in line 3, the actual predicted value isreturned. Since the score(Map) method is not called yet, the default value NaNis returned by the method in line 3. When getPredictedClass() is called inline 5, a ResultException is thrown, because the predicted class for the resultfield does not fit the model type Scorer.REGRESSION_TYPE. In line 9 to 12, theactual data record is defined. There, the field names are mapped to theiractual values. In line 13, score(Map) is called and the scoring result iscomputed. The result of the call to getPredictedValue() in line 14 is a valuenot equal to NaN, the default value.

ScoringException classesThe following exception classes are used by the RecordScorer and its baseclass Scorer:

ModelExceptionUsed if an error occurred that was relevant to the mining model that wasspecified.

RecordExceptionUsed if an error occurred that was relevant to the data record that wasused.

ResultExceptionUsed if an error occurred that was relevant to the result fields that werecomputed.

64 Administration and Programming for DB2

Page 81: Administration and Programming for DB2

Chapter 6. Administrative tasks

This chapter is a guide to doing a number of administrative tasks connectedwith IM Scoring. These include:v “Using IM Scoring in a multilanguage environment”v “Getting error information”v “Getting support” on page 66

Using IM Scoring in a multilanguage environment

You can use IM Scoring with databases that are defined in any codepage.

You might want to create a new database in a multilanguage environment andenable this database for the scoring functions. In this case, you arerecommended to create this database using the Unicode character encoding. Ina Unicode-enabled database, DB2 uses the following encoding:v For columns with character types: UTF-8 (UCS Transformation Format)v For columns with graphic types: UCS-2 (Universal Character Set coded in 2

octets)

You can create a Unicode-enabled database with the following DB2 command:DB2 CREATE DATABASE <dbname> USING CODESET UTF-8 TERRITORY US

Getting error information

If an error occurs during a scoring run, an error message is displayed. SeeAppendix E, “Error messages” on page 173 for a list of error messages withexplanations of their meanings, and indications about what action you shouldtake. Note that some messages are truncated by DB2. To see the full,untruncated message, check the idmMxError.log file.

Errors are logged by default in the file idmMxError.log together withidentifiers and time stamps.v On UNIX systems, the file resides in the directory /tmp.v On Windows systems, the file resides in the temp directory, which is either:

– The path specified by the TMP environment variable– The path specified by the TEMP environment variable, if TMP is not

defined– The Windows directory, if neither TMP or TEMP is defined

© Copyright IBM Corp. 2001, 2002 65

Page 82: Administration and Programming for DB2

If you intend to use the environment variable TMP or TEMP, it must be setin the environment of the DB2 engine. Scoring services run as part of theDB2 engine. Set the environment variable as a system variable.

When the file gets too large, you must delete it.

You can change the error file name and directory by setting the environmentvariable IDM_MX_ERRORFILE to the name of an alternative error file. The filename must be given as an absolute path name. If you want to prevent IMScoring from writing to an error file, set the environment variableIDM_MX_ERRORFILE to NONE. The environment variable must be set in theenvironment of the DB2 engine. On UNIX platforms, add IDM_MX_ERRORFILE tothe DB2 registry variable DB2ENVLIST and restart DB2 to activate the changes.See the DB2 Administration Guide to get information on the DB2ENVLIST registryvariable. On Windows platforms, IDM_MX_ERRORFILE must be set as a systemvariable.

Getting support

Before you contact IBM for support, check the README information thatcomes with the product.

Also check the product’s support pages on the following Web sites:http://www.ibm.com/software/data/support/

http://www.ibm.com/software/data/iminer/modeling/support.html

These support pages provide ″Frequently Asked Questions″ and ″Hints andTips″ that may help to solve your problem.

When you contact IBM for support, prepare to answer the questions in theproblem identification worksheet. Make yourself familiar with the instructionson how to collect trace information given at “Getting trace information” onpage 69.

Product READMEThe product README files (readme_sc.txt) can be found in the installationdirectory, as follows:

Windows:<Installdir>\IMinerX\readme_sc.txt

AIX:/usr/lpp/IMinerX/readme_sc.txt

UNIX:/opt/IMinerX/readme_sc.txt

66 Administration and Programming for DB2

Page 83: Administration and Programming for DB2

'Frequently asked questions' and 'Hints and tips'Check the ″Frequently asked questions″ and ″Hints and tips″ section at theWeb site athttp://www.ibm.com/software/data/iminer/scoring/support.html

Problem identification worksheetThe IBM Software Support Guide contains the following worksheet, which isavailable online athttp://techsupport.services.ibm.com/guides/handbook.html

Answer the questions in the worksheet before contacting IBM support.

PROBLEM IDENTIFICATION WORKSHEET

Complete this form before calling Technical Support

This form helps you identify problems and assists IBM Technical Support infinding solutions.

System Information

What is the failing product? _______________________________________________

What is the version and release number?_____________________________________

What machine model, operating system, and version are running?______________

Problem Description

What are the expected results?______________________________________________

What statement or command is being used? ___________________________________

What are the exact symptoms and syntax? ________________________________________________________________________________________________________________

What is or isn’t happening, including exact error number and message text?

____________________________________________________________________________

____________________________________________________________________________

Is anyone else experiencing the problem? ___________________________________

Is this the first time this operation has been attempted? __________________

Is this the first time this problem has occurred?___________________________

Environment

When did this activity work last? __________________________________________

Chapter 6. Administrative tasks 67

Page 84: Administration and Programming for DB2

What has changed since the activity last worked? ___________________________

_____ Hardware type/model _____ Application

_____ Operating system/version _____ Level of usage

_____ New product version/release _____ Maintenance applied

If the problem does not occur every time, under what conditions does theproblem not occur?

____________________________________________________________________________

____________________________________________________________________________

Is there any other software running on the system which may be conflictingwith this product?

____________________________________________________________________________

____________________________________________________________________________

Problem Isolation

Identify the specific feature of the software causing the problem.__________

____________________________________________________________________________

Can the problem be reproduced? If so, please provide a reproducible test caseor instructions on how to reproduce the error condition.____________________

____________________________________________________________________________

____________________________________________________________________________

____________________________________________________________________________

Getting product informationUse the command idmlevel to collect information about the softwareenvironment (operating system, DB2 version) and IM product version you areusing.

Using ’idmlevel’ on Windows operating systems

1. Open a command window.2. Invoke idmlevel <logfile>

Example:

idmlevel c:\imscoring.log

3. Attach the resulting log file to your problem description when youcontact IBM support.

Using ’idmlevel’ on UNIX operating systems

68 Administration and Programming for DB2

Page 85: Administration and Programming for DB2

1. In a command shell invoke:

AIX:/usr/lpp/IMinerX/bin/idmlevel /tmp/imscoring.log

Other UNIX platforms:/opt/IMinerX/bin/idmlevel /tmp/imscoring.log

2. Attach the resulting log file to your problem description when youcontact IBM support.

Getting trace informationThe trace facility in IM Scoring helps you to locate an error by writinginformation about an error situation to a file. This information consists of tracemessages.

The trace facility is configured by means of two environment variables. Theseare as follows:v IDM_MX_TRACEFILE. This specifies the name of the file where trace

information is written. If the variable is not defined or is empty, trace iswritten to the default file idmMxTrace.log in the temporary directory of thesystem. If the file does not exist, it is created. If the trace file cannot bewritten, tracing is silently switched off, without an error message orwarning.

v IDM_MX_TRACELEVEL. This defines the level of tracing. It must contain one ofthe following values (not case-sensitive):

MINIMUM: Tracing is switched off. However, in some severe errorsituations this setting is ignored, and some trace messages explainingthe error are written. This is the default value.BASIC: Coarse trace information (for example, received data, DB2 calls,and so on) is written.MOST: More detailed trace information (for example, call stacks,parameters, and so on) is written.ALL: All available information is printed. Note that this tends to berather lengthy.

IDM_MX_TRACELEVEL determines the filtering of trace messages.IDM_MX_TRACELEVEL is effectively an on/off switch for tracing.

Since trace messages are controlled by means of environment variables,concurrent mining runs write to the same trace file.

The trace message that is written to the trace file consists of the followinginformation:v The name of the component that issued the trace messagev A time stamp

Chapter 6. Administrative tasks 69

Page 86: Administration and Programming for DB2

v The trace message itself

Additionally, basic system information is written once per trace.

Using the tracing facility on UNIX systemsTo work with the tracing facility, log on to the DB2 server as instance ownerand set the environment variable IDM_MX_TRACELEVEL, IDM_MX_TRACEFILE, orboth. To switch the tracing facility on, set IDM_MX_TRACELEVEL to BASIC, MOST,or ALL. You can also optionally set IDM_MX_TRACEFILE to a file name includingthe whole path. If IDM_MX_TRACEFILE is not set explicitly, the trace informationis written to the file /tmp/idmMxTrace.log by default.

For example, if you use the ksh shell, you might want to use one of thefollowing commands:

export IDM_MX_TRACELEVEL=BASIC

export IDM_MX_TRACEFILE=/home/db2admin/scoreTrace.log

You must add the environment variables to the list of environment variablesin the DB2ENVLIST registry variable to enable DB2 to use them. To activate thechanges, stop DB2 and restart.

To stop tracing, remove the variables from the DB2ENVLIST registry variableand restart DB2.

For information on the DB2ENVLIST registry variable, see the DB2Administration Guide.

Note: If you use IM Scoring for AIX on an SP™ with UDB EEE, you arerecommended to store the trace file in a local file system, for example,in the default directory /tmp. After calling an application function withthe trace option enabled, the idmMxTrace.log file is stored on each nodein the /tmp directory. It contains trace information from the processesthat run on that node.

Using the tracing facility on Windows systemsTo work with the tracing facility, you must set the environment variableIDM_MX_TRACELEVEL, IDM_MX_TRACEFILE, or both. You must set the environmentvariables as system variables, because the DB2 engine is implemented as aWindows service.

To switch the tracing facility on, set IDM_MX_TRACELEVEL to BASIC, MOST, or ALL.You can also optionally set IDM_MX_TRACEFILE to a file name including thewhole path. If IDM_MX_TRACEFILE is not set explicitly, the trace information iswritten to the file idmMxTrace.log by default. This file resides in the directorythat is specified by the TEMP environment variable.

70 Administration and Programming for DB2

Page 87: Administration and Programming for DB2

Follow these steps to set the environment variables:1. Log on as an administrator.2. Open the Start menu, and navigate to Settings —> Control Panel —>

System —> Environment Variables.This is the procedure on Windows 2000. On other Windows versions, thenavigation path may differ slightly.

3. Set the environment variables as required.

To activate the changes, restart your system. To stop tracing, remove theenvironment variables, and restart your system.

Getting DB2 diagnostic informationIf you need to contact IBM support, you might also need to provide DB2diagnostic information like DB2CLI trace or db2diag.log. For moreinformation, see the DB2 Troubleshooting Guide.

Chapter 6. Administrative tasks 71

Page 88: Administration and Programming for DB2

72 Administration and Programming for DB2

Page 89: Administration and Programming for DB2

Part 2. Reference

This part provides a complete reference resource to the database objects thatmake up IM Scoring.v Reference lists of all the database objects supplied with IM Scoring, together

with a short description of each, appear in Chapter 7, “Overview of IMScoring database objects” on page 75

v Full descriptions of all the IM Scoring methods and functions appear insubsequent chapters, as follows:– Chapter 8, “IM Scoring methods reference” on page 83– Chapter 9, “IM Scoring functions reference” on page 91

v Descriptions of IM Scoring’s executables to enable and disable a databaseand to check whether a database has been enabled appear in Chapter 10,“IM Scoring command reference” on page 131.

© Copyright IBM Corp. 2001, 2002 73

Page 90: Administration and Programming for DB2

74 Administration and Programming for DB2

Page 91: Administration and Programming for DB2

Chapter 7. Overview of IM Scoring database objects

This chapter provides reference lists of all the database objects supplied withIM Scoring, together with a short description of each. These database objectsare:v User-defined data types. See “Data types provided by IM Scoring”.v User-defined methods. See “Methods provided by IM Scoring” on page 77.v User-defined functions. See “Functions provided by IM Scoring” on page 77.

Information about the sizes of the input and output parameters to the IMScoring methods and functions appears in “Parameter sizes” on page 81.

For instructions on using these database objects, see Chapter 5, “Using IMScoring” on page 41.

Data types provided by IM Scoring

IM Scoring provides a number of user-defined data types. These data typesare used to store models and results, and to define the data to which a modelis applied.

The data types are installed in the schema IDMMX.

Table 14. Data types specific to IM Scoring

Data type (UDT) Source data type Purpose

DM_ApplicationData CLOB Contains the definition ofdata to which a model isapplied.

DM_ClasModel BLOB Identifies data within DB2 asa Tree or NeuralClassification model. Thedata type is associated withthe model when the modelis imported using theDM_impClasFile function. Ifthe model is stored in adatabase table, the columnmust be configured for thisdata type.

© Copyright IBM Corp. 2001, 2002 75

Page 92: Administration and Programming for DB2

Table 14. Data types specific to IM Scoring (continued)

Data type (UDT) Source data type Purpose

DM_ClasResult VARCHAR Contains the predicted classand classification confidencevalues for a row of dataobtained when aClassification model isapplied by means of theDM_applyClasModel function.

DM_ClusResult VARCHAR Contains the computedcluster ID and the scorevalue for a row of dataobtained when a Clusteringmodel is applied by meansof the DM_applyClusModelfunction.

DM_ClusteringModel BLOB Identifies data within DB2 asa Demographic or NeuralClustering model. The datatype is associated with themodel when the model isimported by means of theDM_impClusFile function. Ifthe model is stored in adatabase table, the columnmust be configured for thisdata type.

DM_LogicalDataSpec CLOB Identifies the field namesand types that are containedin a model and that areneeded to apply a model.

DM_RegressionModel BLOB Identifies data within DB2 asan RBF or Neural Predictionmodel or as a Polynomial orLogistic Regression model.The data type is associatedwith the model when themodel is imported by meansof the DM_impRegFilefunction. If the model isstored in a database table,the column must beconfigured for this datatype.

76 Administration and Programming for DB2

Page 93: Administration and Programming for DB2

Table 14. Data types specific to IM Scoring (continued)

Data type (UDT) Source data type Purpose

DM_RegResult VARCHAR Contains the predicted valuefor a row of data obtainedwhen a Regression model isapplied by means of theDM_applyRegModel function.

Note: The maximum size of models stored as BLOB differs depending onwhether you selected the fenced or the unfenced option when youenabled the database. By default you can store up to 10 MB in fencedmode and 50 MB in unfenced mode.

Methods provided by IM Scoring

The methods provided by IM Scoring enable you to work with the structuredtype DM_LogicalDataSpec. This data type contains the field name and fieldtype definitions of the mining fields that are part of the input data used whenmodels are applied.

Table 15. Methods for type DM_LogicalDataSpec

Method PurposeSeepage

DM_expDataSpec Exports a DM_LogicalDataSpec value as a CLOBvalue

84

DM_getFldName Returns the name of a field at a specified positionin a value of type DM_LogicalDataSpec

85

DM_getFldType Returns the mining field type of a specified fieldcontained in a value of type DM_LogicalDataSpec

86

DM_getNumFields Returns the number of fields contained in a valueof type DM_LogicalDataSpec

87

DM_impDataSpec Imports a previously exported DM_LogicalDataSpecvalue

88

DM_isCompatible Compares two logical data specifications, andreturns TRUE if they are compatible

89

Functions provided by IM Scoring

IM Scoring provides a number of user-defined scoring functions that enableyou to:1. Import and export mining models, and access the properties of the

models.

Chapter 7. Overview of IM Scoring database objects 77

Page 94: Administration and Programming for DB2

2. Apply these models to data held in DB2 tables.3. Retrieve the results.

The scoring functions are installed in the schema IDMMX.

Table 16. Functions for working with scoring data type DM_ApplicationData

Scoring function (UDF) PurposeSeepage

DM_applData Obtains the data to which a model is applied bythe functions DM_applyClasModel,DM_applyClusModel, and DM_applyRegModel

92

DM_impApplData Converts application input data from the CLOB,VARCHAR, or CHAR format

120

Table 17. Functions for working with data mining model type DM_ClasModel

Scoring function (UDF) PurposeSeepage

DM_applyClasModel Applies a specified Classification model tospecified data and returns results values

94

DM_expClasModel Returns a character large object representation ofa value of type DM_ClasModel

97

DM_getClasCostRate Returns the cost rate contained in a value oftype DM_ClasModel

100

DM_getClasMdlName Obtains the name of a Classification model 101

DM_getClasMdlSpec Returns a value of DM_LogicalDataSpeccontaining the set of fields needed for anapplication or test of the model

102

DM_getClasTarget Returns the target field contained in a value oftype DM_ClasModel

103

DM_impClasFile Imports a Tree or Neural Classification modelinto a DB2 database

121

DM_impClasFileE Imports a Tree or Neural Classification modelinto a DB2 database by specifying an encoding

122

DM_impClasModel Converts a Classification model from the CLOBformat

123

Table 18. Functions for working with scoring result type DM_ClasResult

Scoring function (UDF) PurposeSeepage

DM_getConfidence Obtains the classification confidence value froma results value returned by the functionDM_applyClasModel

110

78 Administration and Programming for DB2

Page 95: Administration and Programming for DB2

Table 18. Functions for working with scoring result type DM_ClasResult (continued)

Scoring function (UDF) PurposeSeepage

DM_getPredClass Retrieves the predicted class from a results valuereturned by the function DM_applyClasModel

112

Table 19. Functions for working with scoring result type DM_ClusResult

Scoring function (UDF) PurposeSeepage

DM_getClusConf Returns the confidence of attributing a record tothe best cluster in comparison with attributing itto another cluster of the applied model

104

DM_getClusScore Obtains the Clustering score from a results valuereturned by the function DM_applyClusModel

107

DM_getClusterID Obtains the Clustering ID from a results valuereturned by the function DM_applyClusModel

108

DM_getQuality Returns the quality of the best cluster 114

DM_getQuality(clusterid) Returns the quality of a specified cluster 115

Table 20. Functions for working with data mining model type DM_ClusteringModel

Scoring function (UDF) PurposeSeepage

DM_applyClusModel Applies a specified Clustering model to specifieddata and returns results values

95

DM_expClusModel Returns a character large object representation ofa value of type DM_ClusteringModel

98

DM_getClusMdlName Obtains the name of a Clustering model 105

DM_getClusMdlSpec Returns a value of DM_LogicalDataSpeccontaining the set of fields needed for anapplication

106

DM_getClusterName Gets the name of the cluster at a specifiedposition

109

DM_getNumClusters Returns the number of clusters contained in avalue of type DM_ClusteringModel

111

DM_impClusFile Imports a Demographic or Neural Clusteringmodel into a DB2 database

124

DM_impClusFileE Imports a Demographic or Neural Clusteringmodel into a DB2 database by specifying anencoding

125

Chapter 7. Overview of IM Scoring database objects 79

Page 96: Administration and Programming for DB2

Table 20. Functions for working with data mining model typeDM_ClusteringModel (continued)

Scoring function (UDF) PurposeSeepage

DM_impClusModel Converts a Clustering model from the CLOBformat

127

Table 21. Functions for working with data mining model type DM_RegressionModel

Scoring function (UDF) PurposeSeepage

DM_applyRegModel Applies a specified Regression model tospecified data and returns results values

96

DM_expRegModel Returns a character large object representation ofa value of type DM_RegressionModel

99

DM_getRegMdlName Obtains the name of a Regression model 117

DM_getRegMdlSpec Returns a value of DM_LogicalDataSpeccontaining the set of fields needed for anapplication

118

DM_getRegTarget Returns the target field for Regression 119

DM_impRegFile Imports a Neural Prediction model, an RBFPrediction model, or a Polynomial Regressionmodel into a DB2 database

128

DM_impRegFileE Imports a Neural Prediction model, an RBFPrediction model, or a Polynomial Regressionmodel into a DB2 database by specifying anencoding

129

DM_impRegModel Converts a Regression model from the CLOBformat

130

Table 22. Functions for working with scoring result type DM_RegResult

Scoring function (UDF) PurposeSeepage

DM_getPredValue Obtains the predicted value from a results valuereturned by the function DM_applyRegModel

113

DM_getRBFRegionID Returns the number of the region to which therecord was assigned

116

80 Administration and Programming for DB2

Page 97: Administration and Programming for DB2

Parameter sizes

The sizes of the input and output parameters of the methods and functionsare as follows:v Names and field names:

VARCHAR(128)v Model names in the functions DM_getClasMdlName, DM_getRuleMdlName, and

DM_getClusMdlName:VARCHAR(256)

v CLOBs in the methods DM_expDataSpec and DM_impDataSpec:The default size is 200 KB. For instructions on how to specify a differentsize, see the information about the StructSize parameter in “Theidmenabledb command” on page 133.

v CLOBs in the function DM_expClusModel:This depends on the optional ClusModelSize parameter specified whenidmenabledb was called.- Default fenced mode: 10 MB- Default unfenced mode: 50 MB

v CLOBs in the function DM_expClasModel:This depends on the optional ClasModelSize parameter specified whenidmenabledb was called.- Default fenced mode: 10 MB- Default unfenced mode: 50 MB

v CLOBs in the function DM_expRuleModel:This depends on the optional RuleModelSize parameter specified whenidmenabledb was called.- Default fenced mode: 10 MB- Default unfenced mode: 50 MB

v Data record in DM_applyClasModel, DM_applyClusModel, andDM_applyRegModel:

The default is 500 KB. For instructions on how to specify a different size,see the information about the ApplDataSize parameter in “Theidmenabledb command” on page 133.

v Result (DM_ClasResult, DM_ClusResult, DM_RegResult) in DM_applyClasModel,DM_applyClusModel, and DM_applyRegModel:

VARCHAR(512)

Chapter 7. Overview of IM Scoring database objects 81

Page 98: Administration and Programming for DB2

82 Administration and Programming for DB2

Page 99: Administration and Programming for DB2

Chapter 8. IM Scoring methods reference

This chapter contains full descriptions of all the IM Scoring methods. They arepresented in alphabetical order of the method names.

For brief overview descriptions of these methods, see “Methods provided byIM Scoring” on page 77.

For instructions on how to use the database objects presented here, seeChapter 5, “Using IM Scoring” on page 41.

For instructions on how to read the syntax diagrams, see “How to read thesyntax diagrams” on page xiii.

© Copyright IBM Corp. 2001, 2002 83

Page 100: Administration and Programming for DB2

DM_expDataSpec

This method converts a value of type DM_LogicalDataSpec to a CLOB value,and returns it.

SyntaxMethod syntax

$$ fields..DM_expDataSpec ( ) $&

Function syntax

$$ DM_expDataSpec ( fields ) $&

Parameters

fieldsA value of type DM_LogicalDataSpec

Return valueThe return value is a CLOB value converted from fields.

84 Administration and Programming for DB2

Page 101: Administration and Programming for DB2

DM_getFldName

This method returns the name of a field at a specified position in a value oftype DM_LogicalDataSpec.

SyntaxMethod syntax

$$ fields..DM_getFldName ( position ) $&

Function syntax

$$ DM_getFldName ( fields , position ) $&

Parameters

fieldsA value of type DM_LogicalDataSpec consisting of a set of fields

positionAn INTEGER value, ranging from 1 to the number of fields, that specifiesthe position

Return valuev If position is NULL, the return value is NULL.v The return value is the name of the mining field that is identified by the

number given in position in the following situation:The value of position must be greater than zero, and less than or equal tothe result of a call of DM_getNumFields().

The return value is a VARCHAR value.v Any other value for position raises an exception. The exception states that

the parameter is out of range.

Chapter 8. IM Scoring methods reference 85

Page 102: Administration and Programming for DB2

DM_getFldType

This method returns the mining field type of a specified field that is containedin a value of type DM_LogicalDataSpec.

SyntaxMethod syntax

$$ fields..DM_getFldType ( fieldName ) $&

Function syntax

$$ DM_getFldType ( fields , fieldName ) $&

Parameters

fieldsA value of type DM_LogicalDataSpec consisting of a set of fields

fieldNameA field name, of type VARCHAR, that is already contained in fields

The DM_LogicalDataSpec type defines the list of fields needed for a modelapplication run and their mining field types. The mining field type describeswhich field type the field has in the model. Possible types are DM_Categoricalor DM_Numerical, and these types map by default to the DB2 source data typeslisted here.

Table 23. Mining field types

Mining data fieldtype name

Miningfield typevalue

Source data type

DM_Categorical 0 CHAR, VARCHAR, LONG VARCHAR, TIME,DATE, TIMESTAMP

DM_Numerical 1 SMALLINT, INTEGER, DOUBLE, FLOAT,DECIMAL, BIGINT, REAL

Return valuev If fieldName is NULL, the return value is NULL.v If fieldName is not the name of any field contained in fields, this raises an

exception. The exception states that the field is not defined in the logicaldata specification.

v Otherwise, the return value is the mining field type, as a SMALLINT value,of the field fieldName contained in the set of fields fields.

86 Administration and Programming for DB2

Page 103: Administration and Programming for DB2

DM_getNumFields

This method returns the number of fields that are contained in a value of typeDM_LogicalDataSpec.

SyntaxMethod syntax

$$ fields..DM_getNumFields ( ) $&

Function syntax

$$ DM_getNumFields ( fields ) $&

Parameters

fieldsA value of type DM_LogicalDataSpec consisting of a set of fields

Return valueThe return value is the number of fields that are contained in fields. The returnvalue is of type INTEGER.

Chapter 8. IM Scoring methods reference 87

Page 104: Administration and Programming for DB2

DM_impDataSpec

This method converts a CLOB value to a value of type DM_LogicalDataSpec.The CLOB value must be a value that was previously exported using theDM_expDataSpec method.

SyntaxMethod syntax

$$ fields..DM_impDataSpec ( logDataSpec ) $&

Function syntax

$$ DM_impDataSpec ( fields , logDataSpec ) $&

Parameters

fieldsA value of type DM_LogicalDataSpec

logDataSpecA CLOB value containing logical data specifications

Return valueThe return value is fields containing the content of logDataSpec. Any existingdefinitions in fields are overwritten by the content of logDataSpec.

88 Administration and Programming for DB2

Page 105: Administration and Programming for DB2

DM_isCompatible

This method determines whether a DM_LogicalDataSpec value is compatiblewith another DM_LogicalDataSpec value.

SyntaxMethod syntax

$$ existLogDataSpec..DM_isCompatible ( logDataSpec ) $&

Function syntax

$$ DM_isCompatible ( existLogDataSpec , logDataSpec ) $&

Parameters

existLogDataSpecA value of type DM_LogicalDataSpec consisting of a set of fields

logDataSpecA value of type DM_LogicalDataSpec to be checked for compatibility

Return valueNote that calls to DM_isCompatible are not symmetric. A callDM_isCompatible(A,B) can return a result different from that returned byDM_isCompatible(B,A).v The return value is 1 as an INTEGER value if logDataSpec is compatible to

existLogDataSpec. For every field entry in existLogDataSpec, there must be afield in logDataSpec with an identical name. The type of a field inlogDataSpec must be DM_Numerical whenever this is the type of thecorresponding field in existLogDataSpec.

v If logDataSpec or existLogDataSpec is NULL, the return value is NULL.v Otherwise, the return value is 0 as an INTEGER value.

Chapter 8. IM Scoring methods reference 89

Page 106: Administration and Programming for DB2

90 Administration and Programming for DB2

Page 107: Administration and Programming for DB2

Chapter 9. IM Scoring functions reference

This chapter contains full descriptions of all the IM Scoring functions. Theyare presented in alphabetical order of the function names.

For brief overview descriptions of these functions, see “Functions provided byIM Scoring” on page 77.

For instructions on how to use the database objects presented here, seeChapter 5, “Using IM Scoring” on page 41.

For instructions on how to read the syntax diagrams, see “How to read thesyntax diagrams” on page xiii.

© Copyright IBM Corp. 2001, 2002 91

Page 108: Administration and Programming for DB2

DM_applData

This function obtains the data to which a model is applied byDM_applyClusModel, DM_applyClasModel, or DM_applyRegModel. It builds a valueof type DM_ApplicationData, which includes the label and the value obtainedfrom the column specified in the SQL statement.

This function is available in two different notations. One notation has twoparameters; the other has three parameters. Use the two-parameter notationfor the inner call in nested calls, or if the application data consists only of onefield-value pair.

Syntax

Two-parameter notation

$$ DM_applData ( label , character expressionnumeric expression

) $&

Three-parameter notation

$$ DM_applData ( application data , label , character expressionnumeric expression

) $&

Parameters

application dataIdentifies application data of type DM_ApplicationData to which thelabel and value of the expression are appended before being returned.In nested calls, this parameter is the return value of a DM_applDatacall.

label The label is a literal and must be defined within single quotationmarks. The string enclosed within the quotation marks must be aprecise match, including case, for the name of a field defined in themodel you apply. Together with the character or numeric expressionthat follows, this forms the data to which a model is applied. It willbe appended to the DM_ApplicationData value.

character expressionAn expression that returns a value of type CHAR, VARCHAR, or LONGVARCHAR.

Note: IM for Data handles date values and time values in characterISO format. If you want to specify a date value or a time valueas input to DM_applData and you want to apply a model createdby IM for Data, the date value and time value must be cast to

92 Administration and Programming for DB2

Page 109: Administration and Programming for DB2

character ISO format by using CHAR(<value>, ISO) for a datevalue and CHAR(<value>, JIS) for a time value.

numeric expressionAn expression that returns a value that is a numeric data type, eitherINTEGER, DOUBLE, DECIMAL, FLOAT, BIGINT, or SMALLINT. For example,you can include an expression such as age + 10.

If multiple data items are to be included, multiple nested calls to DM_applDatamust be made.

Return valueThe return value is a value of data type DM_ApplicationData.

Chapter 9. IM Scoring functions reference 93

Page 110: Administration and Programming for DB2

DM_applyClasModel

This function applies a Classification model to selected data and producesresults.

Syntax

$$ DM_applyClasModel ( model , application data ) $&

Parameters

modelThe Classification model of type DM_ClasModel that is to be applied todata.

application dataSpecifies the data of type DM_ApplicationData to which the model isapplied. A DM_ApplicationData value is returned by DM_applData orDM_impApplData. See “DM_applData” on page 92 and“DM_impApplData” on page 120.

Return valueThe results produced by applying a model using DM_applyClasModel arereturned as data type DM_ClasResult.

If model or application data is NULL, the return value is NULL.

94 Administration and Programming for DB2

Page 111: Administration and Programming for DB2

DM_applyClusModel

This function applies a Clustering model to selected data and producesresults.

Syntax

$$ DM_applyClusModel ( model , application data ) $&

Parameters

model The Clustering model of type DM_ClusteringModel that is to beapplied to data.

application dataSpecifies the data of type DM_ApplicationData to which the model isapplied. A DM_ApplicationData value is returned by DM_applData orDM_impApplData. See “DM_applData” on page 92 and“DM_impApplData” on page 120.

Return valueThe results produced by applying a model using DM_applyClusModel arereturned as data type DM_ClusResult.

If the input model is a Demographic Clustering model that does not containdistance units, they are calculated by default for numeric fields. The defaultvalue is half of the square root of the variance of the field.

If model or application data is NULL, the return value is NULL.

Chapter 9. IM Scoring functions reference 95

Page 112: Administration and Programming for DB2

DM_applyRegModel

This function applies a Regression model to selected data and producesresults.

Syntax

$$ DM_applyRegModel ( model , application data ) $&

Parameters

model The Regression model of type DM_RegressionModel that is to beapplied to data.

application dataSpecifies the data of type DM_ApplicationData to which the model isapplied. A DM_ApplicationData value is returned by DM_applData orDM_impApplData. See “DM_applData” on page 92 and“DM_impApplData” on page 120.

Return valueThe results produced by applying a model using DM_applyRegModel arereturned as data type DM_RegResult.

If model or application data is NULL, the return value is NULL.

96 Administration and Programming for DB2

Page 113: Administration and Programming for DB2

DM_expClasModel

This function returns a character large object representing a PMMLClassification model of type DM_ClasModel.

Syntax

$$ DM_expClasModel ( classificationModel ) $&

Parameters

classificationModelA value of type DM_ClasModel

Return valueThe return value is a character large object representing the PMMLClassification model in question. The CLOB XML string is encoded indatabase encoding (the codeset specified when the database was created)regardless of the encoding specification found in the XML content.

Chapter 9. IM Scoring functions reference 97

Page 114: Administration and Programming for DB2

DM_expClusModel

This function returns a character large object representing the PMMLClustering model extracted from a value of type DM_ClusteringModel.

Syntax

$$ DM_expClusModel ( clusteringModel ) $&

Parameters

clusteringModelA value of type DM_ClusteringModel

Return valueThe return value is a character large object representing the Clustering model.The CLOB XML string is encoded in database encoding (the codeset specifiedwhen the database was created) regardless of the encoding specification foundin the XML content.

98 Administration and Programming for DB2

Page 115: Administration and Programming for DB2

DM_expRegModel

This function returns a character large object representing a PMML Regressionmodel or a PMML or XML prediction model of type DM_RegressionModel.

Syntax

$$ DM_expRegModel ( regressionModel ) $&

Parameters

regressionModelA value of type DM_RegressionModel

Return valueThe return value is a character large object representing the Regression model.The CLOB XML string is encoded in database encoding (the codeset specifiedwhen the database was created) regardless of the encoding specification foundin the XML content.

Chapter 9. IM Scoring functions reference 99

Page 116: Administration and Programming for DB2

DM_getClasCostRate

This function returns the classification cost rate of the Classification modelcomputed in a validation phase during the training phase.

If you need more information, see the description of DM_setClasCostRate inIntelligent Miner Modeling Administration and Programming.

Syntax

$$ DM_getClasCostRate ( clasModel ) $&

Parameters

clasModelA value of type DM_ClasModel

Return valuev If the model was calculated without validation, the return value is NULL.v Otherwise, the return value is the cost rate of the Classification model

contained in clasModel as a DOUBLE value, computed using the validationdata.

100 Administration and Programming for DB2

Page 117: Administration and Programming for DB2

DM_getClasMdlName

This function obtains the name of a Classification model of type DM_ClasModel.

Syntax

$$ DM_getClasMdlName ( model ) $&

Parameters

model A Classification model of the type DM_ClasModel

Return valueThe return value is the name of the Classification model as a VARCHARvalue.

Chapter 9. IM Scoring functions reference 101

Page 118: Administration and Programming for DB2

DM_getClasMdlSpec

This function returns the DM_LogicalDataSpec value representing the set offields needed for an application of the Classification model.

Syntax

$$ DM_getClasMdlSpec ( classificationModel ) $&

Parameters

classificationModelA value of type DM_ClasModel

Return valueThe return value is the DM_LogicalDataSpec value representing the set of fieldsneeded for an application of the Classification model.

102 Administration and Programming for DB2

Page 119: Administration and Programming for DB2

DM_getClasTarget

This function returns the name of the target field of a Classification model. Ifthe model was produced through the use of IM Modeling, the target fieldname is the name that was set by means of the method DM_clasSetTarget oftype DM_ClasSettings.

Syntax

$$ DM_getClasTarget ( clasModel ) $&

Parameters

clasModelA value of type DM_ClasModel

Return valueThe return value is the target field name as a VARCHAR value.

Chapter 9. IM Scoring functions reference 103

Page 120: Administration and Programming for DB2

DM_getClusConf

This function returns the confidence of attributing the record r to the bestcluster in comparison with attributing it to another cluster of the appliedmodel. The ID of the best cluster is returned by the function DM_getClusterID.

Syntax

$$ DM_getClusConf ( clusResult ) $&

Parameters

clusResultA value of type DM_ClusResult

Return valuev If clusResult is NULL, the return value is NULL.v Otherwise, the return value is the confidence value as data type DOUBLE.

The confidence is a value between 0 and 1 of type DOUBLE. A value near 0should never appear. A value near 0.5 (50%) means that the record might fitequally well into another cluster of the model. A value near 1 means that it isquite certain that the record belongs to the best cluster. It also means that therecord definitely does not suit any other clusters of the model.

104 Administration and Programming for DB2

Page 121: Administration and Programming for DB2

DM_getClusMdlName

This function obtains the name of a Clustering model of typeDM_ClusteringModel.

Syntax

$$ DM_getClusMdlName ( model ) $&

Parameters

model A Clustering model of the type DM_ClusteringModel

Return valueThe return value is the name of the Clustering model as a VARCHAR value.

Chapter 9. IM Scoring functions reference 105

Page 122: Administration and Programming for DB2

DM_getClusMdlSpec

This function returns the DM_LogicalDataSpec value representing the set offields needed for an application of the Clustering model.

Syntax

$$ DM_getClusMdlSpec ( clusteringModel ) $&

Parameters

clusteringModelA value of type DM_ClusteringModel

Return valueThe return value is the DM_LogicalDataSpec value representing the set of fieldsneeded for an application of the Clustering model.

106 Administration and Programming for DB2

Page 123: Administration and Programming for DB2

DM_getClusScore

This function obtains the Clustering score from results data that is producedwhen you apply a Clustering model. The score is an expression of how closelythe data matches the model cluster. For Demographic Clustering, a scorevalue close to 1.0 indicates a good match. For Neural Clustering, a score valueclose to 0.0 indicates a good match.

This function is related to the following:v DM_getClusConf, which is compatible with the IM for Data score valuev DM_getQuality, which is a new function

Syntax

$$ DM_getClusScore ( results value ) $&

Parameters

results valueThe result of applying a Clustering model, returned by functionDM_applyClusModel as data type DM_ClusResult

Return valuev If results value is NULL, the return value is NULL.v Otherwise, the return value is the Clustering score as data type DOUBLE.

Chapter 9. IM Scoring functions reference 107

Page 124: Administration and Programming for DB2

DM_getClusterID

This function obtains the cluster ID from results data that is produced whenyou apply a Clustering model. This identifies the position of the cluster in theClustering model that is the best match for this data. For information aboutthe difference between the cluster IDs that are returned by IM Scoring V7.1and IM Scoring V8.1, see “Using the function DM_getClusterID” on page 170.

To get the cluster name shown by the IM for Data V6 Clustering Visualizer,use DM_getClusterName().

Syntax

$$ DM_getClusterID ( results value ) $&

Parameters

results valueThe result of applying a Clustering model, returned by functionDM_applyClusModel as data type DM_ClusResult. Usually, cluster IDsare between 1 and the number of clusters as returned by the functionDM_getNumClusters.

Return valuev If results value is NULL, the return value is NULL.v Otherwise, the return value is the cluster ID as data type INTEGER.

108 Administration and Programming for DB2

Page 125: Administration and Programming for DB2

DM_getClusterName

This function returns the name of a cluster at a specified position in a value oftype DM_ClusteringModel.

Syntax

$$ DM_getClusterName ( clusteringModel , position ) $&

Parameters

clusteringModelA value of type DM_ClusteringModel.

positionA value, of type INTEGER, between 1 and the result ofDM_getNumClusters. Here you can specify the value obtained by a call toDM_getClusterID().

Return valuev If clusteringModel is NULL, the return value is NULL.v If position is less than 1 or greater than the result of a call to

DM_getNumClusters, this raises an exception. The exception states that theparameter is out of range.

v If position is NULL, the return value is NULL.v If the cluster at the specified position has no name, the return value is

NULL.v Otherwise, the return value is the name of the specified cluster contained in

clusteringModel, of type VARCHAR.

Chapter 9. IM Scoring functions reference 109

Page 126: Administration and Programming for DB2

DM_getConfidence

This function obtains the classification confidence value from results data thatwas produced when you applied a Classification model. This is a valuebetween 0.0 and 1.0 that expresses the probability that the class is predictedcorrectly.

Syntax

$$ DM_getConfidence ( results value ) $&

Parameters

results valueThe result of applying a Classification model, returned by the functionDM_applyClasModel as data type DM_ClasResult

Return valuev If results value is NULL, the return value is NULL.v Otherwise, the return value is the confidence value as data type DOUBLE.

110 Administration and Programming for DB2

Page 127: Administration and Programming for DB2

DM_getNumClusters

This function returns the number of clusters contained in a value of typeDM_ClusteringModel.

Syntax

$$ DM_getNumClusters ( clusteringModel ) $&

Parameters

clusteringModelA value of type DM_ClusteringModel

Return valuev If clusteringModel is NULL, the return value is NULL.v Otherwise, the return value is the number of clusters contained in

clusteringModel, as an INTEGER value.

Chapter 9. IM Scoring functions reference 111

Page 128: Administration and Programming for DB2

DM_getPredClass

This function obtains the predicted class from results data that is producedwhen you apply a Classification model. This identifies the class within themodel to which the data matches.

Syntax

$$ DM_getPredClass ( results value ) $&

Parameters

results valueThe result of applying a Classification model, returned by the functionDM_applyClasModel as data type DM_ClasResult

Return valuev If results value is NULL, the return value is NULL.v Otherwise, the return value is the predicted class as data type VARCHAR.

112 Administration and Programming for DB2

Page 129: Administration and Programming for DB2

DM_getPredValue

This function obtains the predicted value from results data that is producedwhen you apply a Regression model. This value is calculated according torelations that are established by the model.

Syntax

$$ DM_getPredValue ( results value ) $&

Parameters

results valueThe result of applying a Regression model, returned by the functionDM_applyRegModel as data type DM_RegResult

Return valuev If results value is NULL, the return value is NULL.v Otherwise, the return value is the predicted value as data type DOUBLE.

Chapter 9. IM Scoring functions reference 113

Page 130: Administration and Programming for DB2

DM_getQuality

This function returns the quality of the result for the cluster whose ID is givenby the function DM_getClusterID. It measures how well the applied record fitsinto the specified cluster.

The returned value is between 0.0 and 1.0. A value close to 0.0 means that therecord does not fit at all in the cluster. A value close to 1.0 means that therecord fits very well in the specified cluster. The quality measurementdepends on the algorithm used to score the record, so that a directcomparison between the quality of algorithm results is not possible. However,both algorithms use a linear, possibly similar, quality measurement function.

Syntax

$$ DM_getQuality ( results value ) $&

Parameters

results valueThe result of applying a Clustering model, returned by the functionDM_applyClusModel as data type DM_ClusResult

Return valuev If results value is NULL, the return value is NULL.v Otherwise, the return value is the quality of the result for the specified

cluster, as a value of type DOUBLE between 0.0 and 1.0.

114 Administration and Programming for DB2

Page 131: Administration and Programming for DB2

DM_getQuality(clusterid)

This function returns the quality of the result for a specified cluster. Itmeasures how well the applied record fits into the specified cluster.

The result may not contain a quality measure for all the available clusters. IMScoring V8 usually computes the quality measure for only the two bestmatching clusters; it does this in order to optimize the memory needed tostore the clustering result.

The returned value is between 0.0 and 1.0. A value close to 0.0 means that therecord does not fit at all in the cluster. A value close to 1.0 means that therecord fits very well in the specified cluster. The quality measurementdepends on the algorithm used to score the record, so that a directcomparison between the quality of algorithm results is not possible. However,both algorithms use a linear, possibly similar, quality measurement function.

Syntax

$$ DM_getQuality ( results value , clusterid ) $&

Parameters

results valueThe result of applying a Clustering model, returned by the functionDM_applyClusModel as data type DM_ClusResult

clusteridThe ID of the cluster for which the quality should be returned

Return valuev If results value is NULL, the return value is NULL.v If the result does not contain a quality measure for the requested cluster,

the return value is NULL.v Otherwise, the return value is the quality of the result for the specified

cluster, as a value of type DOUBLE between 0.0 and 1.0.

Chapter 9. IM Scoring functions reference 115

Page 132: Administration and Programming for DB2

DM_getRBFRegionID

This function returns the number of the region to which the record wasassigned. The value is returned from results data that is produced when youapply an RBF Regression model.

Syntax

$$ DM_getRBFRegionID ( results value ) $&

Parameters

results valueThe results of applying an RBF Regression model returned by thefunction DM_applyRegModel as data type DM_RegResult

Return valuev If results value is NULL, the return value is NULL.v Otherwise, the return value is the region ID as data type INTEGER. If NULL is

returned, one of the following error events might have occurred:– The results value is computed on a Regression model that is not an RBF

Regression model.– The results value is computed based on an RBF Regression model,

however, the record contains too many missing values to compute aresult value. This applies if the same results value is used as input forthe function DM_getPredValue, and the function returns NULL.

116 Administration and Programming for DB2

Page 133: Administration and Programming for DB2

DM_getRegMdlName

This function obtains the name of a Regression model of typeDM_RegressionModel.

Syntax

$$ DM_getRegMdlName ( model ) $&

Parameters

model A Regression model of the type DM_RegressionModel

Return valueThe return value is the name of the Regression model as a VARCHAR value.

Chapter 9. IM Scoring functions reference 117

Page 134: Administration and Programming for DB2

DM_getRegMdlSpec

This function returns the DM_LogicalDataSpec value representing the set offields needed for an application of the Regression model.

Syntax

$$ DM_getRegMdlSpec ( regressionModel ) $&

Parameters

regressionModelA value of type DM_RegressionModel

Return valueThe return value is the DM_LogicalDataSpec value representing the set of fieldsneeded for an application of the Regression model.

118 Administration and Programming for DB2

Page 135: Administration and Programming for DB2

DM_getRegTarget

This function returns the target field (that is, the field to be predicted) for theRegression function.

Syntax

$$ DM_getRegTarget ( regressionModel ) $&

Parameters

regressionModelA value of type DM_RegressionModel

Return valuev If regressionModel is NULL, the return value is NULL.v Otherwise, the return value is the name of the target field in the Regression

model, as a VARCHAR value.

Chapter 9. IM Scoring functions reference 119

Page 136: Administration and Programming for DB2

DM_impApplData

This function converts application input data of type CLOB, CHAR, or VARCHARinto type DM_ApplicationData.

Syntax

$$ DM_impApplData ( applData_as_string ) $&

Parameters

applData_as_stringApplication input data as a CLOB, CHAR, or VARCHAR value.

The application input data must be a well-formed XML element. Theelement must match the DTD declaration of type row as shown in thefollowing example.

<!ELEMENT row (column*) ><!ELEMENT column (PCDATA) ><!ATTLIST column

name CDATA #REQUIREDnull (true | false ) "false"

>

A value of the attribute name is interpreted as the name of the scoringfield. The content of an element is interpreted as the value of thenamed scoring field. The order of the column elements is not relevant.The data for some fields might be NULL. In this case, it is representedby an element where the attribute null has the value true and thecontent is empty.

DM_impApplData also accepts the output of the DB2 function REC2XML.

Return valuev If applData_as_string is NULL, the return value is NULL.v Otherwise, the return value is application input data, converted into type

DM_ApplicationData.

120 Administration and Programming for DB2

Page 137: Administration and Programming for DB2

DM_impClasFile

This function reads a Tree or Neural Classification model from a file, andreturns it as data of type DM_ClasModel. This data can then be inserted into atable where one of the columns is designated for this data type.

The encoding of characters in the imported file is determined by default. Forinformation about the default encoding rules, see “DM_impClasFileE” onpage 122. The call

DM_impClasFile(<file name>)

corresponds to

DM_impClasFileE(<file name>,cast (NULL as CHAR))

When the content of the file is read and imported into DB2, the file is nolonger needed. However, you might want to keep the file in case you need toimport the model again.

Syntax

$$ DM_impClasFile ( input file name ) $&

Parameters

input file nameThe name and full path of the file to be imported. The specified filemust exist on the database server and the DB2 instance owner(unfenced) or DB2 fenced user (fenced) must have read access to thefile.

Return valueThe return value is the imported model as data type DM_ClasModel.

Chapter 9. IM Scoring functions reference 121

Page 138: Administration and Programming for DB2

DM_impClasFileE

This function reads a Tree or Neural Classification model from a file, andreturns it as data of type DM_ClasModel. This data can then be inserted into atable where one of the columns is designated for this data type.

The encoding of characters in the imported file is specified by a parameter.

When the content of the file is read and imported into DB2, the file is nolonger needed. However, you might want to keep the file in case you need toimport the model again.

Syntax

$$ DM_impClasFileE ( input file name , encoding name ) $&

Parameters

input file nameThe name and full path of the file to be imported. The specified filemust exist on the database server.

encoding name

IM Scoring supports different encoding names. The MIME encodingstrings are recommended, for example, iso-8859-1.v If the encoding name is a non-empty string, the value is interpreted

as an XML encoding. The content of the imported file is parsedaccording to this encoding. The imported file must be a PMML 1.1or PMML 2.0 document.

v If the encoding name is the string ’SYSTEM’, the locale settings ofthe operating system are used to determine the encoding of the file.

v If the encoding name is NULL, the encoding is determined by theimported file.– If the file is a PMML 1.1 or PMML 2.0 document, the standard

rules of the XML specification apply. That is, the encoding iseither given explicitly or assumed to be Unicode by default.

– If the imported file is written in Intelligent Miner format, thelocale settings of the operating system are used.

Return valueThe return value is the imported model as data type DM_ClasModel.

122 Administration and Programming for DB2

Page 139: Administration and Programming for DB2

DM_impClasModel

This function converts the Classification model given as CLOB into aClassification model of type DM_ClasModel.

Syntax

$$ DM_impClasModel ( model_as_CLOB ) $&

Parameters

model_as_CLOBA Classification model in PMML 1.1 or PMML 2.0 format as a CLOBvalue. The model is assumed to be provided in database encoding(the codeset specified when the database was created), regardless ofthe XML encoding specification contained in the model.

Return valueThe return value is a Classification model of type DM_ClasModel.

Chapter 9. IM Scoring functions reference 123

Page 140: Administration and Programming for DB2

DM_impClusFile

This function reads a Demographic or Neural Clustering model from a file,and returns it as data of type DM_ClusteringModel. This data can then beinserted into a table where one of the columns is designated for this datatype.

The encoding of characters in the imported file is determined by default. Forinformation about the default encoding rules, see “DM_impClusFileE” onpage 125. The call

DM_impClusFile(<file name>)

corresponds to

DM_impClusFileE(<file name>,cast (NULL as CHAR))

When the content of the file is read and imported into DB2, the file is nolonger needed. However, you might want to keep the file in case you need toimport the model again.

Syntax

$$ DM_impClusFile ( input file name ) $&

Parameters

input file nameThe name and full path of the file to be imported. The specified filemust exist on the database server and the DB2 instance owner(unfenced) or DB2 fenced user (fenced) must have read access to thefile.

Return valueThe return value is the imported model as data type DM_ClusteringModel.

If the input model is a Demographic Clustering Model that does not containdistance units, they are calculated by default for numeric fields. The defaultvalue is half the square root of the variance of the field.

124 Administration and Programming for DB2

Page 141: Administration and Programming for DB2

DM_impClusFileE

This function reads a Demographic or Neural Clustering model from a file,and returns it as data of type DM_ClusteringModel. This data can then beinserted into a table where one of the columns is designated for this datatype.

The encoding of characters in the imported file is specified by a parameter.

When the content of the file is read and imported into DB2, the file is nolonger needed. However, you might want to keep the file in case you need toimport the model again.

Syntax

$$ DM_impClusFileE ( input file name , encoding name ) $&

Parameters

input file nameThe name and full path of the file to be imported. The specified filemust exist on the database server.

encoding name

IM Scoring supports different encoding names. The MIME encodingstrings are recommended, for example, iso-8859-1.v If the encoding name is a non-empty string, the value is interpreted

as an XML encoding. The content of the imported file is parsedaccording to this encoding. The imported file must be a PMML 1.1or PMML 2.0 document.

v If the encoding name is the string ’SYSTEM’, the locale settings ofthe operating system are used to determine the encoding of the file.

v If the encoding name is NULL, the encoding is determined by theimported file.– If the file is a PMML 1.1 or PMML 2.0 document, the standard

rules of the XML specification apply. That is, the encoding iseither specified explicitly or assumed to be Unicode by default.

– If the imported file is written in Intelligent Miner format, thelocale settings of the operating system are used.

Return valueThe return value is the imported model as data type DM_ClusteringModel.

If the input model is a Demographic Clustering model that does not containdistance units, they are calculated by default for numeric fields. The default

Chapter 9. IM Scoring functions reference 125

Page 142: Administration and Programming for DB2

value is half the square root of the variance of the field.

126 Administration and Programming for DB2

Page 143: Administration and Programming for DB2

DM_impClusModel

This function converts a Clustering model given as a CLOB value into aClustering model of type DM_ClusteringModel.

Syntax

$$ DM_impClusModel ( model_as_CLOB ) $&

Parameters

model_as_CLOBA Clustering model in PMML 1.1 or PMML 2.0 format as a CLOBvalue. The model is assumed to be provided in database encoding(the codeset specified when the database was created), regardless ofthe XML encoding specification contained in the model.

Return valueThe return value is a Clustering model of type DM_ClusteringModel.

Chapter 9. IM Scoring functions reference 127

Page 144: Administration and Programming for DB2

DM_impRegFile

This function reads an RBF or Neural Prediction model or a PolynomialRegression model from a file, and returns it as data of typeDM_RegressionModel. This data can then be inserted into a table where one ofthe columns is designated for this data type.

The encoding of characters in the imported file is determined by default. Forinformation about the default encoding rules, see “DM_impRegFileE” onpage 129. The call

DM_impRegFile(<file name>)

corresponds to

DM_impRegFileE(<file name>,cast (NULL as CHAR))

When the content of the file is read and imported into DB2, the file is nolonger needed. However, you might want to keep the file in case you need toimport the model again.

Syntax

$$ DM_impRegFile ( input file name ) $&

Parameters

input file nameThe name and full path of the file to be imported. The specified filemust exist on the database server and the DB2 instance owner(unfenced) or DB2 fenced user (fenced) must have read access to thefile.

Return valueThe return value is the imported model as data type DM_RegressionModel.

128 Administration and Programming for DB2

Page 145: Administration and Programming for DB2

DM_impRegFileE

This function reads an RBF or Neural Prediction model or a PolynomialRegression model from a file, and returns it as data of typeDM_RegressionModel. This data can then be inserted into a table where one ofthe columns is designated for this data type.

The encoding of characters in the imported file is specified by a parameter.

When the content of the file is read and imported into DB2, the file is nolonger needed. However, you might want to keep the file in case you need toimport the model again.

Syntax

$$ DM_impRegFileE ( input file name , encoding name ) $&

Parameters

input file nameThe name and full path of the file to be imported. The specified filemust exist on the database server.

encoding name

IM Scoring supports different encoding names. The MIME encodingstrings are recommended, for example, iso-8859-1.v If the encoding name is a non-empty string, the value is interpreted

as an XML encoding. The content of the imported file is parsedaccording to this encoding. The imported file must be a PMML 1.1or PMML 2.0 document.

v If the encoding name is the string ’SYSTEM’, the locale settings ofthe operating system are used to determine the encoding of the file.

v If the encoding name is NULL, the encoding is determined by theimported file.– If the file is a PMML 1.1 or PMML 2.0 document, the standard

rules of the XML specification apply. That is, the encoding iseither specified explicitly or assumed to be Unicode by default.

– If the imported file is written in Intelligent Miner format, thelocale settings of the operating system are used.

Return valueThe return value is the imported model as data type DM_RegressionModel.

Chapter 9. IM Scoring functions reference 129

Page 146: Administration and Programming for DB2

DM_impRegModel

This function converts a Regression model given as CLOB into a Regressionmodel of type DM_RegressionModel.

Syntax

$$ DM_impRegModel ( model_as_CLOB ) $&

Parameters

model_as_CLOBA Regression model in PMML 1.1 or PMML 2.0 format as a CLOBvalue. The model is assumed to be provided in database encoding(the codeset specified when the database was created), regardless ofthe XML encoding specification contained in the model.

Return valueThe return value is a Regression model of type DM_RegressionModel.

130 Administration and Programming for DB2

Page 147: Administration and Programming for DB2

Chapter 10. IM Scoring command reference

This chapter contains full descriptions of the executables provided with IMScoring. These executables make it possible to:v Enable a database.

See “The idmenabledb command” on page 133.v Check if a database has been enabled.

See “The idmcheckdb command” on page 132.v Disable a database.

See “The idmdisabledb command” on page 132.v Enable the DB2 instance on UNIX platforms.

See “The idminstfunc command” on page 135.v Disable the DB2 instance on UNIX platforms.

See “The idmuninstfunc command” on page 138.v Convert models from IM for Data format to PMML 2.0 format.

See “The idmxmod command” on page 139.v Generate from a PMML model an SQL script for performing a scoring run.

See “The idmmkSQL command” on page 136.v Check your license status.

See “The idmlicm command” on page 135.v Obtain information about your product version and other installed

software.See “The idmlevel command” on page 135.

You can find these commands in the appropriate directory for your platform,as follows:

AIX:/usr/lpp/IMinerX/bin

Linux and Sun Solaris:/opt/IMinerX/bin

Windows:<install path>/IMinerX/bin

© Copyright IBM Corp. 2001, 2002 131

Page 148: Administration and Programming for DB2

The idmcheckdb command

Use the idmcheckdb utility to check whether a specified database is enabledfor IM Scoring, IM Modeling, or both. The utility also checks whether thedatabase is enabled in fenced or unfenced mode.

Command syntax

$$ idmcheckdb dbname $&

The idmcheckdb utility returns the following messages:

Table 24. The idmcheckdb messages

Database successfully enabled for IM Modeling and IM Scoring in fenced mode

Database successfully enabled for IM Scoring in fenced mode

Database successfully enabled for IM Modeling in fenced mode

Database is not enabled for either IM Scoring or IM Modeling

The idmdisabledb command

To disable a database, use the executable idmdisabledb.

Command syntax

$$ idmdisabledb dbnametables

$&

Parameters

dbnameThe name of the database that you want to disable.

tablesIf the optional parameter tables is specified, the sample tablescontaining models are dropped.

Warning: Use the tables parameter with caution. Before you drop thesample tables, consider that you will also delete the modelscontained in these tables.

132 Administration and Programming for DB2

Page 149: Administration and Programming for DB2

The idmenabledb command

To enable a database, use the executable idmenabledb. The executable is sharedbetween IM Modeling and IM Scoring. The executable has logic to detect thefollowing:v Which products are installedv Which types and functions already exist in the databasev Which types and functions need to be created

Command syntax

$$ idmenabledb <dbname>fencedunfenced

tables ClasModelSize <n>$

$RuleModelSize <n> ClusModelSize <n> RegModelSize <n>

$

$StructSize <n> ApplDataSize <n>

$&

Example:

idmenabledb testdb fenced ClasModelSize 5 RuleModelSize 7 RegModelSize 9

where:v 5 MB is the CLOB size used for the function DM_impClasModel and the

BLOB size of type DM_ClasModel

v 7 MB is the CLOB size used for the function DM_impRuleModel and theBLOB size of DM_RuleModel

v 9 MB is the CLOB size used for the function DM_impRegModel and theBLOB size of DM_RegressionModel

Note: DM_RuleModel is a type introduced by IM Modeling. DM_RuleModel isnot used by IM Scoring, but is mentioned here because idmenabledbis a shared command between IM Scoring and IM Modeling.

Use StructSize to specify the size of all IM Scoring structured types inmegabytes. The default size of the DM_LogicalDataSpec data type is 200 KB,which is suitable for most of the models.

Use ApplDataSize to specify the CLOB size of the distinct typeDM_ApplicationData in megabytes. The default is 500 KB.

Chapter 10. IM Scoring command reference 133

Page 150: Administration and Programming for DB2

Use ClasModelSize, RuleModelSize, ClusModelSize, and RegModelSize tospecify the BLOB size of the distinct types DM_ClasModel, DM_RuleModel,DM_ClusteringModel, and DM_RegressionModel, and also the CLOB parametersize of the associated import and export model functions.

Note: When you have created database objects (for example, tables) usingthese types, you can no longer change the size of the types withoutdropping these database objects. IBM recommends that you work withdefault sizes in a test environment and that you carefully consider theparameter sizes before moving to a production environment. If youwant to change your parameter sizes after you have created models inyour database, follow the instructions in “Exporting and importingmodels by means of DB2 Utilities” on page 168.

If an existing model size is different from the new one that is specified, theidmenabledb executable:1. Uses the existing definitions2. Enables the database3. Issues a warning message (NLS) that says:

″Database successfully enabled. However, some types already exist with adifferent size than requested. First disable the database and run thecommand again.″

If a model size is not specified for a specific model type, the default is 50 MBfor unfenced and 10 MB for fenced.

Note: In IM Scoring 7.1, the default for the BLOB model was 100 MB; thedefault for the CLOB model to be imported was 50 MB. The BLOBmodel in IM Scoring 8.1 is compressed. For this reason, the default forthe CLOB and BLOB models is the same (50 MB for unfenced and 10MB for fenced).

If the optional parameter tables is specified, idmenabledb creates sample tablesthat are suitable for storing imported and generated models. The followingtables are created if you specify the tables option:

CREATE TABLE IDMMX.ClassifModels ( MODELNAME VARCHAR(240) NOT NULLPRIMARY KEY, MODEL IDMMX.DM_ClasModel );

CREATE TABLE IDMMX.ClusterModels ( MODELNAME VARCHAR(240) NOT NULLPRIMARY KEY, MODEL IDMMX.DM_ClusteringModel );

CREATE TABLE IDMMX.RegressionModels ( MODELNAME VARCHAR(240) NOT NULLPRIMARY KEY, MODEL IDMMX.DM_RegressionModel );

134 Administration and Programming for DB2

Page 151: Administration and Programming for DB2

The idminstfunc command

The idminstfunc command enables the DB2 instance for the use of IMModeling and IM Scoring. Enabling the DB2 instance means that sharedlibraries containing the implementation of the UDFs, UDMs, and storedprocedures are linked into the sqllib/function directory of the DB2 instance.

This command must be called only on UNIX platforms, and can be calledonly by a user with SYSADM authority.

Syntax:

$$ idmintfunc $&

The idminstfunc command is a shared command between IM Modeling andIM Scoring, and it need be called only once if both products are installed.

The idmlevel command

Use the command idmlevel to collect information about the softwareenvironment (operating system, DB2 version) and IM product version you areusing.

Using ’idmlevel’ on Windows operating systems

1. Open a command window.2. Invoke idmlevel <logfile>

Example:

idmlevel c:\imscoring.log

Using ’idmlevel’ on UNIX operating systems

1. In a command shell invoke:

AIX:/usr/lpp/IMinerX/bin/idmlevel /tmp/imscoring.log

Other UNIX platforms:/opt/IMinerX/bin/idmlevel /tmp/imscoring.log

The idmlicm command

Use the idmlicm command to check your license status. IM Scoring and IMModeling 8.1 use nodelocked license keys installed in a license file. The fullproduct contains production license keys, which are installed duringinstallation. The ’Try and Buy’ version does not include a production licensekey. You can use the idmlicm command to generate a ’Try and Buy’ licensekey.

Chapter 10. IM Scoring command reference 135

Page 152: Administration and Programming for DB2

The idmlicm command is located in the bin directory. It has no parameters.When invoked, idmlicm checks the license status.

Sample output with production licenseThe license for the product "DB2 Intelligent Miner Scoring"

was found in the file "nodelock_sc".The license for the product "DB2 Intelligent Miner Modeling"

was found in the file "nodelock_mo".

If no production license is installed, idmlicm can generate a temporary licensekey the first time it is invoked.

Sample output with temporary licenseW 2771: You are now using a temporary license. You needto enroll "DB2 Intelligent Miner Scoring" within "89" daysin the license file "d:\IMinerX\bin\nodelock_sc".

If you invoke idmlicm when a temporary license is already installed, you getthe following sample output:

Sample output with temporary license already installedW 2772: The number of days left is "89". You are still usinga temporary license. You need to enroll "DB2 Intelligent Miner Scoring"in the license file "d:\IMinerX\bin\nodelock_sc".

The idmmkSQL command

The command idmmkSQL analyzes a PMML model, and generates from it anSQL script that contains the SQL statements necessary to perform a scoringrun using the model. The script is basically a template. It containsplaceholders that you replace with the names of concrete database objects inorder to finally get the executable SQL script.

Command syntax

$$ idmmkSQLoptions

inputfileoutputfile

$&

ParametersThe inputfile parameter is the file containing the PMML model. The SQLtemplate is written to outputfile. If no outputfile is given, the templateis written to standard output. The following options exist. Note that, onWindows, options start with a slash (/), not with a hyphen (-).

-MIdentifies the method used to build input records. The option must befollowed by one of the following values (not case-sensitive):v REC2XML. This is the default.

136 Administration and Programming for DB2

Page 153: Administration and Programming for DB2

v DM_applData.v DM_ApplColumn. This option applies only to Oracle SQL (IM Scoring

for Oracle).v CONCAT.

For an introduction to these options, see:“Specifying data by means of DM_applData” on page 52“Specifying data by means of REC2XML” on page 51“Specifying data by means of CONCAT” on page 53

-D Determines whether DB2 SQL or Oracle SQL is written. The optionmust be followed by one of the following values, which are notcase-sensitive:v DB2 (the default value)v ORACLE

-E Determines the encoding of the PMML file. The option must befollowed by a valid encoding string. If this option is not given, theencoding string contained in the PMML file is used. If the file doesnot contain encoding information, utf-8 is the default encoding.

-HDisplays usage information.

Example:

idmmkSQL -D DB2 -M Concat -E iso-8859-15 thePMMLFile.pmmltheScriptFile.sql

PlaceholdersThe placeholders are marked with leading and trailing ### signs, forexample, ###TABLENAME###. A template can contain the followingplaceholders:

###IDMMX.CLUSTERMODELS###

###IDMMX.CLASSIFMODELS###

###IDMMX.REGRESSIONMODELS###

This denotes the name of the table where the mining model is to bestored. If you simply remove the # signs, you get the name of thecorresponding sample table. However, you can replace the placeholdersby the name of any other table that you use to store your models.

Note: To generate the SQL script by using idmmkSQL, you need the modelas a PMML file. When executing the script, it imports the modelinto this database table.

Chapter 10. IM Scoring command reference 137

Page 154: Administration and Programming for DB2

###ABSOLUTE_PATH###The absolute path to the PMML file. The function that imports thePMML model needs the absolute pathname of the file.

###RECORDID###The table containing the input data records for the scoring run isexpected to have a column of data type INTEGER. This columncontains an identifier for different records. Replace ###RECORDID###with the name of the column.

###MODEL###The name of the column in the model table containing the modelitself. In the sample tables, this name is MODEL; if you use the sampletables to store your models, you can simply remove the # signs.

###MODELNAME###The name of the column in the model table containing the modelname. In the sample tables, this name is MODELNAME; if you use thesample tables to store your models, you can simply remove the #signs.

###TABLENAME###The name of the table containing your input data records.

The idmuninstfunc command

The idmuninstfunc command disables the DB2 instance for the use of IMModeling and IM Scoring. Disabling the DB2 instance means that links to theshared libraries containing the implementation of the UDFs, UDMs, andstored procedures are removed from the sqllib/function directory of the DB2instance.

This command must be called only on UNIX platforms, and it can be calledonly by a user with SYSADM authority.

Syntax:

$$ idmuninstfunc $&

The idminstfunc command is a shared command between IM Modeling andIM Scoring, and needs be called only once if both products are installed. It isnot possible to disable the instance for the use of only one product if bothproducts are installed.

After the idmuninstfunc command is called, IM Modeling and IM Scoring canno longer be used for that DB2 instance. To enable the DB2 instance again,call the idminstfunc command.

138 Administration and Programming for DB2

Page 155: Administration and Programming for DB2

The idmxmod command

You can convert a model outside IM for Data from Intelligent Miner format toPMML 2.0 format by using the idmxmod command.

Command syntax

$$ idmxmod in file name out file name $&

Parameters

in file nameThe name of a file containing a model in Intelligent Miner format

out file nameThe name of the file containing the converted PMML 2.0 model

Chapter 10. IM Scoring command reference 139

Page 156: Administration and Programming for DB2

140 Administration and Programming for DB2

Page 157: Administration and Programming for DB2

Chapter 11. IM Scoring Java Beans reference

The Java API is documented in online documentation (Javadoc) in thefollowing directory:\doc\ScoringBean\index.html

© Copyright IBM Corp. 2001, 2002 141

Page 158: Administration and Programming for DB2

142 Administration and Programming for DB2

Page 159: Administration and Programming for DB2

Part 3. Appendixes

© Copyright IBM Corp. 2001, 2002 143

Page 160: Administration and Programming for DB2

144 Administration and Programming for DB2

Page 161: Administration and Programming for DB2

Appendix A. Installing IM Scoring

This chapter guides you through the tasks involved in installing anduninstalling IM Scoring on the available platforms:v AIX

See “Installing IM Scoring on AIX systems”v Linux

See “Installing IM Scoring on Linux systems” on page 149v Sun Solaris

See “Installing IM Scoring on Sun Solaris systems” on page 150v Windows NT, Windows 2000, and Windows XP

See “Installing IM Scoring on Windows systems” on page 153

This chapter also describes steps that you need to complete before and afterthe installation, as follows:v “Configuring the database management system on UNIX systems” on

page 157v “Configuring the database management system on Windows systems” on

page 158v “Enabling IM for Data to export PMML or XML models” on page 158

Installing IM Scoring on AIX systems

Before you install IM Scoring on an AIX system, ensure that your systemmeets the prerequisites.

Prerequisites for AIX systemsThe prerequisites that your AIX system must meet for installing IM Scoringare as follows:v 60 MB of additional disk space in the /usr file system

On SP2® with DB2 EEE installation, this disk space is required on eachnode.

v At least 256 MB RAMv DB2 UDB Version 7.2 Fixpack 7, or DB2 UDB Version 8

To download DB2 fixpacks, go to:http://www.ibm.com/software/data/db2/udb/support.html

v AIX 4.3.3 or higher

© Copyright IBM Corp. 2001, 2002 145

Page 162: Administration and Programming for DB2

If IM Scoring V7.1 is installed on your system, you are recommended touninstall it before you install IM Scoring V8.1. Even though it is feasible tohave those versions of the applications installed on the same system, there is ahigh risk of user error. In any case, you are recommended to use differentDB2 instances if you intend to use both versions.

To check whether IM Scoring V7.1 is already installed on your system, typethe following on the AIX command line:

lslpp -l "IMinerSc.services.*"

The file sets that make up IM Scoring V7.1 should not be present. They are asfollows:v IMinerSc.services.base

v IMinerSc.services.cnv

v IMinerSc.services.symblnk

v IMinerSc.services.cstr

For instructions on uninstalling IM Scoring V7.1 file sets, see IBM IntelligentMiner Scoring Administration and Programming for DB2, Version 7.1.

You might also want to uninstall documentation related to IM Scoring V7.1.To do so, uninstall the file sets IMinerSc.services.doc.<language>

Installing IM ScoringUse smit or smitty to install IM Scoring on AIX systems. The AIX systemmust have a DB2 server installed and configured.

To install IM Scoring on an AIX system:

1. Log on as user root.2. Insert the IM Scoring CD-ROM in the CD-ROM drive.3. Type smit on the AIX command line.4. Select the following options. These steps might differ slightly depending

on the AIX version installed on your system.a. Software Installation and Maintenanceb. Install and Update Softwarec. Install and Update from LATEST Available Softwared. Enter /dev/cd0 as INPUT devicee. Display the list of licensed software available to install. The relevant

items in the list are:

IMinerX.scoring.db2 (Intelligent Miner Scoring - DB2)Contains the scoring functionality of IM Scoring for DB2.

146 Administration and Programming for DB2

Page 163: Administration and Programming for DB2

IMinerX.scoring.db2.doc.enContains the documentation for IM Scoring for DB2 in English inPDF and HTML format

IMinerX.scoring.db2.doc.esContains the documentation for IM Scoring for DB2 in Spanish inPDF and HTML format

IMinerX.scoring.db2.doc.jaContains the documentation for IM Scoring for DB2 in Japanese inPDF and HTML format

IMinerX.scoring.db2.doc.koContains the documentation for IM Scoring for DB2 in Korean inPDF and HTML format

IMinerX.scoring.db2.doc.cnContains the documentation for IM Scoring for DB2 in Chinese(China) in PDF and HTML format

IMinerX.scoring.db2.doc.twContains the documentation for IM Scoring for DB2 in Chinese(Taiwan) in PDF and HTML format

IMinerX.conversionContains the model conversion facility to convert models exportedfrom IM for Data to PMML. This is an optional feature in IMScoring for DB2.

IMinerX.symblnkContains symbolic links from the IM for Data /usr/lpp/IMiner/bindirectory to the executables of the model conversion facility. This isan optional feature in IM Scoring for DB2, which requiresIMiner.server.serial. This enables you to use the modelconversion facility from IM for Data.

To install IM Scoring, select the file set IMinerX.scoring.db2

To install the documentation, select IMinerX.scoring.db2.doc.<language>

To install the optional feature ″Intelligent Miner — Symbolic Links″, selectIMinerX.symblnk

To install ″Intelligent Miner — Conversion″, select IMinerX.conversion

On the smit installation menu, for AUTOMATICALLY install requisitesoftware, type ’yes’.

Appendix A. Installing IM Scoring 147

Page 164: Administration and Programming for DB2

After you have successfully installed IMinerX.scoring.db2, you must repeatthe installation procedure in order to install the license that enables you to useIM Scoring. There are two kinds of IM Scoring license – the ’Try and Buy’license and the regular full license.

To install the IM Scoring ’Try and Buy’ license:In step 4e, select the file set IMinerX.scoring.tab.license (instead of thefile set IMinerX.scoring.db2). The ’Try and Buy’ license expires after 59days.

To install the regular IM Scoring license:In step 4e, select the file set IMinerX.scoring.license (instead of the fileset IMinerX.scoring.db2).

Note: If you want to use IM Scoring on an SP2 in a DB2 EEE environment,you must install IM Scoring on each node where DB2 is installed.

Before you can use IM Scoring, you must enable the DB2 instance anddatabases, and create sample tables. For information on how to do this, see:

“Enabling the DB2 instance on UNIX systems” on page 157“Configuring the database environment” on page 21“Enabling databases” on page 41“The idmenabledb command” on page 133

Before you can use the conversion utility, IM for Data must be enabled toexport mining models properly. See “Enabling IM for Data to export PMMLor XML models” on page 158.

Uninstalling IM ScoringBefore you uninstall IM Scoring on AIX systems, you must disable thedatabases and the DB2 instance. For information on how to do this, see:

“Disabling databases” on page 42“The idmdisabledb command” on page 132“Disabling the DB2 instance on UNIX systems” on page 157

To uninstall IM Scoring on an AIX system:

1. Log on as user root.2. Run the standard software and maintenance procedure from smit or

smitty and select the following options:a. Software Installation and Maintenanceb. Software Maintenance and Utilitiesc. Remove Installed Software

3. List the installed software.4. Follow the instructions on the smit menus to uninstall IM Scoring.

148 Administration and Programming for DB2

Page 165: Administration and Programming for DB2

On SP2, you must uninstall IM Scoring on each individual node where IMScoring is installed.

Installing IM Scoring on Linux systems

Before you install IM Scoring on a Linux system, ensure that your systemmeets the prerequisites.

Prerequisites for Linux systemsThe prerequisites that your Linux system must meet for installing IM Scoringare as follows:v 60 MB of additional disk spacev At least 256 MB RAMv Linux with kernel 2.2.18 or higher, and glibc Version 2.1.1 or higherv DB2 UDB Version 7.2 Fixpack 7, or DB2 UDB Version 8

To download DB2 fixpacks, go to:http://www.ibm.com/software/data/db2/udb/support.html

v To install IM Scoring, you need RPM

Installing IM ScoringTo install IM Scoring on a Linux system:

1. Insert the Linux server CD-ROM in the CD-ROM drive.2. Mount the CD-ROM by using the following command:

mount /cdrom

If the directory /cdrom is not listed in the file /etc/fstab, use thefollowing command:

mount -tauto9660 /<dev/hdx> /cdrom

where /dev/hdx is your CD-ROM drive.3. Depending on your system, go to the appropriate directory by using one

of the following commands:v cd /cdrom/LINUX/i386

v cd LINUX/s390

4. Depending on your system, use the appropriate commands to completethe installation.

On Linux/i386:./linuxInstallSc

On Linux/s390:./linux390InstallSc

Appendix A. Installing IM Scoring 149

Page 166: Administration and Programming for DB2

Before you can use IM Scoring, you must enable the DB2 instance anddatabases, and create sample tables. For information on how to do this, see:

“Enabling the DB2 instance on UNIX systems” on page 157“Configuring the database environment” on page 21“Enabling databases” on page 41“The idmenabledb command” on page 133

Uninstalling IM ScoringBefore you uninstall IM Scoring on Linux systems, you must disable thedatabases and the DB2 instance. For instructions on how to do this, see:

“Disabling databases” on page 42“The idmdisabledb command” on page 132“Disabling the DB2 instance on UNIX systems” on page 157

To uninstall IM Scoring on a Linux system:

Use the appropriate command:

On Linux/i386:./linuxUninstallSc

On Linux/s390:./linux390UninstallSc

Installing IM Scoring on Sun Solaris systems

Before you install IM Scoring on a Sun Solaris system, ensure that yoursystem meets the prerequisites.

Prerequisites for Sun Solaris systemsThe prerequisites that your Sun Solaris system must meet for installing IMScoring are as follows:v 60 MB of additional disk space in the /opt file systemv At least 256 MB RAMv DB2 UDB Version 7.2 Fixpack 7, or DB2 UDB Version 8

To download DB2 fixpacks, go to:http://www.ibm.com/software/data/db2/udb/support.htm

v Sun Solaris Version 2.6 or higher

Make sure that the following patches are installed:109147-09108434-02108435-02

To download patches, go to http://sunsolve.sun.com.

150 Administration and Programming for DB2

Page 167: Administration and Programming for DB2

Installing IM ScoringTo install IM Scoring on a Sun Solaris system:

1. Log on as user root.2. Mount the product CD to a directory of your choice, for example, /cdrom.3. Go to the directory where you mounted the CD.4. Go to the Sun Solaris directory by using the following command:

cd SUN

5. Install the components you want by executing the appropriate script, asfollows:

IM Scoring./sunInstallScD2

IMiner Conversion Utilities./sunInstallCn

DocumentationBy default, the US English documentation is installed together with IMScoring. To install additional documentation packages, execute thefollowing command:pkgadd -a ./admin -d ./IMXSDD2XX.pkg

where XX is one of the following:v EE for Spanishv JJ for Japanesev KK for Koreanv ZC for Chinese (China)v ZT for Chinese (Taiwan)

If IM for Data and IMiner Conversion Utilities are installed on the samesystem, symbolic links to the model conversion facility executables arecreated in the directory /opt/IMiner/bin. This enables you to use themodel conversion facility from IM for Data.

If other components are already installed on the system, these will not beinstalled again, though the installation script will try to do so. Themessages that appear during installation will reflect this.

You might have IM for Data and IMiner Conversion Utilities installed ondifferent systems and you might want to use the model conversion facilityfrom IM for Data. In this case, use the following command on the systemwhere IM for Data is installed to install the model conversion facility:./sunInstallCn

Appendix A. Installing IM Scoring 151

Page 168: Administration and Programming for DB2

You can add further components on the system later by following thesteps outlined above.

Before you can use IM Scoring, you must enable the DB2 instance anddatabases, and create sample tables. For information on how to do this, see:

“Enabling the DB2 instance on UNIX systems” on page 157“Configuring the database environment” on page 21“Enabling databases” on page 41“The idmenabledb command” on page 133

Before you can use the conversion utility, IM for Data must be enabled toexport mining models properly. See “Enabling IM for Data to export PMMLor XML models” on page 158.

Uninstalling IM ScoringBefore you uninstall any Intelligent Miner components on Sun Solarissystems, you must disable the databases and the DB2 instance. Forinformation on how to do this, see:

“Disabling databases” on page 42“The idmdisabledb command” on page 132“Disabling the DB2 instance on UNIX systems” on page 157

You can uninstall IM Scoring, IMiner Conversion Utilities, or both.

To uninstall IM Scoring on a Sun Solaris system:

1. Log on as user root.2. Mount the product CD to a directory of your choice, for example, /cdrom.3. Go to the directory where you mounted the CD.4. Go to the Sun Solaris directory by using the following command:

cd SUN

5. Uninstall the components you want by executing the appropriate script, asfollows:

IM Scoring./sunUninstallScD2

IMiner Conversion Utilities./sunUninstallCn

DocumentationBy default, the US English documentation is removed together withIM Scoring. To see if any other documentation is installed, execute thefollowing command:pkginfo | grep IMXSDD2

152 Administration and Programming for DB2

Page 169: Administration and Programming for DB2

To remove any other documentation that was installed, execute thefollowing command:pkgrm -n IMXSDD2XX

where XX is one of the following:v EE for Spanishv JJ for Japanesev KK for Koreanv ZC for Chinese (China)v ZT for Chinese (Taiwan)

The uninstall scripts will try to remove all the components that were installedwith the relevant component. However, other products may depend on someof these components. In that case, uninstallation of the relevant componentswill be skipped and the screen output will reflect this.

Installing IM Scoring on Windows systems

Before you install IM Scoring on a Windows system, ensure that your systemmeets the prerequisites.

Prerequisites for Windows systemsThe prerequisites that your Windows system must meet for installing IMScoring are as follows:

Disk space

v To install IM Scoring or IM Scoring Java Beans: 40 MB of additionaldisk space

v To install IM Scoring, IM Scoring Java Beans, and IM Modeling: 50 MB

RAM

At least 256 MB RAM

Operating system

Windows NT 4.0 SP6aWindows 2000 SP2 or higherWindows XP

JDKFor IM Scoring Java Beans: JDK 1.3 or higher

DatabaseDB2 UDB Version 7.2 Fixpack 7, or DB2 UDB Version 8

To download DB2 fixpacks, go to:http://www.ibm.com/software/data/db2/udb/support.html

Appendix A. Installing IM Scoring 153

Page 170: Administration and Programming for DB2

If you do not have a DB2 database, you can use the IM Scoring JavaBeans installation.

Installing IM Scoring

Note: IM Scoring for Windows includes Microsoft® Windows Installer(MSWI). If your operating system does not yet have MSWI installed oruses an older version, the automatic installation process installs MSWIas the first step. This requires you to restart Windows.

IM Scoring V8 cannot coexist with IM Scoring V7 on the same machine. Whenyou install IM Scoring V8, IM Scoring V7 is automatically uninstalled.

To install IM Scoring on a Windows system:

1. Change to the WIN32 directory on the IM Scoring CD-ROM.2. Run the setup.exe file.3. Follow the installation instructions.

IM Scoring includes a multilingual globalized installation. You can choosethe language you want to use during the installation. Regardless of thelanguage you select, IM Scoring is installed with support and translatedmessages for all supported languages.You can install the following features:

Scoring: User-defined functions for DB2This feature contains the shared libraries that implement the DB2user-defined functions for the DB2 database server. It also containsa command line interface to convert IM for Data Version 6 resultfiles to the standardized PMML format used by IM Scoring.

Install this feature on the database server where the database islocated on which you want to run the scoring functions.

Scoring: SamplesThis feature contains data to populate a sample database as wellas a sample Clustering model to apply scoring functions to adatabase. It also contains other samples that show how to migrateIM Scoring V7 databases.

PMML conversion utilities (server)Install this feature on the IM for Data Version 6 server. With thisfeature installed on the server and the PMML conversion utilities(client) feature installed on the client, PMML conversion ispossible. You can convert IM for Data Version 6 results to thestandardized PMML format from the IM for Data Version 6 GUI.

This feature is also available for the IM Scoring Java Beansinstallation.

154 Administration and Programming for DB2

Page 171: Administration and Programming for DB2

PMML conversion utilities (client)Install this feature on the IM for Data Version 6 client. It addsentries to the client tool registration file idmcsctr.dat to invokePMML conversion on version 6 result files. The actual conversionruns on the IM for Data server.

This feature is also available for IM Scoring Java Beans installation.

IM Scoring Java BeansThis feature contains IM Scoring Java Beans, which enables you toscore a single data record in a Java application given a PMMLmodel. This can be used to integrate scoring in e-businessapplications, for example, for real-time scoring in customerrelationship management (CRM) systems.

This feature is also available for the IM Scoring Java Beansinstallation.

Java Beans samplesThis feature contains Java sample programs for IM Scoring JavaBeans. To use the sample, you need a Java Development Kit (JDK).

This feature is also available for the IM Scoring Java Beansinstallation.

Online documentation: ScoringThis feature contains the online manuals as PDF and HTML files.In the subfeatures, you can select one of the languages that areavailable.

This feature is also available for the IM Scoring Java Beansinstallation.

The default installation path is <Program Files>\IBM\IMinerX. You canchange that path if this is the first product you are installing from theIntelligent Miner extender product family. Otherwise, the path of theproduct that is already installed is used.

Depending on the features you selected on the feature selection dialog, thefollowing installations and updates are performed:v Scoring functions are written to the directory <install path>\bin,

where <install path> is the directory where IM Scoring is installed.By default, the scoring functions are enabled if DB2 UDB is foundduring installation.

v Sample scripts for the tutorial are written to the following directories:

IM Scoring<install path>\samples\ScoringDB2

Appendix A. Installing IM Scoring 155

Page 172: Administration and Programming for DB2

IM Scoring Java Beans<install path>\samples\ScoringBean

v The PMML model conversion facility is installed.If IM for Data is found on the computer, the model conversion facility iscopied into the bin directory of IM for Data. The contents of theidmcsctr.add file and the idmcsstr.add file are added to the appropriateclient tool registration files idmcsctr.dat of IM for Data. This adds theoption of exporting models from the IM for Data GUI in PMML or XMLformat.

v If the Java Bean feature was selected:– The Java archives (jar) are written to the directory <install

path>\java

– The Java Native Interface (JNI) DLLs are written to the followingdirectory:- <install path>\bin

where <install path> is the directory where IM Scoring isinstalled.

v The DB2 Intelligent Miner Scoring folder is created, including shortcutsto the sample scripts and online documentation.

After IM Scoring is installed, you must reboot your system to activate thePATH variable in the system environment.

Before you can use IM Scoring, you must enable the DB2 instance anddatabases, and create sample tables. For information on how to do this, see:

“Enabling the DB2 instance on Windows systems” on page 158“Configuring the database environment” on page 21“Enabling databases” on page 41“The idmenabledb command” on page 133

Before you can use the conversion utility, IM for Data must be enabled toexport mining models properly. See “Enabling IM for Data to export PMMLor XML models” on page 158.

Uninstalling IM ScoringBefore you uninstall IM Scoring on Windows systems, you must disable thedatabases. For instructions on how to do this, see:v “Disabling databases” on page 42v “The idmdisabledb command” on page 132

To uninstall IM Scoring on a Windows system:

1. Double-click Add/Remove Programs on the Control Panel.

156 Administration and Programming for DB2

Page 173: Administration and Programming for DB2

2. Select IBM DB2 Intelligent Miner Scoring V8.1 from the list and clickAdd/Remove....

3. Follow the instructions on the screen. The following options are available:v Modify the list of installed featuresv Repair the installationv Remove all features

Configuring the database management system on UNIX systems

Before you can use IM Scoring, you must prepare your system environmentand verify the installation.

Enabling the DB2 instance on UNIX systemsA DB2 instance must have access to the libraries containing the scoringfunctions in order to use IM Scoring from that instance.1. Log on to the DB2 server as the DB2 instance owner and go to the

appropriate directory:

AIX /usr/lpp/IMinerX/bin

Linux, Sun Solaris/opt/IMinerX/bin

2. Call the script idminstfunc.Executing the script creates symbolic links to the following libraries in theinstance owner’s sqllib/function directory:v idmclu

v idmreg

v idmclf

v idmrec

v idmrul

v idmx

3. Change the database manager configuration parameter UDF_MEM_SZ to themaximum value by using the following command:db2 update dbm cfg using UDF_MEM_SZ 60000

4. Restart the DB2 instance by using the db2stop and db2start commands.

Disabling the DB2 instance on UNIX systemsTo disable the DB2 instance:1. Log on as DB2 instance user.2. Go to the appropriate directory:

AIX /usr/lpp/IMinerX/bin

Appendix A. Installing IM Scoring 157

Page 174: Administration and Programming for DB2

Linux, Sun Solaris/opt/IMinerX/bin

3. Call the script idmuninstfunc. This script deletes a set of symbolic linksfrom the instance owner’s sqllib/function directory.

Configuring the database management system on Windows systems

Before you can use IM Scoring, you must prepare your system environmentand verify the installation.

Enabling the DB2 instance on Windows systemsDB2 allocates a storage area for the input and output parameters of the UDFs.You can modify the size of the storage area by using the database managerconfiguration parameter UDF_MEM_SZ. This parameter indicates the size of thememory as a number of database pages.

The DB2 registry variable DB2NTMEMSIZE indicates the upper limit for fencedUDFs in bytes. The default value is 16 MB.

For IM Scoring, the values of the UDF_MEM_SZ and the DB2NTMEMSIZE variablesmust be increased. Note that the storage allocated by the value specified inthe UDF_MEM_SZ variable must not be greater than the upper limit specified inthe DB2NTMEMSIZE variable.

Use the following commands to increase the value of the UDF_MEM_SZ variableto 60000 pages and the value of the DB2NTMEMSIZE variable to 240MB:

db2 update dbm cfg using UDF_MEM_SZ 60000db2set DB2NTMEMSIZE=APLD:240000000

Restart the DB2 instance.

Enabling IM for Data to export PMML or XML models

Depending on the system you use, you must complete a number of stepsbefore IM for Data can export PMML or XML models.

On AIX systemsIf IM for Data is found on your system during the IM Scoring installation, thefile set IMinerX.symblnk generates links to the model conversion facility in thebin directory of IM for Data.

If you install IM for Data after you have installed IM Scoring, you canestablish the links to the model conversion facility manually by installing thefile set IMinerX.symblnk.

158 Administration and Programming for DB2

Page 175: Administration and Programming for DB2

After you have registered the model conversion facility by using the clienttool registration, you can use it on the IM for Data GUI.

To register the model conversion facility:v Add the contents of the file idmcsctr.add to the idmcsctr.dat file of the IM

for Data clientv Add the contents of the file idmcsstr.add to the idmcsstr.dat file of the IM

for Data server

The files idmcsctr.add and idmcsstr.add are platform-independent. You canadd the contents of the file idmcsctr.add to the idmcsctr.dat file of an IM forData client on different platforms. The situation might occur where theplatform is AIX and you are running the IM for Data client in a languageother than English. In this case, the idmcsstr.dat file resides in thenls/<language> directory of the IM for Data client. Otherwise, it resides in thebin directory of the IM for Data client.

On Sun Solaris systemsIf IM for Data is found on your system during the installation of IM Scoring,links to the model conversion facility are established in the bin directory ofIM for Data.

If you install IM for Data after you have installed IM Scoring, you canestablish the links to the model conversion facility by calling the scriptidmlnconv as user root. This script is available in the /opt/IMinerX/bindirectory of IM Scoring.

After you have registered the model conversion facility by using the clienttool registration, you can use it on the IM for Data GUI.

To register the model conversion facility:v Add the contents of the file idmcsctr.add to the idmcsctr.dat file of the IM

for Data clientv Add the contents of the file idmcsstr.add to the idmcsstr.dat file of the IM

for Data server

The files idmcsctr.add and idmcsstr.add are platform-independent. You canadd the contents of file idmcsctr.add to the idmcsctr.dat file of an IM forData client on different platforms. The situation might occur where theplatform is Sun Solaris and you are running the IM for Data client in alanguage other than English. In this case, the idmcsstr.dat file resides in thenls/<language> directory of the IM for Data client. Otherwise it resides in thebin directory of the IM for Data client.

Appendix A. Installing IM Scoring 159

Page 176: Administration and Programming for DB2

If you want to remove IM for Data from your system, you can remove thelinks to the model conversion facility by calling the script idmrlnconv as userroot.

On Windows systemsIf IM for Data is found on the system during the IM Scoring installation,Intelligent Miner is enabled to use the model conversion facility from theIntelligent Miner GUI.

If you install IM for Data after you have installed IM Scoring and you want touse the model conversion facility on the IM for Data GUI, complete thefollowing modifications:1. Invoke the IM Scoring setup.exe

2. Select Modify

3. Add the following features to your installation:v PMML conversion (server)v PMML conversion (client)

Alternatively, you can do the following:1. Add the contents of the file idmcsctr.add to the idmcsctr.dat file of the

IM for Data client2. Add the contents of the file idmcsstr.add to the idmcsstr.dat file of the

IM for Data serverThe files idmcsctr.add and idmcsstr.add files are platform-independent.You can add the contents of idmcsctr.add to the idmcsctr.dat file of anIM for Data client on different systems.

3. Copy the following executables to the bin directory of IM for Data:v idmxdclu.exe

v idmxncla.exe

v idmxnclu.exe

v idmxnpre.exe

v idmxrbf.exe

v idmxrul.exe

v idmxsreg.exe

v idmxtree.exe

160 Administration and Programming for DB2

Page 177: Administration and Programming for DB2

Appendix B. Installing IM Scoring Java Beans

The WIN32 directory on the installation CD contains a separate setup.exe forIM Scoring Java Beans. This installation enables you to install a subset of IMScoring without having a DB2 database as a prerequisite. The IM Scoring JavaBeans feature is also part of the IM Scoring installation. However, the IMScoring full installation requires a DB2 database as a prerequisite.

For a description of the IM Scoring installation process and the features thatare available, see Appendix A, “Installing IM Scoring” on page 145.

Installing IM Scoring Java Beans on AIX systems

Before you install IM Scoring Java Beans on an AIX system, ensure that yoursystem meets the prerequisites.

Prerequisites for AIX systemsThe prerequisites that your AIX system must meet for installing IM ScoringJava Beans are as follows:v 30 MB of additional disk space in the /usr file systemv At least 256 MB RAMv AIX 4.3.3 or higherv JDK 1.3.1 or higher

Installing IM Scoring Java BeansUse smit or smitty to install IM Scoring Java Beans on AIX systems.

To install IM Scoring on an AIX system:1. Log on as user root.2. Insert the IM Scoring CD-ROM in the CD-ROM drive.3. Type smit on the AIX command line.4. Select the following options. These steps might differ slightly depending

on the AIX version installed on your system.a. Software Installation and Maintenanceb. Install and Update Softwarec. Install and Update from LATEST Available Softwared. Enter /dev/cd0 as INPUT devicee. Display the list of licensed software available to install. The relevant

items in the list are:

© Copyright IBM Corp. 2001, 2002 161

Page 178: Administration and Programming for DB2

IMinerX.scoring.scoringbean (Intelligent Miner ScoringBean)Contains the IM Scoring Java Beans functionality of IM Scoring

IMinerX.scoringbean.doc.enContains the documentation for IM Scoring Java Beansfunctionality of IM Scoring in English in PDF and HTML format

To install IM Scoring Java Beans, select the file setIMinerX.scoring.scoringbean

To install the documentation in English, selectIMinerX.scoringbean.doc.en

On the smit installation menu, for AUTOMATICALLY install requisitesoftware, type ’yes’.

After you have successfully installed IMinerX.scoring.scoringbean, youmight need to install the appropriate license. To do this, you must repeat theinstallation procedure. IM Scoring and IM Scoring Java Beans share the samelicense. You can omit this procedure if you have already installed a license forIM Scoring.

To install the IM Scoring ’Try and Buy’ license:In step 4e, select the file set IMinerX.scoring.tab.license. The ’Try andBuy’ license expires after 59 days.

To install the regular IM Scoring license:In step 4e, select the file set IMinerX.scoring.license.

Uninstalling IM Scoring Java BeansTo uninstall IM Scoring Java Beans:1. Log on as user root.2. Run the standard software and maintenance procedure from smit or

smitty and select the following options:a. Software Installation and Maintenanceb. Software Maintenance and Utilitiesc. Remove Installed Software

3. List the installed software.4. Follow the instructions on the smit menus to uninstall IM Scoring Java

Beans.

Installing IM Scoring Java Beans on Linux systems

Before you install IM Scoring Java Beans on a Linux system, ensure that yoursystem meets the prerequisites.

162 Administration and Programming for DB2

Page 179: Administration and Programming for DB2

Prerequisites for Linux systemsThe prerequisites that your Linux system must meet for installing IM ScoringJava Beans are as follows:v At least 256 MB RAMv Linux kernel 2.2.12 or higherv JDK 1.3.1 or higher

Installing IM Scoring Java BeansBefore you can install IM Scoring Java Beans on Linux systems, IM Scoringmust already be installed.

To install IM Scoring Java Beans:1. Insert the Linux server CD-ROM in the CD-ROM drive.2. Mount the CD-ROM by using the following command:

mount /cdrom

If the directory /cdrom is not listed in the file /etc/fstab, use thefollowing command:

mount -tauto9660 /<dev/hdx> /cdrom

where /dev/hdx is your CD-ROM drive.3. Depending on your system, go to the appropriate directory by using one

of the following commands:

On Linux/i386:cd /cdrom/LINUX/i386

On Linux/s390:cd LINUX/s390

4. Depending on your system, use the appropriate commands to completethe installation.

On Linux/i386:./linuxInstallScoringBean

On Linux/s390:./linux390InstallScoringBean

Uninstalling IM Scoring Java BeansTo uninstall IM Scoring Java Beans on Linux systems, use the followingcommands:

On Linux/i386:./linuxUninstallScoringBean

Appendix B. Installing IM Scoring Java Beans 163

Page 180: Administration and Programming for DB2

On Linux/s390:./linux390UninstallScoringBean

Installing IM Scoring Java Beans on Sun Solaris systems

Before you install IM Scoring Java Beans on a Sun Solaris system, ensure thatyour system meets the prerequisites.

Prerequisites for Sun Solaris systemsThe prerequisites that your Sun Solaris system must meet for installing IMScoring Java Beans are as follows:v At least 256 MB RAMv Sun Solaris 2.6 or higherv JDK 1.3.1 or higher

Make sure that the following patches are installed:109147-09108434-02108435-02

To download patches, go to http://sunsolve.sun.com.

Installing IM Scoring Java BeansTo install IM Scoring Java Beans on a Sun Solaris system:1. Log on as user root.2. Mount the product CD to a directory of your choice, for example, /cdrom.3. Go to the directory where you mounted the CD.4. Go to the Sun Solaris directory by using the following command:

cd SUN

5. Install IM Scoring Java Beans by executing the following command:./sunInstallScJB

Uninstalling IM Scoring Java BeansTo uninstall IM Scoring Java Beans on a Sun Solaris system:1. Log on as user root.2. Mount the product CD to a directory of your choice, for example, /cdrom.3. Go to the directory where you mounted the CD.4. Go to the Sun Solaris directory by using the following command:

cd SUN

5. Uninstall IM Scoring Java Beans by executing the following command:./sunUninstallScJB

164 Administration and Programming for DB2

Page 181: Administration and Programming for DB2

Installing IM Scoring Java Beans on Windows systems

Before you install IM Scoring Java Beans on a Windows system, ensure thatyour system meets the prerequisites.

Prerequisites for Windows systemsThe prerequisites that your Windows system must meet for installing IMScoring Java Beans are as follows:v 30 MB of additional disk space in the /usr file systemv At least 256 MB RAMv Windows NT SP 6a, Windows 2000 SP2, or Windows XPv JDK 1.3.1 or higher

Installing IM Scoring Java BeansThe installation of IM Scoring Java Beans on a Windows system is aninstallation feature of the DB2 IM Scoring package.

Appendix B. Installing IM Scoring Java Beans 165

Page 182: Administration and Programming for DB2

166 Administration and Programming for DB2

Page 183: Administration and Programming for DB2

Appendix C. Migration from IM Scoring V7.1

If you want to migrate from IM Scoring V7.1 to IM Scoring V8.1, in generalyou can start working with IM Scoring V8.1. after performing the followingsteps:1. Installing IM Scoring V8.12. Performing the configuration steps3. Re-enabling your databases or enabling new databases

Note that, when you re-enable your databases, you do not need to havealready disabled any databases.

However, due to new features and also to limitations introduced with IMScoring V8.1, there are a number of issues that you must keep in mind. Theseissues are described in the sections that follow. They include:v “Working with IM Scoring V7.1 and V8.1 in parallel”v “Exporting and importing models with the use of compression” on page 168v “Exporting and importing models by means of DB2 Utilities” on page 168v “Importing models in unfenced mode” on page 169v “Applying Neural models” on page 169v “Using the function DM_getClusterID” on page 170

Working with IM Scoring V7.1 and V8.1 in parallel

Windows platform:If you install IM Scoring V8.1 on a Windows machine and the install findsIM Scoring V7.1, IM Scoring V7.1 is automatically uninstalled. This meansthat you cannot work with both versions on one machine on the Windowsplatform.

UNIX platforms:IM Scoring V8.1 installs into a directory that is different from the oneused for IM Scoring V7.1. This means that you can have both versionsinstalled on one machine. However, you are recommended to migrate toIM Scoring V8.1, because there is a high risk of user error if you use bothversions.

If you want to perform scoring operations with both versions, therecommended way to do this is to use different DB2 instances. The reasonfor this is that an instance is enabled for a specific version of IM Scoringby means of the command idminstfunc. If you want to use the same DB2instance for both products, disable and enable the DB2 instance first

© Copyright IBM Corp. 2001, 2002 167

Page 184: Administration and Programming for DB2

before you use the second product. For this scenario, you arerecommended to use different databases.

Exporting and importing models with the use of compression

IM Scoring V8.1 stores models in a compressed format in a database. Youmight have a database that was enabled for IM Scoring V7.1 and you mighthave re-enabled it for IM Scoring V8.1. In this case, any models of typesDM_ClasModel, DM_ClusteringModel, and DM_RegressionModel are still in anuncompressed format. IM Scoring V8.1 can also work with the uncompressedformat. However, if you want to save disk space, you can simply compressthe models by using the export and import UDFs that are provided. TheseUDFs are:

DM_expClasModel DM_impClasModel

DM_expClusModel DM_impClusModel

DM_expRegModel DM_impRegModel

You can also compress the models by using a sample script, which is availablein the directory samples/ScoringDB2. You can directly use the sample if yourmodels are stored in the tables that are provided. These tables are as follows:v IDMMX.ClusterModel

v IDMMX.ClassifModels

v IDMMX.Regressionmodels

If you use tables that are different from these, you must adapt the sample toyour needs before you execute it.

To execute the sample:1. Connect to the database.2. Call the following:

db2 -tf compressV7Models.db2

Exporting and importing models by means of DB2 Utilities

IM Scoring V8.1 has a model compression feature. For this reason, the defaultsize for the following model types is 50 MB when you enable a database inunfenced mode:v DM_ClasModel

v DM_ClusteringModel

v DM_RegressionModel

In IM Scoring V7.1 the default size was 100 MB for the unfenced mode.

168 Administration and Programming for DB2

Page 185: Administration and Programming for DB2

The situation might arise where you re-enable, for the use of IM Scoring V8.1,a database that was enabled for IM Scoring V7.1. In this case, if you havemodels stored in the database, the model types are not recreated. If you getrun-time problems because you are still working with model types whose sizeis 100 MB, do the following:1. Export the models by means of the DB2 export command.2. Disable your database.3. Enable your database again.4. Import all the models by means of the DB2 import command.

In the samples/ScoringDB2 directory of your installation, sample scripts areavailable that export and import models by means of the appropriate DB2commands. You can directly use the samples if your models are stored in thetables that are provided; these are IDMMX.ClusterModel, IDMMX.ClassifModels,and IDMMX.Regressionmodels. If you use tables that are different from these,you must adapt the samples to your needs before you execute them. Toexecute the samples:1. Connect to the database.2. Call the following:

db2 -tf db2ExportModels.db2idmdisabledb <databasename> [tables]idmenabledb <databasename> [tables] [fenced|unfenced]db2 -tf db2ImportModels.db2

Importing models in unfenced mode

The situation might occur where you have enabled your database for theunfenced mode and you want to import models that were created by IM forData. In this case, the models cannot be imported if they are still in IntelligentMiner format. Convert the models to PMML first before you import them. Useone of the following ways to do this:v Export them directly in PMML format from IM for Data

orv Use the command line conversion tool idmxmod to convert a model from IM

format to PMML format.

Applying Neural models

Neural Clustering models, Neural Classification models, and NeuralRegression models that were imported by IM Scoring V7.1 can no longer beapplied by IM Scoring V8.1.

If the models are still available to you as flat files or as results in IM for Data,drop them from your tables, and import them again.

Appendix C. Migration from IM Scoring V7.1 169

Page 186: Administration and Programming for DB2

Using the function DM_getClusterID

The situation might arise where you use the function DM_getClusterID on aDM_ClusResult value that was returned when a Clustering model was applied.In this case, the values will differ from those returned by IM Scoring V7.1. InIM Scoring V7.1, the cluster names were returned as cluster IDs. In IMScoring V8.1, the cluster ID that is returned is the position of the cluster in thePMML model that is applied.

To get the cluster name, call the function DM_getClusterName with the clusterID (returned by DM_getClusterID) as input parameter. Note that the clusternames and cluster IDs are shown in the IM Visualizer.

170 Administration and Programming for DB2

Page 187: Administration and Programming for DB2

Appendix D. Coexistence with IM Modeling

On the Windows operating system, IM Modeling V8.1 can coexist only withIM Scoring V8.1. If you have IM Scoring V7.1 installed, the IM Modeling V8.1installation removes the IM Scoring V7.1 product.

Shared schema

The schema IDMMX is shared between IM Modeling and IM Scoring.

Shared data types

The following data types are shared between IM Modeling and IM Scoring:v DM_ClusteringModel

v DM_ClasModel

v DM_RuleModel (published only for IM Modeling, but shared internally)v DM_LogicalDataSpec

Shared functions

The following functions are shared between IM Modeling and IM Scoring:

DM_ClusteringModel

v DM_expClusModel

v DM_getNumClusters

v DM_getClusterName

v DM_getClusMdlName

v DM_getClusMdlSpec

DM_ClasModel

v DM_expClasModel

v DM_getClasMdlSpec

v DM_getClasTarget

v DM_getClasCostRate

v DM_getClasMdlName

© Copyright IBM Corp. 2001, 2002 171

Page 188: Administration and Programming for DB2

Shared methods

The following methods are shared between IM Modeling and IM Scoring:

DM_LogicalDataSpec

v DM_expDataSpec

v DM_getFldName

v DM_getFldType

v DM_getNumFields

v DM_impDataSpec

v DM_isCompatible

Shared commands

The following commands are shared between IM Modeling and IM Scoring:v idmenabledb

v idmcheckdb

v idmdisabledb

v idmlicm

v idmlevel

v idminstfunc

v idmuninstfunc

172 Administration and Programming for DB2

Page 189: Administration and Programming for DB2

Appendix E. Error messages

This appendix describes error events that can occur when you use IM Scoring.Examples of error situations are:v A wrong SQL function is used to import or apply a mining model. For

example, DM_impClusFile is used instead of DM_impClasFile to import aClassification model.

v A mining model or mining results data value is inserted into a databasecolumn that is configured for the wrong data type. For example, Clusteringresults are inserted into a column that is configured for data typeDM_ClasResult instead of data type DM_ClusResult.

v A wrong SQL function is used to get results data. For example, theDM_getClusterID function is used instead of the DM_getConfidence functionon Classification mining results data.

The following types of errors are generated by IM Scoring:

SQL statesIdentified by a five-digit error event code

IM Scoring error eventsIdentified by a four-digit error event code

Tip:If a reason code is not documented:1. Check that there is enough disk space.2. Collect all the error information that is available.3. Call your IBM service representative.

DB2 SQL states

38503 (SQL0430N) The user-defined function(UDF) <function name>(<specificname> <function name>) hasterminated abnormally.

Explanation: DB2 or IM Scoring might not beinstalled correctly.

User Response: Look for any hints in the DB2dump file db2diag.log, or contact your IBMrepresentative.

42724(SQL0444N) The routine ″%1″ (specificname ″%2″) is implemented in thelibrary or path ″%3″, function″%4″. The routine ″%1″ cannot beaccessed. Reason code: ″%5″

Explanation: A shared library or DLL requiredby IM Scoring was not found by the DB2 engine.DB2 or IM Scoring might not be installedcorrectly.

User Response: Check your installation, or

© Copyright IBM Corp. 2001, 2002 173

Page 190: Administration and Programming for DB2

contact your IBM representative.

57011 (SQL0973N) Not enough storage isavailable in the UDF_MEM heapto process the statement.

User Response: DB2 allocates a storage area forthe input and output parameters of UDFs. Youcan modify the size of this area by using thedatabase manager configuration parameterUDF_MEM_SZ. This parameter indicates the size ofthe memory as a number of database pages.

Use the following command to update the DBMCFG for the database instance, and then restartthe DB2 instance:

db2 update dbm cfg using UDF_MEM_SZ 30000

On Windows platforms, you must set anadditional parameter in the DB2 registry.

DB2NTMEMSIZE indicates the upper limit in bytesfor fenced UDFs. If you get the SQLSTATE 57011for fenced UDFs, increase the value forUDF_MEM_SZ to 30000 pages and the value forDB2NTMEMSIZE to 120 MB. Use the followingcommands to do this:

db2 update dbm cfg using UDF_MEM_SZ 30000db2set DB2NTMEMSIZE=APLD:120000000

Restart the DB2 instance.

IM Scoring SQL states

If you get the DB2 error message SQL0443N, it means that one of the followingSQL states has occurred:

38M00

This message occurs when an IM Scoring function ended with a kernelerror.

An accompanying four-digit number identifies an IM Scoring error event.

38M01

This message occurs when an IM Scoring function ended with anon-kernel error.

An accompanying four-digit number identifies an IM Scoring error event.

01HM0

This message occurs when an IM Scoring function ended with a warning.

An accompanying four-digit number identifies an IM Scoring error event.

For a listing of IM Scoring error events, see “IM Scoring error events”.

IM Scoring error events

When an IM Scoring error event occurs, the SQL message SQL0443N isdisplayed with one of the following SQL states:v 38M00 for kernel error statesv 38M01 for non-kernel error statesv 01HM0 for warning states

174 Administration and Programming for DB2

Page 191: Administration and Programming for DB2

These SQL state messages include:v The four-digit reason code that identifies the IM Scoring eventv The text of the error message

For more information see “IM Scoring SQL states” on page 174.

The following is a list of the more important IM Scoring error messages. Theyappear in the order of their four-digit error code numbers. The messages areaccompanied, where relevant, by explanations of their meanings andindications as to what action you should take if they appear. If you need a fulllist of error messages, you can find one in the idmxall.msg file, which isavailable in the bin directory of the installation.

1601 The codepage %1 is notsupported.

1602 The initialization of the tracefacility failed. No trace messageswill be written.

2038 The Intelligent Miner is unable toread the result object ″%1″.

User Response: Check if the file exists, andverify that you have read permission on theserver.

2064 The prediction result containsfields with incomplete statistics.

2109 Invalid continuous statisticsobject for field ″%1″.

2110 Invalid discrete statistics objectfor field ″%1″.

2111 The discrete statistics object ismissing for field ″%2″, which hasbeen used for initializing thedescriptive statistics result ″%1″.

2112 The continuous statistics object ismissing for field ″%2″, which hasbeen used for initializing thedescriptive statistics result ″%1″.

2117 The field type in the result file isnot applicable for Field ″%1″.

2118 No field with name ″%1″ exists inresult ″%2″.

2119 The active field ″%1″ is not activein the ’Result Statistics’ that hasbeen used.

2120 Field ″%1″ was used to constructthe models.For this reason, it must be definedas active field.

2216 Active field ″%1″ occurs morethan once.

2496 Error during XML parserinitialization: ″%1″.

2497 XML parser error in ″%1″. Line:″%2″, Column: ″%3″, Message:″%4″.

2498 The XML element ″%1″ is notunique. Specify a unique namefor this element.

Appendix E. Error messages 175

Page 192: Administration and Programming for DB2

2500 An XML syntax error occurred atthe position ″%1″ in the inputrecord ″%2″.

Explanation: The input format for theconstruction DM_ApplicationData is not valid.One of the column values might contain invalidXML characters such as < or &. For example, for<, you must use &lt;, or for &, you must use&amp;.

User Response: Check the input record againstthe XML DTD of IM Scoring. Replace the invalidcharacters with the appropriate coding.

2501 The value ″%1″ of the field ″%2″in the XML input record cannotbe converted to a numeric value.

User Response: Check if the value is a numberand if the decimal separator is compatible withthe language settings of the database.

2502 The DM_ApplicationData recordcontains the field ″%1″. This fielddoes not exist in the miningmodel.

Explanation: The field ″%1″ is not an activefield in the mining model. The spelling of thefield name might be wrong. Note that uppercaseand lowercase characters are treated as differentcharacters.

User Response: Remove all fields that are notactive in the mining model to improve theperformance.

2503 The fields in theDM_ApplicationData record ″%1″are insufficient.

Explanation: The DM_ApplicationData recorddoes not contain as many fields as defined in themining model.

User Response: Provide values for all activefields in the mining model.

2504 The DM_ApplicationData record″%1″ contains too many fields.

Explanation: The DM_ApplicationData recordmight contain active fields that were writtentwice. Only one value is used.

User Response: Remove the redundant fields toimprove the performance.

2505 The attribute ″%1″ is not valid forthe XML input model ″%2″.

Explanation: The attribute kind=″%1″ of theelement ComparisonMeasure is ignored, becauseit is not valid for the XML input model. Theattribute kind=″%3″ is used.

User Response: To prevent further warningmessages, specify attribute kind=″%3″ in yourinput model.

2506 The comparison measure ″%1″ isnot valid for the type of the XMLinput model.

Explanation: The comparison measure ″%1″ isignored, because the XML input model containsa comparison measure that is not valid for themodel type. The comparison measure ″%2″ isused.

User Response: To prevent further warningmessages, specify the comparison measure ″%2″in your input model.

2507 The compare function ″%1″ is notvalid for the field ″%2″.

Explanation: The compare function ″%1″specified for the field ″%2″ is ignored, because itis not valid for the field type. The comparefunction ″%3″ is used.

User Response: To prevent further warningmessages, specify the compare function ″%3″ inyour input model.

2508 The field type ″%1″ is notsupported.

176 Administration and Programming for DB2

Page 193: Administration and Programming for DB2

2509 The closure ″%1″ is notsupported. Closure ″%2″ is usedinstead.

2530 The PMML model contains ″%2″.It must contain element ″%1″.

2531 The PMML model does notcontain the mandatory attribute″%1″ in element ″%2 %3″.

2532 The field ″%1″ does not havevalid values.

2533 A restrained validity domain isdefined for the continuous field″%1″ in the PMML model.

Explanation: This validity domain is ignored.All values are valid for the continuous field″%1″.

2534 The field ″%1″ is used in themodel but is not declared in thedata dictionary.

2535 All continuous fields must havethe same outlier treatment.

Explanation: The field ″%1″ is indicated withthe outlier treatment ″%2″ and the current fieldwith the outlier treatment ″%3″. The outliertreatment ″%3″ will be used for all of the fields.

2536 The outlier treatment ″%1″ is notdefined in PMML.

Explanation: The outlier treatment ″%1″ cannotbe written in the PMML model. asIs is usedinstead. This causes differences between theIntelligent Miner model and the PMML model.

2537 The value ″%1″ is not valid forthe field ″%2″.

2538 The statistics of the field ″%1″ areinconsistent.

Explanation: The two arrays of values andfrequencies have different lengths.

2539 The interval between ″%1″ and″%2″ for the field ″%3″ is notvalid.

2540 There is a gap between ″%1″ and″%2″ in the intervals of the field″%3″.

2541 The number of statistics ″%1″ inModelStats exceeds the number offields in the data dictionary.

2542 The compare function is notsupported for the elementComparisonMeasure.

Explanation: The default compare function ″%1″is ignored.

User Response: Write the compare function″%1″ in every Clustering field that uses it.

2543 The similarity matrix for the field″%1″ is not valid and will not beused.

2549 The field ″%1″ has a taxonomy″%2″.

Explanation: Taxonomies are not supported.The field ″%1″ will be used without a taxonomy.

2550 The PMML model contains anon-emptyTransformationDirectory element.

Explanation: Computed fields are notsupported and cannot be ignored. Therefore, it isnot possible to use this model in IntelligentMiner.

Appendix E. Error messages 177

Page 194: Administration and Programming for DB2

2600 The name mapping ″%1″ is notcomplete.

User Response: Verify that you have specified aname mapping name, a table name, and twocolumn names.

2601 The name mapping ″%1″ does notpoint to a valid data source.

Explanation: The table ″%2″ or the columns″%3″ and ″%4″ that are defined in the namemapping ″%1″ are not accessible.

User Response: Make sure that the data sourceexists and that it can be read.

2602 Name mappings with the name″%1″ exist already.

Explanation: You must specify unique namesfor name mappings.

User Response: Remove or rename any namemappings that have duplicate names.

2605 The matrix ″%1″ is not complete.

User Response: Verify that you have specified amatrix name, a table name, and three columnnames.

2606 The matrix ″%1″ is not complete.

Explanation: Parts of the matrix ″%1″ do nothave the correct format.

User Response: Use the standard SQL functionsto build the matrix.

2607 The matrix ″%1″ does not containa valid data source.

Explanation: The table ″%2″ or the columns″%3″, ″%4″, and ″%5″ that are defined in thematrix ″%1″ are not accessible.

User Response: Make sure that the data sourceexists and that it can be read.

2608 There are two matrices with thename ″%1″.

Explanation: You must specify unique namesfor matrices.

User Response: Remove or rename anymatrices that have duplicate names.

2609 The list of ″%1″ values does notmatch the number of rows ″%2″or the number of columns ″%3″ inthe matrix ″%4″.

Explanation: Each row or column in the matrixmust match a value in the list of values.

User Response: Make sure that the matrix issquare and that there are as many values as thesize of the matrix.

2610 The XML parameters do notcontain a valid task element.

2611 The XML parameters do notcontain the mining data element″%1″.

2612 The XML parameters do notcontain a logical dataspecification.

2613 The XML parameters do notcontain settings.

2614 The XML parameters do notcontain clustering settings.

2615 The XML parameters do notcontain classification settings.

2616 The XML parameters do notcontain association rules settings.

178 Administration and Programming for DB2

Page 195: Administration and Programming for DB2

2620 The mining data is not completelydefined.

Explanation: Some of the attributes orsubelements that define a mining data value arenot present.

User Response: Verify that you have specified atable name and a list of column names andaliases.

2621 The mining data does notcorrespond to a valid data source.

Explanation: It is not possible to access thetable ″%1″ or the columns whose aliases matchthe field names in the logical data specification.

User Response: Make sure that the data sourceexists and that it can be read.

2622 The mining data contains twocolumns with the same name,″%1″.

Explanation: Columns must have uniquenames.

User Response: Remove or rename one of thesecolumns.

2623 The mining data contains twocolumns with the same alias,″%1″.

Explanation: Columns must have uniquealiases.

User Response: Remove one of these columns,or change its alias.

2624 The mining data contains acolumn ″%1″ with an empty alias.

Explanation: Columns using an empty or ablank alias are not allowed.

User Response: Change the alias of thiscolumn.

2625 The logical data specification isnot completely defined.

Explanation: Some of the attributes orsubelements that define the logical dataspecification are not present.

User Response: Verify that you have specified anon-empty list of field names and field types.

2626 The logical data specificationcontains a field with an emptyname.

Explanation: Fields containing an empty or ablank name are not allowed.

User Response: Change the name of this field.

2627 The type ″%1″ of the field ″%2″ isnot defined.

Explanation: Only the types ’categorical’ and’numerical’ are supported.

User Response: Choose either the categorical ornumerical type for the field ″%2″.

2628 There is no match between thefield name ″%1″ and the alias of acolumn in the mining data.

Explanation: The field name must match thealias of a column in the mining data.

User Response: Make sure that all the fieldnames match column aliases in the mining data.

2629 There is no name mapping ″%1″for the field ″%2″.

User Response: Remove the reference to namemapping ″%1″ in the field ″%2″, or add a namemapping ″%1″.

2630 The numeric field ″%1″ has onlyone limit.

Explanation: For numeric fields, a non-outlierrange can be specified by giving a lower and anupper boundary. Because only one of these limitshas been specified, the limit will be ignored.

Appendix E. Error messages 179

Page 196: Administration and Programming for DB2

User Response: Either specify no limits, orspecify the lower and upper boundaries.

2635 The value ″%2″ of power option″%1″ is not valid.

User Response: Do not specify this poweroption, or specify a valid value for it (notdocumented).

2636 The field ″%1″ referenced in thesettings is not known.

Explanation: This field either has no name or isnot present in the logical data specification.

User Response: Remove the reference to thisfield, or use a valid non-empty name for it.

2637 The outlier treatment ″%1″ for thefield ″%2″ is invalid.

Explanation: The only valid outlier treatmentsare asIs, asMissing, and asExtreme.

User Response: Remove or change the outliertreatment for field ″%2″.

2638 An outlier treatment is definedfor the categorical field ″%1″.

Explanation: Outlier treatments apply only tonumerical fields. The outlier treatment for field″%1″ will be ignored.

User Response: To avoid getting this warningmessage, remove the outlier treatment for field″%1″.

2639 The value for the desiredexecution time, ″%1″, is invalid.

Explanation: The desired execution time mustbe greater than or equal to 0, zero meaning notime limitation.

User Response: Remove or change the value ofthe desired execution time.

2640 The value for the minimumpercentage of data, ″%1″, is not100.

Explanation: When no limit is set for theexecution time, all the data will be read. Thevalue that was specified for the minimumpercentage of data will be ignored.

User Response: To avoid getting this warningmessage, remove the value for the minimumpercentage of data, which defaults to 100, or setit explicitly to 100.

2641 The value for the minimumpercentage of data, ″%1″, is notvalid.

Explanation: The value must be between 0 and100.

User Response: Remove the value for theminimum percentage of data, or set it to a valuebetween 0 and 100.

2642 A field weight is defined for thesupplementary field ″%1″.

Explanation: Field weights apply only to activefields. The field weight for the field ″%1″ will beignored.

User Response: To avoid getting this warningmessage, remove the field weight for field ″%1″.

2643 An outlier treatment is definedfor the supplementary field ″%1″.

Explanation: Outlier treatments apply only toactive fields. The outlier treatment for field ″%1″will be ignored.

User Response: To avoid getting this warningmessage, remove the outlier treatment for field″%1″.

2645 The field usage type ″%1″ is notsupported in clustering settings.

Explanation: Only active and supplementaryfields are supported for Clustering.

180 Administration and Programming for DB2

Page 197: Administration and Programming for DB2

User Response: Change the usage type of thefield ″%2″.

2646 The field usage type ″%1″ is notsupported in classificationsettings.

Explanation: Only active and target fields aresupported for Classification.

User Response: Change the usage type of thefield ″%2″.

2647 The field usage type ″%1″ is notsupported in association rulessettings.

Explanation: Only group and item fields aresupported for association rules.

User Response: Change the usage type of thefield ″%2″.

2650 The value for the maximumnumber of clusters, ″%1″, isinvalid.

Explanation: The value for the maximumnumber of clusters must be greater than or equalto 0, zero meaning no upper limit.

User Response: Remove or change the value forthe maximum number of clusters.

2651 There is no similarity matrix ″%1″for the field ″%2″.

User Response: Remove the reference to matrix″%1″ in field ″%2″, or add a matrix ″%1″.

2652 A value for similarity scale isdefined for the categorical field″%1″.

Explanation: Similarity scales apply only tonumerical fields. The similarity scale value forfield ″%1″ will be ignored.

User Response: To avoid getting this warningmessage, remove the similarity scale value forfield ″%1″.

2653 A value for similarity scale isdefined for the supplementaryfield ″%1″.

Explanation: Similarity scales apply only toactive fields. The similarity scale value for field″%1″ will be ignored.

User Response: To avoid getting this warningmessage, remove the similarity scale value forfield ″%1″.

2654 A similarity matrix is defined forthe numerical field ″%1″.

Explanation: Similarity matrices apply only tocategorical fields. The similarity matrix for field″%1″ will be ignored.

User Response: To avoid getting this warningmessage, remove the similarity matrix for field″%1″.

2655 A similarity matrix is defined forthe supplementary field ″%1″.

Explanation: Similarity matrices apply only toactive fields. The similarity matrix for field ″%1″will be ignored.

User Response: To avoid getting this warningmessage, remove the similarity matrix for field″%1″.

2656 The value weighting ″%1″ for thefield ″%2″ is invalid.

Explanation: The only valid value weightingsare info, prob, compInfo, and compProb.

User Response: Remove or change the valueweighting for the field ″%2″.

2657 A value weighting is defined forthe supplementary field ″%1″.

Explanation: Value weightings apply only toactive fields. The value weighting for field ″%1″will be ignored.

User Response: To avoid getting this warning

Appendix E. Error messages 181

Page 198: Administration and Programming for DB2

message, remove the value weighting for thefield ″%1″.

2658 The similarity threshold ″%1″ isinvalid.

Explanation: The similarity threshold must bebetween 0 and 1.

User Response: Remove or change the value forthe similarity threshold.

2660 There is no cost matrix ″%1″.

Explanation: The Classification settings valuereferences a cost matrix ″%1″ that does not exist.

User Response: Remove the reference to costmatrix ″%1″, or add a matrix ″%1″.

2661 An input model is specified forthe training phase.

Explanation: The use of an input model is notsupported during the training phase. Inputmodels are expected only during the test phase.

User Response: Do not define an input modelfor this task.

2662 No input model is specified forthe test phase.

Explanation: The test phase can be processedonly if an input model is given.

User Response: Specify an input model for thistest task.

2663 More than one target field isspecified.

Explanation: Only one target field may bespecified.

User Response: Specify one of the two fields″%1″ or ″%2″ as the target field.

2664 The target field ″%1″ is anumerical field.

Explanation: Only categorical fields may be thetarget field of a Classification algorithm.

User Response: Choose a categorical field asthe target.

2665 No target field is specified in theclassification task.

Explanation: A target field (one only) must bespecified.

User Response: Specify one categorical field asthe target.

2666 The field ″%1″ is defined in theinput model but not in the testtask.

Explanation: The fields in the input model andin the test task must match.

User Response: Verify that you have specifiedan input model and a test task that arecompatible.

2667 Some of the fields have fieldweights.

Explanation: Field weights are not consideredfor classification. The field weights will beignored.

User Response: To avoid getting this warningmessage, remove the field weights from anyfields that have them.

2668 Some of the fields have outliertreatments.

Explanation: Outlier treatments are notconsidered for Classification. The outliertreatments will be ignored.

User Response: To avoid getting this warningmessage, remove the outlier treatments from anyfields that have them.

182 Administration and Programming for DB2

Page 199: Administration and Programming for DB2

2669 The value for maximum treedepth ″%1″ is invalid.

Explanation: The maximum tree depth valuemust be greater than or equal to 0, zero meaningno upper limit.

User Response: Remove or change the value forthe maximum tree depth.

2670 The value for minimum purity,″%1″, is invalid.

Explanation: The minimum purity value mustbe between 0 and 100.

User Response: Remove or change the value forthe minimum purity.

2671 The value for the minimumnumber of records per node, ″%1″,is invalid.

Explanation: This value must be greater than orequal to 0.

User Response: Remove or change the value forthe minimum number of records per node.

2675 The category map ″%1″ is notcompletely defined.

Explanation: Some of the attributes that definea unique category map are not present.

User Response: Verify that you have specified aname, a table name, and two column names.

2676 The taxonomy ″%1″ is notcorrectly defined.

Explanation: The taxonomy either has no nameor does not contain a category map.

User Response: Use the standard SQL functionsto build the category map.

2677 The category map ″%1″ does notcontain a valid data source.

Explanation: It is not possible to access thetable ″%2″ or the columns ″%3″ and ″%4″

defined in the category map ″%1″.

User Response: Make sure that the data sourceexists and that it can be read.

2678 There are two category maps withthe same name, ″%1″.

Explanation: Duplicate names are not allowedfor the category maps in a taxonomy.

User Response: Remove or rename one of thesecategory maps.

2679 There is no name mapping ″%1″for the category map ″%2″.

User Response: Remove the reference to namemapping ″%1″ in the category map ″%2″, or adda name mapping ″%1″.

2680 There is more than one groupfield.

Explanation: Only one group field is allowed.

User Response: Specify one of the two fields″%1″ or ″%2″ as the group field.

2681 The group field ″%1″ is anumerical field.

Explanation: Only a categorical field is allowedto be the group field for an association rulesalgorithm.

User Response: Choose a categorical field asthe group field.

2682 No group field is specified in theassociation rules task.

Explanation: A group field (one only) must bespecified.

User Response: Specify one categorical field asthe group field.

Appendix E. Error messages 183

Page 200: Administration and Programming for DB2

2683 More than one item field isspecified.

Explanation: Only one item field is allowed.

User Response: Specify one of the two fields″%1″ or ″%2″ as the item field.

2684 The item field ″%1″ is a numericalfield.

Explanation: The item field for an associationrules algorithm must be categorical.

User Response: Choose a categorical field asthe item field.

2685 No item field is specified in theassociation rules task.

Explanation: An item field (one only) must bespecified.

User Response: Specify one categorical field asthe item field.

2686 There is no taxonomy ″%1″ for theitem field ″%2″.

User Response: Remove the reference totaxonomy ″%1″ in the item field ″%2″, or add ataxonomy ″%1″.

2687 The item constraints are notcorrectly defined for theassociation rules settings.

Explanation: The item constraints values eitherspecify an unknown type or do not contain anyconstraints on items.

User Response: Use the standard SQL functionsto build the item constraints values.

2688 The value for minimum support,″%1″, is invalid.

Explanation: The minimum support value mustbe between 0 and 100.

User Response: Remove or change the value forminimum support.

2689 The value for minimumconfidence, ″%1″, is invalid.

Explanation: The minimum confidence valuemust be between 0 and 100.

User Response: Remove or change the value forminimum confidence.

2690 The value for maximum rulelength, ″%1″, is invalid.

Explanation: The maximum rule length valuemust be greater than or equal to 0, zero meaningno upper limit.

User Response: Remove or change the value forthe maximum rule length.

2691 Some of the fields have fieldweights.

Explanation: Field weights are not consideredfor association rules. The field weights will beignored.

User Response: To avoid getting this warningmessage, remove the field weights from anyfields that have them.

2692 Some of the fields have outliertreatments.

Explanation: Outlier treatments are notconsidered for association rules. The outliertreatments will be ignored.

User Response: To avoid getting this warningmessage, remove the outlier treatments from anyfields that have them.

2693 The cost rate ″%1″ in theclassification settings is invalid.

Explanation: The cost rate must be between 0and 100.

User Response: Remove or change the value forthe cost rate in the Classification settings.

184 Administration and Programming for DB2

Page 201: Administration and Programming for DB2

2700 The categorical field ″%1″ hasmore than ″%2″ values.

Explanation: Categorical fields with too manyvalues degrade performance, largely withoutimproving the mining result. For this reason, themaximum number of values considered islimited to ″%2″. The statistics containinginformation about the first ″%2″ values of thefield ″%1″ are stored; the other values areconsidered invalid.

User Response: Verify that the field ″%1″ isneeded in this mining run and that all of itsvalues are useful. If necessary, preprocess thedata in order to reduce the number of values orto have the important values first.

2701 The field ″%1″ is of the ordinaltype.

Explanation: The ordinal type is not supported.The field ″%1″ will be considered to be of thecategorical type.

2771 You are now using a temporarylicense. You need to enroll ″%1″within ″%2″ days in the licensefile ″%3″.

Explanation: You are now using the ’Try andBuy’ version. The number of days after whichthe temporary key expires is ″%2″.

User Response: Buy a production license andreplace the nodelock files, or uninstall theproduct after the temporary license has expired.

2772 The number of days left is ″%2″.You are still using a temporarylicense. You need to enroll ″%1″in the license file ″%3″.

Explanation: You are using the ’Try and Buy’version. The number of days after which thetemporary key expires is ″%2″.

User Response: Buy a production license andreplace the nodelock files, or uninstall theproduct after the temporary license has expired.

2773 An error occurred during theinitializing of License UseManagement for the product ″%1″with the license file ″%2″. Checkyour installation.

Explanation: An attempt was made to install atemporary license in nodelock files. Thepermissions to write to the directory/IMinerX/bin might be missing.

User Response: Check your installation. Runthe idmlicm executable as a user who has writepermissions for <INSTALLDIR>/IMinerX/bin (forexample, root for UNIX or Administrator forWindows).

2774 The license has expired. If it wasa ’Try and Buy’ license, you cannow enroll ″%1″ in the nodelockfile ″%2″.

Explanation: An attempt was probably made touse a temporary ’Try and Buy’ license that hasnow expired.

User Response: Buy a production license andreplace the nodelock files, or uninstall theproduct after the temporary license has expired.

2775 The product ″%1″ does not have alicense enrolled. A ’Try and Buy’license could not be added. Usethe idmlicm command to checkyour license status. Verify thatyou have installed all thenecessary components.

User Response: On UNIX systems you need toinvoke the idmlicm command as root user toenroll a temporary ’Try and Buy’ license.Production licenses are a separate installationoption available on the installation media.

3112 An error occurred when theclustering model was read.

Appendix E. Error messages 185

Page 202: Administration and Programming for DB2

3113 The field ″%1″ does not appear inthe clustering model.

User Response: For this reason, the IntelligentMiner cannot apply this model to your currentdata. Check the input fields you specified andthe model. Correct any mismatch. If there is nomismatch, contact your IBM representative.

3135 A faulty record has been foundand skipped.

3136 The field ″%1″ was not usedwhen the model was built.

User Response: Remove this field from theactive fields list.

3143 You use a Version 1 result forapplication mode.

Explanation: This result contains no distanceunits.

3146 The discrete numeric field ″%1″has ″%2″ different values.

User Response: For this reason, the similaritymatrix for this field needs a lot of space. Thismight cause your system to run out of memory.It is recommended that you define this field as acontinuous field.

3147 You are converting a Version 1result to XML. This model doesnot contain distance units.

Explanation: If a model does not containdistance units, distance units are calculated bydefault. These default values might notcorrespond to the distance units used to createthe result.

User Response: If necessary, you can changethese default values in the attributesimilarityScale of all elements ClusteringField inthe XML model.

3148 You are using an XML model thatdoes not contain the attributesimilarityScale for numeric fields.Therefore distance units arecalculated by default.

3149 You are converting a Version 6.1model to XML. This model mightcontain an erroneous outliertreatment. It also does not containany similarity definitions.

Explanation: Version 6.1 results do not containthe outlier treatment and the similaritydefinitions you might have specified. Outliers aretreated as missing values. Similarity definitionsare not used.

User Response: Upgrade your Intelligent Minersoftware to the latest version and export thismodel again.

3203 An error occurred as theIntelligent Miner tried to read thenext record (rc=″%1″).

3205 The field ″%1″ is defined as activemore than once, or as active andsupplementary.

Explanation: The Intelligent Miner considersonly the first ″active″ declaration. The otherspecifications are ignored.

3206 The field ″%1″ is defined as boththe prediction field and as active.

User Response: The Intelligent Miner considersthe field to be the prediction field. The otherspecification is ignored.

3207 The field ″%1″ is defined as boththe prediction field and assupplementary.

Explanation: The Intelligent Miner considers thefield to be the prediction field. The otherspecification is ignored.

186 Administration and Programming for DB2

Page 203: Administration and Programming for DB2

3226 An error occurred when theIntelligent Miner read the resultobject.

3227 The field ″%1″ is defined as activein the settings object, but not inthe specified result object.

User Response: Check if you specified thecorrect result object or if the result object wascreated with an older version of IntelligentMiner. If the latter is the case, create a new resultobject in training mode.

3228 The number of active fieldsselected for this mining run mustequal the number of active fieldsin the result object.

3232 The field ″%1″ was not specifiedwhen the function was run intraining mode.For this reason, the field isignored in test or applicationmode.

3233 The Intelligent Miner cannot finda value for the expected regionamong the input regions.The result object might bedamaged.

3260 Quantiles cannot be computedbecause there are no quantiles inthe result object.

3270 There is no data to mine.

Explanation: The input data object might referto an empty file or database table. Alternatively,a filter record condition was specified thatexcludes all records in the input data.

User Response: Check the input data and anyfilter records conditions.

4405 Cannot load the input resultsspecified (or none are specified).

4407 The current data source does notmatch data used for training.

4440 The class field is continuous.

Explanation: Neural Classification requires adiscrete data type.

4441 The predicted field is notnumeric.

Explanation: Neural prediction requires anumeric data type.

4442 The categorical field ″%1″ has″%2″ different values.

Explanation: Without automatic normalizationat most two different values are allowed.

User Response: Use automatic normalizationfor your input data, or clean the input datasource to remove extraneous values.

4470 The model cannot be loaded. Arequired tag ″%1″ is missing inthe model. The model isincomplete or damaged. Try torecreate the model.

Explanation: An XML tag that is specified asmandatory in the relevant PMML is missing.Without this tag, a complete model cannot beconstructed.

User Response: Recreate the model. If theproblem persists, ask the provider of the PMMLmodel for a valid version.

4471 An internal error has occurred.Contact your IBM representative.

Explanation: An internal program error hasoccurred.

User Response: Try again. If the problem

Appendix E. Error messages 187

Page 204: Administration and Programming for DB2

persists and you can reproduce the error, contactyour IBM representative.

4472 There is not enough free memoryavailable to complete therequested operation. Close someapplications, and try again.

User Response: Free some memory by closingany other running applications. If the problempersists, try to extend your virtual memory orswap partition size, or install more RAM.

4473 The model cannot be loaded. Themeasure specified in the PMML isnot supported in this release. Themeasures supported are Euclideanand squared Euclidean. Changeyour model accordingly.

Explanation: The PMML model specifies ameasure for the Kohonen Network that is notsupported in this release.

User Response: All measures specified in therelevant PMML core are supported. Therefore,you have a PMML model that uses a measurenot included in the core. Try to recreate themodel, and specify a measure included in thePMML core.

4474 The model cannot be loaded. Thecompare function ″%1″ is notsupported in this release.

Explanation: The PMML model specifies acompare function for the Kohonen Network thatis not supported in this release.

User Response: All compare functions specifiedin the relevant PMML core are supported.Therefore, you have a PMML model that uses acompare function not included in the core. Try torecreate the model, and specify a comparefunction included in the PMML core.

4475 The model is inconsistent. Thenumber of centers in the clustersdoes not match the previousinformation. The model isincomplete or damaged. Try torecreate the model.

Explanation: The number of clusters in thePMML model differs in the relevant sections. Themodel is inconsistent. Therefore, you cannotapply this model.

User Response: Try to recreate the model. If theproblem persists, ask the provider of the PMMLmodel to correct it.

4476 The model cannot be loaded.PMML version ″%1″ is notsupported in this release.

Explanation: The PMML version specified inthe model is not supported in this release.

User Response: Either check for a new versionof this product that supports the relevant PMMLversion, or try to export the PMML model to aversion supported by this product.

4477 The model cannot be loaded. Theactivation function ″%1″ is notsupported in this release.

Explanation: The PMML model specifies anactivation function for the Neural Network thatis not supported in this release.

User Response: All activation functionsspecified in the relevant PMML core aresupported. Therefore, you have a PMML modelthat uses an activation function not included inthe core. Try to recreate the model, and specifyan activation function included in the PMMLcore.

188 Administration and Programming for DB2

Page 205: Administration and Programming for DB2

4479 The model cannot be loaded. Theneuron ″%1″ could not beconnected with the neuron ″%2″.No neuron with the ID ″%2″ wasfound in the network. The modelis incomplete or damaged. Try torecreate the model.

Explanation: The neural network includes aneuron that is connected to another neuronwhose ID does not exist. The PMML model isinvalid.

User Response: Recreate the model. If theproblem persists, ask the provider of the PMMLmodel for a valid version.

4480 The model cannot be loaded. Theconnections of the output layerare inconsistent. The model isincomplete or damaged. Try torecreate the model.

Explanation: The number of outputs does notmatch the number of neurons in the output layer.The model is invalid.

User Response: Recreate the model. If theproblem persists, ask the provider of the PMMLmodel for a valid version.

4481 The model cannot be loaded. Thefield names in the ’CenterFields’tag do not match the ones in the’ClusteringField’ tags. The modelis incomplete or damaged. Try torecreate the model.

Explanation: The names of the clusters are notconsistent throughout the PMML model. Themodel is invalid.

User Response: Recreate the model. If theproblem persists, ask the provider of the PMMLmodel for a valid version.

4482 The model cannot be loaded. Thisis not a PMML model. The modelis incomplete or damaged. Try torecreate the model.

Explanation: An attempt was made to scoredata with something that is not a PMML model.

User Response: Ensure that you specify a validPMML model.

4483 The record cannot be scored. Aninvalid record was received.

Explanation: The record passed to the modelrefers to variable names that are completelydifferent from those that the model expects.

User Response: Make sure that the columnnames in the data match the variable names inthe model.

4484 The requested result is notavailable. An attempt was madeto retrieve a result for aclassification model; however, thisis a value-prediction model.

Explanation: An attempt was made to retrieve aClassification result from a model that doesvalue prediction. Most probably, the wrongmodel was chosen.

User Response: Make sure that you choose aClassification model.

4485 The requested result is notavailable. An attempt was madeto retrieve a result for avalue-prediction model; however,this is a classification model.

Explanation: An attempt was made to retrieve avalue prediction result from a model that doesclassification. Most probably, the wrong modelwas chosen.

User Response: Make sure that you choose avalue prediction model.

Appendix E. Error messages 189

Page 206: Administration and Programming for DB2

4486 The result is invalid. The resultvalue could not be denormalizedbecause it is an outlier. Adjust thenormalization parameters in theoutput layer of the model.

Explanation: The output value of the neuralnetwork cannot be denormalized because it isout of the denormalization range. You cannotscore this record.

4487 The result is invalid. It was notpossible to map the result to astring for a classification result.

Explanation: The result of the classification isinvalid. You cannot score this record.

4488 The model is invalid. You cannotuse the model with this release.Convert the original model again,using the conversion utilities ofthis release.

Explanation: The PMML model cannot beapplied with this release.

User Response: Convert the original modelagain, using the conversion utilities of thisrelease.

4489 The data cannot be scored. Themodel is not a value predictionmodel.

Explanation: The model you specified is not avalue prediction model. Most probably you havea Classification model.

User Response: Specify a value predictionmodel.

4490 The data cannot be scored. Themodel is not a classificationmodel.

Explanation: The model you specified is not aClassification model. Most probably you have avalue prediction model.

User Response: Specify a Classification model.

4492 The model is inconsistent. Thevalue ″%1″ could not be set as amissing value replacement.

Explanation: A replacement for missing valueswas specified in the PMML, but is invalid in thecontext of the mining field.

User Response: Verify that you specify a validPMML model.

6001 The command line option thatselects the method for buildingthe input data records was found,but no method was specified.

Explanation: The command line option thatselects the desired method for building inputdata records in the SQL script was not givencorrectly. The M switch must be followed by avalue that identifies one of the methods thatallow input data records to be built.

User Response: Correct the command lineoption that was specified incorrectly. If necessary,review your documentation of command lineoptions.

6002 The command line option thatselects the method for buildingthe input data records was foundtwice.

Explanation: The same command line optioncannot be used twice.

6003 The command line option thatselects the method for buildingthe input data records was found,but the specified method name isinvalid.

Explanation: The command line option thatselects the desired method for building inputdata records in the SQL script was not givencorrectly. The value identifying one of thepermitted methods is illegal.

User Response: Use one of the method namesthat are allowed. If necessary, review yourdocumentation of command line options.

190 Administration and Programming for DB2

Page 207: Administration and Programming for DB2

6004 The command line option for theSQL dialect was found, but noSQL dialect identifier wasspecified.

Explanation: The command line option forselecting the SQL dialect was not given correctly.The D switch must be followed by a valueidentifying one of the SQL dialects (DB2 orOracle) that are allowed.

User Response: Correct the command lineoption that has been incorrectly specified. Ifnecessary, review your documentation ofcommand line options.

6005 The command line option for theSQL dialect was found twice.

Explanation: The same command line optioncannot be used twice.

6006 The command line option for theSQL dialect was found, but thespecified SQL dialect is invalid.

Explanation: The command line option forselecting the SQL dialect was not specifiedcorrectly. The value identifying one of thepermitted SQL dialects is illegal.

User Response: Use one of the SQL dialects thatare allowed. If necessary, review yourdocumentation of command line options.

6007 The command line option forencoding the PMML file wasfound, but no encoding identifierwas specified.

Explanation: The command line option forencoding the codepage of the PMML file was notgiven correctly. The E switch must be followedby a valid encoding string.

User Response: Correct the command lineoption that was specified incorrectly. If necessary,review your documentation of command lineoptions.

6008 The command line option forencoding the PMML file wasfound twice.

Explanation: The same command line optioncannot be used twice.

6011 The method DM_ApplColumn isnot supported for DB2 SQL.

Explanation: The combination of the methodDM_ApplColumn and the SQL dialect DB2 is notallowed. DM_ApplColumn is specific to Oracle.

User Response: Use a different method to buildinput data records for DB2 SQL scripts.

6012 The output file has the samename as the input file.

Explanation: The output file cannot have thesame name as the input file.

User Response: Use a different name for theoutput file.

6013 No input file is specified.

Explanation: An input PMML file is amandatory command line parameter.

User Response: Specify the PMML input file onthe command line.

6014 The output file cannot be openedfor write.

Explanation: The output file cannot be written.The file name or path name might not be valid,or the permission rights may not allow the file tobe created, to be written, or both.

User Response: Ensure that the output file canbe created, and that it can be opened for write.

6015 More than one output file wasfound in the command line.

Explanation: Only one output file is allowed. Ifno output file is given, the result is written tostandard output.

Appendix E. Error messages 191

Page 208: Administration and Programming for DB2

6016 The PMML file ″%1″ cannot beprocessed.

Explanation: See the preceding errors in theerror file for a more detailed error report.

6017 No model name was found in thePMML file.

Explanation: Either the PMML model in theinput file does not contain a model name or thefile cannot be parsed. The SQL script cannot begenerated without a model name.

User Response: Either use a PMML file thatcontains a model name or insert the XMLelement specifying the model name into thePMML file. Ensure that the file contains correctPMML. See if the error file contains moreinformation about possible parsing errors.

6018 The model type cannot bedetermined.

Explanation: The PMML file cannot be parsedcorrectly.

User Response: Ensure that the file containscorrect PMML. See if the error file contains moreinformation about possible parsing errors.

6019 The data fields for the modelcannot be determined.

Explanation: The PMML file cannot be parsedcorrectly.

User Response: Ensure that the file containscorrect PMML. See if the error file contains moreinformation about possible parsing errors.

6020 The output SQL script cannot begenerated.

Explanation: Errors occur when the SQL scriptis being generated.

User Response: See if the error file contains aprevious error with more detailed informationabout possible errors.

6021 The model type is illegal.

Explanation: The model type in the PMML fileis not allowed. This may happen if a model type(for example, an association rules model) is usedthat cannot be processed by IM Scoring becausethere is no application mode for this model type.

User Response: Use a different model type.

7219 Insufficient memory.

7268 The result file specified is not avalid Intelligent Miner Version 6result file.

7500 The following item is missing inthe model: ″%1″.

7501 The output file cannot be opened.

7502 Error while writing to output file.

7503 Internal error: array overflow.

7504 Internal error: Cannot set records.

7505 Internal error.

7506 The file specified does notcontain a valid regression model.

7507 The file specified is not a validXML file.

7508 The PMML version specified isunsupported. The supportedversions are 1.1 and 2.0.

7509 The value of the PMML tagmodelType is invalid. Your modelis probably damaged.

192 Administration and Programming for DB2

Page 209: Administration and Programming for DB2

7510 The value for tag ″%1: %2″ isunsupported. Make sure that themodel conforms to the PMMLversion specified.

7511 The model is not a valueprediction model but aclassification model. Make surethat you apply this function onlyto linear or polynomial regressionmodels.

7512 The model is not a classificationmodel but a value predictionmodel. Make sure that you applythis function only to logisticregression models.

8116 The file ″%1″ holding theAssociations model could not beread.

Explanation: The pathname might be incorrect,or the file permissions might be insufficient forthe file to be read.

User Response: Verify the pathname and thepermissions for the file.

8117 The Associations model cannot beconverted from non-PMML formatto PMML format because theinternal structure of thenon-PMML model is invalid.

Explanation: The model might be corrupt.

User Response: Generate the model again.

8118 The file ″%1″ used to store theconverted Associations modelcould not be opened for writing.

Explanation: The pathname might be invalid, orthe file permissions might be insufficient for thefile to be written.

User Response: Verify the pathname and thepermissions for the file.

8119 Your Association model containsambiguous item names. One itemname describes more than oneitem. Therefore, the assignment ofsupport values to the rulehead/body might be ambiguousduring conversion. To minimizethe impact of this, the highersupport value is chosen. Thiseffect occurs only when anambiguous name mapping wasapplied during the generation ofthe Association model. To avoidthis warning, use a one-to-onename mapping during thegeneration of the Associationmodel.

8120 The converted Association modelholds fewer rules than theoriginal model, because somerequired information is missing inthe original model. This effectoccurs only if item constraints ofthe type ’including’ were appliedduring the generation of theoriginal model using IntelligentMiner for Data Version 6 orearlier. To avoid this warning, donot use item constraints whengenerating or regenerating yourmodel with a newer version ofIntelligent Miner for Data.

8402 Error while opening file ″%1″(″%2″).

8405 The Intelligent Miner detected anunknown attribute ″%1″ becausethe result object does not matchthe specifications in settingsobject.

User Response: Specify a suitable result objectfor the settings object (″%2″).

8641 Loading of pruned tree failed(line ″%1″).

Appendix E. Error messages 193

Page 210: Administration and Programming for DB2

8642 Dummy node without a parentdetected.

8643 Dummy node whose parent doesnot have a left child detected.

8644 Boolean operator ″%1″ detectednot supported with ″%2″ tag.

8645 Node without ″%1″ detected.

8646 Node with continuous test feature(″%1″) with more than one ″%2″detected.

8647 Node with continuous test feature(″%1″) with unsupported operator″%2″ detected.

8648 Node with continuous test feature(″%1″): number expected, but ″%2″found.

8649 Node with different categoricaltest features (″%1″, ″%2″) detected.

8650 Node with categorical test feature(″%1″) with unsupported operator″%2″ detected.

8651 Model used and record to beclassified do not match.

8652 More than one class labelspecified.

8653 Node is not an element node.

8654 Tree model node without score isnot feasible.

8655 Node does not have distributionspecified.

8656 Invalid node detected: node mustnot have one single child.

8657 Attribute ″%1″ must not be ″%2″.

8658 ″%1″ is not an attribute ofelement ″%2″.

8659 No class label specified.

8660 Error reading tree classificationmodel.

8661 The Predicate field ″%1″ is notspecified as a MiningField.

8662 The Predicate field ″%1″ ofunsupported type is detected.

8663 The Predicate field ″%1″ iscontinuous, but the value (″%2″)is nonnumeric.

8664 The MiningField name=″%1″ doesnot occur in the DataDictionary.

8665 The Predicate name=″%1″ is not aMiningField.

8666 The Intelligent Miner Scoringdoes not support regression treescoring.

8800 UDF is declared as fenced. It mustbe declared not fenced.

Explanation: An SQL function needs to workwith locators, but cannot do so because it isdeclared as fenced.

User Response: You must drop the functiondefinition in your DB2 instance and recreate itusing the CREATE FUNCTION command, thistime declaring the function as not fenced.

194 Administration and Programming for DB2

Page 211: Administration and Programming for DB2

8801 Internal error: sqludf_lengthreceived a bad input value.

Explanation: DB2 or IM Scoring might not beinstalled correctly.

User Response: Look for any hints in the DB2dump file db2dump.log, or contact your IBMrepresentative.

8802 Internal error: sqludf_substrreceived a bad input value.

Explanation: DB2 or IM Scoring might not beinstalled correctly.

User Response: Look for any hints in the DB2dump file db2dump.log, or contact your IBMrepresentative.

8803 Internal error: sqludf_appendreceived a bad input value.

Explanation: DB2 or IM Scoring might not beinstalled correctly.

User Response: Look for any hints in the DB2dump file db2dump.log, or contact your IBMrepresentative.

8804 Internal error: sqludf_createreceived a bad input value.

Explanation: DB2 or IM Scoring might not beinstalled correctly.

User Response: Look for any hints in the DB2dump file db2dump.log, or contact your IBMrepresentative.

8805 Internal error: sqludf_freereceived a bad input value.

Explanation: DB2 or IM Scoring might not beinstalled correctly.

User Response: Look for any hints in the DB2dump file db2dump.log, or contact your IBMrepresentative.

8806 Internal error. Locator is alreadyfreed, or free is not allowed.

Explanation: DB2 or IM Scoring might not beinstalled correctly.

User Response: Look for any hints in the DB2dump file db2dump.log, or contact your IBMrepresentative.

8807 The importing of IntelligentMiner V6 results is not supportedin unfenced mode. Use idmxmodto convert the model into thePMML format, and then run theimport routine again.

Explanation: Models in Intelligent Miner formatmust be converted to PMML format before theycan be used by IM Scoring. This conversioncannot be done in unfenced mode.

User Response: Use idmxmod to convert themodel into the PMML format, and then run theimport routine again.

8901 The model is not a clusteringmodel.

Explanation: The model that was specified asinput for a DM_applyClusModel or DM_impClusFilefunction is not a Clustering model.

8902 The model is not a classificationmodel.

Explanation: The model that was specified asinput for a DM_applyClasModel or DM_impClasFilefunction is not a Classification model.

8903 The model is not a regressionmodel.

Explanation: The model that was specified asinput for a DM_applyRegModel or DM_impRegFilefunction is not a Regression model.

8904 The model does not correspond tothe XML format.

Appendix E. Error messages 195

Page 212: Administration and Programming for DB2

8905 Internal error. The type of modelis unknown.

Explanation: The model is not recognized as aClustering, Classification, or Regression model.This is an internal error.

8906 File ″%1″ cannot be opened forread.

Explanation: The file you specified as input foran import function cannot be read. Check filename, path, and permissions.

8907 File ″%1″ cannot be opened forwrite.

Explanation: A temporary file cannot be openedfor write.

User Response: Check disk space andpermissions in /tmp (AIX), TEMP directory(Windows NT).

8908 The result format in file ″%1″ iswrong.

Explanation: The model that was specified asinput for an import function is not in either IMfor Data format or PMML format.

8909 The tree classification test modelcannot be applied. Use a treeclassification training modelinstead.

Explanation: Tree Classification test models areused to verify the quality of a training run. Youcannot use them for scoring.

User Response: Use the Tree Classificationtraining model instead.

8914 The header of the model is notvalid.

Explanation: The model was not imported intoDB2 with the appropriate function. Therefore itdoes not contain a valid header.

User Response: Import the model again by

using DM_impClasFile, DM_impClusFile,DM_impRegFile, DM_impClasFileE,DM_impClusFileE, DM_impRegFileE,DM_impClasModel, DM_impClusModel, orDM_impRegModel.

8916 The model is not unique.

Explanation: A model that is passed toDM_applyClasModel, DM_applyClusModel, orDM_applyRegModel must have a constant value. Itis not possible to apply data to more than onemodel at the same time.

User Response: Verify your SQL command andmake sure that only one model is passed.

8917 Encoding is only allowed for XMLfiles.

Explanation: You specified an encoding whenimporting a file in V6 format. Encoding is onlyallowed for XML files. The encoding is ignored.

User Response: To import V6 models, use thefunction DM_impClasFile, DM_impClusFile, orDM_impRegFile.

8918 The encoding of the XML modelis missing.

Explanation: The XML model you want toimport does not contain any XML declaration.Therefore the encoding of the model cannot bedetermined.

User Response: You can add an XMLdeclaration at the beginning of the model, or youcan specify an encoding when you import themodel.

8919 The model (″%1″ bytes) cannot bestored in a LOB (″%2″ bytes).

Explanation: After internal conversions, theimported model is too big to be stored in a LargeObject (LOB).

User Response: You must increase themaximum value for a LOB.

196 Administration and Programming for DB2

Page 213: Administration and Programming for DB2

8920 Insufficient memory available toconvert and store the model.

User Response: Reconnect to the database andtry again. If the problem persists, restart the DB2instance.

8930 Evaluation period over.

8956 Check the preceding warningmessages.

8957 Warning: The type ″%1″ alreadyexists with a different size (″%2″bytes) from that requested (″%3″bytes). To specify a new size, firstdisable the database.

Explanation: The size of UDTs cannot bechanged if dependent objects that use these types(for example, tables) already exist. To change thesize of UDTs, drop the types by usingidmenabledb tables, then re-enable the databaseusing idmenabledb.

User Response: To specify a new size, firstdisable the database.

8958 Warning: The function ″%1″(special name ″%2″) is used inother database objects, and cannotbe updated.

Explanation: If a user-defined function (UDF) ormethod (UDM) is used in other dependentdatabase objects like triggers or views, the UDFor UDM cannot be changed.

User Response: Drop the dependent object, andthen rerun the command.

8961 The file ″%1″ cannot be found.Verify your installation, and checkyour PATH settings.

8962 The database ″%1″ was notdisabled. Check the precedingerror messages. Correct theproblems, and rerun thecommand.

Explanation: IM creates database objects likeTYPES, PROCEDURES, METHODS, and TABLESwhen enabling a database. These objects aredropped from the database when the database isdisabled. If you created your own databaseobjects (for example, tables or triggers) that usethe database objects created by IM, the IMdatabase objects cannot be dropped.

The most common error is to enable the databasewith sample tables (idmenabledb <dbname>tables), and then to forget the ″tables″ parameterwhen the database is being disabled. Forexample:

E:\im810b\test>idmdisabledb testudf........................................DROP TYPE IDMMX.DM_MiningData

E 2303: An SQL error occurred: SQLstate:"42893", SQL Error Message: [IBM][CLI Driver][DB2/NT] SQL0478NThe object type "TYPE" cannot be droppedbecause there is an object"IDMMX.MININGDATA", of type "TABLE",that depends on it.SQLSTATE=42893See the DB2 User’s Guide or the MessageReference.

E 8962: The database "testudf" was notdisabled. Check the preceding errormessages. Correct the problems, andrerun the command.

E:\im810b\test>idmdisabledb testudf tables......................................

The database "testudf" was successfullydisabled.

User Response: If you enabled the databaseusing the ″tables″ parameter, use idmenabledb<dbname> tables. Otherwise, manually drop theobjects that depend on IM database objects.

Appendix E. Error messages 197

Page 214: Administration and Programming for DB2

8963 The database ″%1″ was notenabled. Check the precedingerror messages. Correct theproblem, and rerun the command.

Explanation: IM creates database objects likeTYPES, PROCEDURES, METHODS, and TABLESwhen enabling a database. Some of these objectscould not be created. The most common reasonfor this is that database objects with the samename might already exist in the schema IDMMX.Another possible reason is that the databasemight be already enabled for a different releaseof IM.

User Response: If the database is alreadyenabled, disable the database first. Otherwise,manually drop the database objects that have aname conflict with the database objects to becreated by IM.

8998 Error and trace cannot beinitialized.

Explanation: The error file or the trace filecould not be opened for writing, or the messagecatalog library idmmsg could not be read.

User Response: Check the path of your tracefile, and verify that you have permissions towrite to the trace and error files.

Windows: Check that the message catalog file<install path>\IMinerX\bin\idmmsg.dllexists, and that the directory<install path>\IMinerX\bin is included inthe system environment variable PATH.

UNIX: Check that the message catalog file<install path>\IMinerX\lib\libidmmsg.*exists.

198 Administration and Programming for DB2

Page 215: Administration and Programming for DB2

Appendix F. The DB2 REC2XML function

Fixpack 3 of DB2 V7 introduces a new scalar function, REC2XML. This functioncan be used for easy and fast construction of application data values in XML.This section shows the syntax and gives a few examples of how this functioncan be used.

Syntax

$$ REC2XML ( integer constant , format string , row tag string $

$ * , column name ) $&

The schema is SYSIBM.

The REC2XML function returns a string formatted with XML tags andcontaining column names and column data.

integer constantThe expansion factor for replacing column data characters. It must bean integer value from 1 to 6.

The integer constant value is used to calculate the result length of thefunction. For every column with a character or graphic data type, thelength attribute of the column is multiplied by this expansion factorbefore it is added to the result length.

To specify no expansion, use a value of 1. If the actual length of theresult string is greater than the calculated result length of the function,this raises an error (SQLSTATE 22001).

format stringA string constant that specifies which format the function is to useduring execution. The only value that is supported is ’COLATTVAL’.The format string is case-sensitive, and only uppercase values will berecognized.

© Copyright IBM Corp. 2001, 2002 199

Page 216: Administration and Programming for DB2

This format returns a string with column as an attribute:

>>---------------<row tag string>------------------------------------>

.--------------------------------------------------------------.V |

>-----<column name="column name"-+->column value--</column>--+---+--->| |’----null="true"--/>--------’

>---------------</row tag string>-----------------------------------><

row tag stringA string constant that specifies the tag used for each row. If an emptystring is specified, the value row is assumed. When using REC2XML inIM Scoring you should always use the empty string.

column nameA qualified or unqualified name of a table column. The column musthave one of the following data types:v Numeric (SMALLINT, INTEGER, BIGINT, DECIMAL, NUMERIC,

REAL, DOUBLE)v Character string (CHAR, VARCHAR, LONG VARCHAR, CLOB)v Graphic string (GRAPHIC, VARGRAPHIC, LONG VARGRAPHIC,

DBCLOB)v Datetime (DATE, TIME, TIMESTAMP)v A user-defined type based on one of the above types.

Character strings with a subtype of BIT DATA are not allowed.

The same column name cannot be specified more than once, or anerror will result (SQLSTATE 42734).

Depending on the value specified for the format string, certain characters incolumn names and column values are replaced to ensure that the columnnames are valid XML values. These characters and their replacement valuesare as follows:

< is replaced by &lt;

> is replaced by &gt;

" is replaced by &quot;

& is replaced by &amp;

’ is replaced by &apos;

200 Administration and Programming for DB2

Page 217: Administration and Programming for DB2

A single character can be replaced by up to six characters in XML. That is thereason why REC2XML uses a parameter for the expansion factor.

The following examples show how the REC2XML function is used and thestrings that this function returns:v Using the DEPARTMENT table in the sample database, format the

department table row, except the DEPTNAME and LOCATION columns,for department ’D01’ into an XML string. Because the data does not containany of the characters that require replacement, the expansion factor will be1.0 (no expansion). Also note that the MGRNO value is NULL for this row.

SELECT REC2XML (1, ’COLATTVAL’,’’,DEPTNO,MGRNO,ADMRDEPT)FROM DEPARTMENTWHERE DEPTNO = ’D01’

This example returns the following string:

<row><column name="DEPTNO">D01</column><column name="MGRNO" null="true"/><column name="ADMRDEPT">A00</column></row>

v The example:

SELECT REC2XML (2,’COLATTVAL’,’’,CLASS_CODE,DAY,STARTING)FROM CL_SCHEDWHERE CLASS_CODE = ’&43<FIE’

returns the following string:

<row><column name=CLASS_CODE">&amp;43&lt;FIE</column><columnname="DAY">5</column><column name="STARTING">06:45:00</column></row>

v This example shows characters that are replaced in a column name:

SELECT REC2XML (2,’COLATTVAL’,’’,Class,"time<noon")FROM (SELECT Class_code, Starting

FROM Cl_schedWHERE Starting < ’12:00:00’)AS Early (Class, "time<noon")

It returns:

<row><column name="CLASS">&amp;43&lt;FIE</column><columnname="time&lt;noon">06:45:00</column></row>

Appendix F. The DB2 REC2XML function 201

Page 218: Administration and Programming for DB2

202 Administration and Programming for DB2

Page 219: Administration and Programming for DB2

Appendix G. IM Scoring conformance to PMML

This appendix outlines how IM Scoring conforms to the PMML standard.

In this appendix, the following naming conventions of the PMML 2.0 standardare used:v Demographic Clustering in IM Scoring is called distribution-based

clustering in PMML 2.0v Neural Clustering in IM Scoring is called center-based clustering in PMML

2.0v Neural Classification and Neural Prediction in IM Scoring are covered by

the term neural networks in PMML 2.0v Linear Regression, Logistic Regression, and Polynomial Regression are

covered by the term regression in PMML 2.0v Tree Classification in IM Scoring is called decision trees in PMML 2.0

IM Scoring application

IM Scoring provides an SQL interface enabling the application of PMMLmodels to data.

For this feature, IM Scoring complies with the PMML consumer conformanceclause of the PMML 2.0 standard for several algorithms. These algorithms arelisted below with possible restrictions for their consumer conformance toPMML 2.0.

Center-based clustering

v IM Scoring supports all the core features of PMML 2.0 for center-basedclustering.

v The handling of missing values for unary or binary categorical fields ofmodels created with Intelligent Miner products is different from thehandling of missing values as defined in PMML 2.0 and used in IMScoring for models from other producers. Therefore, IM Scoring doesnot deliver the same results that other vendors might deliver with theseIntelligent Miner models when the data contains missing values.

Decision trees

v IM Scoring supports all the core features of PMML 2.0 for decision treesexcept the <SimpleSetPredicate> elements.

v The handling of missing values for models created with IntelligentMiner products is different from, and more powerful than, the handling

© Copyright IBM Corp. 2001, 2002 203

Page 220: Administration and Programming for DB2

of missing values as defined in PMML 2.0 and used in IM Scoring formodels from other producers. Therefore, IM Scoring does not deliverthe same results that other vendors might deliver with these IntelligentMiner models when the data contains missing values.

Distribution-based clustering

v IM Scoring supports all the core features of PMML 2.0 fordistribution-based clustering.

v IM Scoring additionally supports value weighting, which might be usedin some models produced with Intelligent Miner products. (For moreinformation about value weighting, see the documentation for theproduct in question.) For this reason, IM Scoring does not deliver thesame results that other vendors might deliver with thesenon-conforming PMML 2.0 models.

Neural networks

v IM Scoring supports all the core features of PMML 2.0 for neuralnetworks.

v The handling of missing values for unary or binary categorical fields ofmodels created with Intelligent Miner products is different from thehandling of missing values defined in PMML 2.0 and used in IMScoring for models from other producers. Therefore, IM Scoring doesnot deliver the same results that other vendors might deliver with theseIntelligent Miner models when the data contains missing values.

RegressionIM Scoring supports all the core features of PMML 2.0 for regression.

IM Scoring conversion tools

IM Scoring delivers conversion tools that enable models in IM for Data formatto be converted to PMML 2.0.

IM Scoring conversion tools comply with the PMML producer conformanceclause of the PMML 2.0 standard for center-based clustering, decision trees,neural networks, and regression.

IM Scoring conversion tools also comply with the PMML producerconformance clause of the PMML 2.0 standard for distribution-basedclustering when value weighting was not used to create the model. (Forinformation, see the documentation for IM for Data.) However, if valueweighting was used, IM Scoring conversion tools produce non-conformingPMML 2.0 models with a specific extension for value weighting. Thisextension can be read only by IM Scoring, which will use value weightingwhen scoring data on these models. Other PMML consumer tools, however,

204 Administration and Programming for DB2

Page 221: Administration and Programming for DB2

will ignore the extension, though they will read the model successfully. Theywill not consider value weighting when scoring data on these models.

IM Scoring conversion tools additionally comply with the PMML producerconformance clause of the PMML 2.0 standard for association rules. However,taxonomies and name mappings that might have been used to create themodels are not written into PMML 2.0.

Radial-Basis Function prediction

IM Scoring supports RBF prediction in addition to all of the other algorithmsthat have been listed. However, RBF prediction is not yet part of PMML, andtherefore cannot comply with the producer or consumer conformance clause.The RBF prediction models written by IM Scoring conversion tools and usedin the IM Scoring application have an XML proprietary format that is verysimilar to the other PMML formats.

Appendix G. IM Scoring conformance to PMML 205

Page 222: Administration and Programming for DB2

206 Administration and Programming for DB2

Page 223: Administration and Programming for DB2

Appendix H. Notices

This information was developed for products and services offered in theU.S.A. IBM may not offer the products, services, or features discussed in thisdocument in other countries. Consult your local IBM representative forinformation on the products and services currently available in your area. Anyreference to an IBM product, program, or service is not intended to state orimply that only that IBM product, program, or service may be used. Anyfunctionally equivalent product, program, or service that does not infringeany IBM intellectual property right may be used instead. However, it is theuser’s responsibility to evaluate and verify the operation of any non-IBMproduct, program, or service.

IBM may have patents or pending patent applications covering subject matterdescribed in this document. The furnishing of this document does not giveyou any license to these patents. You can send license inquiries, in writing, to:

IBM Director of LicensingIBM CorporationNorth Castle DriveArmonk, NY 10504-1785U.S.A.

For license inquiries regarding double-byte (DBCS) information, contact theIBM Intellectual Property Department in your country or send inquiries, inwriting, to:

IBM World Trade Asia CorporationLicensing2-31 Roppongi 3-chome, Minato-kuTokyo 106, Japan

The following paragraph does not apply to the United Kingdom or anyother country where such provisions are inconsistent with local law:INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THISPUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHEREXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THEIMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITYOR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allowdisclaimer of express or implied warranties in certain transactions, therefore,this statement may not apply to you.

This information could include technical inaccuracies or typographical errors.Changes are periodically made to the information herein; these changes will

© Copyright IBM Corp. 2001, 2002 207

Page 224: Administration and Programming for DB2

be incorporated in new editions of the publication. IBM may makeimprovements and/or changes in the product(s) and/or the program(s)described in this publication at any time without notice.

Licensees of this program who wish to have information about it for thepurpose of enabling: (i) the exchange of information between independentlycreated programs and other programs (including this one) and (ii) the mutualuse of the information which has been exchanged, should contact:

IBM DeutschlandInformationssysteme GmbHDepartment 3982Pascalstrasse 10070569 StuttgartGermany

Such information may be available, subject to appropriate terms andconditions, including in some cases, payment of a fee.

The licensed program described in this information and all licensed materialavailable for it are provided by IBM under terms of the IBM CustomerAgreement or any equivalent agreement between us.

Any performance data contained herein was determined in a controlledenvironment. Therefore, the results obtained in other operating environmentsmay vary significantly. Some measurements may have been made ondevelopment-level systems and there is no guarantee that these measurementswill be the same on generally available systems. Furthermore, somemeasurement may have been estimated through extrapolation. Actual resultsmay vary. Users of this document should verify the applicable data for theirspecific environment.

Information concerning non-IBM products was obtained from the suppliers ofthose products, their published announcements or other publicly availablesources. IBM has not tested those products and cannot confirm the accuracyof performance, compatibility or any other claims related to non-IBMproducts. Questions on the capabilities of non-IBM products should beaddressed to the suppliers of those products.

All statements regarding IBM’s future direction or intent are subject to changeor withdrawal without notice, and represent goals and objectives only.

All IBM prices shown are IBM’s suggested retail prices, are current and aresubject to change without notice. Dealer prices may vary.

This information is for planning purposes only. The information herein issubject to change before the products described become available.

208 Administration and Programming for DB2

Page 225: Administration and Programming for DB2

This information contains examples of data and reports used in daily businessoperations. To illustrate them as completely as possible, the examples includethe names of individuals, companies, brands, and products. All of thesenames are fictitious and any similarity to the names and addresses used by anactual business enterprise is entirely coincidental.

COPYRIGHT LICENSE:

This information contains sample application programs in source language,which illustrates programming techniques on various operating platforms.You may copy, modify, and distribute these sample programs in any formwithout payment to IBM, for the purposes of developing, using, marketing ordistributing application programs conforming to the application programminginterface for the operating platform for which the sample programs arewritten. These examples have not been thoroughly tested under all conditions.IBM, therefore, cannot guarantee or imply reliability, serviceability, or functionof these programs.

Each copy or any portion of these sample programs or any derivative work,must include a copyright notice as follows:

© (your company name) (year). Portions of this code are derived from IBMCorp. Sample Programs. © Copyright IBM Corp. _enter the year or years_. Allrights reserved.

If you are viewing this information softcopy, the photographs and colorillustrations may not appear.

Trademarks

The following terms are trademarks of the IBM Corporation in the UnitedStates, other countries, or both:

AIXDB2DB2 Universal DatabaseIBMIntelligent MinerSPSP2

UNIX is a registered trademark of The Open Group in the United States andother countries.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks ofMicrosoft Corporation in the United States, other countries, or both.

Appendix H. Notices 209

Page 226: Administration and Programming for DB2

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc.in the United States, other countries, or both.

Other company, product, and service names may be trademarks or servicemarks of others.

210 Administration and Programming for DB2

Page 227: Administration and Programming for DB2

Bibliography and related information

This bibliography lists all publications in the Intelligent Miner library otherthan this book, related IBM publications that are relevant to the contents ofthis book, and non-IBM publications that might be useful for referencepurposes.

Where appropriate, IBM publication numbers are given after the documenttitle. This will assist you in finding the document online at the IBMPublications Center, which is available on the Web at:http://www.elink.ibmlink.ibm.com/public/applications/publications/

cgibin/pbi.cgi

IBM DB2 Intelligent Miner publications

v Discovering Data Mining, SG24-4839v Intelligent Miner for Data: Application Programming Interface and Utility

Reference, SH12-6751v Intelligent Miner for Data Applications Guide, SG24-5252v Intelligent Miner for Data: Enhance Your Business Intelligence, SG24-5422v Intelligent Miner for Data: Using the Intelligent Miner for Data, SH12-6750v Intelligent Miner Modeling: Administration and Programming, SH12-6736v Intelligent Miner Visualization: Using the Intelligent Miner Visualizers,

SH12-6737v Mining Relational and Nonrelational Data with IBM Intelligent Miner for Data

Using Oracle, SPSS, and SAS As Sample Data Sources, SG24-5278v Mining Your Own Business in Banking: Using DB2 Intelligent Miner for Data,

SG24-6272v Mining Your Own Business in Health Care: Using DB2 Intelligent Miner for

Data, SG24-6274v Mining Your Own Business in Retail: Using DB2 Intelligent Miner for Data,

SG24-6271v Mining Your Own Business in Telecoms: Using DB2 Intelligent Miner for Data,

SG24-6273

© Copyright IBM Corp. 2001, 2002 211

Page 228: Administration and Programming for DB2

IBM DB2 Universal Database (DB2 UDB) publications

v DB2 UDB Administration Guide: Implementation, SC09-2944v DB2 UDB Administration Guide: Performance, SC09-2945v DB2 UDB Administration Guide: Planning, SC09-2946v DB2 UDB Application Development Guide, SC09-2949v DB2 UDB Call Level Interface Guide and Reference, SC09-2843v DB2 UDB Command Reference, SC09-2951v DB2 UDB Enterprise-Extended Edition for UNIX: Quick Beginnings, GC09-2964v DB2 UDB Enterprise-Extended Edition for Windows: Quick Beginnings,

GC09-2963v DB2 UDB for OS/2®: Quick Beginnings, GC09-2968v DB2 UDB for UNIX: Quick Beginnings, GC09-2970v DB2 UDB for Windows: Quick Beginnings, GC09-2971v DB2 UDB Message Reference (Volumes 1 and 2), GC09-2978 and GC09-2979v DB2 UDB SQL Getting Started, SC09-2973v DB2 UDB SQL Reference, Volumes 1 and 2, SBOF-8933

Related information

v The IBM Software Support Handbook is available on the Web athttp://techsupport.services.ibm.com/guides/handbook.html

v Information relating to PMML is available on the Web athttp://www.dmg.org

212 Administration and Programming for DB2

Page 229: Administration and Programming for DB2

Index

AAIX systems

exporting PMML or XML modelson 158

installing IM Scoring Java Beanson 161

installing IM Scoring on 145prerequisites for installing IM

Scoring Java Beans on 161prerequisites for installing IM

Scoring on 145registering the model conversion

facility on 159uninstalling IM Scoring Java

Beans on 162uninstalling IM Scoring on 148

application functions 53, 54purpose of 50

application mode 7application results, getting 56

BbankingApplyModeling1.db2 script

contents 32bankingApplyModeling2.db2 script

contents 32bankingApplyTable1.db2 script

contents 28description 24use of 28

bankingApplyTable2.db2 scriptcontents 30description 24

bankingApplyView.db2 scriptcontents 27description 24for applying a model 27

bankingExtract.db2 scriptcontents 31

bankingImport.db2 scriptcontents 26description 24

bankingInsert.db2 scriptcontents 26description 24

bankingScoring.datadescription of flat file 24

bibliography 211

CClassification 17client registration tool facility 11CLOB values, importing mining

models from 47clusDemoBanking.dat

description of sample miningmodel 24

cluster IDspractice exercise in

computing 29, 30Clustering 17code samples

for applying models 55for scoring with IM Scoring Java

Beans 63commands

idmcheckdb 14, 43, 132idmdisabledb 14, 42, 132, 133idmenabledb 14, 41idminstfunc 135idmlevel 68, 135idmlicm 14, 135idmmkSQL 13, 136

practice exercise in using 38idmuninstfunc 138idmxmod 139shared between IM Modeling and

IM Scoring 172components, sample 23compression, exporting and

importing models with the useof 168

computed resultsaccessing for IM Scoring Java

Beans 62CONCAT

using to specify data 53configuration, verifying 22configuring

database environments 21DBMS

on Windows systems 158IM Scoring, quick-start guide

to 19system environments

on UNIX systems 157on Windows systems 158

conformance to standards 13

conversion facility 11conversion utilities

client 155server 154

converting exported models 44CRM xiiCustomer Relationship

Management xii

Ddata

specifyingby means of CONCAT 53

specifying by means ofDM_applData 52

specifying by means ofREC2XML 51

data file, samplebankingScoring.data 24

data importing, in practiceexercises 25

data mining functions 17data mining markup language,

PMML 11data records, specifying for IM

Scoring Java Beans 62data types

DM_ApplicationDatafunctions for working

with 78DM_LogicalDataSpec

methods for workingwith 77

overview 75, 76, 77purpose 9shared between IM Modeling and

IM Scoring 171specifying 9user-defined 10

database objectscreating 22, 41overview 75

databaseschecking 43disabling 42enabling 41

DB2getting diagnostic information

about 71

© Copyright IBM Corp. 2001, 2002 213

Page 230: Administration and Programming for DB2

DB2 instancesdisabling on UNIX systems 157enabling on UNIX systems 157enabling on Windows

systems 158DB2 SQL states 173DB2 Utilities

exporting and importing modelsby means of 168

DBMSconfiguring on Windows

systems 158diagnostic information, DB2

getting 71disabling

DB2 instances 157DM_applData 27, 28, 29, 31, 55, 78

description 92specifying data by means of 52

DM_ApplicationData data type 75,78

DM_applyClasModel 50, 54, 55, 78description 94

DM_applyClusModel 27, 28, 29, 30,31, 50, 54, 79

description 95DM_applyRegModel 50, 54, 80

description 96DM_Categorical mining field

type 86DM_ClasModel data type 46, 47,

48, 75, 78functions for working with 78

DM_ClasResult data type 54, 76,78, 79

DM_ClusResult data type 54, 76, 79DM_ClusteringModel data type 46,

48, 76, 79, 80DM_expClasModel 78

description 97DM_expClusModel 79

description 98DM_expDataSpec 77

description 84DM_expRegModel 80

description 99DM_getClasCostRate 78

description 100DM_getClasMdlName 78

description 101DM_getClasMdlSpec 49, 78

description 102DM_getClasTarget 78

description 103DM_getClusConf 56, 79

DM_getClusConf (continued)description 104

DM_getClusMdlName 79description 105

DM_getClusMdlSpec 49, 79description 106

DM_getClusScore 27, 28, 30, 56, 79description 107

DM_getClusterID 27, 28, 29, 30, 56,79, 170

description 108DM_getClusterName 79

description 109DM_getConfidence 56, 78

description 110DM_getFldName 49, 77

description 85DM_getFldType 49, 77

description 86DM_getNumClusters 79

description 111DM_getNumFields 49, 77

description 87DM_getPredClass 56, 79

description 112DM_getPredValue 56, 80

description 113DM_getQuality 56, 79

description 114DM_getQuality(clusterid) 56, 79

description 115DM_getRBFRegionID 56, 80

description 116DM_getRegMdlName 80

description 117DM_getRegMdlSpec 49, 80

description 118DM_getRegTarget 80

description 119DM_impApplData 78

description 120DM_impClasFile 46, 78

description 121DM_impClasFileE 47, 78

description 122DM_impClasFileE data type 47DM_impClasModel 48, 78

description 123DM_impClusFile 27, 46, 79

description 124DM_impClusFileE 47, 79

description 125DM_impClusModel 48, 80

description 127DM_impDataSpec 77

DM_impDataSpec (continued)description 88

DM_impRegFile 46, 80description 128

DM_impRegFileE 47, 80description 129

DM_impRegModel 48, 80description 130

DM_isCompatible 77description 89

DM_LogicalDataSpec data type 76,77

DM_Numerical mining fieldtype 86

DM_RegressionModel data type 46,47, 48, 76, 80

DM_RegResult data type 54, 77, 80

Eenabling

DB2 instances 157on Windows systems 158

IM for Data Version 6 to exportPMML or XML models 158

environment variablesIDM_MX_TRACEFILE 69IDM_MX_TRACELEVEL 69setting for IM Scoring Java

Beans 59environments, system

configuring on UNIXsystems 157

error information, getting 65error messages 173

IM Scoring 174exception classes for RecordScorer

and base class Scorer 64exported models, converting 44exporting

mining modelsby means of DB2

Utilities 168with the use of

compression 168mining models from IM for

Data 43PMML models

configuring IM for Datafor 20

PMML or XML modelson AIX systems 158on Sun Solaris systems 159on Windows systems 160

214 Administration and Programming for DB2

Page 231: Administration and Programming for DB2

Ffeatures, installable

PMML conversion utilitiesclient 155server 154

scoring samples 154user-defined functions for

DB2 154field names in models, querying 49function syntax 10, 11function types

application functions 50import functions

purpose of 45results functions 56

functionsapplication functions 53, 54for working with mining model

type DM_ClasModel 78for working with mining model

type DM_ClusteringModel 79,80

for working with mining modeltype DM_RegressionModel 80

for working with scoring datatype DM_ApplicationData 78

for working with scoring resulttype DM_ClasResult 78, 79

for working with scoring resulttype DM_ClusResult 79

for working with scoring resulttype DM_RegResult 80

shared between IM Modeling andIM Scoring 171

Ggetting error information 65getting product information 68getting support 66GUI xii

IICU xiiIDM_MX_TRACEFILE environment

variable 69IDM_MX_TRACELEVEL

environment variable 69idmcheckdb command 14, 43, 132idmdisabledb command 14, 42, 132,

133idmenabledb command 14, 41idminstfunc command 135idmlevel command 68, 135idmlicm command 14, 135idmmkSQL command 13, 136

idmmkSQL command (continued)practice exercise in using 38

IDMMX schema 10, 171IDMMX.ClusterModels

use of sample table 25idmrlnconv

removing links with script 160idmuninstfunc command 138idmxmod command 139IM for Data

application mode 7client registration tool facility 11configuring to export PMML

models 20enabling to export PMML or

XML models 158exporting models from 43using to produce models 8

IM Modelingapplying models created

with 31providing models by means

of 9, 48IM Scoring 7

administrative tasks 65coexistence with IM

Modeling 171commands shared with IM

Modeling 172conformance to PMML 203data types shared with IM

Modeling 171database objects overview 75e-business enhancements 13functional enhancements in 13functions shared with IM

Modeling 171infrastructure enhancements

in 14installable features, on Windows

systems 154installing

on AIX systems 145on Linux systems 149on Sun Solaris systems 150on Windows systems 153

introduction to 7limitations in 14methods shared with IM

Modeling 172migration from IM Scoring

V7.1 167new features in version 8.1 12prerequisites for installing

on AIX systems 145

IM Scoring (continued)prerequisites for installing

(continued)on Linux systems 149on Sun Solaris systems 150on Windows systems 153

quick-start guide to installing andconfiguring 19, 20

standards conformance 13uninstalling

on AIX systems 148on Linux systems 150on Sun Solaris systems 152on Windows systems 156

using 41in a multilanguage

environment 65working with versions V7.1 and

V8.1 in parallel 167IM Scoring Java Beans

accessing computed results 62accessing model metadata

for 61applying scoring 62code sample 63installing

on AIX systems 161on Linux systems 162on Sun Solaris systems 164on Windows systems 165

online scoring with 12practice exercises in using 33prerequisites for installing

on AIX systems 161on Linux systems 163on Sun Solaris systems 164on Windows systems 165

sample components for 25setting environment variables

for 59specifying data records 62specifying the mining model to

be used with 60uninstalling

on AIX systems 162on Linux systems 163on Sun Solaris systems 164

using 58IMinerX.symblnk file set

on AIX systems 158import functions

purpose of 45importing

data, practice exercises in 25mining models 26

Index 215

Page 232: Administration and Programming for DB2

importing (continued)by means of DB2

Utilities 168from a file 45from CLOB values 47in unfenced mode 169using a specific XML

encoding 47with the use of

compression 168installable features

PMML conversion utilitiesclient 155server 154

scoring samples 154installation and configuration,

verifying 22installing IM Scoring

on AIX systems 145on Linux systems 149on Sun Solaris systems 150on Windows systems 153quick-start guide to 19, 20

installing IM Scoring Java Beanson AIX systems 161on Linux systems 162on Sun Solaris systems 164on Windows systems 165

Intelligent Miner for Data 5Intelligent Miner Modeling 4Intelligent Miner product family 3

Intelligent Miner for Data 5Intelligent Miner Modeling 4Intelligent Miner Scoring 3Intelligent Miner Visualization 5

Intelligent Miner Scoring 3Intelligent Miner Visualization 5introducing the Intelligent Miner

product family 3

Llimitations, IM Scoring 14Linux systems

installing IM Scoring Java Beanson 162

installing IM Scoring on 149prerequisites for installing IM

Scoring Java Beans on 163prerequisites for installing IM

Scoring on 149uninstalling IM Scoring Java

Beans on 163uninstalling IM Scoring on 150

Mmandatory steps in installing and

configuring IM Scoring 19markup language for data mining,

PMML 11messages, error 173

IM Scoring 174method syntax 10methods

for working with data typeDM_LogicalDataSpec 77

shared between IM Modeling andIM Scoring 172

methods, user-defined 10mining functions 17

Classification 17Clustering 17Regression/Prediction 18supported by IM Scoring 7

mining modelsapplying 49applying models and computing

cluster IDs in one SQLquery 29, 30

applying models created with IMModeling 31

clusDemoBanking.dat,sample 24

code sample for applying 55converting exported 44DM_ClasModel

functions for workingwith 78

DM_ClusteringModelfunctions for working

with 79, 80DM_RegressionModel

functions for workingwith 80

enabling IM for Data to exportPMML or XML models 158

exporting and importing bymeans of DB2 Utilities 168

exporting and importing with theuse of compression 168

exporting from IM for Data 43exporting on AIX systems 158exporting on Sun Solaris 159exporting on Windows

systems 160generating SQL scripts from 23,

44importing 26, 45

by using a specific XMLencoding 47

mining models (continued)importing (continued)

from a file 45from CLOB values 47in unfenced mode 169

PMMLconfiguring IM for Data to

export 20practice exercise in applying 27providing by means of IM for

Data 8providing by means of IM

Modeling 9, 48querying field names 49specifying the model to be used

with IM Scoring Java Beans 60working with 43

missing values, handling 57model conversion facility 11model metadata, accessing for IM

Scoring Java Beans 61

NNeural models 169

Ooptional steps in installing and

configuring IM Scoring 20overview

data types 75, 76, 77

PPMML xii, 13

configuring IM for Data to exportPMML models 20

conversion utilitiesclient 155server 154

IM Scoring conformance to 203markup language for data

mining 11models

exporting on AIXsystems 158

exporting on Sun Solaris 159exporting on Windows

systems 160practice exercises 25Predictive Model Markup

Language xiiprerequisites

for installing IM Scoringon AIX systems 145on Linux systems 149on Sun Solaris systems 150on Windows systems 153

216 Administration and Programming for DB2

Page 233: Administration and Programming for DB2

prerequisites (continued)for installing IM Scoring Java

Beanson AIX systems 161on Linux systems 163on Sun Solaris systems 164on Windows systems 165

problem identificationworksheet 67

product information, getting 68publications 211

Qquick-start guide 19

to installing IM Scoring 20

RRBF xiiREADME files 66reason codes

IM Scoring 174REC2XML 199

specifying data by means of 51Redhat Package Manager xiiregistering the model conversion

facilityon AIX systems 159

registration, client tool 11Regression/Prediction 18result type

DM_ClasResultfunctions for working

with 78, 79DM_ClusResult

functions for workingwith 79

DM_RegResultfunctions for working

with 80results data 53results functions

purpose of 56results values

practice exercise in getting 27RPM xii

Ssample table

IDMMX.ClusterModels 25samples

applications, executing 22components 23

for IM Scoring Java Beans,listed 25

schema IDMMX 10, 171

scoring featuresinstallable user-defined functions

for DB2 154scoring functions

purpose 9specifying 9

scoring methodspurpose 9

scriptsbankingApplyModeling1.db2 32bankingApplyModeling2.db2 32bankingApplyTable1.db2 28

description 24bankingApplyTable2.db2 30

description 24bankingApplyView.db2 27

description 24bankingExtract.db2 31bankingImport.db2 26

description 24bankingInsert.db2 26

description 24idmrlnconv

removing links with 160SQL, generating from mining

models 23SQL xii

code sample for applying miningmodels 55

DB2 states 173generating scripts from mining

models 23, 44query to apply models and

compute cluster IDs 29, 30statement to define views 29states 174views, statement to define 29

steps in installing and configuringIM Scoring

mandatory 19optional 20

Sun Solaris systemsexporting PMML or XML models

on 159installing IM Scoring Java Beans

on 164installing IM Scoring on 150prerequisites for installing IM

Scoring Java Beans on 164prerequisites for installing IM

Scoring on 150uninstalling IM Scoring on 152

support, getting 66supported mining functions 7

syntaxfunction 10, 11method 10

system environmentsconfiguring on UNIX

systems 157configuring on Windows

systems 158

Ttable

IDMMX.ClusterModels 27tables, creating in practice

exercises 25trace facility 69

using on UNIX systems 70using on Windows systems 70

UUDF xii

definition 9UDM xii

definition 9UDT xii

definition 9unfenced mode

importing models in 169uninstalling IM Scoring

on AIX systems 148on Linux systems 150on Sun Solaris systems 152on Windows systems 156

uninstalling IM Scoring Java Beanson AIX systems 162on Linux systems 163

UNIX systemsconfiguring system environments

on 157disabling DB2 instances on 157enabling DB2 instances on 157using the trace facility on 70

user-defined data types xii, 10user-defined functions xii, 11user-defined methods xii, 10using

an application function (SQLcommand) 55

IM for Data to producemodels 8

WWindows systems

configuring system environmentson 158

configuring the DBMS on 158enabling DB2 instances on 158

Index 217

Page 234: Administration and Programming for DB2

Windows systems (continued)exporting PMML or XML models

on 160installing IM Scoring Java Beans

on 165installing IM Scoring on 153prerequisites for installing IM

Scoring Java Beans on 165prerequisites for installing IM

Scoring on 153uninstalling IM Scoring on 156using the trace facility on 70

worksheet for problemidentification 67

XXML xii

exporting XML modelson AIX systems 158on Sun Solaris systems 159on Windows systems 160

importing mining models using aspecific encoding 47

218 Administration and Programming for DB2

Page 235: Administration and Programming for DB2

Readers’ Comments — We’d Like to Hear from You

IBM DB2 Intelligent Miner ScoringAdministration and Programmingfor DB2Version 8.1

Publication No. SH12-6745-00

Overall, how satisfied are you with the information in this book?

Very Satisfied Satisfied Neutral Dissatisfied VeryDissatisfied

Overall satisfaction h h h h h

How satisfied are you that the information in this book is:

Very Satisfied Satisfied Neutral Dissatisfied VeryDissatisfied

Accurate h h h h h

Complete h h h h h

Easy to find h h h h h

Easy to understand h h h h h

Well organized h h h h h

Applicable to your tasks h h h h h

Please tell us how we can improve this book:

Thank you for your responses. May we contact you? h Yes h No

When you send comments to IBM, you grant IBM a nonexclusive right to use or distribute your comments in anyway it believes appropriate without incurring any obligation to you.

Name Address

Company or Organization

Phone No.

Page 236: Administration and Programming for DB2

Readers’ Comments — We’d Like to Hear from YouSH12-6745-00

SH12-6745-00

����Cut or FoldAlong Line

Cut or FoldAlong Line

Fold and Tape Please do not staple Fold and Tape

Fold and Tape Please do not staple Fold and Tape

PLACE

POSTAGE

STAMP

HERE

IBM Deutschland Entwicklung GmbHInformation Development, Dept. 0446Schoenaicher Strasse 22071032 BoeblingenGermany

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

__

_

Page 237: Administration and Programming for DB2
Page 238: Administration and Programming for DB2

����

Part Number: CT16INAProgram Number: 5765-F36

Printed in Denmark by IBM Danmark A/S

SH12-6745-00

(1P)

P/N:

CT16INA

Page 239: Administration and Programming for DB2

Spin

ein

form

atio

n:

��

�IB

MD

B2

Inte

llige

ntM

iner

Scor

ing

Adm

inis

trat

ion

and

Prog

ram

min

gfo

rD

B2

Vers

ion

8.1