b28130.pdf

download b28130.pdf

of 52

Transcript of b28130.pdf

  • 8/13/2019 b28130.pdf

    1/52

    Oracle Data Mining

    Administrator's Guide

    11gRelease 1 (11.1)

    B28130-01

    March 2007

    Beta Draft

  • 8/13/2019 b28130.pdf

    2/52

    Oracle Data Mining Administrator's Guide, 11gRelease 1 (11.1)

    B28130-01

    Copyright 2005, 2007, Oracle. All rights reserved.

    The Programs (which include both the software and documentation) contain proprietary information; theyare provided under a license agreement containing restrictions on use and disclosure and are also protected

    by copyright, patent, and other intellectual and industrial property laws. Reverse engineering, disassembly,or decompilation of the Programs, except to the extent required to obtain interoperability with other

    independently created software or as specified by law, is prohibited.

    The information contained in this document is subject to change without notice. If you find any problems inthe documentation, please report them to us in writing. This document is not warranted to be error-free.Except as may be expressly permitted in your license agreement for these Programs, no part of thesePrograms may be reproduced or transmitted in any form or by any means, electronic or mechanical, for anypurpose.

    If the Programs are delivered to the United States Government or anyone licensing or using the Programs onbehalf of the United States Government, the following notice is applicable:

    U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical datadelivered to U.S. Government customers are "commercial computer software" or "commercial technical data"pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. Assuch, use, duplication, disclosure, modification, and adaptation of the Programs, including documentationand technical data, shall be subject to the licensing restrictions set forth in the applicable Oracle licenseagreement, and, to the extent applicable, the additional rights set forth in FAR 52.227-19, CommercialComputer Software--Restricted Rights (June 1987). Oracle USA, Inc., 500 Oracle Parkway, Redwood City, CA94065.

    The Programs are not intended for use in any nuclear, aviation, mass transit, medical, or other inherentlydangerous applications. It shall be the licensee's responsibility to take all appropriate fail-safe, backup,redundancy and other measures to ensure the safe use of such applications if the Programs are used for suchpurposes, and we disclaim liability for any damages caused by such use of the Programs.

    Oracle, JD Edwards, PeopleSoft, and Siebel are registered trademarks of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respective owners.

    The Programs may provide links to Web sites and access to content, products, and services from thirdparties. Oracle is not responsible for the availability of, or any content provided on, third-party Web sites.You bear all risks associated with the use of such content. If you choose to purchase any products or servicesfrom a third party, the relationship is directly between you and the third party. Oracle is not responsible for:(a) the quality of third-party products or services; or (b) fulfilling any of the terms of the agreement with thethird party, including delivery of products or services and warranty obligations related to purchasedproducts or services. Oracle is not responsible for any loss or damage of any sort that you may incur from

    dealing with any third party.Alpha and Beta Draft documentation are considered to be in prerelease status. This documentation isintended for demonstration and preliminary use only. We expect that you may encounter some errors,ranging from typographical errors to data inaccuracies. This documentation is subject to change withoutnotice, and it may not be specific to the hardware on which you are using the software. Please be advisedthat prerelease documentation is not warranted in any manner, for any purpose, and we will not beresponsible for any loss, costs, or damages incurred due to the use of this documentation.

  • 8/13/2019 b28130.pdf

    3/52

  • 8/13/2019 b28130.pdf

    4/52

    iv Beta Draft

    Examining the Data............................................................................................................................... 4-13

    Index

  • 8/13/2019 b28130.pdf

    5/52

    Beta Draft v

    Preface

    This manual explains how to install the various components of Oracle Data Miningand perform basic administration tasks. It also explains how to install the Data Miningdemo programs.

    The preface contains these topics:

    Audience

    Documentation Accessibility

    Related Documentation

    Conventions

    AudienceThis guide can be used by a spectrum of users, however, it is primarily directed at anindividual user who wants to install, configure, and use Oracle Data Mining on alaptop or personal computer.

    Documentation AccessibilityOur goal is to make Oracle products, services, and supporting documentationaccessible, with good usability, to the disabled community. To that end, ourdocumentation includes features that make information available to users of assistivetechnology. This documentation is available in HTML format, and contains markup tofacilitate access by the disabled community. Accessibility standards will continue toevolve over time, and Oracle is actively engaged with other market-leadingtechnology vendors to address technical obstacles so that our documentation can beaccessible to all of our customers. For more information, visit the Oracle AccessibilityProgram Web site at

    http://www.oracle.com/accessibility/

    Accessibility of Code Examples in Documentation

    Screen readers may not always correctly read the code examples in this document. Theconventions for writing code require that closing braces should appear on anotherwise empty line; however, some screen readers may not always read a line of textthat consists solely of a bracket or brace.

  • 8/13/2019 b28130.pdf

    6/52

    vi Beta Draft

    Accessibility of Links to External Web Sites in Documentation

    This documentation may contain links to Web sites of other companies ororganizations that Oracle does not own or control. Oracle neither evaluates nor makesany representations regarding the accessibility of these Web sites.

    TTY Access to Oracle Support Services

    Oracle provides dedicated Text Telephone (TTY) access to Oracle Support Serviceswithin the United States of America 24 hours a day, seven days a week. For TTYsupport, call 800.446.2398.

    Related DocumentationThe documentation set for Oracle Data Mining is part of the Oracle Database 11gRelease 1 (11.1) Online Documentation Library. The Oracle Data Miningdocumentation set consists of the following documents:

    Oracle Data Mining Concepts

    Oracle Data Mining Application Developer's Guide

    Oracle Data Mining Java API Reference(javadoc) Oracle Data Mining Administrator's Guide

    For information about Oracle Data Miner, the graphical user interface for Data Mining,see the online help. Oracle Data Miner is distributed on Oracle Technology Network at

    http://www.oracle.com/technology/index.html

    For detailed information about the Oracle Data Mining PL/SQL interface, see OracleDatabase PL/SQL Packages and Types Reference. Search for DBMS_DATA_MINING.

    For detailed information about the SQL data mining functions, see Oracle Database SQLLanguage Reference.

    For an introduction to application development in SQL and PL/SQL, see OracleDatabase Advanced Application Developer's Guide.

    For an introduction to application development in Java, see Oracle Database JavaDeveloper's Guide

    For information about the data mining process in general, independent of bothindustry and tool, a good source is the CRISP-DM project (Cross-Industry StandardProcess for Data Mining) at

    http://www.crisp-dm.org

    ConventionsThe following text conventions are used in this document:

    Note: Information to assist you in installing and using the DataMining demo programs is provided in Oracle Data Mining

    Administrator's Guide.

  • 8/13/2019 b28130.pdf

    7/52

    Beta Draft vii

    Convention Meaning

    boldface Boldface type indicates graphical user interface elements associatedwith an action, or terms defined in text or the glossary.

    italic Italic type indicates book titles, emphasis, or placeholder variables forwhich you supply particular values.

    monospace Monospace type indicates commands within a paragraph, URLs, codein examples, text that appears on the screen, or text that you enter.

  • 8/13/2019 b28130.pdf

    8/52

    viii Beta Draft

  • 8/13/2019 b28130.pdf

    9/52

    Beta Draft Installing the Data Mining Server 1-1

    1Installing the Data Mining Server

    This chapter explains how to install Oracle Database with the Data Mining option. It isdirected primarily at users who want to install and administer Oracle Database on apersonal computer or laptop for their own use.

    Because Oracle Data Mining is completely integrated with Oracle Database, you willuse Oracle Database tools to administering Data Mining. You can administer OracleDatabase locally or from a remote computer with network access.

    This chapter contains the following topics:

    Connecting to Oracle Database

    Installing Oracle Database

    Local Administration on Microsoft Windows

    Local Administration on Linux

    Remote Administration

    Connecting to Oracle DatabaseOracle Data Mining is a technology embedded within Oracle Database. If you want touse Oracle Data Mining, you must have access to an Oracle database.

    You may be able to access an existing installation instead of installing Oracle Databaseyourself. Contact your Oracle Database Administrator and inquire whether you can

    access an instance of Oracle Database 11gRelease 1 (11.1).

    If you can access an existing installation, then ask your Oracle DBA to do thefollowing:

    1. Install the Sample Schemas, if they are not already installed.

    2. Install the Data Mining demo programs, as described in "Installing OracleDatabase Companion"on page 4-3.

    3. Create a user name. See"Creating Data Mining Users"on page 3-3.

    4. Provide you with the connection information.

    See: "Upgrading Oracle Data Mining"on page 2-9if you will beupgrading from 10.1 or 10.2

    Note: You should use this chapter along with your Oracle Databaseinstallation documentation. This chapter does not attempt to providecomprehensive installation instructions for Oracle Database.

  • 8/13/2019 b28130.pdf

    10/52

    Installing Oracle Database

    1-2 Oracle Data Mining Administrator's Guide Beta Draft

    If you do nothave access to an existing installation, then follow the instructionsprovided in this chapter for installing Oracle Database.

    Installing Oracle DatabaseTo install Oracle Database, you use Oracle Universal Installer. This installationprogram is nearly the same on all platforms, with differences that arise because of thedifferences in the operating systems. See "Oracle Universal Installer"on page 1-7.

    This chapter provides step-by-step instructions for using Oracle Universal Installer toinstall Oracle Database on a Microsoft Windows platform. If you use a differentplatform, you can still derive the preferred settings for Data Mining from theseinstructions. However, you should be careful to perform any pre- and post-installationsteps specific to your platform, because they may be critical to a successful installation.Refer to the Installation Guidefor your platform.

    Installation Directories

    There are four sets of installation media for Oracle Database. If you download thesoftware instead of installing from a physical media pack, you will create thedirectories on your computer.

    Database: Installation directory for Oracle Database 11gRelease 1 (11.1)

    Companion: Installation directory for demo program files. See Chapter 4.

    Client: Installation directory for client tools.

    Doc: Installation directory for documentation

    If you are installing Oracle Database on your personal computer, you do not need toinstall Oracle Client.

    Installation Methods

    Oracle Universal Installer provides two methods for installing the Oracle Databasesoftware:

    Basic: This installation method can be accomplished quickly and requires minimaluser input. It installs the software and optionally creates a general-purposedatabase. It is the default installation method.

    Advanced: This installation method offers a number of choices and may takelonger to complete. It installs the specified software components and optionally

    creates a database. You can choose a general-purpose database, or you can specifythat the database be optimized for data warehousing or transaction processing.

    Installation Types

    Whether you use the Basic or Advanced installation method, you can specify one ofthe following installation types for Oracle Database:

    Enterprise Edition: Installs licensable Oracle Database options, including DataMining. Includes configuration and management tools in addition to all of theproducts that are installed during a Standard Edition installation.

    See Also:

    Oracle Database Installation Guide for Microsoft Windows

    Oracle Database 2 Day DBA

  • 8/13/2019 b28130.pdf

    11/52

    Installing Oracle Database

    Beta Draft Installing the Data Mining Server 1-3

    Standard Edition: Installs an integrated set of management tools, web features,and facilities for creating business applications.

    Personal Edition: Installs Enterprise Edition software, but supports only a singleuser development and deployment environment.

    Custom: Installs the individual components that you specify.

    Creating a Database During Installation

    Both the Basic and Advanced installation methods offer the option of creating a starterdatabase during the installation process. If you choose to create a starter database,Oracle Universal Installer uses Database Configuration Assistant (DBCA) to create it.DBCA is an administrative tool that is provided with Oracle Database.

    Preparing for Installation

    Several preliminary steps are required before you begin the installation process.

    Verify that your computer meets the minimum hardware and softwarerequirements

    Ensure that you have administrative privileges on your computer

    If you have other Oracle products installed on your computer, stop the Oracleservices

    If your computer has a dynamic IP address, install a loopback adapter

    If you are installing from a network drive, use Windows File Manager to map thatdrive to your computer.

    Hardware RequirementsVerify that your computer meets the following requirements:

    Processor: 550 MHz minimum

    Available hard disk space: 2 GB, including 125 MB temp disk space

    RAM: 256 MB minimum, 512 MB recommended

    Virtual memory: double the amount of RAM

    Video adapter: 256 color

    Software Requirements

    The following operating systems are among those supported for Oracle Database: Windows XP Professional

    Windows 2000 with service pack 1 or higher

    Windows Server 2003

    Preexisting Oracle Products

    If any Oracle products are already installed on your computer, you must disable thembefore installing Oracle Database. You need to stop any Oracle services that are

    Note: Oracle Data Mining is only available in the Enterprise Editionof Oracle Database.

  • 8/13/2019 b28130.pdf

    12/52

    Installing Oracle Database

    1-4 Oracle Data Mining Administrator's Guide Beta Draft

    currently running, and you need to delete the ORACLE_HOMEenvironment variable ifit is defined.

    To stop Oracle services:

    1. From the Control Panel, choose Administrative Tools, then Services.

    2. On the Services page, scroll down the list of names and locate those that begin

    with Oracle. Select those services and choose Stop.See "Oracle Services"on page 1-7for a list of Oracle services.

    To delete ORACLE_HOME:

    1. From the Control Panel, choose System.

    2. On the Advanced tab of the System Properties page, choose EnvironmentVariables.

    3. If ORACLE_HOMEis included in the list of System Variables, select it and clickDelete.

    Dynamic IP

    Dynamic Host Configuration Protocol (DHCP) assigns dynamic IP addresses on anetwork. Dynamic addressing allows a computer to have a different IP address eachtime it connects to the network.

    If you plan to install Oracle Database on a computer that uses DHCP, you need toinstall a loopback adapter to assign a local IP address to your computer.

    You can determine if a loopback adapter is already installed on your Windowscomputer by opening a command window, navigating to the system drive (typicallyC:), and executing the following command.

    ipconfig /all

    You can find complete instructions for installing a loopback adapter in the OracleDatabase Installation Guide for Microsoft Windows

    Performing a Basic Installation

    The simplest way to install Oracle Database is to perform a Basic installation. Thisinstallation method provides the Data Mining option and the sample schemas bydefault.

    Follow these steps to perform a Basic installation with a starter database on aWindows platform:

    1. Logon as Administrator.

    2. From the Database installation directory, run SETUP.EXE.

    Oracle Universal Installer opens and displays the Select Installation Method page.Choose Basic Installation. Specify the Oracle home directory. Choose EnterpriseEditionas the Installation Type. Check the Create Starter Databasebox. Provide aglobal database name and a password for the system accounts.

    Note: You will have the opportunity to change the passwords for thesystem accounts at a later time.

  • 8/13/2019 b28130.pdf

    13/52

    Installing Oracle Database

    Beta Draft Installing the Data Mining Server 1-5

    3. On the Product-Specific Prerequisite Checks page, verify that all checks succeeded.If any checks failed, then you must correct the problem before proceeding.

    4. The Summary page displays the settings and components for the installation.Click the Installbutton to proceed.

    5. The Installer proceeds with the installation.

    6.After installing the software, the Installer invokes the Configuration Assistants tocreate the starter database.

    7. The Database Configuration Assistant copies the files for the starter database.

    8. The Database Configuration Assistant displays information about the starterdatabase. Click the Password Managementbutton. Unlock any accounts that youplan to use. If you plan to use the demo programs, you must unlock the SHaccount. Click OK.

    9. The final screen displays the URLs for Enterprise Manager Database Control and

    for iSQL*Plus. See "Remote Administration"on page 1-8.

    Click Exitto exit the Installer.

    Upon completion, the Installer displays the login page for the Database Control.

    Check the Data Mining Option

    You can verify your Data Mining installation by querying the V$OPTIONview in SQL.

    SQL> SELECT VALUE FROM V$OPTION WHERE PARAMETER = 'Data Mining';VALUE----------------------------------------------------------------TRUE

    Tips for Linux Installations

    The step-by-step instructions in this guide are for installations on a Windows platform.The following are some tips to help you with the differences between the Linux andWindows platforms.

    Follow the instructions in the installation guide for your platform. Be sure to setsystem variables and perform any other pre-installation tasks.

    During the installation, you must run some SQL scripts as root, so make sure thatyou have root access.

    To run Oracle Universal Installer, type runInstallerfrom the installation diskor directory.

    Create a script for setting the appropriate environment variables, or include thesettings in the startup script for your operating system login (such as .profileor .cshrc). Perform any other post-installation tasks.

    The following is a sample script for setting environment variables:

    setenv ORACLE_ROOT /dat1/11gR1

    Tip: In the Password Management dialog, you can specify newpasswords for the system accounts.

    Tip: Print a screen shot of this page or write down the information itcontains for future reference.

  • 8/13/2019 b28130.pdf

    14/52

    Local Administration on Microsoft Windows

    1-6 Oracle Data Mining Administrator's Guide Beta Draft

    setenv ORACLE_BASE ${ORACLE_ROOT}setenv ORACLE_HOME ${ORACLE_BASE}/oracle/product/11.1.0/db_1setenv ORACLE_PORT 1521setenv ORACLE_SID rel11gsetenv TNS_ADMIN ${ORACLE_HOME}/network/adminsetenv PATH ${ORACLE_HOME}/bin:${ORACLE_HOME}:$PATH

    Local Administration on Microsoft WindowsSeveral tools for administrators and application developers are installed along withOracle Database. For Microsoft Windows platforms, the Start menu contains an Oraclehome program group with links to the tools.

    Following are descriptions of a few of the basic administrative tools.

    Oracle Enterprise Manager Database Control

    Database Control provides a Web-based graphical interface for managing all aspects ofOracle Database.

    To open Database Control from the Windows Start menu, click Start> Programs>Oracleoracle_home> Database Controldatabase_name.

    You can also open Database Control from the URL provided during installation.

    SQL*Plus

    SQL*Plus is a command-line interface for the SQL language. You can perform allOracle administrative tasks using SQL.

    To open SQL*Plus from the Windows Start menu, click Start> Programs>Oracleoracle_home> Application Development> SQL Plus.

    You will be prompted for your user name and password. You must supply a hoststring only when connecting to a remote computer. The host string takes the formhost_name:port:SID, such as MYHOST:1521:ORCL.

    Database Configuration Assistant

    Database Configuration Assistant provides a graphical user interface for creating,configuring, and deleting database instances. A single installation of Oracle Database

    can support numerous individual database instances. You can use DatabaseConfiguration Assistant to install the sample schemas if you did not install them withthe database.

    To open Database Configuration Assistant from the Windows Start menu, click Start>Programs> Oracleoracle_home> Configuration and Migration Tools>Database Configuration Assistant.

    See Also:

    Oracle Database Quick Installation Guide for Linux x86for abasic installation

    Oracle Database Installation Guide for Linuxfor a custominstallation, upgrade, or other variation

  • 8/13/2019 b28130.pdf

    15/52

    Local Administration on Linux

    Beta Draft Installing the Data Mining Server 1-7

    Oracle Universal Installer

    You can use Oracle Universal Installer to list the Oracle products on your computer orto deinstall them.

    To open Oracle Universal Installer from the Windows Start menu, click Start>Programs> Oracleoracle_home> Oracle Installation Products>Universal Installer.

    You must shut down all databases and supporting services before deinstalling OracleDatabase. Refer to the installation guide for your platform for more information.

    Oracle Services

    The Oracle Database installation creates several services on Windows. The followingtable describes some of them.

    To manage them, open Administrative Tools in the Windows Control Panel and chooseServices.

    Local Administration on LinuxThe same tools that are installed locally on a Windows platform are also installed onLinux. You can run the local administrative tools from the shell command line. Theyare located in $ORACLE_HOME/bin. These are a few of the tools:

    To open SQL*Plus, type sqlplus.

    To open Database Configuration Assistant, type dbca.

    To open Enterprise Manager Database Control, open a browser and type the URLprovided during installation.

    To open Oracle Universal Installer, type

    $ORACLE_HOME/oui/bin/runInstaller .

    To start and stop the various Oracle processes, use these commands:

    lsnrctl: Oracle Database listener

    isqlplusctl: iSQL*Plus application server

    emctl: Oracle Enterprise Manager Database Control console

    For descriptions of these tools, refer to "Local Administration on Microsoft Windows"on page 1-6.

    Service Name Description Usage

    OracleServiceSID Oracle Database Enables you to start and stopOracle Database from the Servicewindow.

    OracleHome_NameTNSListener Oracle Databaselistener

    Enables you to open a connectionwith Oracle Database from aremote computer.

    OracleHome_NameiSQL*Plus iSQL*Plus applicationserver

    Enables you to open iSQL*Plusfrom a browser.

    OracleDBConsoleSID Oracle EnterpriseManager DatabaseControl console

    Enables you to open DatabaseControl from a browser.

  • 8/13/2019 b28130.pdf

    16/52

    Remote Administration

    1-8 Oracle Data Mining Administrator's Guide Beta Draft

    Remote AdministrationYou can open these tools in any browser by typing the URLs listed during installationon the End of Installation page:

    iSQL*Plus is a version of SQL*Plus that runs in a browser.

    Enterprise Manager Database Control is the same thin-client application that you

    access locally.The administration tools installed with Oracle Database are also installed with OracleClient. If you are administering a remote database, you must install Oracle Client toobtain the full suite of administration tools.

  • 8/13/2019 b28130.pdf

    17/52

    Beta Draft Working With Mining Model Objects 2-1

    2Working With Mining Model Objects

    In this chapter, you will learn how to find information about mining models in thedata dictionary and how to perform various operations on mining models.

    This chapter contains the following topics:

    Obtaining Information from the Data Dictionary

    Adding a Comment to a Mining Model

    Auditing Mining Models

    Exporting and Importing Mining Models

    Upgrading Oracle Data Mining

    Obtaining Information from the Data Dictionary

    Mining models are database schema objects. They can be queried in theALL, DBA, andUSERdata dictionary views.

    The data dictionary views in Table 21reveal information about mining modelscreated by Oracle Data Mining.

    Obtaining Information about Mining Models

    You can query theALL_MINING_MODELSdata dictionary view to obtain informationabout all accessible mining model objects. USERand DBAversions of this view are alsoavailable.

    SQL> describe all_mining_models

    See Also: Chapter 3, "Managing Security for Data Mining"forinformation about system and object privileges associated withmining model objects.

    Table 21 Oracle Data Mining Data Dictionary Views

    ALL_ Views DBA_ Views USER_ Views

    ALL_MINING_MODELS DBA_MINING_MODELS USER_MINING_MODELS

    ALL_MINING_MODEL_ATTRIBUTES DBA_MINING_MODEL_ATTRIBUTES USER_MINING_MODEL_ATTRIBUTES

    ALL_MINING_MODEL_SETTINGS DBA_MINING_MODEL_SETTINGS USER_MINING_MODEL_SETTINGS

    See Also: Oracle Database Referencefor complete descriptions of theData Mining views in the data dictionary.

  • 8/13/2019 b28130.pdf

    18/52

    Adding a Comment to a Mining Model

    2-2 Oracle Data Mining Administrator's Guide Beta Draft

    Name Null? Type----------------------------------------- -------- ----------------------------OWNER NOT NULL VARCHAR2(30)MODEL_NAME NOT NULL VARCHAR2(30)MINING_FUNCTION VARCHAR2(30)ALGORITHM VARCHAR2(30)CREATION_DATE NOT NULL DATEBUILD_DURATION NUMBERMODEL_SIZE NUMBERCOMMENTS VARCHAR2(4000)

    Obtaining Information about Mining Model Attributes

    You can query theALL_MINING_MODEL_ATTRIBUTESdata dictionary view to obtaininformation about all accessible mining model attributes. USERand DBAversions ofthis view are also available.

    SQL> describe all_mining_model_attributesName Null? Type----------------------------------------- -------- ----------------------------OWNER NOT NULL VARCHAR2(30)

    MODEL_NAME NOT NULL VARCHAR2(30)ATTRIBUTE_NAME NOT NULL VARCHAR2(30)ATTRIBUTE_TYPE VARCHAR2(11)DATA_TYPE VARCHAR2(12)DATA_LENGTH NUMBERDATA_PRECISION NUMBERDATA_SCALE NUMBERUSAGE_TYPE VARCHAR2(8)TARGET VARCHAR2(3)

    Obtaining Information about Mining Model Settings

    You can query theALL_MINING_MODEL_SETTINGSdata dictionary view to obtaininformation about all accessible mining model settings. USERand DBAversions of this

    view are also available.

    SQL> describe all_mining_model_attributesName Null? Type----------------------------------------- -------- ----------------------------OWNER NOT NULL VARCHAR2(30)MODEL_NAME NOT NULL VARCHAR2(30)SETTING_NAME NOT NULL VARCHAR2(30)SETTING_VALUE VARCHAR2(4000)SETTING_TYPE VARCHAR2(7)

    Adding a Comment to a Mining Model

    You can associate a comment with a mining model using a SQLCOMMENT

    statement.COMMENT ON MINING MODEL schema_name.model_nameIS string;

    To drop a comment, set it to the empty ''string.

    Note: To add a comment to a model in another schema, you musthave the COMMENT ANY MODELsystem privilege.

  • 8/13/2019 b28130.pdf

    19/52

    Auditing Mining Models

    Beta Draft Working With Mining Model Objects 2-3

    The following statement adds a comment to the model DT_SH_CLAS_SAMPLEin yourown schema.

    SQL> COMMENT ON mining model dt_sh_clas_sample IS 'Decision Tree model predicts promotion response';

    You can view the comment by querying the catalog view USER_MINING_MODELS.

    SQL> COLUMN comments FORMAT a22SQL> SELECT model_name, mining_function, algorithm, comments FROM user_mining_models;

    MODEL_NAME MINING_FUNCTION ALGORITHM COMMENTS----------------- ---------------- -------------- -----------------------------------------------DT_SH_CLAS_SAMPLE CLASSIFICATION DECISION_TREE Decision Tree model predicts promotion response

    To drop this comment from the database, issue the following statement:

    SQL> COMMENT ON mining model dt_sh_clas_sample '';

    Auditing Mining ModelsYou can use the SQL auditing system to track operations on data mining models.

    Enabling Auditing in the Database

    The database initialization parameterAUDIT_TRAILcontrols whether or not auditingis enabled. To enable auditing, setAUDIT_TRAILin the database initialization file asfollows.

    audit_trail = db

    This setting directs all audit records to the database audit trail. You can specifyauditing options regardless of whether auditing is enabled. However, Oracle Databasedoes not generate audit records until you enable auditing.

    Opening an Audit Trail on Mining Models

    Use the SQLAUDITstatement to open an auditing trail on a data mining model.

    AUDIT operationON mining model schema_name.model_name;

    You can track the following operations on mining models.

    For example, this statement generates an audit trail for all GRANToperations on themodelABN_SH_CLAS_SAMPLEin the DMUSERschema.

    Note: To audit a mining model in another schema, you must havetheAUDIT ANYsystem privilege.

    Audit Operation Description

    AUDIT Generate an audit trail for a mining model

    COMMENT Add a comment to a mining model

    GRANT Give permission to a user to access the model

    RENAME Change the name of the model

    SELECT Apply the model or view its signature.

  • 8/13/2019 b28130.pdf

    20/52

    Auditing Mining Models

    2-4 Oracle Data Mining Administrator's Guide Beta Draft

    SQL> AUDIT GRANT ON mining model dmuser.abn_sh_clas_sample;

    This statement generates an audit trail for all operations on the same model.

    SQL> AUDIT GRANT,AUDIT,COMMENT,RENAME,SELECTON mining model dmuser.abn_sh_clas_sample;

    You can refine the criteria for auditing with the following additional semantics.

    AUDIT operationON MINING MODEL schema_name.model_name [BY [SESSION|ACCESS]] [WHENEVER [NOT] SUCCESSFUL]];

    Specify BY SESSIONif you want Oracle Database to write a single record for alloperations of the same type on each mining model in the same session. Specify BYACCESSif you want Oracle Database to write one record for each audited operation.

    Closing the Audit Trail

    Use the NOAUDITstatement to stop one or more auditing operations previouslyenabled by theAUDITstatement.

    NOAUDIT {operation| ALL} ON MINING MODEL model_name; [WHENEVER [NOT] SUCCESSFUL]];

    Viewing the Audit Trail

    For each audited operation, Oracle Database produces an audit record containing:

    The name of the user performing the operation

    The type of operation

    The object involved in the operation

    The date and time of the operation

    Several data dictionary views present auditing information. Some examples are: DBA_AUDIT_OBJECTdisplays audit trail records for all objects in the database.

    USER_AUDIT_OBJECTdisplays audit trail records for all objects accessible to thecurrent user

    DBA_OBJ_AUDIT_OPTSdescribes auditing options for all objects in the database.

    USER_OBJ_AUDIT_OPTSdescribes auditing options for all objects owned by thecurrent user.

    Note: The Oracle Database auditing system is a powerful, highlyconfigurable tool available to administrators. Refer to the followingmanuals for more information:

    SQL Referencefor documentation of theAUDITand NOAUDITstatements

    Database Referencefor documentation of theAUDIT_TRAILinitialization parameter and the data dictionary views forquerying the database audit trail.

    Security Guidefor a comprehensive discussion of databaseauditing.

  • 8/13/2019 b28130.pdf

    21/52

    Exporting and Importing Mining Models

    Beta Draft Working With Mining Model Objects 2-5

    Exporting and Importing Mining ModelsYou can export data mining models to flat files to back up work in progress or to movemodels to a different instance of Oracle Database Enterprise Edition (such as from adevelopment database to a production database). All methods for exporting andimporting models are based in Oracle Data Pump technology.

    Oracle Data Pump consists of two command-line clients and two PL/SQL APIs. Thecommand-line clients, expdpand impdp, provide an easy-to-use interface to the DataPump export and import utilities. The Data Mining APIs also use the Data Pumpexport and import utilities.

    You can export and import models at different levels, depending on your access rightsin the database:

    Database. When a DBA exports a full database using expdp, all data miningmodels in the database are exported. The impdputility imports all the modelswith the other objects in the database.

    Schema. When a DBA or an individual user exports a schema using expdp, all thedata mining models in the schema are exported. Likewise, impdpimports all themodels with the other objects in the schema.

    Models Only. The Data Mining APIs contain utilities for exporting and importingeither all Data Mining models in a schema or models that match specific criteria.

    The Data Pump export utility writes the tables and metadata that constitute a model toa dump file set, which consists of one or more files. The Data Pump import utilityretrieves the tables and metadata from the dump file and restores them to the targetdatabase. Because the expdpand impdpclients and the Data Mining APIs use theData Pump export and import utilities, you can use the APIs to extract individualmodels from a dump file of a schema or database.

    Note that the older expand impdatabase utilities do not export or import datamining models.

    Prerequisites

    To export and import Data Mining models, you must have read and write access to adirectory object, and you may need additional database permissions.

    Directory Objects

    A directory object is a logical name in the database for a physical directory on the hostcomputer. Without read and write access to a directory object, you cannot access thehost computer file system from within Oracle Database.

    You must have the CREATE ANY DIRECTORYprivilege to create directory objects.

    See Also: Oracle Database Utilitiesfor a complete discussion of Oracle

    Data Pump and the expdpand impdputilities

    Oracle Database PL/SQL Packages and Types Referencefordetailed information about the export and importprocedures in the DBMS_DATA_MINING package.

    Oracle Data Mining Java API Referencefor information aboutthe export and import classes in the Oracle Data Mining JavaAPI.

  • 8/13/2019 b28130.pdf

    22/52

    Exporting and Importing Mining Models

    2-6 Oracle Data Mining Administrator's Guide Beta Draft

    The following SQL command creates, or re-creates if it already exists, a directory objectnamed DMTEST. The file system directory (in this example,C:\ORACLE\PRODUCT\11.1.0\DMINING) must already exist and have sharedread/write access rights granted by the operating system.

    CREATE OR REPLACE DIRECTORY dmtest AS 'c:\oracle\product\11.1.0\dmining';

    This SQL command gives user DMUSER1both read and write access to DMTEST.

    GRANT ALL ON DIRECTORY dmtest TO dmuser1;

    For more information about creating database directories, refer to the CREATEDIRECTORYand GRANTcommands in the Oracle Database SQL Language Reference.

    Additional Database Privileges

    You may need special privileges in the database to take full advantage of all DataPump features, such as importing models and other objects into a different schema.These privileges are granted by the EXP_FULL_DATABASEand IMP_FULL_DATABASEroles.

    You do not need these roles to export models from your own schema. To import

    models, you must have the same database roles or be as privileged as the user whocreated the dump file set. Otherwise, you need the IMP_FULL_DATABASErole.

    Privileged users (such as SYSor a user with the DBArole) have sufficient access rightsand do not need these additional roles.

    The following SQL commands grant these roles to DMUSER1:

    GRANT EXP_FULL_DATABASE TO dmuser1;GRANT IMP_FULL_DATABASE TO dmuser1;

    PL/SQL APIs for Exporting and Importing Models

    The DBMS_DATA_MININGPL/SQL package contains these two procedures:

    EXPORT_MODEL

    IMPORT_MODEL

    For more information about these procedures, refer to the Oracle Database PL/SQLPackages and Types Reference.

    Java APIs for Exporting and Importing Models

    Oracle Database implements the industry-standard Java Data Mining (JDM) APISpecification, which includes these two interfaces:

    javax.datamining.task.ExportTask

    javax.datamining.task.ImportTask

    For more information about the standard JDM API, refer to the Java Help for theJSR-73 Specification, which is available on the Oracle Technology Network at

    http://www.oracle.com/technology/products/bi/odm/JSR-73/index.html

    Tables Created By Exporting and Importing Models

    The Data Mining export and import utilities create tables in the user's schema that arefor internal use only:

  • 8/13/2019 b28130.pdf

    23/52

    Exporting and Importing Mining Models

    Beta Draft Working With Mining Model Objects 2-7

    DM$P_MODEL_EXPIMP_TEMP. Used for internal purposes during export andimport, and provides a job history.

    DM$P_MODEL_IMPORT_TEMP. Used only for internal purposes during import.

    DM$P_MODEL_TABKEY_TEMP. Used only for internal purposes during export andimport.

    Do not alter these tables. However, you may drop them when no export or import jobis running. The utilities will re-create them for the next job.

    Example: Exporting and Importing Models

    This example creates a dump file with three models and imports the models from thedump file.

    Exporting Models from the DMUSER Schema

    The following command exports all models from DMUSER, who is currently connectedto the database in SQL*Plus.

    SQL> EXECUTE dbms_data_mining.export_MODEL('allmodels.dmp','DMTEST');

    PL/SQL procedure successfully completed.

    An export or import creates a log file in the same directory as the dump file. Errormessages are returned to the current output device (such as the screen), and the log filemay provide additional information.

    This sample command was successful and created two files in the DMTESTdirectory:

    A dump file namedALLMODELS01.DMP(note the 2-digit suffix added to thename)

    A log file with the name DMUSER_EXP_4589.LOG

    For detailed information about the default names of files, see the DBMS_DATA_MINING

    package in the Oracle Database PL/SQL Packages and Types Reference.You can view the log file using a system command or editor. You must know the pathof the physical directory in order to locate the file.

    DMUSER_EXP_4589.LOGlists the three Data Mining models that were in the schema,plus additional objects as shown here:

    Starting "DMUSER"."DMUSER_exp_45": DM_EXPIMP_JOB_ID=45Estimate in progress using BLOCKS method...Processing object type TABLE_EXPORT/TABLE/TABLE_DATATotal estimation using BLOCKS method: 1.062 MB>>> . . exported Data Mining Model "DMUSER"."ABN_CLAS_SAMPLE">>> . . exported Data Mining Model "DMUSER"."ASSOCIATION_RULES_SAMPLE">>> . . exported Data Mining Model "DMUSER"."NAIVE_BAYES_SAMPLE"

    Processing object type TABLE_EXPORT/TABLE/PROCACT_INSTANCEProcessing object type TABLE_EXPORT/TABLE/TABLEProcessing object type TABLE_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANTProcessing object type TABLE_EXPORT/TABLE/INDEX/INDEXProcessing object type TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINTProcessing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICSProcessing object type TABLE_EXPORT/TABLE/INDEX/FUNCTIONAL_AND_BITMAP/INDEXProcessing object typeTABLE_EXPORT/TABLE/INDEX/STATISTICS/FUNCTIONAL_AND_BITMAP/INDEX_STATISTICSProcessing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS. . exported "DMUSER"."DM$P0ASSOCIATION_RULES_SAMPLE" 7.640 KB 15 rows

  • 8/13/2019 b28130.pdf

    24/52

    Exporting and Importing Mining Models

    2-8 Oracle Data Mining Administrator's Guide Beta Draft

    . . exported "DMUSER"."DM$P0NAIVE_BAYES_SAMPLE" 18.35 KB 219 rows

    . . exported "DMUSER"."DM$P1ABN_CLAS_SAMPLE" 6.945 KB 2 rows

    . . exported "DMUSER"."DM$P1NAIVE_BAYES_SAMPLE" 5.929 KB 2 rows

    . . exported "DMUSER"."DM$P2ASSOCIATION_RULES_SAMPLE" 6.210 KB 11 rows

    . . exported "DMUSER"."DM$P3ASSOCIATION_RULES_SAMPLE" 6.179 KB 18 rows

    . . exported "DMUSER"."DM$P4ASSOCIATION_RULES_SAMPLE" 5.492 KB 26 rows

    . . exported "DMUSER"."DM$P5ABN_CLAS_SAMPLE" 5.304 KB 2 rows

    . . exported "DMUSER"."DM$P5NAIVE_BAYES_SAMPLE" 5.984 KB 27 rows

    . . exported "DMUSER"."DM$P6ABN_CLAS_SAMPLE" 16.47 KB 34 rows

    . . exported "DMUSER"."DM$P7ABN_CLAS_SAMPLE" 7.007 KB 5 rows

    . . exported "DMUSER"."DM$P8ABN_CLAS_SAMPLE" 5.414 KB 5 rows

    . . exported "DMUSER"."DM$P8ASSOCIATION_RULES_SAMPLE" 5.335 KB 3 rows

    . . exported "DMUSER"."DM$P8NAIVE_BAYES_SAMPLE" 5.359 KB 3 rows

    . . exported "DMUSER"."DM$PEABN_CLAS_SAMPLE" 9.093 KB 116 rows

    . . exported "DMUSER"."DM$PENAIVE_BAYES_SAMPLE" 8.742 KB 116 rows

    . . exported "DMUSER"."DM$P_MODEL_EXPIMP_TEMP" 6.273 KB 10 rows

    . . exported "DMUSER"."DM$PEASSOCIATION_RULES_SAMPLE" 0 KB 0 rowsMaster table "DMUSER"."DMUSER_exp_45" successfully loaded/unloaded******************************************************************************Dump file set for DMUSER.DMUSER_exp_45 is: C:\ORACLE\PRODUCT\11.1.0.2\DMINING\ALLMODELS01.DMPJob "DMUSER"."DMUSER_exp_45" successfully completed at 08:40:08

    Importing Models Into the Same Schema

    DMUSERcan restore these models from the dump file at a later date if, for whateverreason, he or she wants to revert to this version of the models. Note that an import willnot overwrite an existing model with the same name unless the model is incomplete orcorrupted.

    The following command restores all models from the dump file to the DMUSERschema:

    SQL> EXECUTE dbms_data_mining.import_model('allmodels01.dmp','DMTEST');

    Importing Models Into a Different SchemaA user with the necessary privileges can load the models from a dump file into adifferent schema. In the next example, the SYSTEMuser issues the followingcommand, which loads the three models into the SCOTTschema:

    SQL> EXECUTE dbms_data_mining.import_model('allmodels01.dmp', 'dmtest', null,null, null, 'toscott', 'dmuser:scott');

    A parameter specifies TOSCOTT.LOGas the name of the log file; the .LOGextension isadded automatically to the name. The log file shows the names of the importedmodels and supporting metadata.

    Master table "SYSTEM"."toscott" successfully loaded/unloadedStarting "SYSTEM"."toscott": DM_EXPIMP_JOB_ID=51|DM_SELECT_IMPORT

    Processing object type TABLE_EXPORT/TABLE/PROCACT_INSTANCE>>> . . imported Data Mining Model "SCOTT"."ABN_CLAS_SAMPLE">>> . . imported Data Mining Model "SCOTT"."ASSOCIATION_RULES_SAMPLE">>> . . imported Data Mining Model "SCOTT"."NAIVE_BAYES_SAMPLE"Processing object type TABLE_EXPORT/TABLE/TABLEProcessing object type TABLE_EXPORT/TABLE/TABLE_DATAProcessing object type TABLE_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANTProcessing object type TABLE_EXPORT/TABLE/INDEX/INDEXProcessing object type TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINTProcessing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICSProcessing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS

  • 8/13/2019 b28130.pdf

    25/52

    Upgrading Oracle Data Mining

    Beta Draft Working With Mining Model Objects 2-9

    Job "SYSTEM"."toscott" completed with 1 error(s) at 09:08:12

    Upgrading Oracle Data MiningIn 11gRelease 1 (11.1), Oracle Data Mining upgrade is integrated with the upgrade ofOracle Database. Data Mining metadata is upgraded and incorporated in the SYSschema. Data Mining models are automatically upgraded and available for use in thenew upgraded environment. Upgraded models will continue to work as they did inthe prior release. All new mining functionality in 11gcan be used in the upgradedenvironment.

    Supported Version Upgrades

    Oracle Data Mining 10.2, including both metadata and models, can be upgraded to11g. The Oracle Data Mining PL/SQL API can also be upgraded from 10.1 to 11g.

    The Oracle Data Mining Java API cannotbe upgraded from 10.1 to 11g. The 10.1

    version of the Java API was no longer supported in Oracle Data Mining 10.2.Model upgrade from 9.2 to 11gis not supported.

    Using Oracle Database Upgrade Assistant

    You can use Oracle Database Upgrade Assistant to upgrade a database to 11g. Duringthe upgrade, all Data Mining metadata that previously existed in the DMSYSschema iscreated in SYS. After the upgrade, all Data Mining objects will reside in SYSand willno longer reside in DMSYS. Data Mining models, which reside in the owners schema,will also be upgraded to 11g.

    On Windows platforms, you can use the Database Upgrade Assistant graphical tool toupgrade to 11g. To start the Upgrade Assistant:

    1. Go to the Windows Startmenu and choose the Oracle home directory.

    2. Choose the Configuration and Migration Toolsmenu.

    3. Launch the Upgrade Assistant.

    On Linux platforms, run the DBUAutility to upgrade Oracle Database.

    After upgrading, check the upgrade log file and DBA_REGISTRYto ensure that theupgrade process completed successfully. Also check the DBA_MINING_MODELSview.The newly upgraded mining models should be listed there with the version value setto 1.

    After you have verified the upgrade and confirmed that there will be no need todowngrade, you should set the initialization parameter COMPATIBLEto the 11g. At

    that point, you can drop the DMSYSschema from the database. Once DMSYSisremoved, the DBA_REGISTRYwill no longer list Oracle Data Mining as a component.

    See: Oracle Database Upgrade Guidefor information about upgradingOracle Database.

    Note: After upgrading to 11g, you can no longer switch to the DataMining Scoring Engine (DMSE). The Scoring Engine does not exist in11g.

  • 8/13/2019 b28130.pdf

    26/52

    Upgrading Oracle Data Mining

    2-10 Oracle Data Mining Administrator's Guide Beta Draft

    Exporting and Importing from a Dump File

    If you wish, you can use a less automated approach to upgrading Oracle Data Mining.You can export the models created in a previous version of Oracle Database (check"Supported Version Upgrades"on page 2-9) and import them into a new 11gdatabase.

    To export your Data Mining models to a dump file, follow the instructions in"Exporting and Importing Mining Models"on page 2-5.

    Before importing from the dump file, run the DMEIDMSYSscript to create the DMSYSschema in the 11g database.

    SQL>connect / as sysdba;SQL>@ORACLE_HOME/rdbms/admin/dmeidmsys.sqlSQL>exit;

    To import the dump file into the database:

    %ORACLE_HOME/bin/impdp system/passwordfile_name= .....SQL>connect / as sysdba;SQL>execute dmp_sys.upgrade_models(11.0.0);SQL>alter system flush shared_pool;SQL>exit;

    If you shutdown the database before operating on the upgraded mining models, thiswill also flush the shared pool.

    Downgrading

    Before downgrading the database back to the previous version, ensure that there are

    no 11gmining models in the upgraded database. Issue the following SQL statement inSYSto verify this:

    SQL>SELECT name from sys.model$ where version=2;

    If there are any mining models in the database, you must manually delete them usingthe DBMS_DATA_MINING.DROP_MODELroutine before downgrading the database. Ifyou do not do this, the database downgrade process will be aborted.

    Note: The TEMPtablespace must already exist in the 11gdatabase.The DMEIDMSYSscript uses TEMPand SYSAUXto create the DMSYSschema.

  • 8/13/2019 b28130.pdf

    27/52

    Beta Draft Managing Security for Data Mining 3-1

    3Managing Security for Data Mining

    Oracle Data Mining uses the standard security features of Oracle Database to controlaccess to mining models.

    This chapter contains the following topics: Security Features of Oracle Data Mining

    System Privileges

    Object Privileges

    Creating Data Mining Users

    Security Features of Oracle Data MiningWith the security features of Oracle Data Mining:

    You can control which users are allowed to create mining models.

    You can control which users are allowed to view model details, modify, apply, ordrop models.

    You can confine users data mining activities to their own schemas

    You can use comments to identify mining models and audit trails to monitor theactivities of mining models

    You can share models without duplicating the underlying objects.

    System PrivilegesA system privilegeconfers the right to perform a particular action in the database or

    to perform an action on a type of schema object. For example, the privileges to createtablespaces and to delete the rows of any table in a database are system privileges.

    To grant a system privilege, you must either have been granted the system privilegewith theADMIN OPTIONor have been granted the GRANT ANY PRIVILEGEsystemprivilege.

    See: Oracle Database Security Guidefor comprehensive informationon Oracle Database security.

  • 8/13/2019 b28130.pdf

    28/52

    Object Privileges

    3-2 Oracle Data Mining Administrator's Guide Beta Draft

    Data Mining in Your Own Schema

    The following system privileges are required for data mining in your own schema.With these privileges, you can create and drop mining models, view model details,and apply models to data in your own schema.

    CREATE MINING MODEL

    CREATE PROCEDURE

    CREATE SESSIONCREATE TABLE

    CREATE SEQUENCE

    CREATE VIEW

    CREATE JOB

    CREATE TYPE

    CREATE SYNONYM

    Additional access rights are required for exporting and importing mining models, asdescribed in "Exporting and Importing Mining Models"on page 2-5.

    Object privileges may also be required for data mining, as described in "ObjectPrivileges"on page 3-2.

    Data Mining in Any Schema

    The system privileges listed in Table 31are required for data mining activities outsideof your own schema.

    Object PrivilegesAn object privilegeconfers the right to perform a specific action on a specific schemaobject. For example, you could allow another user to insert rows in a specific table inyour schema by granting an object privilege.

    Users automatically have all object privileges for objects in their own schemas.The owner of a mining model or a system administrator can grant the object privilegesdescribed in Table 32.

    Table 31 System Privileges for Mining Model Objects

    System Privilege Allows you to...

    CREATE ANY MINING MODEL Create mining models in any schema.

    ALTER ANY MINING MODEL Change the name or cost matrix of mining models inany schema.

    DROP ANY MINING MODELDrop mining models in any schema.

    SELECT ANY MINING MODEL Apply mining models and view model details in anyschema.

    COMMENT ANY MINING MODEL Add comments to mining models in any schema.

    AUDIT ANY Audit mining models (or any object) in any schema.

    Table 32 Object Privileges for Data Mining

    Object Privilege Allows you to....

    ALTER MINING MODEL Change the name or cost matrix of the specified mining model.

    SELECT MINING MODEL Apply the specified model or view its model details.

  • 8/13/2019 b28130.pdf

    29/52

    Creating Data Mining Users

    Beta Draft Managing Security for Data Mining 3-3

    EXECUTEaccess to the Oracle Text package CTXSYS.CTX_DDLis required for textmining.

    Creating Data Mining UsersAnyone who wants to use Oracle Database must have a user name and password.Data Mining users must have several database permissions depending on the kind ofaccess that they need. Data Mining users who will mine large amounts of data shouldalso have personal permanent and temporary tablespaces in which to do their work.

    You must log in with system privileges to create users and tablespaces.

    The following command creates a new permanent tablespace named ODMPERMwith afile name of ODM1.DBF.

    SQL> CREATE TABLESPACE "odmperm" DATAFILE 'c:\oracle\oradata\orcl\odm1.dbf'

    SIZE 20M REUSE AUTOEXTEND ON NEXT 20M;

    The next command creates a new temporary tablespace named ODMTEMPwith a filename of ODMTEMP.TMP.

    SQL> CREATE TEMPORARY TABLESPACE "odmtemp" TEMPFILE 'c:\oracle\oradata\orcl\odmtemp.tmp'

    SIZE 20M REUSE AUTOEXTEND ON NEXT 20M;

    The following command creates a user named DMUSER1with the passwordCHANGE_NOW, and provides default access to two personal tablespaces.

    SQL> CREATE USER dmuser1 IDENTIFIED BY change_now DEFAULT TABLESPACE odmperm TEMPORARY TABLESPACE odmtemp QUOTA UNLIMITED ON odmperm;

    The following command grants the CREATE MINING MODELprivilege to DMUSER1.With this privilege, the user can log in with the specified password and create modelsin the DMUSER1schema. With the CREATE MINING MODELprivilege, the user has fullcontrol over all mining activities within his own schema.

    SQL> GRANT create mining model TO dmuser1;

    The following command grants SELECTaccess to the EMPLOYEEStable in the HRschema. This privilege allows DMUSER1to mine data in HR.EMPLOYEES.

    SQL> GRANT SELECT ON hr.employees TO dmuser1;

    Creating a Demo UserDemo Data Mining programs in SQL and Java are delivered with Oracle Database.Data Mining demo users require several database permissions, as well as SELECTaccess to tables in the SHsample schema. Users that will primarily use the demoprograms, and not be mining large data sets, do not need personal tablespaces.

    To create a demo user for Oracle Data Mining, take these steps:

    1. Create a user name and password.

    2. Run the DMSHGRANTSscript to grant the necessary privileges to the user.

    Note: SELECTaccess to the data being mined is always required.

  • 8/13/2019 b28130.pdf

    30/52

    Creating Data Mining Users

    3-4 Oracle Data Mining Administrator's Guide Beta Draft

    3. Run the DMSHscript to populate the user's schema with objects that support thedemo programs.

    The DMSHGRANTSand DMSHscripts are provided by the Database Companioninstallation. Once installed, you can locate these scripts in the \RDBMS\DEMOsubdirectory of Oracle home. To install Database Companion, see the instructions in"Installing Oracle Database Companion"on page 4-3.

    To create the user, log in as the SYSTEMuser and type a command like the following:

    SQL> CREATE USER dmuser IDENTIFIED BY dmpsw DEFAULT TABLESPACE users TEMPORARY TABLESPACE temp QUOTA UNLIMITED ON users;

    This command creates the user DMUSERwith the password DMPSW. It provides defaultaccess to two tablespaces shared by several other sample schemas.

    DMSHGRANTS Script

    The DMSHGRANTSscript grants the privileges required for data mining to a demo user.These privileges are listed in "Data Mining in Your Own Schema"on page 3-2.

    The DMSHGRANTSscript also grants SELECTrights to these tables in the SHschema:

    COUNTRIES

    CUSTOMERS

    PRODUCTS

    SUPPLEMENTARY_DEMOGRAPHICS

    SALES

    For text mining, DMSHGRANTSgrants access rights to an Oracle Text package:

    EXECUTE ON ctxsys.ctx_ddl

    To run DMSHGRANTS, log in to SQL*Plus as SYSDBA. Provide the password for the SH

    schema and the name of the Data Mining user as parameters, as shown here:SQL> CONNECT / as sysdbaSQL> @%ORACLE_HOME%\rdbms\demo\dmshgrants sh_passworddmuserSQL> COMMIT;

    DMSH Script

    The DMSHscript populates the schema of the demo user with tables and views thatreference the source data in the SHschema. It also creates tables and indexes thatsupport text mining.

    To run DMSH, log in to SQL*Plus as the data mining user (in this case, DMUSERwith thepassword DMPSW), as shown here:

    SQL> CONNECT dmuser/dmpswSQL> @%ORACLE_HOME%\rdbms\demo\dmshSQL> COMMIT;

    Sample Schemas

    The Oracle Data Mining demos use data in the SHsample schema. If the sampleschemas were not originally installed in the database, you must install SH.

    To install the sample schemas after installation of Oracle Database, use DatabaseConfiguration Assistant.

  • 8/13/2019 b28130.pdf

    31/52

    Creating Data Mining Users

    Beta Draft Managing Security for Data Mining 3-5

    To install SHwithout the other sample schemas, run the SH_MAINscript. Thesh_main.sqlfile is located in \demo\schema\sales_historyin Oracle home.You must run the script as SYS.

    See Also: Oracle Database Sample Schemasfor more details aboutinstalling the sample schemas.

  • 8/13/2019 b28130.pdf

    32/52

    Creating Data Mining Users

    3-6 Oracle Data Mining Administrator's Guide Beta Draft

  • 8/13/2019 b28130.pdf

    33/52

    Beta Draft Using the Data Mining Sample Programs 4-1

    4Using the Data Mining Sample Programs

    A number of sample programs are available with Oracle Data Mining. These programsillustrate the many features of the PL/SQL API, the Data Mining SQL functions, theJava API, and the BLAST table functions.

    The sample programs create a set of models in the database. You can experiment withthese models using either the APIs or Oracle Data Miner. You can examine the samplesource code, which includes numerous comments, to familiarize yourself with theOracle Data Mining APIs, and you can create your own models by modifying thesamples.

    This chapter includes the following sections:

    Getting Ready to Run the Sample Programs

    Installing Oracle Database Companion

    PL/SQL Sample Data Mining Programs

    Sample Java Programs

    Text Mining Sample Programs

    BLAST Sample Program

    Examining the Data

    Getting Ready to Run the Sample ProgramsComplete these steps before attempting to run the Data Mining sample programs:

    1. Install Oracle Database 11gRelease 1 (11.1) Enterprise Edition with the sampleschemas, or obtain the connection information to an existing installation. SeeChapter 1.

    2. Obtain the sample programs by installing the Database Companion. See "InstallingOracle Database Companion"on page 4-3.

    3. Create a user ID for running the sample programs. Grant the necessary privilegesto this user and populate the users schema with objects used by the sampleprograms. See "Creating a Demo User"on page 3-3.

    4. To run the Java programs, check the requirements described in "Preparing to Runthe Java Programs"on page 4-8.

    5. To run the BLAST programs, check the requirements described in "Preparing toRun the BLAST Demo"on page 4-12.

  • 8/13/2019 b28130.pdf

    34/52

    Getting Ready to Run the Sample Programs

    4-2 Oracle Data Mining Administrator's Guide Beta Draft

    The sample PL/SQL and Java programs are fully annotated with information that maybe very helpful to you. For a description of the Sales History tables used by theseprograms, refer to "Examining the Data"on page 4-13.

    Example: Perform the Installation and Configuration Steps

    You can use the steps in this example as a quick start to running the sample programs.

    For complete instructions, refer to "Creating Data Mining Users"on page 3-3

    Let's assume that you have access to Oracle Database 11gRelease 1 (11.1), which hasbeen installed on a Windows host with the Data Mining option, the sample schemas,and the sample programs. You can log in to this database as SYSusing SQL*Plus.

    Several Data Mining users (DMUSER1and DMUSER2) have already been created in thisdatabase. You want to create your own user ID (DMUSER3) and run the sampleprograms to create some starter models in your schema. To accomplish this, youwould perform the following steps.

    1. Log in to the database as SYSand create the DMUSER3user.

    > sqlplus Enter user-name: sys / as sysdba

    Enter password: sys_passwordSQL> create user dmuser3 identified by dmuser3_password

    default tablespace users temporary tablespace tempquota unlimited on users;

    2. Run dmshgrants.sqlto grant privileges to DMUSER3.

    SQL> @ %ORACLE_HOME%\rdbms\demo\dmshgrants SH_passworddmuser3

    3. Connect as DMUSER3and run dmsh.sqlto populate the DMUSER3schema withviews and tables needed by the sample programs. Save your changes in thedatabase.

    SQL> connect dmuser3/dmuser3_password

    SQL> @ %ORACLE_HOME%\rdbms\demo\dmshSQL> commit;

    You can now run any of the PL/SQL data mining demos. For example, while logged into SQL*Plus as DMUSER3, you could run the Naive Bayes PL/SQL demo with thefollowing statement.

    SQL>@ %ORACLE_HOME%\rdbms\demo\dmnbdemo

    You can also run any of the Java data mining demos, if you have Java 1.4.2 or higherand your CLASSPATHis set as described in "Preparing to Run the Java Programs"onpage 4-8. For example, you could run the Naive Bayes Java demo with a command likethe following at the operating system prompt.

    >java dmnbdemo myserver:1521:orcl dmuser3 dmuser3_password

    Note that the BLAST demo uses different data sets and requires a separate setupprocedure. See "Preparing to Run the BLAST Demo"on page 4-12.

    Note: All the sample programs are re-executable. They start bydeleting the results of the previous run before executing the currentrun.

  • 8/13/2019 b28130.pdf

    35/52

    PL/SQL Sample Data Mining Programs

    Beta Draft Using the Data Mining Sample Programs 4-3

    Installing Oracle Database CompanionThe Oracle Data Mining sample programs are provided by the Database Companionthat is available with Oracle Database Enterprise Edition.

    The Database Companion installation process copies the Oracle Data Mining sampleprograms, along with examples and demonstrations of other database features, to the\RDBMS\demosubdirectory under Oracle home.

    To install the Database Companion on a Windows platform, take these steps:

    1. From the Companion installation directory, run SETUP.EXE.

    Oracle Universal Installer opens and displays the Welcome page. Click Nexttoadvance to each page.

    2. On the Select a Product to Install page, select Oracle Database 11g Products11.1.0.0.0.

    3. On the Specify Home Details page, select the Oracle home in which you installedOracle Database. Do not rely on the default setting to be correct.

    4. On the Product-Specific Prerequisite Checks page, verify that all checks succeeded.

    If any checks failed, then you must correct the problem before proceeding.5. On the Summary page, review your previous choices, then click Install.

    6. On the End of Installation page, confirm that the installation was successful.

    PL/SQL Sample Data Mining ProgramsThe PL/SQL sample programs illustrate the use of the DBMS_DATA_MININGpackagefor creating models and the DBMS_DATA_MINING_TRANSFORMpackage forperforming transformations on the mining data.

    You can find the SQL demos by searching for dm*.sqlin \RDBMS\demoof Oraclehome. Table 41describes the PL/SQL sample data mining programs.

    Table 41 Mining Functions in PL/SQL Programs

    Mining Function Algorithm Program File

    Association Apriori dmardemo.sql

    Attribute importance Minimum Descriptor Length dmaidemo.sql

    Classification Adaptive Bayes Network dmabdemo.sql

    Classification Decision Tree dmdtdemo.sql

    Classification Decision Tree (cross validation) dmdtxvlddemo.sql

    Classification Naive Bayes dmnbdemo.sql

    Classification Support Vector Machine dmsvcdem.sql

    Classification Support Vector Machine (one class) dmsvodem.sql

    Classification Binary Logistic Regression (GLM) dmglcdem.sql

    Clustering k-Means dmkmdemo.sql

    Clustering O-Cluster dmocdemo.sql

    Feature extraction Non-Negative Matrix Factorization dmnmdemo.sql

    Regression Support Vector Machine dmsvrdem.sql

    Regression Multivariate Linear Regression (GLM) dmglrdem.sql

  • 8/13/2019 b28130.pdf

    36/52

    PL/SQL Sample Data Mining Programs

    4-4 Oracle Data Mining Administrator's Guide Beta Draft

    PL/SQL Program Summaries

    Summary descriptions of the PL/SQL sample programs are provided in Table 42. Fordetailed descriptions of the sample programs, see the comments in the source code.

    Text mining Term extraction using CTX procedures dmtxtfe.sql

    Text mining Non-Negative Matrix Factorization dmtxtnmf.sql

    Text mining Support Vector Machine (classification) dmtxtsvm.sql

    Note: The PL/SQL text mining demos are described in "Text Miningin PL/SQL"on page 4-10.

    See Also: Oracle Database PL/SQL Packages and Types ReferenceandOracle Data Mining Application Developer's Guidefor information on theOracle Data Mining PL/SQL API. See Oracle Database SQL LanguageReferencefor information on the SQL functions for data mining.

    Table 42 Overview of the PL/SQL Sample Programs

    Mining Function Description

    Classification The classification programs demonstrate various preprocessingtechniques and perform the following steps:

    Build a classification model using training data

    Display model details and settings

    Test the model by applying the model on the test data

    Present test metrics, such as confusion matrix, lift, and ROC

    Apply the model on the scoring data

    Present apply results

    Present ranked apply results, influenced by a cost matrix

    The dmdtxvlddemo.sqlprogram demonstrates cross-validationtechniques for decision tree based-classification. With minormodifications, this program can be used to perform cross validationusing other models/algorithms.

    Regression dmsvrdem.sqluses different test metrics, but otherwise performsmost of the same steps used in the classification programs. Selectedattributes of the input data are preprocessed (normalized).

    Association dmardemo.sqlbuilds an association model and presents frequentitemsets and association rules as output. Selected attributes of the

    input data are preprocessed (binned).Clustering dmkmdemo.sql(k-Means) and dmocdemo.sql(0-Cluster) build

    clustering models and present cluster details, such as rules, centroid,and histogram for each cluster as output. The models are scored, andthe probabilities associated with each cluster are returned as output.Selected attributes of the input data are preprocessed (normalized).

    Feature extraction dmnmdemo.sqlbuilds a feature extraction model and presents modeldetails as the output. The model is scored, and each feature ID isassociated with a probability. Selected attributes of the input data arepreprocessed (normalized).

    Table 41 (Cont.) Mining Functions in PL/SQL Programs

    Mining Function Algorithm Program File

  • 8/13/2019 b28130.pdf

    37/52

    PL/SQL Sample Data Mining Programs

    Beta Draft Using the Data Mining Sample Programs 4-5

    Data Mining SQL Functions

    Some of the PL/SQL sample programs use Data Mining SQL functions to applymodels created with the DBMS_DATA_MININGpackage. The Data Mining functionscan also be used to apply models created with the Java API.

    The programs that demonstrate the Data Mining functions are listed in Table 43.

    Running the PL/SQL ProgramsIn SQL*Plus, use commands like the following to execute the sample programs and listthe models created by them.

    >sqlplus dmuser3/dmuser3_passwordSQL> SET serveroutput ONSQL> SET echo ONSQL> @ %ORACLE_HOME%\rdbms\demo\program_name..

    Attribute importance dmaidemo.sqlbuilds an attribute importance model and presents alist of important attributes as the output of model details. Selectedattributes of the input data are preprocessed (binned).

    Note: The SQL functions for Data Mining are documented in OracleDatabase SQL Language Reference. Information about these functions isalso provided in Oracle Data Mining Application Developer's Guide,where they are referred to as SQL scoring functions.

    Table 43 Data Mining SQL Functions in the Sample Programs

    Program Name Algorithm SQL Functions Used

    dmkmdemo.sql k-Means CLUSTER_IDCLUSTER_PROBABILITYCLUSTER_SET

    dmocdemo.sql O-Cluster CLUSTER_ID

    dmnmdemo.sql NMF FEATURE_IDFEATURE_SETFEATURE_VALUE

    dmdtdemo.sql Decision Tree PREDICTION

    PREDICTION_COSTPREDICTION_DETAILSPREDICTION_SET

    dmsvcdem.sql SVM classification PREDICTIONPREDICTION_PROBABILITYPREDICTION_SET

    dmsvodem.sql One-Class SVM PREDICTIONPREDICTION_PROBABILITY

    dmsvrdem.sql SVM regression PREDICTION

    dmtxtsvm.sql Text mining PREDICTIONPREDICTION_PROBABILITY

    Table 42 (Cont.) Overview of the PL/SQL Sample Programs

    Mining Function Description

  • 8/13/2019 b28130.pdf

    38/52

    Sample Java Programs

    4-6 Oracle Data Mining Administrator's Guide Beta Draft

    .

    .SQL> SET linesize 200SQL> SET pagesize 100SQL> SELECT name, function_name, algorithm_name, target_attribute FROM dm_user_models;

    NAME FUNCTION_NAME ALGORITHM_NAME TARGET_ATTRIBUTE--------------------- ---------------------- ------------------------------ -----------------T_NMF_SAMPLE FEATURE_EXTRACTION NONNEGATIVE_MATRIX_FACTORT_SVM_CLAS_SAMPLE CLASSIFICATION SUPPORT_VECTOR_MACHINES AFFINITY_CARD

    AR_SH_SAMPLE ASSOCIATION_RULES APRIORI_ASSOCIATION_RULESAI_SH_SAMPLE ATTRIBUTE_IMPORTANCE MINIMUM_DESCRIPTION_LENGTH AFFINITY_CARDABN_SH_CLAS_SAMPLE CLASSIFICATION ADAPTIVE_BAYES_NETWORK AFFINITY_CARDDT_SH_CLAS_SAMPLE CLASSIFICATION DECISION_TREE AFFINITY_CARDNB_SH_CLAS_SAMPLE CLASSIFICATION NAIVE_BAYES AFFINITY_CARDSVMC_SH_CLAS_SAMPLE CLASSIFICATION SUPPORT_VECTOR_MACHINES AFFINITY_CARDOC_SH_CLUS_SAMPLE CLUSTERING O_CLUSTERKM_SH_CLUS_SAMPLE CLUSTERING KMEANSNMF_SH_SAMPLE FEATURE_EXTRACTION NONNEGATIVE_MATRIX_FACTORSVMR_SH_REGR_SAMPLE REGRESSION SUPPORT_VECTOR_MACHINES AGE

    Sample Java ProgramsThe Java demos illustrate the features of the Oracle Data Mining Java API, whichimplements Oracle-specific extensions to the Java Data Mining (JDM) 1.0 standard.

    The Java programs demonstrate data preprocessing and the basic mining functions.Additional Java samples demonstrate predictive analytics, import/export, and textmining.

    You can find the Java demos by searching for dm*.javein \RDBMS\demoof Oraclehome. Table 44describes the Java programs that illustrate the basic mining functions.

    Table 44 Mining Functions in Sample Java Programs

    Mining Function Algorithm Program FileAssociation Apriori dmardemo.java

    Attribute importance Minimum Description Length dmaidemo.java

    Classification Adaptive Bayes Network dmabdemo.java

    Classification Decision Tree dmtreedemo.java

    Classification Naive Bayes dmnbdemo.java

    Classification Support Vector Machine dmsvcdemo.java

    Classification Support Vector Machine (one class) dmsvodemo.java

    Classification Binary Logistic Regression (GLM) dmglcdemo.java

    Clustering k-Means dmkmdemo.javaClustering O-Cluster dmocdemo.java

    Feature extraction Non-Negative Matrix Factorization dmnmdemo.java

    Regression Support Vector Machine dmsvrdemo.java

    Regression Multivariate Linear Regression dmglrdemo.java

    Text mining Non-Negative Matrix Factorization dmtxtnmfdemo.java

    Text mining Support Vector Machine (classification) dmtxtsvmdemo.java

  • 8/13/2019 b28130.pdf

    39/52

    Sample Java Programs

    Beta Draft Using the Data Mining Sample Programs 4-7

    Table 45lists the Java programs that illustrate special mining tasks. These features areall supported in the PL/SQL API as well, since the Java API is layered on the PL/SQLAPI.

    Java Program Summaries

    Summary descriptions of the Java sample programs are provided in Table 46. Fordetailed descriptions, see the comments in the source code.

    Table 45 Mining Tasks in Java Demos

    Mining Task Description Program File

    Data Transformations Binning, clipping, and normalization dmxfdemo.java

    Predictive Analytics Automated predict and explain dmpademo.java

    Model Export/Import To/from Data Pump dump file dmexpimpdemo.java

    Classification Model Scoring Ways of applying an NB model dmapplydemo.java

    Note: The Java text mining demos are described in "Text Mining inJava"on page 4-11.

    See Also: Oracle Data Mining Java API Reference(javadoc) and the

    Oracle Data Mining Application Developer's Guidefor information on theJava API.

    Table 46 Overview of the Java Sample Programs

    Mining Function or Task Description

    Classification The classification programs demonstrate various preprocessingtechniques and perform the following steps:

    Build a classification model using training data

    Display model details and settings

    Test the model by applying the model on the test data

    Present test metrics, such as confusion matrix, lift, and ROC

    Apply the model on the scoring data

    Present apply results

    Present ranked apply results, influenced by a cost matrix

    The dmapplydemo.javaprogram demonstrates several waysof applying a Naive Bayes model.

    Regression dmsvrdemo.javauses different test metrics, but otherwiseperforms most of the same steps used in the classificationprograms. Selected attributes of the input data are preprocessed

    (normalized).

    Association dmardemo.javabuilds an association model and presentsfrequent itemsets and association rules as output. Selectedattributes of the input data are preprocessed (binned).

    Clustering dmkmdemo.java(k-Means) and dmocdemo.java(0-Cluster)build clustering models and present cluster details, such asrules, centroid, and histogram for each cluster as output. Themodels are scored, and the probabilities associated with eachcluster are returned as output. Selected attributes of the inputdata are preprocessed (normalized).

  • 8/13/2019 b28130.pdf

    40/52

    Sample Java Programs

    4-8 Oracle Data Mining Administrator's Guide Beta Draft

    Preparing to Run the Java Programs

    Before running the Java programs, take the following steps. If you are using anintegrated development environment (IDE), then make the equivalent changes foryour software.

    In Windows, be sure to enter the physical directory name for %ORACLE_HOME%, unlessthis variable is defined on your system.

    1. If Oracle 11gis installed locally, then add %ORACLE_HOME%\jdk\binto the PATHvariable before the paths of any other Java versions.

    Otherwise, check that the version of Java you are using is 1.5.0. You can executethe following in a command window to check the version:

    >java -version

    If you are not using version 1.5.0, then you can download it fromhttp://java.sun.com

    2. Add the following data mining JAR files to your Windows CLASSPATH:

    %ORACLE_HOME%\rdbms\jlib\jdm.jar %ORACLE_HOME%\rdbms\jlib\ojdm_api.jar %ORACLE_HOME%\rdbms\jlib\xdb.jar %ORACLE_HOME%\jdbc\lib\ojdbc14.jar %ORACLE_HOME%\oc4j\j2ee\home\lib\connector.jar %ORACLE_HOME%\jlib\orai18n.jar

    %ORACLE_HOME%\jlib\orai18n-mapping.jar %ORACLE_HOME%\lib\xmlparserv2.jar

    3. Compile the programs listed in Table 46. To use the JAVACexecutable, open acommand window and go to \RDBMS\demoin Oracle home.

    >javac program_name.java

    For example:

    >javac dmnbdemo.java

    If JAVACis not found, then check the value of the PATHvariable. Issue a SETcommand in the command window.

    Feature extraction dmnmdemo.javabuilds a feature extraction model and presentsmodel details as the output. The model is scored, and eachfeature ID is associated with a probability. Selected attributes ofthe input data are preprocessed (normalized).

    Attribute importance dmaidemo.javabuilds an attribute importance model andpresents a list of important attributes as the output of modeldetails. Selected attributes of the input data are preprocessed(binned).

    Data transformations dmxfdemo.javademonstrates binning, clipping, andnormalization transformations.

    Predictive Analytics dmpademo.javademonstrates PREDICTand EXPLAINfunctions.

    Model import/export dmexpimpdemo.javabuilds a Naive Bayes model, exports it toa dump file, then imports it from the dump file.

    Table 46 (Cont.) Overview of the Java Sample Programs

    Mining Function or Task Description

  • 8/13/2019 b28130.pdf

    41/52

    Text Mining Sample Programs

    Beta Draft Using the Data Mining Sample Programs 4-9

    If the Java program is not found, then make sure that a file with that name is in thecurrent directory.

    Running the Java Programs

    Use the following syntax to execute the Java sample programs. Use the JAVAexecutable in the \jdk\bin\directory in Oracle home.

    >java class_nameconnect_stringuser_nameuser_password

    The connection string specifies your Oracle database connection. It identifies themachine hosting the database, the port through which the connection is made, and thename of the database instance (the Oracle system identifier).

    host_name:port:SID

    For example, the following command executes the Naive Bayes demodmnbdemo.classas DMUSER3in the database instance ORCLon host MACH05at port1521.

    >java dmnbdemo mach05:1521:orcl dmuser3 dmuser3_password

    You can list the models created by the Java programs with commands like thefollowing in SQL*Plus.

    >sqlplus dmuser3/dmuser3_passwordSQL> SET linesize 200SQL> SET pagesize 100SQL> SELECT name, function_name, algorithm_name, target_attribute

    from dm_user_models WHERE name LIKE '%JDM';

    NAME FUNCTION_NAME ALGORITHM_NAME TARGET_ATTRIBUTE----------------- --------------------- ---------------------------- ------------------TXTNMFMODEL_JDM FEATURE_EXTRACTION NONNEGATIVE_MATRIX_FACTOR

    ARMODEL_JDM ASSOCIATION_RULES APRIORI_ASSOCIATION_RULESTREEMODEL_JDM CLASSIFICATION DECISION_TREE AFFINITY_CARD

    AIMODEL_JDM ATTRIBUTE_IMPORTANCE MINIMUM_DESCRIPTION_LENGTH AFFINITY_CARDABNMODEL_JDM CLASSIFICATION ADAPTIVE_BAYES_NETWORK AFFINITY_CARDNBMODEL_JDM CLASSIFICATION NAIVE_BAYES AFFINITY_CARDSVMCMODEL_JDM CLASSIFICATION SUPPORT_VECTOR_MACHINES AFFINITY_CARDSVMOMODEL_JDM CLASSIFICATION SUPPORT_VECTOR_MACHINESKMMODEL_JDM CLUSTERING KMEANSOCMODEL_JDM CLUSTERING O_CLUSTERNMFMODEL_JDM FEATURE_EXTRACTION NONNEGATIVE_MATRIX_FACTORSVMRMODEL_JDM REGRESSION SUPPORT_VECTOR_MACHINES AGETXTSVMMODEL_JDM CLASSIFICATION SUPPORT_VECTOR_MACHINES AFFINITY_CARD

    Text Mining Sample Programs

    Oracle Data Mining can mine text columns that have undergone pre-processing byOracle Text routines.

    Oracle Text is a technology for building text query and document classificationapplications. It provides indexing, word and theme searching, and viewingcapabilities for text. Oracle Text is included in a general installation of Oracle DatabaseEnterprise Edition, and therefore is already present in a database installed according tothe instructions in Chapter 1.

  • 8/13/2019 b28130.pdf

    42/52

    Text Mining Sample Programs

    4-10 Oracle Data Mining Administrator's Guide Beta Draft

    The pre-processing steps for text mining create nested table columns of type DM_NESTED_NUMERICALSfrom columns of type VARCHAR2or CLOB. Each row of thenested table specifies an attribute name and a value. The type definition is as follows.

    CREATE OR REPLACE TYPE dm_nested_numerical AS OBJECT (attribute_name VARCHAR2(30), value NUMBER)/

    CREATE OR REPLACE TYPE dm_nested_numericals AS TABLE OF dm_nested_numerical

    Terms extracted from text documents into nested tables can become generic attributesin training or scoring data. Classification, clustering, and feature extraction models canbe built using these attributes.

    Sample text mining programs in both PL/SQL and Java illustrate classification andfeature extraction of a pre-processed text column.

    Text Mining in PL/SQL

    Three PL/SQL sample programs illustrate the process of text mining. One programillustrates the pre-processing that is required to prepare the data for mining. The other

    two programs build models that use the transformed text.

    Text Transformation Demo

    To prepare a column for text mining using the PL/SQL API, you must use Oracle Textroutines to perform the following general steps:

    1. Create a domain index on the column.

    2. Use the index to extract terms from the column to a temporary table.

    3. Populate a column of type DM_NESTED_NUMERICALSwith the terms in thetemporary table.

    The process of term extraction using Oracle Text is illustrated in the sample program

    dmtxtfe.sql. The source code contains extensive comments that explain the stepsinvolved in transforming text into a set of features that can be mined using OracleData Mining.

    More details about text transformation are provided in the Oracle Data MiningApplication Developer's Guide.

    Text Transformation for the PL/SQL Text Mining Sample Programs

    The dmsh.sqlscript performs the text transformation required by the PL/SQL textmining demos. There are two such sample programs: dmtxtnmf.sql, which builds afeature extraction model using Non-Negative Matrix Factorization, anddmtxtsvm.sql, which builds a classification model using Support Vector Machine.Both of these programs use the following tables, which have a nested table column of

    comment data:MINING_BUILD_NESTED_TEXTMINING_TEST_NESTED_TEXTMINING_APPLY_NESTED_TEXT

    The Sample Text Mining Models (PL/SQL)

    You can run the PL/SQL text mining demo programs, dmtxtnmf.sqlanddmtxtsvm.sql, like the other PL/SQL programs. The models created by theseprograms are listed in the following example.

  • 8/13/2019 b28130.pdf

    43/52

    BLAST Sample Program

    Beta Draft Using the Data Mining Sample Programs 4-11

    SQL> @ %ORACLE_HOME%\rdbms\demo\dmtxtnmf.sqlSQL> @ %ORACLE_HOME%\rdbms\demo\dmtxtsvm.sqlSQL> select NAME, FUNCTION_NAME, ALGORITHM_NAME, TARGET_ATTRIBUTE

    from dm_user_models;

    NAME FUNCTION_NAME ALGORITHM_NAME TARGET_ATTRIBUTE---------------- ------------------ ------------------------ ----------------T_NMF_SAMPLE FEATURE_EXTRACTION NONNEGATIVE_MATRIX_FACTORT_SVM_CLAS_SAMPLE CLASSIFICATION SUPPORT_VECTOR_MACHINES AFFINITY_CARD

    Text Mining in Java

    Two Java sample programs illustrate the process of text mining. One builds a featureextraction model, the other builds a classification model.

    Text Transformation for the Sample Java Text Mining Programs

    The Oracle Data Mining Java API provides an interface that handles the termextraction process. If you are developing data mining applications in Java, you do notneed to use Oracle Text directly. However, you must ensure that Oracle Text is presentin the database.

    The OraTextTransforminterface is used to perform text transformation within theJava text mining demos. There are two such sample programs: dmtxtnmfdemo.java,which builds a feature extraction model using Non-Negative Matrix Factorization, anddmtxtsvmdemo.java, which builds a classification model using Support VectorMachine. Both of these programs create build, test, and apply data sets from thefollowing tables, which have a text column of comment data:

    MINING_BUILD_TEXTMINING_TEST_TEXTMINING_APPLY_TEXT

    The Sample Text Mining Models (Java)

    You can run the Java text mining sample programs, dmtxtnmfdemo.javaanddmtxtsvmdemo.java, like the other Java programs. The models created by theseprograms are shown in the following example.

    > java dmtxtnmfdemo host:port:SIDdmuser3 dmuser3_password> java dmtxtsvmdemo host:port:SIDdmuser3 dmuser3_password>sqlplus dmuser3/dmuser3_passwordSQL> select NAME, FUNCTION_NAME, ALGORITHM_NAME, TARGET_ATTRIBUTE

    from dm_user_models;

    NAME FUNCTION_NAME ALGORITHM_NAME TARGET_ATTRIBUTE---------------- ------------------ ------------------------ ----------------txtnmfModel_jdm FEATURE_EXTRACTION NONNEGATIVE_MATRIX_FACTORtxtsvmModel_jdm CLASSIFICATION SUPPORT_VECTOR_MACHINES AFFINITY_CARD

    BLAST Sample ProgramThe Oracle implementation of the Basic Local Alignment Search Tool (BLAST) isdemonstrated in the sample program, dmbldemo.sql. This program providesexamples of sequence matching queries using the BLAST table functions.

  • 8/13/2019 b28130.pdf

    44/52

    BLAST Sample Program

    4-12 Oracle Data Mining Administrator's Guide Beta Draft

    The BLAST sample program and configuration scripts are listed in Table 47.

    Preparing to Run the BLAST Demo

    The BLAST demo table functions in dmbldemo.sqluse two data sets: SwissProtand ecoli10. To prepare these data sets, log in to SQL*Plus as the data mining userand run the dmblprot.sqland dmblcoli.sqlscripts as shown in the followingexample.

    SQL>connect dmuser3/dmuser3_password

    SQL>@ %ORACLE_HOME%\rdbms\demo\dmblprot.sqlSQL>@ %ORACLE_HOME%\rdbms\demo\dmblcoli.sql

    Exit SQL*Plus and use the SQL*Loader utility to load data into the SwissProtdatabase in the schema of the data mining user. From the command prompt, change tothe \rdbms\demodirectory in Oracle home and execute the following command.

    >sqlldr dmuser3/dmuser3_passwordcontrol=dmblprot.ctl data=dmblprot.txt log=dmblprot.log

    Running the BLAST Table Functions

    The sample program dmbldemo.sqlcontains multiple invocations of BLAST tablefunctions. You can run them all at once by running the dmbldemo.sqlscript, or you

    can copy individual table functions to the SQL*Plus command line and execute themindividually.

    To run the sample program, log in to SQL*Plus as the data mining user and run thedmbldemo.sqlscript as shown in the following example.

    SQL>connect dmuser3/dmuser3_passwordSQL> @ %ORACLE_HOME%\rdbms\demo\dmbldemo.sql

    Note: The BLAST algorithm detects local alignments in nucleotideand protein databases. BLAST performs a kind of data mining, since itfinds regions of similarity embedded in otherwise unrelatedsequences. However, BLAST does notuse Oracle Data Miningtechnolo