Data Mining Tools Knowledge Seeker 4.5
description
Transcript of Data Mining Tools Knowledge Seeker 4.5
Data Mining Tools
Alexandra Bahl
Ray Bichard
Wes Griffin
Kitty Roberts
Data Mining Tools: KnowledeSeeker 4.5
2
Presentation Outline
Overview of Data Mining
Introduction to KnowledgeSeeker
Major Competitors
Current Applications
Introduction to Key Terms
Interactive Demonstration
Summary
Questions & Answers
Wes Griffin
Ray Bichard
Kitty Roberts
Alex Bahl
Data Mining Tools: KnowledeSeeker 4.5
3
What is Data Mining?
“Data mining is a continuous, iterative process. It involves the use of software, sound methodology, and human creativity
to achieve new insight through the exploration of data to uncover patterns,
relationships, anomalies and dependencies.”
Data Mining Tools: KnowledeSeeker 4.5
4
Data Mining History
Data Collection and Database Creation 1960s
Database Management Systems1970s -early 1980s
Advanced DBMSmid 1980s - present
New Generation Integrated Information Systems
Web based DBMS1990-presentDW and Data Mining
late 1980s - present
Data Mining Tools: KnowledeSeeker 4.5
5
Data Mining Architecture
Graphical User Interface
Pattern Evaluation
Data Mining Engine
Database or DW server
DatabaseData
Warehouse
KnowledgeBase
Data CleaningData Integration
Filtering
Data Mining Tools: KnowledeSeeker 4.5
6
What is KnowledgeSeeker?
A data analysis, data mining package
Enables users to quickly analyze and understand the relationships between variables in a data set.
First generation data mining tool
Most widely used “decision tree” data mining analytical tool
Price per copy: $4750.00 USD
Data Mining Tools: KnowledeSeeker 4.5
7
What is KnowledgeSeeker?
Produced by ANGOSS Software Corporation, who focus “solely” on data mining software.
Offer training and consulting services
Produce data mining add-ins which accepts data from all major databases
Works with popular query and reporting, spreadsheet, statistical and OLAP & ROLAP tools.
Data Mining Tools: KnowledeSeeker 4.5
8
The KnowledgeSeeker Process
Define business goal
Prepare the data
Analyze the data
Data Mining Tools: KnowledeSeeker 4.5
9
The KnowledgeSeeker Process
Define business goal What question needs answered?
What type of analysis will be performed?
What functionalities does the business require?
Data Mining Tools: KnowledeSeeker 4.5
10
The KnowledgeSeeker Process
Prepare the dataConsider the various factors that could influence the outcome.
Examine database to identify those data fields which provide measurements of potential dependencies.
Create subset of the database containing only those data fields.
Data Mining Tools: KnowledeSeeker 4.5
11
The KnowledgeSeeker Process
Analyze the data Automatically scans all the fields in the data set, summarizes the statistically significant patterns and relationships among the fields, and displays the result as a graphical decision tree, or as a knowledge base of rules.
Data Mining Tools: KnowledeSeeker 4.5
12
KnowledgeSeeker Pulsepoints
ADVANTAGESADVANTAGES
Easy to use
Powerful
Scalability
Flexibility
DISADVANTAGESDISADVANTAGES Less than impressive GUI
Data Mining Tools: KnowledeSeeker 4.5
13
Company Software
Clementine 6.0
Enterprise Miner 3.0
Intelligent Miner
Major Competitors
Data Mining Tools: KnowledeSeeker 4.5
14
Company Software
Mineset 3.1
Darwin
Scenario
Major Competitors
Data Mining Tools: KnowledeSeeker 4.5
15
Current Applications
Manufacturing Used by the R.R. Donnelly & Sons commercial printing company to improve process control, cut costs and increase productivity.
Used extensively by Hewlett Packard in their United States manufacturing plants as a process control tool both to analyze factors impacting product quality as well as to generate rules for production control systems.
Data Mining Tools: KnowledeSeeker 4.5
16
Current Applications
Auditing Used by the IRS to combat fraud, reduce risk, and increase collection rates.
Finance Used by the Canadian Imperial Bank of Commerce (CIBC) to create models for fraud detection and risk management.
Data Mining Tools: KnowledeSeeker 4.5
17
Current Applications
CRM
Telephony Used by US West to reduce churning and increase customer loyalty for a new voice messaging technology.
Data Mining Tools: KnowledeSeeker 4.5
18
Current Applications
Marketing Used by the Washington Post to improve their direct mail targeting and to conduct survey analysis.
Health Care Used by the Oxford Transplant Center to discover factors affecting transplant survival rates.
Used by the University of Rochester Cancer Center to study the effect of anxiety on chemotherapy-related nausea.
Data Mining Tools: KnowledeSeeker 4.5
19
More Customers
Data Mining Tools: KnowledeSeeker 4.5
20
Introduction to Key Terms
Dependent / Independent variables
Root node / nodes
Decision tree
Splits
Clustering
Data Mining Tools: KnowledeSeeker 4.5
21
Data Mining Tools: KnowledeSeeker 4.5
22
Questions
1. What percentage of people in the test group have high blood pressure with these characteristics: 66-year-old male regular smoker that has low to moderate salt consumption?
2. Do the risk levels change for a male with the same characteristics who quit smoking? What are the percentages?
3. If you are a 2% milk drinker, how many factors are still interesting?
4. Knowing that salt consumption and smoking habits are interesting factors, which one has a stronger correlation to blood pressure levels?
5. Grow an automatic tree. Look to see if gender is an interesting factor for 55-year-old regular smoker who does not each cheese?
Data Mining Tools: KnowledeSeeker 4.5
23
Summary
Data mining has evolved into knowledge discovery
KnowledgeSeeker provides rapid data anaylsis
KnowledgeSeeker is flexible and inexpensive
KnowledgeSeeker is easy to use
Data Mining Tools: KnowledeSeeker 4.5
24