Data mining knowing the unknown
-
Upload
dr-othman-alsalloum -
Category
Technology
-
view
1.003 -
download
2
description
Transcript of Data mining knowing the unknown
![Page 1: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/1.jpg)
DATA MINING:
KNOWING THE UNKNOWN
![Page 2: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/2.jpg)
Chapter 12: Data mining: Knowing the Unknown
Contents
What Is Data Mining? Data Mining and Business Intelligence Business Drivers Technical Drivers DM Virtuous Cycle Data Management DM in Practice Role of DM in Customer Relationship
Management
![Page 3: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/3.jpg)
Chapter 12: Data mining: Knowing the Unknown
What Is Data Mining?
DEFINITIONS A body of scientific knowledge accumulated
through decades of forming well-established disciplines, such as statistics, machine learning, and artificial intelligence
A technology evolving from high volume transaction systems, data warehouses, and the Internet
A business community forced by an intensive competitive environment to innovate and integrate new ideas, concepts, and tools to improve operations and DM quality
![Page 4: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/4.jpg)
Chapter 12: Data mining: Knowing the Unknown
Alternative definitions The search for relationships and global patterns
that exist in large databases but are hidden among the vast amount of data (Holsheimer and Kersyen 1994)
A set of techniques used in an automated approach to exhaustively explore and bring to the surface complex relationships in very large data sets (Moxon 1996)
The process of finding previously unknown and potentially interesting patterns and relations in large databases (Fayyad et al. 1996)
![Page 5: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/5.jpg)
Chapter 12: Data mining: Knowing the Unknown
Business intelligence (BI) A global term for all processes, techniques,
and tools that support business decision making based on information technology
Approaches can range from a simple spreadsheet to an advanced decision support system
![Page 6: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/6.jpg)
Chapter 12: Data mining: Knowing the Unknown
Data mining as a component of BI
FIGURE 12.2:
![Page 7: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/7.jpg)
Chapter 12: Data mining: Knowing the Unknown
Business Drivers Competition. Competing in today’s economy requires
an understanding of customer needs and behavior and flexibility to respond to market demands and competitors' challenges
Information glut. Today’s manager is confronted with increasingly large volumes of data, collected and stored in various databases and constitutes a challenge for decision making
Serving knowledge workers efficiently. Databases provide the firm with transactional memory. Memory is of little use without intelligence--capacity to acquire and apply knowledge
![Page 8: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/8.jpg)
Chapter 12: Data mining: Knowing the Unknown
Knowledge worker types and needs
Type Knowledge analysts
Knowledge users
Knowledge consumers
Need Sophisticated tools for
their investigations and research analysis
Review and analysis of data
Easy access to information
![Page 9: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/9.jpg)
Chapter 12: Data mining: Knowing the Unknown
Technical Drivers Objective of DM is to optimize use of
available data and reduce risk of making wrong decisions
Statistics and machine learning considered
the analytical foundations upon which DM was developed
![Page 10: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/10.jpg)
Chapter 12: Data mining: Knowing the Unknown
ROLE OF STATISTICS
With databases organizing data records of hundreds of attributes, the statistics methodology of the hypothesize-and-test paradigm becomes a time-consuming process. DM helps automate formulation and discovery of new hypotheses
MACHINE LEARNING
A scientific discipline considered a sub-field of artificial intelligence. In contrast, data mining is a business process concerned with finding understandable knowledge from very large real-world databases
![Page 11: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/11.jpg)
Chapter 12: Data mining: Knowing the Unknown
DATA WAREHOUSES
Extracting and transforming operational data into informational or analytical data and loading it into a central data warehouse
Major features of DW: Subject oriented Integration Time variant Nonvolatile
![Page 12: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/12.jpg)
Chapter 12: Data mining: Knowing the Unknown
FIGURE 12.3: Data Warehouse and DM Technological Framework
![Page 13: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/13.jpg)
Chapter 12: Data mining: Knowing the Unknown
OLAP (Online Analytical Processing)
Uses computing power and graphical interfaces to manipulate data easily and quickly at user’s convenience
Focus is showing data along several dimensions. Manager should be able to drill down into the ultimate detail of a transaction and zoom up for a general view
![Page 14: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/14.jpg)
Chapter 12: Data mining: Knowing the Unknown
OLAP strengths: A powerful visualization tool An easy-to-use interactive tool Can be used as a first step in understanding the
data
OLAP limitations: Does not find patterns automatically Does not have powerful analytical techniques
![Page 15: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/15.jpg)
Chapter 12: Data mining: Knowing the Unknown
EVOLUTION OF THE DECISION MAKING ARCHITECTURE
Integration of decision processing into the overall business process achieved by building a closed-loop system, where decision processing applications output delivered to users in the form of recommended actions
In the e-business environment, many companies look to extend closed-loop processing to automatically adjust business operations based on messages generated by a decision engine
![Page 16: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/16.jpg)
Chapter 12: Data mining: Knowing the Unknown
FIGURE 12.4 Conventional Approach to Corporate Decision Making
![Page 17: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/17.jpg)
Chapter 12: Data mining: Knowing the Unknown
FIGURE 12.5: DM in the Context of the New Corporate Business Model
![Page 18: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/18.jpg)
Chapter 12: Data mining: Knowing the Unknown
DM Virtuous CycleHarnessing power of data and
transforming it into added value for the entire organization
Capabilities: Response to extracted patterns
Selection of the right actionLearning from past actionsTurning action into business value
![Page 19: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/19.jpg)
Chapter 12: Data mining: Knowing the Unknown
The Virtuous DM Cycle
![Page 20: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/20.jpg)
Chapter 12: Data mining: Knowing the Unknown
BUSINESS UNDERSTANDING
The first step in the virtuous DM cycle is identifying the business opportunity--defining the problems faced by the firm
The goal is to identify areas where data can provide values
Defining the problem should involve technical and business experts
![Page 21: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/21.jpg)
Chapter 12: Data mining: Knowing the Unknown
DEVELOP THE DM APPLICATION
Define the adequate data-mining tasksOrganize data for analysisUse the right DM technique to build the
data modelValidate the model
![Page 22: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/22.jpg)
Chapter 12: Data mining: Knowing the Unknown
DM tasksClustering:
Clustering finds groups that are very different from each other, but whose members are similar to each other
One does not know what the clusters will be at the start or by what attributes the data will be clustered
![Page 23: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/23.jpg)
Chapter 12: Data mining: Knowing the Unknown
ClassificationThe classification function identifies characteristics of the group to which each case belongs
Affinity groupingA descriptive approach to exploring data that can help identify relationships among values in a database. The two most common approaches are: Association discovery Sequence discovery
![Page 24: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/24.jpg)
Chapter 12: Data mining: Knowing the Unknown
Different Industries and the DM GoalsCustomer services
Business Challenge--understanding that individual
preferences of customers is the key to satisfying them
DM Goals
Customer acquisition profile Customer retention profiling Customer-centric selling Inquiry routing Online shopping Scenario notification Staffing level prediction Web mining for prospects Targeting market
![Page 25: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/25.jpg)
Chapter 12: Data mining: Knowing the Unknown
Financial services industry Business Challenge--Retaining customer loyalty
is of the utmost importance to this industry
DM Goals
Focused statistical and DM applications are
prevalent Risk management for all types of credit and
fraud detection
![Page 26: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/26.jpg)
Chapter 12: Data mining: Knowing the Unknown
Health-care businessBusiness Challenge Keeping pace with rate of technological and
medical advancement Cost is a constant issue in ever-changing market
DM Goals Early DM activities have focused on financially
oriented applications Predictive models have been applied to predict
length of stay, total charges, and even mortality
![Page 27: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/27.jpg)
Chapter 12: Data mining: Knowing the Unknown
Telecommunications industryBusiness Challenge Keeping pace with rate of technological change Deregulation is changing business landscape, resulting
in competition from a wide range of service providers Finding and retaining customers
DM Goals Customer profiling Subscription fraud and credit applications are utilized
throughout the industry Concerns about privacy and security are likely to result
in DM applications targeted to these areas
![Page 28: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/28.jpg)
Chapter 12: Data mining: Knowing the Unknown
Data Management
The most challenging part of a DM project is the data management stage
At least 40 percent of a DM project is spent on this stage
![Page 29: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/29.jpg)
Chapter 12: Data mining: Knowing the Unknown
DATA SOURCESExamples:
Flat files Relational databases Data warehouses Geographical databases Time series databases World Wide Web
![Page 30: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/30.jpg)
Chapter 12: Data mining: Knowing the Unknown
TAXONOMY OF DATA
Data can be found in several forms:
Business transactions Scientific data Medical data Personal data Text and documents Web repositories
![Page 31: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/31.jpg)
Chapter 12: Data mining: Knowing the Unknown
DATA PREPARATION
Evaluating data quality Handling missing data Processing outliers Normalizing data Quantifying data
![Page 32: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/32.jpg)
Chapter 12: Data mining: Knowing the Unknown
MODEL BUILDING
The first step is to select the modeling
technique.
The most popular techniques:Association rulesClassification treesNeural networks
![Page 33: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/33.jpg)
Chapter 12: Data mining: Knowing the Unknown
PARAMETER SETTINGS AND TUNING
The set of initial and intermediate parameters must be recorded for eventual use and comparison
The selection of parameters must also be explained
The process of reaching the final values must be documented
Testing and validating samples are used for this task
![Page 34: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/34.jpg)
Chapter 12: Data mining: Knowing the Unknown
MODEL TESTING AND ANALYSIS OF RESULTS Reviewing business objectives and success
criteria Assessing success of the DM project to ensure
all business objectives have been incorporated identifying factors that have been overlooked Understanding data-mining results Interpreting the results Comparing results with common sense and
knowledge base to detect any worthwhile discoveries
![Page 35: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/35.jpg)
Chapter 12: Data mining: Knowing the Unknown
TAKING ACTION AND DEPLOYMENT Summarizing deployable results Identifying users of the discovered
knowledge and finding out how to deliver and propagate it
Defining a performance measure to monitor obtained benefits by implementation of DM results
![Page 36: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/36.jpg)
Chapter 12: Data mining: Knowing the Unknown
DM Applications
Market management applications Sales applications Risk management applications Web applications Text mining
![Page 37: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/37.jpg)
Chapter 12: Data mining: Knowing the Unknown
DM as a Required Component for Various Applications in Different Departments
![Page 38: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/38.jpg)
Chapter 12: Data mining: Knowing the Unknown
Role of DM in Customer Relationship Management Customer Acquisition Campaign Optimization Customer Scoring Direct Marketing
![Page 39: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/39.jpg)
Chapter 12: Data mining: Knowing the Unknown
Integrated Knowledge Management System for Marketing
![Page 40: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/40.jpg)
Chapter 12: Data mining: Knowing the Unknown
INTEGRATING DM, CRM, AND E-BUSINESSDM applications are the first line in
understanding the customer and an integral key to segmenting the market
An intelligent e-business system enhances CRM by enabling a level of responsiveness and proactive customer care not achievable through other channels
Through personalization, corporations can build successful 1-to-1 relationships with customers
![Page 41: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/41.jpg)
Chapter 12: Data mining: Knowing the Unknown
Customer-Centric Data WarehousingEnables Next Generation E-Commerce
![Page 42: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/42.jpg)
Chapter 12: Data mining: Knowing the Unknown
Implications for Knowledge ManagementA DM project is definitely not a
straightforward projectWhile conducting such a project,
companies may face many problems, obstacles, and pitfalls that prevent them from gaining returns on investing in a DM project
![Page 43: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/43.jpg)
Chapter 12: Data mining: Knowing the Unknown
Data-Mining Challenges
![Page 44: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/44.jpg)
Chapter 12: Data mining: Knowing the Unknown
Companies should have a business justification for DM
The organization will only reap the benefits of DM if there is a real business case to answer
It should not be an experiment for experiment’s sake
Insufficient understanding of business needs
![Page 45: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/45.jpg)
Chapter 12: Data mining: Knowing the Unknown
Careless handling of dataOverquantifying dataMiscoding dataAnalyzing without taking precautions
against sampling errorsLoss of precision due to improper
rounding of data values Incorrectly handling missing values
![Page 46: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/46.jpg)
Chapter 12: Data mining: Knowing the Unknown
Invalidly validating the data-mining model
Most data miners will face either abundant amounts of testing data or extremely scare amounts
It is important to never ignore suspicious findings
Haste can lead to believing everything data owners tell us about the data and, even worse, believing everything our own analysis tells us
![Page 47: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/47.jpg)
Chapter 12: Data mining: Knowing the Unknown
Believing in alchemy Many data owners believe that data mining is a form
of an alchemic process that will magically transform their straw databases into golden knowledge
Failure may come from wrong suspicion, technology misapplication, or unsuitable information
Managers should consider: Select the right application In-house development or outsourcing Assessing vendor selection Team building
![Page 48: Data mining knowing the unknown](https://reader033.fdocuments.us/reader033/viewer/2022061114/545bf32aaf7959c8098b45b0/html5/thumbnails/48.jpg)
Chapter 12
DATA MINING:
KNOWING THE UNKNOWN