Data Mining Overview - bus317.ballenger.wlu.edu

39
Data Mining Overview What is Data Mining and its applications?

Transcript of Data Mining Overview - bus317.ballenger.wlu.edu

Page 1: Data Mining Overview - bus317.ballenger.wlu.edu

Data Mining Overview

What is Data Mining and its applications?

Page 2: Data Mining Overview - bus317.ballenger.wlu.edu

Discussion Topics•What is Data Mining?

•Who uses Data Mining?

•Why Data Mining?

•Where Data Mining?

•When Data Mining?

•How Data Mining?

•Why study Data Mining?

Page 3: Data Mining Overview - bus317.ballenger.wlu.edu

Data MiningDefinition & Goal• Definition

– Data Mining is the exploration and analysis of large quantities of data in order to discover meaningful patterns and rules.

– Knowledge creation

–Business decisions should be based on learning

–Informed decisions are better than uninformed

• Goal

– To allow an “enterprise”* to IMPROVE its ______ through better understanding of its ______ .

– Potential for Competitive Advantage.

* Synonyms include: corporation, firm, non-profit organization, government agency

Page 4: Data Mining Overview - bus317.ballenger.wlu.edu

Foundations of Data MiningüData mining is the process of using “raw” data to

infer important “business” relationships.

üData from the past contains information that will be useful in the future (provided customer/business behavior is not completely random)

üData Mining is a collection of powerful techniques intended for analyzing large amounts of data.

üThere is no single data mining approach, but rather a set of techniques that can be used stand alone or in combination with each other.

Page 5: Data Mining Overview - bus317.ballenger.wlu.edu

Data Mining – Why now?

1. Data are being produced

2. Data are being warehoused

3. Computing power is more affordable

4. Interest in CRM is strong with a focus on service and information as a product

5. Data Mining software is available

Page 6: Data Mining Overview - bus317.ballenger.wlu.edu

Customer Relationship Management (CRM)

1. Notice – what its customers are doing

2. Remember – what it and its customers have done over time

3. Learn – from what it has remembered

4. Act On – what it has learned to make customers more profitable

In order to form a learning relationship with its customers, an enterprise (firm)

must be able to:

Page 7: Data Mining Overview - bus317.ballenger.wlu.edu

Analytical Customer Relationship Management

Transaction Processing Systemsnotice customer behavior

Data Warehousingremember behavior over time

Data Mininglearn from behavior

Customer Relationship Management (CRM)act on leaning

Page 8: Data Mining Overview - bus317.ballenger.wlu.edu

Based on Transaction Data

Page 9: Data Mining Overview - bus317.ballenger.wlu.edu

Based on Transaction Data

Page 10: Data Mining Overview - bus317.ballenger.wlu.edu

Transaction Processing Systems

Operational systemsBut sometimes used for Data Mining

Phone companies’ call records to find residential numbers being used like businesses

Catalog companies’ order histories to identify customers for future mailings

Fedex change in shipping patterns during UPS strike

Supermarket POS data to decide what coupons to print

Web retailers past purchases to determine what to display on return visits

Page 11: Data Mining Overview - bus317.ballenger.wlu.edu

Data Warehousing

Gather operational data together and organize it in a consistent and useful way over time.

Page 12: Data Mining Overview - bus317.ballenger.wlu.edu

Customer RelationshipManagement

Understand each customer individually

Use that understanding to make it easier for the customer to do business with you rather than competitors.

Transform from a product-focused organization into a customer-centric one.

Organization must be able to change its behavior as a result of what it learns through DM.

Need to know both how DM tools work and also how they will be used.

Page 13: Data Mining Overview - bus317.ballenger.wlu.edu

Group Exercise

• Time = 15 minutes• Teams of 4 or less• Discuss Data Mining situations among

yourselves and pick one to report to the class

• What to report (verbally – 5 minute max):– Describe the Data Mining situation– How does it help the enterprise?

Page 14: Data Mining Overview - bus317.ballenger.wlu.edu

Why Study Data Mining?

• Open discussion to identify these

Page 15: Data Mining Overview - bus317.ballenger.wlu.edu

Discussion Topics •Data Mining History

•Data Warehouse

•Data Mart

Page 16: Data Mining Overview - bus317.ballenger.wlu.edu

Data Mining History• The approach has roots in practice dating

back over 40 years.

• In the early 1960s, data mining was called statistical analysis, and the pioneers were statistical software companies such as SAS and SPSS.

• By the late 1980s, the traditional techniques had been augmented by new methods such as fuzzy logic, heuristics and neural networks.

Page 17: Data Mining Overview - bus317.ballenger.wlu.edu

Definitions of a Data Warehouse

- W.H. Inmon

“A subject-oriented, integrated, time-

variant and non-volatile collection of

data in support of management's decision

making process”

- Ralph Kimball

“A copy of transaction data, specifically structured for query

and analysis”

1.

2.

Page 18: Data Mining Overview - bus317.ballenger.wlu.edu

Data Warehouse• For organizational learning to take place,

data from many sources must be gathered together and organized in a consistent and useful way – hence, Data Warehousing

• A Data Warehouse allows an organization (enterprise) to remember what it has noticed about its data

• Data Mining techniques make use of the data in a Data Warehouse

Page 19: Data Mining Overview - bus317.ballenger.wlu.edu

Data Warehouse

Customers

Etc…

Vendors Employees

Orders

DataWarehouse

Enterprise“Database”

Transactions

Copied, organized

summarized

Data Mining

Data Miners:• “Farmers” – they know• “Explorers” - unpredictable

Page 20: Data Mining Overview - bus317.ballenger.wlu.edu

Data WarehouseqA data warehouse is a copy of transaction data

specifically structured for querying, analysis

and reporting – hence, data mining.

qNote that the data warehouse contains a copy

of the transactions which are not updated or

changed later by the transaction system.

qAlso note that this data is specially

structured, and may have been transformed

when it was copied into the data warehouse.

Page 21: Data Mining Overview - bus317.ballenger.wlu.edu

Data Mart

•A Data Mart is a smaller, more focused

Data Warehouse – a mini-warehouse.

•A Data Mart typically reflects the

business rules of a specific business

unit within an enterprise.

Page 22: Data Mining Overview - bus317.ballenger.wlu.edu

Data Warehouse to Data Mart

DataWarehouse

Data Mart

Data Mart

Data Mart

Decision Support

Information

Decision Support

Information

Decision Support

Information

Page 23: Data Mining Overview - bus317.ballenger.wlu.edu

Data Warehouse & Mart

•Set of “Tables” – 2 or more dimensions•Designed for Aggregation

Page 24: Data Mining Overview - bus317.ballenger.wlu.edu

Group Exercise

• Time = 15 minutes• Teams of 4 or less• Discuss Data Warehouse to Data Mart

situations among yourselves and pick one to report to the class

• What to report (verbally – 5 minute max):– Describe the Data Warehouse to Data Mart

situation– How does it help the enterprise’s “business”

unit?

Page 25: Data Mining Overview - bus317.ballenger.wlu.edu

Data MiningDiscussion Topics

•Data Mining Flavors

•Data Mining Examples

•Data Mining Tasks

•Data Mining’s Biggest Challenge

•What does all of this mean?

Page 26: Data Mining Overview - bus317.ballenger.wlu.edu

Data Mining Flavors

• Directed (Supervised) Attempts to explain or categorize some particular target field such as income or response.

• Build models (algorithms/rules/formulas) to connect inputs to target or outcome

• For example - regression, neural networks, decision trees, nearest neighbors

• Models produce scores (fitted or predicted values) used to rank customers.

• Undirected (Unsupervised)Attempts to find patterns or similarities among groups of records without the use of a particular target field or collection of predefined classes.

• For example - affinity grouping (association rules, market basket analysis), clustering, self-organizing maps.

Page 27: Data Mining Overview - bus317.ballenger.wlu.edu

Data Mining Examples in Enterprises• US Government

– FBI – track down criminals

– Treasury Dept – suspicious int’l funds transfer

– SEC - insider trading

• Phone companies

• Supermarkets & Superstores (Vons, Albertsons, Wal-Mart, Costco)

• Mail-Order, On-Line Order (L.L. Bean, Victoria’s Secret, Lands End)

• Financial Institutions (BofA, Wells Fargo, Charles Schwab)

• Insurance Companies (USAA, Allstate, State Farm)

• Tons of others…

Page 28: Data Mining Overview - bus317.ballenger.wlu.edu

Data Mining Techniques• Classification

example: Fr, So, Jr, Sr

• Estimationexample: household income

• Predictionexample: predict credit card balance transfer average amount

• Affinity GroupingExample: people who buy X, often buy Y also with probability Z%

• Clusteringsimilar to classification but no predefined classes

• Description and Profiling behavior begets an explanation

Page 29: Data Mining Overview - bus317.ballenger.wlu.edu

Data Mining’s Biggest Challenge• The largest challenge a data miner may face

is the sheer volume of data in the data warehouse.

• summary data must be available to get the analysis started.

• this sheer volume may mask the important relationships the data miner is interested in.

• Must be able to overcome the volume and be able to interpret the data.

Page 30: Data Mining Overview - bus317.ballenger.wlu.edu

What Does All of This Mean?• On a regular basis, “farmers” and “explorers”

utilize their data warehouses to give guidance to and/or answer a limitless variety of questions.

• Nothing is free, however, and the benefits do come with a cost.

• The value of a data warehouse and subsequent data mining is a result of the new and changed business processes it enables – competitive advantage also.

• There are limitations, though - A Data Warehouse cannot correct problems with its data, although it may help to more clearly identify them.

Page 31: Data Mining Overview - bus317.ballenger.wlu.edu

The Virtuous Cycle of Data Mining

Data are at the heart of most companies’ core business processes

Data are generated by transactions regardless of industry

In addition to this internal data, there are tons of external data sources (credit ratings, demographics, etc.)

Data Mining’s promise is to find patterns in the “gazillions” of bytes

Page 32: Data Mining Overview - bus317.ballenger.wlu.edu

But…

Finding patterns is not enough

Business (individuals) must:

Respond to the pattern(s) by taking action

Turning:

Data into Information

Information into Action

Action into Value

Hence, the Virtuous Cycle of Data Mining

Page 33: Data Mining Overview - bus317.ballenger.wlu.edu

Data Mining…Easy?Marketing literature makes it look easy!!!

Just apply automated algorithms created by great minds, such as:

Neural networks

Decision trees

Genetic algorithms

“Poof”…Magic happens!!!

Not So…Data Mining is an iterative, learning processData Mining takes conscientious, long-term hard work and commitmentData Mining’s Reward:

Success can transform a company from being reactive to being proactive

Page 34: Data Mining Overview - bus317.ballenger.wlu.edu

Data Mining’s Virtuous Cycle1. Identify the business opportunity/Problem

2. Mining data to transform it into actionable

information

3. Acting on the information

4. Measuring the results

Page 35: Data Mining Overview - bus317.ballenger.wlu.edu

Bank of AmericaCase Study

In-Class ExerciseReview Bank of America Case Study

found in the textbook on pages 11 - 14

Page 36: Data Mining Overview - bus317.ballenger.wlu.edu

Identify the Business Opportunity

Many business processes are good candidates:New product introductionDirect marketing campaignUnderstanding customer attrition/churnEvaluating the results of a test market

Measurements from past Data Mining efforts:What types of customers responded to our last campaign?Where do the best customers live?Are long waits in check-out lines a cause of customer attrition?What products should be promoted with our XYZ product?

TIP When talking with business users about data mining opportunities, make sure you focus on the business problems/opportunities and not on technology and algorithms.

Page 37: Data Mining Overview - bus317.ballenger.wlu.edu

Mining data to transform it into actionable information

Success is making business sense of the data

Numerous data “issues”:

Bad data formats (alpha vs numeric, missing, null, bogus data)

Confusing data fields (synonyms and differences)

Lack of functionality (“I wish I could…”)

Legal ramifications (privacy, etc.)

Organizational factors (unwilling to change “our ways”)

Lack of timeliness

Page 38: Data Mining Overview - bus317.ballenger.wlu.edu

Acting on the Information

This is the purpose of Data Mining

the hope of adding value

What type of action?

Interactions with customers, prospects, suppliers

Modifying service procedures

Adjusting inventory levels

Consolidating

Expanding

Etc…

Page 39: Data Mining Overview - bus317.ballenger.wlu.edu

Measuring the ResultsAssesses the impact of the action taken

Often overlooked, ignored, skipped

Planning for the measurement should begin when

analyzing the business opportunity, not after it is

“all over”

Assessment questions (examples):

Did this ____ campaign do what we hoped?

Did some offers work better than others?

Did these customers purchase additional products?

Tons of others…