0022-Chanawuth

8
DATA MINING IN DATA ANALYSIS FOR BUSINESS DECISION SUPPORT IN WAREHOUSE MANAGEMENT WITH WEKA PROGRAM by C. Chanawuth School of Business, University of the Thai Chamber of Commerce 126/1 Vibhavadee-Rangsit Road, Dindaeng, Bangkok 10400, Thailand Tel: (662) 697-6101-5, E-mail: [email protected] ABSTRACT Data mining in data analysis for business decision support in warehouse management with Weka program which is one of the open source programs is able to analyze data and help support various kinds of decision making. For example data mining supports customers’ service cancellation prediction, product association in transportation for warehouse store keeping management. This is a tool to enable business decision making to gain superior over other competitors. Classification techniques are mostly applied in data mining which are able to analyze both continuous and discrete data. This research is focused on data classification analysis and numerical prediction with Weka program which can analyze data by developing a model for decision making and business solving support. KEY WORDS Data Mining, Classification, Weka program, Decision Support System INTRODUCTION With the competition of the business world in present applying knowledge, experience and technologies has become significant aspect in accomplishing advantageous positions among competitors. Another important aspect which should not be overlooked is data analysis. Analyzing data, knowledge and experience must be applied in order to take the results into business decision making support. In order to do so, with many kinds of data and information and with large amounts, data analysis requires a great deal of time. If the analysts lack experience in choosing data and information, the analytic outcomes will be useless. Data mining is an aid for analyzing large amounts of data (Power, 2008). Data mining is a technique in finding significant data hidden in multiplex data sources and is able to select only necessary data (Gargano and Raggad, 1999; Rafalski, 2002). Data mining is the discovery of both data association which is hidden in data sources and new body of knowledge through a tool developed in the processes of data mining (Gargano and Raggad, 1999). DATA ANALYSIS BY DATA MINING Figure 1 is the presentation of the use of data mining technique in data analysis within data preparation which is a time consuming process. Sometimes, if the data is still unclear, the process of business understanding has to be rerun. When the data is clear, the next process is to model with the use of data mining technique. Business Understanding : analyzing business processes, the parts of occurring problems or the parts that require data to support decision making and analytical targeting. Data Understanding : analyzing data used in decision making support and understanding data format, Types, sources and amounts to apply in the analysis. Data Selection : selecting necessary data for the modeling process of decision making support which Must be the problem solving data.

description

c

Transcript of 0022-Chanawuth

Page 1: 0022-Chanawuth

DATA MINING IN DATA ANALYSIS FOR BUSINESS DECISION SUPPORT IN WAREHOUSE MANAGEMENT WITH WEKA PROGRAM

by

C. ChanawuthSchool of Business, University of the Thai Chamber of Commerce

126/1 Vibhavadee-Rangsit Road, Dindaeng, Bangkok 10400, Thailand

Tel: (662) 697-6101-5,E-mail: [email protected]

ABSTRACT

Data mining in data analysis for business decision support in warehouse management with Weka program which is one of the open source programs is able to analyze data and help support various kinds of decision making. For example data mining supports customers’ service cancellation prediction, product association in transportation for warehouse store keeping management. This is a tool to enable business decision making to gain superior over other competitors. Classification techniques are mostly applied in data mining which are able to analyze both continuous and discrete data. This research is focused on data classification analysis and numerical prediction with Weka program which can analyze data by developing a model for decision making and business solving support.

KEY WORDSData Mining, Classification, Weka program, Decision Support System

INTRODUCTION

With the competition of the business world in present applying knowledge, experience and technologies has become significant aspect in accomplishing advantageous positions among competitors. Another important aspect which should not be overlooked is data analysis. Analyzing data, knowledge and experience must be applied in order to take the results into business decision making support. In order to do so, with many kinds of data and information and with large amounts, data analysis requires a great deal of time. If the analysts lack experience in choosing data and information, the analytic outcomes will be useless. Data mining is an aid for analyzing large amounts of data (Power, 2008).

Data mining is a technique in finding significant data hidden in multiplex data sources and is able to select only necessary data (Gargano and Raggad, 1999; Rafalski, 2002). Data mining is the discovery of both data association which is hidden in data sources and new body of knowledge through a tool developed in the processes of data mining (Gargano and Raggad, 1999).

DATA ANALYSIS BY DATA MINING

Figure 1 is the presentation of the use of data mining technique in data analysis within data preparation which is a time consuming process. Sometimes, if the data is still unclear, the process of business understanding has to be rerun. When the data is clear, the next process is to model with the use of data mining technique.

Business Understanding : analyzing business processes, the parts of occurring problems or the parts that require data to support decision making and analytical targeting.

Data Understanding : analyzing data used in decision making support and understanding data format, Types, sources and amounts to apply in the analysis.

Data Selection : selecting necessary data for the modeling process of decision making support which Must be the problem solving data.

Page 2: 0022-Chanawuth

Data Cleaning : checking data accuracy and correcting such as incomplete data, inaccurate numerical data. Weka program can present such wrong data in order to correct it.

Data Transformation : adjusting all data to be used into a ready format for analysis and modeling process. Because data may be from various sources and with different data types, it is necessary to change all data into the same format to be ready for developing model for decision support.

Modeling : putting the selected data into 2 categories that are Training Data and Evaluation Data will be used in developing a model for decision support with the use of data mining techniques. In this part, Weka will apply the chosen algorithm in processing data and present the best results from the analysis.

Evaluation : after developing a model to support decision making, take the Evaluation Data to be tested with the model to check the model to check the accuracy of the developed model.

Deployment Model : using the model for decision support with Unseen Data to work with Weka program and aid in output evaluation.

FIGURE 1DATA MINING WORKFLOW

Page 3: 0022-Chanawuth

In data mining, Weka is a program that evaluates the data analysis to develop models to support on decision making on business warehouse management. There are many data mining techniques for model developing. Among the most popular ones are Classification, Clustering and Association Rule Discovery which are applied in model developing (Weka Machine Learning Project, 2010; Wass, 2007). For the models supporting decision making in business warehouse management in this essay, Classification is used in analyzing nominal data and prediction analysis for numeric data.Processes in model developing with classification technique as in figure2 is applying training data in developing model and testing the model with the evaluation data. Then the model is used in real practice by applying the unseen data that there is still no answer class with this model, developed by Weka program.

FIGURE 2CLASSIFICATION WORFLOW IN DATA MINING

CLASSIFICATION CASE STUDY: CUSTOMERS TYPE ANALYSIS

One significant problem in warehouse management is dealing with the limited space for large amount of product demands. If the types of customers are known, ordering can be adjusted to suit the demands. For example, by classifying customers into 2 groups that is ‘A’ representing premium customers and ‘D’ representing general customers. The data analysis with classification is to process the customers’ buying list data from their previous visit to divide customers into groups. The model supporting customer classification will be developed by the use of a decision tree to solve the problem. The algorithm used in developing the decision tree is C4.4 algorithm (JIANG, et al., 2009). In Weka program the J48 is applied. This is upgrade from C4.4 algorithm to C4.5 algorithm and it is called J48 when applied in Weka program as the algorithm to develop the decision tree.

The data of customers’ buying lists in the past is presented in table 1

Page 4: 0022-Chanawuth

TABLE 1THE DATA OF THE CUSTOMERS’ BUYING LISTS IN THE PAST

FOR CUSTOMER CLASSIFICATION MODELING

No. Sex Age Income Product_A Product_B Product_C Type1001 Male 20 12,000 0 0 0 D1002 Female 18 7,000 1 1 1 A1003 Female 35 35,000 0 0 0 D1004 Male 28 6,000 1 1 0 A1005 Female 32 20,000 0 0 0 D1006 Male 20 12,000 0 0 0 D1007 Female 23 7,000 1 1 1 A1008 Female 35 35,000 0 0 0 D1009 Male 18 6,000 1 1 0 A1010 Female 32 20,000 0 0 0 D1011 Male 34 12,000 0 0 0 D1012 Female 45 27,000 1 1 1 A1013 Female 22 35,000 0 0 0 D1014 Male 18 8,000 1 1 0 A1015 Female 32 20,000 0 0 0 D1016 Male 22 12,000 0 0 0 D1017 Female 34 17,000 1 1 1 A1018 Female 35 35,000 0 0 0 D1019 Male 53 26,000 1 1 0 A1020 Female 32 20,000 0 0 0 D1021 Male 34 12,000 0 0 0 D1022 Female 34 24,000 1 1 1 A1023 Female 35 35,000 0 0 0 D1024 Male 33 60,000 1 1 0 A1025 Female 34 20,000 0 0 0 D1026 Male 20 12,000 0 0 0 D1027 Female 35 37,000 1 1 1 A1028 Female 35 35,000 0 0 0 D1029 Male 33 26,000 1 1 0 A1030 Female 32 20,000 0 0 0 D

Remark: 1 represents purchasing and 0 represents purchasing rejection in Product A, Product B and Product C.

Unseen data for model testing in order to know customer types is presented in table 2.

TABLE 2THE DATA OF CUSTOMERS’ PURCHASING DECISION

FOR CUSTOMER CLASSIFICATION MODELING

No. Sex Age Income Product_A Product_B Product_C Type1031 Male 17 5,000 1 0 1 ?

Remark: 1 represents purchasing and 0 represents purchasing rejection in Product A, Product B and Product C.

The outcome of model developing with the use of classification technique of Weka program as in figure 3 is presented in a form of the decision tree or J48 algorithm as in figure 2. Then, if we test the data in table 2, type A customers will be the result.

Page 5: 0022-Chanawuth

FIGURE 3ANALYTICAL OUTCOME OF CLASSIFICATION: DECISION TREE

The result is that we will know the number of customer types from their purchasing behaviors so product ordering and storing in response to customers’ demands can be well managed.

Classification: Customers Type Model

Page 6: 0022-Chanawuth

CLASSIFICATION: ANALYZING WAREHOUSE RENTAL

If warehouse rental cannot be managed or planned ahead, the problem will be came the burden for business. There must always be warehouse improvement such as improving air conditioners in the freezing rooms for dozen foods, the number of RFID for product checking, improving walls in warehouses or resizing storing rooms. All these can applyin data analysis for warehouse rental calculation. With classification analysis, transferring fees can be predicted in advance by using linear regression in Weka program to aid model developing for decision support as in table 3.

TABLE 3WAREHOUSE RETAL CALCUATING DATA

Area Size Security Fee RFID Upgrades Partition Upgrades Air Rental3,529 9,191 6 0 0 205,0003,247 10,061 5 1 1 224,9004,032 10,150 5 0 1 197,9002,397 14,156 4 1 0 189,9002,200 9,600 4 0 1 195,0003,536 19,994 6 1 1 325,000

… … … … … …2,983 9,365 5 0 1 230,000

Remark: 1 represents equipment improving and 0 represents no improving activity in Partition and Air.

The outcome from Weka analysis is presented in figure 4

FIGURE 4ANALYTICAL OUTCOME OF CLASSIFICATION: LINEAR REGRESSION

Classification:Warehouse Rental Model

Page 7: 0022-Chanawuth

The equation from Weka program processing is

Services Price = (-26.6882 * Area Size) + (7.0551 * Security Fee) + (43166.0767 * RFID) + (42292.0901 * Upgrades Air) + (-21661.1208)................................................................................................................................ (1)

TABLE 4THE DATA FOR MODEL TESING

Area Size Security Fee RFID Upgrades Partition Upgrades Air Rental3,529 9,191 6 0 0 205,000

Rental

= (-26.6882 * 3,529) + (7.0551 * 9,191) + (43166.0767 * 6) + (42292.0901 * 0) + (-21661.1208)

= (-94182.6578) + (64843.4241) + (258996.4602) + (0) + (-21661.1208)

= 207,996.12 Baht

By applying the data of table 4 into the equation from model developing, it is found that the value is 207,996.12 Baht while the value of real data is 205,000.00 Baht. Comparing the 2 values, the differential is 2,996.12 Baht.

CONCLUSION

The outcome of data analyzing with data mining is a model supporting decision on warehouse management to solve the problems. Apart from data, knowledge or experience, an efficient tool is necessary in supporting any ideas to gain better business process ahead of other competitors.

Applying Weka program which is an open source program is the beginning of data mining application in any business that needs to discover some hidden information among large amounts of data especially the warehouse business which relies on limited resource management to gain the largest benefit.

REFERENCES

Edward Rafalski (2002), Using data mining/data repository methods to identify marketing opportunities in health care, Journal of Consumer Marketing, 19(7), pp. 607-613

JIANG, L., LI, C., & CAI, Z. (2009). DECISION TREE WITH BETTER CLASS PROBABILITY ESTIMATION. International Journal of Pattern Recognition & Artificial Intelligence, 23(4), pp. 745-763.

Liu John I. C, Yun David Y. Y and Klein Gary (1990), An Agent for Intelligent Model Management, Journal of Management Information Systems, 7(1), pp. 101-122

Michael L. Gargano and Bel G. Raggad (1999), Data Mining – a powerful information creating tool, OCLC Systems & Services, 15(2), pp. 81-90

PETRUŞEL Răzvan (2009), Collaborative Virtual Enterprise Environment and Decision Mining, InformaticsEconomica, 13(2), pp. 59-67

Power Daniel J (2008), Understanding Data-Driven Decision Support Systems, Information Systems Management, 25(2), pp.149-154

Sang Jun Lee and Keng Siau (2001). A review of data mining techniques, Industry Management & Data System, 101(1), pp. 41 – 46.

Page 8: 0022-Chanawuth

Wass, J. (2007). Weka Machine Learning Workbench. Scientific Computing, 24(3), pp. 21-47.

Weka Machine Learning Project, Retrieved August 2010, 01, from http://www.cs.waikato.ac.nz/~ml/weka/index.html