A little about data mining
-
Upload
nguyen-ngoc-binh-phuong -
Category
Business
-
view
153 -
download
1
Transcript of A little about data mining
Data mining is focused on better understanding of characteristics and patterns among variables in large databases using a variety of statistical and analytical tools.◦ It is used to identify relationships among variables in
large data sets and understand hidden patterns that they may contain.
◦ XLMiner software implement many basic data mining procedures in a spreadsheet environment.
2
In supervised data mining techniques, there is a dependent variable the method is trying to predict.◦ The classification and prediction/forecasting methods
are supervised data mining techniques.
In unsupervised data mining techniques, there is no dependent variable. Instead, these techniques search for patterns and structure among all of the variables.◦ One popular unsupervised method is association
analysis (known in marketing as market basket analysis)
◦ The most common unsupervised method is clustering(known in marketing as segmentation).
4
De
sc
rip
tiv
e a
na
lyti
cs
Pre
dic
tiv
e a
na
lyti
cs
You already learn from your previous work about the physical characters of fruits. So arranging the same type of fruits at one place is easy now.
In data mining terminology the earlier work is called as training data. You already learn the things from your training data. This is because of response variable.
7
Suppose you have taken a new fruit from the basket then you will see the size, color, and shape of that particular fruit.◦ If size is Big, color is Red, the shape is rounded shape
with a depression at the top, you will confirm the fruit name as Apple and you will put in Apple group.
If you learn the thing before from training data and then applying that knowledge to the test data (for new fruit), this type of learning is called as supervised learning.
8
This time, you don’t know anything about the fruits, honestly saying this is the first time you have seen them. You have no clue about those.
So, how will you arrange them? What will you do first?
9
You will take a fruit and you will arrange them by considering the physical character of that particular fruit.
Suppose you have considered color. Then you will arrange them on considering base condition as color. Then the groups will be something like this:◦ RED COLOR GROUP: apples & cherries.
◦ GREEN COLOR GROUP: bananas & grapes.
So now you will take another physical character such as size.◦ RED COLOR AND BIG SIZE: apples.
◦ RED COLOR AND SMALL SIZE: cherries.
◦ GREEN COLOR AND BIG SIZE: bananas.
◦ GREEN COLOR AND SMALL SIZE: grapes.
10
Here you did not learn anything before, means no training data and no response variable.
In data mining, this kind of learning is known as unsupervised learning.
11