Data Warehousing Lecture-31 Supervised vs. Unsupervised Learning Virtual University of Pakistan...
-
Upload
adela-bond -
Category
Documents
-
view
214 -
download
0
Transcript of Data Warehousing Lecture-31 Supervised vs. Unsupervised Learning Virtual University of Pakistan...
Data Warehousing
Lecture-31Supervised vs. Unsupervised Learning
Virtual University of PakistanVirtual University of Pakistan
Ahsan AbdullahAssoc. Prof. & Head
Center for Agro-Informatics Researchwww.nu.edu.pk/cairindex.asp
National University of Computers & Emerging Sciences, IslamabadEmail: [email protected]
Data Structures in Data Mining
• Data matrix– Table or database – n records and m attributes, – n >> m
C1,1 C1,2 C1,3 C1,m
C2,1 C2,2 C2,3 C2,m
C3,1 C3,2 C3,3 C3,m
Cn,1 Cn,2 Cn,3 Cn,m
…
.
.
.…
.
.
.
1 S1,2 S1,3 S1,n
S2,1 1 S2,3 S2,n
S3,1 S3,2 1 S3,n
Sn,1 Sn,2 Sn,3 1
…
.
.
.…
.
.
.
• Similarity matrix– Symmetric square matrix– n x n or m x m
Main types of DATA MINING
Supervised• Bayesian Modeling • Decision Trees• Neural Networks• Etc.
Unsupervised• One-way Clustering• Two-way Clustering
Type and number of classes are NOT known in advance
Type and number of classes are known in advance
Clustering: Min-Max Distance
Age
Salary
20 40 60
outlier Inter-cluster distances are maximized
Intra-cluster distances are
minimized
How Clustering works?
One-way clustering example
INPUT OUTPUT
Black spotsare noise
White spotsare missing
data
Data Mining Agriculture data
INPUT Clustered OUTPUT
clusters
Which class?
Classifier (model)
Unseen Data
Classification
Output
ConfidenceLevel
Inputs
How Classification work?
Classification Process (1): Model Construction
TrainingTrainingDataData
NAME Time Items GenderMoin 10 2 MMunir 16 3 MMeher 15 1 FJaved 5 1 MMahin 20 1 FAkram 20 4 M
ClassificationClassificationAlgorithmsAlgorithms
IF time/items >= 6THEN gender = ‘F’
ClassifierClassifier(Model)(Model)
(observations, measurements, etc.)
Relationship between shopping time and items bought
Classification Process (2): Use the Model in Prediction
TestingTestingDataData Unseen DataUnseen Data
(Firdous, Time= 15 Items = 1)
ClassifierClassifier
Gender?NAME Time Items GenderTahir 20 1 MYounas 11 2 MYasin 3 1 M
Clustering vs. Cluster Detection
Clustering vs. Cluster Detection Example
AA BB
The K-Means Clustering
The K-Means Clustering: Example
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
A B
D C
The K-Means Clustering: Comment