Analytics Assignment - Cluster analysis
-
Upload
aditya-dashora -
Category
Data & Analytics
-
view
89 -
download
1
Transcript of Analytics Assignment - Cluster analysis
I I I T B – B U S I N E S S A N A LY T I C S
CLUSTER ANALYSIS ASSIGNMENT
DATA DESCRIPTION
1. FRESH(Continuous) 2. MILK (Continuous)3. GROCERY(Continuous) 4. FROZEN(Continuous) 5. DETERGENTS_PAPER(Continuous) 6. DELICATESSEN(Continuous) 7. CHANNEL(Nominal) 8. REGION(Nominal)
STEPS
Uni-
variate Analysis
Scaling
Cluster
Identification
Cluster Plotting
UNI-VARIATE ANALYSIS
Channel Region Fresh Milk Min. :1.000 Min. :1.000 Min. : 3 Min. : 55 1st Qu.:1.000 1st Qu.:2.000 1st Qu.: 3128 1st Qu.: 1533 Median :1.000 Median :3.000 Median : 8504 Median : 3627 Mean :1.323 Mean :2.543 Mean : 12000 Mean : 5796 3rd Qu.:2.000 3rd Qu.:3.000 3rd Qu.: 16934 3rd Qu.: 7190 Max. :2.000 Max. :3.000 Max. :112151 Max. :73498 Grocery Frozen Detergents_Paper Min. : 3 Min. : 25.0 Min. : 3.0 1st Qu.: 2153 1st Qu.: 742.2 1st Qu.: 256.8 Median : 4756 Median : 1526.0 Median : 816.5 Mean : 7951 Mean : 3071.9 Mean : 2881.5 3rd Qu.:10656 3rd Qu.: 3554.2 3rd Qu.: 3922.0 Max. :92780 Max. :60869.0 Max. :40827.0 Delicassen Min. : 3.0 1st Qu.: 408.2 Median : 965.5 Mean : 1524.9 3rd Qu.: 1820.2 Max. :47943.0
UNI-VARIATE ANALYSIS (SCALE)
Channel Region Fresh Min. :-0.6895 Min. :-1.9931 Min. :-0.9486 1st Qu.:-0.6895 1st Qu.:-0.7015 1st Qu.:-0.7015 Median :-0.6895 Median : 0.5900 Median :-0.2764 Mean : 0.0000 Mean : 0.0000 Mean : 0.0000 3rd Qu.: 1.4470 3rd Qu.: 0.5900 3rd Qu.: 0.3901 Max. : 1.4470 Max. : 0.5900 Max. : 7.9187 Milk Grocery Frozen Min. :-0.7779 Min. :-0.8364 Min. :-0.62763 1st Qu.:-0.5776 1st Qu.:-0.6101 1st Qu.:-0.47988 Median :-0.2939 Median :-0.3363 Median :-0.31844 Mean : 0.0000 Mean : 0.0000 Mean : 0.00000 3rd Qu.: 0.1889 3rd Qu.: 0.2846 3rd Qu.: 0.09935 Max. : 9.1732 Max. : 8.9264 Max. :11.90545 Detergents_Paper Delicassen Min. :-0.6037 Min. :-0.5396 1st Qu.:-0.5505 1st Qu.:-0.3960 Median :-0.4331 Median :-0.1984 Mean : 0.0000 Mean : 0.0000 3rd Qu.: 0.2182 3rd Qu.: 0.1047 Max. : 7.9586 Max. :16.4597
CORRELATION MATRIX
Channel Region Fresh Milk Grocery FrozenDetergents_Paper
Channel 1.00 0.06 -0.17 0.46 0.61 -0.20 0.64
Region 0.06 1.00 0.06 0.03 0.01 -0.02 0.00
Fresh -0.17 0.06 1.00 0.10 -0.01 0.35 -0.10
Milk 0.46 0.03 0.10 1.00 0.73 0.12 0.66
Grocery 0.61 0.01 -0.01 0.73 1.00 -0.04 0.92
Frozen -0.20 -0.02 0.35 0.12 -0.04 1.00 -0.13
Detergents_Paper 0.64 0.00 -0.10 0.66 0.92 -0.13 1.00
DETERMINING NUMBER OF CLUSTERS
> Using various methods, we could determine that the number of clusters recommended for this data set were 2.
PLOTTING CLUSTERS
Cluster profiles - Cluster 1: 135 data points- Cluster 2: 305 data points
CLUSTER MEANS (MEDIODS)
Channel Region Fresh Milk Grocery FrozenDetergents_Paper
Delicassen
1 1.43 0.11 -0.28 0.76 0.95 -0.28 0.99 0.21
2 -0.63 -0.05 0.12 -0.34 -0.42 0.12 -0.44 -0.09
CLUSTER SUMMARY
• > fit2$size• [1] 135 305
• > fit2$betweenss• [1] 918.5233
• > fit2$withinss• [1] 1277.692 1315.784