HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)
Click here to load reader
-
Upload
suh-hee-choi -
Category
Data & Analytics
-
view
112 -
download
1
description
Transcript of HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)
![Page 1: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/1.jpg)
CLUSTER ANALYSIS HTM 602
September 10, 2008
Suh-hee Choi
![Page 2: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/2.jpg)
2
“Cluster analysis classi-fies objects, so that each
object is similar to others in the cluster with respect
to a predetermined selection criterion.”
Hair, Joseph F. & William Black, Cluster analysis, in Grimm, Laurence & Paul Yarnold (eds.) , 2000, Reading and Understanding More Multivariate Statistics, Ch. 5, P.147, American Psychological Association
B
A
C
E
D
F
![Page 3: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/3.jpg)
3
CLUSTER ANALYSIS IN HOSPITALITY AND TOURISM RESEARCH
Thyne et al. (2004)* used cluster analysis to identify backpacker groups to Scotland. They derived five groups: Typical Backpackers, Dis-coverers, Outdoors, Family Ties, and Routine Travelers.
Bigné et al. (2004)** used this methodology to classify consumer into two groups to show that it is possible to use emotional criteria to iden-tify the characteristics of consumers.
* Thyne, M., Davies, S. & Rob Nash (2004) A Lifestyle Segmentation Analysis of the Backpacker Market in Scotland: A Case Study of the Scottish Youth Hostel Association. Journal of Quality Assur-ance in Hospitality & Tourism 5(2/3/4): 95 - 119 ** Bigné, J. E. & Luisa Andreu (2004) Emotions in Segmentation: An Empirical Study. Annals of Tourism Research 31(3):682-696
![Page 4: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/4.jpg)
1. Research Problem (Select Objectives / Select Clustering Variables)
2. Research Design (Outliers / Standardization) 3. Assumptions (Is the sample representative of the popu-lation? Is multicollinearity substantial enough to affect re-sults?) 4. Selecting a Clustering Algorithm (Hierarchical / Nonhierarchical) Number of Clusters Formed Cluster Analysis Respecification
5. Interpreting the Clusters (Name clusters based on clustering variables)
6. Validating and Profiling the Clusters
CLUSTER ANALYSIS PROCESS
4
Hair, Joseph F. & William Black, Cluster analysis, in Grimm, Laurence & Paul Yarnold (eds.) , 2000, Reading and Understanding More Multivariate Statistics, Ch. 5, American Psychological Association
![Page 5: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/5.jpg)
5
ObjectivesClassifying customer based on
emotional criteria Variable selection (p.689)
X1 (angry-satisfied), X2 (un-happy-happy), … X10
1. Research Problem
![Page 6: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/6.jpg)
6
Can outliers be detected? How should object similarity be measured?
(1)
2. Research Design
B
A
C
![Page 7: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/7.jpg)
7
2. Research Design
ObjectProbability of Purchasing a
brand
Commercial Viewing time
Minutes Seconds
A 60 3 180
B 30 4 240
1
221 )(
iii
1
21i
ii
How should object similarity be measured? Euclidean distance
Distance (O1, O2) =
Should the data be standardized?
City-block approach
Distance (O1, O2) =
![Page 8: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/8.jpg)
8
Representativeness of the sam-ple
Multicollinearity
3. Assumptions
![Page 9: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/9.jpg)
9
4. Derivation of Clusters
object var1 var2 var3
A 4 6 6
B 3 3 3
C 3 4 3
D 4 6 7
E 7 6 6
Squared Euclidean Distance
A B C D E
A
B 1+32+32=19
C 1+22+32=14 0+1+0=1
D 0+0+1=1 1+32+42=26 1+22+42=21
E 32+0+0=9 42+32+32=34
42+22+32=29
32+0+1=10
![Page 10: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/10.jpg)
10
4. Derivation of ClustersSquared Euclidean Distance
e.g. Agglomerative hierarchical clustering process
Step Minimum distance
Pair Cluster No. of clusters
1 1 B-CA-D
(A,D), (B,C), E 3
2 9 A-E (A,D,E),(B,C) 2
3 10 D-E (A,B,C,D,E) 1
A B C D E
A
B 1+32+32=19
C 1+22+32=14 0+1+0=1
D 0+0+1=1 1+32+42=26 1+22+42=21
E 32+0+0=9 42+32+32=34
42+22+32=29
32+0+1=10
![Page 11: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)](https://reader038.fdocuments.us/reader038/viewer/2022100507/55931dfe1a28ab8b5c8b478b/html5/thumbnails/11.jpg)
11
4. Derivation of Clusters
* Dandrogram
B
C
A
D
E
Distance at combina-tion
1 9
Step Minimum distance
Pair Cluster No. of clusters
1 1 B-CA-D
(A,D), (B,C), E 3
2 9 A-E (A,D,E),(B,C) 2
3 10 D-E (A,B,C,D,E) 1
10
5. Interpreta-tion
6. Validation