HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

11

Click here to load reader

description

What did I do 6 years ago? -0-; hahaha

Transcript of HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

Page 1: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

CLUSTER ANALYSIS HTM 602

September 10, 2008

Suh-hee Choi

Page 2: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

2

“Cluster analysis classi-fies objects, so that each

object is similar to others in the cluster with respect

to a predetermined selection criterion.”

Hair, Joseph F. & William Black, Cluster analysis, in Grimm, Laurence & Paul Yarnold (eds.) , 2000, Reading and Understanding More Multivariate Statistics, Ch. 5, P.147, American Psychological Association

B

A

C

E

D

F

Page 3: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

3

CLUSTER ANALYSIS IN HOSPITALITY AND TOURISM RESEARCH

Thyne et al. (2004)* used cluster analysis to identify backpacker groups to Scotland. They derived five groups: Typical Backpackers, Dis-coverers, Outdoors, Family Ties, and Routine Travelers.

Bigné et al. (2004)** used this methodology to classify consumer into two groups to show that it is possible to use emotional criteria to iden-tify the characteristics of consumers.

* Thyne, M., Davies, S. & Rob Nash (2004) A Lifestyle Segmentation Analysis of the Backpacker Market in Scotland: A Case Study of the Scottish Youth Hostel Association. Journal of Quality Assur-ance in Hospitality & Tourism 5(2/3/4): 95 - 119 ** Bigné, J. E. & Luisa Andreu (2004) Emotions in Segmentation: An Empirical Study. Annals of Tourism Research 31(3):682-696

Page 4: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

1. Research Problem (Select Objectives / Select Clustering Variables)

2. Research Design (Outliers / Standardization) 3. Assumptions (Is the sample representative of the popu-lation? Is multicollinearity substantial enough to affect re-sults?) 4. Selecting a Clustering Algorithm (Hierarchical / Nonhierarchical) Number of Clusters Formed Cluster Analysis Respecification

5. Interpreting the Clusters (Name clusters based on clustering variables)

6. Validating and Profiling the Clusters

CLUSTER ANALYSIS PROCESS

4

Hair, Joseph F. & William Black, Cluster analysis, in Grimm, Laurence & Paul Yarnold (eds.) , 2000, Reading and Understanding More Multivariate Statistics, Ch. 5, American Psychological Association

Page 5: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

5

ObjectivesClassifying customer based on

emotional criteria Variable selection (p.689)

X1 (angry-satisfied), X2 (un-happy-happy), … X10

1. Research Problem

Page 6: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

6

Can outliers be detected? How should object similarity be measured?

(1)

2. Research Design

B

A

C

Page 7: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

7

2. Research Design

ObjectProbability of Purchasing a

brand

Commercial Viewing time

Minutes Seconds

A 60 3 180

B 30 4 240

1

221 )(

iii

1

21i

ii

How should object similarity be measured? Euclidean distance

Distance (O1, O2) =

Should the data be standardized?

City-block approach

Distance (O1, O2) =

Page 8: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

8

Representativeness of the sam-ple

Multicollinearity

3. Assumptions

Page 9: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

9

4. Derivation of Clusters

object var1 var2 var3

A 4 6 6

B 3 3 3

C 3 4 3

D 4 6 7

E 7 6 6

Squared Euclidean Distance

A B C D E

A

B 1+32+32=19

C 1+22+32=14 0+1+0=1

D 0+0+1=1 1+32+42=26 1+22+42=21

E 32+0+0=9 42+32+32=34

42+22+32=29

32+0+1=10

Page 10: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

10

4. Derivation of ClustersSquared Euclidean Distance

e.g. Agglomerative hierarchical clustering process

Step Minimum distance

Pair Cluster No. of clusters

1 1 B-CA-D

(A,D), (B,C), E 3

2 9 A-E (A,D,E),(B,C) 2

3 10 D-E (A,B,C,D,E) 1

A B C D E

A

B 1+32+32=19

C 1+22+32=14 0+1+0=1

D 0+0+1=1 1+32+42=26 1+22+42=21

E 32+0+0=9 42+32+32=34

42+22+32=29

32+0+1=10

Page 11: HTM 602 Cluster Analysis (Sept. 10, Suh-hee Choi)

11

4. Derivation of Clusters

* Dandrogram

B

C

A

D

E

Distance at combina-tion

1 9

Step Minimum distance

Pair Cluster No. of clusters

1 1 B-CA-D

(A,D), (B,C), E 3

2 9 A-E (A,D,E),(B,C) 2

3 10 D-E (A,B,C,D,E) 1

10

5. Interpreta-tion

6. Validation