Guide Xavier

Author
sachinkumar 
Category
Documents

view
49 
download
1
Embed Size (px)
Transcript of Guide Xavier
BY M J XAVIER
PREFACE
This practical guide on data analysis has been prepared specifically for the business students majoring in marketing who have an aversion for numbers and statistical methods. The simple stepbystep approach used in the guide should enable students to gain insight into statistical tools and help them develop their skills in interpreting and making meaning out of numbers. The entire range of statistical tools has all been explained using a single data set from a questionnaire on toothpaste market. The tools covered range from simple frequencies, mean, median etc. to multivariate techniques like factor, cluster and discriminant analysis. The questionnaire, the code sheet and the final report are all given in the appendix. The first chapter on simple analytical methods start with SPSS data preparation and go on to explain the use of descriptive statistics to prepare summary results for each question in the survey data. It also highlights the use of charts for displaying data. The second chapter goes into the use of brand rating data for making snake charts, and positioning of brand using factor analysis. The third chapter introduces the concept of correlation coefficient and its sue for getting derived importance weights used for construction of Kano diagram. Chapter 4 uses the importance scores for benefits to do benefit segmentation using cluster analysis. Chapter 5 introduces Correspondence analysis and its use for mapping brandpersonality association data. Use of regression analysis in marketing research is explained in Chapter 6. The problem of multicollinearity and talking the same using factor analysis is also explained in the same. Chapter 7 describes the use of discriminant analysis to find out brand drivers for different brands. Chapter 9 explains the use of multidimensional scaling for brand positioning. The data files referred to in the text are all available in the attached 3 disk. I am grateful to the Graduate and undergraduate students who enrolled for the marketing research course during the Fall 2003 term for their cooperation in developing the questionnaire and collection of data. I am grateful to Dr. R Krishnan, Director Graduate Program and Dr. Norm Borin, Marketing Area Chair for their support and encouragement for this project. M J Xavier December 2003
1
CONTENTS Chapter 1. 2. 3. 4. 5. 6. 7. 8. Topic Page No. Introduction and Simple Analytical methods 3 Snake Chart, Factor Analysis and Brand Positioning 14 Kano Model 27 Cluster Analysis and Benefit Segmentation 32 Correspondence Analysis 38 Regression Analysis 45 Discriminant Analysis 52 Multidimensional Scaling 60 APPENDIX Toothpaste Questionnaire 72 Codesheet 78 Power Point Slides 89
2
Chapter 1INTRODUCTION AND SIMPLE ANALYTICAL METHODS
SESSION OBJECTIVES: 1. 2. 3. 4. 5. 6. 7. 8. To understand Data view and Variable view in SPSS data file. To understand the difference between String and Numeric variables To get familiar with Labels and Value labels To learn how to get frequencies of variables and get Pie or Bar charts of those frequencies. To learn how to create a transformed variable and understand the difference between raw variable and the transformed variable. To learn how to calculate the mean, standard deviation and variance of a variable using SPSS To understand how to make a cross tabulation of two nominal variables and use chisquare test to see whether the relationship between the two variables is significant or not. To learn to use the compare means command and learn about independent ttest
DATA VIEW VS VARIABLE VIEW: Open the file `descriptives.sav At the bottom left hand corner you will see TWO BUTTONS: DATA VIEW AND VARIABLE VIEW Click on the variable view. You will see the complete definition of each variable.
Data View variable View
3
STRING VS NUMERIC VARIABLES Study the variable definitions. Note that Nickname is a `STRING VARIABLE All others are `NUMERIC VARIABLES Go to Data View and see that String variables are made up of letters or letters and numbers (alpha numeric) while the numeric variables are made up of numbers only.
LABELS AND VALUE LABELS Go to variable view again and study the columns LABEL and VALUES. Click at the right hand corner of VALUES corresponding to the VARIABLE `class A small window shown below will open.
These are the codes used for the variable class. Now shift to Data view and then click: VIEW VALUE LABEL You will notice that labels corresponding to the numerical codes appear on the data sheet.
4
FREQUENCIES AND CHARTS: Let us first of try to understand the profile of respondents. Let us start with the age profile. Run the following SPSS Commands to get the distribution of respondents by age. ANALYZE DESCRIPTIVE STATAISTICS FREQUENCIES Drag variable `Age[q12a] on to the VARIABLE(S) Box CHART PIE CHARTS PERCENTAGES CONTINUE OK Check if you get the following table and the Pie chart from in a new window.Age Cumulative Percent 82.9 100.0
Frequency Valid Under 18 years 1824 years Total 58 12 70
Percent 82.9 17.1 100.0
Valid Percent 82.9 17.1 100.0
5
Age1824 years 17.1%
Under 18 years 82.9%
Now repeat the analysis with other demographic variables, namely Household Income, Gender, and Race. You can drag all three variables to the variables box and have the charts made simultaneously. Now do the frequencies with other variables, awareness of Brands and also with trial of brands. Now go the variable Current Brand and change the chart from PIE to Bar and see if you get the following chart.
6
Current Brand40
30
20
10
Percent
0 AquaFresh Colgate Crest Arm & Hammer Mentadent Others
Current Brand
RAW VARIABLE VS TRANSFORMED VARIABLES: Suppose we want to know on an average how many brands a person is aware of , we cannot get it directly from the data. We need to create a new variable from the existing ones. Try the following commands to create a new variable called aware which is derived from other variables. TRANSFORM COMPUTE Type `aware in the TARGET VARIABLE Box Move variable `q01a into the NUMERIC EXPRESSION Box Click on + Move variable `q01b into the NUMERIC EXPRESSION Box Click on + Move variable `q01c into the NUMERIC EXPRESSION Box Click on + Move variable `q01d into the NUMERIC EXPRESSION Box Click on + Move variable `q01e into the NUMERIC EXPRESSION Box Click on + Move variable `q01f into the NUMERIC EXPRESSION Box Click on + 7
Move variable `q01g into the NUMERIC EXPRESSION Box Click on + OK Note that we are forming a numeric expression Aware = q01a + q01b + q01c + q01c + q01d + q01e + q01f + q01g Notice that a new column has been created by SPSS called `aware. While the original variables are called raw variables, the new one formed out of raw varaiables is called a transformed variable. Go to variable view and type `No. Of Brands Aware in the LABEL column corresponding to the variable `aware Now perform a Frequency analysis on the new variable `aware and get a bar chart as shown below.No. Of Brands Aware Cumulative Percent 1.4 7.1 50.0 90.0 98.6 100.0
Valid
2.00 3.00 4.00 5.00 6.00 7.00 Total
Frequency 1 4 30 28 6 1 70
Percent 1.4 5.7 42.9 40.0 8.6 1.4 100.0
Valid Percent 1.4 5.7 42.9 40.0 8.6 1.4 100.0
8
No. Of Brands Aware50
40
30
20
10
Percent
0 2.00 3.00 4.00 5.00 6.00 7.00
No. Of Brands Aware
In the same way create a new variable called trial (no. Of brands tried by each person) using the following expression. trial = q02a + q02b + q02c + q02d + q02e + q02f + q02g And find the frequency distribution of the number of brands tried. MEAN, STANDARD DEVIATION AND VARIANCE: Note that the two new transformed variables, namely, aware and trial are different from the other variables we have seen earlier. These are ratio scaled variables whereas the other variables are only nominally scaled. We shall see how to calculate mean, standard deviation and variance for a ratio scaled data. Try the following SPSS Commands. ANALYZE DESCRIPTIVE STATISTICS DESCRIPTIVES drag the variable `aware to VARAIABLES Box OPTIONS 9
VARIANCE CONTINUE OK You will get the following outputDescriptive Statistics N No. Of Brands Aware Valid N (listwise) 70 70 Minimum 2.00 Maximum 7.00 Mean 4.5286 Std. Deviation .84650 Variance .717
This shows that on an average a respondent is aware of 4.53 brands and the standard deviation of the same variable is 0.85. CROSSTABS AND CHISQUARE TEST: Suppose we want to know whether the use of a particular brand depends on whether the person is a male or female, we need to use the type of analysis called crosstabulation. Crosstabs is used to explore the relationship between two nominal or categorical variables. Try the following SPSS Commands ANALYZE DESCRIPTIVE STATITICS CROSSTABS Drag Gender to ROW(S) Drag Current Brand to COLUMN(S) CELLS ROW CONTINUE STATISTICS CHISQUARE OK
10
Gender * Current Brand Crosstabulation Current Brand Mentade Crest nt 8 3 22.9% 17 48.6% 25 35.7% 8.6% 3 8.6% 6 8.6%
AquaFresh Gender Male Count % within Gender Count % within Gender Count % within Gender 7 20.0% 3 8.6% 10 14.3%
Colgate 9 25.7% 11 31.4% 20 28.6%
Arm & Hammer 3 8.6% 0 .0% 3 4.3%
Others 5 14.3% 1 2.9% 6 8.6%
Total 35 100.0% 35 100.0% 70 100.0%
Female
Total
It is a convention to keep the independent variable in the row, dependent variable in the column and get row percentages in the cells. In this case, we are trying to explore whether gender has an impact on brand choice. To interpret the Table always look column wise and see if the percentages vary drastically. In the Aquafresh column, there is a larger percentage of males. Colgate has marginally large percentages of females. Crest has substantially large percentage of females compared to males. Mentadent has equal following among males and females. Arm & Hammer appears to be an exclusive male brand. There appear to be some relationship between gender and brand used. To check whether the relationship is significant or not, we need to look at the chisquare value.ChiSquare Tests Asymp. Sig. (2sided) .058 .032 .228
Pearson ChiSquare Likelihood Ratio LinearbyLinear Association N of Valid Cases
Value 10.707(a) 12.230 1.452 70
df 5 5 1
a 6 cells (50.0%) have expected count less than 5. The minimum expected count is 1.50.
chsquare value of 10.707 at 5 degrees of freedom is significant at 0.058, i.e. at a confidence level of 94.2. Normally we look for a confidence level of 95% or more. As it is close to 95%, and also given the fact that some of the cells have values less than 5, we can take this as a significant relationship. Note that the cell frequencies should be 5 or more for chisquare test. However SPSS applies a correction factor to take care of this deficiency and makes it all the more difficult to attain significance.
11
The same rule for the significance level of 0.05 or less applies to all the tests that we are going to learn, be it ttest, ftest or any other test. Degrees of freedom = (no. of rows 1) x (no. of columns 1) The same way construct cross tabs for age, income, and race against Current Brand and check if the relationships are significant using chisquare values.
COMPARE MEANS Suppose we want to know whether the mean number of brands aware across males and females, we could use the following commands. ANALYZE COMPARE MEANS MEANS Drag variable `aware to DEPENDENT LIST gender to INDEPENDENT LIST OK The following output will be obtained.Report No. Of Brands Aware Gender Male Female Total Mean 4.6286 4.4286 4.5286 N 35 35 70 Std. Deviation 1.00252 .65465 .84650
4.6 and 4.2 are very close values. The difference between Male and Female appears to be very marginal. Now do the analysis with other classification variables, race, income, age and class. INDEPENDENT tTEST: Suppose we want to know whether the difference of 0.4 in the number of brand aware of between male and female populations is statistically significant, we need to conduct a ttest.
12
Run the following SPSS Commands ANALYZE COMPARE MEANS INDEPENDENT SAMPLES tTEST Drag variable aware to the TEST VARIABLE(S) Drag variable q12c to GROUPING VARIABLE DEFINE GROUPS GROUP 1 (Type 1) GROUP 2 (Type 2) CONTINUE OKIndependent Samples Test Levene's Test for Equality of Variances
ttest for Equality of Means 95% Confidence Interval of the Difference Lower .2038 6 .2050 4 Upper .6038 6 .6050 4
F No. Of Brands Aware Equal variances assumed Equal variances not assumed 3.608
Sig. .062
t .988
df 68 58.53 5
Sig. (2tailed) .327
Mean Differe nce .2000
Std. Error Differe nce .20239
.988
.327
.2000
.20239
Our Null Hypothesis is that the means for males and females are the same. The alternate Hypothesis is that the means are significantly different. As we do not know which one should be greater we use a 2tailled significance test. Notice that the tvalue of 0.988 at 68 degrees of freedom has significance of only 0.327. The corresponding confidence level is 67.3% which is too low. Unless the significance level is less than 0.05 the mean values are not significantly different. Note that the degrees of freedom in this case are No. of observations minus two. Conduct the ttest for other variables age, race and income to see whether the mean brands aware of vary by any of these categories.
13
Chapter  2 Snake Chart, Factor Analysis and Brand PositioningObjectives: To understand how to compute mean ratings of brands and to construct snake charts To learn how to run factor analysis and understand the following concepts  variance explained  factor loading  eigen value  communality  rotation  factor score Use factor Analysis for Brand positioning Data Structure: Open the file factor.sav Study the file structure. This new file has been created out of the master data file by rearranging the variables q06a01 to q06d11 as indicated below. q06a01 ... q06a11 1. 2. . . . . . 70 q06b01 ... q06b11 1. 2. . . . . . 70 q06c01 ... q06c11 1. 2. . . . . . 70 q06d01 ... q06d11 1. 2. . . . . . 70
Original Data
14
Brand q06a01 ... q06a11 Code Data
q06b01 ... q06b11 Data
q06c01 ... q06c11 Data
q06d01 ... q06d11 Data
Rearranged Data for further Analysis Note that blank rows have been deleted in the new file and a new variable brand code has been created. Snake Chart: We need to calculate the mean ratings of brands, before we can construct the snake chart. Use the following commands to obtain the mean ratings. ANALYZE COMPARE MEANS MEANS Highlight and Drag `q06_01q06_10 to DEPENDENT LIST Highlight `brand and drag to INDEPENDENT LIST OPTIONS Uncheck NUMBER OF CASES Uncheck STANDARAD DEVIATION CONTINUE OK
15
Brand AquaFresh Colgate Crest Mentadent Arm & Hammer Others
Fighting Cavities
Whitening Teeth
Cleaning Stains
Good Taste
Likeable Flavor
Freshening Breath
Brand Image
Color
Attractive Packaging
Innovative features
7.2 8.1 8.4 9.3 9.0 8.6
6.1 6.9 7.6 8.1 8.3 6.2
6.7 7.5 7.9 8.9 8.3 6.8
6.8 6.9 8.2 9.3 6.0 9.0
6.9 7.0 8.0 9.6 5.0 8.8
7.4 7.7 8.3 9.6 8.7 9.4
6.6 7.9 8.7 7.9 7.3 5.8
7.0 7.0 7.8 8.7 7.7 7.2
7.1 7.0 7.8 8.3 7.3 6.0
6.4 6.8 7.4 7.4 8.0 8.4
Mean Ratings for Brands Right click on the table, copy and paste onto an excel worksheet. Highlight the relevant portions and click Chart Wizard. Choose Line, press Next and Finish to get the following snake chart.12.0 10.0 8.0 6.0 4.0 2.0 0.0gh t in W gC Cl hi av ea te itie ni ni s ng n g Te St ai et ns h /T ar G ta o r Li o d ke Ta a st Fr e e s ble he Fl av ni ng or Br Br an eath d In Im At no ag va tra e ct t iv ive e Co fe at Pac lor ur e s kag in /In g gr ed ie nt s
AquaFresh Colgate Crest Mentadent Arm & Hammer Others
This chart can be used to study the relative positioning of brands on different attributes. We can see that Mentadent and Arm&Hammer are rated highly on selected attributes while Crest scores consistently higher rating on all the attributes. Aquafresh has lower rating and Colgate is stuck in the middle. Here the points are very cluttered and it is difficult to see finer distinctions. Factor analysis will help us do sharper positioning.
Fi
16
Factor Analysis Factor analysis is used to understand the underlying dimensions of a set of variables having high correlation among them. Execute the following commands to get the factor analysis output. ANALYZE DATA REDUCTION FACTOR Highlight and Drag `q06_01q06_10 to VARIABLES DESCRIPTIVES Check COEFFICIENTS CONTINUE EXTRACTION SCREE PLOT CONTINUE ROTATION VARIMAX CONTINUE SCORES SAVE VARIABLES CONTINUE OPTIONS SORTED BY SIZE CONTINUE OK Take a look at the Correlation Matrix and notice that the variables are correlated among themselves. For example the correlation between good taste and likeable flavor is as high as 0.928. The correlations are sufficient for conducting a factor analysis is confirmed by Bartletts Test of Sphericity which is significant.
17
Correlation MatrixLikeabl e Flavor Attracti ve Packag ing
Fighting Cavities Fighting Cavities Whitening Teeth Cleaning Stains/Tartar Good Taste Likeable Flavor Freshening Breath Brand Image Color Attractive Packaging Innovative features
Whitening Teeth
Cleaning Stains
Good Taste
Freshenin g Breath
Brand Image
Color
Innovativ e features
1.000 .385 .610 .221 .203 .450 .367 .142 .210 .228
.385 1.000 .613 .278 .223 .403 .354 .198 .249 .501
.610 .613 1.000 .218 .161 .470 .461 .188 .314 .488
.221 .278 .218 1.000 .928 .475 .313 .436 .303 .366
.203 .223 .161 .928 1.000 .481 .276 .436 .300 .318
.450 .403 .470 .475 .481 1.000 .384 .349 .358 .390
.367 .354 .461 .313 .276 .384 1.000 .576 .624 .373
.142 .198 .188 .436 .436 .349 .576 1.000 .650 .419
.210 .249 .314 .303 .300 .358 .624 .650 1.000 .413
.228 .501 .488 .366 .318 .390 .373 .419 .413 1.000
Now take a look at the variance explained matrix.Total Variance Explained Component 1 2 3 4 5 6 7 8 9 10 Total 4.438 1.577 1.239 .804 .500 .440 .343 .329 .261 .068 Initial Eigen values % of Variance 44.378 15.773 12.391 8.043 5.004 4.399 3.432 3.291 2.613 .676 Cumulative % 44.378 60.152 72.543 80.585 85.590 89.989 93.421 96.711 99.324 100.000 Rotation Sums of Squared Loadings Total 2.651 2.392 2.211 % of Variance 26.512 23.915 22.115 Cumulative % 26.512 50.428 72.543
Read Component as Factors in the Table. Technically the 10 original variables can be converted into 10 new factors which are orthogonal to each other (i.e. will have zero correlation among them). The first such factor will account for 44.378 percent variance in the original data, second one will account for 15.773 percent and so on. In statistics variance is information. As 72.543 percent of information (variance) is summarized by three variables, it is enough to work with three factors. We shall see what does the eigen value and rotation mean later.
18
Now go to the data view in the SPSS data file. You will notice that three new variables, namely, fact1_1, fact2_1 and fact3_1 have been added by the system. The values that these variables take are called factor scores. Basically the original 10 intercorrelated variables have been converted to 3 new factors which are orthogonal to each other. To check the orthogonality do the following analysis. ANALYSE CORRELATE BIVARIATE Highlight and drag `fact1_1, fact2_1 and fact3_1 to VARIABLES OKYou will get the following output which shows that the factors have zero correlation between them. Correlations REGR factor score 1 for analysis 1 1 . 199 .000 1.000 199 .000 1.000 199 REGR factor score 2 for analysis 1 .000 1.000 199 1 . 199 .000 1.000 199 REGR factor score 3 for analysis 1 .000 1.000 199 .000 1.000 199 1 . 199
REGR factor score 1 for analysis 1 REGR factor score 2 for analysis 1 REGR factor score 3 for analysis 1
Pearson Correlation Sig. (2tailed) N Pearson Correlation Sig. (2tailed) N Pearson Correlation Sig. (2tailed) N
Now the problem is to find out what these factors mean. Obviously the three new factors summarize the information present in the original ten variables. We need to establish which variables go into which factor. Look at the rotated component matrix
19
Rotated Component Matrix(a) Component 1 Cleaning Stains/Tartar Fighting Cavities Whitening Teeth Freshening Breath Innovative features/Ingredients Attractive Packaging Color Brand Image Likeable Flavor Good Taste .878 .763 .758 .550 .482 .154 .013 .364 .098 .152 2 .203 .051 .150 .220 .437 .867 .830 .757 .183 .195 3 .014 .103 .126 .484 .234 .115 .323 .077 .950 .933
The cells contain factor loadings, i.e. correlation coefficients of original variables with the new factors. Conduct the following analysis to confirm the above statement. Conduct a correlational Analysis of the first variable Cleaning of Stains/Tartar with the three new factors (fact1_1, fact2_1 and fact3_1) to get the first row in the rotated component matrix. The variable `cleaning stains/tartar has correlation coefficient of 0.878 with the first factor, 0.203 with the second factor and 0.014 with factor 3. What it means is that the variable `cleaning stains/tartar belongs to first factor. The same way the variables highlighted in the column corresponding to factor 1 belong to the same factor. For the moment ignore the variable `innovative features/Ingredients as it is highlighted in two columns. Looking at the variables that go into each factor we can name them as Dental Hygiene, Visibility and Sensory Benefits. Factor 1 Cleaning Stains/Tartar Fighting Cavities Whitening Teeth Freshening breath Dental Hygiene Factor  2 Attractive Packaging Color Brand Image Visibility Factor  3 Likeable Flavor Good Taste Sensory Benefits
The variable `Innovative features/ingredients has a high correlation with Dental Hygiene as well as visibility. Suppose a brand claims in its advertisements that it has a new ingredient that whitens the teeth, it contributes to Dental Hygiene as well the visibility of the brand. That is how it features in two factors.
20
Now take a look at the communalities matrix.Communalities Initial 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 Extraction .596 .613 .812 .931 .945 .585 .712 .794 .789 .478
Fighting Cavities Whitening Teeth Cleaning Stains/Tartar Good Taste Likeable Flavor Freshening Breath Brand Image Color Attractive Packaging Innovative features/Ingredients
Communalities refer to the amount of information that has been extracted from each variable. Notice that more than 90 percent of information (variance) has been extracted from variables Good taste and Likeable Flavor whereas less than 50% is extracted from Innovative features/Ingredients. If we work with large number of variables, say more than 20, it may be a good idea to leave out variables with low communality while naming factors. In the same way the Eigen values are directly proportional to the amount of variance explained by each factor. The sum of all Eigen values always equals the total number of variables. Hence the proportion of variance explained by each factor can be calculated by dviding the corresponding eigen value by the total number of variables. Now take a look at the variance explained table to verify the same. As the first eigen value is 4.438, the variance explained by the first factor can be calculated by diving 4.438 by 10 (total number of variables) and multiplying bv 100. Now take a look at the Scree Plot
21
Scree Plot5
4
3
2
Eigenvalue
1
0 1 2 3 4 5 6 7 8 9 10
Component Number
Scree plot gives an idea as to how many factors to extract. The rule normally applied is to stop at where the arm bends. In this case it is three factors. After three factors the curve gets flat indicating that the gain will be marginal if we go beyond three factors. The default in SPSS is that it stops when the eigen value gets less than one. To understand the concept of rotation, take a look at the unrotated component matrix. If we plot factor2 and factor3 we get the graph shown below.
22
.6 Likeable Flavor .4 Good Taste Freshening Breath .2 Fighting Cavities Whitening Teeth Cleaning Stains/Tart
FACTOR3
.0
Innovative features/
.2
.4
Color
Brand Image Attractive Packaging
.6 .6 .4 .2 .0 . .4 .6 .8
FACTOR2
It is difficult to interpret this type of data as we find that the cluster of variables likeable flavor and good taste are midway between factor 2 and factor 3. If we rotate the yaxis so as to pass through the cluster we can interpret the yaxis as `sensory benefits. Same way the xaxis can be rotated to pass through the cluster that corresponds to Dental Hygiene. Rotation is done to make it easy to interpret the output. Note that the angle between X and Y axis in our rotation is more than 90 degrees. If the angle is maintained at 90o it is called an orthogonal rotation otherwise it is known as oblique rotation. Note that we used Varimax rotation which is an orthogonal rotation method. What we have achieved by conducting a factor analysis is that we have converted the original ten variables into 3 new factors. Now we can use these three new variables to do brand positioning. We can bring both the variables and the brands on to the same map.
23
Brand Positioning Using Factor Scores: We shall now find out the mean ratings of brands for the three new factors. First of all go to the variables view and label those new factors as: 1. Dental Hygiene 2. Visibility 3. Sensory Benefits Then compute mean ratings by executing the following commands. ANALYZE COMPARE MEANS MEANS Highlight and Drag `Dental Hygiene, `visibility, and `oral sensation to DEPENDENT LIST Drag `brand to INDEPENDENT LIST OPTIONS Highlight `number of case and `standard deviation and send back to STATISTICS CONTINUE OK You will get the following output.
Brand AquaFresh Colgate Crest Mentadent Arm & Hammer Others Total
Dental Hygiene .5622039 .0723164 .3170127 .7780827 .9514446 .0827401 .0000000
Visibility .1069474 .0603488 .2179352 .0560046 .0165516 .8240165 .0000000
Sensory Benefits .1044542 .2591244 .2061569 .9165143 .9228734 1.1187287 .0000000
We already have the coordinates of the variables in the rotated components matrix. Create combined table, which has the coordinates of both the brands and attributes as below.
24
Brand/Attribute AquaFresh Colgate Crest Mentadent Arm & Hammer Others Cleaning Stains/Tartar Fighting Cavities Whitening Teeth Freshening Breath Innovative features/Ingredient Attractive Packaging Color Brand Image Likeable Flavor Good Taste
Dental Hygiene 0.56 0.07 0.32 0.78 0.95 0.08 0.88 0.76 0.76 0.55 0.48 0.15 0.01 0.36 0.1 0.15
Visibility 0.11 0.06 0.22 0.06 0.02 0.82 0.2 0.05 0.15 0.22 0.44 0.87 0.83 0.76 0.18 0.2
Sensory Benefits 0.1 0.26 0.21 0.92 0.92 1.12 0.01 0.1 0.13 0.48 0.23 0.12 0.32 0.08 0.95 0.93
Using this data create a new SPSS file factor1.sav To get the positioning map of the first two factors use the following commands. GRAPHS SCATTER SIMPLE DEFINE Drag `Dental Hygiene to XAXIS Drag `Visibility to YAXIS Drag `Brand/Attribute to LABEL CASES BY OPTIONS Check DISPLAY CHART WITH CASE LABELS OK The resulting plot can be taken to Power point to have it annotated as shown below. Note that the attributes are represented as vectors and brands as points.
25
1.0Colo
Attractive Packaging Brand Image
.5Good
Innovative
Freshening
Cleaning Stains Whitening Teeth
Visibility
Likeable
Crest
0.0 AquaFresh Colgate
Mentadent Fighting Cavities Arm & Hammer
.5 Others 1.0 .6 .4 .2 .0 .2 .4 .6 .8 1.0
Dental Hygiene
While Arm & Hammer and Mentadent are seen as better in Dental Hygiene, Crest score better on Visibility. In the same way get the other two plots, namely, Dental Hygiene Vs Sensory Benefits and Visibility Vs Sensory Benefits.
26
Chapter 3 KANO ModelObjectives: To understand the basics of Kano Model To learn how to calculate derived Importance weights for attributes To learn how to plot the Kano Model and interpret the same Data Files: We will be using two different data files for this analysis. 1. cluster.sav 2. factor.sav Stated Importance: This will be the mean importance rating given to attributes by the respondents. To calculate the means, open the file cluster.sav and run the following commands. ANALYZE DESCRIPTIVE STATISTICS DESCRIPTIVES Highlight and drag variables `q05a.q05j OPTIONS Check DESCENDING MEANS CONTINUE OK
27
Descriptive Statistics N Freshening Breadth Fighting Cavities Cleaning Stains/Tartar Whitening Teeth Good Taste Likeable Flavor American Dental Association Recommendation Innovative Feature/new ingredient High Prestige Brand Attractive Packaging Valid N (listwise) 70 70 70 70 70 70 70 70 70 70 70 Minimum 4.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 Maximum 7.00 7.00 7.00 7.00 7.00 7.00 7.00 7.00 7.00 7.00 Mean 6.3000 6.1143 5.6714 5.6429 5.3429 5.1714 4.4857 4.0714 3.4714 3.4286 Std. Deviation .76802 1.44004 1.34834 1.41458 1.50252 1.52250 1.88620 1.36543 1.50093 1.68171
You find that Freshening Breath is the most important attribute with a mean rating of 6.3 on a 17 scale. Attractive Packaging is the least important attribute. To convert the means into importance weights, we need to normalize the means. Take the above table to Excel and find the sum of means. Then calculate: Importance Weight = (Mean/Sum of Means)*100 Develop the following Stated Importance weights table.Attribute Freshening Breadth Fighting Cavities Cleaning Stains/Tartar Whitening Teeth Good Taste Likeable Flavor American Dental Association Recommendation Innovative Feature/new ingredient High Prestige Brand Attractive Packaging Sum Mean 6.30 6.11 5.67 5.64 5.34 5.17 4.49 4.07 3.47 3.43 49.70 Importance Weight 12.68 12.30 11.41 11.35 10.75 10.41 9.03 8.19 6.98 6.90 100.00
Stated Importance 28
Derived Importance: In order to obtain the derived importance we are going to use the file factor.sav. By correlating rating of attributes with the overall rating we get the derived importance of attributes. Use the following commands. ANALYZE CORRELATE BIVARIATE Highlight variables q06_01 to q06_11 and drag to VARIABLES OK You will get a 11 by 11 matrix of correlations. We are interested in the last column only which has the correlation of individual attributes with the overall rating. As correlations can range from 1 to +1, take the r2 value for derived importance. Once again these values can be normalized by taking the sum of all the r2 values. The resultant table is given below.Attribute Freshening Breath Good Taste Likeable Flavor Brand Image Innovative features/Ingredients Color Cleaning Stains/Tartar Whitening Teeth Fighting Cavities Attractive Packaging Sum r 0.65 0.63 0.59 0.59 0.57 0.57 0.54 0.53 0.52 0.45 r2 0.43 0.39 0.35 0.35 0.33 0.32 0.29 0.28 0.27 0.21 3.21 Importance Weights 13.26 12.22 10.84 10.81 10.22 9.96 9.04 8.77 8.46 6.41 100.00
Derived Importance Weights Bring stated and derived importance to a common table as shown below. Now plot derived Stated Importance against Derived importance to develop the Kano Model.
29
Attribute Attractive Packaging Cleaning Stains/Tartar Fighting Cavities Freshening Breadth Good Taste High Prestige Brand Innovative Feature/new ingredient Likeable Flavor Whitening Teeth
Stated Importance 6.90 11.41 12.30 12.68 10.75 6.98 8.19 10.41 11.35
Derived Importance 6.41 9.04 8.46 13.26 12.22 10.81 10.22 10.84 8.77
Stated Vs Derived Importance
Use the commands: GRAPH SCATTER Drag stated importance to XAXIS Drag derived importance to YAXIS Drag attribute to LABEL CASES BY OPTIONS Check DISPLAY CHART WITH CASE LABELS CONTINUE OK On the graph use 10 as a cut off for High and Low values of importance and illustrate by taking it to PowerPoint.
30
1 Freshening breathHigh
1 1
Delight AttributesHigh Prestige brand Innovative Ingredients
Good Taste
Derived Importance
1 1 9 8
Likeable flavor
Cleaning Stains/Tartar Whitening teeth Fighting cavities
Low
7 6
Attractive packaging 6 7Low
Minimum Expected Attributes8 9 1 1High
1
1
Stated Importance
KANOs ModelAccording to Kanos model attributes that have a high stated and low derived importance are Minimum expected attributes. Attributes like whitening teeth, fighting cavities , cleaning stains are minimum expected in a tooth paste. Attributes with low stated and high derived importance are called Delight attributes. The marketers should concentrate on these attributes. In this study the innovative ingredients and brand image emerge as the delight attributes. Others are linear attributes. If they are important then pay attention. The most important attribute is freshening breath, as the stated and derived importance are high. If they have low importance one should not do over engineering of those attributes. In this case spending too much on packaging may not produce commensurate returns.
31
Chapter 4 Cluster Analysis and Benefit SegmentationObjectives: To learn how cluster analysis can be used for grouping of subjects To understand the difference between Hierarchical clustering and kmeans clustering To use SPSS to perform cluster analysis and interpret the results. To learn how to use cluster analysis for Benefit segmentation
CLUSTER ANALYSIS: We shall use the cluster.sav file for this session. Let us first calculate the descriptives. Execute the following commands. ANALYZE DESCRIPTIVE STATISTICS DESCRIPTIVES Drag variables q05a to q05j to VARIABLES OK If you sort the output according to descending values of Standard deviation you will get this output.Attribute American Dental Association Recommendation Attractive Packaging Likeable Flavor Good Taste High Prestige Brand Fighting Cavities Whitening Teeth Innovative Feature/new ingredient Cleaning Stains/Tartar Freshening Breadth N 70 70 70 70 70 70 70 70 70 70 Mean 4.49 3.43 5.17 5.34 3.47 6.11 5.64 4.07 5.67 6.30 Std. Deviation 1.89 1.68 1.52 1.50 1.50 1.44 1.41 1.37 1.35 0.77
Mean rating of Benefits Sought 32
From the output, it is clear that the five variables with high standard deviation are: American dental Association recommendation Attractive Packaging Likeable Flavor Good Taste High Prestige Brand Fighting Cavities Whitening Teeth These are the attributes where the opinion of the respondents vary much. Hence for clustering and segmentation we shall use only these seven variables. Hierarchical Clustering: Let us start with Hierarchical clustering. Execute the following commands. ANALYZE CLASSIFY HIERARCHICAL CLUSTER Highlight and drag the seven variables to VARIABLE(S) Drag nickname to LABEL CASES BY PLOTS Check DENDROGRAM Check NONE CONTINUE If you look at the output you will find a tree structure. If you leave out three cases 38 bearing the Nick name `Warden, 2 (Bud) and 9 (Hank) there are three major branches. That gives us some idea about how many clusters to ask for when we go to Kmeans clustering. KMeans Clustering: In this method the respondents will get allocated to different clusters based on the number of clusters the researcher asks for. Based on the results of the hierarchical cluster we have decided to ask for three clusters. ANALAYZE CLASSIFY KMEANS CLUSTER Highlight and drag the five variables to VARIABLES box Drag `nickname to LABEL CASES BY NUMBER OF CLUSTERS change from 2 to 3 SAVE CLUSTER MEMEBERSHIP CONTINUE 33
OPTION CLUSTER INFORMATION FOR EACH CASE Look at FINAL CLUSTER CENTERS.Attribute Attractive Packaging American Dental Association Recommendation Likeable Flavor Good Taste High Prestige Brand Whitening Teeth Fighting Cavities Cluster 1 2.63 4.63 3.32 3.53 2.63 5.58 6.68 Cluster 2 3.46 2.08 5.77 6.00 3.08 5.85 3.77 Cluster 3 3.82 5.24 5.89 6.03 4.03 5.61 6.63
Yellow filling indicates rank one across the row and green indicates rank 2. Benefit Segmentation: Cluster 1 members are primarily concerned with fighting cavities and are also interested in American Dental Association recommendation. So the benefit sought is `Medically proven cavity fighter. Cluster 2 is primarily interested in Whitening teeth. They have also given relatively high rating for likeable flavor and good taste. Though they get the second rank on attractive packaging and high prestige brand, the ratings themselves are low in absolute terms. the benefit sought by this group is white teeth plus good taste and flavor. Cluster 3 wants everything; they look for a balanced paste that provides dental care and sensory benefits (taste & flavor). So the Benefit segments that we have devised are as follows: 1. Proven cavity fighter 2. Tasty flavorful paste for white teeth 3. Balanced paste which provides dental care as well as has good taste and flavor From the table on number of cases in each cluster we find that 19 are in cluster 1, 13 are in cluster 2 and 38 are in cluster 3. Majority (54%) of the people wants a balanced paste, 27% want a cavity fighter and 19% are for white teeth.
34
Number of Cases in each Cluster Cluster 1 2 3 Valid Missing 19.000 13.000 38.000 70.000 .000
Now take a look at the cluster membership table. This gives the information about the cluster membership of each individual. The same information is also stored in the SPSS data file as a new variable crated qcl_1. Insert label values for the new variable as given below: 1. Cavity Fighter 2. White Teeth 3. Balanced Paste Cross Classification with Demographic variables: Cross tabulation of the new variable qcl_1 Vs race will produce the following table.Crosstab Race/Ethicity White Cluster Number of Case Cavity fighter Count % within Cluster Number of Case White teeth Count % within Cluster Number of Case Balanced Paste Count % within Cluster Number of Case Total Count % within Cluster Number of Case 16 84.2% 9 69.2% 30 78.9% 55 78.6% Others 3 15.8% 4 30.8% 8 21.1% 15 21.4% Total 19 100.0% 13 100.0% 38 100.0% 70 100.0%
While cavity fighting is important for more proportion of whites, white teeth seems to be of more importance to nonwhites. However the chisquare does not show a significant relationship between benefits segments and race.
35
ChiSquare Tests Asymp. Sig. (2sided) 2 2 1 .596 .605 .755
Pearson ChiSquare Likelihood Ratio LinearbyLinear Association N of Valid Cases
Value 1.036(a) 1.005 .097 70
df
a 2 cells (33.3%) have expected count less than 5. The minimum expected count is 2.79.
Same kind of analysis can be done with age, income, and gender. The benefit segments Vs current brand produced the following table.Crosstab Current Brand AquaFresh Cluster Number of Case Cavity fighter Count % within Cluster Number of Case Count % within Cluster Number of Case Count % within Cluster Number of Case Count % within Cluster Number of Case 4 21.1% 1 7.7% 5 13.2% 10 14.3% Colgate 5 26.3% 7 53.8% 8 21.1% 20 28.6% Crest 6 31.6% 2 15.4% 17 44.7% 25 35.7% Others 4 21.1% 3 23.1% 8 21.1% 15 21.4% Total 19 100.0% 13 100.0% 38 100.0% 70 100.0%
White teeth
Balanced Paste
Total
Aqua Fresh has large proportion of cavity fighters; Colgate has a large proportion of White teeth seekers; And Crest ahs a large proportion of Balanced paste segment. Once again the chisquare is not significant. We need to take these results with a pinch of salt.
36
ChiSquare Tests Asymp. Sig. (2sided) .302 .317 .412
Pearson ChiSquare Likelihood Ratio LinearbyLinear Association N of Valid Cases
Value 7.213(a) 7.038 .674 70
df 6 6 1
a 6 cells (50.0%) have expected count less than 5. The minimum expected count is 1.86.
37
Chapter 5 Correspondence AnalysisObjectives: To understand the basic nature of correspondence analysis To conduct Correspondence analysis using SPSS and interpret the results.
What is correspondence Analysis? Correspondence analysis is typically used to get a graphical representation of contingency tables. Suppose we did a sample study in which we obtained the income level and the brand used by 36 respondents. The following table summarizes the responses.Brand Brand A Income Less than $1000 $1000 to $3000 $3001 to $5000 Above $5000 5 2 2 2 11 Brand B 2 4 2 1 9 Brand C 1 1 4 1 7 Brand D 1 2 1 5 9 Total 9 9 9 9 36
Total
Data Preparation: To get a better insight on the relationship between income and brand used, we could use correspondence analysis. To run correspondence analysis using SPSS we need to have the data organized in the following format. Income Less Than $1000 Less Than $1000 Less Than $1000 Less Than $1000 $1000 to $ 3000 $1000 to $ 3000 $1000 to $ 3000 $1000 to $ 3000 $3001 to $5000 Income Code 1 1 1 1 2 2 2 2 3 Brand Brand A Brand B Brand C Brand D Brand A Brand B Brand C Brand D Brand A Brand Code 1 2 3 4 1 2 3 4 1 Frequency 5 2 1 1 2 4 1 2 2
38
$3001 to $5000 $3001 to $5000 $3001 to $5000 Above $5000 Above $5000 Above $5000 Above $5000
3 3 3 4 4 4 4
Brand B Brand C Brand D Brand A Brand B Brand C Brand D
2 3 4 1 2 3 4
2 4 1 2 1 1 5
SPSS Commands: The same data has been used to create a data file corres1.sav. Use the following SPSS commands to run correspondence analysis. DATA WEIGHT CASES WEIGHT CASES BY Drag `freq to FREQUENCY VARIABLE OK ANALYZE DATA REDUCTION CORRESPONDENCE ANALYSIS Drag `income to ROW VARIABLE DEFINE RANGE MINIMUM VALUE `1 MAXIMUM VALUE `4 UPDATE CONTINUE Drag `brand to COLUMN VARIABLE DEFINE RANGE MINIMUM VALUE `1 MAXIMUM VALUE `4 UPDATE CONTINUE OK
39
Interpretation of Results: The data we used as input for the analysis is printed in the correspondence table.Correspondence Table BRAND INCOME Less Than $1000 $1000 to $3000 $3001 to $ 5000 Above $5000 Active Margin Brand A 5 2 2 2 11 Brand B 2 4 2 1 9 Brand C 1 1 4 1 7 Brand D 1 2 1 5 9 Active Margin 9 9 9 9 36
From the summary Table we can infer that the first two dimensions account for 81.4% of inertia. This is pretty similar to the eigen values in factor analysis. It is enough to work with two dimensions.Summary Dimen sion Singular Value Chi Square
Inertia
Sig.
Proportion of Inertia Accounted Cumulati for ve
1 2 3 Total
.427 .341 .261
.182 .116 .068 .367 13.201 .154(a)
.497 .317 .186 1.000
.497 .814 1.000 1.000
a 9 degrees of freedom
Correspondence analysis decomposes the original matrix into row and column points.Overview Row Points(a) Score in Dimension Of Point to Inertia of Dimension INCOME Less Than $1000 $1000 to $3000 $3001 to $ 5000 Above $5000 Active Total Mass .250 .250 .250 .250 1.000 1 .310 .020 .717 1.047 2 .801 .245 .764 .282 Inertia .080 .053 .106 .127 .367 1 .056 .000 .301 .642 1.000 2 .470 .044 .428 .058 1.000
Contribution Of Dimension to Inertia of Point 1 .128 .001 .518 .920 2 .682 .096 .469 .053 Total .810 .097 .987 .974
40
The score in Dimension gives the coordinates for the row variables in the joint plot. In the same way you will find the coordinates for the column variables in the net matrix.Overview Column Points(a) Score in Dimension Of Point to Inertia of Dimension BRAND Brand A Brand B Brand C Brand D Active Total Mass .306 .250 .194 .250 1.000 1 .198 .283 .720 1.085 2 .640 .251 .960 .287 Inertia .068 .059 .107 .133 .367 1 .028 .047 .236 .689 1.000 2 .367 .046 .526 .060 1.000 Contribution Of Dimension to Inertia of Point 1 .075 .146 .402 .947 2 .627 .092 .571 .053 Total .702 .238 .972 1.000
Using the coordinates for the row and column coordinates the program produces a joint map which is given below.
1.0
Less Than $1000 Brand A
.5 Brand B $1000 to $3000 0.0 Brand D Above $5000 .5 $3001 to $ 5000 1.0 1.5 1.0 .5 0.0 .5 Brand C 1.0 BRAN INCOM
Dimension 2
Dimension 1
41
From the chart it is very clear that there is a onetoone relationship between brand used and the income category. It also shows that Brand A and B are more close to each other than the other brands. Toothpaste Data: Now let us turn our attention to the brandpersonality association data collected in the toothpaste study. The data has been arranged in the file correspond.sav Open file corresponds, study the way the data is arranged and run the following SPSS Commands. DATA WEIGHT CASES WEIGHT CASES BY Drag `freq to FREQUENCY VARIABLE OK ANALYZE DATA REDUCTION CORRESPONDENCE ANALYSIS Drag `attri to ROW VARIABLE DEFINE RANGE MINIMUM VALUE `1 MAXIMUM VALUE `11 UPDATE CONTINUE Drag `brand to COLUMN VARIABLE DEFINE RANGE MINIMUM VALUE `1 MAXIMUM VALUE `3 UPDATE CONTINUE OK
42
Notice that 100 percent inertia has been accounted for the first two dimensions. It is enough to work with two dimensions.Summary Confidence Singular Value Standar Correlati on d Deviatio n 2 .035 .036 .043
Proportion of Inertia
Dimensio n 1 2 Total
Singular Value .286 .146
Inertia .082 .021 .103
Chi Square
Sig.
Accounted for .793 .207
Cumulati ve .793 1.000 1.000
72.795
.000(a)
1.000
a 20 degrees of freedom The resulting correspondence Map can be taken to Powerpoint and annotated as given below.
.8 .6 .4 Aqua Fresh Fun Loving Outgoing Feminine
Hedonist Masculine
Dimension 2
.2 0.0 .2 .4 .6 .8 1.0
Colgate Overcautious Sensuous Romantic Traditional
Crest Ambitious Brand Achiever Personality .5 0.0 .5 1.0
Dimension 1
Crest is seen to be used by Ambitious Achiever Colgate is seen to be used by Traditional, overcautious, masculine person Aqua Fresh is seen as Funloving, Feminine and Outgoing. There is no brand available for Romantic, sensuous typesOpportunity for new product
43
Correspondence analysis is a powerful tool for visualization of data from contingency tables. It has no restrictions on the sample size or on the scale used. Association between any two categorical variables can be easily analyzed using this technique.
44
Chapter 6
Regression AnalysisObjectives: To understand the meaning of regression To conduct simple regression and interpret the results To conduct multiple regression and interpret the results. To understand the problem of multicollinearity and a method to overcome the same.
Simple Regression: Open the file regression.sav Study the file structure to understand that it is the same field which was used for factor analysis along with three new variables corresponding to the factor scores that we created using factor analysis. Variables q06_01 to q06_10 refer to rating scores for different brands on different attributes. Variable q06_11 correspond to overall rating given to different brands. We are going to fit regression equations with overall rating as dependent variable and attribute ratings as independent variables. Linear regression refers to fitting of a linear mathematical model between one dependent variable and one or more independent variables. We shall first conduct a regression analysis using just two variables. Use the following SPSS commands to fit a regression model with Fighting Cavities (q06_01) as independent variable and Overall Rating (q06_11) as the dependent variable. ANALYZE REGRESSION LINEAR Drag Fighting Cavities (q06_01) to INDEPENDENT Drag Overall Rating(q06_11) to DEPENDENT OK Take a look at the Coefficients table.
45
Coefficients(a) Unstandardized Coefficients Model 1 B 3.771 Std. Error .477 .059 Standardized Coefficients Beta .521 t 7.905 8.576 Sig. .000 .000
(Constant)
Fighting .505 Cavities a Dependent Variable: Overall rating
The unstandardized coefficients give the linear mathematical model: Y = 3.771 + 0.505 X Y  Overall rating X Fighting Cavities The strength of relationship between the independent and dependent variables is given by the correlation coefficient R given in Model Summary Table.Model Summary Adjusted R Square .268 Std. Error of the Estimate 1.21679
Model 1
R .521(a)
R Square .272
a Predictors: (Constant), Fighting Cavities
The value of R2 0.272 is somewhat low. Normally an R2 value of 0.7 and above is supposed to signify a strong relationship. In the present case we cannot rule out that there is no relationship between the two variables as the F value is significant in the ANOVA Table and the tvalue corresponding to the varaiable Figiting Cavities is significant in the coefficients Table.ANOVA(b) Sum of Squares Regressio n Residual Total 108.880 291.672 400.553
Model 1
df 1 197 198
Mean Square 108.880 1.481
F 73.540
Sig. .000(a)
a Predictors: (Constant), Fighting Cavities b Dependent Variable: Overall rating
46
Coefficients(a) Unstandardized Coefficients Model 1 B 3.771 Std. Error .477 .059 Standardized Coefficients Beta .521 t 7.905 8.576 Sig. .000 .000
(Constant)
Fighting .505 Cavities a Dependent Variable: Overall rating
Same way conduct simple regression of each rating variable with the dependent variable and interpret the results. Multiple Regression: Now we shall conduct regression analysis with all the 10 attribute ratings as independent variables. ANALYZE REGRESSION LINEAR Drag variables q06_01 to q06_10 to INDEPENDENT Drag Overall Rating(q06_11) to DEPENDENT OK Note that the R2 value dramatically improves to 0.743Model Summary Adjusted R Square .730 Std. Error of the Estimate .74117
Model 1
R .862(a)
R Square .743
a Predictors: (Constant), Innovative features/Ingredients, Fighting Cavities, Likeable Flavor, Attractive Packaging, Whitening Teeth, Brand Image, Freshening Breath, Color, Cleaning Stains/Tartar, Good Taste
From the coefficients table, we can construct the following mathematical model to depict the relationship between the 10 independent variables and the overall rating. Y = 0.598 + 0.208X1 + 0.097X2 + 0.016X3 + 0.109X4 + 0.060X5 + 0.150X6 + 0.135X7 + 0.154X8 0.096X9 + 0.120X10 The negative sign for variable nine (Attractive Packaging) connotes that the same has a negative relationship with overall rating. That is, the paste receiving higher rating on attractive packaging has a diminishing effect on the overall rating.
47
Coefficients(a) Unstandardized Coefficients Model 1 B (Constant) Fighting Cavities Whitening Teeth Cleaning Stains/Tartar Good Taste Likeable Flavor Freshening Breath Brand Image Color Attractive Packaging Innovative features/Ingredi ents a Dependent Variable: Overall rating .598 .208 .097 .016 .109 .060 .150 .135 .154 .096 .120 Std. Error .349 .048 .037 .052 .072 .071 .045 .038 .037 .041 .033 Standardized Coefficients Beta .215 .130 .018 .154 .086 .173 .195 .238 .132 .181 t 1.716 4.368 2.606 .307 1.513 .844 3.311 3.582 4.187 2.361 3.660 Sig. .088 .000 .010 .759 .132 .400 .001 .000 .000 .019 .000
Now the question arises as to which variable is contributing more to the dependent variable. This can be inferred by looking at the standardized coefficients. From the values we can see that the most important variable is Color (0.238) and the second important variable is fighting cavities (0.215) and so on From a statistical stand point also see which are the variables that are significant. The significance of is given by the tvalues. Note that only the following variables are significant at 95 percent confidence level. Fighting cavities (0.000) Freshening breath (0.001) Brand Image (0.000) Color (0.000) and Innovative features (0.000) Other variables have significance values above 0.05. Can we conclude that other variables are not important? We cannot say that unless we are sure that the independent variables are not correlated among them.
48
Multicollinearity: When the independent variables are correlated among themselves, we call it a problem of multicollinearity. Consider the variables Good Taste and Likeable Flavor. They are highly correlated among themselves. ANALYZE CORRELATE BIVARIATE Drag Good Taste and Likeable Flavor to VARIABLES OK You will get the following output.Correlations Likeable Flavor .932(**) .000 201 1 . 201
Good Taste Good Taste Pearson Correlation Sig. (2tailed) N Likeable Flavor Pearson Correlation Sig. (2tailed) N ** Correlation is significant at the 0.01 level (2tailed). 1 . 202 .932(**) .000 201
The correlation coefficient between the two variables is as high as 0.932. Now we shall use only these two variables as independent variable and conduct multiple regressions. ANALYZE REGRESSION LINEAR Drag variables Good Taste and Likeable Flavor to INDEPENDENT Drag Overall Rating(q06_11) to DEPENDENT OK The output indicates that only the variable good taste is significant (t = 3.762 and sig. = 0.000) and Likeable flavor is not significant (t = 0.431 and sig. = 0.667). Can we conclude here that likeable flavor is not an important attribute? No. Since the two variables are highly correlated, the values get distorted. As these two are highly correlated, only one of them attains significance.Coefficients(a)
49
Unstandardized Coefficients Model 1 B 4.547 .391 Std. Error .301 .104 .101
Standardized Coefficients Beta .566 .065 t 15.086 3.762 .431 Sig. .000 .000 .667
(Constant) Good Taste
Likeable .044 Flavor a Dependent Variable: Overall rating
To get over the problem of multicollinearity, we could factor analyze the independent variables and work with the resulting new factors. Using Factor Scores: Refer to the module on factor analysis where we reduced the original ten variables to 10 new factors, which were named as Dental Hygiene, Visibility and Sensory benefits. The resulting factor scores for these new variables have been saved in the regression.sav file. We shall now conduct a multiple regression with these three new variables as independent variables and overall rating as the dependent variable. ANALYZE REGRESSION LINEAR Drag variables Dental Hygiene, Visibility and Sensory Benefits to INDEPENDENT Drag Overall Rating(q06_11) to DEPENDENT OK We get an R2 value of 0.710 which shows that the relationship is quite strong between the three factors and the overall rating.Model Summary Adjusted R Square .705 Std. Error of the Estimate .77397
Model 1
R .842(a)
R Square .710
a Predictors: (Constant), sensory benefits, visibility, dental hygiene
50
From the coefficients table we see that all three variables are significant. From the values we can conclude that dental hygiene is the most important attribute, followed by sensory benefits and then visibility.Coefficients(a) Unstandardized Coefficients Model 1 B 7.803 .774 Std. Error .055 .056 .055 .057 Standardized Coefficients Beta .538 .429 .465 t 141.025 13.824 11.028 11.937 Sig. .000 .000 .000 .000
(Constant) dental hygiene visibility
.612 sensory .684 benefits a Dependent Variable: Overall rating
The resulting regression equation is: Y = 7.803+ 0.774 (Dental Hygiene) + 0.612 (Visibility) + 0.684 (Sensory Benefits) In this case we used Overall rating as the dependent variable. If we get respondents Intention to buy score on rating scale, the same could also be used as dependent variable. We could then build a regression equation model to predict intention to buy rating from the attribute rating scores.
51
Chapter 7 DISCRIMINANT ANALYSISObjectives: To understand what is discriminate analysis To run discriminate analysis and interpret the results  discriminate function  discriminate score  Wilkes lambda  Cannonical Correlation  Structure matrix  Standardized and unstandardized discriminant coefficients  Cutoff point  Confusion matrix To understand the use of discriminant analysis to identify Brand Drivers.
Basic Ideas: We shall use disc_aquafresh.sav file to understand how to use dicriminant analysis. Discriminant analysis is a dependent technique where the dependent variable is categorical in nature. The variables 1 10 in the file refer to attribute ratings for the brand Aquafresh. 1113 are the factor scores obtained from earlier analysis (refer factor analysis). We need this information as we may face the same problem of multicollinearity as we experienced in multiple regression (Refer to the section on Multiple regression). We are going to use the first 10 variables as independent variables and later use 1113, the factor scores for the three new factors. The dependent variable is whether the respondent is an user of Aquafresh or not. This is a transformed variable developed from the current and previous brand variables in the master data file. Note that the dependent variable is dichotomous and we are going to fit a model that will help us predict whether someone is an user or non user of Aquafresh. Use the following SPSS commands.
52
SPSS Commands: ANALYZE CLASSIFY DISCRIMINANT Drag usage to GROUPING VARIABLE DEFINE RANGE Type 1 MINIMUM VALUE Type 2 MAXIMUM VALUE CONTINUE Drag Variables 110 to INDEPENDENTS STATISTICS MEANS Check UNSTANDARDIZED COEFFICINETS CONTINUE CLASSIFY Check COMPUTE FROM GROUP SIZES Check SUMMARY TABLE CONTINUE SAVE Check PREDICTED GROUP MEMBERSHIP Check DISCRIMINANT SCORE CONTINUE OK
Groups Users Non Users Overall
Fighting Cavities 7.95 6.74 7.16
Whitening Teeth 6.65 5.76 6.07
Cleaning Stains/ Tartar 7.20 6.39 6.67
Good Taste 7.10 6.82 6.91
Likeable Flavor 7.30 6.87 7.02
Freshening Breath 7.30 7.45 7.40
Brand Image 6.25 6.95 6.71
Color 6.85 7.16 7.05
Attractive Packaging 6.95 7.29 7.17
Innovative features 6.20 6.53 6.41
Notice the difference in the ratings given by users and nonusers. In the first five attributes, the users have given a higher rating for Aqua Fresh and in the last five attributes nonusers have given a marginally higher rating for Aquafresh. Cannonical Correlation and Wilkes lambda: In the next Table find something called Canonical correlation. The interpretation of this is similar to the R value we had in Regression analysis.Eigenvalues
53
Canonical Correlation Eigenvalue % of Variance Cumulative % .662(a) 100.0 100.0 .631 a First 1 canonical discriminant functions were used in the analysis. Function 1
Next Table shows Wilkes , which can be interpreted the same way as the Ftest in the Multiple regression output. The discriminate equation that has been constructed is significant.Wilks' Lambda Wilks' Lambda .602
Test of Function(s) 1
Chisquare 25.919
df 10
Sig. .004
Discriminat Function and Classification: Now take a look at the Canonical Discriminant Function Coefficients (Unstandardized coefficients). Using that we can form the discriminate function shown below. D =  3.041 + 0.702X1 + 0.324X2 + 0.081X3 + 0.056X4 + 0.167X5  0.427X6 0.491X7 + 0.002X8 + 0.201X9  0.165X10
Canonical Discriminant Function Coefficients Function 1 Fighting Cavities Whitening Teeth Cleaning Stains/Tartar Good Taste Likeable Flavor Freshening Breath Brand Image Color Attractive Packaging Innovative features/Ingredients (Constant) Unstandardized coefficients .702 .324 .081 .056 .167 .427 .491 .002 .201 .165 3.041
54
Using this kind of an equation, a discriminate score is calculated for each respondent and the programme uses a cutoff point to classify respondents as users or nonusers. Now go to the data file and find two new variables added there predicted group (dis_1) and discriminant scores (dis1_1). Change to data view and find the discriminate scores. To compute the cutoff point calculate the mean discriminant score for each group and use the following formulae: Cutoff point = (N2D1 + N1D2)/( N1 + N2) where: N1 = No. of Users N2 = No. of Non Users D1 = Average Discrimanat Score for Users D2 = Average Discriminant Score for Non Users These values are available in the output in the Table: Functions at group Centroids.Functions at Group Centroids Function User of Aqua Fresh user Nonuser 1 1.102 .580
Unstandardized canonical discriminant functions evaluated at group means
We know from the next Table that we have 20 users and 38 nonusers.User of Aqua Fresh user Nonuser Total Prior 0.345 0.655 1 Cases Used in Analysis Unweighted 20 38 58
Weighted 20 38 58
We can calculate the cutoff point as 0.522. Respondents with discriminate scores above 0.522 are classified as users and the others are classified as nonusers.
55
Confusion Matrix: If we cross tabulate the actual Vs predicted, we can get the extent of misclassficcation.
Classification Results(a) Predicted Group Membership user 14 5 70.0 Nonuser 6 33 30.0 86.8 Total 20 38 100.0 100.0
Original
Count %
User of Aqua Fresh user Nonuser user Nonuser
13.2 a 81.0% of original grouped cases correctly classified.
From the output we can see the that the classification accuracy is 81%. The above table is also known as confusion matrix. Structure matrix: Now to determine which varaiables were effective in discriminating the users from nonusers, we could look at the Table of standardized canonical Discriminant coefficients. These coefficients can be interpreted the same way as the values were interpreted in multiple regression. Fighting cavities appears to be the most important in predicting whether a respondent is an Aquafresh user or not. The least important one seems to be color. this has to be taken with a pinch of salt as the coefficients can get distorted if the independent variables are correlated.Standardized Canonical Discriminant Function Coefficients Function 1 Fighting Cavities Whitening Teeth Cleaning Stains/Tartar Good Taste Likeable Flavor Freshening Breath Brand Image Color .986 .589 .132 .129 .402 .831 .903 .006
56
Attractive Packaging Innovative features/Ingredients
.405 .359
When the independent variables are correlated among themselves, it is safer to look at the Structure matrix.Structure Matrix Function 1 Fighting Cavities Cleaning Stains/Tartar Whitening Teeth Brand Image Likeable Flavor Attractive Packaging Innovative features/Ingredients Good Taste Color Freshening Breath .514 .296 .290 .225 .107 .100 .089 .073 .068
.045 Pooled withingroups correlations between discriminating variables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function.
This values in the Structure matrix can be interpreted much the way as factor loadings in factor analysis. For example, if you correlate `fighting cavities with discriminate score you will get 0.514. The attributes are arranged in their order of importance. The attributes that differentiate Aqua Fresh users from others are Fighting cavities and Cleaning stains/Tartar. These are strong dental hygiene factors. Using Factors Score in Discriminant Analysis: However, it is better to run the analysis with factor scores, as we already have done factor analysis. Use the following commands. ANALYZE CLASSIFY DISCRIMINANT Drag usage to GROUPING VARIABLE DEFINE RANGE Type 1 MINIMUM VALUE Type 2 MAXIMUM VALUE CONTINUE Drag Variables 1113 to INDEPENDENTS 57
STATISTICS MEANS Check UNSTANDARDIZED COEFFICINETS CONTINUE CLASSIFY Check COMPUTE FROM GROUP SIZES Check SUMMARY TABLE CONTINUE SAVE Check PREDICTED GROUP MEMBERSHIP Check DISCRIMINANT SCORE CONTINUE OK The prediction accuracy is only 65.5.Classification Results(a) Predicted Group Membership user 11 11 55.0 28.9 Nonuser 9 27 45.0 71.1 Total 20 38 100.0 100.0
Original
Count %
User of Aqua Fresh user Nonuser user Nonuser
a 65.5% of original grouped cases correctly classified. Structure Matrix Function 1 dental Hygiene visibility sensory benefits .864 .620 .148
Pooled withingroups correlations between discriminating variables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function. Standardized Canonical Discriminant Function Coefficients Function 1 dental Hygiene visibility sensory benefits .773 .483 .223
58
In terms of importance of attributes, we find that `dental hygiene is the most discriminating attribute for Aquafresh. Also find that there is no contradictory interpretataions arising from standardized discriminant coefficients and the structure matrix. Brand Drivers: The implication for the brand is that dental hygiene is the driver attribute for Aquafresh. They should ensure that the rating on the same attribute does not go down. Conduct discrimant analysis for Colgate and Crest using files disc_colgate.sav and disc_crest.sav and identify the brand drivers.
59
Chapter 8 Multi Dimensional ScalingObjectives: To understand the basic concepts behind multidimensional scaling To learn how to conduct multidimensional scaling analysis using SPSS and interpret the results.  Meaning of Stress  Stress decomposition  Interpreting the dimensions To understand how to use multiple regression to fit vectors to the multidimensional space.
Basic Concepts: This is a techniques used to identify the underlying dimensions on which a set of stimuli are differentiated based on the similaritydissimilarity rating of the stimuli. The stimuli could be brands or some other objects or people or even countries. Suppose we get people to rate a set of six countries on a 010 scale of similaritydissimilarity with 0 when the countries are identical and 10 when they are diametrically opposite to each other. The following is the matrix generated from a single respondent.
England USA France Germany India China
England 0 2 9 8 8 9
USA 2 0 8 9 9 8
France 9 8 0 1 8 8
Germany 8 9 1 0 8 1
India 8 9 8 8 0 9
China 9 8 8 9 1 0
The above data has been arranged in the form required for analysis in the mds_sample.sav file. Open the same file and study the file structure. The rows and columns are defined by row_id and col_id. The corresponding ratings are given under the variable rating. Run the following SPSS commands and study the output. 60
ANALYZE SCALE MULTIDIMENSIONAL SCALING (PROXCAL) Check THE PROXIMITIES ARE IN A SINGLE COLUMN DEFINE Drag variable rating to PROXIMITIES Drag row_id to ROWS Drag col_id to COLUMNS MODEL Check INTERVAL CONTINUE OUTPUT Check INPUT DATA Check ITERATION HISTORY CONTINUE OK You will notice that the input data used by us has been reproduced in the Input data matrix.Proximities England England USA France Germany India China . 2.000 9.000 8.000 8.000 9.000 . 8.000 9.000 9.000 8.000 . 1.000 8.000 8.000 . 8.000 9.000 . 1.000 . USA France Germany India China
The next table shows the improvement in stress value. Stress in MDS refers to the extent of model misfit. It is through an iterative process that the program arrives at the configuration that best fits the dissimilarity rating that we have input into the program.Iteration History Normalized Raw Stress .09053(a) .00423 .00397
Iteration 0 1 2
Improvement .08630 .00025
61
3 .00388 .00009(b) a Stress of initial configuration: simplex start. b The iteration process has stopped because Improvement has become less than the convergence criterion.
Stress value of less than 0.01 is considered to be very good. Stress values upto 0.03 are acceptable. The stress value of 0.00388 for the current data shows that the model has given a good fit. The program has found a solution in two dimensions and the coordinates of the stimuli (countries) are given in the final coordinates matrix. The coordinates are plotted on a graph too.Final Coordinates Dimension 1 England USA France Germany India China .656 .656 .337 .319 .337 .319 2 .057 .057 .518 .572 .518 .572
A cursory look at the chart shows that the countries that are similar to each other are placed together and the ones that are different from each other are placed apart from each other. Notice that there are three clusters USA & England, France & Germany and India & China.
62
Object Points Common SpaceChina .6 India .4 .2 USA .0 .2 .4 England
Dimension 2
France Germany .6 .4 .2 0.0 .2 .4 .6 .8
Dimension 1
In order to interpret the chart we shall take this to Powerpoint. Draw lines passing through the origin. Now the next task is to uncover the meaning of the underlying two dimensions.
63
.6 .4 .2
China India
Dimension 2
US .0 England .2 .4 France .6 .4 Germany .2 0.0 .2 .4 .6 .8
Dimension 1
In order to get better sense out of this graph, rotate the xaxis and yaxis as shown in the next chart.
64
.6 .4 .2
China India
Dimension 2
US .0 England .2 .4 France .6 .4 Germany .2 0.0 .2 .4 .6 .8
Dimension 1
Now you can clearly see that the xaxis refers to support for Iraq war and the yaxis refers to economic development. Here we used a judgmental approach to interpreting the dimensions. It is possible to use multiple regression to fix attributes to the above chart which we shall see later. It raises another question as to how many dimensions will be needed. In general MDS is used mostly for a two dimensional solution. If the researcher has a priori knowledge that more than two dimensions may be required then it has to be planned at the data collection stage itself. If you need more dimensions you will need to get rating on more number of stimuli. The rule is that for every dimension you need a minimum of three stimuli. If you need a three dimensional solution you need to have at least 9 stimuli (in this case countries) rated on a dissimilarity scale. However, if we have data collected from multiple respondents we use the criteria of Stress. If the stress (measure of misfit) value is more than 0.03 then go for more dimensions. Here we had only one respondent rating the countries. In a survey typically we will have several respondents rating the stimuli. In the case of toothpaste survey, we had a total of 43 valid responses to the MDS question (i.e. question 8). 65
I have arranged those 43 responses in 43 columns in the data file md2.sav. Now open this file and study the file structure. ANALYZE SCALE MULTIDIMENSIONAL SCALING (PROXCAL) Check MULTIPLE MATRIX SOURCES Check THE PROXIMITIES ARE IN A SINGLE COLUMN DEFINE Drag variables resp01 to resp69 to PROXIMITIES Drag row_id to ROWS Drag col_id to COLUMNS MODEL Check INTERVAL CONTINUE OUTPUT Check INPUT DATA Check ITERATION HISTORY Check STRESS DECOMPOSITION CONTINUE OK This time we try to fit a model by taking into account the 43 responses. Note that a stress value of 0.05555 has been achieved. The program has gone through 21 iterations to arrive at the final configuration.Iteration History Normalized Raw Stress .12660(a) .08510 .07399 .06833 .06493 .06273 .06126 .06024 .05949 .05891 .05841 .05797 .05756 .05719 .05686
Iteration 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Improvement .04150 .01111 .00566 .00340 .00220 .00147 .00102 .00075 .00059 .00050 .00044 .00041 .00037 .00034
66
15 16 17 18 19 20 21
.05656 .05630 .05608 .05591 .05576 .05565
.00030 .00026 .00022 .00018 .00014 .00012
.05555 .00009(b) a Stress of initial configuration: simplex start. b The iteration process has stopped because Improvement has become less than the convergence criterion.
Stress value of 0.05555 is a little too high to make any meaningful interpretation. You will notice another large table that gives the stress decomposed for every individual. Notice that respondents 6,20,23,38,39,41,47,58 and 6y9nhave stress values above 0.07. If we leave out these respondents and run MDS analysis, we can possibly get a solution with lowered stress values. Alternately, we may try for a solution with more than two dimensions. Let us get a solution in three dimensions. Repeat the above analysis using the following commands. ANALYZE SCALE MULTIDIMENSIONAL SCALING (PROXCAL) Check MULTIPLE MATRIX SOURCES Check THE PROXIMITIES ARE IN A SINGLE COLUMN DEFINE Drag variables resp01 to resp69 to PROXIMITIES Drag row_id to ROWS Drag col_id to COLUMNS MODEL Check INTERVAL Check WEIGHTED EUCLIDIAN Change 2 to 3 MINIMUM DIMENSIONS Change 2 to 3 MAXIMUM DIMENSIONS CONTINUE OUTPUT Check INPUT DATA Check ITERATION HISTORY Check STRESS DECOMPOSITION CONTINUE OK Note that we are now asking for 3 dimensions. We get a much lower stress value of 0.02319 which is an acceptable level of error in this type of analysis.
67
Stress and Fit Measures Normalized Raw Stress StressI StressII SStress Dispersion Accounted For (D.A.F.) Tucker's Coefficient of Congruence .02319 .15229(a) .98328(a) .09732(b) .97681 .98834
PROXSCAL minimizes Normalized Raw Stress. a Optimal scaling factor = 1.024. b Optimal scaling factor = .961.
We also get three sets of coordinates for the brands.Final Coordinates Dimension Aquafresh Crest Colgate Mentadent Arm & Hammer Pepsodent 1 .399 .486 .422 .482 .147 .175 2 .164 .189 .018 .063 .595 .614 3 .483 .360 .471 .408 .187 .002
We could plot two at a time to study the relative positioning of brands. However we ll need to interpret the dimensions. We used a subjective method of interpretation in the case of six countries. It will be difficult in the case of toothpaste brands as we have a three dimensional solution. We shall use regression method to fit attributes on to the same map where the brands are positioned. Use file mds4.sav to get the coordinates for the attributes. Note that the mean ratings of brands on different attributes and three factors are going to be used for this analysis. Perform a multiple regression analysis with the mean rating of attributes as dependent variable and the coordinates for the brands as independent variables. The resulting values give the three coordinates for the attributes concerned. ANALLYZE REGRESION LINEAR
68
Drag Fighting Cavities to DEPENDENT Highlight and drag Dim1, dim2 and dim3 to INDEPENDENT OK The values given in the coefficients table give the three coordinates for the variable `fighting cavities.Coefficients(a) Unstandardized Coefficients Model 1 B 8.297 Std. Error .421 .953 1.470 .945 Standardized Coefficients Beta .078 .247 .831 t 19.718 .151 .477 1.652 Sig. .032 .905 .717 .347
(Constant)
Dimension .144 1 Dimension .701 2 Dimension 1.562 3 a Dependent Variable: Fighting Cavities
In the same way the coordinates for all the attributes are arrived at and a new data file mds5.sav has been created. This new file contains the coordinates for all the brands and attributes. Now we can plot and interpret the dimensions. GRAPH SCATTER SIMPLE DEFINE Drag dim1 to XAXIS Drag dim2 to YAXIS Drag attri to LABEL CASES BY OPTIONS Check DISPLAY CHART WITH CASE LABELS CONTINUE OK You will get the scatter plot which will have the brands and attributes on the same space. The same can be taken to a PowerPoint and draw vectors for the attributes as shown in the chart below.
69
.8 .6 .4 .2 .0 .2 A9 .4 Pepsodent Aquafresh Mentadent A8 A6 A1 A3
Arm & Hammer A10 Dental Hygiene A A1. Fighting cavities A2. Whitening Teeth A3. Cleaning Stains A4. Good Taste A5. Likeable Flavor A6. Freshening Breath A7. Brand Image A8. Color A9. Attractive Pack A10. Innovations
Colgate Crest Visibility
Dimension 2
.6 A4 Sensory Benefits .8 A5 1.0 .8 .6 .4
A7
.2
0.0
.2
.4
.6
Dimension 1
Only the long vectors belong to this space. Attributes A3, A1, A6 etc. are too short and need not be considered while interpreting the meaning of Dimensions 1 and 2. Also we can rotate the axis as shown in the picture to get better meaning of the space. You can see that Dimension 1 is basically attribute A7, namely Band Image and Dimension 2 is comprised of attributes A4 and A5 (Good Taste and Likeable Flavor). Crest and Colgate are seen to have a good brand Image. Pepsodent is high on Dimension 2, namely flavor and taste. This result is consistent with the earlier findings. To name dimension 3, plot dim1 Vs dim3. From the plot notice that Dimension 3 comprises of A8 + A9 and A2 + A1 (Color + Attractive Packaging & Fighting cavities + white teeth). Dimension 1 is clearly made up of A7 (brand image) only. In the same way plot dim2 Vs dim3. Dimension 2 comes out as A4 and A5 (Taste and flavor) and dim 3 remains as a combination of four attributes A1, A2, A8 and A9.
70
1.0 .8 .6 A9 A4
A8
A6
A3
A1
A2
Visibility Dental Hygiene A7 Crest A1. Fighting cavities A2. Whitening Teeth A3. Cleaning Stains A4. Good Taste A5. Likeable Flavor A6. Freshening Breath A7. Brand Image A8. Color A9. Attractive Pack A10. Innovations
A1
Mentadent .4 A5 Sensory Benefits .2 .0 .2 Pepsodent Arm & Hammer
Dimension 3
.4 .6 .8 .6
Aquafresh
Colgate
.4
.2
0.0
.2
.4
.6
Dimension 1
1. A9 .8 .6 .4 .2 .0 Pepsodent A4 A7 Visibility
A8
A6 A3
A1
A2
A1
Dental Hygiene
A5 Sensory Benefits
Crest
Mentadent Arm & Hammer
A1. Fighting cavities A2. Whitening Teeth A3. Cleaning Stains A4. Good Taste A5. Likeable Flavor A6. Freshening Breath A7. Brand Image A8. Color A9. Attractive Pack A10. Innovations
Dimension 3
.2 .4 .6 1.0
Colgate
Aquafresh
.8
.6
.4
.2
.0
.2
.4
.6
.8
Dimension 2
71
72
QUESTIONNAIRE ON TOOTHPASTE BRANDSThis research study is conducted purely for academic purposes only. As the interest is on the collective opinion of the group as a whole, individual identity will not be revealed. Please have the questionnaire filledin and returned to the instructor. 1. 2. 3. 4. Which of the tooth paste brands listed in the table below question 4 are you aware of? (Please put marks in the appropriate boxes in the Table) Which of these have you tried ever? What brand of tooth paste are you currently using? (If you use more than one brand, please mark `M by the side of the most used brand) What was the brand that you used immediately prior to this brand? Brand Aquafresh Colgate Crest Mentadent Arm & Hammer Pepsodent Others1 ________________ Others2 ________________ Question 1 Question 2 Question 3 Question 4
5.
Please indicate the relative importance of the following factors in terms of choosing a brand using a sevenpoint rating scale. (Please circle the appropriate number to indicate the importance to you) Not Important 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 Very Important 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7
Factor a. Fighting cavities b. Whitening teeth c. Cleaning Stains/Tartar d. Good taste e. Likeable Flavor f. Freshening Breadth g. High Prestige Brand h. American Dental Association Recommendation i. Attractive Packaging j. Innovative feature/new ingredient
73
6.
Please rate AquaFresh, Colgate and Crest on the following factors. If your `current brand is other than these three please rate your current brand too. Use a 010 scale with 0 being poor and 10 being good on that attribute. Your response will be a number between 0 and 10 in the boxes below  indicating the extent to which that attribute is present in the brand being rated. (If you have never used any of these brands you may leave the corresponding column(s) blank.) a. Aquafresh b. Colgate c. Crest d. Current Brand
Attribute 1.Fighting Cavities 2. Whitening teeth 3. Cleaning stains/Tartar 4. Good Taste 5. Likeable Flavor 6. Freshening Breath 7. Brand Image 8. Color 9. Attractive Packaging 10. Innovative features/Ingredients 11. Overall rating for brands(out of 10) 7.
Kindly put checkmarks in the boxes to indicate the personality traits of users that match with the three brands of tooth pastes mentioned in the following table. You may also check more than one brand for a personality trait. Personality Trait a. Aquafresh b. Colgate c. Crest
1.Outgoing 2. Sensuous 3. Hedonist 4. Fun loving 5. Achiever 6. Romantic 7. Traditional 8. Ambitious 9. Overcautious 10. Feminine 11. Masculine
74
8.
Please rate the following pairs of toothpaste brands from most similar pair (1) to most dissimilar pair (10). Circle the appropriate number. Please leave out the pairs containing brands that you are not aware of. Brand Pair Most Similar 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 Most Dissimilar 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
a. Aqua Fresh  Crest b. Aqua Fresh Colgate c. Aqua Fresh Mentadent d. Aqua Fresh Arm & Hammer e. Aqua Fresh Pepsodent f. Crest Colgate g. Crest Mentadent h. Crest Arm & Hammer i. Crest Pepsodent j. Colgate Mentadent k. Colgate Arm & Hammer l. Colgate Pepsodent m. Mentadent Arm & Hammer n. Mentadent Pepsodent o. Arm & Hammer Pepsodent 9.
Please indicate how likely are you to buy the following new toothpaste concepts by using a seven point rating scale. Please circle the appropriate number that indicates your preference. Definitely Not Buy 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 Will definitely Buy 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7
New Product Concept a. Tooth paste in a big jar (like styling gel) b. Make your own toothpaste kit (with whitener, mouth wash, foaming agent, baking soda etc. in different tubes) c. Cafinated Tooth paste for a refreshing feeling d. Toothpaste with weightcontrol formulae (make you feel full after brushing) e. Toothpaste containing multivitamin f. Spicy toothpaste (clove and/or cinnamon) g. Transparent tube to see how much is left inside h. Toothpaste containing colored beads and flavor crystals i. Use and throw away pastepreapplied tooth brushes (Better Hygiene) j. Tube fitted with dispenser for right amount k. Male /Female toothpaste l. Toothpaste that doubles as shaving cream m. Night paste with sleep inducers
75
10. Please rank the following 15 toothpaste concepts from most preferred (1) to least preferred (15). (Enter your preference ranks in the little eggs in every box) One easy strategy to make the ranking task easier is to do it in two stages. First mark concepts as H (High preference), L (Low Preference) and M (Medium preference) and then rank the cards within each category. It is also suggested that you assign equal number of cards to each category.
Concept 1Tube Taste Color Brand Price Transparent Mint White Colgate $2
Concept 2Tube Taste Color Brand Price Transparent Spicy* Red Aqua Fresh $2
Concept 3Tube Taste Color Brand Price Transparent Cafinated Green Colgate $4
Concept 4Tube Taste Color Brand Price Transparent Mint White Crest $6
Concept 5Tube Taste Color Brand Price Semitransparent Mint Red Crest $4
Concept 6Tube Taste Color Brand Price Semitransparent Spicy* White Colgate $6
Concept 7Tube Taste Color Brand Price Semitransparent Cafinated White Aqua Fresh $2
Concept 8Tube Taste Color Brand Price Semitransparent Mint Green Colgate $2
Concept 9Tube Taste Color Brand Price Nontransparent Mint Green Aqua Fresh $6
Concept 10Tube Taste Color Brand Price Nontransparent Spicy* White Colgate $4
Concept 11Tube Taste Color Brand Price Nontransparent Cafinated White Crest $2
Concept 12Tube Taste Color Brand Price Nontransparent Mint Red Colgate $2
Concept 13Tube Taste Color Brand Price Transparent Spicy* Green Crest $2
Concept 14Tube Taste Color Brand Price Transparent Cafinated Red Colgate $6
Concept 15Tube Taste Color Brand Price Transparent Mint White Aqua Fresh $4
* `spicy taste could be due to the addition of clove or cinnamon. They also have medicinal properties apart from being natural ingredients (as opposed to chemicals).
76
11. Please indicate the extent of your agreement with the following statements using the AgreeDisagree scale given below. Strongly Disagree 1, Disagree 2, Neither Agree Nor Disagree  3, Agree  4, Strongly Agree5. Statement a. I am very concerned about my looks b. I am very inquisitive like a child. c. I like to be different in whatever I do. d. I love whatever work I do. e. I am always playful f. I keep thinking about my future. g. I take a lot of care about the dress I wear h. I like to try new and different things i. I am sensitive to others feelings j. I believe in keeping myself physically fit. k. I dislike being left alone. l. I'd say I'm rebelling against the way I was brought up m. I try to understand deeply about anything that I study n. I always worry about my past failures. o. I like to follow what others do. p. I am very sensitive to what others think about me. q. My objective in life is to acquire wealth to the maximum extent possible. r. I am very organized s. I'm a "spender" rather than a "saver." t. Love and sex are great distractions to achieving ones objectives. u. I act on my hunches v. I am more conventional than experimental w. I am a little fickle minded x. I love to have good food every day. Strongly Disagree Strongly Agree
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
77
12. Please provide the following information about yourself. a. Which of these age groups do you fall into? Below 25 Years 25 or More refused What is your household yearly income? (Remember, the survey is anonymous) $30,000 or Less $30,001 to $80,000 $80,001 to $125,000 Over $125,000 Gender: Male Female Race/Ethnicity: (adopted from the census) White Others Refused
b.
c.
d.
e. Your nickname: _________________________ (This name will appear in the data base and also you can see where you stand visvis others in your class during analysis Please do not give your real name.)
Thank You for Your Time
78
CODE SHEET FOR TOOTH PASTE SURVEY Column No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Variable Name no class q01a q01b q01c q01d q01e q01f q01g q02a q02b q0ac q02d q02e q02f q02g q03a Variable Label Respondent No. Closeup Aquafresh Colgate Crest Mentadent Arm & Hammer Pepsodent Others Aquafresh Colgate Crest Mentadent Arm & Hammer Pepsodent Others Current Brand 1 Label Values 1. 2. 3. 0. 1. 0. 1. 0. 1. 0. 1. 0. 1. 0. 1. 0. 1. 0. 1. 0. 1. 0. 1. 0. 1. 0. 1. 0. 1. 0. 1. 1. 2. 3. 4. 5. 6. GSB57302 MKT34701 MKT34702 Unaware Aware Unaware Aware Unaware Aware Unaware Aware Unaware Aware Unaware Aware Unaware Aware Never Used Used Never Used Used Never Used Used Never Used Used Never Used Used Never Used Used Never Used Used Aquafresh Colgate Crest Mentadent Arm & Hammer Pepsodent 79
18
q03b
Current Brand 2
19
q04
Previous Brand
20 21 22 23 24 25 26 27 28 29
q05a q05b q05c q05d q05e q05f q05g q05h q05i q05j
Fighting Cavities Whitening Teeth Cleaning Stains/Tartar Good Taste Likeable Flavor Freshening Breath High Prestige Brand American Dental Association Recommendation Attractive Packaging Innovative Features/new Ingredients
7. Others 1. Aquafresh 2. Colgate 3. Crest 4. Mentadent 5. Arm & Hammer 6. Pepsodent 7. Others 8. No Second Brand 1. Aquafresh 2. Colgate 3. Crest 4. Mentadent 5. Arm & Hammer 6. Pepsodent 7. Others 1. Not Important To 7. Very Important 1. Not Important To 7. Very Important 1. Not Important To 7. Very Important 1. Not Important To 7. Very Important 1. Not Important To 7. Very Important 1. Not Important To 7. Very Important 1. Not Important To 7. Very Important 1. Not Important To 7. Very Important 1. Not Important To 7. Very Important 1. Not Important To 7. Very Important 80
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
q06a01 q06a02 q06a03 q06a04 q06a05 q06a06 q06a07 q06a08 q06a09 q06a10 q06a11 q06b01 q06b02 q06b03 q06b04 q06b05
Aquafresh  Fighting Cavities Aquafresh  Whitening Teeth Aquafresh  Cleaning Stains/Tartar Aquafresh  Good Taste Aquafresh  Likeable Flavor Aquafresh  Freshening Breath Aquafresh  High Prestige Brand Aquafresh  Color Aquafresh  Attractive Packaging Aquafres