Marketing Experiment - Part II: Analysis
-
Upload
minha-hwang -
Category
Data & Analytics
-
view
198 -
download
2
Transcript of Marketing Experiment - Part II: Analysis
Marketing Research
MRKT 451
Experimentation IIFebruary 7, 2010
• Within Subjects versus Between Subjects
• Field versus Lab Experiments
• ANOVA
• Regression
• Case: NoPane Advertising Strategy
• Guidelines for Critiquing Experimental Research
Class Outline
• Between-subjects design
– Each subject receives only one treatment.
– Comparisons are made between groups of different
subjects.
• Within-subjects design
– Subject receives more than one treatment.
– Comparisons are made across multiple measures on the
same subject.
Two types of experiments
• Within subjects designs are advantageous because you
get greater statistical power due to “internal matching”
(you are your own control).
• However, in some cases, due to contamination, time
constraints, between subjects designs must be used.
• This is not an obvious issue.
Within or Between Subjects?
(in-market) (paper/pencil)
HighExternalValidity
HighInternalValidity
Field Lab
ScientificControl
Real worldEnvironment
Trade-offs: Field vs. Lab Experiments
Laboratory Field
Validity High InternalLow External
Low InternalHigh External
Time & Cost Low High
Exposure to Competition Low High
Nature of the Manipulation Realistic in a contrived setting
Difficult to duplicate in a lab
Trade-offs: Field vs. Lab Experiments
Analysis of Variance
• ”In our test mailing, did urban customers spend more
than rural customers?“
ANOVA: Comparison of Means
• Most of the time, we want to draw conclusions from our
sample about the population at large
– Know that in the test mailing urban customers spent more
than rural customers, $35.10 vs. $31.34
– Can we conclude from this that we can reliably expect that
this difference exists in our customer base at large? (Or is
this a “fluke” of our test mailing?)
– Is the average spending of rural customers statistically
significantly different from the average spending of urban
customers?
• Statistical inference allows us to make conclusions about
the population at large
Statistical Inference: Comparison of Means
• Statistical inference means deciding which of two
hypotheses to accept as likely to be true
In our example AvgExpurban – AvgExprural = $3.76
Question: Is this close to or far from 0?
Hypothesis 0
Average expenditures of urban
and rural customers are the same
Hypothesis 1
Average expenditures of urban
and rural customers are different
If true, we
expect…
If true, we
expect…
Statistical Inference
• Statistics enables us to conclude whether a value is “close to” or “far
from” zero
• Technically, a “statistic” is a number derived from a formula based
on data
– A mean is a statistic: the sum of the variable values divided by the
number of observations
– A variance is a statistic
• Statistical theory
– For example, we do not need to know the distribution of the
expenditures of the customers in the urban area
– But, if we take 10 sets of samples, and calculated the mean of each
sample, we do know how those means would be distributed
Statistical Inference
Statistically Meaningful Difference ~ f(variance)
• Generate data (sample size of 50) by randomly drawing from a normal distribution N(33.22,52); Assign 25 samples for each group
• Group 1) Mean: 33.61
27.90 25.00 41.13 34.68 27.28
38.60 26.73 32.61 36.92 29.02
31.47 33.37 42.98 35.52 31.22
32.09 31.82 39.75 31.80 29.26
34.42 32.75 37.19 38.02 38.66
• Group 2) Mean: 31.92
36.46 36.38 42.74 27.42 31.48
24.74 37.82 33.33 31.72 24.61
34.14 31.92 26.45 27.43 29.41
31.97 24.98 32.31 36.30 31.34
29.73 34.74 38.73 30.99 30.89
• With std of 5,
difference of 1.69 can
be made by chance
Statistically Meaningful Difference ~ f(variance)
• Generate data (sample size of 50) by randomly drawing from a normal distribution N(33.22,12); Assign 25 samples for each group
• Group 1) Mean: 33.33
31.78 32.24 33.83 33.11 33.98
33.89 33.60 33.99 34.74 33.80
33.65 33.18 31.98 32.72 33.55
32.11 33.69 33.54 33.32 33.52
33.24 33.20 33.21 34.15 33.13
• Group 2) Mean: 33.30
30.58 32.73 33.42 34.18 34.60
32.28 33.98 33.46 33.01 33.24
33.19 33.36 33.46 32.55 32.17
34.19 33.00 34.63 32.30 32.63
34.64 34.63 32.19 33.43 34.56
• With std of 1,
difference of 0.03 can
be made by chance
• 2 Population Means
– Two independent, separate target populations
• e.g., compare men vs. women on a rating scale
• e.g., compare users vs. nonusers on #oz purchased
– Draw random samples of sizes “n1” and “n2”
– Collect data:
• X11, X12, X13, ..., X1n
• X21, X22, X23, ..., X2m
– H0:
– Ha:21
21
Hypothesis Testing: Comparison of means (2 groups)
• Compare 3 or more population means?
• F-test (ANOVA) extension of t-test to 3+ groups
• e.g., Three pricing strategies:
– Low
– Medium
– High
Analysis of Variance (ANOVA)
• Variance Between vs. Variance Within
price: high mdm low
sales,p(buy),F= Var Betw Grps
-------------------Var W/in Grps
F-test Intuitively
Group: I IIIII
Grand Mean
Yij
ijiijY Model:
ANOVA: Model
SStotal = SSbetween groups + SSwithin groups
a n _
SStotal = ( yij - y..) 2
i=1 j=1
a n _ _
SSbetw = ( yi. - y..) 2
i=1 j=1
a n _
SSw/in = ( yij - yi.) 2
i=1 j=1
ANOVA: Computation
• MSbetween = SSbetween / (a - 1)
• MSwithin = SSwithin / (a * (n - 1))
• F = MSbetween / MSwithin
• F has (a - 1) and (a * (n - 1)) degrees of freedom
ANOVA: F-test
• Manipulate more than 1 factor:
– Random sample, randomly assign to 1 condition
– Combinations of 2 or more experimental factors
• Beard x UPS Logo
• Ad x Price
ANOVA: Logical Extension
• Brand Equity & Price Sensitivity
– 2x2: UPS/not branded x bearded/not bearded
– 2x3: 2 ad themes x 3 price strategies:
Price:Low Mdm High
Ad:
Standard
Luxury
Multi-factor ANOVA: Example
• Is one ad more effective than another?
• Is one pricing level more effective than the others?
• Do the ads and prices “interact”?; is there any particular
combination of the ads and prices that is especially
effective? (Or detrimental?)
ANOVA Hypotheses
Regression
• Effect of Ad?
• Effect of Price?
• Effect of Ad x Price?
Model:ijkijjiijkY )(
Experiment: Model
• 2 types of ad (rational vs. emotional) and 3 price levels (low, medium, or
high)
• 15 observations per each treatment
• Total number of observation = 2 x 3 x 15 = 90
Collected Data After Experiments
ID Group Ad type Price level Sales
1 1 Emotional High 129.92
2 1 Emotional High 131.45
.. .. .. .. ..
16 2 Emotional Medium 130.89
17 2 Emotional Medium 130.97
.. .. .. .. ..
89 6 Rational Low 151.12
90 6 Rational Low 151.32
• To obtain the size/existence of effects, we need to
understand how the various factor levels affect the
dependent variable (e.g. sales)
• A regression could be useful, but a regression is usually
for numerical variables and not for “categorical” variables
• We need to convert the categorical variables into
numerical variables. This is achieved by “dummy
variable coding”
• A dummy variable for interaction term can be made by
multiplying relevant level of factors
Data processing
• We need one dummy variable for every factor level in
the experiment
• For a certain treatment, the dummy variable for a certain
factor level takes value 1 if that treatment contains the
factor level and takes value 0 otherwise.
Dummy Variable Coding
In our example, the total number of dummy variables needed for
main effects is 2 (ad) + 3 (price) = 5
For each factor, replace the column containing the factor level
with columns containing the values of the corresponding dummy
variables.
• 2 dummy variables for ad types
• 3 dummy variables for price levels
• Interaction terms are not considered yet
Dataset After Initial Dummy Variable Coding
ID Group d_em_ad d_ra_ad d_hp d_mp d_lp Sales
1 1 1 0 1 0 0 129.92
2 1 1 0 1 0 0 131.45
.. .. .. .. .. .. .. ..
16 2 1 0 0 1 0 130.89
17 2 1 0 0 1 0 130.97
.. .. .. .. .. .. .. ..
89 6 0 1 0 0 1 151.12
90 6 0 1 0 0 1 151.32
Dummy Variable Regression
• We need to run a regression with sales as the y variable
and the dummies as the x variables.
• Problem: Within any one factor, the dummies are
perfectly correlated
– The value of one dummy can be inferred from the rest
– Perfect multicollinearity
– Creates problems for most regression software
Solution
• For each factor
– Pick any one factor level. We will call this the “base case”
level for the factor
– Eliminate the dummy corresponding to the base case
• Run a regression using the remaining variables. Tabulate the
coefficients.
Total number of factor levels in the experiment
Total number of factors in the experiment
Excluding the intercept term, the total number of variables
in this regression will be:
Taking out Dummy Variables for the Base Case
• Suppose, we choose the base case factor level of each
attribute as follows
– Ad type: Rational (base case)
– Price level: High price (base case)
• The x variables (for main effects) in the regression will
be: d_em_ad, d_lp, and d_mp (3 variables)
• Now, we can create dummy variables for interaction
terms by multiplying the remaining levels of each factor
• 1 dummy variable for ad types (main effects)
• 2 dummy variables for price levels (main effects)
• 2 dummy variables for interaction terms
Final Dataset with Dummy Variables for Interaction
Terms
ID Group d_em_ad d_mp d_lp d_int_em_mp
d_int_em_lp
Sales
1 1 1 0 0 0 0 129.92
2 1 1 0 0 0 0 131.45
.. .. .. .. .. ..
16 2 1 1 0 1 0 130.89
17 2 1 1 0 1 0 130.97
.. .. .. .. .. ..
89 6 0 0 1 0 0 151.12
90 6 0 0 1 0 0 151.32
Interpretation of Regression Output
• After running a regression analysis, we obtain the
following results.
• How should we interpret the results?
Coefficients Standard Error t Stat P-value
Intercept 100.99 0.06 1691.66 0.00
Emotional 29.91 0.08 354.27 0.00
Medium price -0.05 0.08 -0.61 0.55
Low price 49.99 0.08 592.17 0.00
Medium Price x emotional 0.13 0.12 1.06 0.29
Low Price x emotional 20.17 0.12 168.96 0.00
• Interpretations
– F-Statistics: Omnibus test that all coefficients are 0.
• Numerator degrees of freedom is the number of Xs (k) in the
model.
• Denominator is n-k-1.
– t-Statistics: Test for individual coefficients = 0.
– p-value: The probability of observing the data we did
given the null hypothesis is true.
Interpretations: Linear Regression
Interpretation of Regression Output
• Think about the base case! (rational ad/high price)
• Two of the effects are statistically not significant (not different from the base
case)
• For all other factor levels, the sales effect can be computed by summing the
intercept with the regression coefficients for the corresponding dummy
variables
Coefficients Standard Error t Stat P-value
Intercept 100.99 0.06 1691.66 0.00
Emotional 29.91 0.08 354.27 0.00
Medium price -0.05 0.08 -0.61 0.55
Low price 49.99 0.08 592.17 0.00
Medium Price x emotional 0.13 0.12 1.06 0.29
Low Price x emotional 20.17 0.12 168.96 0.00
Price:
Low Medium High
Ad:
Rational
Emotional
Intercept
+ Low price
Intercept
+ Low price
+ Emotional
+ LP x Emotional
Intercept
+ (Medium price)
Intercept
Intercept
+ (Medium price)
+ Emotional
+ (MP x Emotional)
Intercept
+ Emotional
Computing Sales Effect for Each Cell
Nopane Advertising Strategy
• Experimental design: 2 segments (East/West coast vs. the rest) x 2
copy execution (emotional vs. rational)
x 3 levels of media exposure ($2.50, $4.75, and $8.00)
x 2 test territories (per each segment) = 24 observations
• Marketing decision variables: advertising copy and media spending
• Questions:
– Experimental results: useful or worthless?
– Source of problem: what is the key problem with this experiment?
Nopane Advertising: Overview
• Impact of advertising expenditure onsSales
– In regression 1, advertising is 50% more effective
than in regression 3.
– Is the emotional appeal ad not significant or negative?
• Is the test market study correct or is Skamarycz right?
Regression 1 Regression 3
Nopane Ad $p-value
1.4770.0003
0.9960.0175
Segment A 0.35140.8046
0.16670.9245
Emotional 2.13370.3058
-3.00000.0997
Nopane Advertising
• OLS assumes that Xs are independent.
• In our case, the competition adjusted their spending to our actions:
Regression Statistics
Multiple R 0.7557
R Square 0.5711Adjusted R Square 0.5068Standard Error 4.7020
Observations 24
ANOVA
df SS MS F Significance F
Regression 3 588.7811 196.2604 8.8770 0.0006
Residual 20 442.1772 22.1089
Total 23 1030.9583
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 9.7132 2.7265 3.5626 0.0020 4.0259 15.4005 4.0259 15.4005
Nopane Ad $ 0.8515 0.4251 2.0030 0.0589 -0.0353 1.7383 -0.0353 1.7383Dum Segm 0.9167 1.9196 0.4775 0.6382 -3.0875 4.9209 -3.0875 4.9209Dum Copy 9.0833 1.9196 4.7319 0.0001 5.0791 13.0875 5.0791 13.0875
NoPane Advertising
• Two stage regression
– To reconcile regression 1 and regression 3, we can
use regression 2 to model the competitive reaction:
Regression 1 Regression 2 Regression 3Nopane Ad $ 1.4772 0.8515 0.9959Dum Segm 0.3514 0.9167 -0.1667Dum Copy 2.1337 9.0833 -3.0000Comp Ad $ -0.5652
0.3514 – 0.5652 * 0.9167 = -0.16672.1337 – 0.5652 * 9.0833 = -3.0000
1.4772
-0.5652
0.8515
* =
0.9959
NoPane Advertising
• Two stage regression
– Without two stage regression, masking occur.
– You would make the wrong inference about the
effectiveness of your actions.
– Regression 1 alone is wrong.
– Regression 3 is ok if the competition will behave after
launch in the same way it behaved during the test.
– If the competition does not behave as in the test, the test is
worthless!
NoPane Advertising
Guidelines for Critiquing
Experimental Research
• Identify the real instrument/treatment variable X, the real
response variable Y and the real population P of interest to
the manager.
• Identify the proxies x, y and p in the experiment setting.
• When and how is y being measured? Identify the
experimental design and the corresponding best estimate of
the observed effect of x on y:
a. Before-After without Control Group: E = O2 − O1
b. Before-After with Control Group: E = (O2 − O1) − (O4 − O3)
c. After-Only with Control Group: E = O2 − O4
Guidelines for Critiquing Experimental Research in
Marketing
• Look for problems in internal validity.
– Are there alternative explanations to the change E other
than the treatment variable? If there are, the statement
that x causes y is falsifiable and the experiment is flawed.
• Look for problems in external validity. That is, is there a
problem with the proxies?
Guidelines for Critiquing Experimental Research in
Marketing
• Read Chap. 11 of Text
• Read “The Science of Asking Questions”
• Individual assignment #1 (experimentation) will be
handed out on Wednesday
For next class…