Post on 31-Dec-2015
description
Copyright by Michael S. Watson, 2012
Statistics Quick Overview
Class #3
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
A/B Testing in Obama’s 2008 Campaign
Objective: Maximize Sign-Up Rate
2
Source: http://www.youtube.com/watch?v=7xV7dlwMChc
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 3
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 4
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 5
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 6
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 7
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 8
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 9
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
So, What is Your Guess?
10
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 11
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 12
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 13
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 14
A/B Testing for On-Line Businesses
What is it? Develop two versions of a page Randomly show users different versions Track how they do Uses statistics to decide which is better Answers yes/no questions
Why? You have the data to do it Web sites convert a small number of users Some see a 40% increase in conversion
Source: Ben Tilly btilly@gmail.com
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
Some Lessons from A/B Testing
Explore before you refine Example: ABC Family:
− Existing Website: Promotions for upcoming shows− Radical Idea: People come to the website looking for old episodes
15
+600% engagement
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
Some Lessons from A/B Testing
Words Matter, Call to action
Which button led to the biggest increase in donations?
16
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
Some Lessons from A/B Testing
Words Matter, Call to action
Which button led to the biggest increase in donations?
Trick question. Depended on what campaign knew!
17
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 18
Thought Exercise with Our Packaging Example
Original Case (mean = 290, sd = 53)
Less Variability (m = 290, sd = 5) More Variability (m = 290, sd = 186)
If a store manager came to you and said, “what will my sales be?” how would you answer?
If CEO came to you and said, “what will average sales be?” how would you answer?
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 19
Thought Exercise II- We Doubled The Samples
If a store manager came to you and said, “what will my sales be?” how would you answer?
If CEO came to you and said, “what will average sales be?” how would you answer?
(mean = 290, sd = 53) (mean = 290, sd = 53)
What do you think of these questions now?
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 20
Sampling Distribution–Many times we are sampling a population and need to find the true mean
The mean of the sample is denoted by
estimates the true mean, µ
Is it a ‘good’ estimator?
It depends on a few things The standard deviation of the population The sample size The distribution of the population (sometimes) A good random sample and maybe a little luck
X
X
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 21
Sampling Distribution
is approximately normally distributed with a mean of µ and st dev of
Since we never know the actual σ, we approximate it with the sample standard deviation, s.
X
n
is commonly used in statistics
We call this term the standard error of the mean
X
ss
n
Let’s see how this applies to our examples
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 22
Central Limit Theorem– General Idea
is approximately normally distributed with a mean of µ and st dev of
In other words, as you take various samples, the collection of these samples will be approximately normally distributed The larger the value of n, the closer to normally distributed
The population data does not have to be normally distributed
X
n
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 23
We Have 3 Measures for a Sample of Data
Mean (average)
Standard Deviation (sample standard deviation)
Standard Error of the Mean
Let’s build a confidence interval….
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 24
The t-distribution
The t-distribution resembles a standard normal but with thicker ‘tails’
t-distributions are characterized by a feature called degrees of freedom
t-distributions with higher degrees of freedom more closely represent the standard normal
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 25
t-distributions with various Degrees of Freedom
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 26
Excel: The t-distribution
The TDIST function requires three inputs
X (the function finds the area to the right of X)
Deg_freedom Tails (inputting 1 tail finds the area to
the right of X, 2 tails reports twice the area)
X must be a positive number
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 27
Excel: The inverse t-distribution
The TINV function requires two inputs
Probability Deg_freedom
The function reports the value, t, that will yield the required probability to its right for a t-dist with the specified d.f.
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 28
Sampling Distribution
is approximately normally distributed with a mean of µ and st dev of
Since we never know the actual σ, we approximate it with the sample standard deviation, s.
follows a t-distribution with n-1 d.f.
X
n
/X
ts n
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 29
Notation
is commonly used in statistics
We call this term the standard error of the mean
X
ss
n
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 30
Interval Estimates
Our estimate of the true mean sales per store is 290.5
The standard error of the mean is 8.8
What proportion of samples like ours would be within 10 units of the true mean?
We can use the t-distribution to find out
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 31
The Computations
/X
ts n
X
ss
n
𝑃𝑟𝑜𝑏 (−10≤ 𝑥−𝜇≤10 )
𝑃𝑟𝑜𝑏 (−10/𝑆𝑥≤(𝑥−𝜇)/𝑆𝑥≤10 /𝑆𝑥 )𝑃𝑟𝑜𝑏 (−10/𝑆𝑥≤𝑡≤10 /𝑆𝑥 )
𝑃𝑟𝑜𝑏 (−10 /8.8≤ 𝑡≤10/8.8 )
𝑃𝑟𝑜𝑏 (−1.13≤ 𝑡≤1.13 )
Area between -1.13 and 1.13
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 32
Where does this fall on t-distribution?
0
Degrees of F: 35
-1.13 1.13
Not to scale
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 33
Let’s Do This in Excel
Find the probability of +/- 10 units
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 34
Confidence
In this example, we say that we are 73% confident that the true mean lies within 10 units of our estimate.
We must use the word confidence instead of probability as the randomness is associated with our estimator and not the true mean which is not random at all.
Usually, we work backwards from a desired level of confidence and then find the range of the interval necessary to achieve that level.
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 35
/2, 1n XX t S
95% Confidence Intervals
A 95% confidence interval takes on the form:
where is the value needed to generate an area of α/2 in each tail of a t-distribution with n-1 degrees of freedom
Use the Excel formula CONFIDENCE.T for
CONFIDENCE.T uses the following: Alpha = 1 – Confidence you want Std Dev = Std Deviation (not the std error of the mean) Sample= sample size
/2, 1n XX t S
/2, 1nt
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 36
Test With Sample Data
Divide into groups
Work on one of the data sets
Find the Mean, Std Dev, Std Error of the Mean, and the 95% Confidence Intervals
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 37
Hypothesis Testing
Source for Hypothesis Testing: Dr Nicola Ward Petty and CreativeHeuresitcs
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 38
Hypothesis Testing
Source for Hypothesis Testing: Dr Nicola Ward Petty and CreativeHeuresitcs
We can say things about a population from a sample taken from the population
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 39
Steps of Hypotheses Testing
Hypotheses
Significance
Sample
P-value
Decide
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 40
Hypothesis Testing: Step 1: The Hypothesis
We are testing something about the underlying population parameters
Null includes the equality sign (=, ≥, or ≤)
H0- Null Hypothesis (everything else or the status quo)
Ha- Alternative Hypothesis (what you want to prove)
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 41
Test Marketing (Formally)
m : average sales per week.
Ho: m is equal to or smaller than 275.
Ha: m is greater than 275.
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 42
Hypothesis Testing, Step 2: Significance
Significance, or alpha (α), is generally set to 5%
It is the probability that the Null is rejected when it is really correct, Or a Type I Error
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 43
Hypothesis Testing: Step 3: Sample
Take a sample and gather the statistics about the sample (like the mean, std dev, std error of the mean, etc)
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 44
Hypothesis Testing, Step 4: P-Value
Different ways to calculate p-value if we are testing one mean or two
One mean: Will the new packaging have sales greater than 275?
Two means: Is the Blue Package better than the Green Package?
We will start with one mean.
To start, we calculate the test statistic:
The value for μ is the value in our Null hypothesis (we are testing to see if this is true population value)
/X
ts n
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 45
Hypothesis Testing: P-Value:Example with Packaging
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
Let’s Not Lose Track of the Intuition…
Is 290 larger than 275? What if sales had to be more than 400, more than 500, more than 320,
would you be comfortable about our hypothesis?
How much larger is 290 than 275 relative to the statistics we have calculated? Hint– think about the standard deviation and the standard error of the
mean
How do you feel about our test?
46
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 47
Hypothesis Testing: P-Value:
290.54
St. Dev = 8.8475
275
If 275 is the true mean (our Null Hypothesis), what is the chance we drew a sample with an average of 290.54?
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 48
m : average sales per store
Ho: m is less than or equal to 275.
Ha: m is greater than 275.
Hypothesis Testing: P-Value:Formal Statement Of Problem
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 49
Hypothesis Testing: P-Value:Computations
Test Statistic =
Case: When Null is ≤ and the sample mean is higher than the null value:
P equals (1-T.DIST) Function or the T.DIST.RT Function
Let’s test in Excel
=1.76
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 50
Hypothesis Testing Step 5: DecideHow to Use the P-Value
Significance
If p < Significance Level, Reject the Null
If p > Significance Level, Do Not Reject the Null
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 51
Hypothesis Testing: Decide:How to Use the P-Value
Low p-value (e.g. 4.4%) means reject the null.
1 minus the p-value is maximum confidence on the alternative hypothesis.
Average Weekly Sales will exceed 275
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 52
Sales Distribution– How far away is 290 if the real mean is 275?
0 1.7575
Area = 4.4%
Not Drawn to Scale
Ho: m is less than or = 275.
Ha: m is greater than 275.
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 53
Sales Distribution– How far away is 290 if the real mean is 285?
0 0.6278
Area = 26.8%
Not Drawn to Scale
Ho: m is less than = 285.
Ha: m is greater than 285.
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book 54
Sales Distribution– How far away is 290 if the real mean is 265?
0 2.89
Area = 0.3%
Not Drawn to Scale
Ho: m is less than = 265.
Ha: m is greater than 265.