Comparing Means (Part 1)

34
Dr Kirsten Challinor Acknowledgment to Andy Field chapter 9 Comparing Means

Transcript of Comparing Means (Part 1)

Page 1: Comparing Means (Part 1)

Dr Kirsten Challinor

Acknowledgment to Andy Field chapter 9

Comparing Means

Page 3: Comparing Means (Part 1)

Lecture Outline• Hypothesis testing

• Comparing means from a linear model perspective

T-tests

• Dependent (aka paired, matched)

• Independent

Rationale for the tests

• Assumptions

Interpretation

Reporting results

Calculating an Effect Size

Categorical predictors in the linear model.

Page 4: Comparing Means (Part 1)

Hypothesis testing

Page 5: Comparing Means (Part 1)

http://statslc.com /

Statistics learning centreVideos on hypothesis testing

https://www.youtube.com/playlist?list=PLm9FYjKtq7Pzjh7e727hSr8VvSR9OvqsZ

Useful website

Page 6: Comparing Means (Part 1)

The research process

Initial observation(research question/hypothesis)

Generate theory

Generate stats hypotheses

Collect data to test theory

Analyse data

Data

Identify variables

Measure variables

- Graph data- Fit a model

Adapted from Field, A. Page 3

Page 7: Comparing Means (Part 1)

Initial Observation

•Find something that needs explaining

• Observe the real world

• Read other research

•Test the concept: collect data

• Collect data to see whether your hunch is correct

• To do this you need to define variablesoAnything that can be measured and can differ across entities or

time.

- Analyse data

- Fit statistical model to the data

Page 8: Comparing Means (Part 1)

Types of researchCorrelational research:

– Observing what naturally goes on in the world without directly interfering with it.

Cross-sectional research:

– This term implies that data come from people at different age points with different people representing each age point.

Experimental research:

– One or more variable is systematically manipulated to see their effect (alone or in combination) on an outcome variable.

– Statements can be made about cause and effect.

Page 9: Comparing Means (Part 1)

Research hypothesisHypothesis = A proposition for reasoning

= A suggestion as to why something might be as it is

= A prediction from a theory.

A testable statement of the state of the world.

Examples of testable and non testable statements:– The Beatles were the most influential band ever = Non-scientific

statement.– The Beatles were the best selling band ever = Scientific/testable

statement.

Good theories produce hypotheses that are scientific statements.Scientific statements are ones that can be verified with reference to empirical evidence.

Page 10: Comparing Means (Part 1)

Research hypothesis & Statistical hypotheses

Research hypothesis

Null hypothesis

Experimental

hypothesis

Do ‘great’ supervisors produce

better students?

No difference between students

Students with highly rated supervisors will be rated better than students with lower rated supervisors

Page 11: Comparing Means (Part 1)

The experimental hypothesis

•The hypothesis or prediction that comes from your theory is usually saying that an effect will be present. It says that there will be a difference between groups.

•This is called the

• Experimental hypothesis OR

• The alternative hypothesis (because it relates to a type of methodology)

•It is labelled like this: H 1

•Examples of an experimental hypothesis:

• H1 = The Beatles have sold more records than Michael Jackson.

• H1 = The instance of dry eye is different in men and women.

• H1 = Students with highly rated supervisors will score better marks than students with lower rated supervisors.

Page 12: Comparing Means (Part 1)

The null hypothesis•It is the opposite of the experimental hypothesis. It states that nothing interesting will happen.

•This states that there is no effect.

•It is labelled like this: H 0

•Examples of a null hypothesis:

• H0 = The number of records sold by The Beatles and Michael Jackson will not be different.

• H0 = There is no difference in the prevalence of dry eye in men and women.

• H0 = No difference between the marks of students who had highly rated supervisors to those who had lowly rated supervisors.

Page 13: Comparing Means (Part 1)

The null hypothesisRemember that null hypothesis does not necessarily state that the size of the effect is zero.

Page 14: Comparing Means (Part 1)

Why do we need the null hypothesis?

•The issue of truth.

•We can’t prove the truth, but we can talk in terms of probability.

•Can never really prove our hypothesis.

•We cannot prove the experimental (alternate) hypothesis using statistics, but we can reject the null hypothesis.

•If we get data that gives us confidence to reject our null hypothesis, that gives us support to our experimental hypothesis.

•However, even if we reject our null hypothesis that still doesn’t prove our experimental hypothesis.

•A basic understanding of hypothesis testing is focused on accepting or rejecting the null hypothesis. However what we really should talk about is ‘the chances of obtaining the data, assuming the null hypothesis is true.”

Page 15: Comparing Means (Part 1)
Page 16: Comparing Means (Part 1)

Example Experimental hypothesis: H1 Null hypothesis: H0

Music The Beatles have sold more records than Michael Jackson.

The number of records sold by The Beatles and Michael Jackson will not be different.

Optometry(Correlational research)

The instance of dry eye is different in men and women.

There is no difference in the prevalence of dry eye in men and women.

Optometry(Experimental research)

Students with highly rated supervisors will have higher marks than students with lower rated supervisors.

No difference between the marks of students who had highly rated supervisors to those who had lowly rated supervisors.

Summary of H1 and H0 Examples

Page 17: Comparing Means (Part 1)

ExampleGroup A.Students with an ‘OK’ Supervisor

Name Student mark

Peter

Sarah

Alex

John

….

Mean Mean of Group A

Group B.Students with a ‘Great’ Supervisor

Name Student mark

Tom

Jill

Sally

Louise

….

Mean Mean of Group B

Page 18: Comparing Means (Part 1)

Logic behind testing the hypotheses

We evaluate our statistical hypotheses by thinking about the chance of getting the data we found in association with the null hypothesis statement.

Chance/ probability / likelihood

Experimental hypothesis: H1 Null hypothesis: H0

Students with highly rated supervisors will have higher marks than students with lower rated supervisors.

No difference between the marks of students who had highly rated supervisors to those who had lowly rated supervisors.

Page 19: Comparing Means (Part 1)

Logic behind testing the hypothesis: example 1.

Let’s say that 75% of students with highly rated supervisors got high marks (say A or A+).

What are the chances that we got this result by accident?

Let’s consider the null hypothesis: No difference between students.

E.g. If the null hypothesis is true (that there is no difference between students), what are the chances that 75% students in the great supervisor group had high marks just by chance ?

Not very likely. It is pretty unlikely that we accidently got 75% of high mark students in the ‘great supervisor’ group.

Experimental hypothesis: H1 Null hypothesis: H0

Students with highly rated supervisors will have higher marks than students with lower rated supervisors.

No difference between the marks of students who had highly rated supervisors to those who had lowly rated supervisors.

Page 20: Comparing Means (Part 1)

Our conclusion…•Therefore we were unlikely to have gotten the data we did if the null hypothesis were true.

OR said differently:

•If there is no true difference between the groups, it is pretty unlikely that when we collected data we randomly got 75% of students with high marks in our ‘great supervisor’ group.

THEREFORE

A basic understanding = we reject the null hypothesis and feel that we have support for our experimental hypothesis*

So we think that it is not likely to be the case that there is no difference between the groups. It seems possible that the 2 groups of students are in fact different. We will need to look at the means of the groups to see the direction of the difference.

* remember that, even if we reject our null hypothesis that still doesn’t prove our experimental hypothesis.

Page 21: Comparing Means (Part 1)

Logic behind testing the hypothesis- example 2.

But what if our result was not 75%, but something like 8%?

(let’s say that 4% of students with highly rated supervisors got high marks of A or A+).

If we assume the null hypothesis is true:

What is the chance that 4% of the students in the ‘great supervisor’ group got high marks just by random chance?

In my opinion 4% is a low percentage…

Well maybe this result could be a random accident.

…now it seems that 4% with high marks is possible or likely, so we might feel uncomfortable rejecting the null hypotheses and we say that it is possible that the 2 groups are not different.

Experimental hypothesis: H1 Null hypothesis: H0

Students with highly rated supervisors will have higher marks than students with lower rated supervisors.

No difference between the marks of students who had highly rated supervisors to those who had lowly rated supervisors.

Page 22: Comparing Means (Part 1)

Fitting statistical models to data•We have looked at testing the hypotheses in a very informal manor by casually asking “what is the chance that we got a result by accident”.

•Statistical testing formalises this process.

•It’s all about this logic of seeing if the data you got is a ‘real’ representation of the world of if it is just due to chance.

…oh and the Beatles are the best selling band of all time.http://ifpi.org

Page 23: Comparing Means (Part 1)

NHST = Null hypothesis significance testing. p62

• Assume the null hypothesis is true (there is no effect).

• We fit a statistical model to the data that represents the alternative hypothesis. We then see how well it fits (in terms of the variance it explains).

• To determine how well the model fits the data, we calculate the probability (called the p-value) of getting that ‘model’ if the null hypothesis were true.

• If that probability is very small (usually we consider the criterion to be .05 or less) then we conclude that our model fits the data well (i.e. explains a lot of the variation in scores)…

… and we assume that our initial prediction is true: we can confidence in the alternative hypothesis.

The basic principles of NHST

Page 24: Comparing Means (Part 1)

Slide 17

Testing the Model: ANOVA

Mean Squared Error • Sums of Squares are total values.

• They can be expressed as averages.

• These are called Mean Squares, MS

• F is a measure of how much the model has improved the prediction of the outcome compared to the level of inaccuracy of the model.

• Good model has a large F (>1)

R

MMSMSF =

• F

• t

• X 2

They all represent signal to noise ratios

A test statistic is a statistic for which we know how frequently different values occur.

They are all defined by an equation that enables us to calculate precisely the probability of a given score.

Test statistics

Page 25: Comparing Means (Part 1)

• Typically kittens weigh 100g at birth

• Sometimes one is 150g. This is rare. So there is a low probability of finding a 150g newborn kitten is very small.

• Conversely the probability of finding a 100g one is high.

• From our research of kitten births, we can now calculate the probability of finding a particular value.

• Like kittens, as test statistics get bigger, the probability of them occurring gets smaller.

Test statistics

Page 26: Comparing Means (Part 1)

Categorical Predictors in the Linear Model

Page 27: Comparing Means (Part 1)

Simplest experiment: one independent variable that is manipulated in only two ways & one outcome is measured.

- Experimental condition and a control condition

E.g., Is the movie Scream 2 scarier than the original Scream? We could measure heart rates (which indicate anxiety) during both films and compare them.

Question to the class:

What kind of variables are Scream 1 and 2

(interval, ordinal etc )?

This situation can be analysed with a t-test

Experiments

Page 28: Comparing Means (Part 1)

Slide 28

The Only Equation You Will Ever Need

The data we observe can be predicted from the model we choose to fit to the

data, plus some amount of error.

Remember this…

ii errorModelOutcome

Page 29: Comparing Means (Part 1)

Invisible cloak exp

Page 30: Comparing Means (Part 1)

• In this case the outcome is membership of one of two groups.

• We are predicting the number of mischievous acts from whether or not someone was wearing a cloak.

• This is regression with one dichotomous predictor.

• The b for the model will reflect the the differences between the mean levels of mischief between the two groups.

• The resulting t-test will tell us if the difference between the means is zero.

Compare the differences between the means of two groups… a kind of regression

Page 31: Comparing Means (Part 1)

Outcome = Model + error

We can use a linear model to compare means (Cohen, 1968).

Yi = (b0 + b1X1i) + Error I

Mischief i = (b0 + b1Cloak i) + Error I

Use dummy variable, 0 and 1 to represent cloak condition.

No cloak is coded as 0 .Wearing cloak is coded as 1.

Ignoring the error (also called the residual)Mischief i = (b0 + b1Cloak i)

For no cloakMeanNoCloak = b 0 + (b1 x 0)b0 = 3.75.The intercept is equal to the mean of the no cloak group.

Mischief i = (b0 + b1Cloak i)For cloak groupMeanCloak = b 0 + (b1 x 1)MeanCloak = MeanNoCloak + b 1

b1 = MeanCloak - MeanNoCloak

Therefore b 1 represents the difference between group means. We have seen that when you run a regression a t-test is used to ascertain the whether the b1 value is equal to zero. In this context it will be testing if the difference between group means is zero.

Page 32: Comparing Means (Part 1)

Constant = B 0 = 3.75 = same as mean of the no cloak group

Regression co-efficient = B 1 = 1.25 = difference between two group means

t-statistic = test of if b1 is sig different from zero. Which is a test of the difference between means.

t = 1.713, p =.101.

As p<.05, if is not significant- there is not a meaningful/reliable difference between the two populations.

P363 of text

Page 34: Comparing Means (Part 1)

Next-> see part 2 of:Comparing Means

http://www.vias.org/tmdatanaleng/cc_test_compare_means.html