Inferring Sample Findings to the Population and Testing for Differences

Jump to first page

Inferring Sample Findings to the Population and Testing for Differences

Statistics versus Parameters Values computed from samples

are statistics!statistics! Values computed from the

population are parameters!parameters! Use Greek letters when referring to

parameters. Use Roman letters for statistics.

Inference and Statistical Inference InferenceInference - generalize about an

entire class based on what you have observed about a small set of members of that class.

Draw a conclusion from a small amount of evidence.

Statistical InferenceStatistical Inference - Sample size and sample statistics are used to make estimates of population parameters.

Hypothesis Testing Statistical procedure used to accept or reject the hypothesis

based on sample information. Steps in hypothesis testing

Begin with a statement about what you believe exists in the population.

Draw a random sample and determine the sample statistic.

Compare the statistic with the hypothesized parameter. Decide whether or not the sample supports the original

hypothesis If the sample does not support the hypothesis, revise the

hypothesis to be consistent with the sample's statistic.

Test of the Hypothesized Population Parameter Value

zx µH

sxThe sample mean is compared to the hypothesized mean, if z exceeds critical value of z (e.g., 1.96) then we reject the hypothesis that the population mean is Mu.

zp H

sp

For example, we hypothesize that the average GPA for business majors is not the same as Recreation majors.

Directional Hypotheses Indicates the direction in which you believe the

population parameter falls. For example, the average GPA of business majors

is higher than the average GPA of Recreation Majors.

Note that we are now interested in the volume of the curve on only one side of the mean.

Interpretation

If the hypothesis about the population parameter is correct or true,then a high percentage of the sample means must fall close to this value (i.e., within +/-1.96 sd.)

Failure to support the hypothesis tells the hypothesizer that the assumptions about the population are in error.

Testing for Differences Between Two Means Ho: There is no difference between

two means. (Mu1=Mu2)

Ha: There is a difference between two means. (Mu1does not equal Mu2).

zx1 x2

sx1 x2

sx1 x2

s2

1

n1

s

22

n2

Testing for Differences Between Two Means: Example

Is there a statistically significant difference between men and women on how many movies they have seen in the last month?

Ho: There is no difference between two means. (MuW=MuM)

Ha: There is a difference between two means. (MuW/=MuM)

Gender N Mean St. Dev

male 19 2.3684 1.98

female 13 2.5385 2.18

t df Significance (2 tailed)

TOTAL -.229 30 .820

F Sig.

Levene’s test for equality of variance .004.952

Example

Testing for Differences Between Two Means

Fail to reject the null hypothesis, that the means are equal Why?

Significance = .82 Reject any significance lower than .05 .82 > ,05; therefore, fail to reject null there is no statistically significant difference between men

and women on how many movies seen in last month makes sense - look at means (2.36 & 2.53)

Small Sample Size - t-Test

Normal bell curve assumptions are invalid when sample sizes are 30 or less.

Alternative choice is t-Test Shape of t distribution is

determined by sample size (i.e., degrees of freedom).

df = n-1

ANOVA

ANOVA = Analysis of Variance Compares means across multiple groups ANOVA will tell you that one pair of means has a

statistically significant difference but not which one Assumptions:

independence normality equality of variance (Levene test)

Analysis of Variance

When researchers want to compare the means of three or more groups.

ANOVA used to facilitate comparison!

Basic AnalysisDoes a statistical significance difference exist between at

least two groups of means?

ANOVA does not communicate how many pairs of means are statistically significant in their differences.

Hypothesis Testing

Ho: There is no difference among the population means for the various groups.

Ha: At least two groups have different population means..

When MSBetween is significantly greater than MSWithin then we reject Ho.

F value

F = MSBetween/MSWithin

If F exceeds Critical F(df1, df2) then we reject Ho.

Visual Representation

Population 1Population 2


Population 3

Appears that at least 2 populations have different means.

Visual Representation


Population 4

Population 5

Population 3

Appears that populations do not have significantly different means.

Tests of Differences Chi-square goodness-of-fit

Does some observed pattern of frequencies correspond to an expected pattern?

Z-test/T-test Is there a significant difference between the

means of two groups? ANOVA

Is there a significant difference between the means of more than two groups?

When to Use each Test Chi-square goodness-of-fit

Both variables are categorical/nominal. T-test

One variable is continuous; the other is categorical with two groups/categories.

ANOVA One variable is continuous (i.e., interval or

ratio); the other is categorical with more than two groups.

How to Interpret a Significantp-value (p < .05)

Chi-square goodness-of-fit “There is a significant difference in frequency of responses

among the different groups (or categories).” T-test

“The means (averages) of the 2 population groups are different on the characteristic being tested.”

ANOVA “The means of the (multiple) population groups are different -

need post hoc test (e.g., Bonferroni) to determine exactly which group means are different from one another.”

Measuring Association

Is there any association (correlation) between two or more variables?

If so, what is the strength and direction of the correlation?

Can we predict one variable (dependent variable) based on its association with other variables (independent variables)?

Correlation Analysis estimate of correlation between two A statistical

technique used to measure the closeness of the linear relationship between two or more variables.

Can offer evidence of causality, but is not enough to establish causality by itself (must also have evidence of knowledge/theory and correct sequence of variables).

Scatterplots can give visual variables.

Regression Analysis

Simple Regression relate a single criterion (dependent) variable to

a single predictor (independent) variable Multiple Regression

relate a single criterion variable to multiple predictor variables

All variables should be at least interval!

Correlation/Regression Coefficient of Correlation (r)

measure of the strength of linear association between two variables

also called “Pearson’s r” or “product-moment” ranges from -1 to +1

Coefficient of Determination (r2) proportion of variance in the criterion explained by

the fitted regression equation (of predictors)

Inferring Sample Findings to the Population and Testing for Differences

Documents

Transcript of Inferring Sample Findings to the Population and Testing for Differences