Students will understand the definition of mean, median...

Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with

given set of numbers. Also, students will understand why some measures of central tendency are more accurate than others.

1. Which non-descriptive research method would be the best way to show how brain damage affects people’s ability to form new memories?

a. Case study c. Surveyb. Correlation d. Naturalistic observation

2. Jocelyn is interested in a rare form of phobia. She is particularly interested in the factors associated with the development of this phobia. The research method that would be most useful for her is

a. Field Study c. Case Studyb. Experiment d. Naturalist observation

3. Harvard Business School is famous for teaching MBA students about business by using the case study method. If a Harvard MBA tried to apply knowledge from a case study to a new situation, the MBA should keep in mind that case studies may not bea. Detailed c. Specificb. Unique d. Representative

4. The scatterplot to the right is most closely representative of which correlation coefficient?a. +.10 c. +.89b. -.40 d. +.45

Statistics

• Recording the results from our studies.

• Must use a common language so we all know what we are talking about.

Descriptive Statistics

• Just describes sets of data.

• You might create a frequency distribution.

• Bargraphs or histograms.

Tools for Describing DataThe bar graph is one simple display

method but even this tool can be manipulated.

Our brand of truck is better!

Our brand of truck is not so different…

Why is there a difference in the apparent result?

8

• The frequency (f) of a particular observation is the number of times the observation occurs in the data.

• The distribution of a variable is the pattern of frequencies of the observation.

• Frequency distributions are generally reported in tables or histograms

• Histogram: A graph that consists of a series of columns, each having a class interval as its base and frequency of occurrence as its height.

Frequency distribution and histograms

7-8 8-9 9-10 10-11 11-12 12-1 1-2 2-3 3 +

Time of day

Nu

mb

er

of

tim

es c

allin

g o

ut

8

6

4

2

0

12

10

Histograms vs. bar graphs• “Histograms look a lot like bar graphs.”• Think of histograms as "sorting bins." You

have one variable, and you sort data by this variable by placing them into "bins."

• Then you count how many pieces of data are in each bin. The height of the rectangle you draw on top of each bin is proportional to the number of pieces in that bin.

We want to compare total revenues of five different companies.Key question: What is the revenue for each company?

Bar graph

We want to compare heights of ten oak trees in a city park.Key question: What is the height of each tree?

Bar graph

We have measured revenues of several companies. We want to compare numbers of companies that make from 0 to 10,000; from 10,000 to 20,000;

from 20,000 to 30,000 and so on.Key question: How many companies are there in each class of revenues?

Histogram

We have measured several trees in a city park. We want to compare numbers of trees that are from 0 to 5 meters high; from 5 to 10; from 10

to 15 and so on.Key question: How many trees are there in each class of heights?

Histogram

Bar graph or Histogram? (Both allow you to compare groups.)

SCENARIO:- You are trying to decide if you want to take a class in school based on how the difficult the class is. You decide to use the grades of students who have taken the class previously as a measure of difficulty.- What are some ways of looking at the data to make your decision?

11

Measures of Central Tendency

Median: The middle score in a rank-ordered distribution.

If the median score is 85%, would you consider this an easy class?

What if you found out that the grades were 42, 44, 50, 85, 85, 85, 85?

Is median a great measure of central tendency?

12


Mode: The most frequently occurring score in a distribution.

If you find a class with a mode of 86 would this be an easy class?

Here are the grades: 14,25,32,45,50,60,86,86.

Is mode a great measure of central tendency?

13

Measures of Central TendencyMean: The arithmetic average of scores

in a distribution obtained by adding the scores and then dividing by the number of scores that were added together.

You have found a class with a mean of 85 and have decided that this must be an easy class.

The grades were: 70,70,100,100. Would you feel confident that this an easy class?


• It is important to always note which measure of central tendency is being reported. If it is a mean, one must consider whether a few atypical scores could be distorting it, or causing a skewed distribution.

• Skewed distribution: When scores don’t distribute themselves evenly around the center. There are a few extremely high or low scores.

15


A Skewed Distribution

Central Tendency• Mean, Median and Mode.

• Watch out for extreme scores or outliers.

$25,000-Pam $25,000- Kevin$25,000- Angela$75,000- Andy$75,000- Dwight$75,000- Jim$350,000- Michael

Let’s look at the salaries of the employees at Dunder Mifflen Paper in Scranton:

Measures of central tendency are Quick and easy, but outliers may distort the numbers.

Normal Distribution

• In a normal distribution, the mean, median and mode are all the same.

The “Bell Curve”

Distributions

• Outliers skew distributions.

• If group has one high score, the curve has a positive skew (contains more low scores)

• If a group has a low outlier, the curve has a negative skew (contains more high scores)

Measures of variation

• Averages from scores with low variability are more reliable than those with high variability.

• Range: Difference between the highest and lowest scores in a distribution. Like with the mean, high and low scores could present a deceptively large range.

21

Measures of Variation

Standard Deviation:A computed measure of how much scores vary around the mean.Standard Deviation uses information from each score, so it better represents data.

Standard Deviation

• SCORES

18

20

24

25

33MEAN: 24

• Score-Mean

-6

-4

0

1

9

(Score-Mean)²

36

16

0

1

81

134

134

5

𝟐𝟔. 𝟖➢ 26.8 is the “variance”➢ Standard deviation is

the “square root of the variance.” (SD=5.17)

Variance: Gauges a spread of scores within a sample

26.8

Normal Curve

-3 -2 -1 0 +1 +2 +3

13.5 34 34 13.5

2.352.35

68%

95%

-Each mark represents one deviation away from the mean.-Numbers in red are the percentage of people whose score falls within each standard deviation.-68% of people will fall within 1 standard deviation from the mean.-95% of people will fall within 2 standard deviations from the mean.

0.150.15

99.7%

Normal Curve

9 14 19 24 29 34 39

13.5 34 34 13.52.35

68%

95%

-Using our numbers from our standard deviation exercise, the normal curve would look like this.68% would have scored within one standard deviation of the mean, or would have scored between 19 and 29.95% would have scored within two standard deviations, or between 14 and 34.

2.35

.15 .1599.7%

9/28/16

1. Create the Normal Curve template with data.

2. A shop foreman found it took 40 minutes to complete a task with a standard deviation of 5, and the times for completing the task are normally distributed. What percentage of workers will take 50 minutes or more to complete the task?

3. The scores from the AP Physics exam had an average of 82, with a standard deviation of 3. People who scored within 2 standard deviations of the mean had a score between ____ and _____

A B C

The three curves below represent standard deviations of 1, 2 and 3.

Which curve below would represent a standard deviation of 1? How do you know?

Which curve would represent a standard deviation of 3? How do you know?

ESTIMATING VARIANCE

THE GREATER THE VARIANCE IN RESULTS, THE GREATER THE STANDARD DEVIATION.

Standard deviation, the normal curve and baseball.

http://www.learner.org/resources/series65.html

Weighing the odds…

• 2 High School punters

– Kicker A:

• mean distance: 40.0 yds

• Standard deviation: + 16 yds.

– Kicker B:

• mean distance: 34.5 yds.

• Standard deviation: + 4 yds.

• Which player do you play?• Kicker B – team will know

what to expect

Applying the conceptsTry, with the help of this rough drawing below, to describe

intelligence test scores at a high school and at a college

using the concepts of range and standard deviation.

Intelligence test scores at a high school

Intelligence test scores at a college

100

Want to take that class?• So, if you were told that the mean average

in a class was 85%, with a standard deviation of 5, would you feel confident that you would get a “B”?

• The smaller the standard deviation, the more closely the scores are packed near the mean, and the steeper the curve would appear.

• What percentage of students got a B or A?

• What would the standard deviation be if every score was the mean score?

84%

0

Z-scores• Sometimes being able to compare scores

from different distributions is important.

• Z-scores measure distance of a score from the mean in units of standard deviation.

• Scores below the mean have negative z-scores, and those above have positive z-scores.

• FOR EXAMPLE: Test: Mean = 80 & SD = 8

Phineas got a 72%: z-score of -1

Ferb got a 84%: z-score of +0.5

Direction of a Z-score

• The sign of any Z-score indicates the direction of a score: whether that observation fell above the mean (the positive direction) or below the mean (the negative direction)– If a raw score is below the mean, the z-

score will be negative, and vice versa

Comparing variables with very different observed units of measure

• Example of comparing an SAT score to an ACT score– Mary’s ACT score is 26. Jason’s SAT score

is 900. Who did better?

– The mean SAT score is 1000 with a standard deviation of 100 SAT points. The mean ACT score is 22 with a standard deviation of 2 ACT points.

Let’s find the z-scores

Jason: 900-1000

100

Mary: 26-22

2

• From these findings, we gather that Jason’s score is 1 standard deviation below the mean SAT score and Mary’s score is 2 standard deviations above the mean ACT score.

• Therefore, Mary’s score is relatively better.

Zx =

Zx =

=

=

-1

+2

Z = Score-meanSD

Interpreting the graph

• For any normally distributed variable:– 50% of the scores fall above the mean and

50% fall below.– Approximately 68% of the scores fall

within plus and minus 1 Z-score from the mean.

– Approximately 95% of the scores fall within plus and minus 2 Z-scores from the mean.

– 99.7% of the scores fall within plus and minus 3 Z-scores from the mean.

Z – Score Conclusions• Z-score is defined as the number of standard

deviations from the mean.• Z-score is useful in comparing variables with very

different observed units of measure.(Like measures of central tendency and variation – z-scores can describe.)

- HOWEVER -• Z-scores allow for precise predictions to be

made of how many of a population’s scores fall within a score range in a normal distribution.

(So they are also inferential, because they can infer what might happen in the future.)

Types of statistics• Descriptive statistics are used to reveal

patterns through the analysis of numeric data.

• Measures of Central tendency• Measures of variation• Z-scores

• Inferential statistics are used to draw conclusions and make predictions based on the analysis of numeric data.

• Z-scores• t-tests

– These types of stats help us determine if chance played a role in our findings.

37

Statistically Significant: a result is called statistically significant if it is unlikely to have occurred by chance.

“Magic number” is p ≤.05

This means you are 95% sure the results did not occur by chance.

When is a Difference

Significant?

Inferential Statistics•The purpose is to discover whether the finding can be applied to the larger population from which the sample was collected.

p-values

38

Making Inferences

1. Large, representative samples are better than biased samples.

2. Observations with low variability are more reliable than those with high variability.

3. Many cases that support your data are better than fewer cases.

POINT TO REMEMBER: Don’t be overly impressed by a few anecdotes. Generalizations based on a few unrepresentative cases are unreliable.

When is an Observed Difference Reliable?

Students will understand the definition of mean, median...

Documents

Transcript of Students will understand the definition of mean, median...