Central Tendency and Variability The two most essential features of a distribution.
-
Upload
beverly-lloyd -
Category
Documents
-
view
222 -
download
2
Transcript of Central Tendency and Variability The two most essential features of a distribution.
![Page 1: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/1.jpg)
Central Tendency and Variability
The two most essential features of a distribution
![Page 2: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/2.jpg)
Questions
• Define– Mean– Median – Mode
• What is the effect of distribution shape on measures of central tendency?
• When might we prefer one measure of central tendency to another?
![Page 3: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/3.jpg)
Questions (2)
• Define– Range– Average Deviation– Variance– Standard Deviation
• When might we prefer one measure of variability to another?
• What is a z score?• What is the point of Tchebycheff’s
inequality?
![Page 4: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/4.jpg)
Variables have distributions
• A variable is something that changes or has different values (e.g., anger).
• A distribution is a collection of measures, usually across people.
• Distributions of numbers can be summarized with numbers (called statistics or parameters).
![Page 5: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/5.jpg)
Central Tendency refers to the Middle of the Distribution
![Page 6: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/6.jpg)
Variability is about the Spread
![Page 7: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/7.jpg)
1. Central Tendency: Mode, Median, & Mean• The mode – the most frequently
occurring score. Midpoint of most populous class interval. Can have bimodal and multimodal distributions.
![Page 8: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/8.jpg)
Median
• Score that separates top 50% from bottom 50%
• Even number of scores, median is half way between two middle scores.– 1 2 3 4 | 5 6 7 8 – Median is 4.5
• Odd number of scores, median is the middle number– 1 2 3 4 5 6 7 – Median is 4
![Page 9: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/9.jpg)
Mean
• Sum of scores divided by the number of people. Population mean is (mu) and sample mean is (X-bar).
• We calculate the sample mean by:
• We calculate the population mean by:
X
N
XX
N
X
![Page 10: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/10.jpg)
Deviation from the mean
• x = X – . Deviations sum to zero.
• Deviation score – deviation from the mean
• Raw scores
• Deviation scores
X
9
8 9 10
7 8 9 10 11
0
-1 0 1
-2 -1 0 1 2
![Page 11: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/11.jpg)
Comparison of mean, median and mode• Mode
– Good for nominal variables– Good if you need to know most frequent
observation– Quick and easy
• Median– Good for “bad” distributions– Good for distributions with arbitrary
ceiling or floor
![Page 12: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/12.jpg)
Comparison of mean, median & mode• Mean
– Used for inference as well as description; best estimator of the parameter
– Based on all data in the distribution– Generally preferred except for “bad”
distribution. Most commonly used statistic for central tendency.
![Page 13: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/13.jpg)
Best Guess interpretations
• Mean – average of signed error will be zero.
• Mode – will be absolutely right with greatest frequency
• Median – smallest absolute error
![Page 14: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/14.jpg)
Expectation
• Discrete and continuous variables
• Mean is expected value either way
• Discrete:
• Continuous:
• (The integral looks bad but just means take the average)
XofmeanxxpXE )()(
XofmeandxxxfXE
)()(
![Page 15: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/15.jpg)
Influence of Distribution Shape
![Page 16: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/16.jpg)
Review
• What is central tendency?
• Mode
• Median
• Mean
![Page 17: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/17.jpg)
2. Variability aka Dispersion
• 4 Statistics: Range, Average Deviation, Variance, & Standard Deviation
• Range = high score minus low score.– 12 14 14 16 16 18 20 – range=20-12=8
• Average Deviation – mean of absolute deviations from the median:
N
MdXAD
||
Note difference between this definition & undergrad text- deviation from Median vs. Mean
![Page 18: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/18.jpg)
Variance
• Population Variance:• Where means population variance,• means population mean, and the other
terms have their usual meaning.• The variance is equal to the average squared
deviation from the mean.• To compute, take each score and subtract the
mean. Square the result. Find the average over scores. Ta da! The variance.
N
X
22 )(
2
![Page 19: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/19.jpg)
Computing the Variance(N=5)
5 15 -10 100
10 15 -5 25
15 15 0 0
20 15 5 25
25 15 10 100
Total: 75 0 250
Mean: Variance Is 50
X X XX 2)( XX
![Page 20: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/20.jpg)
Standard Deviation
• Variance is average squared deviation from the mean.
• To return to original, unsquared units, we just take the square root of the variance. This is the standard deviation.
• Population formula:
N
X
2)(
![Page 21: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/21.jpg)
Standard Deviation
• Sometimes called the root-mean-square deviation from the mean. This name says how to compute it from the inside out.
• Find the deviation (difference between the score and the mean).
• Find the deviations squared.• Find their mean.• Take the square root.
![Page 22: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/22.jpg)
Computing the Standard Deviation(N=5)
5 15 -10 100
10 15 -5 25
15 15 0 0
20 15 5 25
25 15 10 100
Total: 75 0 250
Mean: Variance Is 50
Sqrt SD Is
X X XX 2)( XX
07.750
![Page 23: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/23.jpg)
Example: Age Distribution
5040302010
age
16
12
8
4
0
Fre
qu
en
cy
5040302010
age
Distribution of Age
Mean=25.73
5040302010
age
SD = 6.47
Average Distrance from Mean
5040302010
age
Central Tendency, Variability, and Shape
Median = 23
Mode = 21
![Page 24: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/24.jpg)
Review
• Range
• Average deviation
• Variance
• Standard Deviation
![Page 25: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/25.jpg)
Standard or z score
• A z score indicates distance from the mean in standard deviation units. Formula:
• Converting to standard or z scores does not change the shape of the distribution. Z-scores are not normalized.
S
XXz
X
z
![Page 26: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/26.jpg)
Tchebycheff’s Inequality (1)
• General form 2
2
)|(|b
bXp
Suppose we know mean height in inches is 66 and SD is 4 inches. We assume nothing about the shape of the distribution of height. What is the probability of finding people taller than 74 inches? (Note that b is a deviation from the mean; in this case 74-66=8.). Also 74 inches is 2 SDs above the mean; therefore, z = 2.
25.
64
16
8
42
2
p
[If we assume height is normally distributed, p is much smaller. But we will get to that later.]
![Page 27: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/27.jpg)
Tchebycheff (2)• Z-score form• Probability of z score
from any distribution being more than k SDs from mean is at most 1/k2.
• Z-scores from the worst distributions are rarely more than 5 or less than -5.
• For symmetric, unimodal distributions, |z| is rarely more than 3.
2
1
9
4)|(|
kkzp
2
1)
||(
kk
Xp
25.
2
11)2|(|
22kzp
For the problem in the previous slide:
05.3
1
9
4)3|(|
2
zp
![Page 28: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/28.jpg)
Review
• Z-score in words
• Z-score in symbols
• Meaning of Tchebycheff’s theorem
![Page 29: Central Tendency and Variability The two most essential features of a distribution.](https://reader030.fdocuments.us/reader030/viewer/2022032523/56649d8d5503460f94a761ea/html5/thumbnails/29.jpg)
Median House Price Data
• Find data
• Show Univariate
• Show plots