Probability and Statistics
description
Transcript of Probability and Statistics
![Page 1: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/1.jpg)
Probability and Statistics
Probability and Statistics
1.3 The Normal Distributions1.3 The Normal Distributions
![Page 2: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/2.jpg)
Density CurveDensity CurveDensity CurveDensity Curve
A density curve is a smooth function meant to approximate a histogram.
The area under a density curve is one.
Since the density curve represents the entire distribution, the area under the curve on any interval represents the proportion of observations in that interval.
![Page 3: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/3.jpg)
Density CurveDensity CurveDensity CurveDensity Curve
![Page 4: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/4.jpg)
Density Curves: Density Curves: PropertiesProperties
Density Curves: Density Curves: PropertiesProperties
![Page 5: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/5.jpg)
Density CurvesDensity CurvesDensity CurvesDensity CurvesThe mean of density curve is the point at which the curve would balance.
The median of a density curve is the equal-areas point. In other words the areas under the curve on either side of the median are equal.
For symmetric density curves, balance point (mean) and the equal-areas point (median) are the same.
![Page 6: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/6.jpg)
6
Symmetric Data is symmetric if the left half of its histogram (or density curve) is roughly a mirror of its right half.
Skewed Data is skewed if its histogram (or density curve) is not symmetric and if it extends more to one side than the other.
DefinitionsDefinitions
![Page 7: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/7.jpg)
7
Mode = Mean = Median
SYMMETRIC
SKEWED LEFT(negatively)
Mean Mode Median
SKEWED RIGHT(positively)
Mean Mode Median
SkewnessSkewness
![Page 8: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/8.jpg)
CharacterizationCharacterizationCharacterizationCharacterizationA normal distribution is bell-shaped and symmetric.
The distribution is determined by the mean (mu (μ)), and the standard deviation (sigma (σ)).
The mean controls the center and stdev controls the spread.
Note: These two density curves have the same mean but different Standard
Deviations.
![Page 9: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/9.jpg)
68-95-99.7 Rule68-95-99.7 Rule68-95-99.7 Rule68-95-99.7 RuleFor any normal curve with mean μ and standard deviation σ:
68 percent of the observations fall within one standard deviation of the mean. (μ – 1σ < x < μ + 1σ)
95 percent of observation fall within 2 standard deviations. (μ – 2σ < x < μ + 2σ)
99.7 percent of observations fall within 3 standard deviations of the mean. (μ – 3σ < x < μ + 3σ)
![Page 10: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/10.jpg)
10
Waiting Times of Bank Customers at Different Banks
in minutes
Jefferson Valley Bank
Bank of Providence
6.5
4.2
6.6
5.4
6.7
5.8
6.8
6.2
7.1
6.7
7.3
7.7
7.4
7.7
7.7
8.5
7.7
9.3
7.7
10.0
Jefferson Valley Bank
7.15
7.20
7.7
7.10
Bank of Providence
7.15
7.20
7.7
7.10
Mean
Median
Mode
Midrange
What is the Standard Deviation of the data from JV Bank? from BofP?
![Page 11: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/11.jpg)
Dotplots of Waiting Times
Visually, which one has the greater spread?
![Page 12: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/12.jpg)
12
Measures of VariationMeasures of Variation
Range
highest value – lowest value
![Page 13: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/13.jpg)
13
a measure of variation of the scores about the mean
(average deviation from the mean)
Measures of Variation
Standard Deviation
![Page 14: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/14.jpg)
14
Sample Standard Deviation Formula
Sample Standard Deviation Formula
calculators can compute the sample standard deviation of data
Σ (x - x)2
n - 1S =
![Page 15: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/15.jpg)
15
Symbolsfor Standard Deviation
Population
σσx
xσn
Book
Some graphicscalculators
Somenon-graphicscalculators
Sample
s
Sx
xσn-1
Textbook
Some graphicscalculators
Somenon-graphics
calculators
Articles in professional journals and reports often use SD for standard deviation and VAR for variance.
![Page 16: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/16.jpg)
Understanding Standard Deviation
Spot the Jack Russell weighs 19 pounds. The mean weight for a Jack Russell Terrier is 16 pounds with a std dev of 1.5 pounds. Desdi the Maine Coon cat also weighs 19 pounds and frequently kicks Spot’s butt around the house. The mean weight for a Maine Coon is 17 pounds with a std dev of 0.75 pounds. Which animal is most in need of a diet?
![Page 17: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/17.jpg)
Understanding Standard Deviation
The only way to compare values in different units is to standardize the deviations from the means. In other words, we
first have to convert all of the values into similar units – standard deviations from the respective means. THEN, we can compare them directly. This is done through the application of
a Z-score:
(y – y)z = s
Value of interestValue of interest Mean of
dataMean of
data
Std dev of data
Std dev of data
![Page 18: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/18.jpg)
z-score
will have same units as the independent variable if the data in quantitative or unit-less if the independent variable is categorical
represents the number of standard deviations a given number in the data is from the mean
Understanding Standard Deviation
![Page 19: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/19.jpg)
Understanding Standard DeviationSpot the Jack Russell weighs 19 pounds. The mean weight for a Jack Russell Terrier is 16 pounds with a std dev of 1.5 pounds. Desdi the Maine Coon cat also weighs 19 pounds and frequently kicks Spot’s butt around the house. The mean weight for a Maine Coon is 17 pounds with a std dev of .75 pounds. Which animal is most in need of a diet?
z-score for Spot z-score for Desdi
Desdi is farther from the mean for the typical weight of her breed than Spot is from his breed.
What can you say about the spread of weights for the two breeds?
Can you think of any extraneous factor that could explain Desdi’s weight other than being overweight?
z =19−170.75
=2.67z =19−16
1.5=2
![Page 20: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/20.jpg)
Understanding Standard Deviation
Spotz=2
Desdiz=2.67
What percent of Jack Russell terriers weigh less than Spot? more?
What percent of Maine Coon cats weigh less than Desdi? more?
![Page 21: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/21.jpg)
Using z-score and the normal distribution
Suppose it takes you 20 minutes to drive to school, with a standard deviation of 2 minutes.
• How often will you arrive on school in less than 22 minutes?• How often will it take you more than 24 minutes?• 75% of the time you will arrive in x minutes or less. Solve for x.• 43% of the time you will arrive in y minutes or more. Solve for y.
![Page 22: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/22.jpg)
22
Measures of VariationMeasures of VariationVariance
standard deviation squared
s2 or σ2 Notation
![Page 23: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/23.jpg)
23
SampleVariance
PopulationVariance
Variance
Σ (x - x )2
n - 1s2 =
Σ (x - µ)2
Nσ2 =
![Page 24: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/24.jpg)
24
Round-off Rulefor measures of variation
Round-off Rulefor measures of variation
Carry at least one more decimal place than is present in the
original set of values.
Round only the final answer, never in the middle of a calculation.
![Page 25: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/25.jpg)
25
Estimation of Standard DeviationRange Rule of Thumb
x - 2s x x + 2s
Range ≈ 4sor(minimum
usual value)(maximum usual
value)
Range
4s ≈ =highest value - lowest value
4
![Page 26: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/26.jpg)
minimum ‘usual’ value ≈ (mean) - 2 (standard deviation)
minimum ≈ x - 2(s)
maximum ‘usual’ value ≈ (mean) + 2 (standard deviation)
maximum ≈ x + 2(s)
Usual Sample Values
Usual Sample Values
![Page 27: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/27.jpg)
27
The Empirical Rule(applies to bell-shaped distributions)
x x - s x + s
68% within1 standard deviation
34% 34%
x - 2s x + 2s
95% within 2 standard deviations
13.5% 13.5%
x - 3s x + 3s
99.7% of data are within 3 standard deviations of the mean
0.1% 0.1%
2.4% 2.4%
![Page 28: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/28.jpg)
28
Chebyshev’s TheoremChebyshev’s Theorem
applies to distributions of any shape.
the proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1 - 1/K
2 , where K is any positive number greater than 1.
at least 3/4 (75%) of all values lie within 2 standard deviations of the mean.
at least 8/9 (89%) of all values lie within 3 standard deviations of the mean.
![Page 29: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/29.jpg)
29
Measures of Variation Summary
Measures of Variation Summary
For typical data sets, it is unusual for a score to differ from the mean by more than 2 or 3 standard deviations.
![Page 30: Probability and Statistics](https://reader035.fdocuments.us/reader035/viewer/2022062222/56815161550346895dbf85e6/html5/thumbnails/30.jpg)
AssignmentAssignment
Read Section 1.3
p. 64-66 1.62, 1.64-1.69, 1.71