Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science...
-
Upload
egbert-price -
Category
Documents
-
view
214 -
download
0
Transcript of Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science...
![Page 1: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/1.jpg)
Measurement Variables
Describing Distributions
© 2014 Project Lead The Way, Inc.Computer Science and Software Engineering
![Page 2: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/2.jpg)
• A nearly perfect analogycontinuous : discreteanalog : digitalfloat : int
• Measurements of continuous variables are made discrete by "binning" them.
• How old are you? Time is continuous, but you answer in discrete, binned values.
Continuous vs. Discrete
![Page 3: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/3.jpg)
• Categorical (e.g., zip codes)categories with no meaningful
order• Ordinal (e.g., rank in a race)
ordered, but increasing by 1 has no consistent meaning
• Interval (e.g., grade level)Ordered, with consistent steps up, but no meaning for "doubling" or "tripling"
• Ratio (e.g., height)Ordered, with "2 times" being
"double"
Levels of a Measurement Variable
![Page 4: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/4.jpg)
Sample vs. Population• Population =
infinite pool of measurements, or all measurements possible
• Sample = subset of population
![Page 5: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/5.jpg)
• Population parameters= population mean= population standard deviation
• These are inferred from data
Sample vs. Population• Sample
statistics = sample mean = sample standard deviation
• These describe data
![Page 6: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/6.jpg)
Sample vs. Population• Infer population distribution from
sample histogram • Sample histogram matches parent
distribution better with large sample visualized with small intervals
![Page 7: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/7.jpg)
• Half of the area under the distribution is to the left of the median
Median
![Page 8: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/8.jpg)
Mean, Median, Mode
![Page 9: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/9.jpg)
• y-axis shows values of the data• Splits data into quartiles
Box Plot
heig
ht
![Page 10: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/10.jpg)
Each box contains 25% of the data
The IQR (Interquartile Range) Contains 50% of the Data
![Page 11: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/11.jpg)
Whiskers extend to max and min… usually
Box Plot
![Page 12: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/12.jpg)
Whiskers and Outliers Show max/min
![Page 13: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/13.jpg)
The Range Contains 100% of the Data
![Page 14: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/14.jpg)
• A family of distributions with very similar shape
• One normal distribution for each μ and σ
Normal Distributions
μ
σ
• μ ("mu") = population mean
• σ ("sigma") = population standard deviation
![Page 15: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/15.jpg)
• One normal distribution for any pair μ , σ• Example: μ = 6 and σ = 2.2
A Normal Distribution
μ
σ
• μ ("mu") = population mean
• σ ("sigma") = population standard deviation
![Page 16: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/16.jpg)
• μ = 0 and σ = 1
The Standard Normal Distribution
μ
σ
![Page 17: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/17.jpg)
The Empirical Rule: 67% - 95% - 99.7%
67% area
95% area
99.7% area
values within μ ±
σ
values within μ ±
2σ
values within μ ± 3σ
![Page 18: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/18.jpg)
Shape, Center, Spread
• These distributions are both positively-skewed because they are right-tailed
![Page 19: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/19.jpg)
Shape, Center, Spread
![Page 20: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649f485503460f94c69e63/html5/thumbnails/20.jpg)
Shape, Center, Spread