Week 01 Introduction to Statistics Probability & Statistics 1.
-
Upload
jessica-potter -
Category
Documents
-
view
226 -
download
5
Transcript of Week 01 Introduction to Statistics Probability & Statistics 1.
![Page 1: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/1.jpg)
1
Week 01Introduction to Statistics
Probability & Statistics
![Page 2: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/2.jpg)
2
What is Statistic?
![Page 3: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/3.jpg)
StatisticsStatistics is the science of data which involves
– collecting, – classifying, – summarizing, – organizing, – analyzing, – and interpreting numerical information
3
1-2
![Page 4: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/4.jpg)
4
![Page 5: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/5.jpg)
Data collection methods
•Questionnaires.•Interviews•Observation
5
1-2
![Page 6: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/6.jpg)
Why study Statistic?
You’ll be able to • make objective decisions, • make accurate predictions that seem inspired• convey the message you want in the most
effective way possible.• Statistics can be a convenient way of
summarizing key truths about data• need a way of visualizing data for everyone else.
6
![Page 7: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/7.jpg)
7
Why not just go on the data? Why chart it?
• Sometimes it’s difficult to see what’s really going on just by looking at the raw data.
• There can be patterns and trends in the data, but these can be very hard to spot if you’re just looking at a heap of numbers.
• Charts give you a way of literally seeing patterns in your data.
• They allow you to visualize your data and see what’s really going on in a quick glance.
![Page 8: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/8.jpg)
8
![Page 9: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/9.jpg)
9
![Page 10: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/10.jpg)
10
What’s the difference between information and data?
• Data refers to raw facts and figures that have been collected.
• Information is data that has some sort of added meaning.
![Page 11: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/11.jpg)
Definitions
Populations and Parameters• A population is the entire collection of all
observations of interest.• E.g. All 2.5 million registered voters in Sri Lanka
• A parameter is a descriptive measure of the entire population of all observations of interest
11
![Page 12: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/12.jpg)
Definitions
Samples and Statistics• A sample is a representative portion of the population
which is selected for study.• Potentially very large, but less than the population.• E.g. a sample of 765 voters exit polled on election day.
• A statistic describes a sample and serves as an estimate of the corresponding population parameter.
12
![Page 13: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/13.jpg)
13
![Page 14: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/14.jpg)
14
relationship between samples and populations.
![Page 15: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/15.jpg)
15
Parameters are numbers that summarize data for an entire population. Statistics are numbers that summarize data from a sample, i.e. some subset of the entire populationEg:A nutritionist wants to estimate the mean amount of sodium consumed by children under the age of 10. From a random sample of 75 children under the age of 10, the nutritionist obtains a sample mean of 2993 milligrams of sodium consumed.
![Page 16: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/16.jpg)
16
Individuals and Variables• Individuals are the people or objects included in the study. • A variable is the characteristic of the individual to be
measured or observed.
For example, if we want to do a study about the people who haveclimbed Mt. Everest, then the individuals in the study are the actual people who made it to the top. The variables to measure or observe might be the height, weight, race, gender, income, etc of the individuals that made it to the top of Mt. Everest.
![Page 17: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/17.jpg)
Definitions
Variables• A variable is a the characteristic of the
population that is being examined in the statistical study.
• There are two basic types of data: Qualitative & Quantitative
17
![Page 18: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/18.jpg)
Types of Variables
• Qualitative or Attribute variable(Categorical): the characteristic or variable being studied is nonnumeric.
• EXAMPLES: Gender, religious affiliation, type of automobile owned, state of birth, eye color, type of dessert
18
![Page 19: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/19.jpg)
Types of Variables
• Quantitative variable: the variable can be reported numerically.
• EXAMPLE: balance in your savings account, minutes remaining in class, number of children in a family.
19
![Page 20: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/20.jpg)
Types of Variables• Quantitative variables can be classified as either
discrete or continuous.– Discrete variables: can only assume certain values
and there are usually “gaps” between values. EXAMPLE: the number of bedrooms in a house.
(1,2,3,..., etc...).
– Continuous variables: can assume any value within a specific range.
EXAMPLE: The time it takes to fly from Sri Lanka to New York.
20
![Page 21: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/21.jpg)
Types of StatisticsDescriptive Statistics:
• Methods of organizing, summarizing, and presenting data in an informative way.
• Descriptive statistics do not allow us to make conclusions beyond the data we have analyzed or reach conclusions regarding any hypotheses we might have made.
• Frequency distributions, measures of central tendency (mean, median, and mode), and graphs like pie charts and bar charts that describe the data are all examples of descriptive statistics.
21
![Page 22: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/22.jpg)
22
EXAMPLE for Descriptive Statistics:
• if we look at a basketball team's game scores over a year, we can calculate the average score, variance etc. and get a description (a statistical profile) for that team
• According to Consumer Report of Ceylon Pencil Company, 9 defective pens per 100. The statistic 9 describes the number of problems out of every 100 pens
![Page 23: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/23.jpg)
Types of StatisticsInferential Statistics:
• Inferential statistics is concerned with making predictions or inferences about a population from observations and analyses of a sample
• The methods of inferential statistics are (1) the estimation of parameter(s) and (2) testing of statistical hypotheses. A Chi-square or T-test
23
![Page 24: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/24.jpg)
24
Inferential Statistics:
EXAMPLE: • TV networks constantly monitor the popularity of their
programs by hiring people to sample the preferences of TV viewers.
• To infer the success rate of a drug in treating high temperature, by taking a sample of patients, giving them the drug, and estimating the rate of effectiveness in the population using the rate of effectiveness in the sample.
![Page 25: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/25.jpg)
Levels of Measurement
• There are four levels of measurement: nominal, ordinal, interval and ratio.
25
1-14
![Page 26: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/26.jpg)
Nominal level
26
1-13
Nominal level (scaled): Data that can only be classified into categories and cannot be arranged in an ordering scheme.EXAMPLES: eye color, gender, religious affiliation• Religion (Catholic, Buddhist, etc) • Race ( African-American, Asian, etc) • Marital Status (Married, Single, Divorced)
These categories are mutually exclusive and/or exhaustive.
![Page 27: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/27.jpg)
Nominal level• Mutually exclusive: An individual or item that,
by virtue of being included in one category, must be excluded from any other category.
• Two events are mutually exclusive if they cannot occur at the same time.
• An example is tossing a coin once, which can result in either heads or tails, but not both.
EXAMPLE: eye color.
27
![Page 28: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/28.jpg)
Ordinal level
28
1-13
Ordinal level: involves data that may be arranged in some order, but differences between data values cannot be determined or are meaningless.
EXAMPLE: During a taste test of 4 colas, cola C was ranked number 1, cola B was ranked number 2, cola A was ranked number 3,cola D was ranked number 4.
Rankings (1st, 2nd, 3rd, etc) Grades (A, B, C, D. F) Evaluations Hi, Medium, Low
![Page 29: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/29.jpg)
Interval level
29
1-13
Interval data have meaningful intervals between measurements, but there is no true starting point (zero).
Variables or measurements where the difference between values is measured by a fixed scale.
![Page 30: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/30.jpg)
30
Interval levelFor example, • When we measure temperature (in Fahrenheit), the
distance from 30-40 is same as distance from 70-80. The interval between values is interpretable. Because of this, it makes sense to compute an average of an interval variable, where it doesn't make sense to do so for ordinal scales. But note that in interval measurement ratios don't make any sense - 80 degrees is not twice as hot as 40 degrees
• However 0 degrees (in both scales) cold as it may be does not represent the total absence of temperature
![Page 31: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/31.jpg)
31
![Page 32: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/32.jpg)
Ratio level
32
1-13
Ratio level: the interval level with an inherent zero starting point. Differences and ratios are meaningful for this level of measurement.
EXAMPLES: money, heights of students. A measurement such as 0 feet does make sense, as
it represents no length. Furthermore 2 feet is twice as long as 1 foot. So
ratios can be formed between the data.
![Page 33: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/33.jpg)
33
Level of data
Nominal
Data may only
be classified
Classification of
students by district
Ordinal
Data are ranked
Your rank for
this course module
Interval
Meaningful difference
between values
Temperature
Ratio
Meaningful 0 point &
ratio between values
Number of study hours
![Page 34: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/34.jpg)
34SAMPLING BREAKDOWN
![Page 35: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/35.jpg)
SAMPLING…….
35
TARGET POPULATION
STUDY POPULATION
SAMPLE
![Page 36: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/36.jpg)
Types of Samples
36
• Probability (Random) Samples– Simple random sample– Systematic random sample– Stratified random sample– Cluster sample
![Page 37: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/37.jpg)
37
Basic Methods of Sampling
Random SamplingSelected by using
chance or random numbers
Each individual subject (human or otherwise) has an equal chance of being selected
Examples: Drawing names from a
hat
Random Numbers
![Page 38: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/38.jpg)
38
TOC
The “pick a name out of the hat” technique Random number table Random number generator
Random Sampling
Hawkes and Marsh (2004)
![Page 39: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/39.jpg)
Simple Random Sample•Every subset of a specified size n from the population has an equal chance of being selected
MathAllianceProject39
![Page 40: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/40.jpg)
40
Simple random sampling
![Page 41: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/41.jpg)
41
Systematic Sampling
• This is a form of random sampling, involving a system. Every nth item is selected throughout the list.– Not fully random and therefore there is a
possibility of bias.
![Page 42: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/42.jpg)
42
Basic Methods of SamplingSystematic Sampling
Select a random starting point and then select every kth subject in the population
Simple to use so it is used often
![Page 43: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/43.jpg)
43
TOC
All data is sequentially numbered Every nth piece of data is chosen
Systematic Sampling
Hawkes and Marsh (2004)
![Page 44: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/44.jpg)
Systematic Sample
• Every kth member ( for example: every 10th person) is selected from a list of all population members.
MathAllianceProject44
![Page 45: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/45.jpg)
45
Stratified random sample• In this method all the people or items in the
sampling frame are divided into ‘categories’ which are mutually exclusive. Within each level a simple random sample is selected.– Within the categories the samples are random.– But the categories are not clear.
![Page 46: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/46.jpg)
46
Basic Methods of Sampling
Stratified SamplingDivide the population into at least two
different groups with common characteristic(s), then draw SOME subjects from each group (group is called strata or stratum)
Results in a more representative sample
![Page 47: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/47.jpg)
Stratified Random Sample
• The population is divided into two or more groups called strata, according to some criterion, such as geographic location, grade level, age, or income, and subsamples are randomly selected from each strata.
MathAllianceProject
47
![Page 48: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/48.jpg)
48
Stratified SamplingData is divided into
subgroups (strata)Strata are based
specific characteristic Age Education level Etc.
Use random sampling within each strata
Hawkes and Marsh (2004)
![Page 49: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/49.jpg)
49
Cluster sampling• Clusters are formed by breaking down the
area to be surveyed into smaller areas a number of which are selected by random methods for survey. Within the selected clusters are chosen by random methods for the survey.
![Page 50: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/50.jpg)
50
Basic Methods of Sampling
Cluster Sampling
Divide the population into groups (called clusters), randomly select some of the groups, and then collect data from ALL members of the selected groups
Used extensively by government and private research organizations
Examples:
Exit Polls
![Page 51: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/51.jpg)
51
TOC
Data is divided into clusters Usually geographic
Random sampling used to choose clusters All data used from selected clusters
Cluster Sampling
Hawkes and Marsh (2004)
![Page 52: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/52.jpg)
52
Cluster sampling
Section 4
Section 5
Section 3
Section 2Section 1
![Page 53: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/53.jpg)
53
TOC
Sampling Relationships
Random Sampling
Cluster Sampling
Stratified Sampling
![Page 54: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/54.jpg)
Cluster Sample
• The population is divided into subgroups (clusters) like families. A simple random sample is taken of the subgroups and then all members of the cluster selected are surveyed.
MathAllianceProject54
![Page 55: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/55.jpg)
55
TOC
In a class of 18 students, 6 are chosen for an assignment
Example 1: Sampling Methods
Sampling Type
Example
Random Pull 6 names out of a hat
Systematic Selecting every 3rd student
Stratified Divide the class into 2 equal age groups. Randomly choose 3 from each group
Cluster Divide the class into 6 groups of 3 students each. Randomly choose 2 groups
Convenience Take the 6 students closest to the teacher
![Page 56: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/56.jpg)
56
TOC
Determine average student age Sample of 10 students Ages of 50 statistics students
Example 2: Utilizing Sampling Methods
18 21 42 32 17 18 18 18 19 22
25 24 23 25 18 18 19 19 20 21
19 29 22 17 21 20 20 24 36 18
17 19 19 23 25 21 19 21 24 27
21 22 19 18 25 23 24 17 19 20
![Page 57: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/57.jpg)
57
Example 2 – Random Sampling
Random number generator
Data Point Location
Corresponding Data Value
35 25
48 17
37 19
14 25
47 24
4 32
33 19
35 25
34 23
3 42
Mean 25.1
![Page 58: Week 01 Introduction to Statistics Probability & Statistics 1.](https://reader036.fdocuments.us/reader036/viewer/2022062409/5697bff41a28abf838cbd442/html5/thumbnails/58.jpg)
58
Example 2 – Systematic Sampling
Take every
data point
Data Point Location
Corresponding Data Value
5 17
10 22
15 18
20 21
25 21
30 18
35 21
40 27
45 23
50 20
Mean 20.8
5th