1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1:...

19
1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introducti on to Statistics

Transcript of 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1:...

Page 1: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

1.1 An Overview of Statistics

1.2 Data Classification

1.3 Experimental Design

Chapter 1: Introduction to Statistics

Page 2: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

What is statistics?What is statistics?

Science of dataData are numbers with

contextIt can be broken down

to three branches: Data analysis Probability Statistical Inference

1.1 An Overview of Statistics

Page 3: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Data It is collection of facts Consists of information coming from

observations, counts, measurements or responses

Statistics Uses data to gain insight and draw

conclusions It is the science of collecting, organizing,

analyzing and interpreting data in order to make decisions

A Definition of Statistics

Page 4: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Population It is the collection of all outcomes,

responses, measurements or counts that are of interest.

Sample It is a subset of the populations.

Data sets

Population: All students taking Statistics classes at NSCC

Sample: All Students in Math109 section 05

Page 5: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Parameter It is a description of a population

characteristic.

Statistic: It is a description of a sample

characteristic.

Data sets

Page 6: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Descriptive statistics: It involves organization, analysis,

summarization and display of data.Probability theory:

It is the branch of statistics which deals with chance or random phenomena i.e. it tries to quantify how likely events are to occur.

Inferential statistics: It is the branch statistics that involves using a

sample to draw conclusions about a population. A basic tool in the study of inferential statistics is probability.

Branches of Statistics

Page 7: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

How do we classify data?How do we classify data?

1.2 Data classification

Page 8: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Qualitative dataData which cannot be measured by a numerical scale. It consists of attributes (like gender, nationality). It can be binary (yes or no) or categorical

Quantitative data Data which can be measured or identified by a numerical scale i.e. it consists of numerical measurements, counts.

Types of data

Page 9: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Nominal: data at this level is qualitative onlyOrdinal: data at this level is qualitative or quantitative,

they can be ranked or ordered but differences between measurements are not meaningful.

Interval: data at this level can be ordered and meaningful differences can be calculated. A zero entry measures a position on a scale. It is not an inherent zero **.

Ratio: data at this level are similar to those at the interval level with the added property that a zero entry is an inherent zero. A ratio of two data values can be performed so that the one data value can be a multiple of another.

** inherent zero is a zero that implied ‘none’.

Types of data

Page 10: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

What is experimental study?What is experimental study?

1.3 Experimental Design

An experiment deliberately imposes a treatment on a group of objects or subjects in the interest of observing the response.

It is wise to take time and effort to organize the experiment properly to ensure that the right type of data, and enough of it, is available to answer the questions of interest as clearly and efficiently as possible

Page 11: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Guidelines to designing a statistical study: Identify the variable(s) of interest and the population

of the study Design data collection process. If you use a sample,

make sure the sample is representative of the population.

Collect the data. Summarize the data, using descriptive statistics

techniques. Interpret the data and make decisions about the

population using inferential statistics. Identify any possible errors.

Design of a statistical study

Page 12: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Methods:

• Observational study Basically you observe ‘what is’. An observational study is a study in which a researcher simply observes behavior in a systematic manner without influencing or interfering with the behavior

• Perform an experiment: Here, a treatment is applied to part of the population and responses are observed. Another part of the population may be used as a control group, in which no treatment is applied. The results of the treatment and the control group are studied and compared.

• Simulation: It is the use of mathematical or physical model to reproduce the conditions of a situation or process. They allow you to study situations hat are impractical or even dangerous to create in real life.

• Survey: it is an investigation of one or more characteristics of a population.

Data Collection

Page 13: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

An experiment deliberately imposes a treatment on a group of objects or subjects in the interest of observing the response. This differs from an observational study, which involves collecting and analyzing data without changing existing conditions. Because the validity of a experiment is directly affected by its construction and execution, attention to experimental design is extremely important.Three key principles of experimental design are: Control of the effects of lurking variables on the response, most simply by comparing several treatments.Randomization, use of chance to assign experimental units to treatments.Replication of the experiment on many units to reduce chance variation in the results.

Experimental Design

Page 14: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Control: An experiment involves a dependent variable and independent variables. One usually conducts the experiment to see the impact of the latter on the former. It is very likely that a variety of factors other than the independent variable which is of interest affect the results of the experiment. Hence in order to maintain the integrity of the experiment it is important to control these influential factors. Some factors are:Confounding variable: it is an extraneous variable in an experiment that correlates with both the dependent and independent variable. Placebo effect: it occurs when a subjects shows a favorable reaction to a placebo i.e. when he or she is not administered the actual treatment but a placebo in its place. To control or minimize this effect the blinding technique is used.

Single blind: it is when the subject does not know whether he or she is receiving the treatment or a placebo.

Double blind: it is when both the researcher and subject are unaware if the subject is receives a treatment or placebo.

Experimental Design

Page 15: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Randomization:•It is a process of randomly assigning experimental units to different treatment groups.•In a completely randomized design, experimental units or subjects are assigned to different treatment groups through random selection.•In some cases the experimenter is aware of differences among groups of the experimental units or subjects. In such cases it is necessary to use blocks, which are groups of subjects/units with similar characteristics before they are randomly assigned to a treatment group. This setup is known as a randomized block design. Replication:To improve the results of an experimental, replication, the repetition of an experiment on a large group of subjects, is required. Replication reduces variability in experimental results, increasing their significance and the confidence level with which a researcher can draw conclusions about an experimental factor.

Experimental Design

Page 16: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

What is a census?A census is a count or measure of an entire population. Although it provides complete information it is costly, cumbersome and time consuming.

What is sampling?Sampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen. To collect unbiased data, a researcher must ensure that the sample is representative of the population.

What is a sampling error? A sampling error is the difference between the results of a sample and those of the population.

Sampling techniques

Page 17: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Random sample is one in which every member of the population has an equal chance of being selected.

Simple random sample is a sample in which every possible sample of the same size has the same chance of being selected. Now when you choose members of a sample, you should decide whether it is acceptable to have the same population member selected more than once:

• If it is acceptable, ,then the sampling process is known as with replacement.

• If it is not acceptable, then the sampling process is said to be without replacement.

Sampling techniques

Page 18: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Stratified random sample is formed when the researcher first divides the population into groups that share similar characteristics, called strata and then selects a simple random sample from each stratum.

Cluster sample is formed by diving the population into naturally occurring subgroups, called clusters, and selecting all the members in one or more clusters.

Systematic sample is one in which members of the population are ordered in some way, a starting number is randomly selected and then sample members are selected at regular intervals from the starting number.

Convenience sample consists of only available members of the population. This type of sample often leads to biased studies.

Sampling techniques

Page 19: 1.1 An Overview of Statistics 1.2 Data Classification 1.3 Experimental Design Chapter 1: Introduction to Statistics.

Homework

Section 1.11-4, 11, 13, 14, 17, 19, 21, 27 (assume U.S.), 29, 30, 32, 33, 36, 40, 41Section 1.27-10, 15, 16Section 1.31,2, 4-10, 15-21, 29, 30, 31, 33, 43 (random, stratified and clustering only)Read Chapter 2 What are the odds??? :P