Statistics is the science of collection

3
Statistics is the science of collection, analysis and presentation of numerical data. It is used for decision- making and inferential determination in different situations. It deals with : Large groups of values, not a single entity or value Uncertainty determination (probability) Identifying patterns in values Aspects of information that can be described numerically Branches of statistics: Descriptive Statistics deals with concepts and methods concerned with summarization and description of important aspects of numerical data. Its consists of condensation of data, their graphical display and the computation of few numerical quantities that provide information about centre of the data and indicate the spread of the observations. Inferential Statistics deals with procedure for making inferences about the characteristics that describe the larger group of data or the whole called the population, from the knowledge derived from only a part of the data named as sample. It includes the estimation of population parameters and testing of statistical hypotheses. This part is based on probability theory. Population is the set of all outcomes of an event. It can also be considered as a collection of all the observations regarding any phenomenon or entity. It can be finite or infinite. Parameters are numerical values that describe a population e.g. mean. Sample is a subset of the population. Quantitative variable: numerical data 1. Discrete: integer or whole number 2. Continuous: any value between any given range is possible whether it is a whole number or a decimal number or fraction. Qualitative variable: non-numerical data e.g. eye color, gender Scales: 1. Nominal : numbers define classes but there is no significance in ranking or ordering of numbers 2. Ordinal: numbers define classes and ranking or ordering of numbers is significant. 3. Interval: any scale possessing a constant interval size Collection of data: 1. Personal direct investigation 2. Indirect investigation

description

a mixed view of statistics

Transcript of Statistics is the science of collection

Page 1: Statistics is the science of collection

Statistics is the science of collection, analysis and presentation of numerical data. It is used for decision-

making and inferential determination in different situations.

It deals with :

Large groups of values, not a single entity or value

Uncertainty determination (probability)

Identifying patterns in values

Aspects of information that can be described numerically

Branches of statistics:

• Descriptive Statistics deals with concepts and methods concerned with summarization and

description of important aspects of numerical data. Its consists of condensation of data, their

graphical display and the computation of few numerical quantities that provide information

about centre of the data and indicate the spread of the observations.

• Inferential Statistics deals with procedure for making inferences about the characteristics that

describe the larger group of data or the whole called the population, from the knowledge

derived from only a part of the data named as sample. It includes the estimation of population

parameters and testing of statistical hypotheses. This part is based on probability theory.

Population is the set of all outcomes of an event. It can also be considered as a collection of all the

observations regarding any phenomenon or entity. It can be finite or infinite.

Parameters are numerical values that describe a population e.g. mean.

Sample is a subset of the population.

Quantitative variable: numerical data

1. Discrete: integer or whole number

2. Continuous: any value between any given range is possible whether it is a whole number or a

decimal number or fraction.

Qualitative variable: non-numerical data e.g. eye color, gender

Scales:

1. Nominal : numbers define classes but there is no significance in ranking or ordering of numbers

2. Ordinal: numbers define classes and ranking or ordering of numbers is significant.

3. Interval: any scale possessing a constant interval size

Collection of data:

1. Personal direct investigation

2. Indirect investigation

Page 2: Statistics is the science of collection

3. Questionnaires and surveys

4. Local sources ( no formal investigation )

5. Enumerators

The main aims of classification are

To reduce the large set of data to an easily understood summary

To display the points of similarity and dissimilarity

To reflect the important aspects of the data

To make comparison and inference of data easier

Frequency curves come in a variety of shapes. A unimodal curve is one that rises to a single peak and

then declines. A bimodal curve has two different peaks.

Advantages Disadvantages

MEAN

Easy to compute and comprehend

All observations taken into account

Can be determined for any set

Accuracy affected by outliers

Misleading results

Highly skewed distribution, mean is not a good measure of location

GEOMETRIC MEAN

Rigorously defined mathematical formula

All observations taken into account

Not effected by sampling variability

Cannot be computed for all sets

It is difficult to comprehend

HARMONIC MEAN

Rigorously defined mathematical formula Difficult to comprehend

Page 3: Statistics is the science of collection

Not affected by sampling variability

All observations have bearing on its value

Cannot be computed for all types of sets

MEDIAN

Easy to compute and comprehend

Not affected by outliers

In highly skewed distribution, it is a good measure of location

Has no strict definition

It cannot be mathematically treated further than what it already is

Necessitates the arrangement of data, time consuming

MODE

Simple calculation

Not affected by outliers

Can be evaluated for both Qualit. and Quanti. data

No further mathematical treatment

No strict definition

Does not take into account all observations

An experiment that can result in different outcomes, even though it is repeated in the same manner

every time, is called a random experiment.

Sample space = population

Event= sample

When A and B have no outcomes in common, they are said to be mutually exclusive

or disjoint events.