Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to...

18
Measures of Central Tendency (Mean, Mode, Median,)

Transcript of Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to...

Page 1: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

Measures of Central Tendency (Mean, Mode, Median,)

Page 2: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

Mean•Mean is the most commonly used measure of central

tendency. • There are different types of mean, viz. arithmetic

mean, weighted mean, geometric mean (GM) and harmonic mean (HM). • If mentioned without an adjective (as mean), it

generally refers to the arithmetic mean.

Page 3: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

Arithmetic mean

• Arithmetic mean (or, simply, “mean”) is nothing but the average. It is computed by adding all the values in the data set divided by the number of observations in it. If we have the raw data, mean is given by the formula

Page 4: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

• Where, ∑ (the uppercase Greek letter sigma), refers to summation, Xrefers to the individual value and n is the number of observations in the sample (sample size).

• The research articles published in journals do not provide raw data and, in such a situation, the readers can compute the mean by calculating it from the frequency distribution.

Page 5: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

Mean contd.

•Where, f is the frequency and X is the midpoint of the class interval and n is the number of observations.• The standard statistical notations (in relation to

measures of central tendency). • The mean calculated from the frequency distribution

is not exactly the same as that calculated from the raw data. • It approaches the mean calculated from the raw data

as the number of intervals increase

Page 6: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

Mean contd.

• It is closely related to standard deviation, the most common measure of dispersion.

• The important disadvantage of mean is that it is sensitive to extreme values/outliers, especially when the sample size is small.

• Therefore, it is not an appropriate measure of central tendency for skewed distribution.

Page 7: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

The Median:

• If the items are arranged in ascending or descending order of magnitude, then the middle value is called Median.

• To find the median we arrange the observations from smallest to the highest value.

• If there is an odd number of observations, the median is the middle value. If there is an even number of observations, the median is an average value of the two middle values

Page 8: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

The Mode

• The mode of a data set is the number that occurs most frequently in the set.

• To easily find the mode, put the numbers in order from least to greatest and count how many times each number occurs.

• The number that occurs the most is the mode.

Page 9: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

Measures of dispersion

• In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed.

• Common examples of measures of statistical dispersion are the variance, standard deviation, and interquartile range.

• Standard deviation is considered to be the best measure of dispersion and is therefore, the most widely used measure of dispersion

Page 10: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

• Dispersion is defined as the breaking up or scattering of something.

• An example of a dispersion is throwing little pieces of stick all over a floor.

• Another example of a dispersion is the colored rays of light coming from a prism which has been hung in a sunny window.

Page 11: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

•The formula for Variance is given as:

•𝝈𝟐 = (𝒙𝒊−𝒙)

𝟐

𝑵

•Where 𝜎2 is the population variance, X is the population

•mean, Xi is the ith element from the population and

•N is the number of elements in the population. The variance of a sample is defined by slightly different formula:

Page 12: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

It is stated as:

𝐒𝟐 = (𝐱𝐢 − 𝐱)

𝟐

𝐧 − 𝟏• Where 𝑆2 is the sample variance, X is the sample mean, Xi

is the ith element from the sample and n is the number of elements in the sample.

• The formula for the variance of a population has the value ‘n’ as the denominator.

• The expression ‘n−1’ is known as the degrees of freedom and is one less than the number of parameters.

Page 13: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

Variance contd.

• The variance is measured in squared units. To make the interpretation of the data simple and to retain the basic unit of observation, the square root of variance is used.

• The square root of the variance is the standard deviation (SD). The SD of a population is defined by the following formula:

Page 14: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

• where σ is the population SD, X is the population mean, Xi is the ith element from the population and N is the number of elements in the population. The SD of a sample is defined by slightly different formula:

Page 15: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

•where sis the sample SD, xis the sample mean, xi is the ith element from the sample and n is the number of elements in the sample.

Page 16: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

Correlation and regression.

• Correlation is a statistical measure which determines co-relationship or association of two variables.

• Regression describes how an independent variable is numerically related to the dependent variable.

• Regression indicates the impact of a unit change in the known variable (x) on the estimated variable (y).

• Correlation and Regression are the two analysis based on multivariate distribution. A multivariate distribution is described as a distribution of multiple variables

Page 17: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

Worked Examples:

• The intracranial pressures (mmHg) of 10 patients admitted with severe head injury in Intensive Care Unit are 13, 32, 35, 42, 30, 19, 32, 27, 36 and 31.

• These data can be summarised to best represent the observations.

• We can rank the observations from lowest to highest: 13,19,27,30,31,32,32, 35, 36 and 42.

• We get now a clearer idea of the intracranial pressures in severe head injury. The idea about the commonly observed values 9 (the smaller and larger values less represent our sample).

Page 18: Measures of Central Tendency (Mean, Mode, Median,) · Mean contd. •It is closely related to standard deviation, the most common measure of dispersion. •The important disadvantage

Worked Examples contd.

• The sample mode (most commonly observed value) is 32

• Mean value is 29.7 mmHg.

• The median is the middle value. If there is an even number of observations, then the median is calculated as the average of the two middle values. The median is 31+32/2=31.5 mm Hg