Download - Introduction to Statistical Terms Dr Bryan Mills.

Introduction to Statistical Terms

Dr Bryan Mills

Contents

• Some key statistical terms

• What makes useful output

• Sampling

• Statistics – turn data into information

• Inferential statistics – using a sample to talk about the whole population

• Variables – things that can vary e.g. student grades, height, etc.

• Empirical data – data collected from observation or measurement

The Problem• Measurements• The basis of both models and statistics is being able to

measure a variable numerically (quantitatively).• Statistics• Usually describe either a set of data or the strength of a

relationship.• Mathematical models• Something along the lines of "this = that + something

else * something other"• These are often expressed as x = f(a,b,c) or income =

f(age, social class, qualifications) - in other words x is a function of other variables

Types of Data (Discrete )

• Nominal - differences e.g. voting preference, Towns, types of beach (sandy, rocky, etc.), discrete categories, occupations, named groups. Uses cross-tabulation (contingency tables) and Chi2 as a means of display/analysis (Non-parametric).

• Ordinal - differences and magnitude - e.g. ratings in order, A, B, C grades, small- medium - large (Non-parametric). Use Mann-Whitney, Kruskal Wallis, Spearmans

Types of Data (Continuous)

• Interval - differences, magnitude and equal intervals, centimetres above and below an average height, IQ - 125 is the same to 110 as 115 is to 100, but 120 is not twice 60, Centigrade, there can be no 0, however, so height from 0 would be a ratio scale (Parametric).

• Ratio - differences, magnitude and equal intervals plus the ability to say this is twice that etc. MPH, size, Kelvin (Parametric).

Type of analysis

• Between groups - between different groups (e.g. independent group t-test)

• Within groups - repeated measures, before and after an experiment (e.g. related samples t-test)

Number of Variables

• Univariate - 1 variable

• Bivariate - 2 variables

• Multivariate

Meaningless Mean

• Mean grade = 56% but 7 students out of the 10 are below this.

A Reminder

Qualitative Quantitative

Sample Size

Validity

Reliability

Positivist Both, but mostly quantitative

Represents a large population

Often Low

High

Phenomenology

Qualitative Small and rich in data

High Often Low

What Makes Good Output

There are 2 main points to consider:

• Your audience

• The data

Sampling

• Statistics rely on having gathered enough data from a sample to be able to represent the population.

• A sample is a subset of the main population.

Stratification

• population stratification – Age– Gender– Ethnicity– Other known characteristics

Ideal Response Size

• Sample size = Ideal Response Size

Estimated Response Rate (%)

• Where:

• n = Number of usable questionnaires returned p = Proportion being estimated

• Z = Confidence coefficient (1.96 by convention) E = Error in proportion (<5% by convention)

Types of Sample (probability)

• Simple Random Sampling

• Stratified Random Sampling– proportional or quota – Divide into sub-groups and take random

sample from each

• Cluster (Area) Random Sampling – Narrow down to area (e,.g. Districts)

Types of Sample (non-probability)

• Convenience Sampling

• Purposive Sampling – Modal Instance Sampling

• Target ‘typical’

– Expert Sampling (Delphi)– Quota Sampling (work to a quota)– Heterogeneity Sampling (diversity of views)– Snowball Sampling