Introduction to Statistical Terms
Dr Bryan Mills
Contents
• Some key statistical terms
• What makes useful output
• Sampling
• Statistics – turn data into information
• Inferential statistics – using a sample to talk about the whole population
• Variables – things that can vary e.g. student grades, height, etc.
• Empirical data – data collected from observation or measurement
The Problem• Measurements• The basis of both models and statistics is being able to
measure a variable numerically (quantitatively).• Statistics• Usually describe either a set of data or the strength of a
relationship.• Mathematical models• Something along the lines of "this = that + something
else * something other"• These are often expressed as x = f(a,b,c) or income =
f(age, social class, qualifications) - in other words x is a function of other variables
Types of Data (Discrete )
• Nominal - differences e.g. voting preference, Towns, types of beach (sandy, rocky, etc.), discrete categories, occupations, named groups. Uses cross-tabulation (contingency tables) and Chi2 as a means of display/analysis (Non-parametric).
• Ordinal - differences and magnitude - e.g. ratings in order, A, B, C grades, small- medium - large (Non-parametric). Use Mann-Whitney, Kruskal Wallis, Spearmans
Types of Data (Continuous)
• Interval - differences, magnitude and equal intervals, centimetres above and below an average height, IQ - 125 is the same to 110 as 115 is to 100, but 120 is not twice 60, Centigrade, there can be no 0, however, so height from 0 would be a ratio scale (Parametric).
• Ratio - differences, magnitude and equal intervals plus the ability to say this is twice that etc. MPH, size, Kelvin (Parametric).
Type of analysis
• Between groups - between different groups (e.g. independent group t-test)
• Within groups - repeated measures, before and after an experiment (e.g. related samples t-test)
Number of Variables
• Univariate - 1 variable
• Bivariate - 2 variables
• Multivariate
Meaningless Mean
• Mean grade = 56% but 7 students out of the 10 are below this.
A Reminder
Qualitative Quantitative
Sample Size
Validity
Reliability
Positivist Both, but mostly quantitative
Represents a large population
Often Low
High
Phenomenology
Qualitative Small and rich in data
High Often Low
What Makes Good Output
There are 2 main points to consider:
• Your audience
• The data
Sampling
• Statistics rely on having gathered enough data from a sample to be able to represent the population.
• A sample is a subset of the main population.
Stratification
• population stratification – Age– Gender– Ethnicity– Other known characteristics
Ideal Response Size
• Sample size = Ideal Response Size
Estimated Response Rate (%)
• Where:
• n = Number of usable questionnaires returned p = Proportion being estimated
• Z = Confidence coefficient (1.96 by convention) E = Error in proportion (<5% by convention)
Types of Sample (probability)
• Simple Random Sampling
• Stratified Random Sampling– proportional or quota – Divide into sub-groups and take random
sample from each
• Cluster (Area) Random Sampling – Narrow down to area (e,.g. Districts)
Types of Sample (non-probability)
• Convenience Sampling
• Purposive Sampling – Modal Instance Sampling
• Target ‘typical’
– Expert Sampling (Delphi)– Quota Sampling (work to a quota)– Heterogeneity Sampling (diversity of views)– Snowball Sampling
Top Related