Statistics for GP and the AKT
-
Upload
sara-glenn -
Category
Documents
-
view
19 -
download
2
description
Transcript of Statistics for GP and the AKT
Statistics for GP and the AKT
Sept ‘11
Aims
• Be able to understand statistical terminology, interpret stats in papers and explain them to patients.
• Pass the AKT
Why should you care?
• 10% of questions
• Much less than 10% of the work
• Easy marks
Plan – don’t despair!• Representing data:
– Parametric v non parametric data
– Normal distribution and standard deviation
– Types of data– Mean, median, mode – Prevalence and incidence
• Types of research:
– Types of studies – Grades of evidence– Types of bias– Tests of statistical significance
• Significance of results :– P value– Confidence intervals– Type 1 and type 2 error
• Magnitude of results:– NNT, NNH– Absolute risk reduction, Relative
risk reduction– Hazard ratio– Odds ratio
• Clinical tests– Sensitivity, specificity– Positive predictive value,
negative predictive value – Likelihood ratios for positive and
negative test
• Pretty pictures:– Forest plot– Funnel plot– Kaplan-Meier survival curve
The Normal Distribution
•Frequency on y axis and continuous variable on x•Symmetrical, just as many have more than average as less than average•Generally true for medical tests and measurements
Standard deviation
• A measure of spread
SD and the normal distribution
•68.2% of data within 1SD•95.5% of data within 2SD•99.8% of data within 3SD•95% of data within 1.96 SD
Defining ‘normal’
•Can be used to define normal for medical tests e.g. Na•But be definition 5% of ‘normal’ people will be ‘too high’ and 5% ‘too low’.
Normality
Positive and negative skew
Parametric and non-parametric
• If it’s normally distributed, it’s parametric• If it’s skewed, it’s non-parametic
Mean, median and mode
• Use mean for parametric data• Median for non parametric data
• In a normal distribution: Mean = median = mode
• For a negatively skewed distribution: Mean < median < mode
• For a positively skewed distribution:Mean > median > mode
• Remember alphabetical order, <for negative, >for positive
What sort of distribution is this?
Which is a normal distribution?
Types of data
Types of data
• Continuous – can take any value e.g. height• Discrete – can only take integers e.g. number
of asthma attacks• Nominal – into categories in no particular
order e.g. colour of smarties• Ordinal – into categories with an inherent
rank e.g. Bristol stool chart
Prevalence and incidence
• Prevalence – proportion of people that have a disease at a given time
• Incidence – number of new cases per population per time
• Prevalence = incidence x length of disease
Types of research
• RCT• Cohort• Case controlled• Cross sectional
Group work• Definition• Strengths• Weaknesses• Example where it would
be the most appropriate study to use
RCT
• Interventional study• Used to compare treatment(s) with a control group.• Control group have placebo or current best
treatment.• Best evidence but….• Expensive and ethical problems• Two types
– Group comparative– Cross-over
Cohort
• Longitudinal/follow-up studies.• Usually prospective
• Assessed using relative risk
Population
Exposed
Not exposed
Disease
Well
Disease
Well
selection Time
Case control
• Usually retrospective• Reverse cohort study
• Assessed using odds ratio
Population
Disease
Well
Exposed
Not exposed
Not exposed
Exposed
selectionTime
Cross-sectional
• Prevalence study• Evaluate a defined population at a specific
time.• Used to assess disease status and compare
populations
Levels of Evidence
• Ia – Meta analysis of RCT’s• Ib – RCT(s)
• IIa – well designed non-randomised trial(s)• IIb – well designed experimental trial(s)
• III – case, correlation and comparative
• IV – panel of experts
Grades of Evidence
• Ia – Meta analysis of RCT’s• Ib – RCT(s)
• IIa – well designed non-randomised trial(s)• IIb – well designed experimental trial(s)
• III – case, correlation and comparative
• IV – panel of experts
A
C
B
Bias
• Confounding• Observer• Publication• Sampling• Selection
CARD SORTFor bonus points, spot the odd one out!
Bias• Confounding
– Exposed and non-exposed groups differ with respect characteristics independent of risk factor.
• Observer– The patient/clinician know which treatment is being received.– Outcome measure has a subjective element.
• Publication– Clinically significant results are more likely to be published– Negative results are less likely to be published
• Sampling– Non-random selection from target population.
• Selection– Intervention allocation to the next person is known before
recruitment.
Avoiding Bias• Confounding
– Study design• Observer
– Blinding• Publication
– Journals accept more outcomes with non-significant results
• Sampling– Compare groups statistically
• Selection– Randomisation
Chance…
Types of significance testsQualitative
• Single sample (my sample vs manufacturer’s claim)– Binomial test
• >1 independent sample (drug A vs drug B)– Small sample – Fisher exact test– Larger sample – Chi-squared
• Dependent sample– Percentage agreement (+/- Kappa statistic)
• Single sample– Student one-sample t-test
• Two independent samples– Student independent samples t-test
• Two dependent samples– Student dependent samples t-test
• >2 independent samples– One-way ANOVA
• >2 dependent samples– ANOVA
• Correlation– Pearson correlation coefficient
Types of significance testsQuantitative - Parametric
Types of significance testsQuantitative – Non-parametric
• Single sample– Kolmogorov-Smirnov test
• Two independent samples– Mann-Whitney
• Two dependent samples– Wilcoxon matched pairs sum test
• >2 independent samples– Kruskal-Wallis test
• >2 dependent samples– Friedman test
• Correlation– Spearman
Types of significance testssummary table
Samples 1 2 >2 Correlation
Qualitative Binomial Ind: Fishers / *Chi squared
Dep: % agreement
-
Quantitative
Parametric
Student Student Ind: one-way ANOVA
Dep: ANOVA
Pearson
Quantitative
Non-parametric
Kolmogorov-Smirnov
Ind: Mann-Whitney
Dep: Wilcoxon
Ind: Kurskal-Wallis
Dep: Friedman
Spearman
*Chi squared – can be used to compare quantitative data if look at proportions/percentages
P value
“The p value is equal to the probability of achieving a result at least as extreme as the experimental outcome by chance”
• Usually significance level is 0.05i.e. the chance that there is no real difference is
less than 5%
Hypothesis
• Null hypothesis – states that there is no difference between the 2 treatments
Errors• Type I error:
– False positive– The null hypothesis is rejected when it is true– Probability is equal to p value– Depends on significance level set not on sample size– Risk increased if multiple end points
• Type II error:– False negative– The null hypothesis is accepted when it is true i.e. fail to find a
statistical significant difference– More likely if small sample size
Error
Sample populations
Confidence intervals
• 95% confidence interval means you are 95% sure that the result for the true population lies within this range
• The bigger the sample, i.e. the more representative of the true population, the smaller the confidence interval.
Confidence intervals (the maths)
• For 95% confidence interval:Mean ± 1.96 x SEM
• Standard error of the mean= SD / √n
i.e. standard deviation divided by square root of number of samples
As number of samples increases, SEM decreases.
Confidence intervals
• We measure the concentration span of a sample of 36 VTS trainees. The mean concentration span is 2.4 seconds and the standard deviation is 1.2 seconds.
• What is the approximate 95% confidence interval?1. 1.2 – 3.6 seconds2. Too short to measure and getting shorter3. 2.2 – 2.6 seconds4. 2.3 – 2.5 seconds5. 2.0 – 2.8 seconds6. I don’t care
Confidence intervals and trials
• If the confidence interval of a difference doesn’t include 0, then the result is statistically significant.
After 30 minutes of stats, the mean reduction in attention span was 2.3 minutes (0.8 – 3.8).
• If the confidence interval of a relative risk doesn’t include 1, then the result is statistically significant.
Relative risk of death after learning about stats was 0.7(0.3 – 1.1)
Magnitude of results
– NNT, NNH– Absolute risk reduction, Relative risk reduction– Hazard ratio– Odds ratio
Relative risk
• How many times more likely if….?
• EER = Exposed (or experimental) event rate• CER = Control event rate• RR = EER / CER
Disease Total
Exposed A B EER = A/B
Control C D CER = C/D
Relative risk reduction (or increase)
RRR (RRI) = EER-CER CER
RRI = relative risk reduction
EER = exposed event rate
CER = control event rate Watch your R’s!
Hazard
• Hazard ratio (HR) – estimate of RR over time– Deaths rate in A/Death rate in B
(2=twice as many, 0.5=half as many)
– Note: hazard ratio does not reflect median survival time it is relative probability of dying
Number needed to treat (NNT)Number needed to harm (NNH)
• How many patients need to be treated to...
• Absolute risk reduction (ARR)=EER-CER
NNT = 1/ARR = 1/EER-CER
Scenario
• Claire Stewart thought women with no hair were more likely to pass CSA because having hair would distract trainees by getting in their eyes.
• She tested this by randomising her female trainees.
• What is the relative risk of passing?• What is the RRR/RRI?• What is the NNT?
Pass CSA Fail CSA
Control group 15 15
Shaved trainees 20 5
Odds ratio• Used in case control studies
• Odds ratio: case odds/control oddsIt doesn’t need the total.
RF No RF Odds
Case A B A/B
Control C D C/D
How good is a test at predicting disease?
• If the test is negative, how sure can you be that you don’t have the disease?
• If the test is positive, how sure can you be that you do have the disease?
Tests
Learn this!
Sensitivity and specificity
• Sensitivity – proportion people that have the disease that test positive
• Specificity – proportion of people that don’t have the disease that test negative
Sensitivity and specificity
Predictive values
• Positive predictive value – proportion of positive tests that actually represent disease
• Negative predictive value – proportion of negative tests that don’t have disease
Learn this!
Likelihood ratios• Take into account prevalence of disease so are more useful• Likelihood ratio for a positive test =
sensitivity / 1 – specificity• Likelihood ratio for a negative test =
1 – sensitivity / specificity
• A likelihood ratio of greater than 1 indicates the test result is associated with the disease.
• A likelihood ratio less than 1 indicates that the result is associated with absence of the disease.
• A likelihood ratio close to 1 means the test is not very useful
An example….
• In a VTS group of 110 people, 30 people have the dreaded lurgy. A test is developed for this. Of the 30 people with the dreaded lurgy, 18 have a positive test. 16 of the others also have a positive test.
• What is the likelihood ratio for a positive test?
Pretty pictures
– Forest plot– Funnel plot– Kaplan-Meier
survival curve
Forest plotsaka Blobbograms
• Used in meta analysis• Graphical representation of results of
different RCT’s
Studies
Summary measureConfidence interval
Odds ratio ofsummarymeasure
Confidenceinterval
Odds ratioof study
Size of box = study size
OR (CI)
Funnel plot
• Used in meta-analysis• Demonstrates the presence/absence of
publication bias
Y axis –Measure of precision
X axis – Treatment effect
Individual study
Increased precision of study = reduced variance
Asymmetrical funnel = publication bias (missing data/studies)
Kaplan-Meier Survival Curve
• What % of people are still alive
Scenario
• We’ve driven Sarah Egan to insanity by not doing enough learning logs.
• She’s gone on a rampage with a gun because basically life will be better without any of us around (nothing to do with pregnancy hormones…obviously)
• Draw the Kaplan-Meier survival curve for MK GP trainees
Time (units)
Number oftrainees
Any questions?