Dr Kelvin Ng Kuan Huei MBBS MRCP Specialist Registrar in CPT/GIM Crash Course in Statistics.
-
Upload
gretchen-piller -
Category
Documents
-
view
226 -
download
0
Transcript of Dr Kelvin Ng Kuan Huei MBBS MRCP Specialist Registrar in CPT/GIM Crash Course in Statistics.
Dr Kelvin Ng Kuan HueiMBBS MRCP
Specialist Registrar in CPT/GIM
Crash Course in Statistics
‘There are three kinds of lies: lies, damned lies, and statistics.’
-- Benjamin Disraeli
Why understand statistics?
• Statistics help us to see patterns• Bad statistics = Bad Decisions• If you don’t understand statistics, you can’t spot
bad statistics
Quantitative vs Qualitative
Qualitative Quantitative
Complete detailed description Classify, count and analyse statistically
Researcher may only roughly know endpoint
Researcher knows what the endpoint is
Researcher is data gathering instrument
Researcher uses tools
Data in form of pictures, words or objects
Data in the form of numbers
Subjective Objective
‘Rich’ more time consuming and not generalizable
Efficient, hypothesis testing but loss of detail
‘Red apple was the favourite as it was sweeter, crunchier and tastier but on the other hand
green apple was more refreshing! ‘
‘The red apple was the favourite compared with the green apple with
P<0.05’
Observational studies vs RCT
• Experimental and quasi-experimental
• Observational studies– Easy, fast and relatively cheap– Dependent on stratification eg. selection bias,
covariates
• RCT– Balancing of confounding factors– Lack of generalization, not always applicable,
slow
Statistics• Descriptive Statistics
– Describe or summarise data
• Inferential Statistics– Make statistical inferences and draw
conclusions• Estimation
– Confidence interval– Parameter estimation
• Hypothesis testing – Null hypothesis
Descriptive statistics• Measures of central tendency
– Mean, mod, median
• Measures of dispersion and variability– Standard deviation, variance,
• Diagrams eg. stem and leaf, box plots
Descriptive statistics
Sample– 9, 4, 5, 4, 7, 4, 2, 5– 2, 4, 4, 4, 5, 5, 7, 9– Mean = 5– Median = 4.5– Mod = 4– Standard deviation = 2
Inferential Statistics
• Reach conclusion beyond the immediate data alone ie. make inferences on population based on sample
• True state of affairs + chance = sample– Sample error– Central limit theorem ie. normally distributed
Inferential Statistics
• Comparisons analysis– Either compares means or medians between
groups
• Correlation analysis– Correlation does not imply causation
• Regression analysis– Incorporates multiple covariates into equation
Comparisons Analysis
• T-test– Comparisons of means
• Mann Whitney U and Wilcoxon matched pair test– Comparisons of medians
• ANOVA and Kruskal Wallis test– Comparison of means between unrelated
groups (ANOVA)– Comparisons of medians between unrelated
groups (Kruskal Wallis test)
Correlations analysis
• Linear datasets?
• Spearman rank correlation– Ordinal data but no need for normal
distibution
• Pearsons product moment– Interval data
Correlation does not imply cause and effect!
Regression analysis
• Does not assume normal sampling. Allows modeling the dependence of a variable against another (or more)
• Binomial dataset – Chi2 test
• Linear regression
• Multiple regression
Linear regression
Multiple regression
Correlation vs regression
• Correlation – Makes no assumption about association– Test for interdependence
• Regression– Assumes variable is dependent covariates– One way causal relationship (in linear
regression)
Correlation or regression analysis?
The P value
• It is not a measure of the hypothesis ie. • It is the probability of obtaining the result by
chance….• But null hypothesis is not a random event! • P value of <0.05 is a less that 5% chance of
obtaining the result by chance• Pre-test probability
– Bayesian probability
The P value
• High P value– Underpowered– Limited clinical difference
• Low P value– Large enough sample size will find even trivial
differences are associated with statistical significance
– Statistical significance does not equate to clinical significance
P value is no replacement for common sense!
Type I and II Errors
• Type 1 error (α error)– False positive ie. reject null hypothesis when
it is true
• Type 2 error (β error)– False negative ie. fail to reject null hypothesis
when it is false
Type 1 error
Type 2 error
Subgroup analysis
• Not statistically powered
• Multiple testing
• Usually not adjusted for covariates
• Predetermined endpoints
• ISIS-2 and star signs
Hazard ratio• Hazard ratio
– The risk of an event eg. death, composite endpoint
– A value of 1 suggests no difference between comparator groups ie. risk relative to another group
– Often expressed within 95% confidence intervals
Relative vs absolute risk reduction
• Beware of headline grabbing statements!– If I buy two lottery tickets, I double my
chances of winning by 100%– If I buy two lottery tickets, I increase my
chance of winning to 0.0001%
• Significance of effect is dependent on incidence
• Important in health economics assessments.
Summary
Questions?
‘There are three kinds of lies: lies, damned lies, and statistics.’
-- Benjamin Disraeli