bio statistics for clinical research
-
Upload
ranjith-paravannoor -
Category
Healthcare
-
view
259 -
download
1
description
Transcript of bio statistics for clinical research
Statistical Methods in Clinical Research
Dr Ranjith P
DNB Resident ACME Pariyaram , Kerala
Overview Data types
Summarizing data using descriptive statistics
Standard error
Confidence Intervals
Overview P values
Alpha and Beta errors
Statistics for comparing 2 or more groups with continuous data
Non-parametric tests
Overview Regression and Correlation
Risk Ratios and Odds Ratios
Survival Analysis
Cox Regression
Forest plot
PICOT
overview
Types of Data Discrete Data-limited number of choices
Binary: two choices (yes/no) Dead or alive Disease-free or not
Categorical: more than two choices, not ordered Race Age group
Ordinal: more than two choices, ordered Stages of a cancer Likert scale for response
E.G. strongly agree, agree, neither agree or disagree, etc.
Types of data Continuous data
Theoretically infinite possible values (within physiologic limits) , including fractional values
Height, age, weight Can be interval
Interval between measures has meaning. Ratio of two interval data points has no meaning Temperature in celsius, day of the year).
Can be ratio Ratio of the measures has meaning Weight, height
Types of Data Why important? The type of data defines:
The summary measures used Mean, Standard deviation for continuous data Proportions for discrete data
Statistics used for analysis: Examples:
T-test for normally distributed continuous Wilcoxon Rank Sum for non-normally distributed
continuous
Descriptive Statistics Characterize data set
Graphical presentation Histograms Frequency distribution Box and whiskers plot
Numeric description Mean, median, SD, interquartile range
HistogramContinuous Data
No segmentation of data into groups
Frequency Distribution
Segmentation of data into groupsDiscrete or continuous data
Box and Whiskers Plots
Box and Whisker Plots
Popular in Epidemiologic StudiesUseful for presenting comparative data graphically
Numeric Descriptive Statistics Measures of central tendency of data
Mean Median Mode
Measures of variability of data(dispersion) Standard Deviation, mean deviation Interquartile range, variance
Mean Most commonly used measure of central tendency
Best applied in normally distributed continuous data.
Not applicable in categorical data
Definition: Sum of all the values in a sample, divided by the number of
values.
Eg mean Height of 6 adolescent children 146 ,142,150,148,156,140
Ans ?
882/6 =147
Median Used to indicate the “average” in a
skewed population Often reported with the mean
If the mean and the median are the same, sample is normally distributed.
It is the middle value from an ordered listing of the values If an odd number of values, it is the middle
value 1.2.3.4.5 ie 3 If even number of values, it is the average
of the two middle values.1,2,3,4,5,6 ie 3+4/2 = 3.5
Mid-value in interquartile range
Mode Infrequently reported as a value in studies.
Is the most common value eg. 1,3,8,9,5,8,5,6
mode = 5
.
Interquartile range Is the range of data from the 25th percentile
to the 75th percentile
Common component of a box and whiskers plot It is the box, and the line across the box is the
median or middle value Rarely, mean will also be displayed.
Mean deviation(standard deviation )
Mean deviation(SD) = £I X- Ẍ I / n n is the no of observations Ẍ is the mean ,
X each observation
Square mean deviation= variance=
£I X- Ẍ I² / n
Root mean square deviation =√£I X- Ẍ I² / n
Variance Square of SD(standard deviation )
Coefficient of variance = SD/ mean x 100
Eg. If sd is 3 mean is 150
Variance is 9, coefficient of variance is 300/150 = 2
Standard Error A fundamental goal of statistical analysis is to
estimate a parameter of a population based on a sample
The values of a specific variable from a sample are an estimate of the entire population of individuals who might have been eligible for the study.
A measure of the precision of a sample
Standard Error Standard error of the mean
Standard deviation / square root of (sample size) (if sample greater than 60) Sd/ √n
Important: dependent on sample size Larger the sample, the smaller the standard error.
Clarification Standard Deviation measures the
variability or spread of the data in an individual sample.
Standard error measures the precision of the estimate of a population parameter provided by the sample mean or proportion.
Standard Error Significance:
Is the basis of confidence intervals
A 95% confidence interval is defined by Sample mean (or proportion) ± 1.96 X standard error
Since standard error is inversely related to the sample size:
The larger the study (sample size), the smaller the confidence intervals and the greater the precision of the estimate.
Mean +/- 1 sd = 68.27% value Mean +/- 2 sd = 95.49% value
Mean +/- 3 sd = 99.7% value Mean +/- 4 sd = 99.9% value
Confidence Intervals May be used to assess a single point
estimate such as mean or proportion.
Most commonly used in assessing the estimate of the difference between two groups.
Confidence Intervals
Commonly reported in studies to provide an estimate of the precisionof the mean.
P Values The probability that any observation is
due to chance alone assuming that the null hypothesis is true Typically, an estimate that has a p
value of 0.05 or less is considered to be “statistically significant” or unlikely to occur due to chance alone. Null hypothesis rejected
The P value used is an arbitrary value P value of 0.05 equals 1 in 20
chance P value of 0.01 equals 1 in 100
chance P value of 0.001 equals 1 in 1000
chance.
Errors Type I error
Claiming a difference between two samples when in fact there is none.
Remember there is variability among samples- they might seem to come from different populations but they may not.
Also called the error. Typically 0.05 is used
Errors Type II error
Claiming there is no difference between two samples when in fact there is.
Also called a error. The probability of not making a Type II
error is 1 - , which is called the power of the test.
Hidden error because can’t be detected without a proper power analysis
Errors
Null Hypothesis
H0
Alternative Hypothesis
H1
Null Hypothesis
H0
No Error Type I
Alternative Hypothesis
H1
Type II
No Error
Test result
Truth
Sample Size Calculation Also called “power analysis”. When designing a study, one needs to
determine how large a study is needed. Power is the ability of a study to avoid a Type
II error. Sample size calculation yields the number of
study subjects needed, given a certain desired power to detect a difference and a certain level of P value that will be considered significant.
Sample Size Calculation
Depends on: Level of Type I error: 0.05 typical Level of Type II error: 0.20 typical One sided vs two sided: nearly always two Inherent variability of population
Usually estimated from preliminary data The difference that would be meaningful
between the two assessment arms.
One-sided vs. Two-sided Most tests should be framed as a two-
sided test. When comparing two samples, we usually
cannot be sure which is going to be be better.
You never know which directions study results will go.
For routine medical research, use only two-sided tests.
Statistical Tests Parametric tests
Continuous data normally distributed
Non-parametric tests Continuous data not normally distributed Categorical or Ordinal data
Comparison of 2 Sample Means Student’s T test
Assumes normally distributed continuous data.
T value = difference between means standard error of difference
T value then looked up in Table to determine significance
Paired T Tests Uses the change before
and after intervention in a single individual
Reduces the degree of variability between the groups
Given the same number of patients, has greater power to detect a difference between groups
Analysis of Variance(ANOVA) Used to determine if two or more
samples are from the same population- If two samples, is the same as
the T test. Usually used for 3 or more
samples.
Non-parametric Tests Testing proportions
(Pearson’s) Chi-Squared (2) Test Fisher’s Exact Test
Testing ordinal variables Mann Whiney “U” Test Kruskal-Wallis One-way ANOVA
Testing Ordinal Paired Variables Sign Test Wilcoxon Rank Sum Test
Use of non-parametric tests Use for categorical, ordinal or non-normally
distributed continuous data May check both parametric and non-
parametric tests to check for congruity Most non-parametric tests are based on
ranks or other non- value related methods Interpretation:
Is the P value significant?
(Pearson’s) Chi-Squared (2) Test
Used to compare observed proportions of an event compared to expected.
Used with nominal data (better/ worse; dead/alive)
If there is a substantial difference between observed and expected, then it is likely that the null hypothesis is rejected.
Often presented graphically as a 2 X 2 Table
Non parametric test
For comparing 2 related samples
-Wilcoxon Signed Rank Test
For comparing 2 unrelated samples
-Mann- Whitney U Test
For comparing >2groups
-Kruskal Walli Test
Mann–Whitney U test Mann–Whitney–Wilcoxon (MWW), Wilcoxon
rank-sum test, or Wilcoxon–Mann–Whitney test) is a non-parametric test especially that a particular population tends to have larger values than the other.
It has greater efficiency than the t-test on non-normal distributions, such as a mixture of normal distributions, and it is nearly as efficient as the t-test on normal distributions.
STUDENT T TEST A t-test is any statistical hypothesis
test in which the test statistic follows a normal
distri bution if the null hypothesis is supported.
It can be used to determine if two sets of data are significantly different from each other, and is most commonly applied when the test statistic would follow a normal distribution
The Kaplan–Meier estimator,also known as the product limit estimator, is an estimator for estimating the survival function from lifetime data.
In medical research, it is often used to measure the fraction of patients living for a certain amount of time after treatment.
The estimator is named after Edward L. Kaplan and Paul Meier.
A plot of the Kaplan–Meier estimate of the survival function is a series of horizontal steps of declining magnitude which, when a large enough sample is taken, approaches the true survival function for that population.
ODDS RATIO
In case control study – measure of the strength of the association between risk factor and out come
Odds ratioLung cancer(cases)
No lung cancer (controls)
smokers 33 (a) 55 (b)
Non smokers 2 (c) 27 (d)
TOTAL 35(a+c) 82(b+d)
Odds ratio =ad/bc
=33*27/55*2
=8.1 ie smokers have 8.1 times have the
risk to develop lung cancer than non smokers
RELATIVE RISK
Measure of risk in a cohort study
RR=lncidence of disease among exposed / incidence among non exposed
Cigarette smoking
Developod lung cancer
Not Developod lung cancer
total
Yes 70 (a) 6930 (b) 7000(a+b)
No 3 (c) 2997 (d) 3000(c+d)
Incidence among smokers=70/7000=10/1000
Incidence among non smokers=3/3000=1/1000
Total incidence= 73/10000=7.3/1000
RR=lncidence of disease among exposed/ incidence among non exposed
Relative risk of lung cancer=10/1=10
Incidence of lung cancer is 10 times higher in exposed group (smokers) , ie having a Positive relationship with smoking
Larger RR ,more the strength of association
Attributable risk
It is the difference in incidence rates of disease between exposed group(EG) and non exposed group(NEG)
Often expressed in percent
(Incidence of disease rate in EG-Incidence of disease in NEG/incidence rate in EG ) * 100
. AR= 10-1/10=90%
Ie 90% lung cancers in smokers was due to their smoking
Population attributable Risk It is the incidence of the disease in total
population - the incidence of disease among those who were not exposed to the suspected causal factor/incidence of disease in total population
PAR=7.3-1/7.3=86.3%, ie 86.3 % disease can be avoided if risk factors like cigarettes were avoided
Mortality rates & Ratios Crude Death rate
No of deaths (from all cases )per 1000 estimated mid year population(MYP) in one year in a given place
CDR=(No. deaths during the year /MYP)*1000
CDR in Panchayath A is 15.2/1000
Panchayath B is 8.2/1000 population
Health status of Panchayath B is better than A
Specific Death rate=(No of diseases due to specific diseases during a calendar year/ MYP)*1,000
Can calculate death rate in separate diseases eg . TB, HIV 2/1000, 1/1000 respAge groups 5-20yrs, <5yrs - 1/1000, 3/3000
resp.Sex eg. More in males, Specific months,etc
Case fatality rate(ratio) (Total no of deaths due to a particular
disease/Total no of cases due to same disease)*100
Usually described in A/c infectious diseases
Dengue, cholera, food poisoning etc Represent killing power of the disease
Proportional mortality rate(ratio)
Due to a specific disease=(No of deaths from the specific disease in a year/ Total deaths in an year )*100
Under 5 Mortality rate=(No of deaths under 5 years of age in a given year/Total no of deaths during the same period)*100
Survival rate (Total no of patients alive after 5yrs/Total
no of patients diagnosed or treated)*100
Method of prognosis of certain disease conditions mainly in cancers
Can be used as a yardstick for assessment of standards of therapy
INCIDENCE No of new cases occurring in a defined
population during a specified period of time
(No of new cases of specific disease during a given time period / Population at risk)*1000
Eg 500 new cases of TB in a population of 30000, Incidence is (500/3000)*1000
ie 16.7/1000/yr expressed as incidence rate
Incidence-uses Can be expressed as Special incidence
rate , Attack rate , Hospital admission rate , case rate etc
Measures the rate at which new cases are occurring in a population
Not influenced by duration Generally use is restricted to acute
conditions
PREVALENCE Refers specifically to all current cases (old
& new) existing at a given point of time, or a period of time in a given population
Referred to as a rate , it is really a a ratio
Two types ,point prevalence, Period prevalance
Point prevalence=(No of all currant cases (old& new) of a specified disease existing at a given point of time / Estimated population at the same point of time)*100
Period prevalence=(No of existing cases (old& new) of a specified disease during a given period of time / Estimated mid interval population at risk)*100
Incidence - 3,4,5,8
Point prevalence at jan 1- 1,2& 7
Point prevalence at Dec 31- 1,3,5&8
Period prevalence(jan-Dec)- 1,2,3,4,5,7&8
Relationship b/n Incidence & prevalence
Prevalence=Incidence*Mean duration P=I*D I=P/D D=P/I
Eg: Incidence=10 cases/1000 population/yr
Mean duration 5 yrs Prevalence=10*5 =50/1000 population
PREVALENCE-USES Helps to estimate magnitude of
health/disease problems in the community, & identify potential high risk populations
Prevalence rates are especially useful for administrative and planning purposes
eg: hospital beds, man power needs,rehabilation facilities etc.
Statistical significance
P value (hypothesis)
95% CI (Interval)
P value & its interpretation
“it is the probability of type 1 error”
The chance that, a difference or association is concluded , when actually there is none.
Study of prevalence of obesity in male & female child in a classroom.
50 students
of 25 boys- 10 obese
of 25 girls - 16 obese
p value : 0.02
Null hypothesis: “no difference in obesity among boys & girls in the classroom”
study ,Bubble vs conventional CPAP for prevention of extubation Failure( EF) in preterm very low birth weight infants.
EF bCPAP =4(16)
cCPAP =9(16)
p value-0.14
Null hypothesis: “ no difference in EF among preterm babies treated with bCPAP &cCPAP.”
95% CI
95%CI= Mean ‡1.96SD(2SD)
= Mean ‡ 2SE
1) 100 children attending pediatric OP.
mean wt=15kg SD=2
95%CI =?
Interpretation of 95%CI If a test is repeated 100times , 95 times
the mean value comes between this value.
If CI of 2 variables overlap, the chance of significant difference is very less.
Measures Of Risk case control study- Odds ratio Cohort study -RR,AR
Chi-Squared (2) Test Chi-Squared (2) Formula
Not applicable in small samples If fewer than 5 observations per cell, use
Fisher’s exact test
BREAK
Correlation Assesses the linear relationship between two variables
Example: height and weight Strength of the association is described by a correlation
coefficient- r r = 0 - .2 low, probably meaningless r = .2 - .4 low, possible importance r = .4 - .6 moderate correlation r = .6 - .8 high correlation r = .8 - 1 very high correlation
Can be positive or negative Pearson’s, Spearman correlation coefficient Tells nothing about causation
Correlation
Source: Harris and Taylor. Medical Statistics Made Easy
Correlation
Perfect Correlation
Source: Altman. Practical Statistics for Medical Research
Regression Based on fitting a line to data
Provides a regression coefficient, which is the slope of the line
Y = ax + b Use to predict a dependent variable’s value based on the
value of an independent variable. Very helpful- In analysis of height and weight, for a known
height, one can predict weight. Much more useful than correlation
Allows prediction of values of Y rather than just whether there is a relationship between two variable.
Regression Types of regression
Linear- uses continuous data to predict continuous data outcome
Logistic- uses continuous data to predict probability of a dichotomous outcome
Poisson regression- time between rare events. Cox proportional hazards regression- survival
analysis.
Multiple Regression Models Determining the association between two
variables while controlling for the values of others.
Example: Uterine Fibroids Both age and race impact the incidence of fibroids. Multiple regression allows one to test the impact of
age on the incidence while controlling for race (and all other factors)
Multiple Regression Models In published papers, the multivariable models are
more powerful than univariable models and take precedence.
Therefore we discount the univariable model as it does not control for confounding variables.
Eg: Coronary disease is potentially affected by age, HTN, smoking status, gender and many other factors.
If assessing whether height is a factor: If it is significant on univariable analysis, but not on
multivariable analysis, these other factors confounded the analysis.
Survivial Analysis Evaluation of time to an event (death,
recurrence, recover). Provides means of handling censored data
Patients who do not reach the event by the end of the study or who are lost to follow-up
Most common type is Kaplan-Meier analysis Curves presented as stepwise change from
baseline There are no fixed intervals of follow-up- survival
proportion recalculated after each event.
Survival Analysis
Source: Altman. Practical Statistics for Medical Research
Kaplan-Meier Curve
Source: Wikipedia
Kaplan-Meier Analysis Provides a graphical means of comparing the
outcomes of two groups that vary by intervention or other factor.
Survival rates can be measured directly from curve.
Difference between curves can be tested for statistical significance.
Cox Regression Model Proportional Hazards Survival Model. Used to investigate relationship between an event
(death, recurrence) occurring over time and possible explanatory factors.
Reported result: Hazard ratio (HR). Ratio of the hazard in one group divided the hazard in
another. Interpreted same as risk ratios and odds ratios
HR 1 = no effect HR > 1 increased risk HR < 1 decreased risk
Cox Regression Model Common use in long-term studies
where various factors might predispose to an event. Example: after uterine embolization, which
factors (age, race, uterine size, etc) might make recurrence more likely.
True disease state vs. Test result
not rejected rejected
No disease (D = 0)
specificity
XType I error (False +)
Disease (D = 1) X
Type II error (False -)
Power 1 - ; sensitivity
DiseaseTest
Specific Example
Test Result
Pts Pts with with diseasdiseasee
Pts Pts without without the the diseasedisease
Test Result
Call these patients “negative”
Call these patients “positive”
Threshold
Test Result
Call these patients “negative”
Call these patients “positive”
without the diseasewith the disease
True Positives
Some definitions ...
Test Result
Call these patients “negative”
Call these patients “positive”
without the diseasewith the disease
False Positives
Test Result
Call these patients “negative”
Call these patients “positive”
without the diseasewith the disease
True negatives
Test Result
Call these patients “negative”
Call these patients “positive”
without the diseasewith the disease
False negatives
Test Result
without the diseasewith the disease
‘‘‘‘-’-’’’
‘‘‘‘+’+’’’
Moving the Threshold: right
Test Result
without the diseasewith the disease
‘‘‘‘-’-’’’
‘‘‘‘+’+’’’
Moving the Threshold: left
Tru
e P
osi
tive R
ate
(s
en
siti
vit
y)
0%
100%
False Positive Rate (1-specificity)
0%
100%
ROC curve
Tru
e P
osi
tive
Ra
te
0%
100%
False Positive Rate0%
100%
Tru
e P
osi
tive
Ra
te
0%
100%
False Positive Rate0%
100%
A good test: A poor test:
ROC curve comparison
Best Test: Worst test:T
rue
Po
sitiv
e R
ate
0%
100%
False Positive Rate
0%
100%
Tru
e P
osi
tive
R
ate
0%
100%
False Positive Rate
0%
100%
The distributions don’t overlap at all
The distributions overlap completely
ROC curve extremes
Best Test: Worst test:T
rue
Po
sitiv
e R
ate
0%
100%
False Positive Rate
0%
100%
Tru
e P
osi
tive
R
ate
0%
100%
False Positive Rate
0%
100%
The distributions don’t overlap at all
The distributions overlap completely
ROC curve extremes
FOREST PLOT
114
An example forest plot of five odds
ratios (squares) with the summary measure (centre line of diamond) and associated confidence intervals (lateral tips of diamond), and solid vertical line of no effect. Names of (fictional) studies are shown on the left, odds ratios and confidence intervals on the right.
115
A forest plot (or blobbogram[1]) is a graphical display designed to illustrate the relative strength of treatment effects in multiple quantitative scientific studies addressing the same question. It was developed for use in medical research as a means of graphically representing a meta-analysis of the results of randomized controlled trials.
116
117
i. Probably a small study, with a wide CI, crossing the line of no effect (OR = 1). Unable to say if the intervention works
ii. Probably a small study, wide CI , but does not cross OR = 1; suggests intervention works but weak evidence
iii. Larger study, narrow CI: but crosses OR = 1; no evidence that intervention works
iv. Large study, narrow confidence intervals: entirely to left of OR = 1; suggests intervention works
v. Small study, wide confidence intervals, suggests intervention is detrimental
vi. Meta-analysis of all identified studies: suggests intervention works.
PICOT Used to test evidence based research Population Intervension or issue Comparison with another intervention Outcome Time frame