Statistics five

35
Statistics “Five” Mohamed Ahmed Hefny, MD.

Transcript of Statistics five

Page 1: Statistics five

Statistics “Five”Mohamed Ahmed Hefny, MD.

Page 2: Statistics five
Page 3: Statistics five

Designing a study

Page 4: Statistics five

TARGET POPULATION

All low-birth weight babies born in the

--- in 2007.

STUDY POPULATIONAll low-birth

weight babies born in three

maternity units in

---- in 2007.

SAMPLEThe last 300

babies born in these three

maternity units.

The target population, the study population and the sample

Page 5: Statistics five

Samples and populations

• In clinical research, we usually study a sample of individuals who are assumed to be representative of a wider group, to whom the research might apply. This wider group is known as the target population.

• It would be impossible to study every single baby in such a large target population (or every member of any population). So instead, we might choose to take a sample from a more accessible group. This more restricted group is the study population.

• Suppose we take as our sample the last 300 babies born in these three maternity units. What we find out from this sample we hope will also be true of the study population, and ultimately of the target population.

Page 6: Statistics five

Population

Target Population

Study Sample

Page 7: Statistics five

Sampling error

• Samples are never perfect replicas of their populations, so when we draw a conclusion about a population based on a sample, there will always be what is known as sampling error.

• For example, if the percentage of women in the with recurrent abortion is 3.50 % and a sample produces a sample percentage of 2.90 %, then the difference between these two values, 0.60 per cent, is the sampling error.

• We can never completely eliminate sampling error, since this is an inherent feature of any sample.

Page 8: Statistics five

Collecting the data – types of sample

Simple random sample and its offspring

• Any sample should be representative of the population from which it is taken. i.e. if your population has equal numbers of male and female babies, but your sample consists of twice as many male babies as female, then any conclusions you draw are likely to be, at least, misleading.

• Generally, the most representative sample is a simple random sample. The only way that a simple random sample will differ from the population will be due to chance alone.

• For a sample to be truly random, every member of the population must have an equal chance of being included in the sample. Unfortunately, this is rarely possible in practice, since this would require a complete and up-to-date list (name and contact details). Such a list is called a sampling frame.

Page 9: Statistics five

The two close relatives of simple random sampling are:

Systematic random sampling: Fixed fraction of the sampling frame is selected, say every 10th or every 50th member, until a sample of the required size is obtained. Provided there are no hidden patterns in the sampling frame, this method will produce samples as representative as a random sample.

Page 10: Statistics five

Stratified random sampling, the sampling frame is first broken down into strata relevant to the study, for example men and women; or nonsmokers, ex-smokers and smokers. Then each separate stratum is sampled using a systematic sampling approach, and finally these strata samples are combined. But both methods require a sampling frame.

Page 11: Statistics five

Contact or consecutive samples

The need for an accurate sampling frame makes random sampling impractical in any realistic clinical setting. One common alternative is to take as a sample, individuals in current or recent contact with the clinical services, such as consecutive attendees at a clinic.

Alternatively, researchers may study a group of subjects in situ, for example on a ward, or in some other setting.

Page 12: Statistics five

Types of study

Page 13: Statistics five

All studies

Analytic (PICO – PECO)

Descriptive (PO)

ExperimentalObservational

Cross SectionalCohort Case

Control

QualitativeSurvey (Cross

Sectional)

Randomized Parallel group

Randomized Crossover

Population (P) Outcomes (O) Interventions (I) Exposures (E)

Page 14: Statistics five

• A non-analytic or descriptive study does not try to quantify the relationship but tries to give us a picture of what is happening in a population Example, the prevalence, incidence, or experience of a group.

• Descriptive studies include case reports, case-series, qualitative studies and surveys (cross-sectional) studies, which measure the frequency of several factors, and hence the size of the problem.

Page 15: Statistics five

• An analytic study attempts to quantify the relationship between two factors, that is, the effect of an intervention (I) or exposure (E) on an outcome (O).

• To quantify the effect we will need to know the rate of outcomes in a comparison (C) group as well as the intervention or exposed group.

• Whether the researcher actively changes a factor or imposes uses an intervention determines whether the study is considered to be observational (passive involvement of researcher), or experimental (active involvement of researcher).

Page 16: Statistics five

• In experimental studies, the researcher manipulates the exposure, that is he or she allocates subjects to the intervention or exposure group.

• Experimental studies, or randomized controlled trials (RCTs), subjects are allocated to two or more groups to receive an intervention or exposure and then followed up under carefully controlled conditions.

• Such studies controlled trials, particularly if randomized and blinded, have the potential to control for most of the biases that can occur in scientific studies but whether this actually occurs depends on the quality of the study design and implementation.

Page 17: Statistics five

• In analytic observational studies, the researcher simply measures the exposure or treatments of the groups.

• Analytical observational studies include case-control studies, cohort studies and some population (cross-sectional) studies.

• These studies all include matched groups of subjects and assess of associations between exposures and outcomes.

Page 18: Statistics five

• Observational studies investigate and record exposures (such as interventions or risk factors) and observe outcomes (such as disease) as they occur.

• Such studies may be purely descriptive or more analytical.

Page 19: Statistics five

Spotting the Study Design

Page 20: Statistics five

Q1. What was the aim of the study?

• Describe a population (PO questions) ……. Descriptive.

• Quantify the relationship between factors (PICO questions) ……. Analytic.

Page 21: Statistics five

Q2. If analytic, was the intervention randomly allocated?

1. Yes?  ……. RCT 2. No? ……. Observational study

Page 22: Statistics five

Q3. When were the outcomes determined?

• Some time after the exposure or intervention? cohort study (Prospective study)

• At the same time as the exposure or intervention? Cross sectional study or survey

• Before the exposure was determined? case-control study (Retrospective study)

Page 23: Statistics five

Advantages and Disadvantages of the Designs

Page 24: Statistics five
Page 25: Statistics five

Randomized Controlled Trial

Experimental comparison study in which participants are allocated to treatment/intervention or control/placebo groups using randomization. Best for study the effect of an intervention.

Advantages: • Unbiased distribution of confounders;• Blinding more likely;• Randomization facilitates statistical analysis.

Disadvantages: • Expensive: time and money;• Volunteer bias;• Ethically problematic at times.

Page 26: Statistics five
Page 27: Statistics five
Page 28: Statistics five

Crossover Design

A controlled trial where each study participant has both therapies, e.g., is randomized to treatment A first, at the crossover point they then start treatment B. Only relevant if the outcome is reversible with time, e.g., symptoms.

Advantages: • All subjects serve as own controls and error variance is

reduced thus reducing sample size needed;• All subjects receive treatment (at least some of the time);• Statistical tests assuming randomization can be used;• Blinding can be maintained.

Disadvantages: • All subjects receive placebo or alternative treatment at

some point;• Washout period lengthy or unknown;• Cannot be used for treatments with permanent effects

Page 29: Statistics five
Page 30: Statistics five

Cohort StudyData are obtained from groups who have been exposed, or not exposed, to the new technology or factor of interest (e.g. from databases). No allocation of exposure is made by the researcher. Best for study the effect of predictive risk factors on an outcome.

Advantages:• Ethically safe;• Subjects can be matched;• Can establish timing and directionality of events;• Eligibility criteria and outcome assessments can be

standardized;• Administratively easier and cheaper than RCT.

Disadvantages: • Controls may be difficult to identify;• Exposure may be linked to a hidden confounder;• Blinding is difficult;• Randomization not present;• For rare disease, large sample sizes or long follow-up

necessary.

Page 31: Statistics five
Page 32: Statistics five

Case-Control Studies

Patients with a certain outcome or disease and an appropriate group of controls without the outcome or disease are selected (usually with careful consideration of appropriate choice of controls, matching, etc.) and then information is obtained on whether the subjects have been exposed to the factor under investigation.

Advantages: • Quick and cheap;• Only feasible method for very rare disorders or those with long

lag between exposure and outcome;• Fewer subjects needed than cross-sectional studies.

Disadvantages: • Reliance on recall or records to determine exposure status;• Confounders;• Selection of control groups is difficult;• Potential bias: recall, selection.

Page 33: Statistics five

Cross-Sectional SurveyA study that examines the relationship between diseases (or other health-related characteristics) and other variables of interest as they exist in a defined population at one particular time (i.e. exposure and outcomes are both measured at the same time). Best for quantifying the prevalence of a disease or risk factor, and for quantifying the accuracy of a diagnostic test.

Advantages:• Cheap and simple;• Ethically safe.Disadvantages:• Establishes association at most, not causality;• Recall bias susceptibility;• Confounders may be unequally distributed;• Neyman bias;• Group sizes may be unequal.

Page 34: Statistics five

Cross-Sectional Survey

Page 35: Statistics five

Thank You