Basic statistical tests Hypothesis testing

Aya GotoCenter for Integrated Science and Humanities

Fukushima Medical University

Course VII‐2, 2017

Basic statistical testsHypothesis testing

What you can learn in this session

Choosing an appropriate testWays of tabulationAnalyzing using OpenEpiUnderstanding hypothesis testingInterpreting results from descriptive analysis

Data type Parametric Non‐parametricContingency table

Large sampleChi‐square testSmall sampleFisher’s exact test

Comparison of means(2 groups, independent) T‐test Mann‐Whitney U test

(2 groups, paired) Paired t‐test Wilcoxon signed rank test

(>3 groups, independent) ANOVA Kruskal‐Wallis test

Association of twocontinuous variables

(Correlation) Pearson’s correlation Spearman’s correlation

(Regression) Linear regression Median regression

E + E ‐D +D ‐

Basic statistical tests

Analysis of contingency tableRelationship of residential region and hypertension

City A City BHP positive 20 80

HP negative 40 60

Comparison of meansRelationship of residential region and blood pressure

City A City B

max BP (mean) 160 140

PairedBefore‐after studyMatched case‐control

Un‐paired (independent)

Before treatmentN=100

After treatmentN=100


PlaceboN=100

Drug AN=100


Correlation

Regression

Association between a frequency of using learned skills and confidence.

Non‐parametric

Parametric

Data type: Categorical

Sample size: Small (<30)

Distribution (graph): With outliers

Data type: Continuous

Sample size: Large

Distribution (graph): Bell shape

Group A Group B120 10580 13090 145

110 12595 115

Mean Mean

Group A Group B Rank A Rank B120 105 7 480 130 1 990 145 2 10

110 125 5 895 115 3 6

Group A Group B A‐B120 105 1580 130 ‐5090 145 ‐55

110 125 ‐1595 115 ‐20

0 Mean

Group A Group B A‐B Rank A‐B120 105 15 180 130 ‐50 ‐490 145 ‐55 ‐5

110 125 ‐15 ‐295 115 ‐20 ‐3

Paired

Independent

PARA NON‐PARAt‐test

Paired t‐test

U‐test

Signed rank‐sum test

Tabulation

tabulate sm outcome, row chi

| outcomesm | alive dead | Total

---------------+----------------------+----------current smoker | 61 85 | 146

| 41.78 58.22 | 100.00 ---------------+----------------------+----------

ex-smoker | 74 67 | 141 | 52.48 47.52 | 100.00

---------------+----------------------+----------non smoker | 268 136 | 404

| 66.34 33.66 | 100.00 ---------------+----------------------+----------

Total | 403 288 | 691 | 58.32 41.68 | 100.00

Pearson chi2(2) = 29.0882 Pr = 0.000

You can not say:

Mortality is significantly higher for current smoker.

10 year follow‐up study of health behavior and mortality

Mortality is significantly different among three groups.

Simplifying a big tableCategorical data

1. Descriptive analysis only

2. Re‐categorize into major categories

3. Re‐categorize into one item of interest and others

Continuous data

1. Descriptive analysis only

2. Re‐categorize into two by using

1) a standard cut‐off value

2) mean or median or

quantile value

Contraceptive STD + STD ‐Methods

CondomOCIUD...

Maternal confidence andTwo‐item depression score (0‐2; >1 = depression tendency)

Useful when…1. You want to calculate 95%CI of a proportion.2. You have a filled contingency table and want to

perform a statistical test.3. You know mean (SD) of your data and want to

perform a statistical test.4. You want to calculate a sample size.

http://www.openepi.com

Quick analysis using OpenEpi

Proportion

Two by Two

T test

If this p value is 0.05 or higher, select the equal variance p value. If under 0.05, select the unequal variance p value.

Sample size:Cohort/RCT

(Comparing %)

Sample size:Mean Difference

Exercise dataMean (SD) or N(%)

p‐value*City AN=200

City BN=200

Systolic blood pressure 123 (20) 120 (25)HypertensionYesNo

20 (10%)180 (90%)

12 (6%)188 (94%)

* T‐test or Chi‐square test was used.

Assignments1. Calculate 95% confidence interval of a prevalence of

hypertension in each city. “Proportion”2. Select and perform an appropriate statistical test for

each item (BP and HT). “Two by Two” or “t test”3. The sample data is from a pilot test. Calculate a

sample size for the main survey. “Cohort/RCT” or “Mean Difference”

4. Try ADVANCE exercise

Is new Drug A more effective than Drug B? Is there a difference in the effect of Drug A and Drug B?

Probability that the null hypothesis (effect of A = effect of B) is true

Hypothesis testing

You check the difference by rejecting (p<0.05)the hypothesis that two are the same.

NOTE. Even if p value is higher than 0.05, it does not mean that the null hypothesis is true. P is just a probability.

Truth ‐ Different Truth ‐ SameSurvey resultDifferent P value

Survey resultNot different

Simple understandingYou don’t want to make a mistake by saying that the new drug is effective when it is not.You want to make the probability of the mistake to be small.Low p value means less chance of making the mistake and you are more confident that there is a real difference.

Truth Null hypothesis is NOT true

TruthNull hypothesis

is trueReject the null hypothesis

Power

Type I errorFalse positive

AlphaP value

NOT reject the null hypothesis

Type II errorFalse negative

Beta

ADVANCED statistical understanding

Health literacy promotion in Fukushima after the nuclear accident:

A case of responding to health care professionals’ needs through the development of a health literacy toolkit

IAEA ‐ Hiroshima University Consultancy MeetingScience, Technology and Society Perspectives on Nuclear Science, Radiation and Human Health – The International Perspective

Aya Goto Center for Integrated Science and Humanities

Fukushima Medical University

Fukushima nuclear accident

50 miles: US Recommended Evacuation Zone

Fukushima City15% decline in under 5‐yo pop.in 2 years

Depression and decline in maternal confidence among Fukushima mothersBMC Psychiatry. 2015; 15: 59.J Commun Healthc. 2014; 7: 106‐116.

IAEA

Fear of unknown health effects of radiation contamination due to confusing and often contradicting health risk messages with difficult scientific data

Picture: Leaflets about radiation placed in the lobby of a health center in Fukushima City.

Community health workers

Fukushima Nuclear Accident Independent Investigation Commission

“Information for residents to make informed decisions”

Public health nurses (gate keepers of community health)

Nursery school teachers(key players of maternal and child health)

How do we respond to parents’ concerns？

Government

“The cognitive and social skills which determine the motivation and ability of individuals to gain access to understand and use information in ways which promote and maintain good health” WHO, 1998

SLIDE 32

Health literacy

http://www.hsph.harvard.edu/healthliteracy/overview/

Goto A, et al. Japan Medical Association Journal. 2014; 57: 146‐53. Rudd RE. Assessing health materials: Eliminating barriers – increasing

access. 2010. http://www.hsph.harvard.edu/healthliteracy/

Health literacy training

Sentences: Grade level, topic sentenceNumbers: Numeracy level

RISK is one of the most difficult statistical concepts.(Apter AJ, et al. J Gen Intern Med. 2008;23(12):2117‐24.)

Graphs: Pictogram

Communication: Marker method(Method to ask readers to mark difficult words and phrases.)

SLIDE 34

Training content

Workshop evaluation surveys among participants65 nurses and 45 teachers who attended workshops in 2013‐2014At the end of each session, 1 month (nurses only) and 1 year after the second session. Evaluation itemsApplication, confidence gain and interest in further training.12 specific training goals: 4 items each on knowledge, material assessment and development Opinions on applications and barriers of learned skills in daily practices

Japan Medical Association Journal. 2015; 58: 1‐9.Journal of Seizon and Life Sciences. 2017; 27: 192‐207.

SLIDE 35

Training evaluation

TOTAL(N=57)

Nurses(N=31)

Teachers(N=26)

I applied learned skills in practice. 61% 68% 47%

I gained confidence in assessing in revising written materials.

27% 32% 20%

I want to attend further training. 68% 81% 54%

Selected knowledge itemsI can explain health literacy needs.I can explain numeracy levels.

42%9%

65%12%

15%4%

Selected assessment itemsI can use the Marker Method 47% 61% 29%Selected development itemsI can write easy‐to‐read text.I can explain risk.

44%14%

52%16%

35%12%

Achievements toward training objectives

Nurses and teachersNon‐users(N=22)

Users(N=35) P value

I gained confidence in assessing and revising written materials

32% 45% ( )

I want to attend further training. 41% 86% ( )

Application and confidence

A five‐ping Likert‐scale ranging from highly disagree (1) to highly agree (5) was used. Those who answered 4 and 5 to the item “I applied learned skills in practice” was classified as users. Chi‐square test was used.

Assignments1. First table: Interpret the results2. Second table: Calculate p values and

interpret the results

Applications during the follow‐up

Proportion of sentences that include words in each category is shown. Number of sentences (denominator) was 28 for both.

“Even among staff, we started circulating documents and getting signatures in addition to oral communication.” (Nursery school teacher)

Health literacy

Goals

Team workCommunity

Annals of the ICRPGoto A. Thinking, talking, and workingwith professional community workersafter the Fukushima nuclear accident.

Health literacyand

health system

Basic statistical tests Hypothesis testing

Documents

Transcript of Basic statistical tests Hypothesis testing