8/8/2019 Biostat Last Research
1/40
RESEARCH
IN
BIOSTAT
Submitted by:
Alonzo, Jessa Marie
Balbin, Carmina
Magalona, Norie Rose
Magsalin, Alexander Hubert
Natividad, Leslie Ann
Valdez, Darrel Jan
Submitted to:
Mr. Joselito Roque
(Professor)
8/8/2019 Biostat Last Research
2/40
1. Test of HypothesisA. Differentiate:
i. Null and Alternative Hypothesisii. One-tailed Test and Two-tailed Test
B. What is the Level of Significance / Critical Value?
C. Test of Significance
2. Differentiate Parametric and Non-parametric Test3. Define and determine when it is appropriate to use:
y Z-testy T-testy Correlation and Regressiony Analysis of Variancey Chi-square Test
Illustrate examples each.
4. Types of of Non-parametric Test - Valdez, Darrel Jan
Magalona, Norie Rose
Magsalin, Alexander Hubert
Balbin, Carmina
Natividad, Leslie Ann
Alonzo, Jessa Marie
8/8/2019 Biostat Last Research
3/40
1) Test of HypothesisA. Differentiate:
i. Null and Alternative HypothesisNull hypothesis
The null hypothesis is an hypothesis about a population parameter. The purpose of
hypothesis testing is to test the viability of the null hypothesis in the light of experimental
data. Depending on the data, the null hypothesis either will or will not be rejected as a
viable possibility.
Consider a researcher interested in whether the time to respond to a tone is affected by
the consumption of alcohol. The null hypothesis is that 1- 2 = 0 where 1is the mean
time to respond after consuming alcohol and 2 is the mean time to respond otherwise.
Thus, the null hypothesis concerns the parameter 1- 2and the null hypothesis is that the
parameter equals zero.
The null hypothesis is often the reverse of what the experimenter actually believes; it is
put forward to allow the data to contradict it. In the experiment on the effect of alcohol,
the experimenter probably expects alcohol to have a harmful effect. If the experimental
data show a sufficiently large effect of alcohol, then the null hypothesis that alcohol has
no effect can be rejected.
It should be stressed that researchers very frequently put forward a null hypothesis in the
hope that they can discredit it. For a second example, consider an educational researcher
who designed a new way to teach a particular concept in science, and wanted to test
experimentally whether this new method worked better than the existing method. The
researcher would design an experiment comparing the two methods. Since the null
hypothesis would be that there is no difference between the two methods, the researcher
8/8/2019 Biostat Last Research
4/40
would be hoping to reject the null hypothesis and conclude that the method he or she
developed is the better of the two.
The symbol H0 is used to indicate the null hypothesis. For the example just given, the null
hypothesis would be designated by the following symbols:
H0: 1- 2 = 0
or by
H0: 1= 2.
The null hypothesis is typically a hypothesis of no difference as in this example where it
is the hypothesis of no difference between population means. That is why the word "null"
in "null hypothesis" is used -- it is the hypothesis of no difference.
Despite the "null" in "null hypothesis," there are occasions when the parameter is not
hypothesized to be 0. For instance, it is possible for the null hypothesis to be that the
difference between population means is a particular value. Or, the null hypothesis could
be that the mean SAT score in some population is 600. The null hypothesis would then be
stated as: H0: = 600. although the null hypotheses discussed so far have all involved the
testing of hypotheses about one or more population means, null hypotheses can involve
any parameter. An experiment investigating the correlation between job satisfaction and
performance on the job would test the null hypothesis that the population correlation ()
is 0. Symbolically, H0: = 0.
Some possible null hypotheses are given below:
H0: =0
H0: =10
H0: 1 - 2 = 0
8/8/2019 Biostat Last Research
5/40
H0: = .5
H0: 1 - 2 = 0
H0: 1= 2 = 3
H0: 1- 2= 0
When a one-tailed test is conducted, the null hypothesis includes the direction of the
effect. A one-tailed test of the differences between means might test the null hypothesis
that 1- 2 0. If M1- M2 were much less than 0 then the null hypothesis would be
rejected in favor of the alternative hypothesis: 1- 2 < 0.
Alternative hypothesis
In statistical hypothesis testing, the alternative hypothesis (or maintained
hypothesis or research hypothesis) and the null hypothesis are the two rival hypotheses
which are compared by a statistical hypothesis test. An example might be where water
quality in a stream has been observed over many years and a test is made of the null
hypothesis that there is no change in quality between the first and second halves of the
data against the alternative hypothesis that the quality is poorer in the second half of therecord.
The concept of an alternative hypothesis in testing was devised by Jerzy
Neyman and Egon Pearson, and it is used in the NeymanPearson lemma. It forms a
major component in modern statistical hypothesis testing. However it was not part
of Ronald Fisher's formulation of statistical hypothesis testing, and he violently opposed
its use.[1] In Fisher's approach to testing, the central idea is to assess whether the observed
dataset could have resulted from chance if the null hypothesis were assumed to hold,
notionally without preconceptions about what other model might hold. Modern statistical
8/8/2019 Biostat Last Research
6/40
hypothesis testing accommodates this type of test since the alternative hypothesis can be
just the negation of the null hypothesis.
ii. One-tailed Test and Two-tailed TestOne tailed test
A statistical test in which the critical region consists of all values of a test statistic that are
less than a given value or greater than a given value, but not both.
We choose a critical region. In a one-tailed test, the critical region will have just one part (the
red area below). If our sample value lies in this region, we reject the null hypothesis in favour
of the alternative.
Suppose we are looking for a definite decrease. Then the critical region will be to the left.
Note, however, that in the one-tailed test the value of the parameter can be as high as you
like.
Example
Suppose we are given that X has a Poisson distribution and we want to carry out a hypothesis
test on the mean, , based upon a sample observation of 3.
8/8/2019 Biostat Last Research
7/40
Suppose the hypotheses are:
H0: = 9
H1: < 9
We want to test if it is "reasonable" for the observed value of 3 to have come from a Poisson
distribution with parameter 9. So what is the probability that a value as low as 3 has come
from a Po(9)?
P(X 3) = 0.0212 (this has come from a Poisson table)
The probability is less than 0.05, so there is less than a 5% chance that the value has come
from a Poisson(3) distribution. We therefore reject the null hypothesis in favour of the
alternative at the 5% level.
However, the probability is greater than 0.01, so we would not reject the null hypothesis in
favour of the alternative at the 1% level.
Two-tailed test
The two-tailed test is a statistical test used in inference, in which a given statistical
hypothesis , H0 (null hypothesis) will be rejected when the value of the statistic is either
sufficiently small or sufficiently large. The test is named after the "tail" of data under the far
left and far right of a bell-shaped normal data distribution, or bell curve. However, the
terminology is extended to tests relating to distributions other than normal.
"In general a test is called two-sided or two-tailed if the null hypothesis is rejected for values
of the test statistic falling into either tail of its sampling distribution, and it is called one-
sided or one-tailed if the null hypothesis is rejected only for values of the test statistic falling
into one specified tail of its sampling distribution".[1] For example, if our alternative
8/8/2019 Biostat Last Research
8/40
hypothesis is , rejecting the null hypothesis of = 42.5 for small or for large
values of the sample mean, the test is called two-tailed or two-sided. If our alternative
hypothesis is > 1.4, rejecting the null hypothesis of only for large values of the
sample mean, it is then called one-tailed or one-sided.
If the distribution from which the samples are derived is considered to be normal, Gaussian,
or bell-shaped, then the test is referred to as a one- or two-tailed T test. If the test is
performed using the actual population mean and variance, rather than an estimate from a
sample, it would be called a one- or two-tailed Z test.
The statistical tables forZand fortprovide critical values for both one- and two-tailed tests.
That is, they provide the critical values that cut off an entire alpha region at one or the other
end of the sampling distribution as well as the critical values that cut off the 1/2 alpha regions
at both ends of the sampling distribution.
In a two-tailed test, we are looking for either an increase or a decrease. So, for example,
H0 might be that the mean is equal to 9 (as before). This time, however, H1 would be that the
mean is not equal to 9. In this case, therefore, the critical region has two parts:
Example
Lets test the parameter p of a Binomial distribution at the 10% level.
8/8/2019 Biostat Last Research
9/40
Suppose a coin is tossed 10 times and we get 7 heads. We want to test whether or not the
coin is fair. If the coin is fair, p = 0.5 . Put this as the null hypothesis:
H0: p = 0.5
H1: p 0.5
Now, because the test is 2-tailed, the critical region has two parts. Half of the critical region
is to the right and half is to the left. So the critical region contains both the top 5% of the
distribution and the bottom 5% of the distribution (since we are testing at the 10% level).
If H0 is true, X ~ Bin(10, 0.5).
If the null hypothesis is true, what is the probability that X is 7 or above?
P(X 7) = 1 - P(X < 7) = 1 - P(X 6) = 1 - 0.8281 = 0.1719
Is this in the critical region? No- because the probability that X is at least 7 is not less than
0.05 (5%), which is what we need it to be.
So there is not significant evidence at the 10% level to reject the null hypothesis.
B. What is the Level of Significance / Critical Value?Level of significance
In statistics, a result is called statistically significant if it is unlikely to have occurred
by chance. The phrase test of significance was coined by Ronald Fisher.
As used in statistics,significantdoes not mean importantormeaningful, as it does in
everyday speech. For example, a study that included tens of thousands of participants might
be able to say with great confidence that residents of one city were more intelligent than
people of another city by 1/20 of an IQ point. This result would be statistically significant,
8/8/2019 Biostat Last Research
10/40
but the difference is small enough to be utterly unimportant. Many researchers urge that tests
of significance should always be accompanied by effect-sizestatistics, which approximate the
size and thus the practical importance of the difference.
The amount of evidence required to accept that an event is unlikely to have arisen by chance
is known as the significance level or critical p-value: in traditional Fisherianstatistical
hypothesis testing, the p-value is the probability of observing data at least as extreme as that
observed,given that the null hypothesis is true. If the obtained p-value is small then it can be
said either the null hypothesis is false or an unusual event has occurred. It is worth stressing
that p-values do not have any repeat sampling interpretation.
An alternative statistical hypothesis testing framework is the Neyman-Pearson frequentist
school which requires that both a null and an alternative hypothesis to be defined and
investigates the repeat sampling properties of the procedure, i.e. the probability that a
decision to reject the null hypothesis will be made when it is in fact true and should not have
been rejected (this is called a "false positive" or Type I error) and the probability that a
decision will be made to accept the null hypothesis when it is in fact false (Type II error).
More typically, the significance level of a test is such that the probability of mistakenly
rejecting the null hypothesis is no more than the stated probability. This allows the test to be
performed using non-significant statistics which has the advantage of reducing the
computational burden while wasting some information.
It is worth stressing that Fisherian p-values are philosophically different from Neyman-
Pearson Type I errors. This confusion is unfortunately propagated by many statistics
textbooks.
Use in practice
8/8/2019 Biostat Last Research
11/40
The significance level is usually denoted by the Greek symbol (lowercase alpha). Popular
levels of significance are 5% (0.05), 1% (0.01) and 0.1% (0.001). If a test of
significance gives a p-value lower than the -level, the null hypothesis is rejected. Such
results are informally referred to as 'statistically significant'. For example, if someone argues
that "there's only one chance in a thousand this could have happened by coincidence," a
0.001 level of statistical significance is being implied. The lower the significance level, the
stronger the evidence required. Choosing level of significance is an arbitrary task, but for
many applications, a level of 5% is chosen, for no better reason than that it is conventional.
In some situations it is convenient to express the statistical significance as 1 . In general,
when interpreting a stated significance, one must be careful to note what, precisely, is being
tested statistically.
Different -levels trade off countervailing effects. Smaller levels of increase confidence in
the determination of significance, but run an increased risk of failing to reject a false null
hypothesis (a Type II error, or "false negative determination"), and so have less statistical
power. The selection of an -level thus inevitably involves a compromise between
significance and power, and consequently between the Type I error and the Type II error.
More powerful experiments - usually experiments with more subjects or replications - can
obviate this choice to an arbitrary degree.
In some fields, for example nuclear and particle physics, it is common to express statistical
significance in units of "" (sigma), the standard deviation of a Gaussian distribution. A
statistical significance of "n" can be converted into a value of via use of the error
function:
8/8/2019 Biostat Last Research
12/40
The use of implicitly assumes a Gaussian distribution of measurement values. For example,
if a theory predicts a parameter to have a value of, say, 100, and one measures the parameter
to be 109 3, then one might report the measurement as a "3 deviation" from the theoretical
prediction. In terms of , this statement is equivalent to saying that "assuming the theory is
true, the likelihood of obtaining the experimental result by coincidence is 0.27%" (since
1 erf(3/2) = 0.0027).
Fixed significance levels such as those mentioned above may be regarded as useful in
exploratory data analyses. However, modern statistical advice is that, where the outcome of a
test is essentially the final outcome of an experiment or other study, the p-value should be
quoted explicitly. And, importantly, it should be quoted whether the p-value is judged to be
significant. This is to allow maximum information to be transferred from a summary of the
study into meta-analyses.
Critical value
In differential topology, a critical value of a differentiable function
between differentiable manifolds is the value of a critical point .
The basic result on critical values is Sard's lemma. The set of critical values can be quite
irregular; but in Morse theory it becomes important to consider real-valued functions on a
manifold M, such that the set of critical values is in fact finite. The theory of Morse
functions shows that there are many such functions; and that they are even typical, or generic
in the sense of Baire category.
A critical value is used in significance testing. It is the value that a test statistic must exceed
in order for the the null hypothesis to be rejected. For example, the critical value of t (with
12 degrees of freedom using the 0.05 significance level) is 2.18. This means that for the
8/8/2019 Biostat Last Research
13/40
probability value to be less than or equal to 0.05, the absolute value of the t statistic must be
2.18 or greater. It should be noted that the all-or-none rejection of a null hypothesis is not
recommended.
It should be noted that the all-or-none rejection of a null hypothesis is not recommended.
Statistics
In statistics, a critical value is the value corresponding to a given significance level. This
cutoff value determines the boundary between those samples resulting in a test statistic that
leads to rejecting the null hypothesis and those that lead to a decision not to reject the null
hypothesis. If the absolute value of the calculated value from the statistical test is greater than
the critical value, then the null hypothesis is rejected in favour of the alternative hypothesis,
and vice versa. You can never 'accept' an alternative hypothesis, you can only reject the null
hypothesis in favour of the alternative.
C. Test of SignificanceOnce sample data has been gathered through an observational study or experiment, statistical
inference allows analysts to assess evidence in favor or some claim about the populationfrom which the sample has been drawn. The methods of inference used to support or reject
claims based on sample data are known as tests of significance.
Every test of significance begins with a null hypothesis H0.H0 represents a theory that has
been put forward, either because it is believed to be true or because it is to be used as a basis
for argument, but has not been proved. For example, in a clinical trial of a new drug, the nullhypothesis might be that the new drug is no better, on average, than the current drug. We
would writeH0: there is no difference between the two drugs on average.
The alternative hypothesis,Ha, is a statement of what a statistical hypothesis test is set up to
establish. For example, in a clinical trial of a new drug, the alternative hypothesis might be
8/8/2019 Biostat Last Research
14/40
that the new drug has a different effect, on average, compared to that of the current drug. We
would writeHa: the two drugs have different effects, on average. The alternative hypothesis
might also be that the new drug is better, on average, than the current drug. In this case we
would writeHa: the new drug is better than the current drug, on average.
The final conclusion once the test has been carried out is always given in terms of the null
hypothesis. We either "rejectH0 in favor ofHa" or "do not rejectH0"; we never conclude
"rejectHa", or even "acceptHa".
If we conclude "do not rejectH0", this does not necessarily mean that the null hypothesis is
true, it only suggests that there is not sufficient evidence againstH0 in favor ofHa; rejecting
the null hypothesis then, suggests that the alternative hypothesis may be true.
Hypotheses are always stated in terms of population parameter, such as the mean . An
alternative hypothesis may be one-sidedortwo-sided. A one-sided hypothesis claims that a
parameter is either largerorsmaller than the value given by the null hypothesis. A two-sided
hypothesis claims that a parameter is simply not equalto the value given by the null
hypothesis -- the direction does not matter.
Hypotheses for a one-sided test for a population mean take the following form:
H0: = k
Ha: > k
or
H0: = k
Ha: < k.
8/8/2019 Biostat Last Research
15/40
Hypotheses for a two-sided test for a population mean take the following form:
H0: = k
Ha: k.
A confidence intervalgives an estimated range of values which is likely to include an
unknown population parameter, the estimated range being calculated from a given set of
sample data. (Definition taken from Valerie J. Easton and John H. McColl's Statistics
Glossary v1.1)
Example
Suppose a test has been given to all high school students in a certain state. The mean test
score for the entire state is 70, with standard deviation equal to 10. Members of the school
board suspect that female students have a higher mean score on the test than male students,
because the mean score from a random sample of 64 female students is equal to 73.
Does this provide strong evidence that the overall mean for female students is higher?
The null hypothesisH0 claims that there is no difference between the mean score for female
students and the mean for the entire population, so that = 70. The alternative hypothesis
claims that the mean for female students is higher than the entire student population mean, so
that > 70
2) Differentiate Parametric and Non-Parametric Test
Parametric test is a branch of statistics that assumes data has come from a type of probability
distribution and makes inferences about the parameters of the distribution. Most well-known
elementary statistical methods are parametric.
8/8/2019 Biostat Last Research
16/40
Generally speaking parametric methods make more assumptions than non-parametric methods. If
those extra assumptions are correct, parametric methods can produce more accurate and precise
estimates. They are said to have more statistical power. However, if those assumptions are
incorrect, parametric methods can be very misleading. For that reason they are often not
considered robust. On the other hand, parametric formulae are often simpler to write down and
faster to compute. In some, but definitely not all cases, their simplicity makes up for their non-
robustness, especially if care is taken to examine diagnostic statistics.
Because parametric statistics require a probability distribution, they are not distribution-free.
History
Statistician Jacob Wolfowitz coined the statistical term "parametric" in order to define its
opposite in 1942:
"Most of these developments have this feature in common, that the distribution functions of the
various stochastic variables which enter into their problems are assumed to be of known
functional form, and the theories of estimation and of testing hypotheses are theories of
estimation of and of testing hypotheses about, one or more parameters. . ., the knowledge of
which would completely determine the various distribution functions involved. We shall refer to
this situation. . .as the parametric case, and denote the opposite case, where the functional forms
of the distributions are unknown, as the non-parametric case."
Example
Suppose we have a sample of 99 test scores with a mean of100 and a standard deviation of10. If
we assume all 99 test scores are random samples from a normal distribution we predict there is a
1% chance that the 100th test score will be higher than 123.65 (that is the mean plus 2.365
standard deviations) assuming that the 100th test score comes from the same distribution as the
others. The normal family of distributions all have the same shape and areparameterizedby
8/8/2019 Biostat Last Research
17/40
mean and standard deviation. That means if you know the mean and standard deviation, and that
the distribution is normal, you know the probability of any future observation. Parametric
statistical methods are used to compute the 2.365 value above, given
99 independent observations from the same normal distribution.
A non-parametric estimate of the same thing is the maximum of the first 99 scores. We don't
need to assume anything about the distribution of test scores to reason that before we gave the
test it was equally likely that the highest score would be any of the first 100. Thus there is a 1%
chance that the 100th is higher than any of the 99 that preceded it.
Non parametric test
In statistics, the term non-parametric statistics has at least two different meanings:
1. The first meaning ofnon-parametric covers techniques that do not rely on data belongingto any particular distribution. These include, among others:
distribution free methods, which do not rely on assumptions that the data are drawnfrom a given probability distribution. As such it is the opposite ofparametric statistics.
It includes non-parametric statistical models, inference and statistical tests.
non-parametric statistics (in the sense of a statistic over data, which is defined to be afunction on a sample that has no dependency on a parameter), whose interpretation
does not depend on the population fitting any parametrized distributions. Statistics
based on the ranks of observations are one example of such statistics and these play a
central role in many non-parametric approaches.
2. The second meaning ofnon-parametric covers techniques that do not assume thatthestructure of a model is fixed. Typically, the model grows in size to accommodate the
complexity of the data. In these techniques, individual variables are typically assumed to
8/8/2019 Biostat Last Research
18/40
belong to parametric distributions, and assumptions about the types of connections
among variables are also made. These techniques include, among others:
non-parametric regression, which refers to modeling where the structure of the
relationship between variables is treated non-parametrically, but where nevertheless
there may be parametric assumptions about the distribution of model residuals.
non-parametric hierarchical Bayesian models, such as models based on the Dirichletprocess, which allow the number of latent variables to grow as necessary to fit the
data, but where individual variables still follow parametric distributions and even the
process controlling the rate of growth of latent variables follows a parametric
distribution.
Applications and purpose
Non-parametric methods are widely used for studying populations that take on a ranked order
(such as movie reviews receiving one to four stars). The use of non-parametric methods may be
necessary when data have a ranking but no clear numerical interpretation, such as when
assessing preferences; in terms of levels of measurement, for data on an ordinal scale.
As non-parametric methods make fewer assumptions, their applicability is much wider than the
corresponding parametric methods. In particular, they may be applied in situations where less is
known about the application in question. Also, due to the reliance on fewer assumptions, non-
parametric methods are more robust.
Another justification for the use of non-parametric methods is simplicity. In certain cases, even
when the use of parametric methods is justified, non-parametric methods may be easier to use.
Due both to this simplicity and to their greater robustness, non-parametric methods are seen by
some statisticians as leaving less room for improper use and misunderstanding.
8/8/2019 Biostat Last Research
19/40
The wider applicability and increased robustness of non-parametric tests comes at a cost: in cases
where a parametric test would be appropriate, non-parametric tests have less power. In other
words, a larger sample size can be required to draw conclusions with the same degree of
confidence.
Non-parametric models
Non-parametric models differ from parametric models in that the model structure is not
specified a priori but is instead determined from data. The term non-parametric is not meant to
imply that such models completely lack parameters but that the number and nature of the
parameters are flexible and not fixed in advance.
A histogram is a simple nonparametric estimate of a probability distribution Kernel density estimation provides better estimates of the density than histograms. Nonparametric regression and semiparametric regression methods have been developed
based on kernels, splines, and wavelets.
Data Envelopment Analysis provides efficiency coefficients similar to those obtainedby Multivariate Analysis without any distributional assumption.
Methods
Non-parametric (ordistribution-free) inferential statistical methods are mathematical
procedures for statistical hypothesis testing which, unlike parametric statistics, make no
assumptions about the probability distributions of the variables being assessed. The most
frequently used tests include
AndersonDarling test Cochran's Q Cohen's kappa
8/8/2019 Biostat Last Research
20/40
Friedman two-way analysis of variance by ranks KaplanMeier Kendall's tau K
endall's W KolmogorovSmirnov test Kruskal-Wallis one-way analysis of variance by ranks Kuiper's test Logrank Test MannWhitney U or Wilcoxon rank sum test median test Pitman's permutation test Rank products SiegelTukey test Spearman's rank correlation coefficient WaldWolfowitz runs test Wilcoxon signed-rank test.3) Define and determine when it is appropriate to use:a. Z-test
It is a statistical test where normal distribution is applied and is basically used for dealing
with problems relating to large samples when n 30.
There are different types ofZ-test each for different purpose. Some of the popular types are
outlined below:
1. z test for single proportion is used to test a hypothesis on a specific value of the populationproportion.
8/8/2019 Biostat Last Research
21/40
Statistically speaking, we test the null hypothesis H0: p = p0 against the alternative hypothesis
H1: p >< p0 where p is the population proportion and p0 is a specific value of the population
proportion we would like to test for acceptance.
The example on tea drinkers explained above requires this test. In that example, p0 = 0.5. Notice
that in this particular example, proportion refers to the proportion of tea drinkers.
2. z test for difference of proportions is used to test the hypothesis that two populations have thesame proportion.
For example suppose one is interested to test if there is any significant difference in the habit of
tea drinking between male and female citizens of a town. In such a situation, Z-test for difference
of proportions can be applied.
One would have to obtain two independent samples from the town- one from males and the other
from females and determine the proportion of tea drinkers in each sample in order to perform this
test.
3. z -test for single mean is used to test a hypothesis on a specific value of the population mean.Statistically speaking, we test the null hypothesis H0: = 0 against the alternative hypothesis
H1: >< 0 where is the population mean and 0 is a specific value of the population that we
would like to test for acceptance.
Unlike the t-test for single mean, this test is used if n 30 and population standard deviation is
known.
4. z test for single variance is used to test a hypothesis on a specific value of the populationvariance.
Statistically speaking, we test the null hypothesis H0: = 0 against H1: >< 0 where is the
population mean and 0 is a specific value of the population variance that we would like to test
for acceptance.
8/8/2019 Biostat Last Research
22/40
In other words, this test enables us to test if the given sample has been drawn from a population
with specific variance 0. Unlike the chi square test for single variance, this test is used if n 30.
5. Z-test for testing equality of variance is used to test the hypothesis of equality of two populationvariances when the sample size of each sample is 30 or larger.
Example:
n = sample size
For example suppose a person wants to test if both tea & coffee are equally popular in a
particular town. Then he can take a sample of size say 500 from the town out of which suppose
280 are tea drinkers. To test the hypothesis, he can use Z-test.
Assumption:
Irrespective of the type ofZ-test used it is assumed that the populations from which the
samples are drawn are normal.
b. T-testThe students t test is a statistical method that is used to see if to sets of data differ
significantly. The method assumes that the results follow the normal distribution (also called
student's t-distribution) if the null hypothesis is true. This null hypothesis will usually stipulate
that there is no significant difference between the means of the two data sets.
It is best used to try and determine whether there is a difference between two independent
sample groups. For the test to be applicable, the sample groups must be completely independent,
and it is best used when the sample size is too small to use more advanced methods.
Before using this type of test it is essential to plot the sample data from he two samples and
make sure that it has a reasonably normal distribution, or the students t test will not be suitable.
It is also desirable to randomly assign samples to the groups, wherever possible.
Restrictions:
8/8/2019 Biostat Last Research
23/40
The two sample groups being tested must have a reasonably normal distribution. If the
distribution is skewed, then the students t test is likely to throw up misleading results. The
distribution should have only one main peak (= mode) near the mean of the group.
If the data does not adhere to the above parameters, then either a large data sample is needed or,
preferably, a more complex form of data analysis should be used.
Results:
The students t test can let you know if there is a significant difference in the means of the two
sample groups and disprove the null hypothesis. Like all statistical tests, it cannot prove anything, as
there is always a chance of experimental error occurring. But the test can support a hypothesis.
However, it is still useful for measuring small sample populations and determining if there is a
significant difference between the groups.
Example:
You might be trying to determine if there is a significant difference in test scores between
two groups of children taught by different methods.
The null hypothesis might state that there is no significant difference in the mean test scores of
the two sample groups and that any difference down to chance.
The students t test can then be used to try and disprove the null hypothesis.
8/8/2019 Biostat Last Research
24/40
c. Correlation and RegressionCorrelation Types
Correlation is a measure of association between two variables. The variables are not designated
as dependent or independent. The two most popular correlation coefficients are: Spearman's
correlation coefficient rho and Pearson's product-moment correlation coefficient.
When calculating a correlation coefficient for ordinal data, select Spearman's technique. For
interval or ratio-type data, use Pearson's technique.
The value of a correlation coefficient can vary from minus one to plus one. A minus one
indicates a perfect negative correlation, while a plus one indicates a perfect positive correlation.
A correlation of zero means there is no relationship between the two variables. When there is a
negative correlation between two variables, as the value of one variable increases, the value of
the other variable decreases, and vise versa. In other words, for a negative correlation, the
variables work opposite each other. When there is a positive correlation between two variables,
as the value of one variable increases, the value of the other variable also increases. The
variables move together.
The standard error of a correlation coefficient is used to determine the confidence intervals
around a true correlation of zero. If your correlation coefficient falls outside of this range, then it
is significantly different than zero. The standard error can be calculated for interval or ratio-type
data (i.e., only for Pearson's product-moment correlation).
The significance (probability) of the correlation coefficient is determined from the t-statistic. The
probability of the t-statistic indicates whether the observed correlation coefficient occurred by
chance if the true correlation is zero. In other words, it asks if the correlation is significantly
8/8/2019 Biostat Last Research
25/40
different than zero. When the t-statistic is calculated for Spearman's rank-difference correlation
coefficient, there must be at least 30 cases before the t-distribution can be used to determine the
probability. If there are fewer than 30 cases, you must refer to a special table to find the
probability of the correlation coefficient.
Example:
A company wanted to know if there is a significant relationship between the total number of
salespeople and the total number of sales. They collect data for five months.
Variable1
Variable2
207 6907
180 5991
220 6810
205 6553
190 6190
--------------------------------
Correlation coefficient = .921
Standard error of the coefficient = ..068
t-test for the significance of the coefficient = 4.100
Degrees of freedom = 3
Two-tailed probability = .0263
8/8/2019 Biostat Last Research
26/40
Another Example:
Respondents to a survey were asked to judge the quality of a product on a four-point Likert scale
(excellent, good, fair, poor). They were also asked to judge the reputation of the company that
made the product on a three-point scale (good, fair, poor). Is there a significant relationship
between respondents perceptions of the company and their perceptions of quality of the product?
Since both variables are ordinal, Spearman's method is chosen. The first variable is the rating for
the quality the product. Responses are coded as 4=excellent, 3=good, 2=fair, and 1=poor. The
second variable is the perceived reputation of the company and is coded 3=good, 2=fair, and
1=poor.
Variable
1
Variable
2
4 3
2 2
1 2
3 3
4 3
1 1
2 1
-------------------------------------------
8/8/2019 Biostat Last Research
27/40
Correlation coefficient rho = .830
t-test for the significance of the coefficient = 3.332
Number of data pairs = 7
Probability must be determined from a table because of the small sample size.
Regression
Simple regression is used to examine the relationship between one dependent and one
independent variable. After performing an analysis, the regression statistics can be used to
predict the dependent variable when the independent variable is known. Regression goes beyond
correlation by adding prediction capabilities.
People use regression on an intuitive level every day. In business, a well-dressed man is
thought to be financially successful. A mother knows that more sugar in her children's diet
results in higher energy levels. The ease of waking up in the morning often depends on how late
you went to bed the night before. Quantitative regression adds precision by developing a
mathematical formula that can be used for predictive purposes.
For example, a medical researcher might want to use body weight (independent variable)
to predict the most appropriate dose for a new drug (dependent variable). The purpose of running
the regression is to find a formula that fits the relationship between the two variables. Then you
can use that formula to predict values for the dependent variable when only the independent
variable is known. A doctor could prescribe the proper dose based on a person's body weight.
The regression line (known as the least squares line) is a plot of the expected value of the
dependent variable for all values of the independent variable. Technically, it is the line that
8/8/2019 Biostat Last Research
28/40
"minimizes the squared residuals". The regression line is the one that best fits the data on a
scatterplot.
Using the regression equation, the dependent variable may be predicted from the
independent variable. The slope of the regression line (b) is defined as the rise divided by the
run. The y intercept (a) is the point on the y axis where the regression line would intercept the y
axis. The slope and y intercept are incorporated into the regression equation. The intercept is
usually called the constant, and the slope is referred to as the coefficient. Since the regression
model is usually not a perfect predictor, there is also an error term in the equation.
In the regression equation, y is always the dependent variable and x is always the
independent variable. Here are three equivalent ways to mathematically describe a linear
regression model.
y = intercept + (slope x) + error
y = constant + (coefficient x) + error
y = a + bx + e
The significance of the slope of the regression line is determined from the t-statistic. It is
the probability that the observed correlation coefficient occurred by chance if the true correlation
is zero. Some researchers prefer to report the F-ratio instead of the t-statistic. The F-ratio is equal
to the t-statistic squared.
The t-statistic for the significance of the slope is essentially a test to determine if the
regression model (equation) is usable. If the slope is significantly different than zero, then we
8/8/2019 Biostat Last Research
29/40
can use the regression model to predict the dependent variable for any value of the independent
variable.
On the other hand, take an example where the slope is zero. It has no prediction ability
because for every value of the independent variable, the prediction for the dependent variable
would be the same. Knowing the value of the independent variable would not improve our ability
to predict the dependent variable. Thus, if the slope is not significantly different than zero, don't
use the model to make predictions.
The coefficient of determination (r-squared) is the square of the correlation coefficient.
Its value may vary from zero to one. It has the advantage over the correlation coefficient in that it
may be interpreted directly as the proportion of variance in the dependent variable that can be
accounted for by the regression equation. For example, an r-squared value of .49 means that 49%
of the variance in the dependent variable can be explained by the regression equation. The other
51% is unexplained.
The standard error of the estimate for regression measures the amount of variability in the points
around the regression line. It is the standard deviation of the data points as they are distributed
around the regression line. The standard error of the estimate can be used to develop confidence
intervals around a prediction.
Example:
A company wants to know if there is a significant relationship between its advertising
expenditures and its sales volume. The independent variable is advertising budget and the
dependent variable is sales volume. A lag time of one month will be used because sales are
expected to lag behind actual advertising expenditures. Data was collected for a six month
8/8/2019 Biostat Last Research
30/40
period. All figures are in thousands of dollars. Is there a significant relationship between
advertising budget and sales volume?
Indep.
Var.
Depen.
Var
4.2 27.1
6.1 30.4
3.9 25.0
5.7 29.7
7.3 40.1
5.9 28.8
--------------------------------------------------
Model: y = 10.079 + (3.700 x) + error
Standard error of the estimate = 2.568
t-test for the significance of the slope = 4.095
Degrees of freedom = 4
Two-tailed probability = .0149
r-squared = .807
You might make a statement in a report like this: A simple linear regression was performed
on six months of data to determine if there was a significant relationship between advertising
expenditures and sales volume. The t-statistic for the slope was significant at the .05 critical
alpha level, t(4)=4.10, p=.015. Thus, we reject the null hypothesis and conclude that there was a
8/8/2019 Biostat Last Research
31/40
positive significant relationship between advertising expenditures and sales volume.
Furthermore, 80.7% of the variability in sales volume could be explained by advertising
expenditures.
d. Analysis of Variance
An important technique for analyzing the effect of categorical factors on a response is to
perform an Analysis of Variance. An ANOVA decomposes the variability in the response
variable amongst the different factors. Depending upon the type of analysis, it may be important
to determine: (a) which factors have a significant effect on the response, and/or (b) how much of
the variability in the response variable is attributable to each factor.
STATGRAPHICS Centurion provides several procedures for performing an analysis of variance:
1. One-Way ANOVA - used when there is only a single categorical factor. This is equivalent to
comparing multiple groups of data.
2. Multifactor ANOVA - used when there is more than one categorical factor, arranged in a
crossed pattern. When factors are crossed, the levels of one factor appear at more than one level
of the other factors.
3. Variance Components Analysis - used when there are multiple factors, arranged in a
hierarchical manner. In such a design, each factor is nested in the factor above it.
4. General LinearModels - used whenever there are both crossed and nested factors, when some
factors are fixed and some are random, and when both categorical and quantitative factors are
present.
One-Way ANOVA
8/8/2019 Biostat Last Research
32/40
A one-way analysis of variance is used when the data are divided into groups according
to only one factor. The questions of interest are usually: (a) Is there a significant difference
between the groups?, and (b) If so, which groups are significantly different from which others?
Statistical tests are provided to compare group means, group medians, and group standard
deviations. When comparing means, multiple range tests are used, the most popular of which is
Tukey's HSD procedure. For equal size samples, significant group differences can be determined
by examining the means plot and identifying those intervals that do not overlap.
Multifactor ANOVA
When more than one factor is present and the factors are crossed, a multifactor ANOVA
is appropriate. Both main effects and interactions between the factors may be estimated. The
output includes an ANOVA table and a new graphical ANOVA from the latest edition of
Statistics for Experimenters by Box, Hunter and Hunter (Wiley, 2005). In a graphical ANOVA,
the points are scaled so that any levels that differ by more than exhibited in the distribution of the
residuals are significantly different.
Variance Components Analysis
A Variance Components Analysis is most commonly used to determine the level at which
variability is being introduced into a product. A typical experiment might select several batches,
several samples from each batch, and then run replicates tests on each sample. The goal is to
determine the relative percentages of the overall process variability that is being introduced at
each level.
General Linear Model
8/8/2019 Biostat Last Research
33/40
The General LinearModels procedure is used whenever the above procedures are not
appropriate. It can be used for models with both crossed and nested factors, models in which one
or more of the variables is random rather than fixed, and when quantitative factors are to be
combined with categorical ones. Designs that can be analyzed with the GLM procedure include
partially nested designs, repeated measures experiments, split plots, and many others. For
example, pages 536-540 of the book Design and Analysis of Experiments (sixth edition) by
Douglas Montgomery (Wiley, 2005) contains an example of an experimental design with both
crossed and nested factors. For that data, the GLM procedure produces several important tables,
including estimates of the variance components for the random factors.
e. Chi-Square TestAny statistical test that uses the chi square distribution can be called chi square test. It is
applicable both for large and small samples-depending on the context.
There are different types of chi square test each for different purpose. Some of the popular
types are outlined below.
Chi square test for testing goodness of fitis used to decide whether there is any differencebetween the observed (experimental) value and the expected (theoretical) value.
For example given a sample, we may like to test if it has been drawn from a normal population.
This can be tested using chi square goodness of fit procedure.
Chi square test for independence of two attributes. Suppose N observations are consideredand classified according two characteristics say A and B. We may be interested to test
whether the two characteristics are independent. In such a case, we can use Chi square test
for independence of two attributes.
The example considered above testing for independence of success in the English test vis a vis
immigrant status is a case fit for analysis using this test.
8/8/2019 Biostat Last Research
34/40
Chi square test for single variance is used to test a hypothesis on a specific value of thepopulation variance. Statistically speaking, we test the null hypothesis H0: = 0 against the
research hypothesis H1: # 0 where is the population mean and 0 is a specific value of
the population variance that we would like to test for acceptance.
In other words, this test enables us to test if the given sample has been drawn from a
population with specific variance 0. This is a small sample test to be used only if sample size is
less than 30 in general.
Example:
For example suppose a person wants to test the hypothesis that success rate in a particular
English test is similar for indigenous and immigrant students.
If we take random sample of say size 80 students and measure both
indigenous/immigrant as well as success/failure status of each of the student, the chi square test
can be applied to test the hypothesis.
Assumptions:
The Chi square test for single variance has an assumption that the population from which
the sample has been is normal. This normality assumption need not hold for chi square goodness
of fit test and test for independence of attributes.
However while implementing these two tests, one has to ensure that expected frequency
in any cell is not less then 5. If it is so, then it has to be pooled with the preceding or succeeding
cell so that expected frequency of the pooled cell is at least 5.
Non-Parametric and Distribution Free:
It has to be noted that the Chi square goodness of fit test and test for independence of
attributes depend only on the set of observed and expected frequencies and degrees of freedom.
8/8/2019 Biostat Last Research
35/40
These two tests do not need any assumption regarding distribution of the parent population from
which the samples are taken.
Since these tests do not involve any population parameters or characteristics, they are
also termed as non parametric or distribution free tests. An additional important fact on these two
tests is they are sample size independent and can be used for any sample size as along as the
assumption on minimum expected cell frequency is met.
4) Types of Non-Parametric Test
Basically, there is at least one nonparametric equivalent for each parametric general type of test.
In general, these tests fall into the following categories:
8/8/2019 Biostat Last Research
36/40
y Tests of differences between groups (independent samples);y Tests of differences between variables (dependent samples);y Tests of relationships between variables.
Differences between independent groups.Usually, when we have two samples that we want to
compare concerning their mean value for some variable of interest, we would use the t-test for
independent samples); nonparametric alternatives for this test are the Wald-Wolfowitz runs test, the
Mann-Whitney U test, and the Kolmogorov-Smirnov two-sample test. If we have multiple groups,
we would use analysis of variance (see ANOVA/MANOVA; the nonparametric equivalents to this
method are the Kruskal-Wallis analysis of ranks and the Median test.
Differences between dependent groups.If we want to compare two variables measured in the same
sample we would customarily use the t-test for dependent samples (in Basic Statistics for example, if
we wanted to compare students' math skills at the beginning of the semester with their skills at the
end of the semester). Nonparametric alternatives to this test are the Sign test and Wilcoxon's matched
pairs test. If the variables of interest are dichotomous in nature (i.e., "pass" vs. "no pass") then
McNemar's Chi-square test is appropriate. If there are more than two variables that were measured in
the same sample, then we would customarily use repeated measures ANOVA. Nonparametric
alternatives to this method are Friedman's two-way analysis of variance and Cochran Q test (if the
variable was measured in terms of categories, e.g., "passed" vs. "failed"). Cochran Q is particularly
useful for measuring changes in frequencies (proportions) across time.
Relationships between variables.To express a relationship between two variables one usually
computes the correlation coefficient. Nonparametric equivalents to the standard correlation
coefficient are Spearman R, Kendall Tau, and coefficient Gamma (see Nonparametric correlations).
If the two variables of interest are categorical in nature (e.g., "passed" vs. "failed" by "male" vs.
8/8/2019 Biostat Last Research
37/40
"female") appropriate nonparametric statistics for testing the relationship between the two variables
are the Chi-square test, the Phi coefficient, and the Fisher exact test. In addition, a simultaneous test
for relationships between multiple cases is available: Kendall coefficient of concordance. This test is
often used for expressing inter-rater agreement among independent judges who are rating (ranking)
the same stimuli.
Descriptive statistics.When one's data are not normally distributed, and the measurements at best
contain rank order information, then computing the standard descriptive statistics (e.g., mean,
standard deviation) is sometimes not the most informative way to summarize the data. For example,
in the area of psychometrics it is well known that the rated intensity of a stimulus (e.g., perceived
brightness of a light) is often a logarithmic function of the actual intensity of the stimulus (brightness
as measured in objective units of Lux). In this example, the simple mean rating (sum of ratings
divided by the number of stimuli) is not an adequate summary of the average actual intensity of the
stimuli. (In this example, one would probably rather compute the geometric mean.) Nonparametrics
and Distributions will compute a wide variety of measures of location (mean, median, mode, etc.)
and dispersion (variance, average deviation, quartile range, etc.) to provide the "complete picture" of
one's data.
When to UseWhich Method
It is not easy to give simple advice concerning the use of nonparametric procedures. Each
nonparametric procedure has its peculiar sensitivities and blind spots. For example, the Kolmogorov-
Smirnov two-sample test is not only sensitive to differences in the location of distributions (for
example, differences in means) but is also greatly affected by differences in their shapes. The
Wilcoxon matched pairs test assumes that one can rank order the magnitude of differences in
matched observations in a meaningful manner. If this is not the case, one should rather use the Sign
8/8/2019 Biostat Last Research
38/40
test. In general, if the result of a study is important (e.g., does a very expensive and painful drug
therapy help people get better?), then it is always advisable to run different nonparametric tests;
should discrepancies in the results occur contingent upon which test is used, one should try to
understand why some tests give different results. On the other hand, nonparametric statistics are less
statistically powerful (sensitive) than their parametric counterparts, and if it is important to detect
even small effects (e.g., is this food additive harmful to people?) one should be very careful in the
choice of a test statistic.
Large data sets and nonparametric methods.Nonparametric methods are most appropriate when the
sample sizes are small. When the data set is large (e.g., n > 100) it often makes little sense to use
nonparametric statistics at all. Elementary Concepts briefly discusses the idea of the central limit
theorem. In a nutshell, when the samples become very large, then the sample means will follow the
normal distribution even if the respective variable is not normally distributed in the population, or is
not measured very well. Thus, parametric methods, which are usually much more sensitive (i.e.,
have more statistical power) are in most cases appropriate for large samples. However, the tests of
significance of many of the nonparametric statistics described here are based on asymptotic (large
sample) theory; therefore, meaningful tests can often not be performed if the sample sizes become
too small. Please refer to the descriptions of the specific tests to learn more about their power and
efficiency.
Nonparametric Correlations
The following are three types of commonly used nonparametric correlation coefficients
(Spearman R, Kendall Tau, and Gamma coefficients). Note that the chi-square statistic computed for
two-way frequency tables, also provides a careful measure of a relation between the two (tabulated)
8/8/2019 Biostat Last Research
39/40
variables, and unlike the correlation measures listed below, it can be used for variables that are
measured on a simple nominal scale.
Spearman R. Spearman R (Siegel & Castellan, 1988) assumes that the variables under
consideration were measured on at least an ordinal (rank order) scale, that is, that the individual
observations can be ranked into two ordered series. Spearman R can be thought of as the regular
Pearson product moment correlation coefficient, that is, in terms of proportion of variability
accounted for, except that Spearman R is computed from ranks.
Kendall tau. Kendall tau is equivalent to Spearman R with regard to the underlying assumptions. It
is also comparable in terms of its statistical power. However, Spearman R and Kendall tau are
usually not identical in magnitude because their underlying logic as well as their computational
formulas are very different. Siegel and Castellan (1988) express the relationship of the two measures
in terms of the inequality: More importantly, Kendall tau and Spearman R imply different
interpretations: Spearman R can be thought of as the regular Pearson product moment correlation
coefficient, that is, in terms of proportion of variability accounted for, except that Spearman R is
computed from ranks. Kendall tau, on the other hand, represents a probability, that is, it is the
difference between the probability that in the observed data the two variables are in the same order
versus the probability that the two variables are in different orders.
-1 3 * Kendall tau - 2 * Spearman R 1
Gamma. The Gamma statistic (Siegel & Castellan, 1988) is preferable to Spearman R orKendall
tau when the data contain many tied observations. In terms of the underlying assumptions, Gamma is
equivalent to Spearman R orKendall tau; in terms of its interpretation and computation it is more
similar to Kendall tau than Spearman R. In short, Gamma is also a probability; specifically, it is
computed as the difference between the probability that the rank ordering of the two variables agree
8/8/2019 Biostat Last Research
40/40
minus the probability that they disagree, divided by 1 minus the probability of ties. Thus, Gamma is
basically equivalent to Kendall tau, except that ties are explicitly taken into account.
Top Related