Bootstrap Method of Interior-Branch Test for …php.scripts.psu.edu/nxm2/2002...

7
Bootstrap Method of Interior-Branch Test for Phylogenetic Trees Tatyana Sitnikova Institute of Molecular Evolutionary Genetics and Department of Biology, The Pennsylvania State University Statistical properties of the bootstrap test of interior branch lengths of phylogenetic trees have been studied and compared with those of the standard interior-branch test in computer simulations. Examination of the properties of the tests under the null hypothesis showed that both tests for an interior branch of a predetermined topology are quite reliable when the distribution of the branch length estimate approaches a normal distribution. Unlike the standard interior-branch test, the bootstrap test appears to retain this property even when the substitution rate varies among sites. In this case, the distribution of the branch length estimate deviates from a normal distribution, and the standard interior-branch test gives conservative confidence probability values. A simple correction method was developed for both interior-branch tests to be applied for testing the reliability of tree topologies estimated from sequence data. This correction for the standard interior-branch test appears to be as effective as that obtained in our previous study, though it is much simpler. The bootstrap and standard interior-branch tests for estimated topologies become conservative as the number of sequence groups in a star-like tree increases. Introduction There are two different types of statistical tests of the reliability of a branching pattern of phylogenetic trees. One is Felsenstein’s (1985) test, in which the sta- bility of each sequence cluster (each partition of se- quences) of a tree is evaluated by the bootstrap method (Efron 1979). The test is known to be sometimes ex- cessively conservative (Zharkikh and Li 1992a, 1992b, 1995; Hillis and Bull 1993; Sitnikova, Rzhetsky, and Nei 1995). The other test, called the interior-branch test (Nei, Stephens, and Saitou 1985; Li 1989; Rzhetsky and Nei 1992), is to compute the standard errors of estimates of interior branch lengths and examine whether or not the branch lengths are significantly positive by means of the normal deviate test. This test has been shown to perform well if it is conducted properly (Sitnikova, Rzhetsky, and Nei 1995). Dopazo (1994) recently proposed that a bootstrap technique be used for testing the positiveness of the length of each interior branch in a given tree topology. This bootstrap test is different from Felsenstein’s. It is a nonparametric version of the interior-branch test, and we call it “the bootstrap interior-branch test.” However, the statistical properties of this test remain unclear. We have therefore studied some of the properties in com- parison to those of the standard interior-branch test by using computer simulation. In this study, we examined both cases where all the assumptions of the test are sat- isfied and where some assumptions are violated. Before explaining details of the computer simulation, we briefly describe the bootstrap test studied here. Bootstrap Interior-Branch Test The null hypothesis of the bootstrap interior-branch test as applied by Rodrigo and Dopazo (1995) is the same as that of the standard interior-branch test, i.e., that Key words: bootstrap, interior-branch test, phylogenetic trees, neighbor-joining. Address for correspondence and reprints: Tatyana Sitnikova, In- stitute of Molecular Evolutionary Genetics, 208 Mueller Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802. E-mail: [email protected]. Mol. Biol. Evol. 13(4):605411. 1996 0 1996 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038 the length of an interior branch is equal to 0. To test this null hypothesis, a bootstrap sampling of nucleotide sites from a set of sequences is conducted, and the interior branch length is estimated from each bootstrap sample. This procedure is repeated many times. Since the same topology is considered for all bootstrap samples, the es- timates of the branch length can be either positive or negative. The proportion of the bootstrap samples pro- ducing positive estimates of the branch length is com- puted for each interior branch. This will be called the bootstrap confidence probability value. Note that the test described above is appropriate for evaluating the reliability of a predetermined tree to- pology. In practice, however, an estimated topology is of special interest, and the test of this topology is slight- ly different from that of a predetermined topology. In the first case, a tree topology is chosen a priori and does not depend on the sequence data being analyzed, where- as in the second case a tree topology is estimated from the data. Since the statistical properties of both the boot- strap and the standard interior-branch tests are affected by the topology under study, we will consider the cases of predetermined and estimated topologies separately. Computer Simulation Both the bootstrap and the standard interior-branch tests provide the confidence probability (CP) value, which is interpreted as a complement of P value in stan- dard statistical tests and therefore is expected to follow a uniform distribution under the null hypothesis (for the definition of P value see, e.g., Walpole and Myers 1989, p. 302). If the CP value of the test does not follow a uniform distribution, the test will give biased results. In practice, the good behavior of the tests is particularly important in the range of large (0.9-1.0) CP values, since a threshold value of 0.95 or 0.99 is commonly used to reject a null hypothesis. To study the distribu- tions of CP values of the bootstrap and standard interior- branch tests under the null hypothesis, we conducted computer simulations. Methods of Computer Simulation We simulated the evolution of four or six nucleo- tide sequences of 100 nucleotides in length according to 605

Transcript of Bootstrap Method of Interior-Branch Test for …php.scripts.psu.edu/nxm2/2002...

Page 1: Bootstrap Method of Interior-Branch Test for …php.scripts.psu.edu/nxm2/2002 Publications/2002-sitnikova...Bootstrap Method of Interior-Branch Test for Phylogenetic Trees Tatyana

Bootstrap Method of Interior-Branch Test for Phylogenetic Trees

Tatyana Sitnikova Institute of Molecular Evolutionary Genetics and Department of Biology, The Pennsylvania State University

Statistical properties of the bootstrap test of interior branch lengths of phylogenetic trees have been studied and compared with those of the standard interior-branch test in computer simulations. Examination of the properties of the tests under the null hypothesis showed that both tests for an interior branch of a predetermined topology are quite reliable when the distribution of the branch length estimate approaches a normal distribution. Unlike the standard interior-branch test, the bootstrap test appears to retain this property even when the substitution rate varies among sites. In this case, the distribution of the branch length estimate deviates from a normal distribution, and the standard interior-branch test gives conservative confidence probability values. A simple correction method was developed for both interior-branch tests to be applied for testing the reliability of tree topologies estimated from sequence data. This correction for the standard interior-branch test appears to be as effective as that obtained in our previous study, though it is much simpler. The bootstrap and standard interior-branch tests for estimated topologies become conservative as the number of sequence groups in a star-like tree increases.

Introduction

There are two different types of statistical tests of the reliability of a branching pattern of phylogenetic trees. One is Felsenstein’s (1985) test, in which the sta- bility of each sequence cluster (each partition of se- quences) of a tree is evaluated by the bootstrap method (Efron 1979). The test is known to be sometimes ex- cessively conservative (Zharkikh and Li 1992a, 1992b, 1995; Hillis and Bull 1993; Sitnikova, Rzhetsky, and Nei 1995). The other test, called the interior-branch test (Nei, Stephens, and Saitou 1985; Li 1989; Rzhetsky and Nei 1992), is to compute the standard errors of estimates of interior branch lengths and examine whether or not the branch lengths are significantly positive by means of the normal deviate test. This test has been shown to perform well if it is conducted properly (Sitnikova, Rzhetsky, and Nei 1995).

Dopazo (1994) recently proposed that a bootstrap technique be used for testing the positiveness of the length of each interior branch in a given tree topology. This bootstrap test is different from Felsenstein’s. It is a nonparametric version of the interior-branch test, and we call it “the bootstrap interior-branch test.” However, the statistical properties of this test remain unclear. We have therefore studied some of the properties in com- parison to those of the standard interior-branch test by using computer simulation. In this study, we examined both cases where all the assumptions of the test are sat- isfied and where some assumptions are violated. Before explaining details of the computer simulation, we briefly describe the bootstrap test studied here.

Bootstrap Interior-Branch Test

The null hypothesis of the bootstrap interior-branch test as applied by Rodrigo and Dopazo (1995) is the same as that of the standard interior-branch test, i.e., that

Key words: bootstrap, interior-branch test, phylogenetic trees, neighbor-joining.

Address for correspondence and reprints: Tatyana Sitnikova, In- stitute of Molecular Evolutionary Genetics, 208 Mueller Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802. E-mail: [email protected].

Mol. Biol. Evol. 13(4):605411. 1996 0 1996 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038

the length of an interior branch is equal to 0. To test this null hypothesis, a bootstrap sampling of nucleotide sites from a set of sequences is conducted, and the interior branch length is estimated from each bootstrap sample. This procedure is repeated many times. Since the same topology is considered for all bootstrap samples, the es- timates of the branch length can be either positive or negative. The proportion of the bootstrap samples pro- ducing positive estimates of the branch length is com- puted for each interior branch. This will be called the bootstrap confidence probability value.

Note that the test described above is appropriate for evaluating the reliability of a predetermined tree to- pology. In practice, however, an estimated topology is of special interest, and the test of this topology is slight- ly different from that of a predetermined topology. In the first case, a tree topology is chosen a priori and does not depend on the sequence data being analyzed, where- as in the second case a tree topology is estimated from the data. Since the statistical properties of both the boot- strap and the standard interior-branch tests are affected by the topology under study, we will consider the cases of predetermined and estimated topologies separately.

Computer Simulation

Both the bootstrap and the standard interior-branch tests provide the confidence probability (CP) value, which is interpreted as a complement of P value in stan- dard statistical tests and therefore is expected to follow a uniform distribution under the null hypothesis (for the definition of P value see, e.g., Walpole and Myers 1989, p. 302). If the CP value of the test does not follow a uniform distribution, the test will give biased results. In practice, the good behavior of the tests is particularly important in the range of large (0.9-1.0) CP values, since a threshold value of 0.95 or 0.99 is commonly used to reject a null hypothesis. To study the distribu- tions of CP values of the bootstrap and standard interior- branch tests under the null hypothesis, we conducted computer simulations.

Methods of Computer Simulation

We simulated the evolution of four or six nucleo- tide sequences of 100 nucleotides in length according to

605

Page 2: Bootstrap Method of Interior-Branch Test for …php.scripts.psu.edu/nxm2/2002 Publications/2002-sitnikova...Bootstrap Method of Interior-Branch Test for Phylogenetic Trees Tatyana

606 Sitnikova

(4 1 3

b, b3 X b2 b5

b4 2 4

2 6

FIG. I.-Model trees for simulation of evolution of four (A) and six (B) nucleotide sequences.

the trees given in figure 1. We applied two methods to test the significance of interior branch b5 in the case of the four-sequence model tree (fig. 1A) and of interior branch bg for the six-sequence model tree (fig. 1B). The procedure of computer simulations was essentially the same as in our previous study (Sitnikova, Rzhetsky, and Nei 1995), although in this paper several substitution models were incorporated in generating nucleotide se- quences. The models were Jukes and Cantor’s (1969) model (JC), Kimura’s ( 1980) two-parameter model (K2P), and Jukes and Cantor’s (1969) model where the rate of substitution was assumed to vary among nucle- otide sites following a gamma distribution with param- eter a (JCT, see Golding 1983; Jin and Nei 1990). The corresponding correction formulas were applied for es- timating distances, and the true value of gamma param- eter a was used in the case of the JCF correction for- mula. (For the formulas of variances and covariances, see Kimura and Ohta 1972; Kimura 1980; Jin and Nei 1990; Bulmer 1991.)

Below we describe the procedure for calculating the CP values in the case of the four-sequence model tree (fig. 1A) for predetermined and estimated topolo- gies. The simulation procedure for the six-sequence model tree (fig. 1B) was essentially the same.

Statistical Tests for a Predetermined Topology

To examine the behavior of the tests for a prede- termined topology, we considered all sets of sequences generated according to the model tree (the number of replicate simulations is indicated in figure 3 legend). The topology of the model tree is regarded as a predeter- mined topology.

For each set of sequences the bootstrap CP value, P,, was computed as follows. (1) By drawing sites of the sequence set with replacement we produced B boot- strap samples of the same sequence length as that of the original one, where B = 500 was used unless otherwise stated. (2) For each bootstrap sample we computed a matrix of distances between sequences by using the ap-

propriate correction formula and calculated the ordinary least 2quares (OLS) estimate of b5 by the formula: l?& = k,dj, where l& is the estimate of b5 for the jth boot- strap sample, di is a vector of distances between se- quences of the jth bootstrap sample, and k5 is a vector of the coefficients. Both k, for 6, in the four-sequence model tree and k, for 6, in the six-sequence model tree were calculated as described in Rzhetsky and Nei (1992) and given in Sitnikova, Rzhetsky, and Nei (1995). (3) For all bootstrap samples we computed the proportion (Pn) of the bootstrap replications for which the estimate of the branch length is positive.

The CP value of the standard interior-branch test (Pc) was computed as foll?ws. (1) We estimated dis- tances between sequences (d) and obtained the OLS es- timate (6,) of b5 and its standard deviation (~(6,)) as described in Rzhetsky and Nei (1992). (2) We then cal- culated 2 = 6,/s(&) and, assuming that the test statistic 2 follows the standard normal distribution, computed Pc in the following way:

(1)

Statistical Tests for an Estimated Topology

In this simulation we tested the reliability of the branching pattern of the tree when it was estimated from the data by the neighbor-joining (NJ) method (Saitou and Nei 1987). Thus, we considered only sets of se- quences that produced NJ trees whose topologies were identical with that of the model tree (about one third of all data sets in the case of the four-sequence model tree), and discarded all other sets of sequences. In the previous study (Sitnikova, Rzhetsky, and Nei 1995), we have de- veloped a corrected version of the standard interior- branch test to be applied for assessing a tree estimated from sequence data. In this corrected test, we first com- puted 2 = 6,/s(&) as described above and then calcu- lated the corrected CP value (Pc) by the formula:

P; = I

Z

H(a)- kbxx a- l dx , (2) 0

where a = 3.17, and b = 3.06.

Results Bootstrap and Interior-Branch Tests for a Predetermined Topology

In this section the performance of the bootstrap and standard interior-branch tests is studied by computer simulation using the four-sequence model tree (fig. 1A). We start with a computer simulation where the JC sub- stitution model is applied for both generating sequences and estimating distances. Then, the K2P and JCT mod- els are considered in the same way.

First, let us examine whether or not the assump- tions of the tests under study are satisfied. The bootstrap method does not make any explicit assumptions about the shape of distribution of a test statistic (Efron and Tibshirani 1993, p. 55). However, there is an implicit

Page 3: Bootstrap Method of Interior-Branch Test for …php.scripts.psu.edu/nxm2/2002 Publications/2002-sitnikova...Bootstrap Method of Interior-Branch Test for Phylogenetic Trees Tatyana

Bootstrap Interior-Branch Test 607

(A) @I 7.40 4.10

‘s

gj

E

3.70 1 2.05

0.00 0.00

-0.7 0.00 0.75 -0.65 0.10 0.85

FIG. 2.-Distributions of the estimate of interior branch length b, of the model tree in simulations according to the four-sequence model tree (fig. 1A) with b, = bz = b, = b4 = 0.3 and b, = 0, where the JC (A), K2P with R = 15 (B), and JCT with a = 0.1 (C) substitution models were used in generating sequences, and appropriate estimators of distances were used. The distributions of 6s in bootstrap samples (6;) were obtained for one data set with Pa = 0.76 in simulation under the JCI model with a = 0.1 (D). In all cases the solid line represents the normal distribution with_the same mean and standard deviation as those of the distribution of b,. The number of replications of simula- tions (r) was 10,000, the number of bootstrap replications (B) was 5,000, and the length of nucleotide sequences (n) was equal to 100.

assumption that the distribution of the branch length es- timate is, on average, reproduced by its distribution in the bootstrap replications created from one set of se- quences. In contrast, the assumption of the interior- branch test is that the estimate of an interior branch length follows a normal distribution. In the case of the simulation under the JC substitution model, the assump- tions of both tests are almost perfectly satisfied. The d@ibution of the estimate of the interior branch length (b,) is indeed close to the normal distribution (fig. 2A). The distribution of 6, in bootstrap replications created from a set of data (distribution of 6;) does approach the distribution of 6, in most cases (data not shown). Note that the variance of a bootstrap distribution of 6, is usually close to the one obtained for the interior- branch test (the coefficient of correlation is 0.98). (In fact, it has been shown that the estimates of variances obtained by the delta method and by the bootstrap method with an infinite number of bootstrap samples are generally close [Parr 19831.)

m 0.0 0.2 0.4 0.6 0.8 1.0

PE3’ PC

FIG. 3.-Frequency distributions of P, (hatched columns) and Pc (solid columns) in simulations for the four-sequence model tree (fig. 1A) with b, = bz = b3 = b4 = 0.3 and b5 = 0 where the same substitution models were used for generating sequences and estimating distances between them: JC (A), K2P with R = 15 (B), JCT with a = 0.1 (C), and JCT with u = 2 (D). The dashed line indicates the uniform distribution. r = 5,000, B = 500, and IZ = 100.

Let us now examine the distributions of Pn and Pc likely to be a consequence of some departure of the obtained in computer simulations. As mentioned earlier, distribution of 6, from the corresponding normal distri-

El P, n Pc

0.10

1

(C) 0.0 0.2 0.4 0.6 0.8 1.0

b’ PC

these distributions are expected to be close to a uniform distribution under the null hypothesis. This is indeed the case for both PB and P, (fig. 3A), although some devi- ation is observed in the range of very small and very large CP values. In the case of the bootstrap test, this deviation probably occurs because the 6; distribution obtained for a set of data is not necessarily a good es- timate of the distribution of 6, when the interior branch length estimate computed from this data set falls in the extreme tail of the distribution of 6, (see a similar ex- ample in Efron and Tibshirani 1993, p. 81). A minor deviation from uniformity of the distribution of Pc is

above 0.95, the bootstrap interior-branch test is less con- bution. Note, however, that for the threshold values

servative than the standard interior-branch test. The JC substitution model is not always appropri-

ate for describing the mechanisms of molecular evolu- tion, since in real genes transitional nucleotide substi- tutions tend to occur more frequently than transversional substitutions (e.g., see Nei 1987, p. 84). In addition, sub- stitution rates may vary among the sites of sequences. In order to examine the behavior of the tests under more realistic substitution models, we conducted simulations under the K2P substitution model with a transition/trans-

Page 4: Bootstrap Method of Interior-Branch Test for …php.scripts.psu.edu/nxm2/2002 Publications/2002-sitnikova...Bootstrap Method of Interior-Branch Test for Phylogenetic Trees Tatyana

608 Sitnikova

version ratio (R) of 15 and the JCT model with a gamma parameter (a) equal to 0.1. Note that these are rather extreme values of the parameters, since such a high R value is rarely observed and a = 0.1 is unusually small (e.g., Kocher and Wilson 1991; Yang, Goldman, and Friday 1994; S. Kumar, personal communication). Thus, we studied the performance of the tests in extreme con- ditions to verify their behavior in common phylogenetic analysis.

The simulation results show that for both the K2P and JCI? models the assumption of a normal distribution of 6, no longer applies (fig. 2B and c). Namely, in both cases the distributions of 6, are narrower and have lon- ger tails than the corresponding normal distribution. Note that the deviation from normality is much more conspicuous in the case of the JCT model than in the case of the K2P model. The distribution of bootstrap estimates (6;) of b5 for a set of sequences obtained under the JCT model often becomes more skewed than the distribution of b5, though it is also narrower and has longer tails than the normal distribution (compare the shapes of the frequency distributions in fig. 2C and D).

Despite the deviation of the distribution of 6, from normality, the distributions of both P, and P, appear to be close to uniform in the case of the K2P model with R = 15 (fig. 3B). Moreover, the bootstrap interior-branch test remains nearly unbiased in the case of the JCT mod- el (a = 0.1) as well, but the standard interior-branch test becomes conservative for large values (PC > 0.9) of Pc (fig. 3C). The distribution of Pc in this case is no longer uniform but rather bimodal, whereas the P, distribution is still close to uniform.

When less extreme values of the model parameters R and a are used (smaller values of R and larger values of a), the distribution of the branch length estimate con- verges to normal. Consequently, both PB and Pc ap- proach a uniform distribution (see fig. 30 for the case of the JCT model with a = 2).

Therefore, we can conclude that the bootstrap in- terior-branch test is an approximate method that per- forms quite well under various substitution models, de- spite possible violation of the implicit assumption of the test.

Our study of the effects of the sequence length and the number of sequences on the performance of the in- terior-branch tests showed that the distribution of inte- rior branch length estimates approaches the normal dis- tribution as the number of nucleotides or the number of the sequences increases. This improves the performance of both tests, making their distributions closer to uni- form, even in the case of the JCT model with a = 0.1 (data not shown). By contrast, use of very short se- quences (say, 10 nt long) is followed by the considerable deviation of the distribution of 6, from normality. This makes both tests conservative in the range of high CP values, even in the simulation under the JC model, the bootstrap interior-branch test being less sensitive to this condition. Therefore, for applying both of the interior- branch tests, sequences at least 50 nt long should be used.

0.20

6

1 O.1° 0.00

0.23

FIG. 4.-Unconditional (solid columns) and conditional (open col- umns) distributions of the estimated length of the interior branch b, for the four-sequence model tree (fig. 1A). The conditional distribution is a part of the unconditional distribution for the cases where the to- pology of the model tree is selected by the NJ method. The simulations were conducted according to the model tree with br = b, = b, = b, = 0.3 and b, = 0, where the JC model was used for generating se- quences and estimating distances between them. r = 10,000 and n = 100.

Interior-Branch Tests for an Estimated Topology

In phylogenetic analysis of real data an investigator is often interested in testing a tree topology estimated from data rather than a predetermined topology. In this case, both bootstrap and standard interior-branch tests as described above become too liberal (data not shown). However, it is possible to develop a simple correction of the tests to overcome this problem. In analogy to deriving the formula for Po (see eq. 2), we consider the four-sequence model tree (fig. 1A) with exterior branch- es of equal length and with interior branch equal to, 0 as the null-hypothesis tree. The distribution of 2 = bg/ ~(6,) for this tree has been found to represent the “worst case” in the sense that the tests constructed for this case will never be too liberal for any other case, even for the trees with a larger number of sequences (Sitnikova, Rzhetsky, and Nei 1995).

Let us first consider the distribution of the interior branch length estimate (6,) for a predetermined topol- ogy, or the unconditional distribution (fig. 4, solid col- umns). The NJ-method would select a given topology among the three possible topologies for four sequences in approximately one third of the cases. Obviously, the conditional distribution of bs (fig. 4, open columns) for a given topology chosen by the NJ-method is a fraction of the unconditional distribution. Since the shape of the conditional distribution is different from the shape of the normal distribution, we aim to find a simple ap- proximation of this distribution.

In developing a test for an interior branch of an estimated topology, Tajima (1992) adapted the truncated standard normal distribution to approximate the condi- tional distribution of t = 6i/s(6i), given that the consid- ered topology is selected by the NJ-method. (Here t is a statistic analogous to 2, though Tajima’s [ 19921 meth- od is used for calculating variances of branch length estimate.) Thus, in his test the probability of accepting a particular topology with critical value c (PT(c)) has been computed as follows:

I m

WC) = f (0 dt, t=c

Page 5: Bootstrap Method of Interior-Branch Test for …php.scripts.psu.edu/nxm2/2002 Publications/2002-sitnikova...Bootstrap Method of Interior-Branch Test for Phylogenetic Trees Tatyana

Bootstrap Interior-Branch Test 609

{

exp( - t2/2)

wheref(t) = 6 if t 2 0.4307, (4)

0 if t < 0.4307.

Hereflt) is a truncated standard normal distribution and P=(c) = l/3 when c < 0.4307. P&c) = l/3 approxi- mately corresponds to the probability of obtaining a tree in the case of a star-like model tree of four sequences.

To find the correction for the interior-branch tests, we make use of Tajima’s approach to describe the con- ditional distribution of a test statistic. The validity of such approximation can be justified in the following way. Let b, be some value of 6, greater than 0. We note that the tail of the conditional distribution of bs (fig. 4, open columns), where 6, > b,, nearly coincides with the tail of the unconditional distribution of 6, (fig. 4, solid columns). Let us fix b, and let o denote the prob- ability of obtaining 6, higher than b, in the uncondi- tional distribution. Since the NJ-method recovers a giv- en topology in one third of the cases, the conditional distribution (given that the topology is selected by the NJ-method) constitutes one third of the unconditional distribution. After normalizing the conditional distribu- tion to make the area under the curve of the distribution equal to 1, the probability of obtaining 6, greater than b, is about 3a in the conditional distribution. Then, from (Y = 1 - P, where P is the probability of obtaining 6, smaller than b, in the unconditional distribution, and 3o = 1 - e where P is the same probability for the con- ditional distribution, we obtain P = 3P - 2. Therefore, we can introduce the following formulas to compute P for an observed & value in the test for an estimated topology:

(3PB - 2) if PB > 2/3 B

= 0 if Pn 5 2/

and

C(3Pc - 2) if P, > 2/3

if Pc I 2/3

Note that both P, and Tajima’s (1992) tests make an assumption that the interior branch length estimate follows a normal distribution and, in fact, Pc = 1 - 3P,(c), where PT(c) is calculated for Tajima’s test by equation (3).

Pc can also be obtained for the two-sided interior- branch test (Pc2). In this case, the alternative hypothesis of the test is that the interior branch length deviates from 0. Thus, the corrected CP value of the test becomes:

We examined the distributions of Pg, Pc (eq. 5), and Pc (eq. 2) in the simulations according to the four- sequence tree (fig. 1A). All tests tended to perform well for CP values greater than 0.5 (fig. 5A), although a mi- nor underestimation in the range of 0.9-1.0 was ob- served. However, examining the distributions of PB and Pc, we found that the frequencies of very small values

(A)o.m 0.15 Ill

0.0 6.1 4.2 6.3 6.4 6.5 6.6 0.7 0.8 0.9 1.0

@)0.30 FB, Fc, p;

0.25

6.1 6.2 0.4 6.6 0.8

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

p,, p, 9 pc FIG. 5.-Frequency distributions of P, (hatched columns), Pc

(solid columns), and Pc (open columns) for the estimated topology when it is identical to the topology of the model tree. (A) Simulation was conducted under the JC substitution model following the four- sequence model tree (fig. 1A) with b, = b2 = b3 = b4 = 0.3 and b, = 0. (B) Simulation was conducted under the JC substitution model following the six-sequence model tree (fig. 1B) with b, = b2 = b3 = b, = b, = b, = 0.3 and b7 = b, = b, = 0. (C) In simulation under the JCT substitution model with a = 0.1 the model tree was the same as for A. In all cases the appropriate correction formulas were used for estimating distances. In simulations for the four-sequence tree r = 10,000 and for the six-sequence tree r = 100,000. In all cases B = 500, n = 100.

are excessively high because all values of PB 5 2/3 and Pc 5 2/3 are lumped in 0 for the corrected CP values. Consequently, in the region O-O.5 both distributions di- gress downward from the uniform distribution. Never- theless, these deviations can be neglected, since in prac- tice small CP values are not used for rejecting the null hypothesis of the tests and therefore rarely draw the in- terest of investigators. Simulations for the six-sequence model tree (fig. 1B) with expected lengths of all interior branches equal to 0 indicate that the above three tests all become conservative as the number of sequence groups in a star tree increases (fig. 5B).

Page 6: Bootstrap Method of Interior-Branch Test for …php.scripts.psu.edu/nxm2/2002 Publications/2002-sitnikova...Bootstrap Method of Interior-Branch Test for Phylogenetic Trees Tatyana

610 Sitnikova

In the case of the JCT model with a = 0.1, the distributions of Pc and Pc give more serious underesti- mates of the true CP values in the range of large values than does the P, distribution (fig. 5C). This is consistent with the results of the computer simulation with the JCT model for a predetermined topology, where the interior- branch test is found to be conservative for large values of Pc (see above).

Interior-Branch Tests when Model Assumptions are Violated

It is unlikely that the substitution pattern of real nucleotide sequences exactly follows any substitution model that is usually used in phylogenetic analysis. Hence, the properties of the statistical tests for the cases of biased estimators of distances between sequences are worth studying. Here we analyze the behavior of both bootstrap and standard interior-branch tests when the K2P and JCT substitution models are incorporated in generating sequences, while the JC formula is used for estimation of distances.

In our simulations both interior-branch tests were affected by an incorrect choice of the distance measures to approximately equal extents. No substantial deviation of the distributions of P, and Pc from the uniform dis- tribution occurs when the expected value of the branch length estimate does not deviate from the true value of the branch length. This was found in the case of the four-sequence model tree with equal lengths of exterior branches (fig. 6A) where the expected value of 6, is 0. In the case of overestimation of the interior branch length, both P, and Pc tests become too liberal (fig. 6B), whereas they are conservative when the interior branch length is underestimated (fig. 6C).

In practice, estimates of branch lengths may be bi- ased. Therefore, it is preferable to use the most realistic distance measures available for conducting statistical tests of the reliability of phylogenetic trees.

Discussion

Simplicity of implementation and freedom from distributional assumptions are among the most attractive properties of the bootstrap technique. One of the popular applications of the bootstrap method in statistical tests is to obtain P values. However, the bootstrap test has been shown to perform well only for normally distrib- uted statistics or statistics that follow a symmetrical and continuous distribution. That is, it has been proven that the distribution of the bootstrap P value is asymptoti- cally uniform, provided that the bootstrap method gives a good approximation of the real distribution (Singh and Berk 1994).

In this paper, we have shown that the bootstrap method performs well when it is employed for testing the positiveness of the length of an interior branch of a predetermined tree topology. It appears to be nearly un- biased even in the case of the simulation under the JCT substitution model with a = 0.1, where the distribution of the branch length estimate differs considerably from a normal distribution. The standard interior-branch test

. _ 0.10 El P, n Pc

I 1 r 0.0 0.1 0.2 0.3 d.4d.5 d.6-d.7 &i -d.9 i.0

Pl3’ PC

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

PEI’ PC

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

PI3’ PC FIG. 6.-Frequency distributions of P, (hatched columns) and Pc

(solid columns) in simulations for the four-sequence model tree (fig. 1A) with b, = b2 = b3 = b4 = 0.3 and b5 = 0 (A), with b, = b2 = 0.5, b3 = b, = 0.05, and b, = 0 (B), and with b, = b3 = 0.05, b, = b, = 0.5, and bs = 0 (C). The JCT model with a = 0.1 was used for generating sequences, but the JC correction formula was applied to estimate distances between sequences. r = 10,000, B = 500, and 12 = 100.

(the parametric analog of the bootstrap interior-branch test that incorporates an assumption of a normal distri- bution of the test statistic) turns out to be conservative for high values of P, in this case.

When a topology estimated from a data set is test- ed, the corrected CP values for the bootstrap (P,) and standard interior-branch tests (Pc or Pc) rather than P, and Pc are recommended for use. Since P, and Pc as presented in this paper are very simple to compute and as effective as Pc, they might be preferable to Pc.

The bootstrap interior-branch test can serve as a good alternative to the standard interior-branch test in evaluating the reliability of both predetermined and es- timated tree topologies. Note also that, unlike the stan- dard interior-branch test, the bootstrap test does not re- quire computation of a covariance matrix of distance

Page 7: Bootstrap Method of Interior-Branch Test for …php.scripts.psu.edu/nxm2/2002 Publications/2002-sitnikova...Bootstrap Method of Interior-Branch Test for Phylogenetic Trees Tatyana

Bootstrap Interior-Branch Test 611

estimates, which is rather computationally intensive when a large data set is considered.

Nevertheless, the interpretation of the results of both interior-branch tests in assessing the accuracy of a tree topology is not straightforward. This is because the test of the positiveness of an interior branch length is not equivalent to the test of the hypothesis that the par- tition of sequences produced by this interior branch is present in the true tree. In particular, the interior branch generating an incorrect partition in a tree may have pos- itive expectation of the length estimate (see examples in Sitnikova, Rzhetsky, and Nei 1995). (Note that this nev- er happens if the clustering of four sequences or four monophyletic groups is considered.) After examining a number of examples of such situations, we conjectured that a wrong tree must contain at least one interior branch of which the expectation of the length becomes negative. If this conjecture is correct, we can reject a tree under consideration when it contains an interior branch of which the length is negative and statistically significant. Conversely, the branching pattern of a to- pology can be recognized as reliable only if all interior branches of the tree are significantly positive.

KIMURA, M. 1980. Simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16: 11 l-120.

KIMURA, M., and T. OHTA. 1972. On the stochastic model for estimation of mutational distance between homologous pro- teins. J. Mol. Evol. 2:87-90.

KOCHER, T. D., and A. C. WILSON. 1991. Sequence evolution of mitochondrial DNA in humans and chimpanzees: control region and a protein-coding region. Pp. 41-64 in T. OSAWA and T. HONJO, eds. Evolution of life: fossils, molecules and culture. Springer, Tokyo.

LI, W.-H. 1989. A statistical test of phylogenies estimated from sequence data. Mol. Biol. Evol. 6:424-435.

NEI, M. 1987. Molecular evolutionary genetics. Columbia Uni- versity Press, New York.

NEI, M. 199 1. Relative efficiencies of different tree-making methods for molecular data. Pp. 90-128 in M. MIYAMOTO and J. CRACRAFT, eds. Phylogenetic analysis of DNA se- quences. Oxford University Press, New York.

NEI, M., J. C. STEPHENS, and N. SAITOU. 1985. Methods for computing the standard errors of branching points in an evolutionary tree and their application to molecular data from humans and apes. Mol. Biol. Evol. 2:66-85.

PARR, W. C. 1983. A note on the jackknife, the bootstrap and the delta method estimators of bias and variance. Biometri- ka 70:719-722.

RODRIGO, M. J., and J. DOPAZO. 1995. Evolutionary analysis Acknowledgments of picornavirus family. J. Mol. Evol. 40:362-371.

RZHETSKY, A., and M. NEI. 1992. A simple method for esti- This study was motivated by discussions with Mas- mating and testing minimum-evolution trees. Mol. Biol.

atoshi Nei. I would like to thank Andrey Rzhetsky and Evol. 9:945-967.

Masatoshi Nei for numerous comments and suggestions RZHETSKY, A., and M. NEI. 1993. Theoretical foundation of

and Joaquin Dopazo for discussions on the early drafts the minimum-evolution method of phylogenetic inference.

of the manuscript. I also thank Claudia Russo, Jianzhi Mol. Biol. Evol. 10: 1073-1095.

Zhang, and Jennifer M. Hay for comments. This work SAITOU, N., and M. NEI. 1987. The neighbor-joining method:

was supported by grants from NIH and NSF to M. Nei. a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.

SINGH, K., and R. H. BERK. 1994. A concept of type-2 p-value.

LITERATURE CITED

BULMER, M. 1991. Use of the method of generalized least squares in reconstructing phylogenies from sequence data. Mol. Biol. Evol. 8:868-883.

DOPAZO, J. 1994. Estimating errors and confidence intervals for branch lengths in phylogenetic trees by a bootstrap ap- proach. J. Mol. Evol. 38:300-304.

EFRON, B. 1979. Bootstrap methods: another look at the jack- knife. Ann. Statist. 7:1-26.

EFRON, B., and R. J. TIBSHIRANI. 1993. An introduction to the bootstrap. Chapman & Hall, New York.

FELSENSTEIN, J. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783-79 1.

FELSENSTEIN, J. 1988. Phylogenies from molecular sequences: inference and reliability. Annu. Rev. Genet. 22:521-565.

GOLDING, G. B. 1983. Estimates of DNA and protein sequence divergence: an examination of some assumptions. Mol. Biol. Evol. 1: 125-142.

HILLIS, D. M., and J. J. BULL. 1993. An empirical test of boot- strapping as a method for assessing confidence in phylo- genetic analysis. Syst. Biol. 42: 182-192.

JIN, L., and M. NEI. 1990. Limitations of the evolutionary par- simony method of phylogenetic analysis. Mol. Biol. Evol. 7:82-102.

JUKES, T. H., and C. R. CANTOR. 1969. Evolution of protein molecules. Pp. 21-132 in H. M. MUNRO, ed. Mammalian protein metabolism. Academic Press, New York.

Statistica Sinica 4:493-504. SITNIKOVA, T, A. RZHETSKY, and M. NEI. 1995. Interior-

branch and bootstrap tests of phylogenetic trees. Mol. Biol. Evol. 12:319-333.

TAJIMA, E 1992. Statistical method for estimating the standard errors of branch lengths in a phylogenetic tree reconstructed without assuming equal rates of nucleotide substitution among different lineages. Mol. Biol. Evol. 9: 168-181.

WALPOLE, R. E., and R. H. MYERS. 1989. Probability and sta- tistics for engineers and scientists. Macmillan, New York.

YANG, Z., N. GOLDMAN, and A. E. FRIDAY. 1994. Comparison of models for nucleotide substitution used in maximum likelihood phylogenetic estimation. Mol. Biol. Evol. 11: 3 16324.

ZHARKIKH, A., and W.-H. LI. 1992~. Statistical properties of bootstrap estimation of phylogenetic variability from nucle- otide sequences: I. Four taxa with a molecular clock. Mol. Biol. Evol. 9: 1119-l 147.

ZHARKIKH, A., and W.-H. LI. 1992b. Statistical properties of bootstrap estimation of phylogenetic variability from nucle- otide sequences: II. Four taxa without a molecular clock. J. Mol. Evol. 35:356-366.

ZHARKIKH, A., and W.-H. LI. 1995. Estimation of confidence in phylogeny: the complete-and-partial bootstrap technique. Mol. Phylogenet. Evol. 444-63.

SHOZO YOKOYAMA, reviewing editor

Accepted January 8, 1996