Lecture 006

7/29/2019 Lecture 006

1/17

Roots of Equations

1.1 Introduction

The following problem may be used as an introduction to the problem of root finding. An electrical cable

is suspended from two towers that are 50 meters apart. The cable is allowed to dip 10 meters in the

middle. How long is the cable? We know that the curve assumed by a suspended cable is a catenary

(see Figure 0.0).

-20 -10 0 10 200

10

20

30

40

x

y

Figure 0.0. Cable suspended between two towers (left and right in the figure).

When the y-axis passes through the lowest point, we can assume an equation of the form

y= kcoshxk. Here k is a parameter to be determined. The conditions of the problem are thaty25 = y0 + 10. Hence

kcosh25

k= k + 10.

From this equation, k can be determined by the methods discussed in this chapter. The result is

k = 32.79. The question now is how can we find this value and what are the procedures to calculate it.

Another example for such kind of problems is the following missile-intercept problem. The movement of

an object in the x yplane is decried by the parametrized equations

x1t = t and y1t = 1 - -t.

A second object moves according to the equations

x2t = 1 - cosa t and y2t = sina t- 0.1 t2.

Is it possible to choose a value for a so that both objects will be in the same place at some time?

2012 G. Baumann

7/29/2019 Lecture 006

2/17

When we set the xand y coordinates equal to each other, we get the system

t= 1 - cosa t and 1 - -t = sina t- 0.1 t2

that needs to be solved for the unknown a and t. If real values exist for these unknowns that satisfy the

two equations, both objectives will be in the same place at some value t. But even though the problem isa rather simple one that yields a small system, there is no obvious way to get the answer, or even to see

if there is a solution. However, if we graphically represent the two curves we observe that there is an

intersection which means a solution (see Figure 0.0)

a

0.5 1.0 1.5 2.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

Figure 0.0. Two objects are crossing on a common point.

The numerical solution of a system of nonlinear equations is one of the more challenging tasks in

numerical analysis and, as we will see, no completely satisfactory method exists for it. To understand the

difficulties, we start with what at first seems to be a rather easy problem, the solution of a single equation

in one variable

f(x) = 0.

The values of x that satisfy this equation are called the zeros or roots of the function f. In what is to follow

we will assume that f is continuous and sufficiently differentiable where needed.

1.2 Simple Root Finding Methods

To find the roots of a function of one variable is straightforward enoughjust plot the function and see

where it crosses the x-axis. The simplest methods are in fact little more than that and only carry out this

2 Lecture_006.nb

2012 G. Baumann

7/29/2019 Lecture 006

3/17

suggestion in a systematic and efficient way. Relying on the intuitive insight of the graph of the function f,

we can discover many different and apparently viable methods for finding the roots of a function of one

variable.

Suppose we have two values a and b, such that fa and fb have opposite signs. Then, because it isassumed that f is continuous, we know that there is a root somewhere in the interval a, b. To localize it,we take there is a root somewhere in the interval and compute fc. Depending on the sign of fc, wecan then place the root in one of the two intervals a, c or c, b. We can repeat this procedure until theregion in which the root is known to be located is sufficiently small. The algorithm is known as the bisec-

tion method. This method is based on the intermediate-value theorem which is shown in Figure 0.0. In

this figure the graph of a function that is continuous on the closed interval a, b is shown. The figuresuggests that if we draw any horizontal line y= k, where k is between fa and fb, then that line willcross the curve y= fx at least once over the interval a, b.

x

y

fa

fb

a b

k

x

Figure 0.0. Graph of a function with continuous behavior in the interval a, b.Stated in numerical terms, if f is continuous on a, b, then the function f must take on every value kbetween fa and fb at least once as xvaries from a to b. For example, the polynomial

px_ := x7 - x + 3

has a value of 3 at x= 1 and a value of 129 at x= 2. Thus it follows from the continuity of p that the

equation 3 - x + x7 == k has at least one solution in the interval 1, 2 for every value of k between 3and 129. This idea is stated more precisely in the following theorem.

Theorem 0.0. Intermediate-Value Theorem

If f is continuous on a closed interval a, b and k is any number between fa and fb, inclusive, thenthere is at least one number x in the interval a, b such that fx = k.

Although this theorem is intuitively obvious, its proof depends on a mathematically precise development

of the real number system, which is beyond the scope of this text.

A variety of problems can be reduced to solving an equation fx = 0 for its roots. Sometimes it is possi-ble to solve for the roots exactly using algebra, but often this is not possible and one must settle for

decimal approximations of the roots. One procedure for approximating roots is based on the following

Lecture_006.nb 3

2012 G. Baumann

7/29/2019 Lecture 006

4/17

consequences of the Intermediate-Value Theorem.

Theorem 0.0. Root Approximation

If f is continuous on a, b, and if fa and fb are nonzero and have opposite signs, then there is at least

one solution of the equation fx = 0 in the interval a, b.This result, which is illustrated in Figure 0.0, can be proved as follows.

x

y

fa>0

fx=0

fb

7/29/2019 Lecture 006

5/17

-2.20 -2.15 -2.10 -2.05 -2.00 -1.95 -1.90

-5

0

5

x

y

-2.2, -1.9

Figure 0.0. Graph of a function fx allowing a root in the interval a, b.

The polynomial is defined by

px_ := x5 + 8 x2 - x + 1

The following sequence of intervals shrinks the interval length in such a way that the conditions of Theo-

rem 0.0 are satisfied. The intervals are given in curled brackets in the second argument of Map[].

Mapp1, p2 &, 2.1, 2, 2.1, 1.8, 2.06, 2.05, 2.059, 2.056, 2.0585, 2.058

TableForm, TableHeadings , "x", "px" &

x px-2.46101 63

-2.46101 9.82432

-0.0879704 0.464937

-0.031969 0.135084

-0.00402787 0.0238737

The table shows that the interval is chosen in such a way that the signs of the polynomial px changes.However, the exact value of the root can be determined by

FindRootpx 0, x, -3.1

x -2.05843

Lecture_006.nb 5

2012 G. Baumann

7/29/2019 Lecture 006

6/17

Stating that the real value of x= -2.0584 is the intersection of the polynomial px with the horizontal x-axis.

1.2.1 The Bisection Method

The bisection method is very simple and intuitive, but has all the major characteristics of other root-

finding methods. The simplest numerical procedure for finding a root is to repeatedly halve the interval

a, b, keeping the half on which fx changes sign. This procedure is called the bisection method. It isguaranteed to converge to a root.

The bisection method is very simple and uses the ideas introduced above that the product of the function

at two different locations distinguishes three cases. If we have positive values there is no change in sign

and thus no root, if we have a negative sign the two values are different in sign and we will have a root, if

the result is zero we found the root itself. In general the following steps are used:

Step1 : Choose the lower and upper boundary of an interval including the root. This means fxu fxl < 0.

Step2 : Estimate the root by the arithmetic mean.

Step3 : Make the following calculations to determine in which subinterval the root lies:

If fxl fxr < 0 the root lies in the lower interval. Therefore, set xu= xr and return to step 2.

If fxl fxr > 0 the root lies in the upper interval. Therefore, set xl = xr and return to step 2.

If fxl fxr = 0 the root equals xr; terminate the computation.

To be more precise in our definition, suppose that we are given an interval a, b satisfying fa fb < 0and an error tolerance e > 0. Then the bisection method consists of the following steps:

Define c= a+ b2.

If b- c e, then accept c as the root and stop.

If signfb signfc 0, then set a= c. Otherwise, set b= c. Return to step 1.

These algorithmic steps are implemented in the following lines

6 Lecture_006.nb

2012 G. Baumann

7/29/2019 Lecture 006

7/17

bisectionf_, a_, b_ : Blockc, 105, ain a, bin b, m 1, results ,

While0 0, first step find the midpoint

c ain bin2

;

second step select the rootIfbin c , Returnresults;AppendToresults, m, c; third step select the interval Iff . x bin f . x c 0, ain Nc, bin Nc;m m 1

The application of the function to a polynomial shows us the following results

bisectionx6 x 1, 1, 1.4 TableForm, TableHeadings , "m", "c" &

m c

1 1.2

2 1.1

3 1.15

4 1.125

5 1.1375

6 1.13125

7 1.13438

8 1.13594

9 1.1351610 1.13477

11 1.13457

12 1.13467

13 1.13472

14 1.13474

15 1.13473

where m is the iteration step and c represents the approximation of the root at iteration step m. The

graphical representation of the function shows that there is in fact an intersection with the xaxis.

Lecture_006.nb 7

2012 G. Baumann

7/29/2019 Lecture 006

8/17

Plotx6 x 1, x, 1, 1.4

1.0 1.1 1.2 1.3 1.4

-1

0

1

2

3

4

5

Figure 0.0. Graph of the function fx = x6 - x- 1 allowing a root in the interval 1, 1.3.

In general, an iteration produces a sequence of approximate solutions; we will denote these iterates by

x0, x1, x2, .... These sequence of approximations is shown dynamically in the following Figure 0.0

1.0 1.1 1.2 1.3 1.4

-1

0

1

2

3

4

5

Figure 0.0. Sequence of approximations of the root for the function fx = x6 - x- 1.

The difference between the various root-finding methods lies in what is computed at each step and how

the next iterate is chosen.

8 Lecture_006.nb

2012 G. Baumann

7/29/2019 Lecture 006

9/17

x

y

fx0

fx1

x0 x1

x4 x2

x3

Figure 0.0. The bisection method. After three steps the root is known to lie in the interval x3, x4.

To estimate the error bound of the bisection method we can proceed as follows. Let an, bn and cn denote

the nth computed value of a, b, and c, respectively. Then easily we get

bn+1 - an+1 =1

2bn- an for n 1

and it is straightforward to deduce that

bn- an=1

2n-1b- a for n 1

where b- adenotes the length of the original interval with which we started. Since the root a is in either

the interval an, cn or cn, bn, we know that

a - cn cn- an= bn- cn=1

2bn- an.

This is the error bound for cn that is used in the second step of the bisection algorithm. Combining it with

our estimation, we obtain the further bound

a - cn 1

2nb- a.

This shows that the iterates cn converges to a as n .

To see how many iterations will be necessary, suppose we want to have

a - cn e.

This will be satisfied if

1

2nb- a e.

Lecture_006.nb 9

2012 G. Baumann

7/29/2019 Lecture 006

10/17

Taking logarithms of both sides, we can solve this to give

nlog b-a

e

log2.

For the example we discussed above the number of iterations for an accuracy of 10-5 should be found

within

nlog 1

0.00001

log2= 16.6096.

Thus we need about n= 16 iterations which is in agreement with the calculation.

There are several advantages to the bisection method. The principal one is that the method is guaran-

teed to converge. In addition, the error bound, given is guaranteed to decrease by one-half with each

iteration. Many other numerical methods have variable rates of decrease for the error, and these may be

worse than the bisection method for some equations. The principal disadvantage of the bisection methodis that it generally converges more slowly than most other methods. For functions fx that have a continu-ous derivative, other methods are usually faster. These methods may not always converge; when they do

converge, however, they are almost always much faster than the bisection method.

1.2.2 Method of False Position

Suppose we have two iterates x0 and x1 that encloses the root. We can then approximate fx by astraight line in the interval and find the place where this line cuts the x-axis. We take this as the new

iterate

x2 = x1 -x1 - x0 fx1

fx1 - fx0.

When this process is repeated, we have to decide which of the three points x0, x1, or x2, to select for

starting the next iteration. There are two plausible choices. The first, we retain the last iterate and one

point from the previous ones so that the two new points enclose the solution (Figure 0.0). This is the

method of false position.

10 Lecture_006.nb

2012 G. Baumann

7/29/2019 Lecture 006

11/17

x

y

fx0

fx1x0

x1 x2x3

Figure 0.0. The method of false position. After the second iteration, the root is known to lie in the interval

x3, x0.The formula for the false position algorithm is based on the similarity of the two triangles involved in the

iteration. Using the triangles generated by the straight line connecting the upper and lower value of the

function in the interval xn, xn-1 we can write down the relationfxn

xn+1 - xn=

fxn-1xn+1 - xn-1

This equation is equivalent to

xn+1 - xn-1 fxn = fxn-1 xn+1 - xn

which is written by collecting terms as

xn+1fxn - fxn-1 = xn-1 fxn - xnfxn-1

which is equivalent to

xn+1 =xn-1 fxn

fxn - fxn-1-

xnfxn-1fxn - fxn-1

If we add and subtract on the right hand side xn we find

xn+1 = xn+xn-1 fxn

fxn - fxn-1- xn-

xnfxn-1fxn - fxn-1

= xn+xn-1 fxn

fxn - fxn-1+

-xnfxn + xnfxn-1 - xnfxn-1fxn - fxn-1

= xn+xn-1 fxn

fxn - fxn-1+

-xn fxnfxn - fxn-1

Lecture_006.nb 11

2012 G. Baumann

7/29/2019 Lecture 006

12/17

= xn+xn-1 - xn fxnfxn - fxn-1

= xn-xn- xn-1 fxnf

xn

- f

xn-1

The successive iterates of the false position method are then simply computed by

xn+1 = xn-xn- xn-1 fxnfxn - fxn-1

.

We use this form because it involves one less function evaluation and one less multiplication than the

original relation (0.0) we started from.

The algorithm for the secant method consists of three steps:

Generate the approximated root by the derived iteration formula

Check if the error requirements are satisfied; if yes stop and return the value

If signfa signfc 0, then set a= c. Otherwise, set b= c. Return to step 1.

The following lines are an implementation of the secant method

falsePositionMethodf_, a_, b_ :Blockc, 105, ain a, bin b, cold b, k 0, results ,

While0 0,k k 1;

first step find the approximation c bin f . x bin bin ainf . x bin f . x ain;

second step select the root and terminate

IfAbscold c , Returnresults, cold c;AppendToresults, k, c; third step select the interval Iff . x ain f . x c 0, ain Nc, bin Nc

The application of the secant method shows the iteration steps

12 Lecture_006.nb

2012 G. Baumann

7/29/2019 Lecture 006

13/17

falsePositionMethodx6 x 1, 1, 2 TableForm, TableHeadings , "k", "c" &

k c

163

62

2 1.19058

3 1.11766

4 1.14056

5 1.1328

6 1.13537

7 1.13451

8 1.1348

9 1.1347

10 1.13473

11 1.13472

The same example was used previously as an example for the bisection method. The results are given

above. The last iterate equals the roota

rounded to 5 significant digits. The false position method con-verge only a little bit faster than the bisection method. But as the iterates become closer to a, the speed

of convergence increases.

1.2.3 Secant Method

The secant method and the false position method are known as straight-line approximations to the given

function y= fx. Assume that two initial guesses to the root a are known and denoted by x0 and x1. Theymay occur on opposite side of a or on the same side of a. The two points x0, fx0 and x1, fx1, on thegraph of y= fx, determine a straight line, called a secant line. This line is an approximation to the graphof y= fx and its root x2 is an approximation of a (see Figure 0.0).

To derive a formula for x2, we proceed in a manner similar to that used to derive the false position

formulas: Find the equation of the line and then find its root x2. The equation of the line is given by

y= px = fx1 + x- x1fx1 - fx0

x1 - x0.

Solving px2 = 0, we obtain

x2 = x1 - fx1x1 - x0

fx1 - fx0

Having found x2, we can drop x0 and use x1, x2 as a new set of approximate values for a. this leads to an

improved value x3; and this process can be continued indefinitely.

Doing so, we obtain the general iteration formula

xn+1 = xn- fxnxn- xn-1

fxn - fxn-1for n 1.

This is the secant method. It is called a two-point method, since two approximate values are needed to

obtain an improved value. The bisection method is also a two-point method, but the secant method will

almost always converge faster than bisection.

Lecture_006.nb 13

2012 G. Baumann

7/29/2019 Lecture 006

14/17

Figure 0.0 illustrates how the secant method works and shows the difference between it and the method

of false position. From this example we can see that now the successive iterates are no longer guaran-

teed to enclose the root.

x

y

fx0

fx1x0

x1 x2x3

Figure 0.0. The secant method.

The algorithm for the secant method consists of three steps:

generate the approximated root by the derived iteration formula

change the boundary values a= band b= c.

check if the error requirements are satisfied; if yes stop and return the value, if not return to step 1.

The following lines are an implementation of the secant method

secantMethodf_, a_, b_ :Blockc, 105, ain a, bin b, cold b, k 0, results ,

While0 0,k k 1;

first step find the approximation c bin f . x bin bin ainf . x bin f . x ain; second step select the root and terminate ain Nbin;bin Nc; third step select the root and terminate IfAbscold c , Returnresults, cold c;AppendToresults, k, c;

The application of the secant method shows the iteration steps

14 Lecture_006.nb

2012 G. Baumann

7/29/2019 Lecture 006

15/17

secantMethodx6 x 1, 1, 2 TableForm, TableHeadings , "k", "c" &

k c

163

62

2 1.03067

3 1.17569

4 1.12368

5 1.13367

6 1.13475

7 1.13472

The same example was used previously as an example for both the bisection and false position method.

The results are given in the table above. The last iterate equals to the root a rounded to 5 significant

digits. Contrary to the bisection method the secant method converge very rapidly. When the iterates

become closer to a, the speed of convergence increases in a way which needs less steps.

Example 0.0. Secant and False Position Method

The function

fx_ := x2 x - 1

has a root in the interval 0, 1 since f0 f1 < 0.

Solution 0.2. The results for all three methods discussed so far, the bisection, the false position, and

secant methods, are demonstrated in the following. The function has a root near x 0.7 as shown in the

following Figure 0.0.

0.0 0.2 0.4 0.6 0.8 1.0

-1.0

-0.5

0.0

0.5

1.0

1.5

x

fx

Figure 0.0. Graph of the function fx = x2 x- 1 for x 0, 1.

All methods start with two points x0 = 0 and x1 = 1. The following tables show the steps needed to derive

the root.

First the bisection method is applied to the problem

Lecture_006.nb 15

2012 G. Baumann

7/29/2019 Lecture 006

16/17

bisectionfx, 0, 1 TableForm, TableHeadings , "k", "c" &

k c

11

2

2 0.75

3 0.625

4 0.6875

5 0.71875

6 0.703125

7 0.710938

8 0.707031

9 0.705078

10 0.704102

11 0.703613

12 0.703369

13 0.703491

14 0.70343

15 0.70346116 0.703476

Next we use the false position method

falsePositionMethodfx, 0, 1 TableForm, TableHeadings , "k", "c" &

k c

1 1 --1

2 1.8816

3 0.420725

4 0.941745

5 0.5899566 0.78112

7 0.660269

8 0.730769

9 0.68747

10 0.713283

11 0.697609

12 0.707023

13 0.701331

14 0.704759

15 0.70269

16 0.703937

17 0.703184

18 0.70363819 0.703364

20 0.70353

21 0.70343

22 0.70349

23 0.703454

24 0.703476

25 0.703462

16 Lecture_006.nb

2012 G. Baumann

7/29/2019 Lecture 006

17/17

Finally the secant method is used.

secantMethodfx, 0, 1 TableForm, TableHeadings , "k", "c" &

k c

1 1 --1

2 0.569456

3 0.797357

4 0.685539

5 0.701245

6 0.703524

7 0.703467

The results for the different methods show that the bisection method needs the expected number of

iterations. However, the false position method needs more steps than expected. If we look at the results

generated during the iteration we observe that the root is approached. But during the first few iteration

steps there is some oscillation around the root which makes the convergence not direct. Contrary to thesecant method the false position method converge quite fast to the true root and does not show oscilla-

tions.

By using techniques from calculus and some algebraic manipulation, it is possible to show that the

iterates xn satisfy

a - xn+1 = a - xn a - xn-1-f'' xn2 f' xn

.

The unknown number xn is between xn and xn-1, and the unknown number xn is between the largest and

the smallest of the numbers a, xn, and xn-1. The error formula closely resembles the Newton error

formula which is discussed in the next section. This kind of formula should be expected, since the secant

method can be considered as an approximation of Newton's method, based on the difference quotient

f' xn fxn - fxn-1

xn- xn-1.

Check as an exercise that the use of this expression in Newton's formula (0.XXX) will yield (see next

subsection)

xn+1 = xn- fxnxn- xn-1

fxn - fxn-1for n 1.

Lecture_006.nb 17

Lecture 006

Documents

Transcript of Lecture 006