Chapter 2

37
Notes for Numerical Analysis Math 5465 by S. Adjerid Virginia Polytechnic Institute and State University (A Rough Draft) 1

description

Chapter 2

Transcript of Chapter 2

  • Notes for Numerical Analysis

    Math 5465

    by

    S. Adjerid

    Virginia Polytechnic Institute

    and State University

    (A Rough Draft)

    1

  • 2

  • Contents

    1 Error Analysis 5

    2 Nonlinear Algebraic Equations 72.1 Convergence and order of convergence . . . . . . . . . . . . . . . . . . . . . . 72.2 Methods for finding roots of f(x) = 0 . . . . . . . . . . . . . . . . . . . . . . 10

    2.2.1 The Bisection method . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.2 Newtons Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2.3 The Secant Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.4 Fixed-point and functional iterations . . . . . . . . . . . . . . . . . . 162.2.5 Acceleration techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.3 Systems of Nonlinear Equations . . . . . . . . . . . . . . . . . . . . . . . . . 212.3.1 Newtons method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.3.2 Fixed-point iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.3.3 Modified Newton and steepest descent methods . . . . . . . . . . . . 272.3.4 Continuation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 312.3.5 Secant Methods for multidimensional problems . . . . . . . . . . . . 33

    2.4 Finding zeros of polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    3

  • 4

  • Chapter 1

    Error Analysis

    5

  • 6

  • Chapter 2

    Nonlinear Algebraic Equations

    In this chapter we will discuss numerical methods for finding zero of algebraic nonlinearproblems of the form f(x) = 0.

    2.1 Convergence and order of convergence

    We start with the following example sequences and their limits

    1. xn = (1)n, diverges,2. xn = (1 +

    1n) , converges to 1.

    3. xn = (1 +1n)n, converges to e.

    Next, we state a rigorous definition of convergence for sequences.

    Definition 1. A sequence {xn} converges to x as n if and only if for all > 0 thereexist an integer N() such that

    |xn x| < , for all n > N().

    The limit is denoted by limn

    xn = x.

    This describes the fact that for n > N , xn gets arbitrarily close to x.

    A sequence in Rm is defined by a sequence of vectors Xn Rm for n 0. The limit ofXn = (x1,n, x2,n, , xm,n)t Rm, n 0, is defined by

    limn

    Xn = ( limn

    x1,n, limn

    x2,n, , limn

    xm,n)t.

    7

  • For the purpose of illustration we consider the following sequence in R3

    Xn = (x1,n, x2,n, x3,n)t, n 0,

    where

    x1,n = nsin(

    2

    n),

    x2,n =n2 + (1)nn2 +3 ,

    x3,n =n+ en

    5en + 1.

    Thus, by definition, the limit of this vector sequence Xn as n islimn

    Xn = ( limn

    x1,n, limn

    x2,n, limn

    x3,n)t = (

    2,1, 1/5)t.

    Order of convergence:

    Definition 2. (Linear convergence) A sequence {xn} converges to x linearly if and only ifthere exists 0 < c < 1 and N > 0 such that

    |xn+1 x| c|xn x|, n > N,

    or

    limn

    |xn+1 x||xn x| = c, 0 < c < 1.

    Remarks:

    1. In the case of linear convergence the error at iteration n+1 is approximately a fractionof the error at iteration n.

    2. If c = 0 the convergence is faster than linear and is called superlinear convergence.

    Now, let us consider the following sequence xn = (1/3)n, n 0, which admits x = 0 as a

    limit.

    Applying the definition of linear convergence we examine the ratio

    |xn+1 x||xn x| =

    1/3n+1

    1/3n= 1/3.

    We immediately conclude that the sequence converges linearly to x = 0 with c = 1/3.

    8

  • Definition 3. (convergence of order p) A sequence {xn} converges to x with order p > 1 ifthere exists c > 0 such that

    limn

    |xn+1 x||xn x|p = c,

    or, there exists N > 0 and c > 0 such that

    |xn+1 x| c|xn x|p, for n > N.

    Let us apply this definition to find the order of convergence of the following sequences

    1. xn = n10 3n

    2, with x = 0,

    |xn+1 x||xn x| =

    (n+ 1)103(n+1)2

    n103n2= lim

    n(n+ 1)10

    n103(2n+1) = 0.

    Which leads to a superlinear convergence. Next we check for quadratic convergence bylooking at the ratio

    limn

    |xn+1 x||xn x|2 = limn

    (n+ 1)10

    n203n

    22n+1 .

    Since the limit is not a positive number, the convergence is not quadratic.

    2. Let us consider the sequence xn = 102 5n , with x = 0,

    |xn+1||xn|5 = 1.

    The order of convergence for this sequence is p = 5.

    3. If xn = bcan , b, c > 0 and a > 1, xn converges to x = 0 with an order of convergence

    p = a.

    Stopping the iteration process: when computing the limit of a sequence on a computerthe most commonly used stopping criterion is

    |xn xn1| tol,where tol = |xn|rtol + atol and rtol and atol, respectively, are the relative and absolutetolerances.

    9

  • 2.2 Methods for finding roots of f (x) = 0

    2.2.1 The Bisection method

    Bisection algorithm:

    step 1: x_0= (a+b)/2

    step 2: i=0

    step 3:

    if f(a) f(x_i) < 0, then b=x_0

    otherwise, a = x_i

    step 4: i = i+1

    step 5: x_i = (a+b)/2

    step 6: if (b-a) < |x_i| rtol + atol, then stop

    go to step 3

    Theorem 2.2.1. Let f(x) be a continuous function on [a, b] such that f(a)f(b) < 0. Thenthe bisection method converges linearly to a root x (a, b) with

    |xn x| (b a)2n+1

    .

    Proof. By selecting x0 to be the midpoint x0 = (a+ b)/2 and defining xn to be the midpointa each iteration, we immediately see that |x0 x| (b a)/2. The same argument appliesto the subsequent iterates which proves the theorem.

    Advantages of the bisection method:

    1. It requires one function evaluation per iteration

    2. It converges to a root for all [a, b] such that f(a)f(b) < 0.

    Major disadvantages:

    1. It exhibits linear convergence with c = 1/2

    2. does not extend naturally to systems of equations.

    10

  • 2.2.2 Newtons Method

    Let us assume f C2, select an initial guess x0 and approximate f(x) = 0 by the equationof tangent line to f at x0 as

    f(x) f (x0)(x x0) + f(x0) = f(x)

    and solve f(x) = 0 to define x1

    x1 = x0 f(x0)f (x0)

    We continue this process from x1 to get x2 to obtain the following iteration

    xn+1 = xn f(xn)f (xn)

    , n 0 (2.1)

    The main advantage of Newtons method is the fact that for simple roots it exhibits quadraticconvergence such that

    limn

    |xn x||xn1 x|2 = c > 0, for x0 close enough to x

    .

    However, Newtons method, when compared to other methods, has the following disadvan-tages by requiring

    1. an initial guess x0 close enough to x for convergence

    2. two function evaluations per iteration

    3. f (xn) 6= 0 at each iteration4. values of f and f at each iteration.

    Newtons Algorithm

    function [x1,iter,fail] = newton(x0,tol,Nmax,f,fd)

    %This routine find a root of f starting from

    %the initial guess x0

    %

    %input parameters

    %f function f

    %fd derivative of f

    %x0: initial guess

    %tol: tolerance

    %Nmax: maximum number of iterations allowed

    %

    11

  • err = 10;

    iter = 0;

    while (i tol)

    err = feval(f,x0)/feval(fd,x0);

    x1 = x0 - err;

    if (abs(err) < tol) & (iter ==Nmax)

    fail = 1;

    end

    x0=x1;

    iter = iter + 1;

    end

    Results for Newtons method applied to x^2 - 4 *x + 1 =0 ,

    with x0 = 0 and x0=5

    n x_n f(x_n) | x_n - ( 2 -sqrt(3))|

    0 0.0 1.000000000000000e+00 2.679491924311228e-01

    1 2.500000000000000e-01 6.250000000000000e-02 1.794919243112281e-02

    2 2.678571428571428e-01 3.188775510204467e-04 9.204957397995761e-05

    3 2.679491899852725e-01 8.472673784787332e-09 2.445850355581314e-09

    4 2.679491924311227e-01 0 1.110223024625157e-16

    n x_n f(x_n ) | x_n -- ( 2 +sqrt(3))|

    0 5.000000000000000e+00 6.000000000000000e+00 1.267949192431123e+00

    1 4.000000000000000e+00 1.000000000000000e+00 2.679491924311228e-01

    2 3.750000000000000e+00 6.250000000000000e-02 1.794919243112281e-02

    3 3.732142857142857e+00 3.188775510203357e-04 9.204957398001312e-05

    4 3.732050810014728e+00 8.472675006032659e-09 2.445850633137070e-09

    5 3.732050807568877e+00 0 0

    Now, let us carry out a rigorous convergence error analysis of Newtons iteration method bystating and proving the first theorem.

    Theorem 2.2.2. Let f(x) C2 that has a root x, i.e., f(x) = 0. If x0 is close enoughto x, then Newtons iteration converges to x. If in addition f (x) 6= 0, then Newtonsiteration converges quadratically.

    Proof. Subtracting x from Newtons iteration formula (2.1) we obtain

    xn+1 x = xn x f(xn)f (xn)

    If en = xn x, thenen+1 = en f(xn)

    f (xn)=f (xn)en f(xn)

    f (xn)(2.2)

    12

  • By Taylor expansion of f about xn we have

    0 = f(x) = f(xn) + f (xn)(x xn) + f()2

    e2n, between xn and x,

    which can be written as

    f (xn)en f(xn) = f()2

    e2n.

    Substituting this into (2.2) leads to

    en+1 =f ()2f (xn)

    e2n. (2.3)

    Thus if x0 is close enough to x and f (x) 6= 0, then we have

    |en+1| |f(x)|

    2|f (x)| ||en|2 < k|en|, 0 < k < 1. (2.4)

    If f (x) = 0, then one can write

    f (xn) = f (x) + f ()en = f ()en, between xn and x,

    which, in turn, is substituted into (2.4) to yield

    |en+1| 12|en|, x0 close to x.

    Thus, we have established convergence.

    Next, if f (x) 6= 0, from (2.3) we show quadratic convergence by taking the limit

    limn

    |en+1||en|2 =

    |f (x)|2|f (x)| .

    Before stating our next theorem we prove the following theorem on monotonic boundedsequences.

    Theorem 2.2.3. If {xn}0 is a monotonically increasing(decreasing) sequence and boundedfrom above(below), then it converges.

    Proof. Here we prove the case of an increasing sequence and bounded from above. The othercase is left as ana exercise.Let S = max

    nxn. Thus, by definition, for every > 0 there isN > 0 such that S < xN < S.

    (Otherwise S would not be the maximum). This leads to S < xn < S for n >= N . Bydefinition of a limit, we conclude that S is the limit of xn and xn converges.

    13

  • Theorem 2.2.4. Let f(x) C2, such that f (x) > 0 and f (x) > 0 for all x and f(x) hasa real root x. Then, for all initial guess x0 Newtons method converges quadratically.

    Proof. Using

    en+1 =f ()2f (xn)

    e2n,

    we show that en+1 > 0, thus xn+1 > x , for n > 0.

    From the iteration formula and the assumptions f (x) > 0 and f (x) > 0,

    xn+1 = xn f(xn)f (xn)

    leads to the fact that xn+1 < xn. Thus, by the previous theorem , Newtons iterationconverges to x.

    2.2.3 The Secant Method

    In order to avoid computing f we replace f (xn) by (f(xn)f(xn1))/(xnxn1) in Newtonsiteration (2.1) to obtain the secant method

    xn+1 = xn f(xn)(xn xn1)f(xn) f(xn1) (2.5)

    The secant method has several advantages summarized as 0.15in

    1. it requires one function evaluation per iteration

    2. it doe not use the function derivative

    3. it exibits superlinear convergence

    However, the secant method breaks down if f(xn) = f(xn1).

    Next, we state and prove convergence result for the secant method.

    Theorem 2.2.5. Let f C2 and admit a simple root x, If we select x0 and x1 close enoughto x, then {xn} converges to x with an order equal to (1 +

    5)/2.

    Proof. If we subtract x from the secant iteration (2.5), we obtain

    xn+1 x = f(xn)xn1 f(xn1)xnf(xn) f(xn1) x

    ,

    which leads to

    en+1 =f(xn)en1 f(xn1)en

    f(xn) f(xn1) .

    14

  • Factoring out enen1 we obtain

    en+1 =xn xn1

    f(xn) f(xn1)[f(xn)/en f(xn1)/en1

    xn xn1

    ]enen1.

    Using Taylor series we write

    f(xn) = f(x + en) = f(x) + enf (x) +

    e2nf(x)2

    +O(e3n)

    Since f(x) = 0 we have

    f(xn)

    en= f (x) + enf (x)/2 +O(e2n)

    For the index n 1 we obtainf(xn1)en1

    = f (x) + en1f (x)/2 +O(e2n1)

    Combining the previous two expansions we write

    f(xn)/en f(xn1)/en1 = (en en1)f (x)/2 +O(e2n1),which leads to

    f(xn)/en f(xn1)/en1xn xn1 = f

    (x)/2 +O(en1).

    Hence,

    en+1 f(x)

    2f (x)enen1 = Kenen1, K =

    f (x)2f (x)

    .

    Let us assume that there exists C >) such |en+1| = C|en|, thus, |en1| = [C1|en|]1/.This leads to

    C|en| |K||en|C1/|en|1/,which, in turn, yields

    C1+1/|K|1 |en|1+1/

    Since the left-side is a non zero constant then 1 + 1/ = 0 with a positive root =(1 +

    5)/2 1.618.

    15

  • Since C1+1/|K|1 1, we write

    C = |K| 11+1/ =[ |f (x)|2|f (x)|

    ]0.62.

    Finally we have

    en+1 [ |f (x)|2|f (x)|

    ]0.62|en| 1+

    5

    2 .

    This completes the proof.

    A Comparison of Newton versus the secant:If we measure the computational cost by the number of function evaluations, two iterationsof the secant will be equivalent to one Newton iteration. Two steps of the secant methodyield

    |en| C|en1| |en+1| C|en|.Thus,

    |en+1| C(C|en1|) = C1+|en1|2 .The order of convergence for two steps of the secant method is

    |en+1| = C1+|en1|(3+

    5)

    2 ,3 +

    3

    2 2.618,

    which leads to a faster method than Newtons iteration.

    2.2.4 Fixed-point and functional iterations

    We begin by defining the notion of a a fixed point and state and prove few theorem on theexistence and uniqueness of fixed points. We also establish few results on the convergenceof fixed-point iterations to find the solution of algebraic problems.

    Definition 4. Let g(x) be a continuous function. Then, g(x) has a fixed point x if and onlyif g(x) = x.

    Theorem 2.2.6. If g(x) [a, b] for all x [a, b] then g(x) has a fixed point in [a, b].Furthermore if g is contractive mapping, i.e., |g(x)| k < 1, for all x [a, b], then g(x)has a unique fixed point in [a, b].

    Proof. consider f(x) = xg(x), f(a) = ag(a) > 0, f(b) = bg(b) > 0. Thus, f(a)f(b) < 0,there exists at least one fixed point c [a, b] such that f(c) = c g(c) = 0.The derivative f (x) = 1 g(x) > 0, thus, f(x) is monotonically increasing on [a, b]. Thisshows that c is unique root of f in [a, b].

    16

  • Theorem 2.2.7. If g(x) C1[a, b] such that g([a, b]) [a, b] and |g(x)| k < 1 for allx [a, b]. Then(i) for x0 [a, b] the sequence xn = g(xn1), n > 0, converges to x, the unique fixed pointof g(x).

    (ii) |xn x| knmax(b x0, x0 a), n 0(iii)|xn x| kn1k |x1 x0|

    Proof. First we write

    xn x = g(xn1) g(x) = g()(xn1 x), between x and xn1|xn x| kn(x0 x) knmax(b x0, x0 a)Since 0 < k < 1, xn x.Now, we show (iii) by writing for m > n,

    xm xn = xm xm1 + xm1 xm2 xn+1 xn|xm xn| |xm xm1|+ |xm1 xm2| |xn+1 xn|Using

    |xl+1 xl| kl|x1 x0|, l > 0,

    |xm xn| (km1 + km2 + kn)|x1 x0| = kn(1 km

    1 k )|x1 x0|

    Letting m while keeping n fixed proves our theorem.Theorem 2.2.8. If g Cp[a, b], such that g(x) = x [a, b], with g(l)(x) = 0, l =1, 2, 3, p 1 and g(p)(x) 6= 0, then(i) there exists an interval I [a, b] containing x with g(I) I(ii) the sequence xn = g(xn1) starting from x0 I converges to x I with an order ofconvergence equal to p.

    Proof. Since g(x) = 0 by continuity there exists an interval [x , x + ] on which|g(x)| k < 1.Applying Taylor series we obtain

    xn+1 x = g(xn) g(x) = g(p)()

    p!(xn x)p.

    17

  • Thus,

    limn

    |xn+1 x||xn x|p =

    g(p)(x)p!

    .

    Examples:

    (i) f(x) = (x 3)(x+ 1) has two roots r = 1, 3,

    g(x) = x (x3)(x+1)10

    g(x) = 1 (2x 2)/10g(1) = 1.4 > 1g(3) = 0.6 < 1

    (ii) f(x) = x ex

    g(x) = ex on [0, 1].

    Theorem 2.2.9. If g(x) C1 has a fixed point x such that g(x)| > 1, then no fixed pointiteration xn = g(xn1) converges to x.

    Proof. Assume xn converges to x, i.e., after some n > N , xn gets arbitrarily close to x.

    thus one can assume g(xn) > 1 on [xn, x] and xn+1 x = g(xn) g(x) = g()(xn x).Thus, |xn+1x| > |xnx| which leads to a contradiction. We conclude that the sequence{xn} will not converge to x.

    Example:

    For instance, Newtons method can be viewed as the fixed point-iteration

    xn+1 = g(xn), where g(x) = x f(x)/f (x).

    then

    g(x) =f(x)f (x)f (x)2

    Thus, if f (x) 6= 0 we can show that g(x) = 0 and

    g(x) =f (x)2f (x)

    .

    18

  • In the next theorem we will study the convergence of Newtons method for multiple roots:

    Theorem 2.2.10. Let f(x) = f (1)(x) = = f (q1)(x) = 0 and f (q)(x) 6= 0, q > 1, thenNewtons iteration converges linearly for x0 close enough to x

    and such that

    limn

    |xn+1 x||xn x| = 1

    1

    q.

    Proof. Let g(x) = x f(x)f (x) and use Taylors theorem to write

    f(x) =(x x)qf (q)()

    q!, between x and x

    f (x) =(x x)q1f (q)()

    (q 1)! , between x and x

    xn+1 x = (xn x) (xn x)

    q

    f (q)()

    f (q)(), and between xn and x

    .

    By taking the limit we find that

    limn

    |xn+1 x||xn x| = 1 1/q, q > 1,

    One may modify Newtons method to recover quadratic convergence for a multiple rootas

    xn+1 = xn q f(xn)f (xn)

    .

    2.2.5 Acceleration techniques

    We consider few iteration methods with higher-order of convergence.

    1. Aitkens method:A linearly converging sequence can be accelerated by using Aitkens method. If {xn}n=0is a linearly converging sequence to x Aitkens method gives the sequence

    xn = xn (xn+1 xn)2

    xn+2 2xn+1 + xn = xn (xn)

    2

    2xn, n = 0, 1,

    which converges quadratically to x.

    If {xn}n=0 is defined by a fixed point iteration xn+1 = g(xn), n = 0, 1, that con-verges linearly, then Aitkens method yields the following method

    19

  • xn = xn (g(xn) xn)2

    g(g(xn)) 2g(xn) + xn , n = 0, 1, ,

    which converges quadratically.

    2. Steffensen Method:

    xn+1 = xn f(xn) f(xn)f(xn + f(xn)) f(xn) , n = 0, 1, .

    3. Halleys Method:Apply Taylor series to write

    f(xn) + f(xn)(x xn) + f (xn)(x xn)2/2 = 0

    solve for x to find the next iterate

    xn+1 = xn 2f(xn)f (xn)

    [f (xn)]2 2f(xn)f (xn)

    ,

    where we chose the sign such that xn+1 is the closest to xn, i.e., should be thesign of f (xn).

    The method has the following properties

    (a) It exhibits cubic convergence for x0 close enough to x

    (b) It requires f(xn), f(xn) and f (xn).

    4. Mullers Method:The secant method can be viewed as an interpolation at xn and xn1. This can be gener-alized to obtain Muellers method which consists of interpolating f(x) at xn, xn1, xn2to obtain

    Qn(x) = f(xn) + f [xn, xn1](x xn) + f [xn, xn1, xn2](x xn)(x xn1),

    where

    f [xi, xj] =f(xj) f(xi)

    xj xi , f [xi, xj, xk] =f [xj, xk] f [xi, xj]

    xk xi .

    Using x xn1 = x xn + xn xn1, Qn(x) can be written as

    Qn(x) = an(x xn)2 + 2bn(x xn) + cn,

    20

  • where

    an = f [xn, xn1, xn2]

    bn = (f [xn, xn1] + f [xn, xn1, xn2](xn xn1))/2cn = f(xn).

    Muellers method consists of

    (a) Solving

    Qn(hn) = anh2n + 2bnhn + cn, for hn.

    (b) Selecting the root hn having the smallest absolute value.

    (c) Defining the next iterate as

    xn+1 = xn + hn

    where

    hn =bn

    b2n ancnan

    =cn

    bn b2n ancn

    .

    Here are few remarks on Muellers method

    (a) The sign is selected such that xn+1 is closest to xn,

    (b) If an = 0, we recover the secant method.

    2.3 Systems of Nonlinear Equations

    Let us consider the system:

    F (X) = 0,

    where F (X) = (f1(X), , fn(X))t Rn and X = (x1, x2, , xn)t Rn.

    2.3.1 Newtons method

    Newtons method extends naturally to systems by using Taylors series for functions of severalvariables.

    We first introduce Taylor series for twice continuously differentiable functions of severalvariables f(X) : Rn Rn by considering the auxiliary function for fixed X and H =(h1, , hn)t Rn

    g(t) = f(X + tH), t R.

    21

  • Its Maclaurin series can be written as

    g(1) = g(0) + g(0) + g()/2,

    where g(0) = f(X) and g(1) = f(X +H).Applying the chain rule we write

    g(0) =ni=1

    f(X)

    xihi,

    and

    g() =ni=1

    nj=1

    2f(X + H)

    xixjhjhi.

    Applying this to each component fk(X), k = 1, 2, , n with H = Y X

    fk(Y ) = fk(X) + (Y X) fk(X) + 12

    nj=1

    ni=1

    (yi xi)(yj xj)2fk()

    xjxk

    Neglecting the second-order terms and solving the linear system

    F (Y ) = F (X) + JF (X)(Y X) = 0,we obtain

    Y X = JF (X)1F (X).

    From this we define Newtons iteration as

    Xk+1 = Xk JF (Xk)1F (Xk), k = 0, 1,

    The following main steps in Newtons iteration method are

    1. Select X0 close enough to X

    2. Compute JF (X0) and F (X0)

    3. Solve JF (X0)H = F (X0)4. X1 = X0 +H

    5. Test for convergence ||H||/||X0|| < tol(a) If converged, stop

    (b) Otherwise, set X0 = X1 and go back to step 2

    Properties of Newtons method:

    22

  • 1. Requires one Jacobian evaluation at each iteration,

    2. Requires the solution of a system of linear equations at each iteration,

    3. May break down if JF (Xk) is a singular matrix,

    4. Converges quadratically.

    We start the convergence analysis with the following preliminary lemma.

    Lemma 2.3.1. Let F : Rn Rn and JF (X) for all X a convex subset of Rn and ifthere exists > 0 such that

    ||JF (X) JF (Y )|| < ||X Y ||, X ,Then

    ||F (X) F (Y ) JF (Y )(X Y )|| < 2||X Y ||2, X .

    Proof. Let X,Y (convex), i.e., Y + t(X Y ) for 0 t 1. Let us define thefunction

    (t) = F (Y + t(X Y )), 0 t 1.By differentiating with respect to t we obtain

    (t) = JF (Y + t(X Y ))(X Y ).At t = 0 we have

    (0) = JF (Y )(X Y ).Next we write

    (t) (0) = JF (Y + t(X Y ))(X Y ) JF (Y )(X Y ).Taking the norm and using the assumption we obtain

    ||(t) (0)|| = ||JF (Y + t(X Y ))(X Y ) JF (Y )(X Y )||< t||X Y ||2. (2.6)

    On the other-hand, we can write

    d = F (X) F (Y ) JF (Y )(X Y ) = (1) (0) (0) = 10

    [(t) (0)]dt.

    Using the bound (2.6) we obtain

    ||d|| ||(1) (0)|| < ||X Y ||2 10

    tdt =

    2||X Y ||2.

    23

  • In the next theorem we state and prove a convergence result for Newtons method forsystems.

    Theorem 2.3.1. Let = {X, ai < xi < bi, i = 1, 2, n} Rn. Assume F (X) to bedifferentiable on . For X0 let r, , , and h be positive constants such that(a) ||JF (X) JF (Y )|| < ||X Y ||, for all X, Y (b) JF (X)1 exists and satisfies ||JF (X)1|| for all X (c) ||JF (X0)1F (X0)|| , where is small enough (by selecting X0 close enough to a rootX such that

    1. h = 2

    < 1

    2. r = {X, ||X X0|| < r} , i.e., r = 1h is small enough.

    Then

    (i) Starting at X0, each point

    Xk+1 = Xk JF (Xk)1F (Xk), k = 0, 1, (2.7)

    is in r

    (ii) Newtons iteration converges to X

    (iii) for all k 0

    ||Xk X|| h2k1

    1 h2k .

    with 0 < h < 1 and Newtons iteration converges quadratically.

    Proof. From the Newton iteration formula we obtain

    ||Xk+1 Xk|| = || JF (X)1F (Xk)|| < ||F (Xk)||

    Using JF (Xk1)(Xk Xk1) + F (Xk1) = 0 we obtain

    ||Xk+1 Xk|| < ||F (Xk) F (Xk1) JF (Xk1)(Xk Xk1)||.Applying Lemma 2.3.1 we have

    ||Xk+1 Xk|| < 2||Xk Xk1||2. (2.8)

    24

  • Combining assumption (c) and

    X1 X0 = JF (X0)1F (X0)yields ||X1 X0|| < < r, Thus X1 r.

    In order to prove that Xn r for n > 1, we apply the recursion (2.8), ||X1 X0|| < andh/ = /2, to obtain

    ||Xk+1 Xk|| h[h

    ]2 [h

    ]22k1||X1 X0||2k (2.9)

    [h]1+2

    1+22++2k12k

    . (2.10)

    Applying the geometric sum 1 + 2 + 4 + 2k1 = 2k 1, we have

    ||Xk+1 Xk|| [h]2k12

    k h2k1. (2.11)

    Furthermore, for m > k, the estimate (2.11) and the triangle inequality yield

    ||Xm Xk|| ||Xm Xm1||+ ||Xk+1 Xk|| h2k1((1 + h2k + [h2k ]2 + + [h2k ]m).

    If h < 1 this leads to

    ||Xm Xk|| h2k1

    1 h2k . (2.12)

    For k = 0, the previous estimate becomes

    ||Xm X0|| 1 h = r, for all m > 0,

    which proves that Xm r, m > 0.Furthermore, from the bound (2.12) we prove that, for m > k,

    limk

    ||Xm Xk|| = 0,

    which establishes that {Xk} is a Cauchy sequence. We know from basic real analysis thatevery Cauchy sequence {Xk} in a the bounded domain r converges to X r, the closureof r.

    Since F is continuous limk

    F (Xk) = F (X),

    In order to show that X is a root, i.e.,, F (X) = 0, we need to show that ||JF (Xk)|| isbounded as

    25

  • ||JF (Xk)|| ||JF (Xk) JF (X0)||+ ||JF (X0)|| r + ||JF (X0)||.Thus, by the continuity of JF (X) and F (X) and the limit as k of

    JF (Xk)(Xk+1 Xk) = F (Xk),

    we establish that F (X) = 0.

    Finally, we prove quadratic convergence by subtractingX from the Newton iteration formula(2.7) to write

    Xk+1 X = JF (Xk)1[JF (Xk)(Xk X) F (Xk)].

    Adding F (X) = 0 to the right hand side term we obtain

    ||Xk+1 X|| = ||JF (Xk)1[F (X) F (Xk) JF (Xk)(X Xk)]

    Applying Lemma 2.3.1, we show that

    ||Xk+1 X|| 2||Xk X||2,

    which completes the proof of the theorem.

    2.3.2 Fixed-point iterations

    We consider the problem X = G(X) where G : Rn Rn

    The fixed-point iteration is defined by selecting a vectorX0 Rn and definingXk+1 = G(Xk),k = 0, 1, .

    Definition 5. A point X Rn is a fixed point of G(X) if and only if G(X) = X.Definition 6. G(X) is a contractive mapping if there is 0 < k < 1 such that

    ||G(X)G(Y )|| k||X Y ||.

    This is our main theorem on fixed point iterations and their convergence.

    Theorem 2.3.2. Let G(X) be a continuous function on = {X, ai xi bi} such that(i) G() (ii) G(X) is a contraction

    Then

    26

  • (a) G has a fixed point X and the sequence Xk+1 = G(Xk), X0 converges to X.(b) ||Xk X|| Kk||X0 X||, k > 0.

    (c) ||Xk X|| Kk1K ||X1 X0||, k > 1.

    Proof. The proof follows the same line of reasoning as the scalar case.

    Gauss-Seidel Method:

    k = 0, 1, 2, fori = 1, 2, , nxk+1,i = gi([xk+1,1, , xk+1,i1, xk,i, xk,n]t),

    2.3.3 Modified Newton and steepest descent methods

    Newton methods have problems with the initial guess which, in general, has to selected closeto the solution. In order to avoid this problem for scalar equations we combine the bisectionand Newton method. First, we apply the bisection method to obtain a small interval thatcontains the root and then finish the work using Newtons iteration.

    For systems we will develop global methods known as descent methods of which Newtonsiteration is a special case. Newtons method will be applied once we get close to a root.

    Let us consider the system of nonlinear algebraic equations F (X) = 0 and define the scalarmultivariable function

    (X) =1

    2

    ni=1

    fi(X)2 =

    1

    2F t F.

    The function has the following properties.

    1. (X) 0 for all X Rn

    2. if X is solution of F(X) = 0 then has a local minimum at X.

    3. At an arbitrary point X0, the vector (X0) is the direction of the most rapiddecrease of .

    4. has infinitely many descent directions Descent directions.

    5. A direction u is descent direction for at X if and only if ut (X) < 0.6. Special descent directions:

    (a) Steepest descent method: uk = = JF (Xk)tF (Xk),

    27

  • (b) Newtons method: uk = JF (Xk)1F (Xk)

    Next we prove that Newtons method is a descent method.

    Theorem 2.3.3. Newtons iteration method for F (X) = 0 is a descent method for (X).

    Proof. For scalar problems f(x) = 0, the descent direction is given by (x) = 2f(x)f (x)while Newtons method yields f(x)/f (x) which has the same sign as the steepest descentmethod.For multidimensional problems we would like to show that Newton direction uN = JF (X)1F (X)satisfies t uN < 0 as follows

    2(JF (X)tF (X))tJF (X)1F (X) = 2F (X)tJF (X)JF (X)1F (X) =2F (X)tF (X) = 2(X) < 0, X 6= X.

    The question that arises for all descent methods is how far should one go in a given descentdirection. To answer this question, there are several techniques and conditions to guaranteeconvergence to a minimum of . For a detailed discussion consult the book by Dennis andSchnabel [?].For instance, we obtain a modified Newtons iteration method as

    xk+1 = xk + uk, k = 0, 1, 2, ,where we try values for in the following order = [1, 1/2, 1/4, 1/8, ], i.e., starting withNewton method first ( = 1) and accept the iterate only if satisfies the following criterionset by Dennis and Schnabel [?]

    (xk+1) < (xk) + 104(xk) uk = TR. (2.13)As a consequence we obtain the following Quasi-Newton algorithm

    1. Select X0, T ol, MaxIter

    2. iter = 0

    3. Solve JF (X0)H = F (X0),4. Set iter = iter + 1

    5. Set = 1

    6. While ( (X0 + H) (X0) + 104(X0)t H ) do = /2

    7. X0 = X0 + H

    28

  • 8. If ||H|| < Tol||X0||, or iter > MaxIter print results and stop9. Goto step 3

    Remarks:

    1. This modified Newton method is not a globally convergent method for all problems.For instance, for the problem of finding roots of z2+1 = 0. If we start at (x0, 0), x0 6= 0all iterates will stay on the x axis and thus will not converge to the roots located onthe y axis.

    2. If U is a descent direction at Xk, the natural criteria (Xk + U) < (XK) does notguarantee convergence to a minimum of . See [?] for counterexamples.

    Example:

    We illustrate the modified Newton on the following system

    f1(x, y) = x2 + y2 9 = 0 and f2(x, y) = x+ y 1 = 0

    Let F = [f1(x, y), f2(x, y)]t whose Jacobian matrix is defined as

    JF(x0) =

    [2x 2y1 1

    ]

    Let us define the scalar function (x, y) = [(x2 + y2 9)2 + (x + y 1)2]/2 whose gradientis given by

    (x, y) =[(x2 + y2 9)2x+ (x+ y 1)(x2 + y2 9)2y + (x+ y 1)

    ]To start the iteration process we select an initial guess x0 = [1, 2]t

    first iteration: k=1

    we obtain u0 by solving

    JF(x0)u0 = F(x0) = 1x1 = x0 + u0 = [5, 6]t(x1) = 1352TR = 9.998The condition (2.13) is not satisfied, we reject x1 and try

    = 1/2

    29

  • x1 = x0 + 0.5u0 = [2, 4]t(x1) = 61TR = 9.999The condition (2.13) is not satisfied, we reject x1 and try

    = 1/4x1 = x0 + 0.25u0 = [0.5, 3]t(x1) = 1.15625TR = 9.9995The condition (2.13) is satisfied, we accept x1 and repeat this process to obtain the second

    iteration

    second iteration, k=1

    we solve JF(x1)u1 = F(x1) to obtain u1 = [1.25,0.25]t

    = 1x2 = x1 + u1

    The condition (2.13) is not satisfied

    = 1/2x2 = x1 + 0.5u1 = [1.125,0.25]t

    The condition (2.13) is satisfied

    All remaining iterations with = 1 satisfy (2.13), i.e. Newton-Raphson is used.

    Steepest descent method:The steepest descent method is obtained by selecting

    1. the direction uk = JF(xk)tF(xk)2. = such that

    (xk + uk) = min[0,1]

    (xk + uk).

    For further delails on how to approximate consult [?].

    Next, we describe the main steps of a steepest descent algorithm

    1. Select X0, , max1, k = 0, h > 0

    2. Compute uk = JF(Xk)tF(Xk)3. Set P1 = Xk + huk, P2 = X

    k + 2huk

    30

  • 4. If (P1) > (Xk) go to step 5 else go to step 6

    5. Set h=h/2 goto step 3

    6. If (P2) < (P1) then

    (a) Xk+1 = P2

    (b) else xk+1 = P1

    7. If ||Xk+1 Xk|| < or k > max1 print results and stop.8. Increment k k + 1, goto step 1.

    Other Quasi-Newton Methods:

    JF (Xk) is approximated using finite difference approximations for partial derivatives. JF (Xk) may be factored and used for more than one iteration as

    Xk+i = Xk+i1 JF (Xk)1F (Xk+i1), i = 1, 2, , nk,where nk may be a fixed number or may be selected adaptively by checking the con-vergence rate.

    2.3.4 Continuation Methods

    A good initial guess X0 that guarantee convergence for Newtons iteration may be hard tofind. To address this issue by discussing few continuation techniques using the homotopyprinciple where we follow a path from an initial guess to the actual solution. The ideasituation is where there exists a path, i.e. a piecewise continuously differential curve inspace which connects initial guess X0 to the solution X

    and which we can follow. However,this is not the always the case. The conditions for an existence of such paths are discussedin [2]. In this section we briefly discuss the following homotopy functions:

    1. Linear HomotopyWe consider the homotopy function

    G(X, t) = tF (X) + (1 t)(F (X) F (X0)), 0 t 1.where

    G(X, 0) = F (X) F (X0), with G(X0, 0) = 0and

    G(X, 1) = F (X) = 0.

    Next, we develop an algorithm that marches us from an initial trivial problem withsolution X0 to a terminal problem with the unknown solution X

    .

    A Continuation Algorithm:

    31

  • (a) Select N 1, t = 1/N , set k = 1.(b) Find Xk , the solution of G(Xk, kt) = 0 by applying a descent method with X

    k1

    as an initial guess.

    (c) Increment k k + 1,(d) If k < N go to step 1b.

    (e) Otherwise set X = XN .

    2. Newton HomotopyObtained by differentiating the homotopy function

    G(X(t), t) = F (X(t) + (t 1)F (X0) = 0,

    as

    JF (X)dX(t)

    dt= F (X0), X(0) = X0.

    If JF (X) exists and is nonsingular, we can rewrite the problem as

    dX(t)

    dt= JF (X)1F (X0), X(0) = X0.

    Applying the forward Euler method yields the iteration method

    Xk+1 = Xk tJF (Xk)1F (X0) .

    3. A Third homotopy:Consider the homotopy function

    G(X(t), t) = F (X(t)) F (X0)et = 0, 0 t

  • We also note that the terminal value limt

    X(t) = X.

    Applying the forward Euler method with a step size t > 0, we obtain the modifiedNewtons iteration

    Xk+1 = Xk t JF (Xk)1F (Xk), k = 0, 1, .For t = 1 we obtain Newtons iteration method.

    2.3.5 Secant Methods for multidimensional problems

    The secant method which does not use partial derivatives are sought for in practice see [?] formore details. Here we include a popular extension of the secant method to multi-dimensions.

    Broyden MethodHere we only present an algorithm for convergence analysis consult [?].

    Algorithm for Broyden iteration method

    1. Select X0 and A0 = JF (X0)

    2. for k = 0, 1, do

    (a) Solve AkSk = F (Xk) for Sk(b) Xk+1 = Xk + Sk

    (c) Yk = F (Xk+1) F (Xk)(d) Ak+1 = Ak +

    (YkAkSk)StkStkSk

    (e) If ||Sk|| < Tol||Xk+1|| stop and print result otherwise continue

    We may use the Sherman-Morrison formula to obtain the inverse

    A1k+1 = A1k +

    (Sk A1k Yk)StkA1kStkA

    1k Yk

    .

    Remarks

    1. Broyden method is equivalent to the secant method in multi-dimensions

    2. The matrix should be updated after few iterations as Ak = JF (Xk)

    33

  • 2.4 Finding zeros of polynomials

    In this section we discuss two methods for finding roots of polynomials with real coefficients.

    1. Newtons Method with real arithmeticAssuming the coefficients aj R we may use Newtons method with real arithmeticand real initial guess x0 to find real roots of

    p(x) = anzn + an1zn1 + + a1z + a0

    as

    xk+1 = xk p(xk)p(xk)

    , k = 0, 1, 2, , x0 R.

    However, we use following fast algorithm to evaluate p(x0) known as Horner scheme.

    If we write p(x) = (x x0)q(x) + c where q(x) = bn1xn1 + + b0, then p(x0) = cand we have Horners algorithm to find c as bn1 = an and bk1 x0bk = ak, k =n 1, n 2, , 1.

    Horner Algorithm:

    b(n-1)= a(n)

    for k=n-2:0

    b(k) = a(k+1) + x0 b(k+1)

    end

    c = a(0) + x0 b(0)

    %p(x_0)

    We apply the same algorithm to evaluate p(xk) = q(xk)

    2. Newtons method with complex arithmeticNewton method with complex arithmetic may be used to find complex roots by startingfrom a complex initial guess.

    Zk+1 = Zk p(Zk)p(Zk)

    , K = 1, 2, , Z0 is complex number.

    3. Bairstows MethodAnother alternative is to use Bairstows algorithm to find complex roots. Since ak arereal numbers, we will have pairs of conjugate complex roots.

    We will divide p(z) by z2 rz q by determining r, q, A and B such that

    34

  • p(z) = (z2 rz q) p1(z) + Az +B.

    Find r and q such that A(r, q) = 0 and B(r, q) = 0 which leads to a system oftwo equations. Then, we find the two conjugate roots

    Z =r (r)2 + 4q

    2.

    Bairstows algorithm:

    It consists of using Newtons method to solve A(r, q) = 0 and B(r, q) = 0, where A, Band the Jacobian matrix

    J =

    [Ar AqBr Bq

    ]are computed using the following algorithm.

    First, we will factor p1(z) as

    p1(z) = (z2 rz q)p2(z) + A1z +B1

    The Jacobian matrix is computed as

    (a) Aq = A1, Bq = B1

    (b) Ar = rA1 +B1, Br = qA1

    Horner Scheme is used to compute A,B,A1, B1 if

    p(z) = a0zn + a1z

    n1 + + anand

    p1(z) = b0zn2 + b1zn3 + + bn2,

    Horners Scheme is

    (a) b0 = a0

    (b) b1 = rb0 + a1

    (c) bi = qbi2 + rbi1 + ai, i = 2, 3, , n 2(d) A = qbn3 + rbn2 + an1

    B = qbn2 + an

    Using bi, i = 0, 1, , n 2 we will compute A1 and B1 using Horners scheme.

    35

  • 36

  • Bibliography

    [1] J.E. Dennis and R. B. Schnabel. Numerical Methods for Unconstrained Optimizationand Nonlinear Problems. SIAM, Philadelphia, 1996.

    [2] C.B. Garcia and W.I. Zangwill. Pathways to Solutions, Fixed Points and Equilibria.Prentice Hall, Engelwood Cliffs, 1981.

    37