Modern iterative methods For basic iterative methods, converge linearly Modern iterative methods,...

Modern iterative methods

For basic iterative methods, converge linearlyModern iterative methods, converge faster– Krylov subspace method

• Steepest descent method• Conjugate gradient (CG) method --- most popular• Preconditioning CG (PCG) method• GMRES for nonsymmetric matrix

– Other methods (read yourself)• Chebyshev iterative method• Lanczos methods• Conjugate gradient normal residual (CGNR)

cDxRDxcxRxDbxA mm 1)(1)1(

bxxAxxx TT

x n

2

1:)()(min

Modern iterative methods

Ideas:– Minimizing the residual – Projecting to Krylov subspace

Thm: If A is an n-by-n real symmetric positive definite matrix, then

have the same solutionProof: see details in class

bxxAxxbxA TT

xx nn

2

1min)(min

bAbxbAx T 11

2

1*)(*

bxxAxxdxdxx TT

mm

mmmm

2

1:)()(min )()()1(

Steepest decent method

Suppose we have an approximation Choose the direction as negative gradient of

– If

– Else, choose to minimize

*xxc

ccxxxxc rxAbbxAxdcc

:|)(|)(

cd

c

)(x

!solution!exact theis 0 ccc xxAbr

)( cc dx


Computation

Choose asc

cT

c

cT

c

cTc

cTc

c rAr

rr

dAd

dd

)()(2

1)(

)()(2

1)(

)()()(2

1)(

2

2

cTcc

Tcc

cTcc

Tcc

Tcccc

Tcccc

dddAdx

bxAddAdx

bdxdxAdxdx

Algorithm– Steepest descent method

)xx

xAbrrαxx

rArrrαmm

r

mxAbr

x

mm

mmmm

mm

mT

mmT

mm

m

10)1()(

)(11

)1()(

11111

)0(0

)0(

10 (e.g. until

&

)/()( & 1

0 while

0set & Compute

guess Initial

Theory

Suppose A is symmetric positive definite.Define A-inner product

Define A-norm


yAxyAxyx TA

),(),(

xAxxxx TAA

),(

2

Amm

mmm

mm

mmmm

rr

rrαxAbr

rαxxx

),(

),( &

& guess Initial

)(

)()1()0(

Theory

Thm: For steepest decent method, we have

Proof: Exercise

*)()()(

11

2

1)(

)(

11

2

1)(*)()(

)1(

2

1)1(

2

1)()(

xxAk

bAbxAk

bAbxxx

m

Tm

Tmm

Theory

Rewrite the steepest decent method

Let errors

Lemma: For the method, we have m

mmmmmm reexxexxe

)1()()()()()( ~*~~*

(m)m

Amm

mmmm

(m))(m

mm

mmmm

mm

xA brrr

rrαrx x

xxrxx

),(

),(~

~)1(

1

)1()()()1(

0),()~,(

~)1()1()1()()1(

)1()()1(

Amm

Ammm

mm

mm

m

reeee

eee

Theory

Thm: For steepest decent method, we have

Proof: See details in class (or as an exercise)

A

m

A

m

A

m

A

m

AmmA

m

A

mmmA

m

A

m

eAk

Ake

eere

eeee

)(

2

2)1(

)()1(222)1(

2)1()(22)1(2)(

1)(

1)(

nrom.-A of sense thein

convergently monotional algorithm The

~


Performance– Converge globally, for any initial data– If , then it converges very fast– If , then it converges very slow!!!

Geometric interpretation– Contour plots are flat!!– Local best direction (steepest direction) is not necessarily a global best direction – Computational experience shows that the method suffers a decreasing

convergence rate after a few iteration steps because the search directions become linearly dependent!!!

)1()(2 OAk

1)(2 Ak

Conjugate gradient (CG) method

Since A is symmetric positive definite, A-norm

In CG method, the direction vectors are chosen to be A-orthogonal (and called as conjugate vectors), i.e.

xAxxxx TAA

),(

midAd mT

i ,0)(

CG method

In addition, we take the new direction vector as a linear combination of the old direction vector and the descent direction as

By the assumption we get 0)( 1 mT

m dAd

mT

m

mT

mmm

Tmmm

dAd

dArdAdr

)(

)()(0

)( 1 m

mmmmm xAbrdrd

Algorithm– CG Method

(0)

(0)0 0 0

( 1) ( )1

1 2

Choose initial guess

Compute & set

For 0,1,..., do

Compute ( ) / ( )

&

If

T Tm m m m m

m mm m m m m m

m

x

r b A x d r

m

α r r d A d

x x α d r r α A d

r

10

1 11 1

(e.g. 10 , then

( ) &

( )

endif

endfor

Tm m

m m m m mTm m

)

r rd r d

r r

An example

An example

Initial guess

The approximate solutions

1

1

1

7

7

7

511

151

115

xbxAbA

0000.0

0000.0

0000.0)0(x

0003.1

0003.1

0003.1

,1429.0,

0000.7

0000.7

0000.7)1(

000 xdr

CG method

In CG method, are A-orthogonal!

Define the linear space as

Lemma: In CG method, for m=0,1,…., we have

– Proof: See details in class or as an exercise

1 & mm dd

0),()(A respect to with 111 AmmmT

mmm dddAddd

0 11

span{ , , , } { | , }m

m i i ii

d d d y y d

},,,{span

},,,{span},,,{span

000

1010

rArAr

rrrdddm

mm

CG method

In CG method, is A-orthogonal to or

Lemma: In CG method, we have

– Proof: See details in class or as an exercise

Thm: Error estimate for CG method

1md

mddd

,,, 10

A respect to with},,,{span 101 mm dddd

jirrdAd jT

ijT

i ,0)(,0)(

min

max)(&*

:&,2,1,01)(

1)(2

2

1

22)()(

2

2

)0(

)(

AAAkxxe

xAxxmAk

Ak

e

e

mm

T

A

m

A

A

m

CG method

Computational cost– At each iteration, 2 matrix-vector multiplications. This

can be further reduced to 1 matrix-vector multiplications

– At most n steps, we can get the exact solution!!!Convergence rate depends on the condition #– K2(A)=O(1), converges very fast!!– K2(A)>>1, converges slow but can be accelerated by

preconditioning!!

Preconditioning

Ideas: Replace by satisfying

– C is symmetric positive definite – is well-conditioned, i.e. – can be easily solved

Conditions for choosing the preconditioning matrix– as small as possible– is easy to compute– Trade-off

bxA

bxA~~~

xxC ~A~ )()

~( 22 AkAk

)~

(2 Ak1C

bCbxCxACCA 111 ~~~

Algorithm– PCG Method

(0) (0)0

0 0 0 0

( 1) ( )1

Choose initial guess & compute

Solve & set

For 0,1,..., do

Compute ( ) / ( )

&

T Tm m m m m

m mm m m m m

x r b A x

Cr r d r

m

α r r d A d

x x α d r r α

101 2

1 1

1 11 1

If (e.g. 10 , then

Solve

( ) &

( )

endif

endfor

m

m

m m

Tm m

m m m m mTm m

A d

r )

Cr r

r rd r d

r r

Preconditioning

Ways to choose the matrix C (read yourself)– Diagonal part of A– Tri-diagonal part of A– m-step Jacobi preconditioner– Symmetric Gauss-Seidel preconditioner– SSOR preconditioner– In-complete Cholesky decomposition– In-complete block preconditioning– Preconditioning based on domain decomposition– …….

Extension of CG method to nonsymmetric

Biconjugate gradient (BiCG) method: – Solve simultaneously– Works well for A is positive definite, not symmetric– If A is symmetric, BiCG reduces to CG

Conjugate gradient squared (CGS) method– A has a special formula in computing Ax, its transport hasn’t– Multiplication by A is efficient but multiplication by its transport

is not

& TAx b A y b

Krylov subspace methods

Problem I. Linear systemProblem II. Variational formulation

Problem III. Minimization problem

– Thm1: Problem I is equivalent to Problem II– Thm2: If A is symmetric positive definite, they are equivalent

bxA

),(),(2

1

2

1:)()(min bxxxAbxxAxxx TT

x n

nvvbvxA

),(),(


To reduce problem size, we replace by a subspace

Subspace minimization: – Find – Such that

Subspace projection

n

),(),(2

1

2

1)()(min )(

)0(bxxxAbxxAxxx TTm

Sx m

guess initial an with},,,{span )0(110

)0( xdddSSx mmm

1( ) (0)

0

m

mk k

k

x x d

mm

kkm

SvvbvxA

mkdbdxA

),(),(

10),(),()(

)(

: )0(n)(m

m Sxx


To determine the coefficients, we have – Normal Equations

– It is a linear system with degree m!!

m=1: line minimization or linear search or 1D projection

By converting this formula into an iteration, we reduce the original problem into a sequence of line minimization (successive line minimization ).

1,,1,0)()()()(

1,,1,0)( )(

0)0(

1

0

1

0

)0(

mkrdxAbddAd

mkbddxAd

Tk

Tk

m

lll

Tk

Tkl

m

ll

Tk

00)0()1(

00

000

)(

)(dxx

dAd

rdT

T

For symmetric matrix

Positive definite– Steepest decent method

– CG method

– Preconditioning CG method

Non-positive definite – MINRES (minimum residual method)

kk rd

0),( 11 Akkkkkk dddrd

2

)(min mxAb

),,(},,,{span 001

00)0( mrAKrArArSSx m

mm

For nonsymmetric matrix

Normal equations method (or CGNR method)

GMRES (generalized minimium residual method)– Saad & Schultz, 1986 – Ideas:

• In the m-th step, minimize the residual over the set

• Use Arnoldi (full orthogonal) vectors instead of Lanczos vectors• If A is symmetric, it reduces to the conjugate residual method

bAbAAAbxAbxA TT

~

&~

with~~

2

)(mxAb

),,(},,,{span 001

00)0( mrAKrArArSSx m

mm

Algorithm– GMRES

(0)

(0)0 1,0 0 2

1,

1 1,

,

Choose initial guess

Compute & set 0

while 0

/ 1 &

for 1

( ) &

k k

k k k k k k

Ti,k i k k k i k i

x

r b A x h r k

h

q r h k k r A q

i , k

h q r r r h q

( ) (0)1, 1 0 12 2

( ) ( 1) 10

end

with min

until (e.g. 10

kk k k k k , k k

k k

h r x x Q y h e H y

x x )

More topics on Matrix computations

Eigenvalue & eigenvector computations

If A is symmetric: Power method

If A is general matrix– Householder matrix (transform)

– QR method

nnn xAxxA

with

IPPPPPvvvvv

IP TTnTT

2&0 2

matrix triangle &matrix orthogonal with RQRQA

221 &matrix enbergupper Hess nT PPPUCUAU

More topics on matrix computations

Singular value decomposition (SVD)Thm: Let A be an m-by-n real matrix, there exists orthogonal matrices U & V

such that

Proof: Exercise

nmA

nnn

mmm vvvVuuuU ],,,[&],,,[ 2121

)(rank&0 with

},min{},,,{diag

121

21

Ar

nmpVAU

prr

nmp

T

Modern iterative methods For basic iterative methods, converge linearly Modern iterative methods,...

Documents

Transcript of Modern iterative methods For basic iterative methods, converge linearly Modern iterative methods,...