Modern iterative methods For basic iterative methods, converge linearly Modern iterative methods,...
-
Upload
lydia-clark -
Category
Documents
-
view
229 -
download
3
Transcript of Modern iterative methods For basic iterative methods, converge linearly Modern iterative methods,...
Modern iterative methods
For basic iterative methods, converge linearlyModern iterative methods, converge faster– Krylov subspace method
• Steepest descent method• Conjugate gradient (CG) method --- most popular• Preconditioning CG (PCG) method• GMRES for nonsymmetric matrix
– Other methods (read yourself)• Chebyshev iterative method• Lanczos methods• Conjugate gradient normal residual (CGNR)
cDxRDxcxRxDbxA mm 1)(1)1(
bxxAxxx TT
x n
2
1:)()(min
Modern iterative methods
Ideas:– Minimizing the residual – Projecting to Krylov subspace
Thm: If A is an n-by-n real symmetric positive definite matrix, then
have the same solutionProof: see details in class
bxxAxxbxA TT
xx nn
2
1min)(min
bAbxbAx T 11
2
1*)(*
bxxAxxdxdxx TT
mm
mmmm
2
1:)()(min )()()1(
Steepest decent method
Suppose we have an approximation Choose the direction as negative gradient of
– If
– Else, choose to minimize
*xxc
ccxxxxc rxAbbxAxdcc
:|)(|)(
cd
c
)(x
!solution!exact theis 0 ccc xxAbr
)( cc dx
Steepest decent method
Computation
Choose asc
cT
c
cT
c
cTc
cTc
c rAr
rr
dAd
dd
)()(2
1)(
)()(2
1)(
)()()(2
1)(
2
2
cTcc
Tcc
cTcc
Tcc
Tcccc
Tcccc
dddAdx
bxAddAdx
bdxdxAdxdx
Algorithm– Steepest descent method
)xx
xAbrrαxx
rArrrαmm
r
mxAbr
x
mm
mmmm
mm
mT
mmT
mm
m
10)1()(
)(11
)1()(
11111
)0(0
)0(
10 (e.g. until
&
)/()( & 1
0 while
0set & Compute
guess Initial
Theory
Suppose A is symmetric positive definite.Define A-inner product
Define A-norm
Steepest decent method
yAxyAxyx TA
),(),(
xAxxxx TAA
),(
2
Amm
mmm
mm
mmmm
rr
rrαxAbr
rαxxx
),(
),( &
& guess Initial
)(
)()1()0(
Theory
Thm: For steepest decent method, we have
Proof: Exercise
*)()()(
11
2
1)(
)(
11
2
1)(*)()(
)1(
2
1)1(
2
1)()(
xxAk
bAbxAk
bAbxxx
m
Tm
Tmm
Theory
Rewrite the steepest decent method
Let errors
Lemma: For the method, we have m
mmmmmm reexxexxe
)1()()()()()( ~*~~*
(m)m
Amm
mmmm
(m))(m
mm
mmmm
mm
xA brrr
rrαrx x
xxrxx
),(
),(~
~)1(
1
)1()()()1(
0),()~,(
~)1()1()1()()1(
)1()()1(
Amm
Ammm
mm
mm
m
reeee
eee
Theory
Thm: For steepest decent method, we have
Proof: See details in class (or as an exercise)
A
m
A
m
A
m
A
m
AmmA
m
A
mmmA
m
A
m
eAk
Ake
eere
eeee
)(
2
2)1(
)()1(222)1(
2)1()(22)1(2)(
1)(
1)(
nrom.-A of sense thein
convergently monotional algorithm The
~
Steepest decent method
Performance– Converge globally, for any initial data– If , then it converges very fast– If , then it converges very slow!!!
Geometric interpretation– Contour plots are flat!!– Local best direction (steepest direction) is not necessarily a global best direction – Computational experience shows that the method suffers a decreasing
convergence rate after a few iteration steps because the search directions become linearly dependent!!!
)1()(2 OAk
1)(2 Ak
Conjugate gradient (CG) method
Since A is symmetric positive definite, A-norm
In CG method, the direction vectors are chosen to be A-orthogonal (and called as conjugate vectors), i.e.
xAxxxx TAA
),(
midAd mT
i ,0)(
CG method
In addition, we take the new direction vector as a linear combination of the old direction vector and the descent direction as
By the assumption we get 0)( 1 mT
m dAd
mT
m
mT
mmm
Tmmm
dAd
dArdAdr
)(
)()(0
)( 1 m
mmmmm xAbrdrd
Algorithm– CG Method
(0)
(0)0 0 0
( 1) ( )1
1 2
Choose initial guess
Compute & set
For 0,1,..., do
Compute ( ) / ( )
&
If
T Tm m m m m
m mm m m m m m
m
x
r b A x d r
m
α r r d A d
x x α d r r α A d
r
10
1 11 1
(e.g. 10 , then
( ) &
( )
endif
endfor
Tm m
m m m m mTm m
)
r rd r d
r r
An example
An example
Initial guess
The approximate solutions
1
1
1
7
7
7
511
151
115
xbxAbA
0000.0
0000.0
0000.0)0(x
0003.1
0003.1
0003.1
,1429.0,
0000.7
0000.7
0000.7)1(
000 xdr
CG method
In CG method, are A-orthogonal!
Define the linear space as
Lemma: In CG method, for m=0,1,…., we have
– Proof: See details in class or as an exercise
1 & mm dd
0),()(A respect to with 111 AmmmT
mmm dddAddd
0 11
span{ , , , } { | , }m
m i i ii
d d d y y d
},,,{span
},,,{span},,,{span
000
1010
rArAr
rrrdddm
mm
CG method
In CG method, is A-orthogonal to or
Lemma: In CG method, we have
– Proof: See details in class or as an exercise
Thm: Error estimate for CG method
1md
mddd
,,, 10
A respect to with},,,{span 101 mm dddd
jirrdAd jT
ijT
i ,0)(,0)(
min
max)(&*
:&,2,1,01)(
1)(2
2
1
22)()(
2
2
)0(
)(
AAAkxxe
xAxxmAk
Ak
e
e
mm
T
A
m
A
A
m
CG method
Computational cost– At each iteration, 2 matrix-vector multiplications. This
can be further reduced to 1 matrix-vector multiplications
– At most n steps, we can get the exact solution!!!Convergence rate depends on the condition #– K2(A)=O(1), converges very fast!!– K2(A)>>1, converges slow but can be accelerated by
preconditioning!!
Preconditioning
Ideas: Replace by satisfying
– C is symmetric positive definite – is well-conditioned, i.e. – can be easily solved
Conditions for choosing the preconditioning matrix– as small as possible– is easy to compute– Trade-off
bxA
bxA~~~
xxC ~A~ )()
~( 22 AkAk
)~
(2 Ak1C
bCbxCxACCA 111 ~~~
Algorithm– PCG Method
(0) (0)0
0 0 0 0
( 1) ( )1
Choose initial guess & compute
Solve & set
For 0,1,..., do
Compute ( ) / ( )
&
T Tm m m m m
m mm m m m m
x r b A x
Cr r d r
m
α r r d A d
x x α d r r α
101 2
1 1
1 11 1
If (e.g. 10 , then
Solve
( ) &
( )
endif
endfor
m
m
m m
Tm m
m m m m mTm m
A d
r )
Cr r
r rd r d
r r
Preconditioning
Ways to choose the matrix C (read yourself)– Diagonal part of A– Tri-diagonal part of A– m-step Jacobi preconditioner– Symmetric Gauss-Seidel preconditioner– SSOR preconditioner– In-complete Cholesky decomposition– In-complete block preconditioning– Preconditioning based on domain decomposition– …….
Extension of CG method to nonsymmetric
Biconjugate gradient (BiCG) method: – Solve simultaneously– Works well for A is positive definite, not symmetric– If A is symmetric, BiCG reduces to CG
Conjugate gradient squared (CGS) method– A has a special formula in computing Ax, its transport hasn’t– Multiplication by A is efficient but multiplication by its transport
is not
& TAx b A y b
Krylov subspace methods
Problem I. Linear systemProblem II. Variational formulation
Problem III. Minimization problem
– Thm1: Problem I is equivalent to Problem II– Thm2: If A is symmetric positive definite, they are equivalent
bxA
),(),(2
1
2
1:)()(min bxxxAbxxAxxx TT
x n
nvvbvxA
),(),(
Krylov subspace methods
To reduce problem size, we replace by a subspace
Subspace minimization: – Find – Such that
Subspace projection
n
),(),(2
1
2
1)()(min )(
)0(bxxxAbxxAxxx TTm
Sx m
guess initial an with},,,{span )0(110
)0( xdddSSx mmm
1( ) (0)
0
m
mk k
k
x x d
mm
kkm
SvvbvxA
mkdbdxA
),(),(
10),(),()(
)(
: )0(n)(m
m Sxx
Krylov subspace methods
To determine the coefficients, we have – Normal Equations
– It is a linear system with degree m!!
m=1: line minimization or linear search or 1D projection
By converting this formula into an iteration, we reduce the original problem into a sequence of line minimization (successive line minimization ).
1,,1,0)()()()(
1,,1,0)( )(
0)0(
1
0
1
0
)0(
mkrdxAbddAd
mkbddxAd
Tk
Tk
m
lll
Tk
Tkl
m
ll
Tk
00)0()1(
00
000
)(
)(dxx
dAd
rdT
T
For symmetric matrix
Positive definite– Steepest decent method
– CG method
– Preconditioning CG method
Non-positive definite – MINRES (minimum residual method)
kk rd
0),( 11 Akkkkkk dddrd
2
)(min mxAb
),,(},,,{span 001
00)0( mrAKrArArSSx m
mm
For nonsymmetric matrix
Normal equations method (or CGNR method)
GMRES (generalized minimium residual method)– Saad & Schultz, 1986 – Ideas:
• In the m-th step, minimize the residual over the set
• Use Arnoldi (full orthogonal) vectors instead of Lanczos vectors• If A is symmetric, it reduces to the conjugate residual method
bAbAAAbxAbxA TT
~
&~
with~~
2
)(mxAb
),,(},,,{span 001
00)0( mrAKrArArSSx m
mm
Algorithm– GMRES
(0)
(0)0 1,0 0 2
1,
1 1,
,
Choose initial guess
Compute & set 0
while 0
/ 1 &
for 1
( ) &
k k
k k k k k k
Ti,k i k k k i k i
x
r b A x h r k
h
q r h k k r A q
i , k
h q r r r h q
( ) (0)1, 1 0 12 2
( ) ( 1) 10
end
with min
until (e.g. 10
kk k k k k , k k
k k
h r x x Q y h e H y
x x )
More topics on Matrix computations
Eigenvalue & eigenvector computations
If A is symmetric: Power method
If A is general matrix– Householder matrix (transform)
– QR method
nnn xAxxA
with
IPPPPPvvvvv
IP TTnTT
2&0 2
matrix triangle &matrix orthogonal with RQRQA
221 &matrix enbergupper Hess nT PPPUCUAU
More topics on matrix computations
Singular value decomposition (SVD)Thm: Let A be an m-by-n real matrix, there exists orthogonal matrices U & V
such that
Proof: Exercise
nmA
nnn
mmm vvvVuuuU ],,,[&],,,[ 2121
)(rank&0 with
},min{},,,{diag
121
21
Ar
nmpVAU
prr
nmp
T