4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder...

49
General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices 4.4. Gaussian Elimination for Sparse Matrices 4.4.1. Algebraic Pivoting in GE Numerical pivoting: for eliminating elements in column k choose large(st) entry in column/row/block k and permute this element on the diagonal position. Disadvantage: may lead to large fill in in the sparsity pattern of A. Idea: Choose pivot element according to minimum fill in! Note that for well-conditioned A = A T > 0 no numerical pivoting is necessary. Heuristic: Choose pivot element according to the degree in graph minimum degree reordering Parallel Numerics, WT 2013/2014 4 Sparse Matrices page 37 of 49

Transcript of 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder...

Page 1: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

4.4. Gaussian Elimination for Sparse Matrices4.4.1. Algebraic Pivoting in GE

• Numerical pivoting: for eliminating elements in column k chooselarge(st) entry in column/row/block k and permute this elementon the diagonal position.

• Disadvantage: may lead to large fill in in the sparsity pattern of A.

• Idea: Choose pivot element according to minimum fill in! Notethat for well-conditioned A = AT > 0 no numerical pivoting isnecessary.

• Heuristic: Choose pivot element according to the degree in graph→ minimum degree reordering

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 37 of 49

Page 2: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Special Case A = AT

• For elimination in the k th column of A:

– Define rm := number of nonzero entries in row m– Choose pivot index i by ri = minm rm

– Do the pivot permutation and the elimination– Go to next column k

• rm is #nonzeros in the mth row = #vertices directly connectedwith vertex mHence, pivot vertex is vertex with minimum degree in G(Ak )

• Heuristics: few entries in mth row/column→ few fill in because

– only few elements to eliminate– the pivot row used in the elimination is very sparse

→ Multiple minimum degree reordering

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 38 of 49

Page 3: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Multiple Minimum Degree reordering MMD

• Often, many vertices may have the minimum degree. Which oneshould be chosen?

• Choose an independent set: Indices with the same minimumdegree, but that are no neighbors.

– Eliminating any will have no effect on the degree of theothers, hence we can eliminate in Gaussian Eliminationprocess in parallel (compare Multifrontal methods)

– It reduces also the runtime because the outer loop is lessthan n.

– It often leads to fewer nonzeros.

• Approximative Minimum Degree Reordering (AMD) usesapproximations of the degrees.

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 39 of 49

Page 4: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Generalization to Nonsymmetric Problems:Markowitz

• Define rm := nnz in row m; cp := nnz in column p

• Choose pivot element with index pair (i , j) such that

(ri − 1)(cj − 1) = minm,p

(rm − 1)(cp − 1)

• Heuristics:– small cj leads to few elimination steps– small ri leads to sparse pivot row used in the elimination.

• Special case ri = 1 or cj = 1: no fill in.

• Include numerical pivoting by applying algebraic pivoting only onindices with absolute value that is not to small, e.g.,

|ai,j | ≥ 0.1 ·maxr ,s|ar ,s|

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 40 of 49

Page 5: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Comparison

Example:

A =

∗ ∗ ∗ . . . ∗∗ ∗∗ ∗...

. . .∗ ∗

First elimination step leads to dense matrix!

∗ ∗ ∗ . . . ∗0 ∗ ∗ . . . ∗0 ∗ ∗ . . . ∗...

......

. . ....

0 ∗ ∗ . . . ∗

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 41 of 49

Page 6: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Comparison (cont.)

Cuthill McKee:

With starting vertex 1:keeps numbers unchanged→ no improvement

Optimal bandwidth is not satisfactory: bandwidht n2

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 42 of 49

Page 7: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Comparison (cont. 2)Minimum degree is efficient for this example (PAPT ):

Optimal reordering in one step: 1 ↔ n Optimal costs: O(n).

∗ 0 0 . . . ∗0 ∗ . . . ∗

0 ∗...

......

.... . . ∗

∗ ∗ . . . ∗ ∗

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 43 of 49

Page 8: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Comparison (cont. 2)Minimum degree is efficient for this example (PAPT ):

Optimal reordering in one step: 1 ↔ n Optimal costs: O(n).

∗ 0 0 . . . ∗0 ∗ . . . ∗

0 ∗...

......

.... . . ∗

∗ ∗ . . . ∗ ∗

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 43 of 49

Page 9: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Example: Nonsymmetric PermutationCosts: O(n2)

Test: MATLAB (symamd, colamd, amd, colperm, symrcm)load(’west0479.mat’);a=west0479; s=a’*a; p=symamd(s); spy(s(p,p));

See matrix ’west0479.mat’ on Matrix Market:

http://math.nist.gov/MatrixMarket/data/Harwell-Boeing/chemwest/west0479.html

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 44 of 49

Page 10: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

4.4.2. Gaussian Elimination in Graph

• Gaussian elimination can be modelled without numericalcomputations only algebraically by computing the sequence ofrelated graphs (in terms of dense subgraphs (matrices) = clique)

• Modification of GE:

1. Apply algebraic prestep for GE, determining the graphsrelated to the elimination matrices Ak in GE.

2. Based on these graphs we can decide• whether GE will lead to nearly dense matrices (do not use GE in this

case!)• what additional entries will appear during GE (prepare the storage)

• Algebraic prestep is cheap, can be implemented using cliques.

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 45 of 49

Page 11: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

4.4.3. A Parallel Direct Solver

Frontal methods for band matrices (from PDE)

• Frontal dense matrix of size (β + 1)× (2β + 1)

• Move frontal matrix to CPU and treat as dense matrix.

• Then move frontal matrix one entry down-right and do nextelimination.

• Repeat until done.

• No parallelism until now.

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 46 of 49

Page 12: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Multifrontal in ParallelTo make efficient use of parallelism search for “independent“elimination steps!

A1,1 as first pivot element is related to a first frontal matrix thatcontains all information to eliminate the first column:Dense submatrix for k = 1:

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 47 of 49

Page 13: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Multifrontal in Parallel (cont.)

Because A1,2 = A2,1 = 0 we can in parallel consider already thefrontal matrix related to k = 2, the second step:

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 48 of 49

Page 14: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

General Properties of Sparse Matrices Sparse Matrices and Graphs Reordering Gaussian Elimination for Sparse Matrices

Multifrontal in Parallel (cont. 2)

The computations for k = 1 and k = 2 are independent and canbe done in parallel (in the 2× 2 block):

k = 1:Ai,j → Ai,j −

Ai,1 · A1,j

A1,1

k = 2:Ai,j → Ai,j −

Ai,2 · A2,j

A2,2

Number of frontal matrices that can be used in parallel dependson the sparsity pattern.

Parallel Numerics, WT 2013/2014 4 Sparse Matrices

page 49 of 49

Page 15: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Parallel Numerics, WT 2013/2014

5 Iterative Methods for Sparse Linear Systemsof Equations

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 1 of 73

Page 16: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Contents1 Introduction

1.1 Computer Science Aspects1.2 Numerical Problems1.3 Graphs1.4 Loop Manipulations

2 Elementary Linear Algebra Problems2.1 BLAS: Basic Linear Algebra Subroutines2.2 Matrix-Vector Operations2.3 Matrix-Matrix-Product

3 Linear Systems of Equations with Dense Matrices3.1 Gaussian Elimination3.2 Parallelization3.3 QR-Decomposition with Householder matrices

4 Sparse Matrices4.1 General Properties, Storage4.2 Sparse Matrices and Graphs4.3 Reordering4.4 Gaussian Elimination for Sparse Matrices

5 Iterative Methods for Sparse Linear Systems of Equations5.1 Stationary Methods5.2 Nonstationary Methods5.3 Preconditioning

6 Domain Decomposition6.1 Overlapping Domain Decomposition6.2 Non-overlapping Domain Decomposition6.3 Schur Complements

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 2 of 73

Page 17: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

• Disadvantages of direct methods (in parallel):– strongly sequential– may lead to dense matrices– sparsity pattern changes, additional entries necessary– indirect addressing– storage– computational effort

• Iterative solver:– choose initial guess = starting vector x (0), e.g., x (0) = 0– iteration function x (k+1) := Φ(x (k))

• Applied on solving a linear system:– Main part of Φ should be a matrix-vector multiplication Ax

(matrix-free!?)– Easy to parallelize, no change in the pattern of A.

x (k) k→∞−→ x̄ = A−1b

– Main problem: Fast convergence!Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 3 of 73

Page 18: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

5.1. Stationary Methods5.1.1. Richardson Iteration

• Construct from Ax = b an iteration process:

b = Ax = ( A− I + I︸ ︷︷ ︸(artificial) splitting of A

)x = x − (I − A)x ⇒ x = b + (I − A)x

= b + Nx

• Leads to equation x = Φ(x) with Φ(x) := b + Nx :

start: x (0);

x (k+1) := Φ(x (k)) = b + Nx (k) = b + (I − A)x (k)

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 4 of 73

Page 19: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Richardson Iteration (cont.)start: x (0);

x (k+1) := Φ(x (k)) = b + Nx (k) = b + (I − A)x (k)

If x (k) is convergent, x (k) → x̃ ,then

x̃ = Φ(x̃) = b + Nx̃ = b + (I − A)x̃ ⇒ Ax̃ = b

and therefore it holds

x (k) → x̃ = x̄ := A−1b

Residual-based formulation:

x (k+1) = Φ(x (k)) = b + (I − A)x (k) = b + x (k) − Ax (k)

= x (k) + (b − Ax (k))︸ ︷︷ ︸r(x) = residual

= x (k) + r(x (k))

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 5 of 73

Page 20: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Convergence Analysis via Neumann Series

x (k) = b + Nx (k−1) = b + N(b + Nx (k−2)) = b + Nb + N2x (k−2) =

. . . = b + Nb + N2b + · · ·+ Nk−1b + Nk x (0) =

=∑k−1

j=0N jb + Nk x (0) =

(∑k−1

j=0N j)

b + Nk x (0)

Special case x (0) = 0:

x (k) =

(∑k−1

j=0N j)

b

⇒ x (k) ∈ span{b,Nb,N2b, . . . ,Nk−1b} = span{b,Ab,A2b, . . . ,Ak−1b}= Kk (A,b)

which is called the Krylov space to A and b.

For ‖N‖ < 1 holds:∑k−1

j=0N j →

∑∞

j=0N j = (I − N)−1 = (I − (I − A))−1 = A−1

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 6 of 73

Page 21: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Convergence Analysis via Neumann Series(cont.)

x (k) →(∑∞

j=0N j)

b = (I − N)−1b = A−1b = x̄

Richardson iteration is convergent for ‖N‖ < 1 or A ≈ I.Error analysis for e(k) := x (k) − x̄ :

e(k+1) = x (k+1) − x̄ = Φ(x (k))− Φ(x̄) = (b + Nx (k))− (b + Nx̄) =

= N(x (k) − x̄) = Ne(k)

‖e(k)‖ ≤ ‖N‖‖e(k−1)‖ ≤ ‖N‖2‖e(k−2)‖ ≤ · · · ≤ ‖N‖k‖e(0)‖

‖N‖ < 1⇒ ‖N‖k k→∞−→ 0⇒ ‖e(k)‖ k→∞−→ 0

• Convergence, if ρ(N) = ρ(I − A) < 1, where ρ is spectral radius

ρ(N) = |λmax| = maxi

(|λi |) (λi is eigenvalue of N)

• Eigenvalues of A have to be all in circle around 1 with radius 1.Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 7 of 73

Page 22: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Splittings of A• Convergence of Richardson only in very special cases!

Try to improve the iteration for better convergence!• Write A in form A := M − N

b = Ax = (M − N)x = Mx − Nx ⇔ x = M−1b + M−1Nx = Φ(x)

Φ(x) = M−1b + M−1Nx = M−1b + M−1(M − A)x =

= M−1(b − Ax) + x = x + M−1r(x)

• N should be such that Ny can be evaluated efficiently.• M should be such that M−1y can be evaluated efficiently.

x (k+1) = x (k) + M−1r (k)

• Iteration with splitting M − N is equivalent to Richardson on

M−1Ax = M−1b

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 8 of 73

Page 23: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Convergence

• Iteration with splitting A = M − N is convergent if

ρ(M−1N) = ρ(I −M−1A) < 1

• For fast convergence it should hold

– M−1A ≈ I– M−1A should be better conditioned than A itself

• Such a matrix M is called a preconditioner for A.Is used in other iterative methods to accelerate convergence.

• Condition number:

κ(A) = ‖A−1‖‖A‖,∣∣∣∣λmax

λmin

∣∣∣∣ , orσmax

σmin

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 9 of 73

Page 24: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

5.1.2. Jacobi (Diagonal) Splitting

Choose A = M − N = D − (L + U) withD = diag(A)

L the lower triangular part of A, and

U the upper triangular part.

x (k+1) = D−1b + D−1(L + U)x (k) =

= D−1b + D−1(D − A)x (k) = x (k) + D−1r (k)

Convergent for A ≈ diag(A) or diagonal dominant matrices:

ρ(D−1N) = ρ(I − D−1A) < 1

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 10 of 73

Page 25: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Jacobi (Diagonal) Splitting (cont.)Iteration process written elementwise:

x (k+1) = D−1(b − (A−D)x (k))⇒ x (k+1)j =

1ajj

bj −n∑

m=1,m 6=j

aj,mx (k)m

ajjx

(k+1)j = bj −

∑j−1

m=1aj,mx (k)

m −∑n

m=j+1aj,mx (k)

m

• Damping or relaxation for improving convergence• Idea: Iterative method as correction of last iterate in search

direction.• Introduce step length for this correction step:

x (k+1) = x (k) + D−1r (k) → x (k+1) = x (k) + ωD−1r (k)

with additional damping parameter ω.• Damped Jacobi iteration:

x (k+1)damped = (ω + 1− ω)x (k) + ωD−1r (k) = ωx (k+1) + (1− ω)x (k)

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 11 of 73

Page 26: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Damped Jacobi Iteration

x (k+1) = x (k) + ωD−1r (k) = x (k) + ωD−1(b − Ax (k)) =

= . . .

= ωD−1b + [(1− ω)I + ωD−1(L + U)]x (k)

is convergent for

ρ([(1− ω)I + ωD−1(L + U)]︸ ︷︷ ︸ω→0−→ I

) < 1

Look for optimal ω with best convergence (add. degree of freedom).

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 12 of 73

Page 27: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Parallelism in the Jacobi Iteration

• Jacobi method is easy to parallelize: only Ax and D−1x .

• But often too slow convergence!

• Improvement: block Jacobi iteration

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 13 of 73

Page 28: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

5.1.3. Gauss-Seidel Iteration

Always use newest information available!

Jacobi iteration:

ajjx(k+1)j = bj −

j−1∑m=1

aj,m x (k)m︸︷︷︸

already computed

−n∑

m=j+1

aj,mx (k)m

Gauss-Seidel iteration:

ajjx(k+1)j = bj −

j−1∑m=1

aj,m x (k+1)m︸ ︷︷ ︸

already computed

−n∑

m=j+1

aj,mx (k)m

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 14 of 73

Page 29: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Gauss-Seidel Iteration (cont.)

• Compare dependancy graphs for general iterative algorithms.Here:

x = f (x) = D−1(b + (D − A)x) = D−1(b − (L + U)x)

to splitting A = (D − L)− U = M − N

x (k+1) = (D − L)−1b + (D − L)−1Ux (k) =

= (D − L)−1b + (D − L)−1(D − L− A)x (k) =

= x (k) + (D − L)−1r (k)

• Convergence depends on spectral radius ρ(I − (D − L)−1A) < 1

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 15 of 73

Page 30: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Parallelism in the Gauss-Seidel Iteration

• Linear system in D − L is easy to solve because D − L is lowertriangular but

• strongly sequential!

• Use red-black ordering or graph colouring for compromise:

parallel↔ convergence

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 16 of 73

Page 31: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Successive Over Relaxation (SOR)

• Damping or relaxation:

x (k+1) = x (k)+ω(D−L)−1r (k) = ω(D−L)−1b+[(1−ω)+ω(D−L)−1U]x (k)

• Convergence depends on spectral radius of iteration matrix

(1− ω) + ω(D − L)−1U

• Parallelization of SOR == parallelization of GS

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 17 of 73

Page 32: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Stationary Methods (in General)

• Can always be written in the two normal forms

x (k+1) = c + Bx (k)

with convergence depending on ρ(B) and

x (k+1) = x (k) + Fr (k)

with preconditioner F , B = I − FA

• For x (0) = 0:x (k+1) ⊆ Kk (B, c),

which is the Krylov space with respect to matrix B and vector c.

• Slow convergence (but good smoothing properties!→ multigrid)

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 18 of 73

Page 33: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

MATLAB Example

• clear; n=100;k=10;omega=1;stationary

• tridiag(−.5,1,−.5):

– Jacobi norm | cos(x)|– GS norm sin(x)√

2(1−cos(x))

– both < 1→ convergence, but slow

• To improve convergence→ nonstationary methods (or multigrid)

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 19 of 73

Page 34: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Chair of Informatics V—SCCSEfficient Numerical Algorithms—Parallel & HPC

• High-dimensional numerics (sparse grids)• Fast iterative solvers (multi-level methods,

preconditioners)• Adaptive, octree-based grid generation• Space-filling curves• Numerical linear algebra• Numerical algorithms for image processing• HW-aware numerical programming

Fields of application in simulation

• CFD (incl. fluid-structure interaction)• Plasma physics• Molecular dynamics• Quantum chemistry

Further info→ www5.in.tum.deFeel free to come around and ask for thesis topics!

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 20 of 73

Page 35: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

5.2. Nonstationary Methods5.2.1. Gradient Method

• Consider A = AT > 0 (A SPD)

Function Φ(x) =12

xT Ax − bT x

• n-dim. paraboloid Rn → R• Gradient ∇Φ(x) = Ax − b

• Position with ∇Φ(x) = 0 is exactlyminimum of paraboloid

• Instead of solving Ax = b considerminx Φ(x)

• Local descent direction in y :∇Φ(x) · y is minimum for

y = −∇Φ(x)

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 21 of 73

Page 36: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Gradient Method (cont.)

• Optimization: start with x (0)

x (k+1) := x (k) + αk d (k)

with search direction d (k) and step size αk .

• In view of previous results the optimal (local) search direction is

−∇Φ(x (k)) =: d (k)

• To define αk :

minα

g(α) := minα

(Φ(x (k) + α(b − Ax (k))))

= minα

(12

(x (k) + αd (k))T A(x (k) + αd (k))− bT (x (k) + αd (k))

)= min

α

(12α2d (k)T

Ad (k) − αd (k)Td (k) +

12

x (k)TAx (k) − x (k)T

b)

αk = d (k)T d (k)

d (k)T Ad (k)d (k) = −∇Φ(x (k)) = b − Ax (k)

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 22 of 73

Page 37: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Gradient Method (cont. 2)

x (k+1) = x (k) +‖b − Ax (k)‖2

2

(b − Ax (k))T A(b − Ax (k))(b − Ax (k))

• Method of steepest descent.

• Disadvantage: Distortedcontour lines.

• Slow convergence (zig zagpath)

• Local descent direction is notglobally optimal

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 23 of 73

Page 38: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Analysis of the Gradient Method

• Definition A-norm:‖x‖A :=

√xT Ax

• Consider error:

‖x − x̄‖2A = ‖x − A−1b‖2

A = (xT − bT A−1)A(x − A−1b)

= xT Ax − 2bT x + bT A−1b= 2Φ(x) + bT A−1b

• Therefore, minimizing Φ is equivalent to minimizing the error inthe A-norm! More detailed analysis reveals:

‖x (k+1) − x̄‖2A ≤

(1− 1

κ(A)

)· ‖x (k) − x̄‖2

A

• Therefore, for κ(A)� 1 very slow convergence!

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 24 of 73

Page 39: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

5.2.2. The Conjugate Gradient Method

• Improving descent direction being globally optimal.

• x (k+1) := x (k) + αk p(k) with search direction not being negativegradient, but projection of gradient that is A-conjugate to allprevious search directions:

p(k) ⊥ Ap(j) for all j < k orp(k) ⊥A p(j) or

p(k)TAp(j) = 0 for j < k

• We choose new search direction as component of last residualthat is A-conjugate to all previous search directions.

• αk again by 1-dim. minimization as before (for chosen p(k))

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 25 of 73

Page 40: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

The Conjugate Gradient Algorithm

p(0) = r (0) = b − Ax (0)

for k = 1,2, . . . do

α(k) = − 〈r(k),r (k)〉

〈p(k),Ap(k)〉

x (k+1) = x (k) − α(k)p(k)

r (k+1) = r (k) + α(k)Ap(k)

if ‖r (k+1)‖22 ≤ ε then break

β(k) = 〈r (k+1),r (k+1)〉〈r (k),r (k)〉

p(k+1) = r (k+1) + β(k)p(k)

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 26 of 73

Page 41: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Properties of Conjugate Gradients• It holds

p(j)TAp(k) = 0 = r (j)

Tr (k) for j 6= k

• and

span(p(1), . . . ,p(j)) = span(r (0), . . . , r (j−1)) =

= span(r (0),Ar (0), . . . ,Aj−1r (0)) = Kj (A, r (0))

• Especially for x (0) = 0 it holds

Kj (A, r (0)) = span(b,Ab, . . . ,Aj−1b)

• x (k) is best approximate solution to Ax = b in subspaceKk (A, r (0)). For x (0) = 0 : x (k) ∈ Kk (A,b)

• Error:‖x (k) − x̄‖A = min

x∈Kk (A,b)‖x − x̄‖A

• Cheap 1-dim. minimization gives optimal k -dim. solution for free!

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 27 of 73

Page 42: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Properties of Conjugate Gradients (cont.)

• Consequence: After n steps Kn(A,b) = Rn and thereforex (n) = A−1b is solution in exact arithmetic.

• Unfortunately, this is not true in floating point arithmetic.

• Furthermore, O(n) iteration steps would be too costly:costs: #iterations ∗ matrix-vector-product

• Matrix-vector-product easy in parallel.

• But, how to get fast convergence and reduce #iterations?

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 28 of 73

Page 43: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Error Estimation (x (0) = 0)

‖e(k)‖A = ‖x (k) − x̄‖A = minx∈Kk (A,b)

‖x − x̄‖A =

= minα0,...,αk−1

∥∥∥∥∑k−1

j=0αj (Ajb)− x̄

∥∥∥∥A

=

= minP(k−1)(x)

∥∥∥P(k−1)(A)b − x̄∥∥∥

A=

= minP(k−1)(x)

∥∥∥P(k−1)(A)Ax̄ − x̄∥∥∥

A=

= minP(k−1)(x)

∥∥∥(P(k−1)(A)A− I)(x̄ − x (0))∥∥∥

A=

= minQ(k)(x),Q(k)(0)=1

∥∥∥Q(k)(A)e(0)∥∥∥

A

for polynomial Q(k)(x) of degree k with Q(k)(0) = 1.Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 29 of 73

Page 44: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Error Estimation

• Matrix A has orthonormal basis of eigenvectors uj , j = 1, . . . ,n,eigenvalues λj

• It holds

Auj = λjuj , j = 1, . . . ,n, and uTj uk = 0 for j 6= k and 1 for j = k

• Start error in ONB: e(0) =∑n

j=1ρjuj

‖e(k)‖A = minQ(k)(0)=1

∥∥∥∥∥∥Q(k)(A)n∑

j=1

ρjuj

∥∥∥∥∥∥A

= minQ(k)(0)=1

∥∥∥∥∥∥n∑

j=1

ρjQ(k)(A)uj

∥∥∥∥∥∥A

=

= minQ(k)(0)=1

∥∥∥∥∥∥n∑

j=1

ρjQ(k)(λj )uj

∥∥∥∥∥∥A

≤ minQ(k)(0)=1

{max

j=1,...,n

∣∣∣Q(k)(λj )∣∣∣}∥∥∥∥∥∥

n∑j=1

ρjuj

∥∥∥∥∥∥A

=

= minQ(k)(0)=1

{max

j=1,...,n|Q(k)(λj )|

}∥∥∥e(0)∥∥∥

A

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 30 of 73

Page 45: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Error Estimates

By choosing polynomials with Q(k)(0) = 1, we can derive errorestimates for the error after the k th step:

Choose, e.g. Q(k)(x) :=

∣∣∣∣1− 2λmax + λmin

x∣∣∣∣k

‖e(k)‖A ≤ maxj=1,...,n

|Q(k)(λj )|∥∥∥e(0)

∥∥∥A

=

∣∣∣∣1− 2λmax

λmax + λmin

∣∣∣∣k ‖e(0)‖A

=

(λmax − λmin

λmax + λmin

)k

‖e(0)‖A =

(κ(A)− 1κ(A) + 1

)k

‖e(0)‖A

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 31 of 73

Page 46: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Better Estimates

• Choose normalized Chebyshev polynomialsTn(x) = cos(n arccos(x))

‖e(k)‖A ≤1

Tk

(κ(A)+1κ(A)−1

)‖e(0)‖A ≤ 2

(√κ(A)− 1√κ(A) + 1

)k

‖e(0)‖A

• For clustered eigenvalues choose special polynomial, e.g.assume that A has only two eigenvalues λ1 and λ2:

Q(2)(x) :=(λ1 − x)(λ2 − x)

λ1λ2

‖e(2)‖A ≤ maxj=1,2

∣∣∣Q(2)(λj )∣∣∣ ‖e(0)‖A = 0

Convergence of CG after two steps!

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 32 of 73

Page 47: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Outliers/Cluster

Assume the matrix has an eigenvalue λ1 > 1 and all othereigenvalues are contained in an ε-neighborhood of 1:

∀λ 6= λ1 : |λ− 1| < ε

Q(2)(x) :=(λ1 − x)(1− x)

λ1

‖e(2)‖A ≤ max|λ−1|<ε

∣∣∣∣ (λ1 − λ)(1− λ)

λ1

∣∣∣∣ ‖e(0)‖A ≤(λ1 − 1 + ε)ε

λ1= O(ε)

Very good approximation of CG after only two steps!

Important: small number of outliers combined with cluster.

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 33 of 73

Page 48: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Conjugate Gradients Summary

• To get fast convergence and reduce the number of iterations:→ find preconditioner M, such that M−1Ax = M−1b withclustered eigenvalues.

• Conjugate gradients (CG) is always the method of choice forsymmetric positive definite A (in general).

• To improve convergence, include preconditioning (PCG).

• CG has two important properties: optimal and cheap.

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 34 of 73

Page 49: 4.4. Gaussian Elimination for Sparse Matrices€¦ · 3.3 QR-Decomposition with Householder matrices 4 Sparse Matrices 4.1 General Properties, Storage 4.2 Sparse Matrices and Graphs

Stationary Methods Nonstationary Methods Preconditioning

Parallel Conjugate Gradients Algorithm

Parallel Numerics, WT 2013/2014 5 Iterative Methods for Sparse Linear Systems of Equations

page 35 of 73