Sparse Matrix Computations/Algorithmique des matrices...

Sparse Matrix Computations/Algorithmique desmatrices creuses

J.-Y. L’Excellent (INRIA) and Bora Ucar (CNRS),LIP-ENS Lyon

Jean-Yves.L.Excellent@ens-lyon.fr, Bora.Ucar@ens-lyon.fr

prepared in collaboration with P. Amestoy (ENSEEIHT-IRIT)

2009-2010

Outline

Linear Algebra BasicsIntroductionGaussian Elimination and LU factorizationVector and Matrix normsError, Sensitivity, ConditioningFactorisation LU avec pivotageSystemes bandeMatrices symetriquesFactorisation de CholeskyFactorisation QRGram-Schmidt ProcessProblemes aux moindres carresProblemes aux valeurs propresDecomposition en valeurs singulieres (SVD)

Linear algebra

I Linear algebra: branch of mathematics that deals withsolutions of systems of linear equations and the relatedgeometric notions of vector spaces and linear transformations.

I “linear” comes from the fact that equation

ax + by = c

defines a line (in two-dimensional geometry).I similar to the form of a system of linear equations:

ai1x1 + ai2x2 + . . .+ ainxn = bi , i = 1, . . . ,m

I Linear transformation from a vector space V to W :T (v1 + v2) = T (v1) + T (v2)

T (αv1) = αT (v1)I Linear transformations (rotations, projections, . . . ) are often

represented by matrices. A =

0 1−2 2

], then

T : v −→ Av is a linear transformation from IR2 to IR3, defined by

T (x , y) = (y ,−2x + 2y , x).4/ 82

Use of Linear algebra

Continuous problem → Discretization → Mathematicalrepresentation involving vectors and matrices

This leads to problems involving vectors and matrices, inparticular:

I systems of linear equations (sparse, dense, symmetric,unsymmetric, well conditionned, . . . )

I least-square problems

I eigenvalue problems

I Resolution de Ax = bI A generale carree : factorisation LU avec pivotageI A symetrique definie positive : factorisations de Cholesky LLt

ou LDLt

I A symetrique indefinie : factorisation LDLt

I A rectangulaire m × n avec m ≥ n : factorisation QR

I Problemes aux moindres carres minx ||Ax − b||2I Si rang(A) maximal : factorisation de Cholesky ou QRI Sinon QR avec pivotage sur les colonnes ou decomposition en

valeurs singulieres (SVD)

I Problemes aux valeurs propres Ax = λxI Exemple: determiner les frequences de resonnance d’un pont /

d’un avionI Techniques a base de transformations orthogonales:

decomposition de Schur, reduction a une matrice tri-diagonaleI Problemes generalises :

I Ax = λBx et AtAx = µ2B tBx : Schur et SVD generalisee

I Implantation efficace critique

System of linear equations ?

Example:

2 x1 - 1 x2 3 x3 = 13-4x1 + 6 x2 + 5 x3 = -286 x1 + 13 x2 + 16 x3 = 37

can be written under the form:

Ax = b,

with A =

2 −1 3−4 6 5

6 13 16

, and b =

13−28

Gaussian EliminationExample:

2x1 − x2 + 3x3 = 13 (1)

−4x1 + 6x2 + 5x3 = −28 (2)

6x1 + 13x2 + 16x3 = 37 (3)

With 2 * (1) + (2) → (2) and -3*(1) + (3) → (3) we obtain:

2x1 − x2 + 3x3 = 13 (4)

0x1 + 4x2 + x3 = −2 (5)

0x1 + 16x2 + 7x3 = −2 (6)

Thus x1 is eliminated. With -4*(5) + (6) → (6):

2x1 − x2 + 3x3 = 13

0x1 + 4x2 + x3 = −2

0x1 + 0x2 + 3x3 = 6

The linear system is then solved by backward (x3 → x2 → x1)substitution: x3 = 6

3 = 2, x2 = 14(−2− x3) = −1, and finally

x1 = 12(13− 3x3 + x2) = 3

LU Factorization

I Find L unit lower triangular and U upper triangular such that:A = L× U

2 −1 3−4 6 −5

6 13 16

1 0 0−2 1 0

× 2 −1 3

0 4 10 0 3

I Procedure to solve Ax = b

I A = LUI Solve Ly = b (descente / forward elimination)I Solve Ux = y (remontee / backward substitution)

Ax = (LU)x = L(Ux) = Ly = b

10/ 82

From Gaussian Elimination to LU FactorizationA = A(1), b = b(1), A(1)x = b(1):0@ a11 a12 a13

a21 a22 a23

a31 a32 a33

1A 0@ x1

1A 2← 2− 1× a21/a11

3← 3− 1× a31/a11

A(2)x = b(2)0@ a11 a12 a13

0 a(2)22 a

0 a(2)32 a

1A 0@ x1

1A b(2)2 = b2 − a21b1/a11 . . .

a(2)32 = a32 − a31a12/a11 . . .

Finally 3← 3− 2× a32/a22 gives A(3)x = b(3)0@ a11 a12 a13

0 a(2)22 a

0 0 a(3)33

1A 0@ x1

(3)33 = a

(2)33 − a

(2)32 a

(2)23 /a

(2)22 . . .

Typical Gaussian elimination at step k :

a(k+1)ij = a

(k)ij −

a(k)ik

a(k)kk

a(k)kj , for i > k

(and a(k+1)ij = a

(k)ij for i ≤ k)

11/ 82

From Gaussian Elimination to LU factorization

8><>: a(k+1)ij = a

(k)ij −

a(k)ik

a(k)kk

a(k)kj , for i > k

a(k+1)ij = a

(k)ij , for i ≤ k

I One step of Gaussian elimination can be written:A(k+1) = L(k)A(k) (and b(k+1) = L(k)b(k)), with

0BBBBBBB@

−lk+1,k .. .−ln,k 1

1CCCCCCCAand lik =

a(k)ik

a(k)kk

I After n − 1 steps, A(n) = U = L(n−1) . . .L(1)A givesA = LU , with L = [L(1)]−1 . . . [L(n−1)]−1 =0BBBB@

1l21 1

.. . .

1CCCCA . . .

0BBBB@1

1ln,n−1 1

1CCCCA =

0BBB@1 0

.li,j 1

1CCCA ,

LU Factorization Algorithm

I Overwrite matrix A: we store a(k)ij , k = 2, . . . , n in A(i,j)

I In the end, A = A(n) = U

do k=1, n-1L(k,k) = 1do i=k+1, n

L(i,k) = A(i,k)/A(k,k)do j=k, n (better than: do j=1,n)

A(i,j) = A(i,j) - L(i,k) * A(k,j)end do

enddoenddoL(n,n)=1

I Matrix A at each step:

013/ 82

I Avoid building the zeros under the diagonalI Before

L(n,n)=1do k=1, n-1

L(k,k) = 1do i=k+1, n

L(i,k) = A(i,k)/A(k,k)do j=k, n

A(i,j) = A(i,j) - L(i,k) * A(k,j)

I After

L(n,n)=1do k=1, n-1

L(k,k) = 1do i=k+1, n

L(i,k) = A(i,k)/A(k,k)do j=k+1, n

A(i,j) = A(i,j) - L(i,k) * A(k,j)

14/ 82

I Use lower triangle of array A to store L(i,k) multipliers

I Before:

L(n,n)=1do k=1, n-1

L(k,k) = 1do i=k+1, n

L(i,k) = A(i,k)/A(k,k)do j=k+1, n

A(i,j) = A(i,j) - L(i,k) * A(k,j)

I After (diagonal 1 of L is not stored):

do k=1, n-1do i=k+1, n

A(i,k) = A(i,k)/A(k,k)do j=k+1, n

A(i,j) = A(i,j) - A(i,k) * A(k,j)

15/ 82

I More compact array syntax (Matlab, scilab, Fortran 90):

do k=1, n-1A(k+1:n,k) = A(k+1:n,k) / A(k,k)A(k+1:n,k+1:n) = A(k+1:n,k+1:n)

- A(k+1:n,k) * A(k,k+1:n)end do

I corresponds to a rank-1 update:

A(k,k) A(k,j)

k A(k,k+1:n)

A(k+1:n,k)

A(i,k) A(i,j)i

Computed elements of U

L multipliers

16/ 82

What we have computed

I we have stored the L and U factors in A:

I A(i,j), i > j corresponds to lijI A(i,j), i ≤ j corresponds to uij

I and we had lii = 1, i = 1, n

I Finally,

A = L + U − I

17/ 82

Nombre d’operations flottantes (flops)

I Dans la descente Ly = b calcul de la k-eme inconnue

yk = bk −k−1∑j=1

Soit (k-1) multiplications et (k-1) additions, k de 1 a n-1

Donc n2 − n flops au total

I Idem pour la remontee Ux = yI Nombre de flops dans la factorisation de Gauss:

I n − k divisionsI (n − k)2 multiplications, (n − k)2 additionsI k = 1, 2, ..., n − 1I total: ≈ 2×n3

3 (Strassen’s algorithm can reduce this complexityto Θ(nlog27))

18/ 82

Exercise

Compute the LU factorization of A =

2 −1 3−4 6 −5

6 13 16

Answer: A =

1 0 0−2 1 0

× 2 −1 3

0 4 10 0 3

19/ 82

Exercise

Compute the LU factorization of A =

2 −1 3−4 6 −5

6 13 16

Answer: A =

1 0 0−2 1 0

× 2 −1 3

0 4 10 0 3

19/ 82

Remark

I Assume that a decomposition A = LU exists withI L=(lij)i,j=1...n lower triangular with unit diagonalI U=(uij)i,j=1...n upper triangular

I Computing the LU product, we have:{aij =

∑i−1k=1 likukj + uij for i ≤ j

aij =∑j−1

k=1 likukj + lijujj for i > j

I Renaming i → K in the 1st equation and j → K in the 2nd,

{uKj = aKj −

∑K−1k=1 lKkukj for j ∈ {K ; ...; n}

liK = 1uKK

(aiK −∑K−1

k=1 likukK ) for i ∈ {K + 1; ...; n}

I Explicit computation of uKj and liK for K = 1 to n

I Finally, same computations are performed but in a differentorder (called left-looking)

20/ 82

Existence and uniqueness of the LU decomposition

Theorem.

A ∈ IRn×n has an LU factorization (where L is unit lower triangularand U is upper triangular with uii 6= 0) if det(A(1 : k , 1 : k)) 6= 0for all k ∈ {1 . . . n − 1}. If the LU factorization exists, then it isunique.

Theorem.

For each nonsingular matrix A, there exists a permutation matrixP such that PA possesses an LU factorization PA = LU.

Definition

A is strictly diagonal dominant if |aii | >∑n

j=1,j 6=i |aij | for alli = 1, . . . , n

Theorem.

If AT is strictly diagonally dominant then A has an LUfactorization and |lij | ≤ 1

21/ 82

22/ 82

Vector Norms

Definition

A vector norm is a function f : IRn −→ IR such that

f (x) ≥ 0 x ∈ IRn, f (x) = 0⇔ x = 0

f (x + y) ≤ f (x) + f (y) x , y ∈ IRn

f (αx) = |α|f (x) α ∈ IR, x ∈ IRn

p-norm: ‖x‖p = (|x1|p + |x2|p + · · ·+ |xn|p)1p

Most important p-norms are 1, 2, and ∞ norm:

‖x‖1 = |x1|+ |x2|+ · · ·+ |xn|‖x‖2 = (|x1|2 + |x2|2 + · · ·+ |xn|2)

12 = (xT x)

‖x‖∞ = max1≤i≤n

23/ 82

Vector Norms – Some properties

I Cauchy-Schwarz inequality: |xT y | ≤ ‖x‖2‖y‖2(Proof based on 0 ≤ ‖x − λy‖2 with λ = xT y

‖y‖2 )

I All norms on IRn are equivalent:∀‖.‖α and ‖.‖β, ∃c1, c2 s.t. c1‖x‖α ≤ ‖x‖β ≤ c2‖x‖α

I In particular:

‖x‖2 ≤ ‖x‖1 ≤√

n‖x‖2‖x‖∞ ≤ ‖x‖2 ≤

√n‖x‖∞

‖x‖∞ ≤ ‖x‖1 ≤ n‖x‖∞

24/ 82

Matrix Norms

I As for vector norms,f (A) ≥ 0 A ∈ IRm×n, f (A) = 0⇔ A = 0f (A + B) ≤ f (A) + f (B) A,B ∈ IRm×n

f (αA) = |α|f (A) α ∈ IR,A ∈ IRm×n

I Most matrix norms satisfyI ‖AB‖ ≤ ‖A‖ × ‖B‖

I Norms induced by p norms on vectors:

‖A‖p = maxx 6=0

‖Ax‖p‖x‖p

= max‖x‖p=1

‖Ax‖p Remark: ‖I‖p = 1

‖A‖1 = max1≤j≤n

∑mi=1 |aij |

‖A‖∞ = max1≤i≤m

∑nj=1 |aij |

‖A‖p ≥ ρ(A) = max1≤i≤n

|λi |

I Frobenius norm:‖A‖F =

√∑mi=1

∑nj=1 |aij |2 =

√∑i σ

2i = trace(AT A)

25/ 82

26/ 82

I Consider the linear system:[.780 .563.913 .659

]× [x ] =

[.217.254

]I Suppose that two different algorithms lead to the two

following solutions:

[0.314−0.87

]et x2 =

[0.999−1.00

]I Which of x1 or x2 is the best solution ?

I Residuals:

b − Ax1 =

[.0000001

]et b − Ax2 =

[.001343.001572

]I x1 is the best solution because it leads to the smallest residual

I Exact solution:

x∗ =

[1−1

]I Indeed, x2 is more accurate.

Notion of good solution: ambiguous27/ 82

Sensitivity of the problems

I Let A : [.780 .563.913 .659

]be an almost singular matrix

I Let A′ : [.780 .563001095.913 .659

]singular matrix

→ a perturbation of the matrix entries of the order ofO(10−6) makes the problem unsolvable

I Other problem if A is near singularity: small change on Aand/or b → large perturbations on solution

This is intrinsic to the problem,independent of the algorithm used

28/ 82

Representation des reels en machine

I Reels codes en machine avec nombre fini de chiffres

I Representation normalisee d’un reel flottant normalise:

x = (−1)sm × 2e

I Plupart des calculateurs base = 2 (norme IEEE), mais aussi 8(octal) ou 16 (vieux IBM), 10 (calculettes)

I macheps: precision machine, i.e., plus petit reel positif tel que1 + macheps > 1

I Norme IEEE definit:I format des nombresI modes d’arrondis possiblesI traitement des exceptions (overflow, division par zero, . . . )I procedures de conversion (en decimal, . . . )I l’arithmetique

29/ 82

I Simple precision IEEE :

31 | 30 23 | 22 0_________________________________________s | exposant | mantisse

Exposant code sur 8 bits, mantisse 23 bits plus 1 implicite.I Double precision IEEE :

63 | 62 52 | 51 0________________________________________s | exposant | mantisse

Exposant sur 11 bits, mantisse 52 bits plus 1 impliciteI Simple precision :

I macheps ≈ 1.2× 10−7

I xmin ≈ 1.2× 10−38

I xmax ≈ 3.4× 1038

I Double precision :I macheps ≈ 2.2× 10−16

I xmin ≈ 2.2× 10−308

I xmax ≈ 1.8× 10308

30/ 82

Nombres speciaux

I ±∞ : signe, mantisse=0, exposant max

I NaN : signe, mantisse 6= 0, exposant max

I ±0 : signe, mantisse = 0, exposant min

I Nombres denormalises: signe, mantisse 6= 0, exposant min

Remarques

I 0/0,−∞+∞→ NaN

I 1/(−0)→ −∞I NaN op x → NaN

I (NaN = NaN) → faux, NaN 6= NaN → true

I Exceptions: overflows, underflows, divide by zero, Invalid(NaN)

I Possibilite d’arret avec un message d’erreur ou bien poursuitedes calculs

31/ 82

Analyse d’erreur en arithmetique flottante

I Avec la norme IEEE (modele pour le calcul a precision finie):fl(x op y) = (x op y)(1 + ε) avec |ε| ≤ u

I fl(x): x represente en arithmetique flottanteI op = +, −, ×, /I u = macheps: precision machine

I Exemple:

fl(x1 + x2 + x3) = fl((x1 + x2) + x3)

= ((x1 + x2)(1 + ε1) + x3) (1 + ε2)

= x1(1 + ε1)(1 + ε2) + x2(1 + ε1)(1 + ε2) + x3(1 + ε3)

= x1(1 + e1) + x2(1 + e2) + x3(1 + e3)

avec chaque |ei | < 2 macheps.I Somme exacte de valeurs modifiees xi (1 + ei ), avec |ei | < 2uI Analyse d’erreur inverse: un algorithme est dit backward

stable s’il donne la solution exacte pour des donneeslegerement modifiees (ici xi (1 + ei )). 32/ 82

Analyse d’erreur inverse

I solution approchee = solution exacte d’un probleme modifieI quelle taille d’erreur sur les donnees peut expliquer l’erreur sur

la solution ?I solution approchee OK si solution exacte d’un probleme avec

des donnees proches

erreurdirecte

erreurinverse

y’ = F(x’)

Conditionnement

I Pb bien conditionne: ‖x − x ′‖ petit ⇒ ‖f (x)− f (x ′)‖ petit

I Sinon: probleme sensitif ou mal conditionne

I Sensibilite ou conditionnement: changement relatif solution /changement relatif donnees

= | f (x ′)−f (x)f (x) |/| (x

′−x)x |

33/ 82

Erreur sur la resolution de Ax = b

I Representation de A (et b) en machine inexacte : resolutiond’un probleme perturbe

(A + E )x = b + f

avec E = (eij), |eij | ≤ u × |aij | et |fi | ≤ u × |bi |.x : meilleure solution accessible

I A quel point x est proche de x ?

I Si un algorithme calcule xalg et ‖x − xalg‖/‖x‖ est grand,deux raisons possibles:

I le probleme mathematique est tres sensible aux perturbations(et alors, ‖x − x‖ pourra etre grand aussi)

I l’algorithme se comporte mal en precision finie

I L’analyse des erreurs inverses permet de discriminer ces deuxcas (Wilkinson, 1963)

34/ 82

Notion de conditionnement d’un systeme lineaire

AF7−→ x t.q. Ax = b

A + ∆AF7−→ x + ∆x t.q. (A + ∆A)(x + ∆x) = b

Alors‖∆x‖‖x‖

≤ K (A)‖∆A‖‖A‖

avec K (A) = ‖A‖‖A−1‖.I K (A) est le conditionnement de l’application F .

I Si ‖∆A‖ ≈ macheps‖A‖ (precision machine) alors erreurrelative ≈ K (A)×macheps

(A singuliere : κ(A) = +∞)

35/ 82

Backward error of an algorithm

I Let x be the computed solution. We have:

err = min {ε > 0 such that ‖∆A‖ ≤ ε‖A‖, ‖∆b‖ ≤ ε‖b‖,(A + ∆A)x = b + ∆b}

=‖Ax − b‖

‖A‖‖x‖+ ‖b‖.

I Proof:I

(A + ∆A)x = b + ∆b

⇒ b − Ax = ∆b −∆Ax

⇒ ‖b − Ax‖ ≤ ‖∆A‖‖x‖+ ‖∆b‖⇒ ‖r‖ ≤ ε(‖A‖‖x‖+ ‖b‖)

⇒ ‖r‖‖A‖‖x‖+ ‖b‖

≤ min{ε} = err

36/ 82

=‖Ax − b‖

‖A‖‖x‖+ ‖b‖.

I Proof:

I Bound is attained for ∆Amin =‖A‖

‖x‖(‖A‖‖x‖+ ‖b‖)r xT and

∆bmin =‖b‖

‖A‖‖x‖+ ‖b‖r .

We have ∆Aminx −∆bmin = r with

‖∆Amin‖ =‖A‖‖r‖

‖A‖‖x‖+ ‖b‖and ‖∆bmin‖ =

‖b‖‖r‖‖A‖‖x‖+ ‖b‖

36/ 82

=‖Ax − b‖

‖A‖‖x‖+ ‖b‖.

I Proof:

I Furthermore, it can be shown thatRelative forward error ≤ Condition Number × Backward Error

36/ 82

Ce qu’il faut retenir

I Conditionnement (cas general):

κ(A, b) = ‖A−1‖(‖A‖+‖b‖‖x‖

mesure la sensibilite du probleme mathematique

I Erreur inverse d’un algorithme: ‖Ax−b‖‖A‖‖x‖+‖b‖ .

→mesure la fiabilite de l’algorithme

→a comparer a la precision machine ou a l’incertitude sur lesdonnees

I Prediction de l’erreur:Erreur directe ≤ conditionnement × erreur inverse

37/ 82

38/ 82

Soit A =

[ε 11 1

[1 01ε 1

]×[ε 10 1− 1

]κ2(A) = λmax

λmin= 1+ε+

√5+ε2−2ε

−1−ε+√

5+ε2−2ε' 2.6.

Si on resoud : [ε 11 1

[1 + ε

]Solution exacte x∗ = (1, 1).

39/ 82

En faisant varier ε on a :

ε ‖x∗−x‖‖x∗‖ κ2(A)

10−3 6× 10−6 2.62110−6 2× 10−11 2.61810−9 9× 10−8 2.61810−12 9× 10−5 2.61810−15 7× 10−2 2.618

Table: Precision relative de la solution en fonction de ε.

I Donc meme si A bien conditionnee : elimination de Gaussintroduit des erreurs

I Explication : le pivot ε est trop petit

40/ 82

I Solution : echanger les lignes 1 et 2 de A[1 1ε 1

1 + ε

]→ precision parfaite !

I Pivotage partiel : pivot choisi a chaque etape = plus grandelement de la colonne

I Avec pivotage partiel :1. PA = LU ou P matrice de permutation2. Ly = Pb3. Ux = y

I LU avec pivotage: backward stable

‖Ax − b‖‖A‖ × ‖x‖ ≈ u(1)

‖x − x∗‖‖x∗‖ ≈ u × κ(A) (2)

1. la LU donne de faibles residus independamment duconditionnement de A

2. la precision depend du conditionnementsi u ≈ 10−q et κ∞(A) ≈ 10p alors x a approximativement(q − p) chiffres corrects 41/ 82

Factorisation LU avec pivotage

do k = 1 a n−1f i n d j such t h a t|A( j , k ) | = max { |A( i , k ) | , i = k a n }

i f |A( j , k ) | = 0e x i t . // A i s ( a l m o s t ) s i n g u l a r

e n d i fi f needed , swap rows k and j i n A ( and b )A( k+1:n , k ) = A( k+1:n , k ) / A( k , k )A( k+1:n , k+1:n ) = A( k+1:n , k+1:n )

− A( k+1:n , k )∗A( k , k+1:n )end do

42/ 82

43/ 82

Systemes bande

| x x 0 0 0 || x x x 0 0 |

A = | 0 x x x 0 | largeur de bande = 3| 0 0 x x x | A tridiagonale| 0 0 0 x x |

Exploitation de la structure bande lors de la factorisation : L et Ubidiagonales

| x 0 0 0 0 | | x x 0 0 0 || x x 0 0 0 | | 0 x x 0 0 |

L = | 0 x x 0 0 | U = | 0 0 x x 0 || 0 0 x x 0 | | 0 0 0 x x || 0 0 0 x x | | 0 0 0 0 x |

→ on peut donc reduire le nombre d’operations

44/ 82

Systemes bande

I KL: nombre de sous-diagonales de A

I KU: nombre de sur-diagonales de A

I KL+KU+1: largeur de bande

Question: Si p = KL = KU (largeur totale, 2p+1), quel est lenombre d’operations de l’algo de factorisation LU (sans pivotage)?

Reponse:(n − p)× (p divisions + p2 multiplications + p2 additions ) +23(p − 1)3)

≈ 2np2 flops (quand n >> p), au lieu de 2n3

Pivotage partiel ⇒ la largeur de bande augmente !!

I echange des lignes k et i, A(i,k)=max(A(j,k), j > k)

I KL’ = KL

I KU’ = KL + KU

45/ 82

Systemes bande

Question: Si p = KL = KU (largeur totale, 2p+1), quel est lenombre d’operations de l’algo de factorisation LU (sans pivotage)?Reponse:(n − p)× (p divisions + p2 multiplications + p2 additions ) +23(p − 1)3)

I KL’ = KL

I KU’ = KL + KU

45/ 82

Systemes bande

Question: Si p = KL = KU (largeur totale, 2p+1), quel est lenombre d’operations de l’algo de factorisation LU (sans pivotage)?Reponse:(n − p)× (p divisions + p2 multiplications + p2 additions ) +23(p − 1)3)

I KL’ = KL

I KU’ = KL + KU

45/ 82

46/ 82

Matrices symetriques

I A symetrique : on ne stocke que la triangulaire inferieure ousuperieure de A

I A = LU At = A↔ LU = UtLt Donc(U)(Lt)−1 = L−1Ut = D diagonale et U = DLt , soitA = L(DLt) = LDLt

I Exemple :

| 4 -8 -4| | 1 0 0 | | 1 0 0 | | 1 -2 -1 ||-8 18 14| = | -2 1 0 | x | 0 2 0 | x | 0 1 3 ||-4 14 25| | -1 3 1 | | 0 0 3 | | 0 0 1 |

I Resolution :1. A = LDLt

2. Ly = b3. Dz = y4. LT x = z

I LDLt : n3

3 flops (au lieu de 2n3

3 pour LU)

47/ 82

Matrices symetriques et pivotage

I pas de stabilite numerique sur A a priori → pivotageI maintien de la symetrie → pivotage diagonal, mais insuffisantI approches possibles: Aasen, Bunch & Kaufman, . . .I En general on cherche: PAPt = LDLt ou P matrice de

permutation L : triangulaire inferieureD : somme de matrices diagonales 1× 1 et 2× 2

| 1 0 0 0 | | x 0 0 0 | | 1 0 0 0 |t| x 1 0 0 | | 0 x x 0 | | x 1 0 0 |

PAPt= | x 0 1 0 | * | 0 x x 0 | * | x 0 1 0 || x x x 1 | | 0 0 0 x | | x x x 1 |

L D Lt

I Examples of 2x2 pivots:

| 0 1 | | eps1 1 || 1 0 | | 1 eps2 |

I Determination du pivot complexe: 2 colonnes a chaque etape

48/ 82

I Let PAPt =

[E C t

]. If E is a 2x2 pivot, form E−1 to get:

PAPt =

[I 0CE−1 I

] [E 00 B − CE−1C t

] [I E−1C t

]I Possible pivot selection algorithm (Bunch-Parlett):

µ1 = maxi |aii |; µ2 = maxij |aij |if µ1 ≥ αµ2 (for a given α > 0)

Choose largest 1x1 diag. pivot. Permute s.t. |e11| = µ1

Choose 2x2 pivot s.t. |e21| = µ2

I Choice of α to minimize growth factor, i.e., the magnitude ofthe entries in B − CE−1C t , with E 1x1 or 2x2

I 1x1 pivot (µ1 ≥ αµ2), C has 1 column,|B − C 1

µ1C t |ij ≤ maxij |Bij |+ maxij(|cicj |/µ1) ≤ µ2 + µ2

2/µ1 =

µ2(1 + µ2/µ1) ≤ µ2(1 + 1/α)I 2x2 pivot, one can show that bound is 3−α

1−αµ2

I Choose α s.t 3−α1−α = (1− 1

α )2 (2 pivots) gives. α = 1+√

I Unfortunately, previous algorithm requires between n2

12 and n2

6comparisons, and is too costly.

49/ 82

I More efficient variants exist, also with a good backward errorI Example: Bunch-Kaufman algorithm (1977)

Determination of first pivot:α← (1 +

√17)/8 ≈ 0.64

r ← index of largest element colmax = |ar1| below the diagonalif |a11| ≥ α× colmax

1x1 pivot a11 is okelse

rowmax = |arp| =largest element in row rif rowmax× |a11| ≥ α× colmax2

1x1 pivot a11 is okelseif |arr | ≥ α× rowmax

1x1 pivot arr is ok, permuteelse

2x2 pivot

[a11 ar1

ar1 arr

]is chosen

interchange rows r and 2endif

50/ 82

51/ 82

Factorisation de Cholesky

I A definie positive si x tAx > 0 ∀x 6= 0I A symetrique definie positive → factorisation de Cholesky

A = LLt avec L triangulaire inferieure, L est uniqueI Par identification :[

A11 A12

A21 A22

[L11 0L21 L22

L11 L21

]I De la :

A11 = L211 → L11 = (A11)

12 (7)

A21 = L21 × L11 → L21 =A21

L11(8)

A22 = L221 + L2

22 → L22 = (A22 − L221)

12 (9)

. . . (10)

I Pas de pivotage, Cholesky est backward stableI Factorisation : ≈ n3

3 flops

52/ 82

Algorithme de factorisation de type Cholesky

do k=1, nA(k,k)=sqrt(A(k,k))A(k+1:n,k) = A(k+1:n,k)/A(k,k)do j=k+1, n

A(j:n,j) = A(j:n,j) - A(j:n,k) A(j,k)end do

end do

I Schema similaire a la LU, mais on ne met a jour que letriangle inferieur

I LU factorization:

A(k+1:n,k+1:n) = A(k+1:n,k+1:n) / A(k,k)- A(k+1:n,k) * A(k,k+1:n,k)

53/ 82

54/ 82

Factorisation QR

I Definition d’un ensemble de vecteurs orthonormes{x1, . . . , xk}

I x ti xj = 0 ∀i 6= j

I x ti xi = 1

I Matrice orthogonale Q : les vecteurs colonnes de Q sontorthonormes, QQt = I , Q−1 = Qt

I Factorisation QR:

I Q orthogonale

I R triangulaire superieure

55/ 82

Exemple

1 −82 −12 14

13 −2

3 −23

23 −1

3 60 150 0

= Q × R

56/ 82

Factorisation QR

I Factorisation QR obtenue en general par applicationssuccessives de transformations orthogonales sur les donnees :

Q = Q1 . . .Qn

ou Qi matrices orthogonales simples telles que QtA = R

I Transformations utilisees :I Reflexions de HouseholderI Rotations de GivensI Procede de Gram-Schmidt (auquel cas Q est de taille m× n et

R est de taille n × n)

57/ 82

Reflexions de HouseholderH = I − 2v .v t ou v vecteur de IRn tq ‖v‖2 = 1H orthogonale symetrique.Permet en particulier d’annuler tous les elements d’un vecteur saufune composante.

I Exemple :

u = x +

‖x‖200

et v =u

‖u‖2

Alors :

H = I − 2v × v t =1

−10 5 −105 14 2

−10 2 11

Donc :

H × x =

−300

58/ 82

Reflexions de Householder

Vect {u}

59/ 82

Reflexions de HouseholderVecteur de Householder : u = x ± ‖x‖2e1 puis v = u/‖u‖2Permettent d’obtenir des matrices de la forme:

a11 a12 a13

0 a22 a23

0 a32 a33

0 a42 a43

0 a52 a53

Soit H telle que :

a′22

Si H ′ =

[1 00 H

]Alors H ′ × A =

a11 a12 a13

0 a′22 a′23

0 0 a′33

0 0 a′43

0 0 a′53

60/ 82

I Triangularisation d’une matrice 4× 3 : Q = H3 × H2 × H1

| x x x | | x x x | | x x x || x x x | H1 | 0 x x | H2 | 0 x x || x x x | -> | 0 x x | -> | 0 0 x || x x x | | 0 x x | | 0 0 x |

| x x x |H3 | 0 x x |-> | 0 0 x | = R

| 0 0 0 |

I QR backward stable, avec une erreur inverse meilleure que LU

I Nombre d’operations ≈ 43n3

61/ 82

Rotations de Givens

I Rotation 2× 2 :

G (θ) =

[c s−s c

]orthogonale

avec c = cos(θ) et s = sin(θ).

I Utilisation : x = {x1, x2}

(x21 + x2

et s =−x2

(x21 + x2

y = (y1, y2) = G tx alors y2 = 0

I Permet d’annuler certains elements d’une matrice

62/ 82

Rotations de Givens

I Exemple : factorisation QR de

r11 r12 r13

0 r22 r23

0 0 r33

v1 v2 v3

I On determine (c , s) tels que :[

c s−s c

[r ′11

63/ 82

Rotations de Givens

I Rotation dans le plan (1,4) :

G (1, 4) =

c 0 0 s0 1 0 00 0 1 0−s 0 0 c

G (1, 4)t × A =

r ′11 r ′12 r ′13

0 r22 r23

0 0 r33

0 v ′2 v ′3

I Rotations successives pour annuler les autres elements : 4n3

3flops pour triangulariser la matrice.

64/ 82

65/ 82

Gram-Schmidt Process

I Hypothesis: a basis of a subspace is available

I Goal: Build an orthonormal basis of that subspace

I Very useful in iterative methods, where:I each iterate is searched for in a subspace of increasing

dimensionI one needs to maintain a basis of good quality

66/ 82

Consider two linearly independent vectors x1 and x2

I q1 =x1

‖x1‖2has norm 1.

I x2 − (x2, q1)q1 is orthogonal to q1:

(x2 − (x2, q1)q1, q1) = x t2q1 − (x t

2q1)qt1q1 = 0

I q2 =x2 − (x2, q1)q1

‖x2 − (x2, q1)q1‖2has norm 1

67/ 82

1. Compute r11 = ‖x1‖2, if r11 = 0 stop

2. q1 =x1

r113. For j = 2, . . . , r Do

(q1 . . . , qj−1 form an orthogonal basis)4. rij ← x t

j qi , for i = 1, 2, . . . , j − 1

5. q ← xj −j−1∑i=1

6. rjj = ‖q‖2, if rjj = 0 stop

7. qj ←q

rjj8. EndDo

68/ 82

Remarks1. Compute r11 = ‖x1‖2, if r11 = 0 stop

2. q1 =x1

r113. For j = 2, . . . , r Do

(q1 . . . , qj−1 form an orthogonal basis)

4. rij ← xtj qi , for i = 1, 2, . . . , j − 1

5. q ← xj −j−1Xi=1

6. rjj = ‖q‖2, if rjj = 0 stop

7. qj ←q

rjj8. EndDo

I From steps 5-7, it is clear that xj =∑j

i=1 rijqi

I We note X = [x1, x2, . . . , xr ] and Q = [q1, q2, . . . , qr ]

I Let R be the r -by-r upper triangular matrix whose nonzerosare the ones defined by the algorithm.

I Then the above relation can be written as

X = QR,

where Q is n-by-r and R is r -by-r .

69/ 82

Example: x1 = (1, 2, 2)T , x2 = (−8,−1, 14)T

I r11 = ‖x1‖2 = 3,

q1 =x1

24 122

35 , r12 = xT2 q1 =

3= 6, and q = x2 − r12q1

24 −8−114

35−6× 1

24 122

35 et r22 = ‖q‖ = 15, q2 =q

‖q‖ =1

24 −2−1

35I Ce qui correspond a la factorisation :24 1 −8

2 −12 14

24 13− 2

323− 1

35× » 3 60 15

24 13− 2

3− 2

323− 1

23− 1

3524 3 60 150 0

35Factorisation QR ou Q orthogonale et R triangulaire superieure

70/ 82

71/ 82

Problemes aux moindres carresSoit A : m × n, b ∈ IRn,m ≥ n (et le plus souvent m >> n)

I Probleme : trouver x tel que Ax = b

I Systeme sur-determine : existence de solution pas garantie.Donc on cherche la meilleure solution au sens d’une norme:

minx‖Ax − b‖2

Span{A}

I Principales approches:Equations normales ou factorisation QR

72/ 82

Equations normales

minx‖Ax − b‖2 ↔ min

x‖Ax − b‖22

‖Ax − b‖22 = (Ax − b)t(Ax − b) = x tAtAx − 2x tAtb + btb

I Derivee nulle par rapport a x : 2AtAx − 2Atb = 0⇒ systeme de taille (n x n)

AtAx = Atb

I AtA symetrique semi-definie positive, definie positive si A estde rang maximal (rang(A)=n)

I resolution: avec Cholesky AtA = LDLt

probleme : κ(AtA) = κ(A)2

pas toujours backward stable

I (AtA)−1At : pseudo-inverse de A

73/ 82

Resolution par factorisation QR

Si Q est une matrice orthogonale :

‖Ax − b‖ = ‖Qt(Ax − b)‖ = ‖(QtA)x − (Qtb)‖

I A : m × n,Q : m ×m tel que A = QR

QtA = R =

]nm − n

R est triangulaire superieure. En posant :

]nm − n

I on a donc :

‖Ax − b‖22 = ‖QtAx − Qtb‖22 = ‖R1x − c‖22 + ‖d‖22I si rang(A) = rang(R1)=n alors la solution est donnee par

R1x = cI nombre de flops ≈ (2n2 ×m)

74/ 82

75/ 82

Problemes aux valeurs propres

I Resolution de Ax = λx ou λ valeurs propres et x vecteurspropres

I Polynome caracteristique : p(λ) = det(A− λI ) (revient achercher λ tel que A− λI singuliere)

I Soit T non singuliere et Ax = λx

(T−1AT )(T−1x) = λ(T−1x)

A et (T−1AT ) sont des matrices dites similaires, elles ontmemes valeurs propres.T : transformation de similarite

76/ 82

Problemes aux valeurs propres

On prend T = Q, orthogonale

I A← QtAQ est tres interessant

I backward stable avec des transformations de Householder ouGivens

I QtAQ similaire a (A + E ) avec ‖E‖ ≈ u × ‖A‖I On cherche donc a determiner Q tel que valeurs propres de

QtAQ evidentes

77/ 82

Exemple

A matrice 2× 2 de valeurs propres reellesOn peut toujours trouver (c , s) tel que :[

c s−s c

a11 a12

a21 a22

c s−s c

[λ1 t

λ1 et λ2 sont les valeurs propres de A : decomposition de Schur

I Si y est vecteur propre de S alors x = Qy est vecteur proprede A

I Sensibilite d’une valeur propre aux perturbations fonction del’independance de son vecteur propre par rapport aux vecteurspropres des autres valeurs propres

78/ 82

Valeurs propres: methodes iteratives

Methode de la puissance

vk+1 = Avk/||Avk ||

avec v0 pris au hasard

I converge vers v tel que Av = λ1v (|λ1| > |λ2| ≥ ... ≥ |λn|)I Preuve:

- si v0 =∑αixi avec (xi ): base de vecteurs propre alors

- Akv = Ak(∑

αixi ) =n∑

αiλki xi =

α1λk1(x1 +

n∑i=2

α1(λi

λ1)kxi )

- avec ( λiλ1

)k → 0

Method is used in google’s page rank calculations.

79/ 82

Valeurs propres: methodes iteratives

Shift-and-invert

(A− µI )vk+1 = vk

I methode de la puissance appliquee a (A− µI )−1

I permet d’obtenir la valeur propre la plus proche de µ

I factorisation (par exemple, LU) de (A− µI )

I a chaque iteration: Ly = vk , puis Uvk+1 = y

Des methodes existent pour accelerer la convergence

79/ 82

80/ 82

Decomposition en valeurs singulieres (SVD)

I A ∈ IRm×n, alors il existe U et V matrices orthogonales tellesque :

A = UΣV t

decomposition en valeurs singulieres.

I Remarque:

AtA = V ΣtΣV t et AAt = UΣΣtUt

I U ∈ IRm×m formee des m vecteurs propres orthonormesassocies aux m valeurs propres de AAt .

I V ∈ IRn×n formee des n vecteurs propres orthonormes associesaux valeurs propres de AtA

I Σ matrice diagonale constituee des valeurs singulieres de Aqui sont les racines carrees des valeurs propres de AtA (tqσ1 ≥ σ2 ≥ . . . ≥ σn ≥ 0).

81/ 82

I Si A est de rang r < n, alors sr+1 = sr+2 = . . . = sn = 0.I Tres utile dans certaines applications lorsque rang(A) pas

maximalI moindres carres,I valeurs propresI determination precise du rang d’une matrice

82/ 82

Sparse Matrix Computations/Algorithmique des matrices...

Documents

Transcript of Sparse Matrix Computations/Algorithmique des matrices...

Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.

ALGORITHMIQUE ET PROGRAMMATION · et C. Baudoin est très significative ... Le présent manuel est conforme au programme de la matière «Algorithmique et programmation» au niveau

Algorithmique Techniques Fond Amen Tales de Program Mat Ion com

Algorithmique & Programmation - Zenk IUT... · Algorithmique & Programmation Introduction Algorithmique & Programmation ... Une expression a une valeur et donc un type : c'est la

Algorithmique ii

Algorithmique et Programmation - deptinfo-ensip.univ ... · 3 Programmation en C 25 A Travaux dirig es 27 iii. iv Algorithmique et programmation { Gea 2 ... pseudo-code Pascal. En

Introduction à la Complexté Algorithmique

Algorithmique et technologies numériques dans la notation ...

Algorithmique et optimisation dans les réseaux de ...

Traduction de la notation algorithmique en langage Pythonadamj/documents/NotationAlgoPython.… · Traduction de la notation algorithmique en langage Python Ou comment adapter un

Understood Volume Computations Did Not Understand Volume Computations.

Un petit peu de géométrie algorithmique

De l'interprétation algorithmique du blason

Théorie algorithmique des nombres et applications à la ...

Algorithmique Appliquée en Python - Alexandre Meslé

Adjustment Computations

Algorithmique avancÃ©e - Introduction aux structures de ... · Algorithmique avancée Introduction aux structures de données Frédéric Guyomarch IUT-A Université de Lille, Sciences

Page 1 Licence L3. Algorithmique. Deuxième session lundi 15 ...

Géométrie algorithmique: des données géométriques à la ...

Algorithmique et commande du mouvement en robotique