Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf ·...

51
1 Session 10 – Part 2 ECE 570 - Computer Aided Engineering for Integrated Circuits - IC 752-E Solution Techniques for Sparse Linear Systems Overview of numerical problems and implementation for circuit simulation 1. Definition of sparse system 2. Problem of fill-ins 3. Sparsity and its preservation 4. Sparse matrix techniques literature 5. Discussion of numerical errors 6. Factorization and pivoting in sparse systems 7. Pivoting for sparsity and accuracy 8. Iterative methods 9. Implementation issues – storing sparse systems.

Transcript of Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf ·...

Page 1: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

1

Session 10 – Par t 2 ECE 570 - Computer Aided Engineer ing for Integrated Circuits - IC 752-E

Solution Techniques for Sparse L inear Systems

��������� Overview of numerical problems and implementation for circuit simulation

����� � 1. Definition of sparse system 2. Problem of fill-ins

3. Sparsity and its preservation 4. Sparse matr ix techniques literature 5. Discussion of numer ical er rors

6. Factor ization and pivoting in sparse systems 7. Pivoting for sparsity and accuracy 8. I terative methods 9. Implementation issues – stor ing sparse systems.

Page 2: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

2

1. Definition of Sparse System Observation: In typical circuit equations generated using the MNA there are ~3 elements per row, which are 0≠≠≠≠ .

Example 1: Assume: 310n ==== (small circuit) Solving Ax b==== using traditional methods requires:

1o Operation count ~ O(n3)=109 flops In 10 Mflop computer this will take 102 seconds ~ 1.7 min. per iteration

2o Memory: n2 entr ies 106 locations

Exploiting the sparsity one can achieve

1o Operation count ~ O(n1.2~1.5) Improvement in comparison to the dense matrix ratio of operation counts:

(((( )))) (((( ))))3 1.2~1.53 3 5.4 4.510 / 10 10 ~ 10====

Page 3: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

3

2o Memory : ~ O(n) ≡≡≡≡ 103 locations Improvement in comparison to the dense matrix ratio of operation counts:

(((( )))) (((( ))))23 3 310 / 10 10====

EXAMPLE 2: Assume : n = 104 (LSI)

Requirement to solve Ax b====

1o Operation count: ~O(n3) = 1012 2o Memory : ~ n2 locations = 108 locations

Exploiting the sparsity:

1o operation count: ~ O(n1.2~1.5) Improvement in comparison to the dense matrix:

(((( )))) (((( ))))3 1.2~1.54 4 7.2 610 / 10 10 ~ 10====

2o Memory: ~ O(n) = 104 locations

Improvement in comparison to the dense matrix: (((( )))) (((( ))))24 4 410 / 10 10==== .

Page 4: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

4

SPARSE MATRIX

There is no precise definition. Definitions encountered in the literature:

1o Using limiting concept A matr ix of order n is sparse if the number of elements ≠≠≠≠ 0 is propor tional to n for n sufficiently large Theoretical, useful in developing mathematical theory.

2o Special structure matr ices (circuits) A matr ix is sparse if the number of nonzero entr ies per row is fixed (independent of n) typically # = 2 ~ 10.

3o Alternative definition A matr ix is sparse if # of nonzero entr ies is n1+γγγγ; γγγγ < 1

typically γγγγ = 0.2 ~ 0.5

Page 5: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

5

4o Practical definition A is sparse if:

a) A – is large b) Most of entr ies are zero (minimum 90%)

and then it pays to exploit the sparsity. The exploitation is not cheap!

In other words: a matr ix is sparse when it is wor thwhile to take advantage of existence of (many) zero entr ies.

Leading concepts

A) Store only the nonzeros

B) Operate only on the nonzeros

C) Preserve the sparsity dur ing the computation (decomposition)

Note: C) is a crucial requirement (perhaps the most crucial) in the elimination process PAQ LU==== .

We want factors LU to be also sparse.

IMPLEMENTATION ISSUES

Page 6: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

6

2. Problem of fill-ins The process of LU factor ization causes so called fill-ins (generation of new nonzero entr ies)

( 1) ( ) ( ) ( ) ( )( / )k k k k kij ij ik kk kja a a a a++++ = −= −= −= −

PIVOT Example

X X X X X X X X X X

x x x x

X X Decomposition x X x x x

X X x x X x x

X X x x x X x

X X x x x x X

� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �⊗⊗⊗⊗� � � �� � � �� � � �� � � �

���� ⊗⊗⊗⊗� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �⊗⊗⊗⊗� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �⊗⊗⊗⊗� � � �� � � �� � � �� � � �

� �� �� �� �

� � � �� � � �� � � �� � � �

STAGE k

STAGE k+1 Pivot

Page 7: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

7

Possible better solution:

1) Swap the first and last columns X X X X X

X X

X X

X X

X X

� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �

����

� � �� � �� � �� � �

2) Swap the first and last rows

X X X X

X X X X

X X Decomposit ion X X

X X X X

X X X X X X X X X

� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �

����� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �� � � �⊗⊗⊗⊗� � � �� � � �� � � �� � � �

� � � �� � � �� � � �� � � �

� �� �� �� �

Page 8: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

8

4. Preserving the sparsity

The above example illustrate that one can control the number of fill-ins, which show that is possible to preserve the sparsity

Sad fact

I t has been shown that finding a permutation of matr ix A, which minimizes the # of fill-ins is NP-complete, i.e. the worst case complexity of an algor ithm is 2q where q is the number of non-zero elements of A.

Example: for q=40, 2q > 1012 !!

Conclusion: No efficient general algor ithms to solve this problem are known. There are heur istic algor ithms used to reduce the number of fill-ins.

Most commonly used and quite successful is MARKOWITZ Algor ithm.

To introduce the algor ithm it is necessary to define a quantity called Markowitz measure of fill-ins in one stage of elimination process.

Page 9: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

9

MARKOWITZ measure, ( )( )kijf a , of fill-in at the stage k for the element ( )k

ija

(which is a candidate for a pivot) is defined as follows

.

( )( ) ( 1)( 1)def

kij i jf a r c= − −= − −= − −= − − ; 0k

ija ≠≠≠≠

ir - Number of nonzero elements in row “ i ” jc - Number of nonzero elements in col. “ j ”

( )( )kijf a = maximum possible number of fill-ins created by choosing ( )k

ija as

a pivot.

Page 10: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

10

Example

11 12 13 14 15

21 22( )

32 33

42 43 44

52 54 55

1

4

0 0 0 1

0 0 0 1

0 0 2

0 0 2

1 ( 1 4 2 2 1 )

k

r

a a a a a

a a

A a a

a a a

a a a

c

−−−−

� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �====� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �

−−−−

MARKOWITZ measures 4 16 8 8 4

1 4 0 0 0

0 4 2 0 0

0 8 4 4 0

0 8 0 4 2

f

� �� �� �� �� �� �� �� �� �� �� �� �

==== � �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �

Using the measures we see that 21a - is clear ly indicated as an element for pivoting

Page 11: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

11

Thus, since we want to minimize the fill-ins we select 21a as the pivot and obtain:

21 22(2)12 13 14 15

( 1)32 33

42 43 44

52 54 55

0 0 0 10 30 0 0 1

20 020 0

k

a a ra a a a

a aA

a a a

a a a

+

� � −� �� �� �=� �� �� �� �� �

1 3 2 2 1c −

f=2

f=2

TIE(OCCURS OFTEN)

Page 12: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

12

BREAKING THE TIES

“ SPICE” breaks ties by selecting the element with minimum column count ( 55a in the example). I f ties still occur , choice is arbitrary (any element in a set after such selection is acceptable).

Example (continued) Selecting 55a and swapping (row 3& 5, col. 2& 5) yields

55 54 52 55

33 32 33 32

43 44 42 43 44 42(2) (3) (3)

15 13 14 12 13 4 12

0 0 0 0

0 0 0 0

0 0

0

a a a a

a a a a

a a a a a a

a a a a a a a

��������

Page 13: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

13

Active matr ix in (3)A :

33 32

43 44 42(3) (3)

13 14 12

0 1

2

2

a a

a a a

a a a

�����

2 1 21c− 1r−

f=2

f=2

NOTE: Column count selects 44a , (3)

14athen the selection is arbitrary

Page 14: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

14

MARKOWITZ ALGORITHM – SUMMARY

1o At each stage (k) of decomposition select a set d of elements ( ( )kpsa ) with

minimum MARKOWITZ measure ( ) ( )( ) min ( )k k

ps ijk i nk j n

f a f a≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤

====

2o I f d contains one element, then ( ) pivotk

psa →→→→

otherwise, select a subset e of elements with minimum column count: sc .

I f e contains one element ( ( )l l

kp sa ) only then

(((( )))) pivotl l

kp sa →→→→

Otherwise select any element in the subset e can be a pivot.

Logical choice: select an element with the largest magnitude.

Page 15: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

15

Comments Concerning The Markowitz Algor ithm

1. Simple and easy to implement – advantage

2. Local – minimizes fill-ins in one stage only. Minimization over several stages might produce fewer fill-ins! Markowitz minimization is not global.

Local character of the algor ithm – disadvantage.

Note:stability refered here is related to accuracy, which is not controlled in the Markowitz algor ithm

3. Stability is not considered – minimization is per formed without consideration of er rors – disadvantage. Stability here is related to accuracy and round-off er ror propagation, which are not controlled.

4. Modifications of the algor ithm are many. Versions especially suitable for circuit analysis will be discussed in conduction with the for thcoming simplified analysis of round-off er rors.

Markowitz algor ithm preserves symmetry, which means that the same order ing is obtained for either A or TA - advantage (symmetry is effected by breaking of ties and

threshold pivoting – discussed later )

Page 16: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

16

5. SPARSE MATRIX TECHNIQUES LITERATURE

Many publications are available. Sample of references in the book form:

1. D.J.Evans (Ed.), Sparsity And I ts Applications, Cambr idge Univ. Press London, 1985

2. I .S. Duff (Ed), Sparse Matr ices And Their Uses, Academic Press, N.York 1981

3. S.Pissanetsky, Sparse Matr ix Technology, Academic Press, N.York 1984

Also see the chapter by K. Kunder t on sparse matr ix techniques in the book “ Advances in CAD for VLSI ,” Ser ies by Nor th-Holland., Vol. 3: A. Ruehli, Ed., Circuit Analysis, Simulation and Design Par t 1: General Aspects of Circuit Analysis and Design; Par t 2: VLSI Circuit Analysis and Simulation.

Page 17: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

17

5. Discussion of numer ical er rors - round off er rors (Elementary)

Definitions: ���� - Stands for basic operations:

, , ,+ − × ÷+ − × ÷+ − × ÷+ − × ÷ a b���� - Exact ar ithmetic, no round-off er ror .

( )fl a b���� - Machine ar ithmetic, result with er ror Simple model of error in floating pt. operation

( ) ( )(1 )df

fl a b a b εεεε� � � � = += += += +� � � � � � � �

� �� �� �� �

where εεεε is a round-off er ror .

NOTE: εεεε - is bounded by the machine er ror Mεεεε such that Mε εε εε εε ε≤≤≤≤ For a CPU with a t-bit word length and rounding-off after flop we have

2 tMεεεε −−−−====

A single precision ar ithmetic is assumed in this discussion.

Page 18: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

18

Typical operation in the decomposition process: ( 1) ( ) ( ) ( ) ( )( )k k k k k

ij ij ik kk kja a a a a++++ = −= −= −= −

We shall use a simplified notation ( / )a b c p d= −= −= −= − NOTE: In view of the introduced notation, the above is exact (theoretical). Numer ical result (a ) (((( )))){{{{ }}}}/a fl b fl fl c p d= − ⋅= − ⋅= − ⋅= − ⋅� � � � � � � �

Applying (sequentially) the er ror model

1-rst flop: 1(1 )c

a fl b fl dp

εεεε� �� �� �� �� � � �

= − += − += − += − +� �� �� �� �� � � � � � � �� �� �� �� �

2-nd flop: 1 2 (1 ) (1 )c

fl b dp

ε εε εε εε ε� �� �� �� �� � � � = − + += − + += − + += − + +� �� �� �� �� � � �

� � � �� �� �� �� �

3-st flop: 1 2 3 (1 ) (1 ) (1 )c

b dp

ε ε εε ε εε ε εε ε ε� �� �� �� �� � � � = − + + += − + + += − + + += − + + +� �� �� �� �� � � � � � � �� �� �� �� �

thus:

1 2 3(1 )(1 ) (1 )c

a b dp

ε ε εε ε εε ε εε ε ε� � � � = − + + += − + + += − + + += − + + +� � � � � � � �

; NOTE: i Mε εε εε εε ε≤≤≤≤ (single precision ar ithmetic).

p – stands for pivot

Page 19: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

19

1-rst order (linear) analysis

(((( ))))2 1 1 2 3(1 ) 1c

a b dp

ε ε ε ε εε ε ε ε εε ε ε ε εε ε ε ε ε� � � � = − + + + += − + + + += − + + + += − + + + +� � � � � � � �

(((( ))))1 2 3(1 ) 1c

a b dp

ε ε εε ε εε ε εε ε ε� � � � ≅ − + + +≅ − + + +≅ − + + +≅ − + + +� � � � � � � �

(((( )))) (((( ))))(((( ))))

(((( ))))

3 1 2 3

3 1 1 3 2 2 3

1 1 1

1

ca b d

pε ε ε εε ε ε εε ε ε εε ε ε ε

ε ε ε ε ε ε εε ε ε ε ε ε εε ε ε ε ε ε εε ε ε ε ε ε ε

= + − + + += + − + + += + − + + += + − + + +

+ + + + ++ + + + ++ + + + ++ + + + +

(((( )))) (((( ))))3 1 2 31 1c

a b dp

ε ε ε εε ε ε εε ε ε εε ε ε ε� � � � ≅ + − + + +≅ + − + + +≅ + − + + +≅ + − + + +� � � � � � � �

Global er ror of one operation

df

e a a= −= −= −= − Using the def. of exact result ( ( / )a b c p d= −= −= −= − )

(((( )))) (((( ))))3 1 2 31c c

e b d b dp p

ε ε ε εε ε ε εε ε ε εε ε ε ε= + − + + − += + − + + − += + − + + − += + − + + − +

Page 20: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

20

(((( ))))3 1 2 3c

e b dp

ε ε ε εε ε ε εε ε ε εε ε ε ε= − + += − + += − + += − + +

Er ror bound ( i Mε εε εε εε ε≤≤≤≤ )

3M Mc

e b dp

ε εε εε εε ε≤ +≤ +≤ +≤ +

DISCUSSION OF ERROR BOUND

3M Mcd

e bp

ε εε εε εε ε≤ +≤ +≤ +≤ +

2 tMεεεε −−−−====

In the above p - is the element that can be controlled by pivoting strategy. For minimizing the er ror (preventing the error growth) one should choose largest pivoting elements. Usually conflicting with sparsity!

Page 21: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

21

7. Pivoting and factor ization techniques

7.1. Par tial pivoting

Par tial pivoting consists of finding (((( ))))kika such that

(((( )))) (((( ))))max

, 1,...,

k kik jka a

j k k n

===== += += += +

and swaping suitable raws to br ing the element (((( ))))kika into a position of pivot.

Page 22: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

22

Matr ix schematics (active matr ix is shown)

(((( )))) (((( )))) (((( ))))

(((( ))))

(((( )))) (((( ))))

(((( )))) (((( ))))

, , 1 ,

1,

, ,

, ,

...

.

..

.

. .

. . . .

k k kk k k k k n

kk k

k ki k i n

k kn k n n

a a a

a

a a

a a

++++

++++

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

swap rows and i k

Page 23: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

23

Another illustration

A(k) =0 i

k

k The effect of swapping is that

1 ; 1,ikm i k n≤ = +≤ = +≤ = +≤ = + ���� Now the A matr ix will be well conditioned. This means that condition number is close to unity. Condition number for a matr ix “ A” :

1K A A−−−−= ⋅= ⋅= ⋅= ⋅

I f 2

max ( )TA A eigen A A= == == == = , then max

minK

λλλλλλλλ

====

Page 24: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

24

7. 2. Complete pivoting Complete pivoting consists of finding the element (((( ))))

,k

i ja such that

(((( )))) (((( ))))maxk kij pa a

k n

k p n

====≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤

����

����

and swaping suitable raws and columns to br ing the element (((( )))),k

i ja into

a position of pivot.

Page 25: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

25

Matr ix illustration (active matr ix is shown)

swap columns j, k

(((( )))) (((( )))) (((( ))))

(((( ))))

(((( )))) (((( )))) (((( ))))

(((( )))) (((( )))) (((( ))))

, , ,

1,

, , ,

, , ,

. ...

. .

.. .

.

. ...

. . .

. . .

k k kk k k j k n

kk k

k k ki k i j i n

k k kn k n j n n

a a a

a

a a a

a a a

++++

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

swap raws ,i k

Page 26: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

26

Another illustration

A(k) =0

j

i

k

k

Pivoting changes the or iginal matr ix A to PAQ where P & Q are permutation matr ices. Thus we have to consider the resulting modification to the or iginal system.

Page 27: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

27

Or iginal system Ax=b (* ) With pivoting we do not get A LU==== , but PAQ LU====

Therefore we transform (* ) into PAx Pb==== . Then since 1TQQ ==== , we wr ite

����T

LU

PAQQ x Pb==== or explicitly

TLUQ x Pb==== Solution procedure

��������

T

z

y

LU Q x Pb====

1* Ly Pb y ↵↵↵↵==== ���� 2* Uz y z ↵↵↵↵==== ����

3* TQ x z==== or since: 1TQQ ====

����1

TQQ x Qz====

4* x Qz x==== ����

Page 28: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

28

Scaling and Equilibration of Matr ices Scaling a matr ix

Motivation: Range of voltage 310 10 [ ]V−−−−→→→→

Range of cur rents: 3 1510 10 [ ]A− −− −− −− −→→→→ Multiplication of column j by jαααα is equivalent to replacing xj by

ˆ jj

j

xx

αααα====

Multiplication of row i by jββββ is equivalent to scaling the r ight side entry, ib , by the

factor , jββββ , i.e. replacing ib by ˆi i ib lββββ ====

Compact form of scaling

Define: (((( ))))(((( ))))

1 1 2

2 1 2

, , ,

, , ,m

m

D diag

D diag

α α αα α αα α αα α αβ β ββ β ββ β ββ β β

====

====

����

����

Then scaling of x and b can be wr itten as

1 ˆx D x==== and 2b D b==== which turns Ax b==== into

Page 29: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

29

1 ˆAD x b==== (column scaling)

����2 1 2

ˆ

ˆb

D AD x D b==== (row scaling)

or new (scaled) system: ˆ ˆˆAx b==== ; where 2 1A D AD====

Scaling may be used to equilibrate a matr ix. We want to have the entr ies of the matr ix to be of the same order (same size).

Page 30: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

30

Matr ix equilibration

A matr ix is row equilibrated if max 1

1

ija

j m

≈≈≈≈

≤ ≤≤ ≤≤ ≤≤ ≤ for ∀∀∀∀ i

A matr ix is column equilibrated if max 1

1

ija

i m

≈≈≈≈

≤ ≤≤ ≤≤ ≤≤ ≤ for ∀∀∀∀ j

A matr ix is equilibrated if it is both row and column equilibrated. Sad observation: there is no unique way to equilibrate a matr ix! Example:

9

9

1 1 2 10

2 1 10

1 2 0

A

� � � � ⋅⋅⋅⋅� � � �

= −= −= −= −� � � � � � � � � � � � � � � �

Page 31: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

31

a) Per forming column equilibration as a first operation, we get:

0.5 0.5 1

1 0.5 0.5

0.5 1 0

A� � � � � � � � = −= −= −= −� � � � � � � � � � � �

Note that A is also row equilibrated.

b) Per forming the rows equilibration as a first operation yields: 10 10

9 9

5 10 5 10 1

2 10 10 1

0.5 1 0

A

− −− −− −− −

− −− −− −− −

� � � � ⋅ ⋅⋅ ⋅⋅ ⋅⋅ ⋅� � � �

= ⋅ −= ⋅ −= ⋅ −= ⋅ −� � � � � � � � � � � � � � � �

Then A is also column equilibrated. But the two matr ices are very different.

Page 32: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

32

Suggested exercises: 1. Wr ite scaling matr ices D1, D2 for the above matr ices (tr ivial) and per form the

multiplication / equilibration. 2. Investigate the effect of scaling on the accuracy (global round-off er ror ) in

the k-stage of factor ization:

( )( 1) ( ) ( )

( )

kk k kik

ij ij kjkkk

aa a a

a++++ = −= −= −= − Assume no round-off er ror in

scaling. Exploiting the sparsity of r ight-hand side

Use the Crout’s decomposition or modified G-E. to get

1iiu ==== This will minimize the number of operations in back substitution (no division by diagonal entr ies)

xu y==== The elements 1iil ≠≠≠≠ in general, but with sparsity in b, this is advantageous. Note that it is beneficial to have non-zero entr ies of the vector , b, clustered at the bottom (explain why?).

Page 33: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

33

8. Pivoting for sparsity and accuracy Modification of Markowitz algor ithm 1o Find an element with max. magnitude ( ) ( )maxk k

mx lpk l nk p n

a a≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤≤ ≤

====

2o Select set D of elements with minimum Markowitz measure

Eliminate elements for which ( ) ( )k k

ij mxa u a< ⋅< ⋅< ⋅< ⋅

This is a threshold parameter that needs to be chosen. Often : 0.1u ====

Page 34: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

34

Elimination of small elements creates subset D . I f D is empty then : Warning Message Otherwise proceed with acceptance of pivot (if D contains one element only) or per form fur ther elimination like in classical (sparsity or iented) Markowitz algor ithm. The approach descr ibed above is also known as threshold pivoting (or relative threshold).

Circuit simulators based on MNA restr ict, whenever possible (if elements are not too small – this involves some kind of threshold consideration), pivot choice to the main diagonal.

Note: 1o MNA usually yields matr ices, which are diagonally dominant, so the strategy is suitable.

2o This strategy preserves symmetry if threshold is not involved.

Page 35: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

35

Pivoting in SPICE 2g.6 (and newer versions) 1o Choose element ( ( )k

iia ) on main diagonal with minimum Markowitz measure. 2o Check it against the largest element ( mxa ) in the same column, and

if ( )kii A R mxa aε εε εε εε ε< +< +< +< +

1310A pivtolεεεε −−−−= == == == = (absolute tolerance)

310R pivrelεεεε −−−−= == == == = (relative tolerance) reject otherwise ( )k

iia - pivot. 3o I f rejected, select next element on diagonal (go to 1 and repeat operations excluding

( )kiia ).

4o I f all elements on diagonal fail the threshold test, per form selection over all elements of active matr ix using Markowitz algor ithm with threshold (2o).

Page 36: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

36

Many other methods attempting to improve solution of sparse systems are available. There is a class of methods aiming at minimizing profile of matr ix A. Definitions

Bandwidth: A has bandwidth m if 0ija ==== when i j m− >− >− >− >

Par ticular cases: 0m ==== Diagonal 1m ==== Tr idiagonal Profile: im - is a bandwidth of i -th row 1,2, ,i n==== ���� P - is a matr ix profile

1

df n

ii

P m====

==== ����

The methods are aiming at developing such order ing schemes that minimize the profile. Most popular : Cuthill-McKee (C-M) Reverse C-M.

Page 37: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

37

8. I terative methods Basics definitions:

Or iginal system Ax b==== True (theoretical) solution *x Star ting vector 0x Sequence of approximate (numer ical) solutions 1 2, , , kx x x����

I teration error : *k ke x x= −= −= −= − Convergence requirement 0ke →→→→ as k → ∞→ ∞→ ∞→ ∞ .

Constructing an iterative scheme: ( )A B A B= + −= + −= + −= + − plug into the or iginal system

yields: ( )Bx A B x b+ − =+ − =+ − =+ − =

General scheme: ( 1) ( )( )k kBx A B x b++++ = − − += − − += − − += − − +

B – is an arbitrar ily selected matr ix. The sophistication comes in the selection process.

Page 38: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

38

One requirement: B must be easy to inver t. I f B is inver ted we can wr ite ( 1) 1 ( ) 1(1 )k kx B A x B b

Q

+ − −+ − −+ − −+ − −= − += − += − += − +� �� �� �� �

This formula defines the iteration matrix: 1( )B B A−−−− −−−− The sufficient convergence condition (((( )))) 1(1 ) 1Q B Aρ ρρ ρρ ρρ ρ −−−−= − <= − <= − <= − < where: ( )Qρρρρ - represents spectral radius of matr ix Q. I f eigenvalues are known ( 1 2, , , nλ λ λλ λ λλ λ λλ λ λ���� ) then

1( ) max i

i nQρ λρ λρ λρ λ

≤ ≤≤ ≤≤ ≤≤ ≤====

Page 39: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

39

Examples of par ticular choices of matr ix B Gauss-Jacobi scheme Simple decomposition of A produces A L D U= + += + += + += + + where D is the main diagonal and ,L U are lower and upper tr iagonal matr ices (with zero main diagonal elements). 1. Choosing B D==== - yields Gauss-Jacobi scheme where we get A B L U− = +− = +− = +− = + using this as an iteration matr ix we get 1 1( ) ( )B B A D L U− −− −− −− −− = − +− = − +− = − +− = − + . I f A is diagonally dominant, i.e.

1

n

ii ijjj i

a a====≠≠≠≠

>>>> ����

then the spectral radius of the iteration matr ix satisfies the convergence condition 1( ) 1D L Uρρρρ −−−−� � � � + <+ <+ <+ < � � � � .

Page 40: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

40

Other methods: 2. B L D= += += += + Gauss-Seidel

The Gauss-Jacobi (1) and Gauss-Seidel (2) schemes have been around for long time.

3. 1

wB D Lw

= += += += + Successive Over Relaxation

0 2w< << << << <

relaxation parameter . The relaxation parameter ,w , should be selected to minimize

11 wB Aρρρρ −−−−� � � � −−−− � � � �

The above iteration matr ices are usually un-symmetr ic.

Page 41: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

41

Symmetr ic overrelaxation (SSOR)

1

wB D Lw

= += += += + in one step

1

wB D Uw

= += += += + in another step

Note: w - must be the same in 2 consecutive steps. I t was shown that the matr ix taken over 2 steps is symmetr ic if A is symmetr ic.

Alternating direction implicit (ADI) The matr ix A of or iginal system is split A H V= += += += + This produces ( )H V x b+ =+ =+ =+ = or Hx Vx b+ =+ =+ =+ = from here a scheme is constructed.

Page 42: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

42

a) Hx Vx b= − += − += − += − + Hx Ix aIx Vx bαααα+ = − ++ = − ++ = − ++ = − + αααα - parameter ( ) ( )H I x I V x bα αα αα αα α+ = − ++ = − ++ = − ++ = − + 1-rst iterative scheme

12

1 1( ) ( )k k

k kH I x I V x bα αα αα αα α++++

+ ++ ++ ++ ++ = − ++ = − ++ = − ++ = − + b) Vx Hx b= − += − += − += − +

Vx Ix Ix Hx bα αα αα αα α+ = − ++ = − ++ = − ++ = − + ( ) ( )V I x I H x bα αα αα αα α+ = − ++ = − ++ = − ++ = − + 2-nd iterative scheme

11 2

1 1( ) ( )kk

k kV I x I H x bα αα αα αα α++++++++

+ ++ ++ ++ ++ = − ++ = − ++ = − ++ = − + Splitting of A is a problem. In some cases of unsymmetr ic matr ices

1( )

2TH A A= += += += +

1( )

2TV A A= −= −= −= −

Page 43: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

43

Conjugate gradient methods (many var iations are available)

Or iginally developed for symmetr ic matr ices, but have been expanded to general cases. Basic scheme:

0;

for 1,2,

if 0

then EXIT

else if 1

then

else

T

T

T

T

x

k

r b Ax

r

k

p r

p Ar

p Ap

p r p

r r

p Ap

x x p

ββββ

ββββ

αααα

αααα

========

← −← −← −← −====

====←←←←

← −← −← −← −

← +← +← +← +

←←←←

← +← +← +← +

Page 44: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

44

General proper ties of conjugate gradient method: 1o One multiplication by A per iteration 2o True solution is reached in m-iterations where m is # of distinct eigenvalues of A. 3o Convergence is faster in case of matr ices which have eigenvalues clustered closely

(well-conditioned matr ices). Advanced monograph: (G. H. Golub, C. F. Van Loan, 1983, Matr ix Computation, J. Hopkins Univ. Press)

Page 45: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

45

9. Implementation issues - stor ing a sparse matr ix

(Data structures) Many techniques (data structures) are available. Trade-off between storage and speed of calculations is involved. SPICE2f (and newer versions) employ:

bi-directional threaded list.

Each entry is represented by - value - row index - column index - pointer to the next nonzero entry in the same column. I f there is no next

element, then pointer is zero. - pointer to the next ( 0≠≠≠≠ ) entry in the same row. I f no next, pointer is zero

Pointers indicate the position of next entry in the storage system.

Page 46: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

46

Graphical I llustration of Information Related to an Entry

VALUE ROW COL. PTR1 PTR2

Position of next entry in thesame column (zero if none)

Posi t i on ofnext entry int h e s a m ecolumn (zeroif none)

Example

9 3 7 2 1

6 5 0 0 2

0 0 1 0 3

0 8 0 7 4

A

� � � =� � �

i =

1 2 3 4j =

Page 47: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

47

Bidirectional Threaded L ist – concept Col.1 Col.2 Col.3 Col.4

Row 1

Row 2

Row 3

Row 4

9 11 3 21 7 31

7 0044

1 0033

8 024

5 0226 012

2 041

StartingCol.PTRs

Starting rowpointers

Bidirectional threaded list – implementation (FORTRAN or iented) Row Column

Page 48: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

48

(1) 9 1 1 2

(2) 3 1 2 3

(3) 7 1 3 4

(4) 2 1 4 0

VAL ; IROW ; JCOL ; IPT ; (5) 6 2 1 6

(6) 5 2 2 0

(7) 1 3 3 0

(8) 8 4 2 9

(9) 9 4 4 0

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

51

65

ISPT77

98

JPT ;0

81

02

0 JSPT3

04

Row

Column

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

� � � � � � � � � � � �

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

Page 49: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

49

Examples of Using the Structure

1. Scanning a column Assume column 2

I . a) Read the PTR: JSPT(2) = 2

b) Read the first entry VAL[JSPT(2)]=VAL(2)=3

I I . c) Read pointer to next entry in the column JPT(2) = 6

d) Read the entry VAL[JPT(2)]=VAL(6)=5

I I I . e) Read the PTR: JPT(6)=8

f) Read the entry VAL[JPT(6)]=VAL(8)=8

IV. g) Read the PTR to next entry JPT(8)=0 Last element – done!

Page 50: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

50

2. Scanning a row

Assume row 4:

a) Read the star ting PTR: ISPT(4)=8

b) Read the entry

VAL[ISPT(4)]=VAL[8]=8

c) Read the PTR to the next entry

IPT(8)=9

d) Read the entry VAL[IPT(8)]=VAL(9)=7

e) Read the PTR to the next entry

IPT(9)=0

Last entry – done!

Page 51: Overview of numerical problems and implementation for circuit 3. …ece570/session10_2.pdf · 2003-10-19 · S.Pissanetsky, Sparse Matrix Technology, Academic Press, N.York 1984 Also

51

3. Add an entry (fill-in) Assume fill-in in the position (2, 3) – matr ix The value = 14. The fill-in will be added at the position 10.

A. Thus VAL(10)=2

indexes: IROW(10)=2 JCOL(10)=3

B. Changing the pointers

a) Scan the column 2 till entry in the row 2 and change IPT(6)=0 to IPT(6)=10

b) Scan the column 3 till the entry in the row 1 and change JPT(3)=7 to JPT(3)=10

C. Add PTRs for new entry

IPT(10)=0

JPT(10)=7