Download - Mixed Precision Dense Linear System Solvers for High ...saahpc.ncsa.illinois.edu/09/sessions/day2/session3/Lee_presentation.pdf · Mixed Precision Dense Linear System Solvers for

Mixed Precision Dense Linear System Solvers

for High Performance Reconfigurable Computing

Introduction Methodology Results Future work Conclusions

Tennessee Advanced Computing Laboratory

University of Tennessee

July 29th 2009

JunKyu Lee, Gregory D. Peterson, Robert J. Harrison, Robert J. Hinde

This work was partially supported by

the National Science Foundation,

grant NSF CHE-0625598.

I. Introduction

II. Methodology

III.Results

IV. Future work

V. Conclusions

- What is a mixed precision solver?

- How has a mixed precision computations been

developed? What do we need to develop further?

- What have we explored?

- What have we found?

- What could be a possible RC architecture for

this research?


What is mixed precision solver? Why ?

Computational

Science Applications

Ax = b

Computer’s Nature

Finite Precision Computation

Error Prone

Solution x

Iterative Refinement

(For Better Numeric Results)

James Wilkinson (1948)

Mixed Precision Solver

( For Better Numeric Results and High Performance )

J.Langou,J.Dongarra(2006)

Mixed Precision Computation Method:

1. Employ more than one precision computation.

2. Lower precision (fast computation) for computationally inten

sive tasks and Higher precision (slow computation) for refin

ing solution.

High performance (Lower precision computation)

Satisfactory numeric solution (Higher precision refinement)


Mixed precision algorithm

To solve Ax = b;

Step 1: GEPP (A); O(n3) <= precision PI;

Solve LUx(1) = P×b; O(n2) <= precision PI;

for ( i = 1 to x(i) accurate enough)

Step 2: r(i) = b – Ah x(i); O(n2) <= precision PH;

Step 3: LUz(i) = P×r(i); O(n2) <= precision PI;

Step 4: x(i) = x(i) + z(i); O(n) <= precision PH;

end

P is a permutation matrix and r is a residual vector.


Computationally

Expensive

Employ lower

precision for fast

computation

Computationally

Not Expensive

Employ higher

precision for

refinement

Graphical View for Mixed Precision Algorithm

Ax = b

Exact

Solution x

Exact x at iteration 1

Ax at iteration 1

r = b – Ax at iteration 1

z = A-1 × r at iteration 1

Ax at iteration 2

x at iteration 2


A × z = r

To solve Ax = b;

Step 1: GEPP (A);

Solve LUx(1) = P×b;

for ( i = 1 to x(i) accurate enough)

Step 2: r(i) = b – Ah x(i);

Step 3: LUz(i) = P×r(i);

Step 4: x(i) = x(i) + z(i);

end

How have mixed precision methods been developed?


Previous Research

Single precision for GEPP

Double precision refinement

Work ?

Double precision for GEPP

DONE

CPU/ CELL / GPU

Could we apply an arbitrary precision

instead of just single or double

precision?

Use FPGA !

(Arbitrary precision computation)

From J. Sun, G.D. Peterson, O.O. Storaasli,

High-Performance Mixed Precision Linear

Solver for FPGAs, IEEE Transactions on

Computers, vol. 57, no. 12, 2008

From J. Sun, G.D. Peterson, O.O. Storaasli (2008)

Edelman’s analysis with random matrices: E(k(A)) = e log(n)+1.537

Matrices

Size

Condition Numbers

(J.Sun / Edelman)

128 913 595

256 1818 1191

512 4017 2381

1024 6196 4762

2048 9407 9525

4096 22425 19050


1. FPGA can employ arbitrary precision computation

(Applying higher precision when a condition number of system is high)

2. Lower precision Smaller, Faster ALUs More ALUs

(Significant performance difference for multiplication between lower

precision and higher precision in FPGAs)

Mantissa: 16bits, Exponent: 7bits

From J. Sun, G.D. Peterson, O.O. Storaasli (2008)

In this research, we address

1. Initial precision predictor: How to apply appropriate initial precision for first

O(n3) computation for GEPP ?

2. Failure manipulator: If the algorithm fails, how to predict the failure and how

to fix it ?


Latency

XC4LX160-10

Question (This is what the research is about)

If FPGA can employ arbitrary precision computation, what precision will be for GEPP?Could we predict success?

Exploit FPGA’s arbitrary

precision computation

Fewer H/W resources in

FPGAs

Lower Precision

Smaller, Faster ALUs

More ALUs

Initial precision

predictor

according to matrix

characteristics

Failure manipulator

+

Mixed precision dense linear

system solver for high performance

reconfigurable computing


MethodologyIntroduction Results Future work Conclusions

Guessing

Condition

Number of

Matrix

Decide

Initial

Precision

System

Matrix

Characteristics

GuessGuess

GEPP

Generate

L and U

Compute

Condition

Number of

Matrix

Done

Model

No

Work?

Is Guess

Right ?

Failure

Manipulator

Yes

Initial Precision Predictor and Failure Manipulator

I. Initial Precision Predictor Note 1 and 2

We note that

1. The condition number of a matrix is an important factor to predict success

2. The condition number of a random matrix increases with size

3. Computing a condition number from triangular matrices takes O(n2)


II. Failure Manipulator Note 1 and 3


Experiments with 1000 uniformly distributed random matrices

(lagged Fibonacci generator) according to different matrix sizes

What have we tried ?

(Initial Precision Predictor)

1. Investigate success rate according to different size matrices

with different precision computations for GEPP

2. Find out appropriate mantissa bit widths which guarantee

95% success rate according to different matrix sizes

(Failure Manipulator)

1. Investigate CDFs for condition numbers according to

different size matrices

2. Find out the 95% condition numbers in CDFs

Relate the working precision (initial precision) to the condi

tion numbers (failure manipulator)

Different precision computations => Fix exponential bits, Change

mantissa bits

I. Initial Working Precision

Initial

Precision

(GEPP)

95%

Success

Rate

Matrix

Sizes

Mantissa

bit width

Matrix size

32×32 64×64 128×128 256×256

8 bits 19.1 % 0 % 0 % 0 %

12 bits 85.5 % 57.8 % 8.7 % 0 %

16 bits 98.6 % 94.9 % 81.1 % 49 %

20 bits 99.8 % 99.5 % 97.9 % 93.6 %

24 bits 100 % 100 % 99.8 % 99.5 %

More than 28 bits 100 % 100 % 100 % 100 %

95% Wr 14.90 bits 16.09 bits 19.31 bits 20.95 bits


W = N/33.6 + 14.3 (N>32) (1)

II. Failure Manipulator

95%

Condition

Number 95%

Working

Precision

Matrix

Sizes

Matrices Sizes 32×32 64×64 128×128 256×256 512×512

95% CNs 5273 16590 60520 179000 475300


Y = 197X ( Y: ECN(Est. Condition Numbers)/(2X), X:log2(matrix order) )

CN = 197N log2(N), (N: Matrix order) (2)

CN = 197(33.6(W-14.3))log2(33.6(W-14.3)) (3)

ResultsIntroduction Methodology Future work Conclusions

Initial Precision Predictor

Failure Manipulator

Future workIntroduction ResultsMethodology Conclusions

System

Application

Matrix

Next Step

Uniformly

Distributed

Random

Matrices

Gaussian

Distributed

Random

Matrices

Exponential

Distributed

Random

Matrices

Mean Variance

Mapping

Provide appropriate initial

precision and failure

manipulator

Optimize performance

according to system attributes

ConclusionsIntroduction ResultsMethodology Future work

Conclusions

1. Customized reconfigurable computing architecture for mixed precision

linear system solver according to system matrix condition numbers.

2. First effort to find out the initial precision predictor for GEPP and

failure manipulator to provide newly assigned appropriate precision

according to the system matrix condition number

3. High performance for linear solver along with reconfigurable computing

multiple precision computation paradigm.

Thank you, Any Questions ?