Mixed Precision Dense Linear System Solvers
for High Performance Reconfigurable Computing
Introduction Methodology Results Future work Conclusions
Tennessee Advanced Computing Laboratory
University of Tennessee
July 29th 2009
JunKyu Lee, Gregory D. Peterson, Robert J. Harrison, Robert J. Hinde
This work was partially supported by
the National Science Foundation,
grant NSF CHE-0625598.
I. Introduction
II. Methodology
III.Results
IV. Future work
V. Conclusions
- What is a mixed precision solver?
- How has a mixed precision computations been
developed? What do we need to develop further?
- What have we explored?
- What have we found?
- What could be a possible RC architecture for
this research?
Introduction Methodology Results Future work Conclusions
What is mixed precision solver? Why ?
Computational
Science Applications
Ax = b
Computer’s Nature
Finite Precision Computation
Error Prone
Solution x
Iterative Refinement
(For Better Numeric Results)
James Wilkinson (1948)
Mixed Precision Solver
( For Better Numeric Results and High Performance )
J.Langou,J.Dongarra(2006)
Mixed Precision Computation Method:
1. Employ more than one precision computation.
2. Lower precision (fast computation) for computationally inten
sive tasks and Higher precision (slow computation) for refin
ing solution.
High performance (Lower precision computation)
Satisfactory numeric solution (Higher precision refinement)
Introduction Methodology Results Future work Conclusions
Mixed precision algorithm
To solve Ax = b;
Step 1: GEPP (A); O(n3) <= precision PI;
Solve LUx(1) = P×b; O(n2) <= precision PI;
for ( i = 1 to x(i) accurate enough)
Step 2: r(i) = b – Ah x(i); O(n2) <= precision PH;
Step 3: LUz(i) = P×r(i); O(n2) <= precision PI;
Step 4: x(i) = x(i) + z(i); O(n) <= precision PH;
end
P is a permutation matrix and r is a residual vector.
Introduction Methodology Results Future work Conclusions
Computationally
Expensive
Employ lower
precision for fast
computation
Computationally
Not Expensive
Employ higher
precision for
refinement
Graphical View for Mixed Precision Algorithm
Ax = b
Exact
Solution x
Exact x at iteration 1
Ax at iteration 1
r = b – Ax at iteration 1
z = A-1 × r at iteration 1
Ax at iteration 2
x at iteration 2
Introduction Methodology Results Future work Conclusions
A × z = r
To solve Ax = b;
Step 1: GEPP (A);
Solve LUx(1) = P×b;
for ( i = 1 to x(i) accurate enough)
Step 2: r(i) = b – Ah x(i);
Step 3: LUz(i) = P×r(i);
Step 4: x(i) = x(i) + z(i);
end
How have mixed precision methods been developed?
Introduction Methodology Results Future work Conclusions
Previous Research
Single precision for GEPP
Double precision refinement
Work ?
Double precision for GEPP
DONE
CPU/ CELL / GPU
Could we apply an arbitrary precision
instead of just single or double
precision?
Use FPGA !
(Arbitrary precision computation)
From J. Sun, G.D. Peterson, O.O. Storaasli,
High-Performance Mixed Precision Linear
Solver for FPGAs, IEEE Transactions on
Computers, vol. 57, no. 12, 2008
From J. Sun, G.D. Peterson, O.O. Storaasli (2008)
Edelman’s analysis with random matrices: E(k(A)) = e log(n)+1.537
Matrices
Size
Condition Numbers
(J.Sun / Edelman)
128 913 595
256 1818 1191
512 4017 2381
1024 6196 4762
2048 9407 9525
4096 22425 19050
Introduction Methodology Results Future work Conclusions
1. FPGA can employ arbitrary precision computation
(Applying higher precision when a condition number of system is high)
2. Lower precision Smaller, Faster ALUs More ALUs
(Significant performance difference for multiplication between lower
precision and higher precision in FPGAs)
Mantissa: 16bits, Exponent: 7bits
From J. Sun, G.D. Peterson, O.O. Storaasli (2008)
In this research, we address
1. Initial precision predictor: How to apply appropriate initial precision for first
O(n3) computation for GEPP ?
2. Failure manipulator: If the algorithm fails, how to predict the failure and how
to fix it ?
Introduction Methodology Results Future work Conclusions
Latency
XC4LX160-10
Question (This is what the research is about)
If FPGA can employ arbitrary precision computation, what precision will be for GEPP?Could we predict success?
Exploit FPGA’s arbitrary
precision computation
Fewer H/W resources in
FPGAs
Lower Precision
Smaller, Faster ALUs
More ALUs
Initial precision
predictor
according to matrix
characteristics
Failure manipulator
+
Mixed precision dense linear
system solver for high performance
reconfigurable computing
Introduction Methodology Results Future work Conclusions
MethodologyIntroduction Results Future work Conclusions
Guessing
Condition
Number of
Matrix
Decide
Initial
Precision
System
Matrix
Characteristics
GuessGuess
GEPP
Generate
L and U
Compute
Condition
Number of
Matrix
Done
Model
No
Work?
Is Guess
Right ?
Failure
Manipulator
Yes
Initial Precision Predictor and Failure Manipulator
I. Initial Precision Predictor Note 1 and 2
We note that
1. The condition number of a matrix is an important factor to predict success
2. The condition number of a random matrix increases with size
3. Computing a condition number from triangular matrices takes O(n2)
MethodologyIntroduction Results Future work Conclusions
II. Failure Manipulator Note 1 and 3
MethodologyIntroduction Results Future work Conclusions
Experiments with 1000 uniformly distributed random matrices
(lagged Fibonacci generator) according to different matrix sizes
What have we tried ?
(Initial Precision Predictor)
1. Investigate success rate according to different size matrices
with different precision computations for GEPP
2. Find out appropriate mantissa bit widths which guarantee
95% success rate according to different matrix sizes
(Failure Manipulator)
1. Investigate CDFs for condition numbers according to
different size matrices
2. Find out the 95% condition numbers in CDFs
Relate the working precision (initial precision) to the condi
tion numbers (failure manipulator)
Different precision computations => Fix exponential bits, Change
mantissa bits
I. Initial Working Precision
Initial
Precision
(GEPP)
95%
Success
Rate
Matrix
Sizes
Mantissa
bit width
Matrix size
32×32 64×64 128×128 256×256
8 bits 19.1 % 0 % 0 % 0 %
12 bits 85.5 % 57.8 % 8.7 % 0 %
16 bits 98.6 % 94.9 % 81.1 % 49 %
20 bits 99.8 % 99.5 % 97.9 % 93.6 %
24 bits 100 % 100 % 99.8 % 99.5 %
More than 28 bits 100 % 100 % 100 % 100 %
95% Wr 14.90 bits 16.09 bits 19.31 bits 20.95 bits
MethodologyIntroduction Results Future work Conclusions
W = N/33.6 + 14.3 (N>32) (1)
II. Failure Manipulator
95%
Condition
Number 95%
Working
Precision
Matrix
Sizes
Matrices Sizes 32×32 64×64 128×128 256×256 512×512
95% CNs 5273 16590 60520 179000 475300
MethodologyIntroduction Results Future work Conclusions
Y = 197X ( Y: ECN(Est. Condition Numbers)/(2X), X:log2(matrix order) )
CN = 197N log2(N), (N: Matrix order) (2)
CN = 197(33.6(W-14.3))log2(33.6(W-14.3)) (3)
ResultsIntroduction Methodology Future work Conclusions
Initial Precision Predictor
Failure Manipulator
Future workIntroduction ResultsMethodology Conclusions
System
Application
Matrix
Next Step
Uniformly
Distributed
Random
Matrices
Gaussian
Distributed
Random
Matrices
Exponential
Distributed
Random
Matrices
Mean Variance
Mapping
Provide appropriate initial
precision and failure
manipulator
Optimize performance
according to system attributes
ConclusionsIntroduction ResultsMethodology Future work
Conclusions
1. Customized reconfigurable computing architecture for mixed precision
linear system solver according to system matrix condition numbers.
2. First effort to find out the initial precision predictor for GEPP and
failure manipulator to provide newly assigned appropriate precision
according to the system matrix condition number
3. High performance for linear solver along with reconfigurable computing
multiple precision computation paradigm.
Thank you, Any Questions ?
Top Related