Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double...
Transcript of Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double...
![Page 1: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/1.jpg)
Research Matters
February 25, 2009
Nick HighamDirector of Research
School of Mathematics
1 / 6
Multiprecision Algorithms
Nick HighamSchool of Mathematics
The University of Manchester
http://www.maths.manchester.ac.uk/~higham@nhigham, nickhigham.wordpress.com
Samuel D. Conte Distinguished Lecture, Department ofComputer Science, Purdue University, March 2018
![Page 2: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/2.jpg)
Conte and De Boor
1980 2018Early textbook cmpwise backw error analysis of LUfactorization: computed solution of Ax = b satisfies
(A +∆A)x = b, |∆A| ≤ un|A|+ un(3 + un)|L||U|,
where un = 1.01nu and |A| = (|aij |).
Nick Higham Multiprecision Algorithms 2 / 48
![Page 3: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/3.jpg)
Outline
Multiprecision arithmetic: floating point arithmeticsupporting multiple precisions.
Low precision—-half, or less.
High precision–quadruple, possibly arbitrary.
How to exploit different precisions to achieve faster algswith higher accuracy.
Download this talk from http://bit.ly/conte18
Nick Higham Multiprecision Algorithms 3 / 48
![Page 4: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/4.jpg)
IEEE Standard 754-1985 and 2008 Revision
Type Size Range u = 2−t
half 16 bits 10±5 2−11 ≈ 4.9× 10−4
single 32 bits 10±38 2−24 ≈ 6.0× 10−8
double 64 bits 10±308 2−53 ≈ 1.1× 10−16
quadruple 128 bits 10±4932 2−113 ≈ 9.6× 10−35
Arithmetic ops (+,−, ∗, /,√) performed as if firstcalculated to infinite precision, then rounded.Default: round to nearest, round to even in case of tie.Half precision is a storage format only.
Nick Higham Multiprecision Algorithms 4 / 48
![Page 5: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/5.jpg)
Model for Rounding Error Analysis
For x , y ∈ F
fl(x op y) = (x op y)(1 + δ), |δ| ≤ u, op = +,−, ∗, /.
Also for op =√
.
Model is weaker than fl(x op y) being correctly rounded.
Nick Higham Multiprecision Algorithms 5 / 48
![Page 6: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/6.jpg)
Precision versus Accuracy
fl(abc) = ab(1 + δ1) · c(1 + δ2) |δi | ≤ u,= abc(1 + δ1)(1 + δ2)
≈ abc(1 + δ1 + δ2).
Precision = u.Accuracy ≈ 2u.
Accuracy is not limited by precision
Nick Higham Multiprecision Algorithms 6 / 48
![Page 7: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/7.jpg)
Precision versus Accuracy
fl(abc) = ab(1 + δ1) · c(1 + δ2) |δi | ≤ u,= abc(1 + δ1)(1 + δ2)
≈ abc(1 + δ1 + δ2).
Precision = u.Accuracy ≈ 2u.
Accuracy is not limited by precision
Nick Higham Multiprecision Algorithms 6 / 48
![Page 8: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/8.jpg)
![Page 9: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/9.jpg)
ARM NEON
Nick Higham Multiprecision Algorithms 8 / 48
![Page 10: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/10.jpg)
NVIDIA Tesla P100 (2016), V100 (2017)
“The Tesla P100 is the world’s first accelerator built fordeep learning, and has native hardware ISA support forFP16 arithmetic”V100 tensor cores do 4× 4 mat mult in one clock cycle.
TFLOPSdouble single half/ tensor
P100 4.7 9.3 18.7V100 7 14 112
Nick Higham Multiprecision Algorithms 9 / 48
![Page 11: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/11.jpg)
AMD Radeon Instinct MI25 GPU (2017)
“24.6 TFLOPS FP16 or 12.3 TFLOPS FP32 peak GPUcompute performance on a single board . . . Up to 82GFLOPS/watt FP16 or 41 GFLOPS/watt FP32 peak GPUcompute performance”
Nick Higham Multiprecision Algorithms 10 / 48
![Page 12: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/12.jpg)
Machine Learning
“For machine learning as well as for certain imageprocessing and signal processing applications, moredata at lower precision actually yields better resultswith certain algorithms than a smaller amount of moreprecise data.”
“We find that very low precision is sufficient not just forrunning trained networks but also for training them.”Courbariaux, Benji & David (2015)
We’re solving the wrong problem (Scheinberg, 2016),so don’t need an accurate solution.
Low precision provides regularization.
Low precision encourages flat minima to be found.
Nick Higham Multiprecision Algorithms 11 / 48
![Page 13: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/13.jpg)
Deep Learning for Java
Nick Higham Multiprecision Algorithms 12 / 48
![Page 14: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/14.jpg)
Climate Modelling
T. Palmer, More reliable forecasts with less precisecomputations: a fast-track route to cloud-resolvedweather and climate simulators?, Phil. Trans. R.Soc. A, 2014:
Is there merit in representing variables atsufficiently high wavenumbers using halfor even quarter precision floating-pointnumbers?
T. Palmer, Build imprecise supercomputers, Nature,2015.
Nick Higham Multiprecision Algorithms 13 / 48
![Page 15: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/15.jpg)
Fp16 for Communication Reduction
ResNet-50 training on ImageNet.
Solved in 60 mins on 256 TESLA P100s at Facebook(2017).Solved in 15 mins on 1024 TESLA P100s at PreferredNetworks, Inc. (2017) using ChainerMN (TakuyaAkiba, SIAM PP18):“While computation was generally done in singleprecision, in order to reduce the communicationoverhead during all-reduce operations, we usedhalf-precision floats . . . In our preliminaryexperiments, we observed that the effect from usinghalf-precision in communication on the final modelaccuracy was relatively small.”
Nick Higham Multiprecision Algorithms 14 / 48
![Page 16: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/16.jpg)
Preconditioning with Adaptive Precision
Anzt, Dongarra, Flegar, H & Quintana-Ortí (2018):
For sparse A and iterative Ax = b solver, executiontime and energy dominated by data movement.
Block Jacobi preconditioning: D = diag(Di), whereDi = Aii . Solve D−1Ax = D−1b.
All computations are at fp64.
Compute D−1 and store D−1i in fp16, fp32 or fp64,
depending on κ(Di).
Simulations and energy modelling show canoutperform fixed precision preconditioner.
Nick Higham Multiprecision Algorithms 15 / 48
![Page 17: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/17.jpg)
Error Analysis in Low Precision
For inner product xT y of n-vectors standard error bound is
| fl(xT y)− xT y | ≤ nu|x |T |y |+ O(u2).
In half precision, u ≈ 4.9× 10−4, so nu = 1 for n = 2048 .
What happens when nu > 1?
Nick Higham Multiprecision Algorithms 16 / 48
![Page 18: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/18.jpg)
Error Analysis in Low Precision
For inner product xT y of n-vectors standard error bound is
| fl(xT y)− xT y | ≤ nu|x |T |y |+ O(u2).
In half precision, u ≈ 4.9× 10−4, so nu = 1 for n = 2048 .
What happens when nu > 1?
Nick Higham Multiprecision Algorithms 16 / 48
![Page 19: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/19.jpg)
Outline
1 Higher Precision
2 Iterative Refinement
Nick Higham Multiprecision Algorithms 17 / 48
![Page 20: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/20.jpg)
Need for Higher Precision
Long-time simulations.Resolving small-scale phenomena.
Khanna, High-Precision Numerical Simulations on aCUDA GPU: Kerr Black Hole Tails, 2013.Beliakov and Matiyasevich, A Parallel Algorithm forCalculation of Determinants and Minors UsingArbitrary Precision Arithmetic, 2016.Ma and Saunders, Solving Multiscale LinearPrograms Using the Simplex Method in QuadruplePrecision, 2015.
Nick Higham Multiprecision Algorithms 18 / 48
![Page 21: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/21.jpg)
Going to Higher Precision
If we have quadruple or higher precision, how can wemodify existing algorithms to exploit it?
To what extent are existing algs precision-independent?
Newton-type algs: just decrease tol?
How little higher precision can we get away with?
Gradually increase precision through the iterations?
For Krylov methods # iterations can depend onprecision, so lower precision might not give fastestcomputation!
Nick Higham Multiprecision Algorithms 19 / 48
![Page 22: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/22.jpg)
Going to Higher Precision
If we have quadruple or higher precision, how can wemodify existing algorithms to exploit it?
To what extent are existing algs precision-independent?
Newton-type algs: just decrease tol?
How little higher precision can we get away with?
Gradually increase precision through the iterations?
For Krylov methods # iterations can depend onprecision, so lower precision might not give fastestcomputation!
Nick Higham Multiprecision Algorithms 19 / 48
![Page 23: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/23.jpg)
Randomization + Extra PrecisionExploit A = Xdiag(λi)X−1 ⇒ f (A) = Xdiag(f (λi))X−1.
Davies (2007): “approximate diagonalization”:
function F = funm_randomized(A,fun)d = digits; digits(2*d);tol = 10^(-d);E = randn(size(A),class(A));[V,D] = eig(A + (tol*norm(A,’fro’) ...
/norm(E,’fro’))*E);F = V*diag(fun(diag(D)))/V;digits(d)
Perturbation ensures diagonalizable.Extra precision overcomes the effect of theperturbation.
Nick Higham Multiprecision Algorithms 20 / 48
![Page 24: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/24.jpg)
Availability of Multiprecision in Software
Maple, Mathematica, PARI/GP, Sage.
MATLAB: Symbolic Math Toolbox, MultiprecisionComputing Toolbox (Advanpix).
Julia: BigFloat.
Mpmath and SymPy for Python.
GNU MP Library.
GNU MPFR Library.
(Quad only): some C, Fortran compilers.
Gone, but not forgotten:
Numerical Turing: Hull et al., 1985.
Nick Higham Multiprecision Algorithms 21 / 48
![Page 25: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/25.jpg)
Cost of Quadruple Precision
How Fast is Quadruple Precision Arithmetic?Compare
MATLAB double precision,Symbolic Math Toolbox, VPA arithmetic, digits(34),Multiprecision Computing Toolbox (Advanpix),mp.Digits(34). Optimized for quad.
Ratios of timesIntel Broadwell-E Core i7-6800K @3.40GHz, 6 cores
mp/double vpa/double vpa/mpLU, n = 250 98 25,000 255eig, nonsymm, n = 125 75 6,020 81eig, symm, n = 200 32 11,100 342
Nick Higham Multiprecision Algorithms 22 / 48
![Page 26: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/26.jpg)
Outline
1 Higher Precision
2 Iterative Refinement
Nick Higham Multiprecision Algorithms 23 / 48
![Page 27: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/27.jpg)
Accelerating the Solution of Ax = b
A ∈ Rn×n nonsingular.
Standard method for solving Ax = b: factorize A = LU,solve LUx = b, all at working precision.
Can we solve Ax = b faster and/or more accuratelyby exploiting multiprecision arithmetic?
Nick Higham Multiprecision Algorithms 24 / 48
![Page 28: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/28.jpg)
Iterative Refinement for Ax = b (classic)
Solve Ax0 = b by LU factorization in double precision.
r = b − Ax0 quad precisionSolve Ad = r double precisionx1 = x0 + d double precision
(x0 ← x1 and iterate as necessary.)
Programmed in J. H. Wilkinson, Progress Report onthe Automatic Computing Engine (1948).Popular up to 1970s, exploiting cheap accumulation ofinner products.
Nick Higham Multiprecision Algorithms 25 / 48
![Page 29: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/29.jpg)
Iterative Refinement (1970s, 1980s)
Solve Ax0 = b by LU factorization.r = b − Ax0
Solve Ad = rx1 = x0 + d
Everything in double precision.
Skeel (1980).Jankowski & Wozniakowski (1977) for a generalsolver.
Nick Higham Multiprecision Algorithms 26 / 48
![Page 30: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/30.jpg)
Iterative Refinement (2000s)
Solve Ax0 = b by LU factorization in single precision.
r = b − Ax0 double precisionSolve Ad = r single precisionx1 = x0 + d double precision
Dongarra et al. (2006).Motivated by single precision being at least twice asfast as double.
Nick Higham Multiprecision Algorithms 27 / 48
![Page 31: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/31.jpg)
Iterative Refinement in Three Precisions
A,b given in precision u.
Solve Ax0 = b by LU factorization in precision uf .
r = b − Ax0 precision ur
Solve Ad = r precision uf
x1 = fl(x0 + d) precision u
Three previous usages are special cases.Choose precisions from half, single, double, quadruplesubject to ur ≤ u ≤ uf .Can we compute more accurate solutions faster?
Nick Higham Multiprecision Algorithms 28 / 48
![Page 32: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/32.jpg)
Existing Rounding Error Analysis
Wilkinson (1963): fixed-point arithmetic.Moler (1967): floating-point arithmetic.Higham (1997, 2002): more general analysis forarbitrary solver.Dongarra et al. (2006): lower precision LU.
At most two precisions and require κ(A)u < 1 .
New AnalysisApplies to any solver.Covers b’err and f’err. Focus on f’err here.Allows κ(A)u & 1 .
Nick Higham Multiprecision Algorithms 29 / 48
![Page 33: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/33.jpg)
New Analysis
Assume computed solution to Adi = ri has normwise relb’err O(uf ) and satisfies
‖di − di‖∞‖di‖
≤ uf θi < 1.
Define µi by
‖A(x − xi)‖∞ = µi ‖A‖∞‖x − xi‖∞,and note that
κ∞(A)−1 ≤ µi ≤ 1.
Nick Higham Multiprecision Algorithms 30 / 48
![Page 34: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/34.jpg)
Condition Numbers
|A| = (|aij |).
cond(A, x) =‖ |A−1||A||x | ‖∞
‖x‖∞,
cond(A) = cond(A,e) = ‖ |A−1||A| ‖∞,
κ∞(A) = ‖A‖∞‖A−1‖∞.
1 ≤ cond(A, x) ≤ cond(A) ≤ κ∞(A) .
Nick Higham Multiprecision Algorithms 31 / 48
![Page 35: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/35.jpg)
Convergence Result
Theorem (Carson & H, 2018)
For IR in precisions ur ≤ u ≤ uf if
φi = 2uf min(cond(A), κ∞(A)µi
)+ uf θi
is sufficiently less than 1, the forward error is reduced onthe ith iteration by a factor ≈ φi until an iterate x satisfies
‖x − x‖∞‖x‖∞
. 4nur cond(A, x) + u.
Analogous standard bound would haveµi = 1,uf θi = κ(A)uf .
Nick Higham Multiprecision Algorithms 32 / 48
![Page 36: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/36.jpg)
Precision Combinations
H = half, S = single, D = double, Q = quad. “uf u ur ”:
Traditional:
SSDDDQHHSHHDHHQSSQ
1970s/1980s:
SSSDDDHHHQQQ
2000s:
SDDHSSDQQHDDHQQSQQ
3 precisions:
HSDHSQHDQSDQ
Nick Higham Multiprecision Algorithms 33 / 48
![Page 37: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/37.jpg)
Results for LU Factorization (1)
Backward erroruf u ur κ∞(A) norm comp Forward errorH S S 104 S S cond(A, x) · SH S D 104 S S SH D D 104 D D cond(A, x) · DH D Q 104 D D DS S S 108 S S cond(A, x) · SS S D 108 S S SS D D 108 D D cond(A, x) · DS D Q 108 D D D
Nick Higham Multiprecision Algorithms 34 / 48
![Page 38: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/38.jpg)
Results (2): HSD vs. SSD
Backward erroruf u ur κ∞(A) norm comp Forward errorH S S 104 S S cond(A, x) · SH S D 104 S S SH D D 104 D D cond(A, x) · DH D Q 104 D D DS S S 108 S S cond(A, x) · SS S D 108 S S SS D D 108 D D cond(A, x) · DS D Q 108 D D D
Can we get the benefit of “HSD” while allowing a largerrange of κ∞(A)?
Nick Higham Multiprecision Algorithms 35 / 48
![Page 39: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/39.jpg)
Results (2): HSD vs. SSD
Backward erroruf u ur κ∞(A) norm comp Forward errorH S S 104 S S cond(A, x) · SH S D 104 S S SH D D 104 D D cond(A, x) · DH D Q 104 D D DS S S 108 S S cond(A, x) · SS S D 108 S S SS D D 108 D D cond(A, x) · DS D Q 108 D D D
Can we get the benefit of “HSD” while allowing a largerrange of κ∞(A)?
Nick Higham Multiprecision Algorithms 35 / 48
![Page 40: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/40.jpg)
Extending the Range of Applicability
Recall that the convergence condition is
φi = 2uf min(cond(A), κ∞(A)µi
)+ uf θi � 1.
We need both terms to be smaller than κ∞(A)uf .
Recall that‖di − di‖∞‖di‖
≤ uf θi ,
µi ‖A‖∞‖x − xi‖∞ = ‖A(x − xi)‖∞ = ‖b − Axi‖∞ = ‖ri‖∞.
Nick Higham Multiprecision Algorithms 36 / 48
![Page 41: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/41.jpg)
Bounding µi
For a stable solver, in the early stages we expect
‖ri‖‖A‖‖xi‖
≈ u � ‖x − xi‖‖x‖
,
or equivalently µi � 1. But close to convergence
‖ri‖‖A‖‖xi‖
≈ u ≈ ‖x − xi‖‖x‖
or µi ≈ 1.
Concludeµi � 1 initially and µi → 1 as the iteration converges.
Nick Higham Multiprecision Algorithms 37 / 48
![Page 42: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/42.jpg)
Bounding θi
uf θi bounds rel error in solution of Adi = ri .We need uf θi � 1.
Standard solvers cannot achieve this for very ill conditioned A!
Empirically observed by Rump (1990) that if L and U arecomputed LU factors of A from GEPP then
κ(L−1AU−1) ≈ 1 + κ(A)u,
even for κ(A)� u−1.
Nick Higham Multiprecision Algorithms 38 / 48
![Page 43: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/43.jpg)
Preconditioned GMRES
To compute the updates di we apply GMRES to
Adi ≡ U−1L−1Adi = U−1L−1ri .
A is applied in twice the working precision.
κ(A)� κ(A) typically.
Rounding error analysis shows we get an accurate di
even for numerically singular A.Call the overall alg GMRES-IR.
GMRES cgce rate not directly related to κ(A).
Cf. Kobayashi & Ogita (2015), who explicitly form A.
Nick Higham Multiprecision Algorithms 39 / 48
![Page 44: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/44.jpg)
Benefits of GMRES-IR
Recall H = 10−4, S = 10−8, D = 10−16, Q = 10−34.
Backward erroruf u ur κ∞(A) nrm cmp F’error
LU H D Q 104 D D DLU S D Q 108 D D D
GMRES-IR H D Q 1012 D D DGMRES-IR S D Q 1016 D D D
How many GMRES iterations are required?
Some tests with 100× 100 matrices . . .
Nick Higham Multiprecision Algorithms 40 / 48
![Page 45: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/45.jpg)
Benefits of GMRES-IR
Recall H = 10−4, S = 10−8, D = 10−16, Q = 10−34.
Backward erroruf u ur κ∞(A) nrm cmp F’error
LU H D Q 104 D D DLU S D Q 108 D D D
GMRES-IR H D Q 1012 D D DGMRES-IR S D Q 1016 D D D
How many GMRES iterations are required?
Some tests with 100× 100 matrices . . .
Nick Higham Multiprecision Algorithms 40 / 48
![Page 46: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/46.jpg)
Benefits of GMRES-IR
Recall H = 10−4, S = 10−8, D = 10−16, Q = 10−34.
Backward erroruf u ur κ∞(A) nrm cmp F’error
LU H D Q 104 D D DLU S D Q 108 D D D
GMRES-IR H D Q 1012 D D DGMRES-IR S D Q 1016 D D D
How many GMRES iterations are required?
Some tests with 100× 100 matrices . . .
Nick Higham Multiprecision Algorithms 40 / 48
![Page 47: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/47.jpg)
Test 1: LU-IR, (uf ,u,ur) = (S,D,D)
κ∞(A) ≈ 1010, σi = αi Divergence!
0 1 2
re-nement step
10-15
10-10
10-5
100 ferrnbecbe
Nick Higham Multiprecision Algorithms 41 / 48
![Page 48: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/48.jpg)
Test 1: LU-IR, (uf ,u,ur) = (S,D,Q)
κ∞(A) ≈ 1010, σi = αi Divergence!
0 1 2
re-nement step
10-15
10-10
10-5
100 ferrnbecbe
Nick Higham Multiprecision Algorithms 42 / 48
![Page 49: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/49.jpg)
Test 1: LU-IR, (uf ,u,ur) = (S,D,Q)
κ∞(A) ≈ 104, σi = αi Convergence
0 1 2
10-15
10-5
Nick Higham Multiprecision Algorithms 43 / 48
![Page 50: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/50.jpg)
Test 1: GMRES-IR, (uf ,u,ur) = (S,D,Q)
κ∞(A) ≈ 1010, σi = αi , GMRES its (2,3) Convergence
0 1 2
re-nement step
10-15
10-10
10-5
100 ferrnbecbe
Nick Higham Multiprecision Algorithms 44 / 48
![Page 51: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/51.jpg)
Test 2: GMRES-IR, (uf ,u,ur) = (H,D,Q)
κ∞(A) ≈ 102, 1 small σi , GMRES its (8,8,8) Convergence
0 1 2 3
re-nement step
10-15
10-10
10-5
100 ferrnbecbe
Nick Higham Multiprecision Algorithms 45 / 48
![Page 52: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/52.jpg)
Test 3: GMRES-IR, (uf ,u,ur) = (H,D,Q)
κ∞(A) ≈ 1012, σi = αi , GMRES (100,100)Take x0 = 0 because of overflow! Convergence
0 1 2 3
10-15
10-5
Nick Higham Multiprecision Algorithms 46 / 48
![Page 53: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/53.jpg)
Conclusions
Both low and high precision floating-point arithmeticbecoming more prevalent, in hardware and software.
Need better understanding of behaviour of algs in lowprecision arithmetic.
IR with LU in lower precision⇒ twice as fast as trad.IR, albeit restricted κ(A).
Judicious use of a little high precision can bringmajor benefits.
GMRES-IR cges when trad. IR doesn’t, thanks topreconditioned GMRES solved of Adi = ri .
GMRES-IR handles κ∞(A) ≈ u−1. Further work:tune cgce tol, alternative preconditioners etc.
Nick Higham Multiprecision Algorithms 47 / 48
![Page 54: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/54.jpg)
![Page 55: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/55.jpg)
References I
H. Anzt, J. Dongarra, G. Flegar, N. J. Higham, and E. S.Quintana-Ortí.Adaptive precision in block-Jacobi preconditioning foriterative sparse linear system solvers.Concurrency Computat.: Pract. Exper., 2018.
G. Beliakov and Y. Matiyasevich.A parallel algorithm for calculation of determinants andminors using arbitrary precision arithmetic.BIT, 56(1):33–50, 2016.
Nick Higham Multiprecision Algorithms 1 / 7
![Page 56: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/56.jpg)
References II
E. Carson and N. J. Higham.A new analysis of iterative refinement and its applicationto accurate solution of ill-conditioned sparse linearsystems.SIAM J. Sci. Comput., 39(6):A2834–A2856, 2017.
E. Carson and N. J. Higham.Accelerating the solution of linear systems by iterativerefinement in three precisions.SIAM J. Sci. Comput., 40(2):A817–A847, 2018.
M. Courbariaux, Y. Bengio, and J.-P. David.Training deep neural networks with low precisionmultiplications, 2015.ArXiv preprint 1412.7024v5.
Nick Higham Multiprecision Algorithms 2 / 7
![Page 57: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/57.jpg)
References III
E. B. Davies.Approximate diagonalization.SIAM J. Matrix Anal. Appl., 29(4):1051–1064, 2007.
N. J. Dingle and N. J. Higham.Reducing the influence of tiny normwise relative errorson performance profiles.ACM Trans. Math. Software, 39(4):24:1–24:11, 2013.
N. J. Higham.Iterative refinement for linear systems and LAPACK.IMA J. Numer. Anal., 17(4):495–509, 1997.
Nick Higham Multiprecision Algorithms 3 / 7
![Page 58: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/58.jpg)
References IV
N. J. Higham.Accuracy and Stability of Numerical Algorithms.Society for Industrial and Applied Mathematics,Philadelphia, PA, USA, second edition, 2002.ISBN 0-89871-521-0.xxx+680 pp.
G. Khanna.High-precision numerical simulations on a CUDA GPU:Kerr black hole tails.J. Sci. Comput., 56(2):366–380, 2013.
Nick Higham Multiprecision Algorithms 4 / 7
![Page 59: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/59.jpg)
References V
Y. Kobayashi and T. Ogita.A fast and efficient algorithm for solving ill-conditionedlinear systems.JSIAM Lett., 7:1–4, 2015.
J. Langou, J. Langou, P. Luszczek, J. Kurzak, A. Buttari,and J. Dongarra.Exploiting the performance of 32 bit floating pointarithmetic in obtaining 64 bit accuracy (revisitingiterative refinement for linear systems).In Proceedings of the 2006 ACM/IEEE Conference onSupercomputing, Nov. 2006.
Nick Higham Multiprecision Algorithms 5 / 7
![Page 60: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/60.jpg)
References VI
D. Ma and M. Saunders.Solving multiscale linear programs using the simplexmethod in quadruple precision.In M. Al-Baali, L. Grandinetti, and A. Purnama, editors,Numerical Analysis and Optimization, number 134 inSpringer Proceedings in Mathematics and Statistics,pages 223–235. Springer-Verlag, Berlin, 2015.
Nick Higham Multiprecision Algorithms 6 / 7
![Page 61: Research Matters Nick Higham School of Mathematics The ...higham/talks/conte18.pdf · MATLAB double precision, Symbolic Math Toolbox, VPA arithmetic, digits(34), Multiprecision Computing](https://reader033.fdocuments.us/reader033/viewer/2022052810/607efa709f74d737546ce76b/html5/thumbnails/61.jpg)
References VII
A. Roldao-Lopes, A. Shahzad, G. A. Constantinides,and E. C. Kerrigan.More flops or more precision? Accuracyparameterizable linear equation solvers for modelpredictive control.In 17th IEEE Symposium on Field ProgrammableCustom Computing Machines, pages 209–216, Apr.2009.
K. Scheinberg.Evolution of randomness in optimization methods forsupervised machine learning.SIAG/OPT Views and News, 24(1):1–8, 2016.
Nick Higham Multiprecision Algorithms 7 / 7