Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

41
Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    1

Transcript of Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Page 1: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Symbolic Program Transformation for Numerical Codes

Vijay MenonCornell University

Page 2: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Background Compiler/Runtime Support for Numerical

Programs (under Keshav Pingali)– Sparse matrix code generation

– Memory hierarchy optimizations

– Compiler/runtime techniques for MATLAB• MAJIC (with University of Illinois)• MultiMATLAB

Page 3: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Motivation

Using matrix properties to generate efficient code – e.g., Associativity, Commutativity of matrix

operations

Problem: loop notation– Matrix operations hidden within loop nests

Page 4: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Example: LU Factorization with Partial Pivoting

Also called: Gaussian Elimination

Key algorithm for solving systems of linear equations:– To solve A x = b for x:

• => Factor A into L U• => Solve L y = b for y• => Solve U x = y for x

Page 5: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

LU Factorization

Problem: Poor cache behavior

do j = 1,N

p(j) = j; do i = j+1,N if (A(i,j)>A(p(j),j)) p(j) = i;

do k = 1,N tmp = A(j,k); A(j,k) = A(p(j),k); A(p(j),k) = tmp;

do i = j+1,N A(i,j) = A(i,j)/A(j,j);

do k = j+1,N do i = j+1,N A(i,k) = A(i,k) - A(i,j)*A(j,k);

Select pivot row:

Swap pivot row with current:

Scale column (to store L):

Update(to compute partial U):

x x x x x x0 x x x x x0 0 5 x x x0 0 3 x x x0 0 7 x x x 0 0 2 x x x

Page 6: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Automatic Blocking Compiler

transformations:– Stripmine– Index-Set Split– Loop Distribute– Tile/Map to BLAS

Problems:– Establishing legality– Mapping to BLAS

do jB = 1,N,B

do j = jB,jB+B-1 p(j) = j; do i = j+1,N if (A(i,j)>A(p(j),j)) p(j) = i; do k = 1,N tmp = A(j,k); A(j,k) = A(p(j),k); A(p(j),k) = tmp; do i = j+1,N A(i,j) = A(i,j)/A(j,j); do k = j+1,jB+B-1 do i = j+1,N A(i,k) = A(i,k) - A(i,j)*A(j,k);

do j = jB,jB+B-1 do k = jB+B,N do i = j+1,N A(i,k) = A(i,k) - A(i,j)*A(j,k);

Page 7: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

LU Performance

0

50

100

150

200

250

300

350

400

450

500

size

MF

lop

s

LAPACK

Distributed w ith BLAS

Distributed

Original LU

Page 8: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Key Points

Conventional techniques (dependence analysis) are insufficient to establish legality

Detection of matrix operations buys extra performance

Page 9: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Overview of this Talk

Fractal Symbolic Analysis– Framework for Symbolically determining

Legality of Program Transformations

Matrixization– Generalization of Vectorization to Matrix

Operations

Page 10: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Fractal Symbolic Analysis

Symbolic test to establish legality of program transformations

for i = 1 : n S1(i); S2(i);

for i = 1 : n S1(i);for i = 1 : n S2(i);

Page 11: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Dependence Analysis Independent operations may be reordered

Legality: No data dependence from S2(m) to S1(l) where l > m

for i = 1 : n S1(i); S2(i);

for i = 1 : n S1(i);for i = 1 : n S2(i);

Page 12: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Symbolic Comparison Dependence Analysis is conservative:

Symbolic execution shows equality:– aout = 2*(ain + bin)

– bout = 2*bin

But, intractable for recurrent loops

s1: a = 2*as2: b = 2*bs3: a = a+b

s3: a = a+bs1: a = 2*as2: b = 2*b

?

Page 13: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Distillation of LU Given p(j) j, prove:

Dependence analysis: too conservative Symbolic comparison: intractable

for j = 1:n tmp = a(j) a(j) = a(p(j)) a(p(j)) = tmp

for i = j+1:n a(i) = a(i)/a(j)

B1(j):

B2(j):

for j = 1:n tmp = a(j) a(j) = a(p(j)) a(p(j)) = tmp

for j = 1:n for i = j+1:n a(i) = a(i)/a(j)

B1(j):

B2(j):

Page 14: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Idea

Simplify– Prove equality of complex programs via

equality of simpler programs– Repeat if necessary

Sufficient, but not necessary– equality of simpler programs -> equality of

original programs

Page 15: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

CommutativityS1, S2, S3, …,Sn = Sf(1), Sf(2), Sf(3), …,Sf(n)

i,j 1..n | i < j f(i) > f(j).

Si; Sj = Sj; Si

Permutation Sequence of Adjacent Transpositions

Page 16: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Commutative Legality

Loop Distribution

Legality:1 m < l N.

S2(m); S1(l) = S1(l); S2(m)

for i = 1 : n S1(i); S2(i);

for i = 1 : n S1(i);for i = 1 : n S2(i);

Page 17: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Commutative Legality

Similar conditions:– Statement reordering– Loop interchange – Loop reversal – Loop tiling

Compiler establishes legality of transformation by testing commute condition

Page 18: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Back to Example Given p(j) j, prove:

Simplified comparison =>

for j = 1:n tmp = a(j) a(j) = a(p(j)) a(p(j)) = tmp

for i = j+1:n a(i) = a(i)/a(j)

B1(j):

B2(j):

for j = 1:n tmp = a(j) a(j) = a(p(j)) a(p(j)) = tmp

for j = 1:n for i = j+1:n a(i) = a(i)/a(j)

B1(j):

B2(j):

Page 19: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Simplified Comparison Given p(j) j l > m prove:

Further simplification?

for i = m+1:n a(i) = a(i)/a(m)

tmp = a(l) a(l) = a(p(l)) a(p(l)) = tmp

B2(m):

B1(l):

tmp = a(l) a(l) = a(p(l)) a(p(l)) = tmp

for i = m+1:n a(i) = a(i)/a(m)

B1(l):

B2(m):

Page 20: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Too Simple! Given p(j) j l > m i > m prove:

Too conservative - no longer legal! Hypothesis:

– Repeated application -> dependence analysis

a(i) = a(i)/a(m)

tmp = a(l) a(l) = a(p(l)) a(p(l)) = tmp

tmp = a(l) a(l) = a(p(l)) a(p(l)) = tmp

a(i) = a(i)/a(m)

Page 21: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Symbolic Comparison Can we symbolically prove?:

Effect can be summarized:– non-recurrent loops– affine indices/loop bounds

for i = m+1:n a(i) = a(i)/a(m)

tmp = a(l) a(l) = a(p(l)) a(p(l)) = tmp

B2(m):

B1(l):

tmp = a(l) a(l) = a(p(l)) a(p(l)) = tmp

for i = m+1:n a(i) = a(i)/a(m)

B1(l):

B2(m):

Page 22: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Guarded Expressions

Compare symbolic values of live output variable in terms of input variables:

aout (k)=

Each– guard : affine expression defining part of aout

– expr : symbolic expression on input data

guard1 -> expr1

guard2 -> expr2

……guardn -> exprn

Page 23: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Guarded Expressions

For both program blocks:

aout(k) =

Omega Library (integer programming tool) to manipulate/compare affine guards

k m => ain(k)

k = l => ain(p(l))/ain(m)

k = p(l) => ain(l)/ain(m)

else => ain(k)/ain(m)

Page 24: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Summary of Fractal Symbolic Analysis Powerful legality test Explores tradeoffs between

– tractability (dependence analysis)– accuracy (symbolic comparison)

Similar application for LU w/ pivoting– 2 recursive simplification steps– 6 guarded regions/expressions

Prototype implemented in OCAML

Page 25: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Overview of this Talk

Fractal Symbolic Analysis– Framework for Symbolically determining

Legality of Program Transformations

Matrixization– Generalization of Vectorization to Matrix

Operations

Page 26: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Matrixization

Detect Matrix Operations– Map to

• hand-optimized libraries (BLAS)• special-purpose hardware

– Eliminate loop overhead• MATLAB: type/bounds checks

Exploit Matrix Semantics– Ring Properties of Matrix (+,*)

Page 27: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

BLAS Performance

050

100150200250300350400450500

100 200 300 400 500 600 700 800 900 1000size

MF

lop

s

Compiler Matrix-Matrix ProductBLAS Matrix-Matrix Product (DGEMM)Compiler Matrix-Vector ProductBLAS Matrix-Vector Product (DGEMV)

Page 28: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

MATLAB Performance

0

10

20

30

40

50

60

70

80

Crank-Nicholson

Dirichlet FiniteDifference

Galerkin Inc.Cholesky

Ex

ecu

tio

n T

ime

(s)

Original

Vectorized

Compiled

Vect. & Comp.

Page 29: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Galerkin Example

In Galerkin, 98% of time spent in:

for i = 1:N for j = 1:N phi(k) += A(i,j)*x(i)*y(i);

end end

A - Matrix x, y - Vector

Page 30: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Vectorized Code

In Optimized Galerkin:

phi(k) += x*A*y’;

Fragment Speedup: 260 Program Speedup: 110

Page 31: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Conventional Approaches

Syntactic Pattern Replacement (KAPF,VAST,FALCON) Can encode high-level propertiesLimited use on loops

Vector Code Generation Can detect array operations in loopsCannot detect/exploit matrix products

Page 32: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Our Approach Map code to: Abstract Matrix Form (AMF)

Convert to Symmetric AMF formulation

Optimize AMF expressions– factorization– invariant removal

Detect Matrix Products

Map AMF to MATLAB/BLAS

Page 33: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Expansion

Array expressions:

Forall loops:

a(1:m,1:n) -> i j ai,j

for i = 1:n x(i) = x(i) + y(i); -> ixi = i(xi+yi)

end

Page 34: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Reduction

Sum Reduction Loops

for i = 1:n k = k + x(i); -> k = k + î i xi

end

Page 35: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Product

Matrix ProductC = A*B -> i jCi,j= P(i jAi,h,ijBh,j)

Product-Reduction Equivalence:

• e1 and e2 are scalar

• e1 is constant w.r.t. ik+1…in

• e2 is constant w.r.t. i1…ik-1

îki1…ine1 * i1…ine2) = Pîk(i1…ike1, ik…ine2)

Page 36: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Other AMF Properties

Distributive Properties– e1· î e2 = î((ie1) ·e2)

• when e1 is constant w.r.t. i

Interchange Properties if(e1, e2,…, en) = f(ie1, ie2,…, ien)

i e = i e • where î=

i j e = j i e

î e = î e

Page 37: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Back to Galerkin Original:

• k = k + î i j (ai,j * xi * yj)

Convert to Symmetric Form:• k = k + î (i j ai,j * i j xi * i j yj)

Optimize:• k = k + î i xi * (i j ai,j * i j yj)

Map to Matrix Operations:• k = k + Pî (i xi , P(i j ai,j, j yj))

=> k = k + x’*A*y;

Page 38: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Other Vectorization Examples

Taken from USENET:

for i = 1:n for j = 1:n C(i,j) = A(i,j)*x(i);

C(1:n,1:n) = A(1:n,1:n) .* repmat(x(1,n),1,n)

for i = 1:n x(i) = y(:,i)*A*y(:,i)’;

x(1:n) = sum(y.*(y*A’),2);

Page 39: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Summary/Status of Matrixization General technique to detect matrix/vector

operations

Implemented in MAJIC– MATLAB interpreter/compiler

Future work:– Extend to more ops: recurrences– JIT Matrixization

Page 40: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Related Work Commutativity Analysis

– Rinard & Diniz– Parallelization of OO programs

High-Level Pattern Replacement– DeRose, Marsolf, …– Exploit high-level matrix properties

• Structure• Symmetry• Orthogonality

Page 41: Symbolic Program Transformation for Numerical Codes Vijay Menon Cornell University.

Conclusion

High-level symbolic techniques:– Fractal Symbolic Analysis– Matrixization

New techniques to analyze & exploit underlying computation and achieve substantial performance gains