Reducing number of operations: The joy of algebraic transformations CS498DHP Program Optimization.

Reducing number of operations:The joy of algebraic transformations

CS498DHP Program Optimization

Number of operations and execution time

• Fewer number of operations does not necessarily mean shorter execution times.– Because of scheduling in a parallel environment.– Because of locality.– Because of communication in a parallel program.

• Nevertheless, although it has to be applied carefully, reducing the number of operations is one of the important optimizations.

• In this presentation, we discuss transformation to reduce the number of operations or reduce the length of scheduling in an idealized parallel environment where communication costs are zero.

Scheduling• Consider the expression tree:• It can be shortened by applying

– Associativity and commutativity: [a+h+b*(c+g+d*e*f) ] or

– Associativity, commutativity and distributivity: [a+h+b*c+b*g+b*d*e*f].

• The second expression is the sortest of the three. This means that with enough resources the third expression is the fastest although is has the most operations.

Locality• Consider:

do i=1.nc(i) = a(i)+b(i)+a(i)/b(i)

end do…do i=1,n

x(i) = (a(i)+b(i))*t(i)+a(i)/b(i)end do

do i=1,n

d(i) = a(i)/b(i)c(i) = a(i)+b(i)+d(i)

end do…do i=1,n

x(i) = (a(i)+b(i))*t(i)+d(i)end do

• The sequence on the right executes fewer operations, but, if n is large enough, it also incurs in more cache misses. (We assume that t is computed between the two loops so that they cannot be fused.)

Communication in parallel programs• Consider:

cobegin…do i=1,n

a(i) = ..end do

send a(1:n)…

// … receive a(1:n) …coend

cobegin

…do i=1,n

a(i) = ..end do…

// … do i=1,n

a(i) = ..end do

…coend

• The sequence on the right executes more operation s, but it would execute faster if the send operation is expensive.

Approaches to reducing cost of computation

• Eliminate (syntactically) redundant computations.

• Apply algebraic transformations to reduce the number of operations.

• Decompose sequential computations for parallel execution.

• Apply algebraic transformations to reduce the height of expressions trees and thus reduce execution time in a parallel environment.

Elimination of redundant computations

• Many of the transformations were discussed in the context of compiler transformations.– Common subexpression elimination– Loop invariant removal– Elimination of redundant counters– Loop unrolling (not discussed, but should

have). It eliminates bookkeeping operations.

• However, compilers will not eliminate all redundant computations. Here is an example where user intervention is needed:The following sequence

do i=1,ns = a(i)+s

end do…do i=1,n-1

t = a(i)+tend do…t…

May be replaced bydo i=1,n-1

t = a(i)+tend dos=t+a(n)……t…

This transformation is not usually done by compilers.

2. Another example, from C, is the loop for (i = 0; i < n; i++) { for (j = 0; j < n; j++) { a[i,j]=0; } }

Which, if a is n × n, can be transformed into the loop below that has fewer bookkeeping operations.

b=a;for (i = 0; i < n*n; i++) { *b=0; b++; }

Applying algebraic transformations to reduce the number of operations• For example, the expressions a*(b*c)+

(b*a)*d+a*e can be transformed into (a*b)*(c+d)+a*e by distributivity and then by associativity and distributivity into a*(b*(c+d)+e).

• Notice that associativity has to be applied with care. For example, suppose we are operating on floating point values and that x is very much larger than y and z=-x. Then (y+x)+z may give 0 as a result, while y+(x+z) gives y as an answer.

• The application of algebraic rules can be very sophisticated. Consider the computation of xn. A naïve implementation would require n-1 multiplications.

• However, if we represent n in binary as n=b0+2(b1+2(b2 + …)) and notice that xn=xb0 (xb1+2(b2 + …))2, the number of multiplications can be reduced to O(log n).

function power(x,n) (assume n>0)if n==1 then return xif n%2==1 then return x*power(x,n-1)

else x=power(x,n/2); return x*x

Horner’s rule• A polynomial

A(x) = a0 + a1x + a2x² + a3x³ + ... may be written as A(x) = a0 + x(a1 + x(a2 + x(a3 + ...))).

As a result, a polynomial may be evaluated at a point x', that is A(x') computed, in Θ(n) time using Horner's rule. That is, repeated multiplications and additions, rather than the naive methods of raising x to powers, multiplying by the coefficient, and accumulating.

Conventional matrix multiplication

Asymptotic complexity: 2n3 operationsEach recursion step (blocked version): 8 multiplications, 4 additions

Strassen’s AlgorithmAsymptotic complexity: O(nlog

27) = O(n2.8…) operations

Each recursion step: 7 multiplications, 18 additions/subtractions

Asymptotic complexity is solution of T(n)=7T(n/2)+18(n/2)2

WinogradAsymptotic complexity: O(n2.8..)operationsEach recursion step: 7 multiplications, 15 additions/subtractions

Parallel matrix multiplication

• Parallel matrix multiplication can be accomplished without redundant operations.

• First observe that the time to compute a sum of n elements, given enough resources, is

. n2log

• With sufficient replication and computational resources matrix multiplication can take just one multiplication step and additions n2log

Copying can also be done in logarithmic steps

Parallelism and redundancy

• Algebra rules can be applied to reduce tree height.

• In some cases, the height of the tree is reduced at the expense of an increase in the number of operations

Parallel Prefix

Redundancy in parallel sorting.Sorting networks.

Comparator (2-sorter)

min(x, y)

max(x, y)

inputs outputs

Comparison Network

n / 2comparisonsper stage

d stages

Sorting Networks

SortingNetwork

10010011

00001111

inputs outputs

sorted

Insertion Sort Network

inputs outputs

depth 2n 3

comparator stages

comparators

Odd-even transposition sort

Bubblesort

Bitonic sort

O(log(n)2)

O(n·log(n)2)

Odd-even mergesort

O(log(n)2)

O(n·log(n)2)

Shellsort

O(log(n)2)

O(n·log(n)2)

Reducing number of operations: The joy of algebraic transformations CS498DHP Program Optimization.

Documents

Transcript of Reducing number of operations: The joy of algebraic transformations CS498DHP Program Optimization.

HANDBOOK OF MATHEMATICS - hdbom.com · Algebraic complex numbers over Q, 157 Algebraic curve, 204, 455 Algebraic dimension, 289 Algebraic dual, 324 Algebraic element, 719 Algebraic

A5-9 Algebraic Transformations - M1 Mathsm1maths.com/A5-9 Alg Transformations.pdf · M1Maths.com A5-9 Algebraic Transformations Page 3 harder to see. If we transform y = f(x) to y

GENERALIZATIONS OF CLAUSEN’S FORMULA AND ALGEBRAIC TRANSFORMATIONS OF ... · Proceedings of the Edinburgh Mathematical Society (2011) 54, 273–295 DOI:10.1017/S0013091509000959

Vocabulary Similarity transformations Congruence transformations.

Analysis and Correctness of Algebraic Graph and Model Transformations

-A182 SPECIFICATION AND DESIGN METHODOLOGIES FAULT ... · evaluation of the implementation. The algebraic basis of FP allows formal transformations of the specification to improve

An Algebraic Oak - Algebraic Poem

Enrich Transformations and Polynomials Instructions with Technology Working with Algebraic Expressions iLearn Grade 8 Math Session 7 of 8.

Graph Polynomials and Graph Transformations in Algebraic Graph

Probe: Visualizing Algebraic Transformations in the ... · programming. We also give an example for an aspect-oriented programming mechanism, mixin layers, and introduce the generative

Algebraic Aspects of Darboux Transformations, Quantum Integrable Systems and Supersymmetric Quantum

Algebra & Number Theory - MSP · strong invariance properties under linear fractional transformations. Speciﬁcally, if a weakly Hecke-monic function has algebraic integer coefﬁcients

Colimit Library for Graph Transformations and Algebraic - CiteSeer

Rules for Algebraic Transformations...2013/09/03 · Rules for Algebraic Transformations Given a parent function f(x), the transformation of the function is given by: Each of the

Modeling Transformations · 2005. 2. 24. · ¥2D Transformations!Basic 2D transformations!Matrix representation!Matrix composition ¥3D Transformations!Basic 3D transformations!Same

Transformations notes paper B F4 - Webs 2B.pdf · Transformations Form 4 j.camenzulismc@gmail.com Page | 1 Transformations 2A Transformations Introducing transformations Translations,

Algebraic Graph Transformations for Merging Ontologiesdisi.unitn.it/~p2p/RelatedWork/Matching/medi2014.pdf · 2014-12-07 · graph grammars with algebraic graph transformations. Typed

Scalable Numerical Queries by Algebraic Inequality ... · Scalable Numerical Queries by Algebraic Inequality Transformations Thanh Truong and Tore Risch Department of Information

Introduction to Similarity · to introduce students to similar figures. The algebraic rules introduced in Problem 2.1, which specify how coordinates change, produce more precise transformations.

What toolbox is necessary for building exercise environments for algebraic transformations Rein Prank University of Tartu rein.prank@ut.ee.