CS 584

21
CS 584 CS 584 Assignment

description

CS 584. Assignment. Systems of Linear Equations. A linear equation in n variables has the form A set of linear equations is called a system. A solution exists for a system iff the solution satisfies all equations in the system. Many scientific and engineering problems take this form. - PowerPoint PPT Presentation

Transcript of CS 584

CS 584CS 584

Assignment

Systems of Linear EquationsSystems of Linear Equations

A linear equation in n variables has the form

A set of linear equations is called a system. A solution exists for a system iff the solution

satisfies all equations in the system. Many scientific and engineering problems

take this form.

a0x0 + a1x1 + … + an-1xn-1 = b

Solving Systems of EquationsSolving Systems of Equations

Many such systems are large.

– Thousands of equations and unknowns

a0,0x0 + a0,1x1 + … + a0,n-1xn-1 = b0

a1,0x0 + a1,1x1 + … + a1,n-1xn-1 = b1

an-1,0x0 + an-1,1x1 + … + an-1,n-1xn-1 = bn-1

Solving Systems of EquationsSolving Systems of Equations

A linear system of equations can be represented in matrix form

a0,0 a0,1 … a0,n-1 x0 b0

a1,0 a1,1 … a1,n-1 x1 b1

an-1,0 an-1,1 … an-1,n-1 xn-1 bn-1

=

Ax = b

Solving Systems of EquationsSolving Systems of Equations

Solving a system of linear equations is done in two steps:

– Reduce the system to upper-triangular

– Use back-substitution to find solution These steps are performed on the system in

matrix form.

– Gaussian Elimination, etc.

Solving Systems of EquationsSolving Systems of Equations

Reduce the system to upper-triangular form

Use back-substitution

a0,0 a0,1 … a0,n-1 x0 b0

0 a1,1 … a1,n-1 x1 b1

0 0 … an-1,n-1 xn-1 bn-1

=

Reducing the SystemReducing the System

Gaussian elimination systematically eliminates variable x[k] from equations k+1 to n-1.

– Reduces the coefficients to zero This is done by subtracting a appropriate

multiple of the kth equation from each of the equations k+1 to n-1

Procedure GaussianElimination(A, b, y) for k = 0 to n-1

/* Division Step */for j = k + 1 to n - 1 A[k,j] = A[k,j] / A[k,k]y[k] = b[k] / A[k,k]A[k,k] = 1

/* Elimination Step */for i = k + 1 to n - 1 for j = k + 1 to n - 1

A[i,j] = A[i,j] - A[i,k] * A[k,j] b[i] = b[i] - A[i,k] * y[k] A[i,k] = 0endfor

endforend

Parallelizing Gaussian Elim.Parallelizing Gaussian Elim.

Use domain decomposition

– Rowwise striping Division step requires no communication Elimination step requires a one-to-all

broadcast for each equation. No agglomeration Initially map one to to each processor

Communication AnalysisCommunication Analysis

Consider the algorithm step by step Division step requires no communication Elimination step requires one-to-all bcast

– only bcast to other active processors

– only bcast active elements Final computation requires no

communication.

Communication AnalysisCommunication Analysis

One-to-all broadcast

– log2q communications

– q = n - k - 1 active processors Message size

– q active processors

– q elements required

T = (ts + twq)log2q

Computation AnalysisComputation Analysis

Division step

– q divisions Elimination step

– q multiplications and subtractions Assuming equal time --> 3q operations

Computation AnalysisComputation Analysis

In each step, the active processor set is reduced by one resulting in:

2/)1(3

11

0

nnCompTime

knCompTimen

k

Can we do better?Can we do better?

Previous version is synchronous and parallelism is reduced at each step.

Pipeline the algorithm Run the resulting algorithm on a linear

array of processors. Communication is nearest-neighbor Results in O(n) steps of O(n) operations

Pipelined Gaussian Elim.Pipelined Gaussian Elim.

Basic assumption: A processor does not need to wait until all processors have received a value to proceed.

Algorithm– If processor p has data for other processors,

send the data to processor p+1– If processor p can do some computation using

the data it has, do it.– Otherwise, wait to receive data from processor

p-1

ConclusionConclusion

Using a striped partitioning method, it is natural to pipeline the Gaussian elimination algorithm to achieve best performance.

Pipelined algorithms work best on a linear array of processors.

– Or something that can be linearly mapped Would it be better to block partition?

– How would it affect the algorithm?