Download - Final Presentation

Transcript
Page 1: Final Presentation

Final Presentation

Parallel Covariance Matrix Creation

Supervisor:Oded Green

Ami Galperin Lior David

Page 2: Final Presentation

2Parallel Covariance Matrix Creation - Final Presentation

Table of Contents - Overview Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 3: Final Presentation

3Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 4: Final Presentation

4Parallel Covariance Matrix Creation - Final Presentation

Project’s Goals

Developing a parallel algorithm for the creation of a covariance matrixCompatibility with Plurality’s HAL platformMaximized parallelization and core utilizationIntegrating the algorithm into Elta’s MVM (Minimum Variance Method) algorithm implementation

April 18, 2010

Page 5: Final Presentation

5Parallel Covariance Matrix Creation - Final Presentation

MVM Algorithm MVM is a modern 2-D spectral estimation algorithm used by Elta’s Synthetic Aperture Radar (SAR).The MVM algorithm:

Improves resolution Removes side lobe artifacts (noise)Reduces speckle compared to what is possible with conventional Fourier transform SAR imaging techniques

One of MVM’s main building blocks is the creation of a covariance matrix

April 18, 2010

Page 6: Final Presentation

6Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Plurality PlatformPlurality’s HyperCore Architecture Line (HAL) family of massively parallel manycore processors features:

Unique task-oriented programming modelNear-serial programmability High performance at low cost per watt per square millimeterUnique shared memory architecture - 2 MB cache size

Page 7: Final Presentation

7Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 8: Final Presentation

8Parallel Covariance Matrix Creation - Final Presentation

Implementing the Naïve Algorithm

April 18, 2010

Implementing the naïve algorithm will give us a greater understanding of the parallelization problem.

Motivation:

Page 9: Final Presentation

9Parallel Covariance Matrix Creation - Final Presentation

The Naïve Algorithm

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

Chip [NxM]

April 18, 2010

Page 10: Final Presentation

10Parallel Covariance Matrix Creation - Final Presentation

The Naïve Algorithm

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

Sub aperture [N1xM1]

Page 11: Final Presentation

11Parallel Covariance Matrix Creation - Final Presentation

The Naïve Algorithm

April 18, 2010

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

Page 12: Final Presentation

12Parallel Covariance Matrix Creation - Final Presentation

The Naïve Algorithm

April 18, 2010

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

C2,4

C3,4

C4,4

C2,3

C3,3

C4,3

C2,3* C3,3* C4,4*C2,2* C3,2* C4,2* C2,4* C3,4* C4,4*

C2,2

C3,2

C4,2

Page 13: Final Presentation

13Parallel Covariance Matrix Creation - Final Presentation

The Naïve Algorithm

April 18, 2010

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

Every Sub-aperture holds its covariance matrix Cov

Page 14: Final Presentation

14Parallel Covariance Matrix Creation - Final Presentation

The Naïve Algorithm

April 18, 2010

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

The covariance matrix Cov is the sum of all Sub-apertures Cov matrixes

1 1N-N*

xx0 0

Cov ~M M

pq pqp q

V V

Page 15: Final Presentation

15Parallel Covariance Matrix Creation - Final Presentation

The Naïve Algorithm

April 18, 2010

Shortcomings

Each multiplication is executed many timesFor a 32x32 chip, the total number of multiplies is 11.4M when the optimal number of multiplications is 208K (x28!)

The naïve algorithm is difficult to parallelize. Two main difficulties:

Simultaneous writing to the same Rcells – requires mutexesMemory cost of holding a Cov matrix for every permutation (each is 250 KB) is too expensive

Page 16: Final Presentation

16Parallel Covariance Matrix Creation - Final Presentation

The Naïve Algorithm

April 18, 2010

Disadvantages

Mutexes - adds complexity Memory space - cache size is only 2 MB

Plurality Platform

The problem requires different solution!

Page 17: Final Presentation

17Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

A Whole different Ball Game!

Our Algorithm

Page 18: Final Presentation

18Parallel Covariance Matrix Creation - Final Presentation

But first …

Before presenting the algorithm there is a need to create a common language for the terms we have created.

April 18, 2010

Page 19: Final Presentation

19Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 20: Final Presentation

20Parallel Covariance Matrix Creation - Final Presentation

Terminology

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

Permutation

M1

M2

Examples• Permutation [1,0] • Permutation [1,1]

Page 21: Final Presentation

21Parallel Covariance Matrix Creation - Final Presentation

Terminology

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

Permutation

M1

M2

Examples• Permutation [1,0] • Permutation [1,1]

Page 22: Final Presentation

22Parallel Covariance Matrix Creation - Final Presentation

Terminology

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

Permutation

M1

M2

Examples• Permutation [1,0] • Permutation [1,1]

Page 23: Final Presentation

23Parallel Covariance Matrix Creation - Final Presentation

Terminology

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

Block

M1

M2Block

Page 24: Final Presentation

24Parallel Covariance Matrix Creation - Final Presentation

Terminology

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

Block

Page 25: Final Presentation

25Parallel Covariance Matrix Creation - Final Presentation

Terminology

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

BNW

BNW

Page 26: Final Presentation

26Parallel Covariance Matrix Creation - Final Presentation

Terminology

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

Shifting

M2

M1Block

Shift only upwardsand leftwards

The block is always inside the shifted window

Page 27: Final Presentation

27Parallel Covariance Matrix Creation - Final Presentation

Terminology

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

Shifting

M2

M1Block

Shift only upwardsand leftwards

The block is always inside the shifted window

Page 28: Final Presentation

28Parallel Covariance Matrix Creation - Final Presentation

Terminology

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

Shifting

M2

M1Block

Shift only upwardsand leftwards

The block is always inside the shifted window

Page 29: Final Presentation

29Parallel Covariance Matrix Creation - Final Presentation

Terminology

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

April 18, 2010

Shifting

M2

M1Block

Shift only upwardsand leftwards

The block is always inside the shifted window

Shift of (0,0) is named Zero iteration

Page 30: Final Presentation

30Parallel Covariance Matrix Creation - Final Presentation

Terminology

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

April 18, 2010

Cov- The covariance matrix[M N, M N]∙ ∙

Page 31: Final Presentation

31Parallel Covariance Matrix Creation - Final Presentation

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

Terminology

April 18, 2010

Rcell

Page 32: Final Presentation

32Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 33: Final Presentation

33Parallel Covariance Matrix Creation - Final Presentation

Our Algorithm – Key Features

April 18, 2010

ParallelEach multiplication is executed once (208k for 32x32 chip)

Memory efficientGeneric

Each Rcell in Cov is calculated by one specific permutation. This enables different permutations to work simultaneously.

Concept:

Page 34: Final Presentation

34Parallel Covariance Matrix Creation - Final Presentation

Our Algorithm (simplified)

April 18, 2010

1. For each permutation (1:313)

1.1 For each legal BNW

1.1.1. Multiply the two multipliers

1.1.2. For each legal shift (including the zero iteration)

1.1.2.1. Add the multiplication product to thematching Rcell in Cov

Page 35: Final Presentation

35Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Our algorithm (simplified)Finding all unique permutations

Iterative algorithm1. Initialize Delta (x,y) set and Permutation(x,y) set2. For each pair of cells (M1,M2) in a N1xM1 matrix

2.1. If |M1-M2| is not in D2.1.1. Add |M1-M2| to D2.1.2. Add (M1,M2) to P

Unique permutation count is 313 ( for Sub-aperture [13x13])

Executed off-line

Page 36: Final Presentation

36Parallel Covariance Matrix Creation - Final Presentation

Our algorithm (simplified)

April 18, 2010

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

Chip [NxM]Cov- The covariance matrix[M N, M N]∙ ∙

Page 37: Final Presentation

37Parallel Covariance Matrix Creation - Final Presentation

Our algorithm (simplified)

April 18, 2010

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

For a given Permutation [1,1]

M2

M1

Page 38: Final Presentation

38Parallel Covariance Matrix Creation - Final Presentation

Our algorithm (simplified)

April 18, 2010

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

There’s a Block

M2

M1Block

Page 39: Final Presentation

39Parallel Covariance Matrix Creation - Final Presentation

Our algorithm (simplified)

April 18, 2010

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

Leagal BNWs for this Block

M2

M1Block

BNW

Page 40: Final Presentation

40Parallel Covariance Matrix Creation - Final Presentation

Our algorithm (simplified)

April 18, 2010

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

For a given BNW

M2

M1Block

Page 41: Final Presentation

41Parallel Covariance Matrix Creation - Final Presentation

Block

Our algorithm (simplified)

April 18, 2010

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4 C4,5 … C4,M

C5,1 C5,2 C5,3 C5,4 C5,5 … C5,M

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

RES=M1 M2* ∙

M2

M1

RES

Page 42: Final Presentation

42Parallel Covariance Matrix Creation - Final Presentation

Our algorithm (simplified)

April 18, 2010

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4

C5,1 C5,2 C5,3 C5,4

… … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

The multipliers Numbering

1 4 7

2 5 8

3 6 9

Block

Page 43: Final Presentation

43Parallel Covariance Matrix Creation - Final Presentation

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

Our algorithm (simplified)

April 18, 2010

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4

C5,1 C5,2 C5,3 C5,4

… … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

The Zero Iteration

1 4 7

2 5 8

3 6 9

Block 5

1

RESRcell (1,5)

RES

Diag(5-1)

Main Diag

Page 44: Final Presentation

44Parallel Covariance Matrix Creation - Final Presentation

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 … R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

Our algorithm (simplified)

April 18, 2010

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4 C3,5 … C3,M

C4,1 C4,2 C4,3 C4,4

C5,1 C5,2 C5,3 C5,4

… … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

Shifting

1 4 7

2 5 8

3 6 9

Block

Page 45: Final Presentation

45Parallel Covariance Matrix Creation - Final Presentation

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 R2,6 R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

Our algorithm (simplified)

April 18, 2010

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4

C4,1 C4,2 C4,3 C4,4

C5,1 C5,2 C5,3 C5,4

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

Shifting

1 4 7

2 5 8

3 6 9Block

6

2

RES

Rcell (2,6)

Diag(5-1)

Main Diag

RES

RES

Page 46: Final Presentation

46Parallel Covariance Matrix Creation - Final Presentation

Our algorithm (simplified)

April 18, 2010

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3 C3,4

C4,1 C4,2 C4,3 C4,4

C5,1 C5,2 C5,3 C5,4

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

Shifting

1 4 7

2 5 8

3 6 9Block

6

2

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 R2,6 R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 … R3,M∙N

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

Page 47: Final Presentation

47Parallel Covariance Matrix Creation - Final Presentation

R1,1 R1,2 R1,3 R1,4 R1,5 … R1,M∙N

R2,1 R2,2 R2,3 R2,4 R2,5 R2,6 R2,M∙N

R3,1 R3,2 R3,3 R3,4 R3,5 …

R4,1 R4,2 R4,3 R4,4 R4,5 … R4,M∙N

R5,1 R5,2 R5,3 R5,4 R5,5 … R5,

M N∙

… … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

… RM N,M∙ ∙N

Our algorithm (simplified)

April 18, 2010

C1,1 C1,2 C1,3 C1,4 C1,5 … C1,M

C2,1 C2,2 C2,3 C2,4 C2,5 … C2,M

C3,1 C3,2 C3,3

C4,1 C4,2 C4,3

C5,1 C5,2 C5,3

… … … … … … …

CN,1 CN,2 CN,3 CN,4 CN,5 … CN,M

Shifting

1 4 7

2 5 8

3 6 9Block

9

5

RES

Diag(5-1)

Main Diag

RES

RES

RES

Page 48: Final Presentation

48Parallel Covariance Matrix Creation - Final Presentation

Our algorithm (simplified)

April 18, 2010

We came across a regularity in the offset of the Rcell coordinates when shifting:

Leftwards (+Sub-ap size, +Sub-ap size)Upwards (+1,+1)

R1,1 R1,2 R1,3 R1,4 R1,5 R1,6 R1,7 … R1,M N∙

R2,1 R2,2 R2,3 R2,4 R2,5 R2,6 R2,7 … R2,M N∙

R3,1 R3,2 R3,3 R3,4 R3,5 R3,6 R3,7 … R3,M N∙

R4,1 R4,2 R4,3 R4,4 R4,5 R4,6 R4,7 s… R4,M N∙

R5,1 R5,2 R5,3 R5,4 R5,5 R5,6 R5,7 … R5, M N∙

R6,1 R6,2 R6,3 R6,4 R6,5 R6,6 R6,7 … R6, M N∙

R7,1 R7,2 R7,3 R7,4 R7,5 R7,6 R7,7 … R7, M N∙

… … … … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

RM N∙ ,6

RM N∙ ,7

…RM N,M N∙ ∙

Page 49: Final Presentation

49Parallel Covariance Matrix Creation - Final Presentation

Our algorithm (simplified)

April 18, 2010

Each color represents a different permutation

R1,1 R1,2 R1,3 R1,4 R1,5 R1,6 R1,7 … R1,M N∙

R2,1 R2,2 R2,3 R2,4 R2,5 R2,6 R2,7 … R2,M N∙

R3,1 R3,2 R3,3 R3,4 R3,5 R3,6 R3,7 … R3,M N∙

R4,1 R4,2 R4,3 R4,4 R4,5 R4,6 R4,7 s… R4,M N∙

R5,1 R5,2 R5,3 R5,4 R5,5 R5,6 R5,7 … R5, M N∙

R6,1 R6,2 R6,3 R6,4 R6,5 R6,6 R6,7 … R6, M N∙

R7,1 R7,2 R7,3 R7,4 R7,5 R7,6 R7,7 … R7, M N∙

… … … … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

RM N∙ ,6

RM N∙ ,7

…RM N,M N∙ ∙

Page 50: Final Presentation

50Parallel Covariance Matrix Creation - Final Presentation

Our Algorithm (simplified)

April 18, 2010

SummaryFor a given permutation:

RES is always written into the same group of Rcells

All on the same diagonalNot necessarily all diagonal cells

There is no overlapping between Rcells of different permutations.The basis for parallelism!Each shift writes to one unique Rcell. Theoretically enables parallelism of Rcell granularity (an instance per Rcell)

R1,1 R1,2 R1,3 R1,4 R1,5 R1,6 R1,7 … R1,M N∙

R2,1 R2,2 R2,3 R2,4 R2,5 R2,6 R2,7 … R2,M N∙

R3,1 R3,2 R3,3 R3,4 R3,5 R3,6 R3,7 … R3,M N∙

R4,1 R4,2 R4,3 R4,4 R4,5 R4,6 R4,7 s… R4,M N∙

R5,1 R5,2 R5,3 R5,4 R5,5 R5,6 R5,7 … R5, M N∙

R6,1 R6,2 R6,3 R6,4 R6,5 R6,6 R6,7 … R6, M N∙

R7,1 R7,2 R7,3 R7,4 R7,5 R7,6 R7,7 … R7, M N∙

… … … … … … … … …

RM N∙ ,1

RM N∙ ,2

RM N∙ ,3

RM N∙ ,4

RM N∙ ,5

RM N∙ ,6

RM N∙ ,7

…RM N,M N∙ ∙

Page 51: Final Presentation

51Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Permutations Execution Times

Page 52: Final Presentation

52Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Permutations Execution Times

Different workload for different permutations, therefore changing the order of permutations’ execution may improve core utilization.

Page 53: Final Presentation

53Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Parallelization Opportunities

Different permutations work simultaneouslyDifferent chips can work simultaneouslyFiner grain parallelism of Rcell granularity (an instance per Rcell)

Page 54: Final Presentation

54Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Platform Comparison

Our algorithm is optimal for shared memory platforms since Cov is shared by all coresWorking on distributed memory platforms will damage its efficiency as a result of communication overhead Plurality provides much higher performance-power utilization than Elta's grid computing

Plurality vs. Distributed Systems

Page 55: Final Presentation

55Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 56: Final Presentation

56Parallel Covariance Matrix Creation - Final Presentation

Reduces calculation at run time by 50%Same tables used for all chips

April 18, 2010

Look-up Tables

Execute many data-independent calculations off-line and storing results as a memory efficient static look-up tables.

Concept:

Advantages:

Page 57: Final Presentation

57Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Holds relevant permutation info:

Permutations Table Look-up tables

Optimal table size: (4 6 + 8 2) bit 313 = ∙ ∙ ∙ 1.5 KBytes

Multipliers’ indexesBlock bordersZero iteration coordinates

Page 58: Final Presentation

58Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Maps each shift to a Rcell

Uses the regularity in the offset of the Rcell coordinates when shifting upwards (+1,+1) or leftwards(+Sub-ap size, +Sub-ap size)

Concept:

Offsets Table Look-up tables

Optimal table size: 2 (13 13 8) bit 313 = ∙ ∙ ∙ ∙ 106 KBytes

Page 59: Final Presentation

59Parallel Covariance Matrix Creation - Final Presentation

Cov is an Hermitian matrix.

April 18, 2010

Concept:Using matrix characteristics to reduce calculations

Important observation:

†R R , ,R i j R j i

Using matrix Characteristics

Page 60: Final Presentation

60Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Highlight:Building Cov’s upper triangle only and, if necessary,generate the lower triangle inexpensively

Advantages:Reduces calculations by half Requires less space for storing the Cov matrixMost eigendecomposition algorithms requires upper triangle only

Using matrix Characteristics

Page 61: Final Presentation

61Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 62: Final Presentation

62Parallel Covariance Matrix Creation - Final Presentation

Results (x86)

April 18, 2010

6 8 16 26 32 33 360

0.02

0.04

0.06

0.08

0.1

0.12

NaiveOurs

Chip Size

Run

Tim

e [s

econ

ds]

Not optimized for x86

Different Chip Sizes

Page 63: Final Presentation

63Parallel Covariance Matrix Creation - Final Presentation

3 4 6 8 11 13 150

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

naiveOurs

Sub-ap Size

Run

Tim

e [s

econ

ds]

April 18, 2010

Results (x86)Different Sub-aparture Sizes

Not optimized for x86

Page 64: Final Presentation

64Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 65: Final Presentation

65Parallel Covariance Matrix Creation - Final Presentation

Elta’s MVM AlgorithmPreliminary Algorithm

April 18, 2010

, ,2D FFT x YK K X YS

Elta’s Algorithm

32 32,, Fragmentation xX YX Y

32 32 32 32, , 2D IFFT x xX Y X YS

32 32 32 32, , x x

MVMX Y X YS MVM

32 32, , Attachment xMVM

X Y X Y

Original SAR Image is Segmented into Chips (32X32 Chip) . The chips overlap.

MVM is Applied to Each Chip. The Various Chips are Attached to Each Other and Forms a

Full Size MVM Image

Input image

Output image The chips overlap

Page 66: Final Presentation

66Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

2D-IFFT

INIT

Segmentation

Covarince

Eigenvalues

FFT

Main effort

1

2 FINISH

Attachment

Task Map - MVM

Page 67: Final Presentation

67Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 68: Final Presentation

68Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Plurality’s HyperCore Architecture Line (HAL) family of massively parallel manycore processors includes:

16 to 256 32-bit RISC cores4-64 co-processors that include a Floating Point unit and a Multiplier/Divider. Each co- processor is shared by four RISC processorsShared memory architecture - 2 MB size. No level one cache.Hardware-based scheduler that supports a task-oriented programming modelA cycle accurate simulator that runs on a x86 platformIntegrated into Eclipse IDEAn emulator supporting Linux and Windows native environments

Plurality Platform

Page 69: Final Presentation

69Parallel Covariance Matrix Creation - Final Presentation

Plurality’s Platform

The emulator mimics the behavior of HAL's hardware scheduler while still running on a X86 processor and working on Linux/Windows-based environments.

April 18, 2010

Emulator

No need to change to new hardware and a new programming model The emulator is written in ANSI-C. (almost all compilers can compile it)It comes with a prebuilt Makefile and a Visual Studio solutionThe emulator calls each task with all its required information: its right task instance, right timing, and right core IDHowever, not cycle-accurate!

Advantages

Page 70: Final Presentation

70Parallel Covariance Matrix Creation - Final Presentation

Plurality’s Platform

April 18, 2010

A cycle-accurate hardware simulator, that simulates the exact behavior of real HAL hardware. The simulator is integrated into eclipse IDE, but is very hard to debug with.

Simulator

Cycle accurate simulation.Uses GNU’s well known binutils and GDB debuggerIntegrated into Eclipse IDEEase transition to hardware

Advantages

Page 71: Final Presentation

71Parallel Covariance Matrix Creation - Final Presentation

Plurality’s Platform

April 18, 2010

Implementations

Compilation of the whole MVM algorithm using Plurality's emulatorCompilation of our covariance matrix creation program using Plurality's simulatorUsing the Eclipse development environment to measure the cycle-accurate performance

Page 72: Final Presentation

72Parallel Covariance Matrix Creation - Final Presentation

MotivationOvercoming Plurality’s unimplemented featureAllow manual scheduling in order to preserve processing time

For a given task, limit the number of concurrent instances out of its defined quotaImplemented in Perl

April 18, 2010

Added FeaturesN of M pre-compiler

Page 73: Final Presentation

73Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

The naïve algorithm

Page 74: Final Presentation

74Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

2D-IFFT

INIT

Segmentation

Covarince

Eigenvalues

2D-FFT

FINISH

Attachment

Task Map - MVM on Plurality

X86

X86

Emulator

Simulator

Page 75: Final Presentation

75Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Results (complete MVM on emulator)

2D(I)FFT and eigendecomposition using Intel’s MKL as black-box on the X86Compiled to native x86 code, but not fully optimized

Catego

ry 1

Catego

ry 2

Catego

ry 3

Catego

ry 4

0

1

2

3

4

5

6

Series 1Series 2Series 3

Placeholder

Page 76: Final Presentation

76Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Results (building covariance on the simulator)

2 4 8 16 32 64 128 2560

1

2

3

4

5

6

7

8

9

10

11

Speedup for 61 Permutations

Cores

Cycle

s spe

edup

Chip size: 15x15Sub-Aparture size: 6x6

Page 77: Final Presentation

77Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Results (building covariance on the simulator)

2 4 8 16 32 64 128 2560123456789

10111213141516171819

Speedup for 113 Permutations

Cores

Cycle

s spe

edup

Chip size: 20x20Sub-Aparture size: 8x8

Page 78: Final Presentation

78Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 79: Final Presentation

79Parallel Covariance Matrix Creation - Final Presentation

Future Projects

April 18, 2010

Completing MVM on Plurality

Implement a parallel algorithm for finding eigenvalues and vectors of a dense Hermitian matrix2D(I)FFT on Plurality using Plurality’s 1-D LibraryTask map Optimizations

Page 80: Final Presentation

80Parallel Covariance Matrix Creation - Final Presentation

Solving The Eigenvalues Problem

High complexityMany Algorithems: QR, SVD, D&C, Jacobi, etc.Many OTS solutions: Intel, AMD, IBM, GNU, LAPACK, NAG, FEAST, etc.Shared memory ∩ Parallel ∩ Open source C = ф

April 18, 2010

Page 81: Final Presentation

81Parallel Covariance Matrix Creation - Final Presentation

MRRR (Multiple Relatively Robust Representations)

Main features:Fast – O(n2) for nxn MatrixParallelMemory efficient – O(n2) for nxn MatrixComplex data structuresImplementation unavailable

Optimal for plurality’s Platform

April 18, 2010

Page 82: Final Presentation

82Parallel Covariance Matrix Creation - Final Presentation

Table of Contents Introduction Building the covariance matrix

The naïve algorithm Our algorithm

Terminology The Algorithm Optimizations Results

MVM on Plurality The MVM algorithm Plurality Platform Results

Future Projects Conclusions

April 18, 2010

Page 83: Final Presentation

83Parallel Covariance Matrix Creation - Final Presentation

Opportunities

April 18, 2010

Our algorithm is unique: no parallel solution has been available to date. This solution may be applied to other signal processing problemsImplementation of MRRR is possible, therefore, enabling the complete MVM algorithm to work on plurality's platformUsing our solution on plurality's platform may be very appealing since plurality provides higher performance-power utilization than Grid Computing and faster run time

Page 84: Final Presentation

84Parallel Covariance Matrix Creation - Final Presentation

Practical Implications

Plurality’s low power platform may enable integrating SAR

On satellitesOn Unmanned Aerial Vehicles (UAV’s)More implications …

April 18, 2010

Page 85: Final Presentation

85Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Thank you

Page 86: Final Presentation

86Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Back-up Slides

Page 87: Final Presentation

87Parallel Covariance Matrix Creation - Final Presentation

Elta’s MVM AlgorithmAssembling SAR radar picture consists of 2 phases:

April 18, 2010

DATA ManipulationRMC, Adaptive Pre-Sum, MOCOMP,

Autofocus, Polar to Rectangular Interpolation

Filtering and2D IFFT

MVMProcess

Incoming radarEchoes SAR Image2D FFT

1. Conventional SAR

2. MVM method in SAR

Identify Target of InterestUpon a SAR Image

Obtain Virtual SARRaw DATA Corresponding

to The Selected Target

MVM SAR Imageof the selected target

Elta’s MVM Algorithm

Preliminary Algorithm

The MVM algorithm

Page 88: Final Presentation

88Parallel Covariance Matrix Creation - Final Presentation

Greatly reduces calculation at run timeSame table used for all chips

April 18, 2010

Our algorithm (Optimizations)

Concepts:Execute many calculations in advance, saving them in a memory efficient static look-up tables.

Look-up tables

Advantages:

Our algorithm

Page 89: Final Presentation

89Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Our algorithm (Optimizations)Look-up tables Permutation Table

M1x M1y M2x M2y Bx By REFx REFy

1 … … … … … … … …

… … … … … … … … …

… … … … … … … … …

313 … … … … … … … …

Holds relevant info for permutation

Our algorithm

Page 90: Final Presentation

90Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Our algorithm (Optimizations)Look-up tables Permutation Table

[M1x, M1y] are coordinates of first multiplier[M1x, M1y] are coordinates of second multiplierBx is the number of rows of the permutation blockBy is the number of cols of the permutation block [REFx, REFy] are the coordinates of the pixel at REF matrix

(at Zero iteration)

Optimal table size: (4*6+8*2)bit*313=1.565KBytes

Our algorithm

Page 91: Final Presentation

91Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Our algorithm (Optimizations)Look-up tables Offsets Table

Maps each shift to a pixel

We came across a regularity in the offset of the pixel coordinates when shifting upwards (+1,+1) or leftwards(+Sub-ap size, +Sub-ap size)

Concept:

Our algorithm

Page 92: Final Presentation

92Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Our algorithm (Optimizations)Look-up tables Offsets Table

First we create a general Matrix containing all possible pixel offsets

Matrix[i,j]- the offset when shifting i steps upwards and j steps leftwards

Table’s construction156

143

130

117

104

91 78 65 52 39 26 13 0

157

144

131

118

105

92 79 66 53 40 27 14 1

158

145

132

119

106

93 80 67 54 41 28 15 2

159

146

133

120

107

94 81 68 55 42 29 16 3

160

147

134

121

108

95 82 69 56 43 30 17 4

161

148

135

122

109

96 83 70 57 44 31 18 5

162

149

136

123

110

97 84 71 58 45 32 19 6

163

150

137

124

111

98 85 72 59 46 33 20 7

164

151

138

125

112

99 86 73 60 47 34 21 8

165

152

139

126

113

100

87 74 61 48 35 22 9

166

153

140

127

114

101

88 75 62 49 36 23 10

167

154

141

128

115

102

89 76 63 50 37 24 11

168

155

142

129

116

103

90 77 64 51 38 25 12

Our algorithm

Page 93: Final Presentation

93Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Our algorithm (Optimizations)Look-up tables Offsets Table

Then, we add each permutation’s Zero Iteration coordinates (x,y) to the matrix to form each permutation offsets table

Table’s construction

Coodrszero-iteration (x,y) +

156

143

130

117

104

91 78 65 52 39 26 13 0

157

144

131

118

105

92 79 66 53 40 27 14 1

158

145

132

119

106

93 80 67 54 41 28 15 2

159

146

133

120

107

94 81 68 55 42 29 16 3

160

147

134

121

108

95 82 69 56 43 30 17 4

161

148

135

122

109

96 83 70 57 44 31 18 5

162

149

136

123

110

97 84 71 58 45 32 19 6

163

150

137

124

111

98 85 72 59 46 33 20 7

164

151

138

125

112

99 86 73 60 47 34 21 8

165

152

139

126

113

100

87 74 61 48 35 22 9

166

153

140

127

114

101

88 75 62 49 36 23 10

167

154

141

128

115

102

89 76 63 50 37 24 11

168

155

142

129

116

103

90 77 64 51 38 25 12

Our algorithm

Page 94: Final Presentation

94Parallel Covariance Matrix Creation - Final PresentationApril 18, 2010

Our algorithm (Optimizations)Look-up tables Offsets TableTable’s construction

Coodrszero-iteration (x,y) +

156

143

130

117

104

91 78 65 52 39 26 13 0

157

144

131

118

105

92 79 66 53 40 27 14 1

158

145

132

119

106

93 80 67 54 41 28 15 2

159

146

133

120

107

94 81 68 55 42 29 16 3

160

147

134

121

108

95 82 69 56 43 30 17 4

161

148

135

122

109

96 83 70 57 44 31 18 5

162

149

136

123

110

97 84 71 58 45 32 19 6

163

150

137

124

111

98 85 72 59 46 33 20 7

164

151

138

125

112

99 86 73 60 47 34 21 8

165

152

139

126

113

100

87 74 61 48 35 22 9

166

153

140

127

114

101

88 75 62 49 36 23 10

167

154

141

128

115

102

89 76 63 50 37 24 11

168

155

142

129

116

103

90 77 64 51 38 25 12

313X

313X

Our algorithm

Optimal table size: (13*13*8)bit*313=52.9KBytes

Page 95: Final Presentation

95Parallel Covariance Matrix Creation - Final Presentation

REF is an Hermitian matrix.

April 18, 2010

Our algorithm (Optimizations)

Concept:Using matrix characteristics to reduce calculations

Using Matrix Characteristics

Important observation:

†R R

Our algorithm

, ,R i j R j i