Parallel algorithm for computing edt with new architecture

International Journal of Electronics and Communication Engineering & Technology (IJECET),

ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online) Volume 1, Number 1, Sep - Oct (2010), © IAEME

1

PARALLEL ALGORITHM FOR COMPUTING EDT WITH

NEW ARCHITECTURE

Er. Kirti Rawal

Lecturer, RIEIT,Railmajra

Punjab, E-Mail: [email protected]

Er. Sonia

Lecturer, BBSBEC

Fatehgarh Sahib

Er. Rajeev Kumar Patial

Sr.Lecturer, LPU

Phagwara

Mahesh Mudavath

Lecturer, RIEIT, Railmajra

Punjab, E-Mail: [email protected]

ABSTRACT

A distance transformation converts a binary image consisting of foreground and

background pixels into one in where each pixel has a value equal to its nearest

background pixel (alternatively, distances could be the to the nearest foreground pixel).

This paper provides an area-efficient hardware solution to the computation of EDT on a

binary image. A Parallel algorithm for computing EDT of an n×n image is presented.

Pipelined 2D array architecture for hardware implementation is designed. The

architecture has a regular structure with locally connected identical processing elements.

Further, pipelining reduces hardware resources. Such array architecture is easily scalable

to handle images of different sizes.

Keywords – Distance Transforms, Euclidean Distance Transform, Parallel Algorithm,

Pipelined Architecture.

International Journal of Electronics and Communication

Engineering & Technology (IJECET)

ISSN 0976 – 6464(Print), ISSN 0976 – 6472(Online)

Volume 1, Number 1, Sep - Oct (2010), pp. 01-17

© IAEME, http://www.iaeme.com/ijecet.html

IJECET

© I A E M E



2

1. INTRODUCTION

A distance transform, also known as distance map or distance field, is a

representation of a digital image. Applications of distance transform are numerous. These

include shape analysis of objects [2], machine vision [3] and image matching [4].

Figure 1 distance transformation

Three different distance metrics are commonly used which are following [1]:

• Manhattan distance

• Chessboard distance

• Euclidean distance

A. Manhattan Distance Transform

The Distance between two points measured along axes at right angles. In a plane

with p1 at (x1, y1) and p2 at (x2, y2), it is |x1 - x2| + |y1 - y2|. Manhattan distance is often

used in integrated circuits where wires only run parallel to the x or y axis Manhattan

distance is often used in integrated circuits where wires only run parallel to the x or y

axis. Also known as rectilinear distance, Minkowski's L1 distance, taxi cab metric, or city

block distance. The formula for this distance between a point x= (x1, x2, etc.) and a point

y= (y1, y2, etc.) is ∑=

−=

n

i

yixid1

Figure 2 Manhattan distance transformation



3

B. Chessboard Distance Transform

The Chessboard distance between two spaces on a chess board is the minimum

number of moves a king requires to move between them. Below are the Chessboard

distances of each square from the square f6. Chessboard Distance is also called

maximum value distance or Chebyshev distance. Mathematically it can be written as

DChebyshev(p, q) = maxi(| pi − qi |).The city block distance, chessboard distance or the

combination of those two are used for the benefit of their simplified architectures and

integer computations[5-7].

Figure 3 Chessboard Distance Transformations

C. Euclidean Distance Transform

Euclidean Distance Transformation (EDT) is used to convert a digital binary

image consisting of object (foreground) and non object (background) pixels into another

image where each pixel has a value of the minimum Euclidean distance from non object

pixels. Considerable research has been done on development of algorithms for

computation of the EDT. Several sequential [8-11] and parallel [12-16] algorithms are

available. Some work on parallel algorithms targeted to general-purpose processors is

also known [17, 18]. Out of these three distance transforms EDT find widespread use.

2. PARALLEL ALGORITHM

In this paper, a parallel algorithm is presented. The salient feature of this

algorithm is that the computation of EDT involves only integer arithmetic operations

within a small neighbourhood of each pixel and hence it is suitable for mapping onto high



4

speed array architecture. The Euclidean distance is given by dp =22

cr ∆+∆ [19]. (∆r,

∆c) of background pixels are initialized to (0, 0) and those of foreground pixels are

computed iteratively starting from the pixels nearby background and moving towards the

far away pixels. That is, dp lies within (k - 0.5, k + 0.5] i.e. for k =1 its value lies between

0.5 to 1.5.Hence dp is not integer because from 0.5 to 1.5 there are some floating point

numbers i.e. 0.7, 0.9 etc. So we take d2p whose values lies within ( k

2 - k, k

2 + k) since

d2p is an integer. However, d

2p is quite large in magnitude and it requires a large storage

space in hardware.

For K=1, value lies within k2-k to k

2+k = (1

2 -1) to (1

2 +1) = 0 to 2. Thus range

of d2p is from 0 to 2. For K=2, value lies within k

2-k to k

2+k = (2

2 -2) to (2

2 +2) = 2 to

6. So we take ∂(p) which is less than d2p is given by ∂(p) = k

2+k - d

2p. Here d

2p is

derived first and then substituted to find ∂(p). A new integer quantity ∂(p) which is much

smaller than d2(p) is defined as (k

2 + k) - d

2p. d

2p can be derived using the already

computed (∆r, ∆c) of eight neighbors pi, i = 1–8, surrounding p. It is given by [min∆ri2 +

∆ci2] where ∆ri

= ∆(pi) if pi is in the same row of p. Otherwise, ∆ri =∆ r(pi) + 1. The

increment by 1 is due to p being displaced from pi by one row. Similarly, ∆ci is given in

terms of ∆c(pi).

3. STEPS OF PARALLEL ALGORITHM

There are three steps of parallel algorithm which are used to compute Euclidean

distance transform.

Step 1: Compute ∆ri, ∆ci and ∂i , i = 1 to 8.

Figure 4 Euclidean distance from reference value to the same column (above)



5

Euclidean Distance dp= 22

cr ∆+∆

d2p = ∆r

2 + ∆c

2

d2p = 1

2 + 0

2 = 1

∂(p) = k2+k - d

2p = 1

2+1 – 1 = 1

In this way all values of Euclidean distance for k = 1 are computed.

Figure 5 Euclidean distance values for k=1

Figure 6 Euclidean distance from Reference value to the different row (Above) and

different column (L.H.S)


cr ∆+∆

d2p = ∆r

2 + ∆c

2

d2p = 2

2 + 1

2 = 5

∂ (p) = k2+k - d

2p = 2

2+2 – 5 = 1




6

Figure 7 Euclidean distance values for k=2

Figure 8 Euclidean distance values from reference value to corner side (above)


cr ∆+∆

d2p = ∆r

2 + ∆c

2

d2p = 2

2 + 2

2 = 8

∂(p) = k2+k - d

2p = 3

2+3 – 8 = 4




7

Figure 9 Euclidean distance values for k = 3

Step 2: To find Maximum values

Consider ∂i values corresponding to neighbors pi whose done (pi) = 1. Find the

maximum, say ∂m.

In step 2 , d2(p) can be rewritten as min[d

2(pi) + ∆Ri + ∆Ci] where ∆Ri = 0 if pi is

in the same row of p. Otherwise, ∆Ri = 2∆r(pi) + 1. ∂ (p) is now derived as follows.

∂(p) = k2+k - d

2p

= max [ k2+k - d

2 pi -∆Ri - ∆Ci]

= max [∂ pi -∆Ri - ∆Ci] = max [∂i ]

Consider iteration k = 2. ∂ of those twelve pixels whose (∆r, ∆c) have been

computed at k =1 are incremented by 2k (i.e. 4).

Figure 10 Maximum value of Euclidean distance from reference value to different row

and different column (L.H.S)



8

= 5 - 3 – 3 = -1

Where ∆Ri = 2∆rpi + 1 and ∆Ci = 2∆cpi + 1

= 2× 1+1 = 3 = 2× 1+1 = 3

Step 3: To find overlapped delta values, If done(p) = 0 and ∂m ≥ 0, then

∆r(p)= ∆rm, ∆c(p)= ∆cm, ∂ (p) =∂m and done(p) =1. If done(p) = 1, then

∂(p) = ∂(p) + 2k.

To keep track of pixels whose (∆r, ∆c) have been computed, a flag done is

assigned to each pixel, whose value is set to 1 when the transform values of pixels are

computed at any iteration.

Overlapped values for k=1 are computed as follows If previous value of ∂(p) is 0

(for k=0), then value for k=1 are computed by using the formula

∂(p) = ∂(p) + 2k.

= 0 + 2×1 = 2

Figure 11 Overlapped value in k = 1

Similarly overlapped values for k=2 are computed as follows If previous value of

∂(p) is 2 (for k=1), then value for k=2 are computed by using the formula

max (∂i ) = max (∂pi ) - ∆Ri - ∆Ci

∂(p) = ∂(p) + 2k.

= 2 + 2×2 = 6



9

Figure 12 Overlapped value in k = 2

Similarly overlapped values for k=3 are also computed by using the formula.

∂(p) = ∂(p) + 2k.

= 6 + 2×3 = 12

In this way all the overlapped values of delta are computed.

4. PIPELINED ARCHITECTURE FOR COMPUTING EDT

The computation of ∂i, ∆Ri and ∆Ci are computed first based on the position of

neighbors pi. ∆Ri (or ∆Ci) takes either 0 or 2∆r(pi) + 1 (or 2∆c(pi) + 1). Therefore, the

computation of ∆ri and ∆ci requires incrementers while the computation of ∂i requires

adders and subtractors [32]. Pipelined Architecture for computing EDT is as shown in

figure 13

Squaring Circuit for r and c:

Squaring circuit is used to square the inputs of r and c in order to get the output

rsq and csq. The maximum value of r and c is 2 so only two bits are required to represent

0,1and 2. The maximum value of rsq and csq is 4 so four bits are used to represent 0,1and

4.

Squaring Circuit for k:

Squaring circuit is used to square the inputs of k in order to get the output ksq.

The maximum value of k is 3 so only two bits are required to represent 0, 1, 2 and 3. The

maximum value of ksq is 9 so four bits are used to represent 0, 1, 4 and 9.



10

Full Adder for k and ksq:

Here 4bit full adder is used to add the values of k and ksq in order to get the

output tot. The maximum value of k is 3 and ksq is 9 so four bits are required to represent

ksq. The maximum value of tot is 12 so four bits are used to represent tot values. In this

case there is no carry input i.e. cin1 and no carry output i.e. cout1 so both will remain

zero.

Subtractor for dp and total:

Here subtractor is used to subtract the values of dp and tot in order to get the

output del_p. The maximum value of dp is 8 and tot is 12 so four bits are required to

represent dp and tot values. Del_p (delta) comes at the output through wire del_p_1.

Pipelined Register1:

Pipelined register1 is used to store all the r, c and delta values which are

calculated in stage 1.The inputs of pipelined register1 is r1_0 to r1_29, c1_0 to c1_29 and

del1_p0 to del1_p29 and output of pipelined register 1 is r0 to r29, c0 to c29 and del_p0

to del_p29.The main advantage of using pipelined register1 is the moment at which value

is stored at the input, it comes at the output of pipelined register instantly. It does not wait

to fill all the values in the register. These r and c values are those values which we are

applying at the input of stage 1 and delta values are coming from the output of subtractor.

With the help of wire we are using r and c values at the input of pipelined register1. So

that EDT is computed only for those r and c values rather than any other value.

Counter:

Five bit counter is used to give output of subtractor i.e. delta values to input of

pipelined register1.The inputs of five bit counter is clk_1 and clear having 1bit each. The

output of counter is q_1 (5bit). Now q_1 is assigned to address of pipelined register1 so

that when the output of counter is incremented then address is also incremented by filling

delta values (del_p_1) one by one. When output of counter reach at 29th

value, at that

time pipelined register is completely filled. By giving next clk_1 again output of counter

and address comes at zero position.



11

Figure 13 Pipelined Architecture for computing EDT



12

In this way all values of delta i.e. EDT are computed with this new architecture.

The pipelined architecture consists of various blocks such as squaring circuit, full

adder, full subtractor, pipelined registers, comparators and multiplexers. Out of these

blocks, output of each block has to be computed individually

5. SIMULATION RESULTS

In order to obtain results Verilog language has been used. Verilog Hardware

Description Language (HDL) is used for computing EDT, which can describe hardware

requirements of the architecture not only at the gate level, register level but at the

algorithmic level [20]. Verilog HDL is one of the two most common Hardware

Description Languages (HDL) used by integrated circuit (IC) designers [21]. For writing

the code Xilinx is used and for simulation or to see the output waveforms Modelsim has

been used.

Figure 14, 15, 16 explains how the r_t, c_t, k_t, q_5_t, k_new_t, c_in1_t, cin1_t,

clk_1_t, clear_1_t, and flag_t inputs are applied to produce output output_data. Here we

consider 30 cases, in which different values of r_t and c_t are applied in order to produce

different values of delta. Carry inputs are always zero and clk_t and flag_t is always 1.

The value of k_t varies from 1 to 3. Here we take k_t = 01, q_5_t and

k_new_t=11.Giving all these inputs we get complete delta values for 30 inputs i.e.

output_data.

Figure 14 Waveform of Calculation of overlapped delta operation



13





14

6. CONCLUSIONS

A distance transformation converts a binary image consisting of foreground and

background pixels into one in where each pixel has a value equal to its nearest

background pixel (alternatively, distances could be the to the nearest foreground pixel).

The types of transforms used generally are city block distance transform, chessboard

distance transform and Euclidean distance transform. Of these EDT find widespread use

in view of the natural metric employed.

The pipelined architecture is presented in this dissertation proved to be applicable

for computation of Euclidean distance transform. It comprises two dimensional arrays of

locally interconnected processing elements where each element is a sequential logic and

all elements are operated synchronously. This architecture is designed in such a way that

it works on the steps of parallel algorithm. The algorithm involves only integer arithmetic

operations. The architecture is fully digital and it is easily scalable for an image of any

n×n size.

7. FUTURE SCOPE

Proposed methodology can be used as an important tool in image analysis.

Keeping in view the importance of image processing, it is required that the architecture

should be robust, accurate as well as faster in order to handle images of different sizes.

So, there is always perpetual need for improvements.

1. The ideas presented for the case of 4 pixels per processing element readily extend to

the case of more than 4 pixels per processing element (such as 9,16 and so on).

2. The given architecture can be further modified to handle 3D images.

3. The work can be carried out to make it more robust.

4. The given architecture can be implemented on FPGA device.

5. The given source code is optimized to get synthesizable results.



15

8. REFERENCES

[1] C. Tony Huang and O. Robert Mitchell (1991), “Rapid Euclidean distance

transform using grayscale morphology decomposition”, IEEE Computer Society

Conference on Pattern Analysis and Machine Intelligence ,1991,vol.14, pp. 695-

697.

[2] P. Danielsson (1978), “A new shape factor”, Computer Graphics and Image

Processing 1978, vol.2, pp. 292–299.

[3] D.Paglieroni (1992), “Distance transforms: properties and machine vision

applications”, CVGIP: Graphical Models and Image Processing, vol.54, 1992, pp.

56–74.

[4] D.P. Huttenlocher, G.A. Klanderman, W.J. Rucklidge (1993), “Comparing images

using the Hausdorff distance”, IEEE Transactions on Pattern Analysis and Machine

Intelligence, vol. 15 ,1993, pp. 850–863.

[5] P. A. Maragos and R. W. Schafer, “Morphological skeleton representation and

coding of binary images,” IEEE Trans. Acoustic Speech, Signal Processing, vol.

ASSP-34, no.5, 1986, pp. 1228-1244.

[6] S. R. Stemberg, “Grayscale morphology,” Computer Vision Graphics and Image

Processing, vol. 35, 1986, pp. 333-355.

[7] J. Toriwaki and S. Yokoi. “Distance transformations and skeletons of digitized

pictures with applications,” in Progress in Pattern Recognition, 1981, pp. 187-264.

[8] H. Breu, J. Gil, D. Kirkpatrick, M. Werman (1995), “Linear time Euclidean

distance transform algorithms”, IEEE Transactions on Pattern Analysis and

Machine Intelligence, vol.17 ,1995, pp. 529 533.

[9] S. Pavel, S.G. Akl (1995), “Efficient algorithms for the Euclidean distance

transform”, Parallel Processing Letters, vol.5, 1995, pp. 205–212.

[10] Hinnik Eggers (1998), “Two fast Euclidean distance transformations in z2 based on

sufficient propagation”, Computer Vision and Image Understanding vol.69, 1998,

pp. 106–116.

[11] J. Maurer, R. Calvin, R. Qi, V. Raghavan (2003), “A linear time algorithm for

computing exact Euclidean distance transforms of binary images in arbitrary



16

dimensions”, IEEE Transactions on Pattern Analysis and Machine Intelligence,

vol.25, 2003, pp. 265–270.

[12] Hugo Embrechts, Dirk Roose (1996), “Parallel Euclidean distance transformation

algorithm”, Computer Vision and Image Understanding vol.63, 1996, pp. 15–26.

[13] Hinnik Eggers (1996), “Parallel Euclidean distance transformations in zng”,

Parallel Recognition Letters, vol. 17, 1996, pp. 751–757.

[14] T. Hirata, “A unified linear-time algorithm for computing distance maps”(1996),

Information Processing Letters, vol.58, 1996, pp. 129–133.

[15] N. Sudha, S. Nandi, K. Sridharan (1998), “Efficient computation of Euclidean

distance transform for applications in image processing”, Proceedings of IEEE

TENCON vol.2,1998, pp. 49–52.

[16] Yu-Hua Lee, Shi-Jinn Horng, Jennifer Seitzer (2003), “Parallel computation of the

Euclidean distance transform on a three-dimensional image array”, IEEE

Transactions on Parallel and Distributed Systems, vol. 14,2003, pp. 203–212.

[17] L. Chen, H.Y.H. Chuang (1995), “An efficient algorithm for complete Euclidean

distance transform on mesh-connected SIMD”, Parallel Computing, vol. 21, 1995,

pp. 841–852.

[18] Y. Pan, M. Hamdi, K. Li (2000), “Euclidean distance transform for binary images

on reconfigurable mesh-connected computers”, IEEE Transactions on Systems,

Man Cybernetics, vol. 30 ,2000, pp. 240–244.

[19] N. Sudha (2005), “A pipelined array architecture for Euclidean distance

transformation and its FPGA implementation”, Microprocessors and Microsystems,

vol. 29, 2005, pp. 405–410.

[20] J. Bhaskar (1998), “Verilog HDL Synthesis a Practical Primer”, second edition,

1998, pp.1-230.

[21] Peter M. Nyasulu (2001), “Introduction to Verilog”, third edition, 2001, pp.1-30.



17

BIO DATA OF AUTHOR’S

Ms. Kirti Rawal is working as an Lecturer in the Electronics and

Communications Engineering Department, R.I.E.I.T, Railmajra (Punjab). She earned her

M.Tech (ECE) Degree from BBSBEC Fatehgarh Sahib (Punjab) in 2010 and B.Tech

(ECE) Degree from IITT Pojewal (Nawanshahr). She has published 2 research papers in

International Conferences and national Conferences. She is a Life Membership of the

“Indian Society for Technical Education (ISTE)”.

Mr. Mahesh Mudavath is working as an Lecturer in the Electronics and

Communications Engineering Department, R.I.E.I.T, Railmajra (Punjab). He earned his

M.Tech (VLSI Design) Degree from C-DAC, Mohali (Punjab) in 2009 and B.Tech

(ECE) Degree from JNTU, Hyderabad (Andhra Pradesh). He has published 6 research

papers in International Journals and International Conferences. He is a Life Membership

of the “Indian Society for Technical Education (ISTE)”.

Parallel algorithm for computing edt with new architecture

Documents

Transcript of Parallel algorithm for computing edt with new architecture