Post on 01-Jun-2020
UNIVERSITY OF OSLODepartment of Informatics
Simulation andVisualization ofUltrasound Fields
Kapila Epasinghe
Cand Scient Thesis
1st August 1997
Simulation and Visualization
of Ultrasound Fields
c©Kapila Epasinghe
1st August 1997
Acknowledgment
This thesis is written as the practical requirement for the Candidatus Sci-
entiarum (Master of Science) degree at The Division of Signal Processing at
the Department of Information Technology, University of Oslo, Norway. This
work was started in June 1995 and �nished in July 1997. Part of this work
was presented as a research article at the Nordic Symposium on Physical
Acoustics in February 1996.
First, I wish to express my sincere gratitude to my main supervisor, Professor
Sverre Holm, for his patience and understanding throughout my labour, and
that little extra support when the going got tough. A big thank you goes to
my bi-supervisor, System Engineer Dag F. Langmyhr, for his collaboration.
A special note of thanks goes to Roger O. Nordby of the Scienti�c Visu-
alization Laboratory in USIT (University Center for Information Services).
Thank you, Roger, for coming to IFI the saturday after your birthday party.
I also wish to thank So�a, Ranjith and Shiva for helping me with small
details of my thesis. Together with Nihal, Sam, Dilu, Bahee, Hisham and
Chamath, they were always there for me with a little bit of encouragement
when I needed it.
Finally, I wish to thank my parents and my two brothers who always are
understanding and encouraging although they were thousands of miles away.
Oslo, July 1997
Kapila Epasinghe
University of Oslo,
Norway.
Contents
1 Introduction 1
1.1 Ultrasound in medical diagnosis . . . . . . . . . . . . . . . . . 1
1.2 Parallel programming in mathematics . . . . . . . . . . . . . . 2
1.3 Objective of this thesis . . . . . . . . . . . . . . . . . . . . . . 2
2 Acoustic Theory and Ultrasound 5
2.1 Basic acoustic principles . . . . . . . . . . . . . . . . . . . . . 5
2.2 Some properties of acoustic waves . . . . . . . . . . . . . . . . 6
2.3 The wave equation . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3.1 Solution in cartesian coordinates . . . . . . . . . . . . 7
2.3.2 Solution in spherical coordinates . . . . . . . . . . . . . 7
2.4 Acoustic impedance . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Re�ection and refraction . . . . . . . . . . . . . . . . . . . . . 9
2.6 Velocity potential . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.7 Rayleigh integral . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.8 Calculating the �eld . . . . . . . . . . . . . . . . . . . . . . . 11
2.8.1 Huygens' principle . . . . . . . . . . . . . . . . . . . . 11
2.8.2 Di�raction impulse response method . . . . . . . . . . 11
2.8.3 Direct calculation of the Rayleigh integral . . . . . . . 12
ii CONTENTS
3 Transducer Design and Theory 13
3.1 Transducers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.1 Piezo-electric e�ect . . . . . . . . . . . . . . . . . . . . 13
3.1.2 Matching . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Transducer apertures . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.1 Continuous apertures . . . . . . . . . . . . . . . . . . . 15
3.2.2 Linear apertures . . . . . . . . . . . . . . . . . . . . . 16
3.2.3 Rectangular apertures . . . . . . . . . . . . . . . . . . 17
3.2.4 Elliptical and circular apertures . . . . . . . . . . . . . 18
3.3 Beam characteristics . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5 Transducer arrays in medical diagnosis . . . . . . . . . . . . . 22
3.6 Electronic focusing . . . . . . . . . . . . . . . . . . . . . . . . 23
3.7 The ultrasound imaging system . . . . . . . . . . . . . . . . . 24
4 The Octogonal Transducer 27
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Rectangular area A . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3 Rectangular area B . . . . . . . . . . . . . . . . . . . . . . . . 28
4.4 Triangular area C . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4.1 Section C1 . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4.2 Section C2 . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.4.3 Section C3 . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4.4 Section C4 . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4.5 Sum of the triangular sections . . . . . . . . . . . . . . 34
4.5 Aperture smoothing function of the transducer . . . . . . . . . 36
5 Simulation of Acoustic Fields 37
5.1 UltraSim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Simulation of the acoustic �eld . . . . . . . . . . . . . . . . . 37
5.3 Visualization of the acoustic �eld . . . . . . . . . . . . . . . . 39
CONTENTS iii
6 Scienti�c Parallel Processing 41
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.2 Parallel machine models . . . . . . . . . . . . . . . . . . . . . 42
6.2.1 The multicomputer . . . . . . . . . . . . . . . . . . . . 43
6.2.2 The multiprocessor . . . . . . . . . . . . . . . . . . . . 44
6.2.3 The SIMD model . . . . . . . . . . . . . . . . . . . . . 44
6.3 Parallel programming models . . . . . . . . . . . . . . . . . . 45
6.3.1 Tasks and channels . . . . . . . . . . . . . . . . . . . . 45
6.3.2 Message passing . . . . . . . . . . . . . . . . . . . . . . 46
6.4 The SPMD model . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.5 Data distribution . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.5.1 Scatter distribution . . . . . . . . . . . . . . . . . . . . 47
6.5.2 Linear distribution . . . . . . . . . . . . . . . . . . . . 48
6.6 Performance analysis . . . . . . . . . . . . . . . . . . . . . . . 49
6.6.1 Speed-up . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.6.2 Performance modelling . . . . . . . . . . . . . . . . . . 50
6.7 Parallel routine libraries . . . . . . . . . . . . . . . . . . . . . 51
7 Parallel Response Calculation 53
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.2 MATLAB and MEX-programming . . . . . . . . . . . . . . . 53
7.3 Data distribution of observational segements . . . . . . . . . . 54
7.4 Parallel routines . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7.5 Performance analysis . . . . . . . . . . . . . . . . . . . . . . . 56
8 Simulation Results 61
8.1 The Octagonal Transducer . . . . . . . . . . . . . . . . . . . . 61
8.2 Parallel programming results . . . . . . . . . . . . . . . . . . . 63
8.3 3D plot results . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.3.1 Rectangular aperture . . . . . . . . . . . . . . . . . . . 67
iv CONTENTS
8.3.2 Elliptical aperture . . . . . . . . . . . . . . . . . . . . 67
8.3.3 Super-elliptical aperture . . . . . . . . . . . . . . . . . 67
8.3.4 Octagonal aperture . . . . . . . . . . . . . . . . . . . . 74
8.3.5 Range movie and time movie . . . . . . . . . . . . . . 76
9 Conclusion and Summary 77
9.1 The octagonal transducer . . . . . . . . . . . . . . . . . . . . 77
9.2 The parallel algorithm . . . . . . . . . . . . . . . . . . . . . . 77
9.3 Software and hardware . . . . . . . . . . . . . . . . . . . . . . 78
A User Guide to Parallel Execution 83
A.1 C-MEX routine . . . . . . . . . . . . . . . . . . . . . . . . . . 83
A.1.1 Compilation of C-MEX Routine . . . . . . . . . . . . . 83
A.1.2 Execution . . . . . . . . . . . . . . . . . . . . . . . . . 83
A.2 C-MEX routine for parallel processing . . . . . . . . . . . . . 84
A.2.1 Compilation of Parallel C-Mex Routine . . . . . . . . . 84
A.2.2 Interactive Parallel Execution . . . . . . . . . . . . . . 84
A.2.3 Parallel Execution as a Batch job . . . . . . . . . . . . 86
B User Guide to Volumetric Visualization 89
B.1 Observation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
B.2 Slice Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
C Programming Code 95
C.1 C routine for sequential computer -pec.c . . . . . . . . . . . . 95
C.2 C routine for parallel computer - pec.c . . . . . . . . . . . . . 107
C.3 C-Routine ppec_slave.c . . . . . . . . . . . . . . . . . . . . . . 119
C.4 MATLAB Routine for parallel network - retrieve.m . . . . . . 125
C.5 MATLAB routine - peplot3d.m . . . . . . . . . . . . . . . . . . 126
C.6 MATLAB routine - plt3d.m . . . . . . . . . . . . . . . . . . . 134
Chapter 1
Introduction
1.1 Ultrasound in medical diagnosis
The average human ear can detect sound waves with frequencies between 20
Hz and 20,000 Hz (20 KHz). Sound waves with higher frequencies than 20
KHz are called ultrasound. In medical ultrasound, frequencies higher than
1.5 MHz and upto 20MHz are used.
Ultrasound was understood to be a potential imaging utility since the 1940s
[10], borrowed from RADAR1 and SONAR2 technology used during the
World War II. Since then, ultrasound has been applied for various medi-
cal purposes. However its acceptance as a powerful diagnostic tool was in
the 1970s.
Today, ultrasound is the second most utilized diagnostic imaging technol-
ogy used in medicine, after X-Rays. Ultrasound scanning provides a less
harmful but certainly e�ective method for �in vivo� diagnosis. Today, ultra-
sound scanners are used by many medical institutes. Two common uses of
ultrasound imaging are the scanning of the fetus during pregnancy and the
scanning of the heart.
In the late sixties, ultrasound scanners with real-time 2D images, formed by
both mechanical and electronic array scanning, was introduced. During the
next two decades, scanners with better image quality, higher frame rates and
doppler imaging methods were introduced to the market. These scanners
used annular arrays and 1D phased arrays[1].
1RADAR - RAdiowave Detection And Ranging2SONAR - SOund Navigation And Ranging
2 Introduction
To keep up with the new developments in medical industry, and due to the
necessity for better diagnostic methods, new ultrasound scanners provide 3D
imaging capabilities. The �rst 3D scanners used 1D phased arrays. Since
these can only be focused in the azimuth direction, the transducers have to
be moved mechanically in elevation direction, successively scanning parallel
planes to make up a non-real time 3D image. To create a real time 3D
image, a 2D phased array can be used. 2D phased arrays can be electronically
focused to any point in a 3-dimentional segement. Using a 2D array, however,
imposes a heavy burden on computations, due to the large number of array
elements involved - in the range of 256x256. But with the availability of
more powerful and cheaper computer products, this can be alleviated to
some extent.
With the development of new signal processing methods and better equip-
ment, 2D arrays with di�erent footprints, such as annular, rectangular, el-
liptical foot prints, are presently being tested and applied.
1.2 Parallel programming in mathematics
Powerful computers have helped scientists and mathematicians to conduct
simulations and receive speedy results. However, with new technology comes
new applications that demand further development in technology. For exam-
ple, astronomers use larger data sets now than a few decades ago, which takes
many days to process even with the most powerful computers. Further, super
computers that are only a few times faster than a powerful Personal Com-
puter (PC), costs many times more. Therefore applying parallel (concurrent)
programming methods can be a useful alternative. Computer parallalism is
the art of dividing a large program into parts or segments. The two main
models of parallel programming is partition of the program and partition
of the data. Mathematical and scienti�c parallel programming usually in-
volve dividing large data sets and computing them in seperate computers or
processors at the same time.
1.3 Objective of this thesis
The main objective of this thesis is to develop a parallel algorithm to to
simulate the acoustical �eld created by arbitrary transducer shapes and ar-
rays. The performance and e�ciency of the algorithm is presented. A 3-
dimensional visualizing tool is then introduced to study the simulated results
1.3 Objective of this thesis 3
in perspective. In addition, a novel shaped transducer is introduced and its
acoustic qualities and quantities are studied.
The thesis can be divided into three main parts. First, some theory behind
medical acoustics is discussed and the novel transducer aperture is intro-
duced. Then some theory behind parallel processing is discussed and the
parallel algorithm is introduced. Finally the simulations using the algorithm
and plots from the visualizing tool are presented and discussed.
4 Introduction
Chapter 2
Acoustic Theory and Ultrasound
2.1 Basic acoustic principles
Sound, like all longitudinal waves, requires a medium to propagate. A lon-
gitudinal wave is when the motions of the particles in the medium is in the
same direction as the direction of propagation. The wave propagates with the
compression and rarefaction of the medium. The displacement plotted with
respect to time gives a sinusoidal wave. The wavelength, λ, is the distance
between two successive compressions or rarefactions called pressure maxima.
And period, T , is the time between two pressure maxima at a particular
point in space. From this we get the propagation velocity of the sound in
medium as:
c =λ
T
Because T = 1/f where f is the temporal frequency, we can also write:
c =λ
T= λf
The mathematical formula c = λf is true for any wave of any type and
frequency. Angular frequency is given by ω = 2πf . The wave number vector,designated by �k, is a spatial frequency variable which has the direction, �k/|�k|,same as the direction of propagation of the wave. The magnitude of �k is also
know as the spatial frequency of the propagating wave. The slowness vector,
designated by α, is de�ned by �k/ω.
6 Acoustic Theory and Ultrasound
2.2 Some properties of acoustic waves
re�ection - An acoustic wave bounces o� a non-absorbing interface. The
acoustic echo has the properties of any other wave. When the acoustic
ray is re�ected, the incidence angle is equal to the re�ective angle. This
suggests that a properly shaped surface can be used to focus or di�ract
acoustic waves.
refraction - When an acoustic wave is incident upon a boundary between
two di�erent media, some of the wave is re�ected and some passes to
the second medium.
interference - Interference occurs when two similar waves travel in one
medium. When the two waves are out of phase, destructive interference
occurs and when they are in phase, constructive interference occurs.
di�raction - Di�raction is the phenomenon which refracts sound around
corners. This can be explained with Huygens's principle. When a wave
front is at an opening, the part of the wave that contacts the opening
acts as a source and propagates circularly.
superposition - The law of superposition states that the existence of one
wave in a certain space and time does not a�ect the existence or prop-
erties of another wave in the same space and time. However, in regard
to sound, some loss occurs when high amplitude sound waves propagate
through air.
2.3 The wave equation
The wave equation is given by:
∇2p =∂2s
∂x2+∂2s
∂y2+∂2s
∂z2=1
c2∂2s
∂t2(2.1)
where s(x, y, z, t) is a general scalar �eld. In the case of acoustics, s is the
acoustic pressure de�ned in a speci�c point in space and time. Scalar c, inacoustical sense, is the speed of sound. In case of �uids, c is related to �uid
parameters by:
c2 =dP
dρ
where P is the acoustic pressure and ρ is the density of the medium.
2.3 The wave equation 7
2.3.1 Solution in cartesian coordinates
There are a number of solutions to 2.1. Since the wave equation is a partial
di�erential equation, we can assume a separable solution. And for simplicity,
we can assume that s(x, y, z, t) has a complex exponential form
s(x, y, z, t) = A exp {j(ωt− kxx− kyy − kzz)} (2.2)
By substituting 2.2 in 2.1, it can be seen that the above equation is indeed
a solution to the wave equation if the constraint
k2x + k2y + k
2z =
ω2
c2
is satis�ed. This solution is interpreted as a monochromatic plane wave. It
is monocromatic since temporal frequency at a given point is constant. It is
a plane wave since at any given instance, the value of s is the same in all
points lying in a plane given by kxx+ kyy+ kzz = C, where C is a constant.
The monochromatic solution, given in 2.2, can be written as
s(�x, t) = A exp {j(ωt− �k · �x)} (2.3)
where �k is the wave number vector, and �x is the spatial location vector.
Sometimes this solution is expressed with the slowness vector, �α = �k/ω.Then
s(�x, t) = A exp{jω(t− �α · �x)}
Then s(�x, t) can be written as s(�x, t) = s(t − �α · �x) = A exp jωu, whereu = t− �α · �x.
Magnitude of �α is reciprocal to the propagation speed, since �α = �k/ω and
|�k|2 = ω2/c2, which is the reason for its name, the slowness vector.
2.3.2 Solution in spherical coordinates
The wave equation 2.1 can be written in polar coordinates (r, θ, φ), where
x = r cos θ sinφ
y = r sin θ sinφ
z = r cosφ
8 Acoustic Theory and Ultrasound
The graphical presentation of the this transformation can be found in �gure
3.3 in page 17. From these variables, the wave equation can be written in
polar coordinates.
1
r2∂
∂r
(r2∂s
∂r
)+
1
r2 sinφ
∂
∂φ
(sinφ
∂s
∂φ
)+
1
r2 sin2 φ
∂2s
∂φ2=1
c2∂2s
∂t2(2.4)
This equation can be solved by using the method of separation of variables.
However, the spherical coordinate wave equation is generally used in situa-
tions where spherical symmetry is evident[18]. Then the solution s(r, θ, φ, t)does not depend on θ and φ and the equation 2.4 reduces to
1
r2
(r2∂s
∂r
)=1
c2∂2s
∂t2
With some manipulation, this equation can be converted to an easy to handle
one dimensional wave equation
∂2 (rs)
∂r2=1
c2∂2 (rs)
∂t2(2.5)
One solution to this is the monochromatic solution
s(r, t) =A
rexp j(ωt− kr)
This is interpreted as a spherical wave propagating outward from the origin.
A propagating spherical wave of the form
s(r, t) =B
rexp j(ωt+ kr)
is also a solution to the equation 2.5. This can be intepreted as a spherical
wave propagating towards the origin.
2.4 Acoustic impedance
Acoustic Impedance, more formally known as speci�c acoustic impedance, is
de�ned as
2.5 Re�ection and refraction 9
Z =P
u
where P is the acoustic pressure and u is the partical velocity. For a plane
wave travelling in a speci�c direction, characteristic acoustic impedance is
de�ned as
Z0 =Pz
uz= ρc
where the subscript z denotes the direction of particle movement, ρ denotes
the density of the medium and c denotes the velocity of sound in the medium.
2.5 Re�ection and refraction
When an acoustic wave meets a boundary, both re�ection and refraction
occurs. The incident angle, θi, is equal to the re�ected angle, θr, and the
transmitted angle, θt, depends on the acoustic velocities of the two media,
c1 and c2 as well as the incident angle. In case of a monochromatic wave, θiand θt is related by
sinθi
vi=sinθt
vtwhere vi and vt are velocities in the incident medium and transmited medium
respectively.
The pressure re�ection and transmission coe�cients R and T can be deduced
by these conditions and with acoustic impedances of the two media, Z1 andZ2.
R =pr
pi=Z2 cos θi − Z1 cos θtZ2 cos θi + Z1 cos θi
and T =pt
pi=
2Z2 cos θiZ2 cos θi + Z1 cos θi
When the acoustic wave is normal to the boundary, the re�ection and trans-
mission coe�cients reduce to
R =Z2 − Z1Z2 + Z1
and T =2Z2
Z2 + Z1
When Z2 � Z1 the re�ection coe�cientR→ 1. Then the amplitude between
the incident and re�ective waves is only slightly reduced. The particle speed
at the boundary is almost zero. Such a medium is called a rigid ba�e.
When Z2 Z1 the transmission coe�cient T → 1. Then the acoustic
pressure at the boundary is almost zero. Such a medium is called a soft
ba�e or pressure release boundary.
10 Acoustic Theory and Ultrasound
2.6 Velocity potential
For acoustic pressure waves, the scalar �eld s in wave equation in 2.1 can be
expressed as the acoustical pressure p. Therefore the acoustic wave equationcan be written as
∇2p−1
c2∂2p
∂t2= 0 (2.6)
From Eular's equation of particle velocity, ρ0∂�v∂t= −∇p, the acoustic pressure
can be converted to partical velocity. Since ∇×∇p = 0, it can be seen that
∇× �v = 0. Therefore, particle velocity v has a scalar potential function Φ,called the velocity potential function, where
v(�x, t) = ∇Φ(�x, t) (2.7)
Substituting this in Eular's equation, acoustic pressure wave can be expressed
as
p(�x, t) = −ρ0∂Φ(�x, t)
∂t(2.8)
By substituting equation 2.8 in equation 2.6 and integrating with respect to
time, it can be seen that the velocity potential function satis�es the acoustic
wave equation.
2.7 Rayleigh integral
In a acoustically rigid ba�e, the acoustic pressure function p(�r, t) can be
derived from the Green's function.
p(�r, t) =∫ ∫
Sρ
∂∂tvn(�r0, t− �r/c)
2π�rdS
or by using the velocity potential:
Φ(�r, t) =∫ ∫
S
vn(�r0, t− �r/c)
2π�rdS
This equation is known as the Rayleigh Integral. Here, ρ is the density of themedium, vn is the particle velocity at the surface of the transducer normal
to the transducer,
2.8 Calculating the �eld 11
2.8 Calculating the �eld
2.8.1 Huygens' principle
Huygens's principle is used to explain the phenomena of di�raction and beam
pro�les of waves. The principle postulates that each point on a travelling
wavefront could be considered as a secondary source of spherical radiation.
In an ultrasonic transducer, this can be applied to �nd the �eld at a spatial
point. The transducer surface can be considered as an in�nite number of
point sources, each emitting a spherical wave. The superposition of the
spherical wavelets generated by all point sources at a certain point yields the
�eld at that point[24].
2.8.2 Di�raction impulse response method
If the transducer is assumed to be planar, that is lateral dimensions and the
radius of curvature are large compared to the wavelength, the surface velocity
at the boundary can be separated. It is further assumed that the transducer
vibrates in a single mode, usually thickness mode[7]. So the surface velocity
can be written as
vn(�r0, t) = O(r0)v(t)
where O(r0) is apodization. The temporal dependence of v(t− �r/c) can be
written as a convolution.
v(t− �r/c) =∫v(t0)δ(t− �r/c− t0)dt0
This can be inserted in the Rayleigh Integral and by changing the order of
integration ,
Φ(�r, t) =∫tv(t0)
∫ ∫S
O(r0)δ(t− �r/c − t0)
2π�rdSdt0
Let the impulse response, h(�r, t) be de�ned as
h(�r, t) =∫ ∫
δ(t− �r/c − t0)
2π�rdS0
Now the velocity potential Φ(�r, t) can be written as a convolution of h(�r, t)and v(t).
Φ(�r, t) = v(t)⊗ h(�r, t)
12 Acoustic Theory and Ultrasound
In order to �nd the impulse response h(�r, t), two di�erent approaches can be
applied[7]. One method is to calculate the impulse response for a speci�c �eld
point and its temporal evolution is obtained by observing that at t = r/c, onlypoints of the source located at a distance r from the �eld point contributes
to the local �eld i.e only points lying in a circle with its center being the
projection of the �eld point in the plance source. This method is known as
the Stepanishen technique. The other method is to �nd the impulse response
at a �xed time on a speci�c plane.
2.8.3 Direct calculation of the Rayleigh integral
Even though Impulse response method is very e�cient, it has some short-
comings. Its in�exibility with regard to the shape of the source is a main
disadvantages. Another important limitation is the assumption of a homo-
geneous medium[22].
A robust way to compute the �eld is to perform the integration of the
Rayleigh integral numarically[6]. Each �eld point will receive a contribu-
tion from each element of the transducer. This method is simple, robust and
can be used to calculated the �eld created by arbitrary shaped transducers.
However when the number of elements in the transducer is large and the �eld
has to be calculated for planes or volumes, the e�ectiveness of the algorithm
would depend heavily on the performance of the computer. Using a parallel
algorithm to alleviate the strain on time consumption is a part of this thesis
and is presented in a later chapter. This method is also advantages when
calculating the �eld in an aberrating medium[21].
Chapter 3
Transducer Design and Theory
In this chapter, the theory behind transducer design and related �elds are
discussed.
3.1 Transducers
3.1.1 Piezo-electric e�ect
Certain materials, like crystals, change their physical dimensions when an
electrical �eld is applied, and vice versa. This phenomenon is known as the
piezo-electric e�ect and this is used to create ultrasound waves in transducers.
Some natural material such as quartz and tourmaline, and arti�cial materials,
known as polarized ferro electricals pocess piezo-electric characteristics.
Usually, electricity is applied to the piezo electric disc via two silver elec-
trodes, as in �gure 3.1. By considering the two surfaces of the piezoelectric
crystal as two independent vibrators, it can be shown that the resonance
frequency is
f0 =ncp
2Lc
or Lc = nλ/2
with the lowest resonant frequency is when n = 1, cp is the acoustic wave
velocity and Lc is the thickness of the piezoelectric material.
The piezoelectric material for a transducer is chosen according to a number
of factors, such as stability, piezoelectric properties and material strength.
Quartz, for example is stable and therefore useful for accurate measurements,
but requires high electric �elds to obtain high power outputs. Ceramics can
14 Transducer Design and Theory
lambda/2
Electrodes
Figure 3.1: A simple transducer. The piezoelectric disc is half lambda thick
so that resonance occurs.
produce the same output at much lower �elds, but shows less stability and
low electric input impedance at high frequencies.
A typical construction of a single element transducer is in �gure 3.2.
Connector
Backing Material
Piezoelectric elementProtectiveLayer
Electrodes
Housing
Matching Layer
Figure 3.2: Major parts of a medical transducer.
3.1.2 Matching
Transducer matching is an important issue in transducer design. When a
transducer is excited using an electrical source, it rings according to its reso-
nant frequency. In case of continuous waves, it is desirable that most energy
3.2 Transducer apertures 15
is transmitted to the forward direction. A backing material with relatively
low impedance, e.g. air, would have a Re�ection Coe�cient R→ 1, allowingthe waves to be almost completely re�ected at the back interface.
In pulse-echo applications, such backing would lengthen the duration of the
pulse, which is undesirable. It is therefore important that the energy is
transmitted to and absorbed by the backing material. Such lossy materials
with equal impedance would create the ideal backing material for pulse-
echo applications. Assume that ZB , ZT and ZM are the impedances of the
backing, the transducer and the medium respectively. In traditional medical
transducers, where ZM ZT and ZB < ZT , the amplitude of the second
impulse would be considerably greater than the �rst and subsequent damping
would occur due to the lossy backing material. When ZB ZT and ZM = Zt,
the �rst two pulses have equal amplitude but opposite phases. All other
pulses are absorbed by the backing material[17].
3.2 Transducer apertures
A sensor placed at the speci�c location in space gathers the propagating
energy and produces an electrical signal. A transducer acts as a sensor in
addition to be able to create an energy wave from electrical signals. Because
propagating waves vary in space and time, sensors are often designed to
have signi�cant spatial extent, gathering energy propagating from speci�c
directions. Such sensors are said to be directional; able to focus on a speci�c
direction. Radar dishes used in aviation is an example of directional sensors.
An aperture gathers signals over a �nite spatial area. An array of such
apertures, placed in a speci�c order in space make up arrays. The output
of each sensor, enhanced, reduced or inhibited, is combined to produce one
single output.
3.2.1 Continuous apertures
When a �eld f(�x, t) is observed through a �nite continuous aperture, the
output is
z(�x, t) = w(�x)f(�x, t)
where w(�x) is called the aperture function. In many cases w(�x) is equal to 1within the aperture boundaries and 0 outside. However, w(�x) can be used asa weighing function where it can take on any value between 0 and 1 within
16 Transducer Design and Theory
the spacial boundaries of the aperture. This kind of aperture weighing is
sometimes referred to as shading, taping or apodization.
By Fourier transformation of z(�x, t), the above equation becomes a convolu-
tion of two functions.
Z(�k, ω) =∫ ∞−∞
W (�k−�l)F (�l, ω)d�l
where
W (�k) =∫ ∞−∞
w(�x)ej�k·�xd�x (3.1)
is known as the aperture smoothing function. The spatial extent of an aper-
ture determines the resolution with which two plane waves are separated. A
large aperture can be focused in any direction.
The aperture smoothing function depends mostly on the shape of the aper-
ture. Common aperture shapes, such as rectangular and annular, have aper-
ture smoothing functions that are easy to express and apply. For novel shaped
apertures, the aperture area has to be handled in special ways to �nd the
aperture smoothing function.
Aperture smoothing function can also be expressed with the direction of the
wave. This is known as the directivity function or the radiation pattern.
Since |�k| = k = 2π/λ, a the wavenumber vector of a plane wave can be
expressed as
�k = {kx, ky} = {k sinφ cos θ, k sinφ sin θ}
= {(2π/λ) sin φ cos θ, (2π/λ) sin φ sin θ} (3.2)
where θ is the angle in XY plane and φ is the normal incident angle. Figure
3.3 shows the angles in 3D space.
3.2.2 Linear apertures
Figure 3.4 shows a linear aperture placed symmetrically on the x-axis. For
simplicity, the aperture function w(�k) is assumed to be regular and equal to
1 inside the aperture.
From equation 3.1, the aperture smoothing function of a linear transducer,
Wx(�k) is
WL(�k) =∫y
∫xej
�k·�xdxdy
3.2 Transducer apertures 17
phi plane
X
Y
Z
phi
theta
Figure 3.3: Geometry of an incidnet wave.
−a a x
y
Figure 3.4: A linear aperture with length 2a
=∫ a
−aejkxxdx
= ejkxx
jkx
∣∣∣a−a
=ejkxa − e−jkxa
jkx
WL(�k) = 2sin(kxa)
kx(3.3)
3.2.3 Rectangular apertures
Figure 3.5 shows a rectangular aperture placed symmetrically on the XY-
plane. The aperture smoothing function is found by assuming two linear
apertures in X and Y direction respectively and multiplying. From equation
3.3, the aperture smoothing function for linear apertures can be found. Mul-
tiplying, the aperture smoothing function, WR(�k) of a rectangular aperture
is
WR(�k) = 2sin(kxa)
kx2sin(kyb)
ky=4
kxkysin(kxa) sin(kyb) (3.4)
18 Transducer Design and Theory
a−a
b
−b
x
y
Figure 3.5: A rectangular aperture with length 2a in X-direction and 2b in
Y-direction.
3.2.4 Elliptical and circular apertures
x
y
a
b
Figure 3.6: An elliptical aperture with a and b as main axes.
X
Y
r
Figure 3.7: A circular aperture with radius r.
Figure 3.6 shows an elliptical aperture, placed symmetrically on XY axes
and �gure 3.7. The long and short axes of the elliptical aperture are aand b respectively, and the radius of the circular aperture is r. By setting
3.2 Transducer apertures 19
a = b = r, the directivity pattern of the circular aperture can be found from
the directivity pattern of the elliptical aperture. The directivity pattern of
the elliptical aperture is found by using the equations 3.1 and 3.2.
WE(�k) = WE(r, θ, φ) =∫ ∫
Sej(2π/λ)(xsinφ cos θ+ysinφ sin θ)dx dy
New variables ρ and ψ are introduced implicitly such that
x = aρ cosψ, 0 ≤ ρ ≤ 1
y = bρ sinψ, 0 ≤ ψ ≤ 2π
for which
dxdy = |J | dρ dψ
where |J | is the Jacobian of the transformation given by
|J | =
∣∣∣∣∣∂x∂ρ∂y
∂ψ−∂x
∂ψ
∂y
∂ρ
∣∣∣∣∣ = abρ
Then the radiation pattern becomes
WE(r, θ, φ) =∫ 2π0
∫ 10ej(2π/λ) sinφ(a cosψ cos θ+b sinψ sin θ)ρabρ dρ dψ (3.5)
By using
cos θ′ =a cos θ
√a2 cos2 θ + b2 sin2 θ
and
sin θ′ =b sin θ
√a2 cos2 θ + b2 sin2 θ
equation 3.5 can be written as
WE(r, θ, φ) =∫ 2π0
∫ 10ej(2π/λ) sinφ
√a2 cos2 θ+b2 sin2 θ cos(ψ−θ′)ρabρ dρ dψ
By using the integral expression of the zero-order Besel function of the �rst
kind1, the above equation can be reduced to one integral.
WE(r, θ, φ) = 2πab∫ 10J0(2π
λρ√a2 cos2 θ + b2 sin2 θ sinφ)ρdρ
12πJ0(z) =∫ 2π0ejz cosψdψ
20 Transducer Design and Theory
Since∫ x0 zJ0(z)dz = xJ1(x), the equation reduces to
WE(r, θ, φ) = abJ1(2π
√a2 cos2 θ + b2 sin2 θ sin φ/λ)
√a2 cos2 θ + b2 sin2 θ sinφ/λ
(3.6)
where J1(·) is the �rst-order Bessel function of the �rst kind.
By setting a = b = r in equation 3.6, the directivity pattern of the circular
aperture, WC is
WC(r, θ, φ) = r2J1(2πr sinφ/λ)
r sin φ/λ
Since |�k| = k = 2π/λ, the above equation can be written as
WC(r, θ, φ) = 2πrJ1(kr sinφ)
k sinφ(3.7)
3.3 Beam characteristics
A transducer surface that vibrates in thickness mode creates an ultrasound
beam that radiates normal to the surface. As explained in section 2.8.1, the
beam is the result of the superposition of sperical waves originating from
minute points in the transducer surface.
The beam can be separated into two main regions, as shown in �gure 3.8 for a
circular aperture. In the near �eld, also known as the freznel zone, the beam
is perceived according to the surface characteristics of the transducer [1].
In the far �eld, the beam consists of a large high amplitude beam called the
mainlobe and low amplitude beams skirting the mainlobe, called sidelobes. In
a circular transducer, the separation of these two regions occur at a distance
z0 = D2/2λ
from the transducer surface, where D is the diameter of the aperture and
λ is the wavelength of the acoustic wave. The Extreme near �eld beam is
an cylindrical extension of the transducer surface, extending approximately
0.8D2/4λ from the surface. In the transition region, the beam �rst becomes
narrower and then widens with sidelobes into the far �eld. The narrowing of
the beam is known as the di�raction focus.
By introducing a curvature to the surface of the transducer, the beam can be
focused to a location in the beam region. This kind of geometrical focusing
3.4 Arrays 21
Main Lobe
Side Lobe
Side Lobe
Near Field Far Field
Extreme near field
Transition region
Figure 3.8: An idealistic pressure pro�le of a circular transducer.
is possible for all footprints, but are mostly used in circular aperture. Figure
3.9 shows the region of focus for a circular aperture. Due to di�raction, the
focuse will not be sharp.
mainlobe
Sidelobe
Sidelobe
F
Lf
Figure 3.9: Beam pro�le of a focused circular array.
3.4 Arrays
An array consists of individual apertures or omni-directional transducers,
which sample the wave-�eld in discrete spatial locations. Regular arrays,
made by placing the sensors on a �regular� grid, are easy to analyze and will
be discussed in this section.
22 Transducer Design and Theory
A wave�eld with one spatial dimention, f(x, t), has a wavenumber-frequencyrepresentation
F (k, ω) =∫ ∞−∞
∫ ∞−∞
f(x, t) exp{−j(ωt− kx}dxdt
If the �eld is measured by a one-dimentional equally spaced M sensors sep-
arated by a distance d, the sampled wave�eld's wavenumber-frequency rep-
resentation is
Y (k, ω) =∫ ∞−∞
∑m
ym(t) exp{−j(ωt− kmd}dt
where ym(t) = f(md, t). By spatial sampling methods, it can be shown that
Y (k, ω) =1
d
∞∑−∞
F (k −2πp
d, ω)
Here, it is assumed that F is bandlimited, i.e. F (k, ω) is zero for |k| ≥ π/d,so that aliasing will not occur. Generally, a set of weights for each element is
applied. Among other advantages, weighing allows explaining the absence of
an element at the mth location by setting wm = 0. An observed signal can
be represented as
zm(t) = wmym(t)
In the frequency domain, this becomes a convolution
Z(�k, ω) = W (�k)⊗ Y (�k, ω)
where W (�k) is called the discrete aperture function, given by
W (�k) =∑m
wmejkmd
3.5 Transducer arrays in medical diagnosis
Transducer arrays are novel transducers that house more than one transducer
element, ordered in a speci�c fashion. Such transducers, specially transducers
with 2D arrays of elements, provide the capability of electronically focusing
the ultrasound beam to any point in the diagnosis area. Transducer arrays
are mostly used in 3D imaging equipments. Some transducer arrays that are
commonly used are showed in �gure 3.10.
3.6 Electronic focusing 23
(a) (b) (c)
Piezoelectric elements
Isolating material
Piezoelectric material
Isolating material
Figure 3.10: (a) Linear (b) 2D and (c) annular arrays.
3.6 Electronic focusing
Focusing using a curvature on the transducer surface was discussed in sec-
tion 3.3 in page 20. However, to achieve dynamic focusing and to focus to
an arbitrary point in the observation segement, the above method becomes
inadequate. A certain amount of freedom can be achieved by mechanically
turning the transducer to di�erent angles.
In a transducer arrray, an electronic delay can be employed for elements,
thereby delaying the signal according to the required focus. As an example,
a 1D array as in �gure 3.11 can be considered. The array elements are placed
at a distance of less than λ/2 between centers to avoid high energy sidelobes
called grating lobes[24]. If voltage is �rst applied to the �rst element and
then to the next and so on. Therefore the waves from every element will
have a delay before emission. the resultant wavefront will then be steered
according to the applied time delay. The beam can be focused by applying
varying the time delays between elements. With 2D arrays, the beam can be
steered or focused in any direction.
The concentric rings in an annular array can also be electronically focused by
inducing a spherical delay on each ring. In most imaging systems, the beam
is steered mechanically. By rotating the array, a sector image is created
and by moving linearly, a linear image is created. The focus achieved by an
24 Transducer Design and Theory
annular array is symmetric both in scan plane and transverse plane to the
scan plane.
Transducerelements
Backing Backing
SteeredBeam
FocusedBeam
(a) (b)
Figure 3.11: (a) Steering and (b) focusing of the beam using a 1D array.
The 1D array of the above paragraph is called a linear phased array. A linear
phased array consists of thin bars mounted on a backing with a λ/4 matching
layer in front. A linear phased array is usually small, about 1cm wide and
about 3cm long and usually with 32 to 128 elements. A slight varient, called
a linear switched array, is also commonly used. The voltage is applied to
groups of elements in succession, so that the beam �moves� across the face of
the array, which has the same e�ect of mechanically moving the transducer
along the axis. Linear switched arrays, also known as linear sequenced arrays,
are much larger than linear phased arrays, 10 to 15 cm long and with 64 to
256 elements.
3.7 The ultrasound imaging system
Figure 3.12 shows a simpli�ed block diagram of an ultrasound imaging sys-
tem. The ultrasound beam is generated and the backscattered wave is re-
ceived by the RF unit. The focusing and steering of the beam is also done
by the RF unit and therefore it is the beamformer of the ultrasound imaging
system. The real time scanline processing unit receives data from RF unit
and this data is processed using the system processor and the memory bank.
3.7 The ultrasound imaging system 25
The processed data is converted to an RGB signal by the Scanconverter dis-
play unit which is sent to the monitor. The real time image is used to carry
out diagnosis.
RF UnitRealtimeScanlineProcessing
SystemProcessor
DataMemory
ScanconverterDisplay Unit
Monitor
Computer Bus
Figure 3.12: Block diagram of an ultrasound imaging system.
26 Transducer Design and Theory
Chapter 4
The Octogonal Transducer
4.1 Introduction
The octogonal aperture is a novel shaped aperture introduced in this project.
The transducer has the lengths −2a to 2a in X direction and −2b to 2b inY direction as shown in �gure 4.1.
b
2b
a 2a
−b
−2b
−a−2aAB1 B2
C1C2
C3 C4
Figure 4.1: The aperture smoothing function is found by dividing the octag-
onal transducer into section and summing over all areas.
Since the aperture has a unconventional footprint, the aperture functions
are calculated for di�erent sections, as divided in �gure 4.1. Assume the
aperture smoothing functions of the sections A,B and C are WA, WB and
WC respectively. Aperture function for the transducer is calculated by adding
aperture functions of these sections together.
W (�k) = WA(�k) +WB(�k) +WC(�k) (4.1)
28 The Octogonal Transducer
4.2 Rectangular area A
From the equation 3.4, the array smoothing function of the rectangular
section A can be calculated, using lengths 2a in X-direction and 4b in Y-
direction.
WA(�k) =4
kxkysin(kxa) sin(2kyb) (4.2)
4.3 Rectangular area B
Aperture smoothing function of Area B,WB, is equal to the sum of aperture
smoothing functions of sections B1 and B2, given by WB1 and WB2 respec-
tively. The sections B1 and B2 can be calculated by assuming each is placed
with its phase center in origo and then shifting them in X-direction, as shown
in �gure 4.2.
−3a/2 3a/2
b
−b
0
Figure 4.2: Sections B1 and B2 of the octagonal transducer.
By using the aperture smoothing function of a rectangle in equation 3.4 with
lengths a and 2b, array smoothing functions of the sections B1 and B2, shifted
to their respective places are
WB1 = e−jkx3a/24
kxkysin(kxa/2) sin(kyb)
WB2 = ejkx3a/24
kxkysin(kxa/2) sin(kyb)
Then the aperture smoothing function WB is the sum of the two sections B1
and B2:
4.4 Triangular area C 29
WB = WB1 +WB2
=4
kxkysin(kxa/2) sin(kyb)(e
jkx3a/2+ e−jkx3a/2)
Simply�ed,
WB =8
kxkycos(3kxa/2) sin(kxa/2) sin(kyb) (4.3)
4.4 Triangular area C
Figure 4.1 shows the positions of the 4 triangular parts of the section C,
labelled C1− C4. Due to the symmetricity of the placement, the aperture
smoothing function expressions show certain similarities. However, the aper-
ture smoothing functions for each triangular parts are calculated seperately
below.
For simplicity each triangle is shifted so that the right angled corner is at
the origo. Then the �nal expression is shifted to the correct position with a
e±jkxae±jkyb term according to its cartesian quadrant.
4.4.1 Section C1
(x,0) (a,0)
(0,b)
(x,b−xb/a)
Figure 4.3: Section C1
Let W x be the aperture smoothing function of the linear aperture in �gure
4.3. The vertical linear aperture then has coordinates [0, b − xb/a], for x ∈[0, a]. Aperture function of this line can be calculated with respect to only ycoordinates.
30 The Octogonal Transducer
W x = ejkxx∫ b−xb/a
0ejkyydy
= ejkxx · ejkyy
jky
∣∣∣b−xb/ao
=ejkxx
jky
(ejkyb · e−jkyxb/a − 1
)
=ejkyb
jky· ej(kx−kyb/a)x −
ejkxx
jky
For all such linear apertures in the complete triangle, the aperture smoothing
function, W 1, is
W 1 =ejkyb
jky
∫ a
0ej(kx−kyb/a)xdx−
1
jky
∫ a
0ejkxxdx
=ejkyb
jky
ej(kx−kyb/a)x
j(kx−kyb/a)
∣∣∣a0−1
jky
ejkxx
jkx
∣∣∣a0
= −ejkyb
ky (kx − kyb/a)
(ej(kx−kyb/a)a − 1
)+1
kxky
(ejkxa − 1
)
= −aejkyb
ky(kxa− kyb)
(ejkxae−jkyb − 1
)+1
kxky
(ejkxa − 1
)
= −a
ky(kxa− kyb)
(ejkxa − ejkyb
)+1
kxky
(ejkxa − 1
)
To shift the aperture to its correct position, the aperture smoothing function
is multiplied by ejkxaejkyb. Then the aperture smoothing function of the
section C1 is
WC1 = −aejkxaejkyb
ky(kxa− kyb)
(ejkxa − ejkyb
)+ejkxaejkyb
kxky
(ejkxa − 1
)(4.4)
4.4.2 Section C2
Let W x be the aperture smoothing function of the linear aperture in �gure
4.4. The verticle linear aperture then has coordinates [0, b + xb/a], for x ∈[−a, 0]. Aperture function of this line can be calculated with respect to only
y coordinates.
4.4 Triangular area C 31
(x,0)
(0,b)
(−a,0)
(x,b+xb/a)
Figure 4.4: Section C2
W x = ejkxx∫ b+xb/a
0ejkyydy
= ejkxx · ejkyy
jky
∣∣∣b+xb/ao
=ejkxx
jky
(ejkyb · ejkyxb/a − 1
)
=ejkyb
jky· ej(kx+kyb/a)x −
ejkxx
jky
For all such linear apertures in the complete triangle, the aperture smoothing
function, W 2, is
W 2 =ejkyb
jky
∫ 0−aej(kx+kyb/a)xdx −
1
jky
∫ 0−aejkxxdx
=ejkyb
jky
ej(kx+kyb/a)x
j(kx+kyb/a)
∣∣∣0−a−1
jkyejkxx
jkx
∣∣∣0−a
= −ejkyb
ky (kx + kyb/a)
(1− e−j(kx+kyb/a)a
)+1
kxky
(1− e−jkxa
)
= −aejkyb
ky(kxa+ kyb)
(1− e−jkxae−jkyb
)−1
kxky
(e−jkxa − 1
)
=a
ky(kxa+ kyb)
(e−jkxa − ejkyb
)−1
kxky
(e−jkxa − 1
)
To shift the aperture to its correct position, the aperture smoothing function
is multiplied by e−jkxaejkyb. Then the aperture smoothing function of the
section C2 is
WC2 =ae−jkxaejkyb
ky(kxa+ kyb)
(e−jkxa − ejkyb
)−e−jkxaejkyb
kxky
(e−jkxa − 1
)(4.5)
32 The Octogonal Transducer
4.4.3 Section C3
(0,−b)
(−a,0) (x,0)
(x,−b−xb/a)
Figure 4.5: Section C3
Let W x be the aperture smoothing function of the linear aperture in �gure
4.5. The verticle linear aperture then has the length [−b − xb/a, 0], forx ∈ [−a, 0]. Aperture function of this line can be calculated with respect to
only y coordinates.
W x = ejkxx∫ 0−b−xb/a
ejkyydy
= ejkxx · ejkyy
jky
∣∣∣0−b−xb/a
=ejkxx
jky
(1− e−jkyb · e−jkyxb/a
)
= −e−jkyb
jky· ej(kx−kyb/a)x +
ejkxx
jky
For all such linear apertures in the complete triangle, the aperture smoothing
function, W 3, is
W 3 = −e−jkyb
jky
∫ 0−aej(kx−kyb/a)xdx+
1
jky
∫ 0−aejkxxdx
= −e−jkyb
jky
ej(kx−kyb/a)x
j(kx−kyb/a)
∣∣∣0−a+1
jky
ejkxx
jkx
∣∣∣0−a
=e−jkyb
ky (kx − kyb/a)
(1− e−j(kx−kyb/a)a
)−1
kxky
(1− e−jkxa
)
=ae−jkyb
ky(kxa− kyb)
(1− e−jkxaejkyb
)+1
kxky
(e−jkxa − 1
)
= −a
ky(kxa− kyb)
(e−jkxa − e−jkyb
)+1
kxky
(e−jkxa − 1
)
4.4 Triangular area C 33
To shift the aperture to its correct position, the aperture smoothing function
is multiplied by e−jkxae−jkyb. Then the aperture smoothing function of the
section C3 is
WC3 = −ae−jkxae−jkyb
ky(kxa− kyb)
(e−jkxa − e−jkyb
)+e−jkxae−jkyb
kxky
(e−jkxa − 1
)(4.6)
4.4.4 Section C4
(x,0) (a,0)
(x,−b+xb/a)
(0,−b)
Figure 4.6: Section C4
Let W x be the aperture smoothing function of the linear aperture in �gure
4.6. The verticle linear aperture then has the length [−b + xb/a, 0], forx ∈ [0, a]. Aperture function of this line can be calculated with respect to
only y coordinates.
W x = ejkxx∫ 0−b+xb/a
ejkyydy
= ejkxx · ejkyy
jky
∣∣∣0−b+xb/a
=ejkxx
jky
(1− e−jkyb · ejkyxb/a
)
= −e−jkyb
jky· ej(kx+kyb/a)x +
ejkxx
jky
For all such linear apertures in the complete triangle, the aperture smoothing
function, W 4, is
W 4 = −e−jkyb
jky
∫ a
0ej(kx+kyb/a)xdx+
1
jky
∫ a
0ejkxxdx
34 The Octogonal Transducer
= −e−jkyb
jky
ej(kx+kyb/a)x
j(kx+kyb/a)
∣∣∣a0+1
jky
ejkxx
jkx
∣∣∣a0
=e−jkyb
ky (kx + kyb/a)
(ej(kx+kyb/a)a − 1
)−1
kxky
(ejkxa − 1
)
=ae−jkyb
ky(kxa+ kyb)
(ejkxaejkyb − 1
)−1
kxky
(ejkxa − 1
)
=a
ky(kxa+ kyb)
(ejkxa − e−jkyb
)−1
kxky
(ejkxa − 1
)
To shift the aperture to its correct position, the aperture smoothing function
is multiplied by ejkxae−jkyb. Then the aperture smoothing function of the
section C4 is
WC4 =aejkxae−jkyb
ky(kxa+ kyb)
(ejkxa − e−jkyb
)−ejkxae−jkyb
kxky
(ejkxa − 1
)(4.7)
4.4.5 Sum of the triangular sections
By adding the �rst terms of equations 4.4, 4.5, 4.6 and 4.7:
WC1 = −
aejkxaejkyb
ky(kxa− kyb)
(ejkxa − ejkyb
)+
ae−jkxaejkyb
ky(kxa+ kyb)
(e−jkxa − ejkyb
)
−ae−jkxae−jkyb
ky(kxa− kyb)
(e−jkxa − e−jkyb
)+
aejkxae−jkyb
ky(kxa+ kyb)
(ejkxa − e−jkyb
)
= −a(e2jkxaejkyb − ejkxae2jkyb + e−2jkxae−jkyb − e−jkxae−2jkyb
)ky(kxa− kyb)
+a(e−2jkxaejkyb − e−jkxae2jkyb + e2jkxae−jkyb − ejkxae−2jkyb
)ky(kxa+ kyb)
= −a(ej(2kxa+kyb) + e−j(2kxa+kyb) − ej(kxa+2kyb) − e−j(kxa+2kyb)
)ky(kxa− kyb)
+a(ej(2kxa−kyb) + e−j(2kxa−kyb) − ej(kxa−2kyb) − e−j(kxa−2kyb)
)ky(kxa+ kyb)
= −2a
ky(kxa− kyb)(cos(2kxa+ kyb)− cos(kxa+ 2kyb))
+2a
ky(kxa+ kyb)(cos(2kxa− kyb)− cos(kxa− 2kyb))
4.4 Triangular area C 35
= −2a
ky(kxa− kyb)
(−2 sin(
3kxa+ 3kyb
2) sin(
kxa− kyb
2)
)
+2a
ky(kxa+ kyb)
(−2 sin(
3kxa− 3kyb
2) sin(
kxa+ kyb
2)
)
Therefore, the �rst terms from each equation add up to:
WC1 =
4a
ky(kxa− kyb)sin(3kxa+ 3kyb
2) sin(
kxa− kyb
2)
−4a
ky(kxa+ kyb)sin(3kxa− 3kyb
2) sin(
kxa+ kyb
2) (4.8)
By adding the second terms of equations 4.4, 4.5, 4.6 and 4.7:
WC2 =
1
kxky
[ejkyb
(e2jkxa − ejkxa
)− ejkb
(e−2jkxa − e−jkxa
)
+e−jkyb(e−2jkxa − e−jkxa
)− e−jkyb
(e2jkxa − ejkxa
)]=
1
kxky
[ejkyb
(e2jkxa − ejkxa − e−2jkxa + e−jkxa
)
+ e−jkyb(e−2jkxa − e−jkxa − e2jkxa + ejkxa
)]=
1
kxky
[ejkyb (2j sin(2kxa)− 2j sin(kxa))
−e−jkyb (2j sin(2kxa)− 2j sin(kxa))]
=2j
kxky
(ejkyb − e−jkyb
)(sin(2kxa)− sin(kxa))
=−4
kxkysin(kyb) (sin(2kxa)− sin(kxa))
=−4
kxkysin(kyb)2 sin(
kxa
2)2 cos(
3kxa
2)
Therefore, the second term from each equation add up to:
WC2 =
−8
kxkysin(kxa/2) sin(kyb) cos(3kxa/2) (4.9)
36 The Octogonal Transducer
4.5 Aperture smoothing function of the trans-
ducer
From equations 4.2, 4.3, 4.8 and 4.9 the aperture smoothing function of the
transducer is:
W (�k) =4
kxkysin(kxa) sin(2kyb)
+8
kxkycos(3kxa/2) sin(kxa/2) sin(kyb)
+4a
ky(kxa− kyb)sin(3kxa+ 3kyb
2) sin(
kxa− kyb
2)
−4a
ky(kxa+ kyb)sin(3kxa− 3kyb
2) sin(
kxa+ kyb
2)
+−8
kxkysin(kxa/2) sin(kyb) cos(3kxa/2)
=4
kxkysin(kxa) sin(2kyb)
+4a
ky(kxa− kyb)sin(3kxa+ 3kyb
2) sin(
kxa− kyb
2)
−4a
ky(kxa+ kyb)sin(3kxa− 3kyb
2) sin(
kxa+ kyb
2) (4.10)
Chapter 5
Simulation of Acoustic Fields
In this chapter, the simulation and imaging of ultrasound waves using Ultra-
Sim is discussed.
5.1 UltraSim
UltraSim is a MATLAB based ultrasound simulation package. UltraSim
serves as a standard platform simulation programs concerning ultrasonic
imaging systems. It provides a tool for transducer and dome design, and
will increase the user's understanding of acoustic wave propagation[15][16].
5.2 Simulation of the acoustic �eld
The simulator UltraSim uses the Huygens' principle and a discrete version of
the Rayleigh Integral, given in section 2.7, to calculate the Impulse response.
The arbitrary shaped transducer is divided into a �nite number of points and
the contribution from each source point is summed to �nd the �eld at every
observation point. The impulse response from one transducer point to one
observation point is given by
h(rn, t) =O(rn)ejωT
rA
where rn is the distance from the phase center to the nth transducer point, Ois the weighing of the transducer surface, T is the observation time and rA is
the distance from the transducer point to the observation point. The distance
38 Simulation of Acoustic Fields
rA is normalized with the reference time tr, de�ned by the user. Since the
ultrasound wave that contributes to the �eld at an observation point at the
observation time is originated at an earlier point in time at the transducer, it
is necessary to locate the exact point in time when the pressure wave leaves
the transducer surface. This depends on the distance to the observational
point from the transducer point as well as electronic focusing adjustments.
The observation time is then given by
T = tdelay − T ime− tS
where T ime is the reference time and tS is the steering and focusing delay,
if electronic focusing is used. The time delay Tdelay is the propagation delay
for the ultrasound wave from the transducer point to the observation point,
and is found by:
tdelay =((X(i)−Xt(n))
2 + (Y (i)− Yt(n))2 + (Z(i)− Zt(n))
2) 12 /c
where X, Y, Z are the coordinates of the observation point, Xt, Yt, Zt are the
coordinates of the transducer point and c is the velocity of propagation. Thevariables that are imported from UltraSim are in table 5.1 and the complete
algorithm is presented in �gure 5.1.
X, Y, Z : Coordinate vectors of observation points.
Xt, Yt, Zt : Coordinate vectors of transducer points
T : The time reference vector. Length = the number of
observation points, value = start time reference
SFD : The vector with Steering and Focusing Delays for
each transducer element
TSV : The vector with time references - of length TLen
O : The weighing of the transducer surface
jω : Angular frequency
Table 5.1: Variables used in Response Calculation.
This algorithm can also be used to simulate the time varying �eld. This is
observed as a movie and therefore called the movie option. The �eld is then
calculated seperately for a list of time references. In movie, the �eld can be
observed for dynamic focusing, i.e. electronic focus changing with time. In
that case, the steering and focusing delays for each element on each reference
time have to be calculated before �eld calculations.
The algorithm is further modi�ed for pulse-echo applications. For ultrasound
pulses, the beam exists only through a certain period and, at a single moment,
5.3 Visualization of the acoustic �eld 39
only a part or none of the transducer points, depending on the observation
time, contribute towards the �eld at a single observation point. Therefore, it
is necessary to locate the observational points that are a�ected by the current
transducer point. If the length of the pulse is tP then the points with time
delay tdelay which ful�ll the criterion below is calculated.
T ime+ tS − tP/2 < tdelay < Time+ tS + tP/2
When the above criterion is satis�ed, the observation point is a�ected at
that speci�c time by at least one transducer point. In addition to alleviating
errors caused by adding contributions when the pulse does not exist, this
method decreases the time consumption to calculate for non-existing waves.
Pulse-echo applications sometimes use a weighing function, usually a cosine
weight as the pulse envelope. The weight W is found by
W (rn) = cos2(πT/tP )
In the simulator, the response is normalized by dividing with the number of
transducer points.
5.3 Visualization of the acoustic �eld
The simulator is expanded by creating a program to visualize the volumetric
�eld. The intensity (power) of the �eld is depicted as variations in color on
the chosen planes or slices in X, Y, Z or X, Y, T volume. The visualizing tool
can carry out the following actions:
Slice view The intensity is presented as a color map on the X,Y and Z or
X,Y and T planes. The user can view upto 5 slice in each axis.
Range Movie The user can study how the �eld changes inside a chosen
volume along the z axis. The slices are visualized one after the other
along the z-axis. Slices on X and Y axes can be present on the plot.
Time Movie The user can study how the �eld evolves over a chosen time,
on the chosen slice positions. The images are visualized one after the
other in the time interval chosen by the user.
40 Simulation of Acoustic Fields
! Primary Loop - over time references
For PNo = 1 : TLen ! If movie, TLen > 1tr = T (1)If movie then
T ime = TSV (PNo) · 1length(X)else
T ime = Tend
min elem = 1max elem = #transducer points
If Dynamic focus and movie then
<Find SFD, min elem and max elem for this TSV (PNo).>end
! Secondary Loop - over transducer points
For n = min elm : max elemts = SFD(n)For i = 1 :#�eld points
tdelay =√(X(i)−Xt(n))2 + (Y (i)− Yt(n))2 + (Z(i)− Zt(n))2/c
If Pulsed wave then
tmax = tS + tP/2tmin = tS + tP/2If tmin ≤ tdelay − T ime(i) ≤ tmax then
t = tdelay − T ime(i)− tSW = cos2(πt/tP)rA = tdelay/trRe(H(i)) = +<O(n)W cos(jωt)/rA>Im(H(i)) = +<O(n)W sin(jωt)/rA>
end
else
t = tdelay − T ime(i)− tSrA = tdelay/trRe(H(i)) = +<O(n) cos(jωt)/rA>Im(H(i)) = +<O(n) sin(jωt)/rA>
end
end
end
end
Figure 5.1: Main Loops of the simulator to calculate the Impulse response.
Chapter 6
Scienti�c Parallel Processing
6.1 Introduction
A parallel computer is a set of processors that are able to work cooperatively
to solve a computational problem. Application of parallelism has helped the
scienti�c community in many ways. Large problems could be solved with
better approximations and less time consumption. Some problems that were
not feasible to be solved have been e�ectively solved using parallel computing
methods[29]. Also, parallel computing is a cost e�ective method to achieve
faster computing.
A simple example can explain the construction of a parallel algorithm, shown
in �gure 6.1. A user requires to solve the equation f(a, b, c, d) = ac+ bd. Se-quentially, the user �rst calculates ac, then calculates bd and �nally adds the
two values. In a parallel algorithm with two processors, the user calculates
ac and bd respectively in process 1 and 2 at the same time and then sums
the result.
Application of computer parallelism is not surprising when new trends in
applications, computer design and networking are considered. Application
of computers to solve mathematical problems are now much more than at
the beginning of the computer era. The need for faster computers is driven
by commercial and scienti�c applications that grow in data and complex-
ity, thereby demanding faster and better computers to match that growth.
Building chips that operate faster are not so feasible and not desirable. Fun-
damental results in VLSI1 design say that it is di�cult to build individual
1VLSI - Very Large Scale Integration
42 Scienti�c Parallel Processing
In Sequential Program:
BEGIN val1:= a*c; val2:= b*d;
f:= val1 + val2;END.
In Parallel Program
Proc 1 Proc2val1:= a*c; val2:=b*d;
f:=val1 + val2;
Figure 6.1: How a simple Parallel Program executes.
components that operate faster. It is therefore desirable to use slower but
more components.
Finally, the development in Networking is changing the face of computing.
New networking technology has provided higher bandwidth, higher reliability
and cheaper services[26]. These trends point to the fact that the potential
exists for the use of parallel computing, not only with super computers, but
also with Personal computers, and networks. Personal computers with multi-
ple processors are already on the market, and future software will be required
to exploit the processors inside the computer as well as processors available
over networks, which implies concurrency or the capability of algorithms and
program structures to perform many operations at once. It is also evident
that the number of processors available will increase. Therefore, a software
application should be able to adapt itself to the increasing number of pro-
cessors, as a requirement for its portability. The resilience to the increase in
processor count, known as scalability, will be an important factor.
6.2 Parallel machine models
The simplest computer model, known as a Von Neumann computer, consists
of a Central Processing Unit(CPU), that executes a stored program that spec-
i�es a sequence of read and write operations on the storage unit(memory).
The Von Neumann model, shown in �gure 6.2, is a simple but robust method
to explain the architecture and functions of a computer. This model is used
to understand the more abstract notions of a parallel machine.
6.2 Parallel machine models 43
CPU Memory
Figure 6.2: The Von Neumann Computer.
6.2.1 The multicomputer
A multicomputer consists of a number of von Neumann machines, or nodes,
connected by a network or a bus. Each computer or node will execute its own
program, access its own memory and receive/send data over the network as
shown in �gure 6.3. This allows each CPU to access local memory of other
CPU units. In an idealized network, the cost of sending a message between
two nodes is independent of both node location and other network tra�c, but
depends on message length. It is safe to assume that accessing local memory
is less expensive than accessing remote memory. So it is a desirable factor
in an parallel algorithm that accesses to local data is more frequent than
accesses to remote data. This property, known as locality, is a fundamental
requirement for parallel computing.
NETWORK
Figure 6.3: The multicomputer. Each node is a von Neumann computer.
A similar model, known as distributed-memoryMIMD (Multiple Instructions
Multiple Data) computer, is also commonly used. As in a multicomputer,
MIMD model's memory is distributed among the CPUs. The main di�erence
is that the cost of sending a message between two nodes need not be inde-
pendent of node location. Two models are given in Figure 6.4. The notation
MIMD means that each node or CPU has its own instruction set or program
44 Scienti�c Parallel Processing
accessing its own local memory. This is in contrast to SIMD (Single Instruc-
tion Multiple Data) which is explained later. MIMD machines include IBM
SP, Intel Paragon and Cray T3D[8].
(a) (b)
Figure 6.4: (a) Grid and (b) Ring distributed Memory MIMD Computers.
6.2.2 The multiprocessor
This type of computers is also known as shared-memoryMIMD computers. In
a multiprocessor computer, all processors or nodes share a common memory
area, accessed via a data bus or a hierarchy of buses, as in �gure 6.5. To
diminish the access times to the memory area, each processor may use a
small memory area, known as cache, to store most frequently used data from
the global memory. Therefore locality is maintained to some degree. Silicon
Graphics Challenge computers are of this type[8].
6.2.3 The SIMD model
SIMD, or Single Instruction Multiple Data computers, is perhaps the most
important type in scienti�c parallel computing[20]. Here, all processors ex-
ecute the same set of instructions or program on di�erent data items. Both
hardware and software complexity is reduced so it is speci�cally applied for
some problems. SIMD programs can be e�ectively run on MIMD systems
with a little modi�cation [8]. Since the problem presented in this thesis is of
scienti�c nature, emphasis will be made on SIMD methods hereafter.
6.3 Parallel programming models 45
P
CACHE
P
CACHE
P
CACHE
P
CACHE
B U S
GLOBAL MEMORY
Figure 6.5: A shared memory Multiprocessor.
6.3 Parallel programming models
6.3.1 Tasks and channels
A parallel algorithm introduces more complexity to a program. To maintain
the modularity of a program, it is necessary to introduce new abstractions to
the algorithm. A good model is the task-channel method. A task encapsu-
lates a sequential program, local memory and a set of inports and outports
to interface with its environment. A parallel computation consists of atleast
one task. Tasks are executed concurrently and the number of tasks may vary
during execution. In addition to accessing its local memory, a task performs
four basic actions - send messages, receive messages, create new tasks and
terminate. The inports and outports of tasks are connected by message paths
called channels. Channels are dynamic since they can be created and deleted
during task execution.
Tasks interact using channels regardless of task location and therefore the
computational results do not depend upon the node. This is known as map-
ping independence.
The task is a natural building block for a modular program. The tasks are
constructed seperately and combined to complete the program. Since inter-
actions take place via channels which are well de�ned interfaces, implementa-
tions can be changed without modifying other components. Therefore tasks
provide a modular and robust method to reduce complexity in a parallel pro-
gram. It is also desirable to write deterministic code for parallel algorithms.
Such code is easy to understand. A program is said to be deterministic if it
yields the same output everytime when the same input is given.
46 Scienti�c Parallel Processing
6.3.2 Message passing
Message passing is the most widely used parallel programming model. It
is very similar to the task/channel model, except that messages are sent to
the task instead of the channel. Even though message-passing model can be
used to create new tasks or execute more than one task per processor, in
practice it is used with a �xed number of identical tasks created at program
startup. This is known as the SPMD (Single Program Multiple Data) model,
the software equivalent to SIMD.
Further emphasis is only made on Message Passing model since this thesis
requires only message passing techniques.
6.4 The SPMD model
In scienti�c �elds like astronomy, meteorology and signal processing, large
data sets, in vector or matrix form, have to be handled. Using SPMD (Single
Program Multiple Data) methods for such data sets are very useful since
partitioned vector or matrix operations can be carried out using the same
program on di�erent nodes while retaining regularity. The rest of this chapter
will look at the criteria involved in a SPMD model.
6.5 Data distribution
Assume a data set with M items, and P available nodes or processes. This
set has to be partitioned over each process. Assume that the process p,0 ≤ p < P , has Mp items from the data set. Each process is given an index
variable i for its local data array. The range of indices over which the arrays
are declared varies from process to process. In process p, the array indices areelements of the index set Ip, where Ip usually is an index range 0 . . . Ip − 1.
Technically, data distribution is to partition the data set to the P nodes and
it is accomplished by a bijective mapping of the global index m to a pair of
indices (p, i), where p is the process identi�er and i is the local index.
De�nition 1 A P-fold distribution of an index set M is a bijective map
µ(m) = (p, i) such that:
µ ::M→ {(p, i) : 0 ≤ p < P and i ∈ Ip} : m→ (p, i)
6.5 Data distribution 47
and
µ−1 :: {(p, i) : 0 ≤ p < P and i ∈ Ip} →M : (p, i)→ m
Here,M = 0 . . . M−1 is the set of global indices. For a given p ∈ 0 . . . P−1,the subsetMp is the set of global indices mapped to process p by the P -folddata distribution µ. Therefore
Mp =⋃i∈Ip
{µ−1(p, i)}
Because the P subsetsMp with 0 ≤ p < P satisfy
M =P−1⋃p=0
and M⋂Mq = ∅ if p �= q,
they de�ne a partition ofM. Because µ is a bijective map, the number of
elements ofMp equals the number of elements of Ip.
Practically, the distribution of the data set should be done so that the load per
process is more or less the same. There are two main methods to distribute
data: scatter distribution and linear distribution.
6.5.1 Scatter distribution
The scatter distribution allocates consecutive vector components to consec-
utive processes. The distribute the index set M over P processes, µ is set
as
µ(m) = (p, i), where
{p = m mod P
i =⌊mP
⌋ (6.1)
It follows then:
Mp = {m : 0 ≤ m < M and m mod P = p}
Ip = �M + P − p− 1
P�
Ip = {i : 0 ≤ i < Ip}
µ−1(p, i) = iP + p
48 Scienti�c Parallel Processing
6.5.2 Linear distribution
The linear distribution allocates consecutive vector components to consecu-
tive local-array entries. When M is an exact multiple of P , i.e. M = PL:
µ(m) = (p, i), where
{p = �m
L�
i = m− PL(6.2)
This is called a perfectly load balanced linear data distribution, and it satis�es:
Mp = {m : pL ≤ m < (p+ 1)L}
Ip = L
Ip = {i : 0 ≤ i < L}
µ−1(p, i) = pL + i
But since M cannot always expected to be an exact multiple of P , it is
necessary to apply general load balanced linear distribution. Given M =PL+R, where R is the residual, 0 ≤ R < P :
L =⌊M
P
⌋R = M mod P
µ(m) = (p, i), where
{p = max(� m
L+1�, �m−R
L�)
i = m− pL −min(p,R)
This maps L+1 components to processes 0 through R−1, and L components
to the rest of the processes. So, we get other distribution quantities as:
Mp = {m : pL +min(p,R) ≤ m < (p + 1)L+min(p+ 1, R)}
Ip =⌊M + P − p− 1
P
⌋Ip = {i : 0 ≤ i < Ip}
µ−1(p, i) = pL +min(p,R) + i
Table 6.1 shows how a vector of length M = 11 is distributed over P = 4processes.
6.6 Performance analysis 49
There is also another, rarely used, method of distribution where the compo-
nents are mapped by means of a random number generator. This method,
called random distribution method, is helpful in cases where balancing has
to be done statistically. This method is rarely used and is considered only in
very special cases.
Process Local i Linear m Scatter m
0 0 0 0
1 1 4
2 2 8
1 0 3 1
1 4 5
2 5 9
2 0 6 2
1 7 6
2 8 10
3 0 9 3
1 10 7
Table 6.1: Linear and Scatter Distribution for M = 11 and P = 4.
6.6 Performance analysis
Performance Analysis of a parallel program is the study of the execution
times of computations2. Achieving good performance is one of the most
important principles of a parallel computation. However search for good
performance has its limitations. A program is the end result of many criteria
besides performance, such as stability, robustness, range and portability. To
achieve good performance some criteria have to sacri�ced and that should be
decided case by case. Therefore generalized performance analysis is di�cult.
However, study of some parameters can give the user some understanding
of the performance for a particular situation. A space of the computation
is simpli�ed so that only one paramater de�nes the computation. This pa-
rameter is called the study parameter. The other parameters are kept �xed
for simplicity. Failing that, they are separated as it �xed parameters and it
dependent parameters.
2A computation is a program with particular input executed on a given computer at a
certain moment in time.
50 Scienti�c Parallel Processing
6.6.1 Speed-up
Since a parallel computer di�ers from a sequential computer by the number
of nodes (i.e. only one node in a sequential computer), the number of nodes,
i.e. p, is taken as the study parameter. It can be assumed that the other
parameters are �xed. Let Tp be the execution time on node p and T ∗p be the
best p-node time. When p = 1, the computer is sequential and T ∗1 is the bestsequential execution time.For these parameters, the following theorem holds:
Theorem 1 On a given multicomputer and for a �xed problem, the best p-
node computation is at most p times faster than the best sequential execution
time, or
T ∗p ≥ T ∗1 /p
The proof is done by contradiction and can be found in [29]. The following
de�nitions can now be considered:
De�nition 2 The speed-up of a p-node computation with execution time Tpis given by:
Sp = T ∗1 /Tp (6.3)
where T ∗1 is the sequential time obtained for the same problem on the same
multicomputer.
De�nition 3 The e�ciency of a p-node computation with speed-up Sp is
given by:
ηp = Sp/p
A good, but not exact, understanding of the performance can be found by
evaluating these parameters.
6.6.2 Performance modelling
For theoritical anaysis of performance, a model computer is assumed. In a
sequential computer, the execution time is considered proportional to the
number of �ops in the computation. If the time used to complete a �op is
τA, the execution time is nτa, where n is the number of �ops. In a concur-
rent computer, the communication should also be considered. The message
exchange time, denoted by τC, is the time it takes to send the message by
6.7 Parallel routine libraries 51
one process and receive the message by another, while another is exchanged
in reverse direction. Message exchange time follows a simple linear equation:
τC(L) = τS + βL (6.4)
where τS is the start-up time or latency, L is the length of the message and
β is the inverse of the network bandwidth. For long messages, TC(L) can be
approximated by LTC(1) where TC(1) is the time taken to send a message of
length 1.
These parameters and de�nitions are adequate to model a very simpli�ed
performance analysis. The execution time for a parallel program on P nodes,
T (P ) is therefore given by
T (P ) = τc(L) + nτA (6.5)
There are some commercial packages that are used to measure performance
analysis quite accurately. Many are used to study the performance via pro-
�les, counters and traces, user-embedded in source code. Commonly used
tools include Paragrhaph, Upshot, Pablo, Gauge and AIMS.
6.7 Parallel routine libraries
In message-passing library approach to parallel programming, a collection
of processes executes programs written in a a standard sequential language
augmented with calls to a library of functions for sending and receiving mes-
sages and data. There are many such libraries available, for example, MPI,
PVM, p4, Express, PARMACS. A parallel program uses atleast one of these
libraries to manage the communication among tasks.
• MPI
MPI, or Message Passing Interface, is a commonly used library which
provides a standard for message passing. MPI programming model
comprises of one or processes that communicate using MPI library rou-
tines to send and to receive messages and data to and from other pro-
cesses. Commonly these processes are created at the intialization of the
program, and one process is created per processor. It is not necessary
that all created processes execute the same program. Therefor MPI
model is referred to as multiple program multiple data (MPMD) but
MPI can be used as a single program multiple data (SPMD) programs
too. MPI can be used in Multiprocessor computers.
52 Scienti�c Parallel Processing
• PVM
PVM, Parallel Virtual Machine, is a software system that permits a
network of heterogeneous Unix computers to be used as a single large
parallel computer. Under PVM, a user de�ned collection of serial, par-
allel and vector computers appears as one large distributed-memory
computer which is the reason for PVM being known as the poor man's
parallel computer. The computer which acts as the host from which
all tasks are spawned can be any of computer in the network. PVM
supports heterogeneity at the application, machine, and network level.
In other words, PVM allows application tasks to exploit the architec-
ture best suited to their solution. PVM handles all data conversion
that may be required if two computers use di�erent integer or �oat-
ing point representations. And PVM allows the virtual machine to be
interconnected by a variety of di�erent networks.
The PVM system is composed of two parts. The �rst is a daemon that
resides on all the computers making up the virtual machine. Before the
user executes the parallel programs, the daemon has to be activated
from the Unix prompt. The second part is a library of functions which,
among others, carries out the spawning of tasks and sends and receives
messages. This library is linked with the users programs. PVM can
also be used for Multiprocessor computers.
Chapter 7
Parallel Response Calculation
7.1 Introduction
In the previous chapter, the basics of scienti�c parallel programming was
discussed. Now a parallel algorithm for the response calculation problem is
presented.
7.2 MATLAB and MEX-programming
MATLAB uses a fourth generation language for programming. As expected
from a 4th generation language, MATLAB code require heavy doses of CPU
time, heavily increasing the computational time. However, MATLAB pro-
vides its users with an external programming tool which can be used activate
routines written in C, C++ and Fortran. These dynamically linked subrou-
tines are modelled via the so-called MEX programming techniques and are
compiled via a tool provided with MATLAB, which is given an extension in
accordance with the architecture of the host computer1.
Even though there are many de�ned data types in its interactive environment,
MATLAB works with only one object type - the Matrix. The variables in the
object, some of which are given in Table 7.1. MATLAB stores matrices in
columnwise order in a vector. That is, the vector array pr contains the valuesof the �rst column and then the second column and so on. A MEX program
uses and accesses this object type via a Gateway routine. By using this
1E.g. <test.c> complied gives <test.mex4> for Sun4/SPARC, <test.mexsg> for SGI
54 Parallel Response Calculation
storage method, data distribution when parallel processing can be carried
out with ease.
Name : Name of the variable
M : The number of rows in the Matrix
N : The number of columns in the Matrix
DisplayMode : Indicates whether the values are numeric or ASCII
Storage : Indicates whether the Matrix is full or sparse
pr : Pointer to an array of length NM containing the real
part of the Matrix
pi : As above containing the complex part of the Matrix
Table 7.1: A selected list of variables in object type MATRIX.
AMEX Program has two main parts: The Gateway Routine and the Compu-
tation Routine. The Gateway routine starts with the following parameters:
void mexFunction(
int nlhs, Matrix *plhs[ ]
int nrhs, Matrix *prhs[ ])
The variables nlhs and nrhs are integer variables containing the number ofleft hand side and right hand side variables respectively in the MATLAB
routine. The pointers plhs and prhs point to two arrays of MATLAB matri-
ces on the left hand side and right hand side respectively of the MATLAB
routine.
The computation routine contains the mathematical calculations that use
the variables in the Matrix objects.
7.3 Data distribution of observational segements
As mentioned earlier, MATLAB uses columnwise vectorization to breakdown
a matrix. For a planar observation, the X, Y, Z and T coordinates are placed
in four matrices where each point of observation is placed in the same in-
dex in all four. The pr variable of Matrix-object X contains the columnwise
vectorized values. This vector is extracted in MEX Gateway routine and dis-
tributed using general load balanced linear distribution in the Computation
Routine and sent to the invoked tasks, as shown in �gure 7.1.
7.4 Parallel routines 55
1 2 3 4 5
1
2
3 4 5
1
2
3
4
5
Process 0
Process 1
Process 2
In MATLAB In MEX calculation
inMex
Gateway
Figure 7.1: How a 6× 5 Matrix is distributed to 3 processes.
When simulating Volumetric observations, it is necessary to keep track of
many planes. Instead of 3D matrices, MATLABmappes layers of 2D matrices
above each other to create a long 2D matrix. How a 3D observational data
matrix is converted is shown in �gure 7.2.
7.4 Parallel routines
Since parallel Routines have to be invoked from MATLAB environment,
PVM was chosen as the parallel library. The Master parallel program, with
MEX extensions, is invoked fromMATLAB environment. This program then
invokes the slave programs in a user-speci�ed number of nodes and distributes
data to them. Final computed values are then sent back to Master Program
which places the results in MATLAB. The complete programs can be found
in Appendix C.
For time varying simulations, only the data that changes from one reference
time to another is sent to the nodes. The other data, for example observation
point coordinates and transducer point coordinates, are used from the �rst
time reference.
The Figures 7.3 and 7.4 show the master and slave routines respectively.
56 Parallel Response Calculation
x
y
z
flip
Figure 7.2: Conversion of volumetric data to matrix form.
To keep the indices simple, no local index is speci�ed. The start index and
the end index for each node is sent to each node, and these two indices
are received back to the master routine. This way, the master program can
update the �nal rsponse vector as soon as data arrives from any node.
7.5 Performance analysis
The performance is analysed here for a simple simulation model. First, the
pressure wave is taken as continuous, so that the execution does not include
the pulse wave calculations and includes all transducer points. Further, a
perfect load balance is assumed so that the number of observation points
in each node is equal. Finally, simulation is assumed to be for one time
reference.
Assume there are N transducer points and MP observation points per node.
From �gure 7.4 it can be seen that there are 11 �ops in the loop If <Pulse
wave> then ... else ... end and 8 �ops in the tdelay calculation. From all
MP points, the execution time for the loop For i ∈ Ip .....end is therefore
19MpτA. Since there are N transducer points, the execution time for the
loop For n = min elem : max elem is 19NMP τA. From the equation 6.5 in
page 51, the execution time can be approximated as
T (P ) ≈ τC(L) + 19NMP τA
where τC(L) is the time taken to send a message of length L. The main
program, in appendix C and simpli�ed in �gure 7.3, sends altogether P
7.5 Performance analysis 57
Program RESPONSE
declare
! Variables imported from ULTRASIM. See �gure 5.1 in page 37
! This pseudo code shows the method of data distribution and communication
! lstart, lend - the start and end indices of the distributed data vector
! ElPt - number of horizontal and vertical element points
initially
!Enrolling in PVM
my_tid = PVM_tid ! Get Task ID for MASTER
broadcast ppec_slave to 0...P − 1‖passign
For PNo = 1 : TLen ! If movie, TLen > 1tr = T (1)If <movie> then
T ime = TSV (PNo) · 1length(X)else
T ime = Tend
min elem = 1max elem =#transducer elements
If <Dynamic focus and movie> then
<Find SFD,min elem,max elem for this PNo>
end
For i ∈ Ip(Xp, Yp, Zp, T imep)[i] = (X, Y, Z, T ime)µ−1(p,i)
end
For p = 0 : P − 1If PNo == 1 then
send lstart, lend, ElPt, PNo,min elem,max elem, SFD, T imep,Xp, Yp, Zp, Xt, Yt, Zt, c, tref , O, ....... to p
else
send lstart, lend, ElPt, PNo,min elem,max elem, SFD, T imep to p
end
For p ≤ Preceive Re(Hp), Im(Hp), lstart, lend from p
end
[Re(H), Im(H)] = [Re(Hp), Im(Hp)]µ(m)end
Figure 7.3: Master routine
58 Parallel Response Calculation
0...P − 1‖p program ppec_slave
declare
!Variables received from MASTER
initially
! Enroll in PVM
my tid = PVM tid ! Get Task ID for this slave
ptid = PVM parent ! Get Task ID for master
receive ............, Xp, Yp, Zp, T imep, Xt, Yt, Zt, c, tref , O, SFD from Master
assign
For n = min elem : max elem
ts = SFD(n)For i ∈ Ip
tdelay =√(Xp(i)−Xt(n))2 + (Yp(i)−Yt(n))2 + (Zp(i)− Zt(n))2/c
if <Pulse wave> then
tmax = tS + tP/2tmin = tS + tP/2If tmin ≤ tdelay − T imep(i) ≤ tmax then
t = tdelay − T imep(i)− tSW = cos2(πt/tP)rA = tdelay/trRe(Hp(i)) = +<O(n)W cos(jωt)/rA>Im(Hp(i)) = +<O(n)W sin(jωt)/rA>
end
else
t = tdelay − T imep(i)− tSrA = tdelay/trRe(Hp(i)) = +<O(n) cos(jωt)/rA>Im(Hp(i)) = +<O(n) sin(jωt)/rA>
end
end
end
send Re(Hp), Im(Hp), lstart, lend to MAIN;
Figure 7.4: Slave routine
7.5 Performance analysis 59
messages of length (13+3N+4MP ). After the calculations, the main program
receives P messages of length 2Mp + 2. If a negligible latency is assumed,
T (P ) and the special case T (1) can be found
T (P ) ≈ τC((15 + 3N + 6Mp)P ) + 19NMpτA
= P (15 + 3N + 6Mp)τC(1) + 19NMpτA
T (1) ≈ 19NMτA
Speed-up, given by SP = T (1)/T (P ) becomes:
SP =19NMτA
P (15 + 3N + 6Mp)τC(1) + 19NMpτA
SP = P1
15P+3NP+6MPP19NM
τC(1)τA+ MP
M
Since MP =M/P ,
SP =1
15P+3NP+6M19NM
τC(1)τA+ 1
P
=P
P (15P+3NP+6M)19NM
τC(1)τA+ 1
(7.1)
When NM >> P , Sp −→ P . This is usually the case since large sets of
observational points are usually used.
60 Parallel Response Calculation
Chapter 8
Simulation Results
8.1 The Octagonal Transducer
The aperture smoothing function found in chapter 4 is visualized here. Fig-
ures 8.1(a) and 8.1(b) show the normalized1 aperture smoothing functions for
a = b = 1λ and a = b = 3λ respectively, over {kx, ky} in the range [−5 : 5].Figure 8.2(a) and 8.2(b) show the normalized aperture smoothing functions
for a = 2λ, b = 1λ and a = 5λ, b = 1λ for the same range.
These �gures illustrate the manner in which the lengths of a and b a�ect thethe aperture function. As the lengths increase, the visible range increases.
The number of zeros and sidelobes also increase and this in turn results in a
narrower main lobe.
To further illustrate the qualities of the octagonal aperture, the directivity
pattern can be considered. Figure 8.3 and 8.4 shows the pattern for θ =0, 30, 60, 90◦ for lengths a = b = 1λ and a = 5λ, b = 1λ respectively.
These �gures suggest that this aperture produces a rectangular aperture
type beam, broad in the θ = 90◦ plane and narrow in the θ = 0◦ plane.Compared to an elliptical aperture of the same dimensions however, the
main lobe is narrower[30]. Another interesting development can be noticed
for the symmetric aperture (i.e. a = b = 1λ) in �gure 8.3. The mainlobe
width for all angles are approximately equal, not unlike a circular aperture.
Figure 8.5 shows the patterns of octagonal and rectangular apertures, with
symmetric dimensions, for angles θ = 0, 30, 45, 60◦ plotted together. The
sidelobe levels of the octagonal aperture is less and mainlobe width is wider
1Normaization is done with respect to the aperture area
62 Simulation Results
−5
0
5
−5
0
5−0.2
0
0.2
0.4
0.6
0.8
1
kxky
aper
ture
sm
ooth
ing
func
tion
(a)
−5
0
5
−5
0
5−0.2
0
0.2
0.4
0.6
0.8
1
kxky
aper
ture
sm
ooth
ing
func
tion
(b)
Figure 8.1: Aperture smoothing function of the octagonal transducer with (a)
a = b = 1λ and (b) a = b = 3λ
than for a rectangular apeture for θ = 0. Figure 8.6 looks at an octagonal
aperture with a = 5λ, b = 1λ compared with a rectangular aperture with
the similar dimensions. In this example, the levels of the �rst sidelobes for
angles θ = 0, 30, 60, 90◦ are less for the octagonal aperture.
8.2 Parallel programming results 63
−5
0
5
−5
0
5−0.2
0
0.2
0.4
0.6
0.8
1
kxky
aper
ture
sm
ooth
ing
func
tion
(a)
−5
0
5
−5
0
5−0.2
0
0.2
0.4
0.6
0.8
1
kxky
aper
ture
sm
ooth
ing
func
tion
(b)
Figure 8.2: Aperture smoothing function of the octagonal transducer with (a)
a = 2λ, b = 1λ and (b) a = 5λ, b = 1λ
8.2 Parallel programming results
In this section, the results from executing an example con�guration on the
parallel system at USIT will be discussed. Since parallel algorithm presented
64 Simulation Results
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 0
Angle in degrees
Rel
ativ
e po
wer
in d
B
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 30
Angle in degrees
Rel
ativ
e po
wer
in d
B
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 60
Angle in degrees
Rel
ativ
e po
wer
in d
B
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 90
Angle in degrees
Rel
ativ
e po
wer
in d
B
Figure 8.3: Directivity pattern of an octagonal transducer with a = b = 1λfor angles θ = 0, 30, 60, 90
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 0
Angle in degrees
Rel
ativ
e po
wer
in d
B
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 30
Angle in degrees
Rel
ativ
e po
wer
in d
B
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 60
Angle in degrees
Rel
ativ
e po
wer
in d
B
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 90
Angle in degrees
Rel
ativ
e po
wer
in d
B
Figure 8.4: Directivity pattern of an octagonal transducer with a = 5λ, b = 1λfor angles θ = 0, 30, 60, 90
8.2 Parallel programming results 65
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 0
Angle in degrees
Rel
ativ
e po
wer
in d
B
Octagonal −Theta = 0
Rectangular ..
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 30
Angle in degrees
Rel
ativ
e po
wer
in d
B
Octagonal −Theta = 30
Rectangular ..
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 45
Angle in degrees
Rel
ativ
e po
wer
in d
B
Octagonal −
Rectangular ..
Theta = 45
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 60
Angle in degrees
Rel
ativ
e po
wer
in d
B
Rectangular ..
Octagonal −Theta = 60
Figure 8.5: Directivity patterns of octagonal and rectangular apertures with
equal width and heights compared.
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 0
Angle in degrees
Rel
ativ
e po
wer
in d
B
Octagonal −
Rectangular ..
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 30
Angle in degrees
Rel
ativ
e po
wer
in d
B
Octagonal −
Rectangular ..
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 60
Angle in degrees
Rel
ativ
e po
wer
in d
B
Octagonal −
Rectangular ..
−80 −60 −40 −20 0 20 40 60 80−100
−80
−60
−40
−20
0
20
Theta = 90
Angle in degrees
Rel
ativ
e po
wer
in d
B
Octagonal −
Rectangular ..
Figure 8.6: Directivity pattern of octagonal aperture with a = 5λ, b = 1λcompared with a rectangular aperture with the same dimensions.
66 Simulation Results
gives better performance for large sets of observational data as explained in
section 7.5, such a large observation volume is used.
The aperture is rectangular is shape, 10mm in azimuth direction (X axis)
and 6mm in elevation direction (Y axis). There are 48 elements in azimuth
direction and 24 points in elevation direction, which combines to 1152 points.
There is no curvature in the aperture and electronic focusing is set to 20mm
in normal axis. The continuous beam is assumed and the speed of sound is
taken as 1540 m/s.
The observation area is the volume [−10 : 10,−10 : 10, 10 : 22] with 4
�eld points per mm. This volume therefore contains 321,489 points. This
con�guration is then executed on the parallel computer network with varying
number of nodes. Table 8.1 presents the run-time results.
Nodes: 1 2 5 10 20 30
1st 1605.53 1289.40 545.43 195.46 114.97 80.63
2nd 2154.83 900.23 370.95 232.59 103.01 78.28
3rd 1802.25 843.06 382.63 205.82 102.10 83.77
4th 1981.00 938.32 556.96 253.58 124.61 82.69
5th 2038.43 832.96 386.46 232.02 103.02 84.62
6th 2059.42 840.96 344.31 199.27 111.43 85.76
7th 1861.31 911.35 380.66 245.91 113.24 80.62
8th 1593.46 1003.22 590.12 190.10 102.04 77.79
9th 2045.46 906.21 369.57 290.26 123.05 77.65
10th 1712.89 906.59 653.64 179.15 103.83 82.21
Average 1885.46 937.23 458.07 222.42 110.13 81.40
Best 1593.46 832.96 344.31 179.15 102.04 77.65
Sp 1.00 1.91 4.63 8.89 15.61 20.52
Table 8.1: Time Consumption for parallel processing. Bold face shows the
best time achieved.
The single node calculations were done using the sequential program so that
no communication was necessary. However the single node simulations were
done on the interactive node which is heavily loaded. Simulations carried
out as batch jobs will give better results both for parallel and sequential
programs.
The best value for each number of nodes are used to study the performance
parameters.From values of Sp, it is evident that the algorithm provides ad-
equate speed-up. Deviation from the expected SP , which is equal to the
8.3 3D plot results 67
number of nodes, as explained in equation 6.3. The averages can be used
to see relative time consumption for calculations. At the time of execution,
PVM was con�gured to the 32 available nodes. There were 15 users logged
in. There were 12-16 jobs running on 18 servers during the time execution.
8.3 3D plot results
The simulations were carried out using rectangular, elliptical, super-elliptical
and octagonal arrays. All arrays consist of 48 elements in X direction and 24
elements in Y direction. Such large numbers were used so that the transduc-
ers will approximate continuous apertures. A continuous ultrasound wave
with frequency 2 MHz was used, while the aperture was electronically fo-
cused to 20mm from the transducer surface. The observation �eld was set to
the [-10:10,-10:10,1:22] volume segement and each millimeter was discretized
to two portions.
8.3.1 Rectangular aperture
Figures 8.7 and 8.8 show the �eld created by a rectangular aperture with
dimensions 10mm length and 6mm wideth. Figure 8.7 looks at �elds in z-
planes 1mm and 20mm from the surface, the latter being the focal point.
Figure 8.8 show x-plane and y-plane views of the �eld. The mainlobe and
the sidelobes, as shown in �gure 3.8 in page 21 and how the beam narrows
in the transition �eld can be noticed.
8.3.2 Elliptical aperture
Figures 8.9 and 8.10 show the �eld created by an elliptical aperture, with
10mm long axis and 6mm short axis. The plots presented correspond to the
slice positions used in the rectangular aperture.
8.3.3 Super-elliptical aperture
The super-elliptical2 aperture with 10mm long axis and 6mm short axis is
tested here. Figures 8.11 and 8.12 show the slice positions as used in the
previous apertures.
2A super ellipse has the equation(xa
)2.5+(yb
)2.5= 1.
68 Simulation Results
ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:08
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(a)ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:11
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(b)
Figure 8.7: Slices on the z-axis, (a)near the sufrace and (b)at the focus of the
rectangular aperture.
8.3 3D plot results 69
ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:12
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(a)ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:14
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(b)
Figure 8.8: Slices at 0 on (a) the x-axis and (b) the y-axis for the rectangular
aperture.
70 Simulation Results
ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:37
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(a)ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:38
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(b)
Figure 8.9: Slices on the z-axis, (a) near the sufrace and (b) at the focus of
the elliptical aperture.
8.3 3D plot results 71
ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:40
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(a)ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:41
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(b)
Figure 8.10: Slices at 0 on (a) the x-axis and (b) the y-axis for the elliptical
aperture.
72 Simulation Results
ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:55
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(a)ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:57
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(b)
Figure 8.11: Slices on the z-axis, (a) near the sufrace and (b) at the focus of
the super-elliptical aperture.
8.3 3D plot results 73
ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 16:59
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(a)ARRAY−RESPONSE Reference=12.99 [us] 6−AUG−1997 17:00
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(b)
Figure 8.12: Slices at 0 on (a) the x-axis and (b) the y-axis for the super-
elliptical aperture.
74 Simulation Results
8.3.4 Octagonal aperture
The octagonal aperture used has the same width and length as in the other
tested apertures. The slices in �gures 8.13 and 8.14 are the same as in
previous plots.
ARRAY−RESPONSE Reference=12.99 [us] 7−AUG−1997 16:21
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(a)ARRAY−RESPONSE Reference=12.99 [us] 7−AUG−1997 16:22
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(b)
Figure 8.13: Slices at 0 on (a) the x-axis and (b) the y-axis for the octagonal
aperture.
8.3 3D plot results 75
ARRAY−RESPONSE Reference=12.99 [us] 7−AUG−1997 16:23
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(a)ARRAY−RESPONSE Reference=12.99 [us] 7−AUG−1997 16:24
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=Inf
Azimuth : no apodization Elevation : no apodization View: 3D default
05
1015
2025
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
(b)
Figure 8.14: Slices at 0 on (a) the x-axis and (b) the y-axis for the elliptical
aperture.
76 Simulation Results
8.3.5 Range movie and time movie
The range movie is applied when the user needs to study the �eld along the
z-axis. Some examples can be found in web site [3].
The time movie is used to study how the �eld develops with time in any
slice in any axis. This is useful, for instance, for pulse wave simulations since
the user can study the �eld propagation with time. Figure 8.15 shows the
development of such a pulsed wave. The aperture is the rectangular aper-
ture used earlier. Number of pulses are set to 6 with cosine pulse weighing.
Observation volume is [-10:10,-10:10,5:22]. Plots are provided for 4 equal
placed times between the interval when pulse reaches 5mm and 20mm. This
example can be found in WWW site [3].
ARRAY−RESPONSE Reference=3.25 [us] 7−AUG−1997 18:21
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=6
Azimuth : no apodization Elevation : no apodization View: 3D default
510
1520
25
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
ARRAY−RESPONSE Reference=6.5 [us] 7−AUG−1997 18:32
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=6
Azimuth : no apodization Elevation : no apodization View: 3D default
510
1520
25
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
ARRAY−RESPONSE Reference=9.74 [us] 7−AUG−1997 18:46
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=6
Azimuth : no apodization Elevation : no apodization View: 3D default
510
1520
25
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
ARRAY−RESPONSE Reference=12.99 [us] 7−AUG−1997 18:52
Theta=0 [deg] Phi=0 [deg] N=48 M=1 f=2 [MHz] pitch=0.2706 osc=6
Azimuth : no apodization Elevation : no apodization View: 3D default
510
1520
25
−10
−5
0
5
10−10
−5
0
5
10
Range in [mm], Azimuth focus=20 [mmAperture (AZ) d=10 [mm]
Ran
ge in
[mm
], E
leva
tion
focu
s=0
[mm
]
Figure 8.15: Snap shots of a pulse wave propagating throught slices x = 0, y =0 and z = 20.
Chapter 9
Conclusion and Summary
9.1 The octagonal transducer
In the previous chapter, the octagonal transducer was extensively tested
using the somewhat complicated equation found for its aperture function.
However, the results in the previous chapter indicate that this footprint does
indeed show a slight reduction in the �rst sidelobe level compared with an
equal sized rectangular aperture. This could also be an alternative model for
circular and elliptical apertures, as it was seen with the plots of the aperture
smoothing function in �gures 8.1 and 8.2.
9.2 The parallel algorithm
Extensive computational results of the parallel algorithm were given in the
previous chapter. Run-time results in table 8.1 indicate that that the algo-
rithm provides good speed up. However, as the number of nodes increases,
speed up value decreases from the perfect speed up - equal to the number of
nodes.The e�ciency, expressed as a percentage, in table 9.1, also shows the
deviation from perfect values.
nodes: 1 2 5 10 20 30
ηp 100% 95.5% 92.2% 88.9 % 78.05% 68.04%
Table 9.1: E�cieny ηp for the parallel algorithm.
78 Conclusion and Summary
This is a natural development since the speed-up parameters in section 6.6.1
are for a model computer. In reality, tra�c congestion, process sharing and
many more factors a�ect the perfect communication and processing model.
In addition, the node in which the master program was executed, was the
interactive node for the super computing network. This tend to increase the
computer time since its processing power is shared by many logged-in users.
A computation done on another node showed some dramatic improvement
in execution times.
The deterioration of speed up and e�ciency with the increase of the number
of nodes can be explained by looking at the speed up equation 7.1 in page 59.
Speed up, Sp −→ p when NM >> P . With increasing nodes, data points
per node decreases. For larger sets of observation points, better speed up
and e�ciency can be expected.
To decrease the time consumption further, two methods can be suggested.
First, as explained above, a node other than the interactive node can be
used to run ULTRASIM command window. Second, the computations can
be carried out as a batch job. Information can be found in the user guide in
appendix A. These two methods will work for the super computing network
in USIT.
9.3 Software and hardware
The software package UltraSim was run on MATLAB version 4.2c. PVM
version 3 was used as the parallel routine library. CMEX version 3.5 was
used to compile the sequential C program and the parallel master program for
MATLAB. This document was created using LATEXversion 2ε. The researcharticle in [4] was created using Framemaker 4 and simulation videos were
created with visualization tools in USIT Scienti�c Visualization Lab [27].
Testing of the sequential programs were carried out on SUN computers run-
ning SunOS 5.5.1 and SGI computers running IRIX 5.3. Testing of the
parallel routines were made on the Super Computing network in USIT. All
computers are IBM RS6000 super computers with DEC and HP tape and
disk stations. More information of the super computing network can be found
in WWW site [28].
Bibliography
[1] Angelsen B.A.J., �Waves, Signals and Signal Processing in Medical Ul-
trasonics�, Norwegian University of Science and Technology, Norway,
April 1996.
[2] Berg R.E.,Stork D.G., �The Physics of Sound�, Prentice Hall, 2nd edi-
tion,1995,ISBN 0-13-183047-3.
[3] Epasinghe K., �Project Thesis of Kapila Epasinghe�, WWW:
http://www.i�.uio.no/� kapilae/thesis , August 1997.
[4] Epasinghe K., �Simulation of 3D acoustic �elds on a concurrent com-
puter�, Proc. Nordic symposium on Physical Acoustics, February 1996.
[5] Erstad J.O., Holm S., �An Approach to the Design of Sparse Array Sys-
tems�, Department of Informatics, University of Oslo, November 1994.
[6] Erstad J.O., �Design of sparse and non-equally spaced arrays for medical
ultrasound�, Masters Thesis, November 1994.
[7] FinkM.A., Cardoso J., �Di�raction E�ects in Pulse-Echo Measurement�,
Proc. IEEE Trans. Sonics Ultrason., vol. SU-31, pp.313-329, July 1984.
[8] Foster I., �Designing and Building Parallel Programs�, Addison-Wesley
Publishing Company, 1994, ISBN 0-201-57594-9.
[9] Geist A., Beguelin A., Dongarra J., Jiang W., Manchek R., Sunderam
V., �PVM 3 User's Guide and Reference Manual�, Oak Ridge Natinal
Laboratory, 1994.
[10] Goldberg B.B., Kimmelman B.A., �Medical Diagnostic Ultrasound: A
Retrospective on its 40th anniversary�, Kodak, Rochester, New York.
[11] Holm S., �Digital Beamforming in Ultrasound Imaging�, Department of
Informatics, University of Oslo.
80 BIBLIOGRAPHY
[12] Holm S., �Elevation focusing properties of superelliptical Probe�, De-
partment of Informatics, University of Oslo, Norway, September 1995.
[13] Holm S., �Medical Ultrasound Transducers and Beamforming�, Proc.
15th Congress on Acoustics, June 1995.
[14] Holm S., �Simulation of Acoustic Fields from Medical Ultrasound Trans-
ducers of Arbitrary Shape�, Proc. Nordic Syhmposium in Physical
Acoustics, January 1995.
[15] Holm S., Teigen F., Odegaard L., Berre V., Erstad J.O., �ULTRASIM
User's Manual�, Department of Informatics, University of Oslo, Version
2.0, 1996, ISBN 82-7368-133-5.
[16] Holm S., Odegaard L., Halvorsen E., Elgetun B., Teigen F., �ULTRA-
SIMUser's Manual for Advanced Functions�, Department of Informatics,
University of Oslo, Version 2.0, 1996, ISBN 82-7368-134-3.
[17] Hunt J.W., Arditi M., Foster F.S., �Ultrasound Transducers for Pulse-
Echo Medical Imaging�, Proc. IEEE Trans. Biomedical Engineering, vol.
BMU-30, No.8, August 1983.
[18] Johnson D.H., Dudgeon D.E., �Array Signal Processing�, Prentice Hall,
1993, ISBN 0-13-048513-6.
[19] Kino G.S., �Acoustic waves. Devices, Imaging and Analog Signal Pro-
cessing�, Prentice Hall, Englewood Cli�s, 1987.
[20] Lewis T.G., El-Rewini H., �Introduction to Parallel Computing�,Prentice
Hall, 1992, ISBN 0-13-498924-4.
[21] Odegaard L., Holm S., Teigen F., Keveland T., �Acoustic Field Simula-
tion for Arbitrarily Shaped Transducers in a Strati�ed Medium�, Proc.
IEEE Ultrasonics Symposium, November 1994.
[22] Odegaard L., Holm S., Torp H., �Phase Aberration Correction Ap-
plied to Annular Array Transducers when Focusing Through a Strati�ed
Medium�, Proc. IEEE Ultrasonics Symposium, 1993.
[23] Rossing T.D.,Fletcher N.H., �Principles of Vibration and Sound�,
Springer-Verlag, 1994, ISBN 0-387-94304-8.
[24] Shung K.K., Smith M.B., Tsui B., �Principles of Medical Imaging�, Aca-
demic Press, Inc.,1992, ISBN 0-12-640970-6.
BIBLIOGRAPHY 81
[25] Simonsen H.H., �Parallellprogramming på klyngen ved UiO�, 1994,
USIT.
[26] Tanenbaum A.S., �Computer Networks�, Prentice Hall, 1996, ISBN 0-
13-394248-1.
[27] USIT �Super computing at University of Oslo - Visualization�, WWW:
http://http://hpc/OsloCluster/SciVis/, July 1997.
[28] USIT �Super computing at University of Oslo - Hardware�, WWW:
http://hpc/OsloCluster/Hardware , July 1997.
[29] Van De Velde E.F., �Concurrent Scienti�c Computing�, Springer Verlag,
1994, ISBN 0-387-94195-9.
[30] Wu L., Zielinski A., �A Novel Array of Elliptic Ring Radiators�, Proc.
IEEE Journal of Oceanic Engineering, Vol. 18. No.3, July 1993.
82 BIBLIOGRAPHY
Appendix A
User Guide to Parallel Execution
This section describes the implementation of the external routines to UL-
TRASIM. The external routines are implemented for the simulation of acous-
tic �eld in 2D plane and 3D volume. These external routines were introduced
primarily to reduce the computation time.
A.1 C-MEX routine
A.1.1 Compilation of C-MEX Routine
C-MEX function pec.c is placed in ULTRASIMHOME/pe directory, and this
routine has to compiled using the C compilor cmex provided by MATLAB.
It is necessary to compile this C-Mex routine for every architecture available.
A.1.2 Execution
For sequential processing, the user should have a compiled programme of
pec.c. This is the C-MEX function which simulates the �eld, called by CAL-
CULATIONS → 2D RESPONSE → COMPUTER RESPONSE from the
CONFIGURATION window of ULTRASIM. If the C-MEX function is avail-
able, ULTRASIM chooses to execute this instead of pec.m
84 User Guide to Parallel Execution
A.2 C-MEX routine for parallel processing
A.2.1 Compilation of Parallel C-Mex Routine
For parallel computers or parallel networks, the C-MEX �le is extended to
calculate the �eld adopting parallel data processing techniques. Two other
external �les are introduced for parallel execution of which one is a slightly
di�erent version of pec.c.
pec.c This is the C-MEX �le for parallel computation. The �le is compiled
as earlier and should be available in ULTRASIMHOME/pe directory.
ppec_slave.c This is the parallel routine executed in each parallel process.
This is compiled using an available C compilor and the compiled func-
tion must be available in ULTRASIMHOME/pe directory.
These two �les should be compiled and present in MATLAB path. In the
Super Computing Network at the University of Oslo, the compilation is done
as below:
cc -I/local/hpc/lib/pvm3/include -o ppec_slave ppec_slave.c
-L/local/lib -lpvm3 -lm
cmex -I/local/hpc/lib/pvm3/include pec.c -L/local/lib -lpvm3 -lm
A.2.2 Interactive Parallel Execution
The parallel routines use PVM (Parallel Virtual Machine) as the communi-
cation library. Before the parallel routines are called from ULTRASIM, it is
necessary to activate the PVM console. Console is activated as below:
bluemaster 17>pvm hostfile
Here, host�le is the name of the description �le PVM uses to set it's envi-
ronment variables and host names. A line starting with a '∗' is considereda command, while a line starting with a 'D' is discarded. A few important
variables are:
A.2 C-MEX routine for parallel processing 85
ep - path to user executables. This speci�es the path where PVM looks for
the spawining tasks.
dx - location of pvmd. This speci�es the location of PVM libraries.
list of hosts - User speci�es the host names of the parallel network. Each
line should contain only one host name.
Below is an example of a host �le used in IBM Super Computing Network
at University Center for Information Technology, University of Oslo. A user
may have to declare more information in the host �le to suit the parallel
network available.
* ep=$HOME:$HOME/paral:$HOME/matlab:$HOME/matlab/toolbox/ultrasim/pe
sp01
sp02
sp03
sp04
sp05
.
.
.
.
.
sp30
sp31
sp32
Once the PVM console is activated, the user can choose to move PVM to the
background by typing 'quit' in PVM console. Calling PVM console again,
without specifying the host�le, reactivates the console. Command 'halt' will
leave the PVM console.
As in the sequential programme, user calls the parallel calculation by CAL-
CULATIONS → 2D RESPONSE → COMPUTER RESPONSE from the
CONFIGURATION window of ULTRASIM. Parallel processing does not
provide information about the calculation loop.
86 User Guide to Parallel Execution
A.2.3 Parallel Execution as a Batch job
For large simulations, it is recommended that the calculation is carried out
as a batch job.
First the con�guration �le should be saved in the user's own con�guration
directory, and then the batch script is written using the QUEUE, which
prompts the user for necessary details to make the batch script.
Once the batch job is executed the the response matrices are written in a
MAT-�le, which is loaded using the M-�le RETRIEVE. The con�guration
should be loaded before the response matrices are loaded. RETRIEVE loads
the �eld point response matrix, maximum response matrix, minimum re-
sponse matrix and sets the visualization enable �ags.
An example of the batch script, created by QUEUE is below:
#!/bin/ksh
# @ input = /dev/null
# @ output = program.$(Cluster).$(Process).out
# @ error = program.$(Cluster).$(Process).err
# @ class = TEST
cd $HOME
pvm hostfile <<EOF
quit
EOF
cd $HOME/matlab
matlab <<EOF
u
[excitation, observasjon, media, transducer, flagg, ...
option, x, y, z, t, elem_pts, centers, elecdelr, elecdelt, ...
comment, pvec, plength, phase_err_ud, amp_ud, ...
array_resp, max_resp, min_resp]= ...
lusim('/usit/rs1/ifi/kapilae/matlab/toolbox/ultrasim/cnf/test.cnf');
[resp,max_resp,min_resp]=pecomp(excitation,transducer, ...
media, observasjon, option,flagg,x,y,z,t,elem_pts, ...
centers,config, plotfig,usr_resdir);
10
save par_res array_resp max_resp min_resp
quit
A.2 C-MEX routine for parallel processing 87
EOF
pvm <<EOF
halt
EOF
# @ queue
The results computed as a batch job can be transferred to a sequential com-
puter for visualization. This method is recommended for the Super Comput-
ing Network in the University of Oslo.
88 User Guide to Parallel Execution
Appendix B
User Guide to Volumetric
Visualization
This appendix explains the setting up of visualization parameters for volu-
metric simulations. Visualization is done by using MATLAB SLICE routines.
B.1 Observation
Observation �ag is set to either Volume, for observation in (x,y,z) or (x,y,t)volume,
or Volume → Movie, for (x,y,z,t) volume. Observation parameters are set
choosing CONFIGURATION → OBSERVATION from the CONFIGURA-
TION window of ULTRASIM.
When observation �ag is set to VOLUME, the observation parameters are
set as below:
------- OBSERVATION SUBMENU ---------
Option : VOLUME
Coordinates : RECTANGULAR
2) 1. axis : x
3) Start value of x [mm] : -15
4) End value of x [mm] : 15
5) 2. axis : y
6) Start value of y [mm] : -10
7) End value of y [mm] : 10
90 User Guide to Volumetric Visualization
12) 3. axis : z
13) Start value of z [mm] : 10
14) End value of z [mm] : 25
9) Fixed value of t [us] : 12.99
15) # obs.pts /[mm] : 2
CHANGE = "number"
-> Decision (<CR> = exit):
The �rst and the second axes are �xed while the third axis can be either z
or t.
----- Choose axis -----
1) z
2) t
Select a menu number:
When observation �ag is set to VOLUME → MOVIE, the observation pa-
rameters are set as below:
------- OBSERVATION SUBMENU ---------
Option : VOLUME (movie)
Coordinates : RECTANGULAR
2) 1. axis : x
3) Start value of x [mm] : -15
4) End value of x [mm] : 15
5) 2. axis : y
B.2 Slice Plot 91
6) Start value of y [mm] : -10
7) End value of y [mm] : 10
12) 3. axis : z
13) Start value of z [mm] : 10
14) End value of z [mm] : 25
10) Start value of t [us] : 12.99
11) Stop value of t [us] : 15.25
... 8 plots will be produced for movie
15) # obs.pts /[mm] : 2
CHANGE = "number"
-> Decision (<CR> = exit):
B.2 Slice Plot
The volumetric simulations area is visualized using the SLICE function in
MATLAB. The results are plotted in ULTRASIM - PLOT window, and the
SLICE plots are called by CALCULATIONS → 2D RESPONSE → SLICE
PLOT in ULTRASIM - CONFIGURATION - CALCULATION window.
Set the fillowing plot options:
-> envelope / rf [e/r] ?
-> linear /logarithmic [lin/log] ?
Obeservation slices for X Axis
Modify slices [y/n] : y
-> Background display ?? [y/n] :y
Range for the axis X: -15 to 15
92 User Guide to Volumetric Visualization
Enter the number of slices(default 0, max 5) : 1
Axis X slice 1: Enter value ->0
Obeservation slices for Y Axis
Modify slices [y/n] : y
-> Background display ?? [y/n] :y
Range for the axis Y: -10 to 10
Enter the number of slices(default 0, max 5) : 1
Axis Y slice 1: Enter value ->0
-> Range Movie on Third axis ?? [y/n] :y
Range for Axis Z : 10 to 25
Enter start value :15
Enter end value :20
plot number 1 of totally 11 plots
plot number 2 of totally 11 plots
plot number 3 of totally 11 plots
plot number 4 of totally 11 plots
plot number 5 of totally 11 plots
plot number 6 of totally 11 plots
plot number 7 of totally 11 plots
plot number 8 of totally 11 plots
plot number 9 of totally 11 plots
plot number 10 of totally 11 plots
plot number 11 of totally 11 plots
Movie Options
-> Enter number of movie loops [1..100] :
-> Enter speed in frames/sec [1..20] :
Movie is playing ...
B.2 Slice Plot 93
-> More movie ??? [y/n] : n
94 User Guide to Volumetric Visualization
Appendix C
Programming Code
The main program code used in the thesis are presented here. For clarity
and for page construtction ease, line breaks are added to some lines.
C.1 C routine for sequential computer -pec.c
#include <stdio.h>
#include <math.h>
#include <string.h>
#include "mat.h"
#include "mex.h"
/*
--------------------------------------------------------------
Routine : pec.c
Description : The Matlab-External Library Routine,
called by PECOMP.M. This C-file shoule be
compiled using the MATLAB C complier
CMEX, for each architecture. The compiled
MEX file can be chosen when PECOMP.M is
called via ULTRASIM
Language : C code, embedding MatLab External Libraries
Written by : University of Oslo, Department of Informatikk,
96 Programming Code
Kapila Epasinghe
Version : 1.0 KE 10.05.96 First Version
Called by : pecomp.m
Calling : focus.m
--------------------------------------------------------------
*/
#define PI 3.14159265358979
#define EPS 2.220446049250313e-16
#define INF 1e308
struct ans {Matrix *rsp;
Matrix *maxrp;
Matrix *minrp;};
/* Function to calculate the non conjugate transpose of a
matrix */
Matrix *non_conj_transpose(mtradr)
Matrix *mtradr;
{
int m,n,i,j;
double *pr,*pi,*tpr,*tpi;
Matrix *temp;
m=mxGetM(mtradr);
n=mxGetN(mtradr);
pr=mxGetPr(mtradr);
pi=mxGetPi(mtradr);
temp=mxCreateFull(n,m,mxIsComplex(mtradr));
tpr=mxGetPr(temp);
tpi=mxGetPi(temp);
for (i=0;i<m;i++)
for (j=0;j<n;j++)
C.1 C routine for sequential computer -pec.c 97
{
tpr[n*i+j]=pr[m*j+i];
if (mxIsComplex(temp))
tpi[n*i+j]=pi[m*j+i];
}
return temp;
}
/* Help function... */
char *num2str(val)
int val;
{
int ch;
char aval[4];
char *sval;
ch=(val/100);
val=val % 100;
aval[0]=ch+48;
ch=(val/10);
val=val % 10;
aval[1]=ch+48;
ch=val;
aval[2]=ch+48;
aval[3]='\0';
sval=aval;
return sval;
}
/* Find absolute value.. */
double abs_val(val)
double val;
{
if (val<0)
val=-1*val;
return val;
}
/* Main function: Calculates the response..*/
struct ans pec(x,y,z,t,FieldSize,elem,ElPt,Apodize,excitation,
98 Programming Code
SteerFocusDelay,TimeStepVector,trans,fla,
option,usr_resdir,values)
double x[],y[],z[],t[],Apodize[],excitation[];
double SteerFocusDelay[],TimeStepVector[],values[];
Matrix *elem,*trans,*fla;
char option[],usr_resdir[];
int FieldSize[],ElPt[];
{
double r,c,PulseLength,jomega,TLen,*transducer,*flagg;
double *elem_pts,*Time,Weight,*resp_pr,*resp_pi,*maxpr,*minpr;
Matrix *resp,*temp,*input,*c_foc,*rhs[5],*lhs[2];
double d,d_max,foc_depth,f_number,*input_pr,*c_foc_pr;
int min_elem,max_elem;
int PNo,FieldLength,n,a,i,finite,testint,k,ulen,set;
double t_ref,x0,y0,z0,maxt,mint,TStFo,TimeDel,w_pulse;
double t_vector,r_attenu,trans_arr_value,F_des;
char *mtrname,*filename;
char mtn[10];
char fn[80];
char *conv,*Pict,*mat;
MATFile *fp;
/* Structure definition used when extracting the three
solution matrices. */
struct ans reply;
/* List of variables imported from pecomp.m, that are used
in the mex file */
if ((option[0]=='m') || (option[0]=='f'))
{
ulen=0;
ulen=strlen(usr_resdir);
Pict="Pict";
mat=".mat";
C.1 C routine for sequential computer -pec.c 99
mtn[0]='P';mtn[1]='i';mtn[2]='c';mtn[3]='t';
mtn[4]='0';mtn[5]='0';mtn[6]='0';mtn[7]='\0';
for (k=0;k<ulen+12;k++){
if (k<ulen)
fn[k]=usr_resdir[k];
else if (k<ulen+7)
fn[k]=mtn[k-ulen];
else
fn[k]=mat[k-ulen-7];
}
filename=fn;
mtrname=mtn;
}
flagg=mxGetPr(fla);
transducer=mxGetPr(trans);
elem_pts=mxGetPr(elem);
r=values[0];
c=values[1];
PulseLength=values[2];
TLen=values[3];
jomega=values[4];
Weight=values[5];
trans_arr_value=values[6];
F_des=values[7];
d=transducer[0];
FieldLength=FieldSize[0]*FieldSize[1];
/* Creation of max_resp and min_resp. Earlier allocations
are neglected. */
reply.maxrp=mxCreateFull(1,2,REAL);
maxpr=mxGetPr(reply.maxrp);
reply.minrp=mxCreateFull(1,2,REAL);
minpr=mxGetPr(reply.minrp);
maxpr[0]=0;maxpr[1]=0;
minpr[0]=0;minpr[1]=0;
100 Programming Code
/* LOOP OVER TIME REFERENCES */
for (PNo=1;PNo<=TLen;PNo++)
{
/* Time reference or distance reference */
if ((option[2]=='y') || (option[2]=='z'))
t_ref=t[0];
else
t_ref=r/c;
/* put time array into a vector. Due to the array
nature of t is in array form, no reshaping is done
when making the Time array. */
if ((option[0]=='m') || (option[0]=='f'))
{
temp=mxCreateFull(1,FieldLength,REAL);
Time=mxGetPr(temp);
for (i=0;i<FieldLength;i++)
Time[i]=TimeStepVector[PNo-1];
}
else
Time=t;
d_max=d;
min_elem=1;
max_elem=ElPt[1];
if ((flagg[0]==2)
&& ((option[0]=='m') || (option[0]=='f')))
{
foc_depth=c*TimeStepVector[PNo-1];
if (F_des > 0)
{
if (d<foc_depth/F_des)
d_max=d;
else
d_max=foc_depth/F_des;
max_elem=0;
min_elem=ElPt[1];
C.1 C routine for sequential computer -pec.c 101
for (set=0;set<ElPt[1];set++)
if (abs_val(elem_pts[set*5])<=d_max/2)
{
if (max_elem<set+1)
max_elem=set+1;
if (min_elem>set+1)
min_elem=set+1;
}
if (F_des>foc_depth/d_max)
f_number=F_des;
else
f_number=foc_depth/d_max;
printf("f-number: %3.0f, aperture: %5.2f mm,
element rage: %d-%d\n",
f_number,d_max*1000,min_elem,max_elem);
}
else
d_max=d;
input=mxCreateFull(1,7,REAL);
input_pr=mxGetPr(input);
for (i=0;i<7;i++)
input_pr[i]=excitation[i];
input_pr[1]=foc_depth;
c_foc=mxCreateFull(1,1,REAL);
c_foc_pr=mxGetPr(c_foc);
c_foc_pr[0]=excitation[13];
rhs[0]=input;rhs[1]=c_foc;rhs[2]=trans;
rhs[3]=elem;rhs[4]=fla;
mexCallMATLAB(2,lhs,5,rhs,"focus");
SteerFocusDelay=mxGetPr(lhs[0]);
}
/* Initialize */
if (PNo>1)
102 Programming Code
mxFreeMatrix(resp);
resp=mxCreateFull(1,FieldLength,COMPLEX);
resp_pr=mxGetPr(resp);
resp_pi=mxGetPi(resp);
/* OBSERVATION POINTS */
for (n=min_elem;n<=max_elem;n++)
{
if ( (n % 10)==0)
printf("Loop %d of %3.0f now at %d
of totally %d computations\n",
PNo,TLen,n,ElPt[1]);
/* Actual Analysis points are */
x0=elem_pts[ElPt[0]*(n-1)];
y0=elem_pts[ElPt[0]*(n-1)+1];
z0=elem_pts[ElPt[0]*(n-1)+2];
/* Steering and Focusing TimeDelay for the
current element. */
TStFo=-SteerFocusDelay[n-1];
/* Check if z is finite: To be used when computing
field ponts.*/
finite=0;
for (i=0;i<FieldLength;i++)
if ((z[i]<INF) && (z[i]>-INF))
finite=1;
for (i=0;i<FieldLength;i++)
{
/* Propagation delay from origo to field points
and current time. */
C.1 C routine for sequential computer -pec.c 103
if (finite==1)
TimeDel=(sqrt((x[i]-x0)*(x[i]-x0)+
(y[i]-y0)*(y[i]-y0)+
(z[i]-z0)*(z[i]-z0)))/c;
else
TimeDel=(sqrt((x[i]-x0)*180/PI*(x[i]-x0)*180/PI+
(y[i]-y0)*180/PI*(y[i]-y0)*180/PI+
(z[i]-z0)*(z[i]-z0)))/c;
/* check if PulseLength is Finite */
if (abs_val(PulseLength)<HUGE_VAL)
{
maxt=TStFo + PulseLength/2;
mint=TStFo - PulseLength/2;
/* see if the field point is in the transmitted burst */
if (((TimeDel-Time[i]- EPS)<=maxt) &&
((TimeDel-Time[i]+EPS)>=mint))
{
/* Find the actual time for the response computation */
t_vector=TimeDel - Time[i]-TStFo;
/* compute the weight and the attenuation
correction factor */
if (Weight==1)
w_pulse=(cos(PI*t_vector/PulseLength)
*cos(PI*t_vector/PulseLength));
else if (Weight==0)
w_pulse=1;
/* Attenuation */
r_attenu=TimeDel/t_ref;
/*Compute the response of the curent burst for
the fieldpoint*/
resp_pr[i]=resp_pr[i]+Apodize[n-1]*
w_pulse*cos(jomega*t_vector)/r_attenu;
resp_pi[i]=resp_pi[i]+Apodize[n-1]*
w_pulse*sin(jomega*t_vector)/r_attenu;
}
}
else
{
104 Programming Code
/* Find the actual time for the response computation */
t_vector=TimeDel-Time[i]-TStFo;
/* Attenuation */
r_attenu=TimeDel/t_ref;
/* Compute the response of the curent burst for the
fieldpoint */
resp_pr[i]=resp_pr[i]+Apodize[n-1]*
cos(jomega*t_vector)/r_attenu;
resp_pi[i]=resp_pi[i]+Apodize[n-1]*
sin(jomega*t_vector)/r_attenu;
}
}
}
printf("Loop %d of %3.0f now at %d of totally %d
computations\n",PNo,TLen,n-1,ElPt[1]);
/* Put the response into a matrix */
mxSetM(resp,FieldSize[0]);
mxSetN(resp,FieldSize[1]);
for (i=0;i<FieldLength;i++)
{
/* Normalize response */
resp_pr[i]=resp_pr[i]*trans_arr_value;
resp_pi[i]=resp_pi[i]*trans_arr_value;
/* Compute max and min of response. Real and
Imaginary prt for both. */
if (maxpr[0]<resp_pr[i])
maxpr[0]=resp_pr[i];
if (maxpr[1]<resp_pi[i])
maxpr[1]=resp_pi[i];
if (minpr[0]>resp_pr[i])
minpr[0]=resp_pr[i];
if (minpr[1]>resp_pi[i])
minpr[1]=resp_pi[i];
}
C.1 C routine for sequential computer -pec.c 105
resp=non_conj_transpose(resp);
if ((option[0]=='m') || (option[0]=='f'))
{
conv=num2str(PNo);
for (k=4;k<7;k++)
mtn[k]=conv[k-4];
for (k=0;k<7;k++)
fn[k+ulen]=mtn[k];
mxSetName(resp,mtrname);
fp= matOpen(filename,"w");
matPutMatrix(fp,resp);
matClose(fp);
}
}
reply.rsp=resp;
return reply;
}
/* NOTE: This section might need to be modified for
the version of C-compiler available
*/
#if __STDC__
void mexFunction(
int nlhs,
Matrix *plhs[],
int nrhs,
Matrix *prhs[]
)
#else
mexFunction (nlhs,plhs,nrhs,prhs)
int nlhs,nrhs;
Matrix *plhs[],*prhs[];
#endif
{
Matrix *trans,*fla,*elem;
double *x,*y,*z,*t,*Apod,*excit,*temp,*SFD,*TSV;
int FS[2],EP[2],n;
char *opt,*dirpath,*delim,*usr_resdir;
106 Programming Code
struct ans mtr;
x=mxGetPr(prhs[0]);
FS[0]=mxGetM(prhs[0]);FS[1]=mxGetN(prhs[1]);
y=mxGetPr(prhs[1]);
z=mxGetPr(prhs[2]);
t=mxGetPr(prhs[3]);
elem=prhs[4];
EP[0]=mxGetM(prhs[4]);EP[1]=mxGetN(prhs[4]);
Apod=mxGetPr(prhs[5]);
excit=mxGetPr(prhs[6]);
SFD=mxGetPr(prhs[7]);
TSV=mxGetPr(prhs[8]);
trans=prhs[9];
fla=prhs[10];
n=mxGetN(prhs[11])+1;
opt=mxCalloc(n,sizeof(char));
mxGetString(prhs[11],opt,n);
n=mxGetN(prhs[12])+1;
dirpath=mxCalloc(n,sizeof(char));
mxGetString(prhs[12],dirpath,n);
n=mxGetN(prhs[13])+1;
delim=mxCalloc(n,sizeof(char));
mxGetString(prhs[13],delim,n);
usr_resdir=strcat(dirpath,delim);
temp=mxGetPr(prhs[14]);
mtr=pec(x,y,z,t,FS,elem,EP,Apod,excit,SFD,TSV,
trans,fla,opt,usr_resdir,temp);
plhs[0]=mtr.rsp;
plhs[1]=mtr.maxrp;
plhs[2]=mtr.minrp;
C.2 C routine for parallel computer - pec.c 107
}
C.2 C routine for parallel computer - pec.c
#include <stdio.h>
#include <math.h>
#include <string.h>
#include "mat.h"
#include "mex.h"
/*
-------------------------------------------------------------
Routine : pec.c
Description : The Matlab-External Library Routine for
parallel computer network in USIT, called
by PECOMP.M. This C-file shoule be compiled
using the MATLAB C complier CMEX, according
to the parameters in User Guide. The compiled
MEX file can be chosen when PECOMP.M is
called via ULTRASIM
Language : C code, embedding MatLab External Libraries
Written by : University of Oslo, Department of Informatikk,
Kapila Epasinghe
Version : 1.0 KE 10.05.96 First Version
Called by : pecomp.m
Calling : focus.m
--------------------------------------------------------------
*/
108 Programming Code
#define PI 3.14159265358979
#define EPS 2.220446049250313e-16
#define INF 1e308
#define MAXTASKS 32
struct ans
{ Matrix *rsp;
Matrix *maxrp;
Matrix *minrp;};
/* Help function to find the non conjugate transpose
of a matrix */
Matrix *non_conj_transpose(mtradr)
Matrix *mtradr;
{
int m,n,i,j;
double *pr,*pi,*tpr,*tpi;
Matrix *temp;
m=mxGetM(mtradr);
n=mxGetN(mtradr);
pr=mxGetPr(mtradr);
pi=mxGetPi(mtradr);
temp=mxCreateFull(n,m,mxIsComplex(mtradr));
tpr=mxGetPr(temp);
tpi=mxGetPi(temp);
for (i=0;i<m;i++)
for (j=0;j<n;j++)
{
tpr[n*i+j]=pr[m*j+i];
if (mxIsComplex(temp))
tpi[n*i+j]=pi[m*j+i];
}
return temp;
}
/* Help function to convert a number to a string */
C.2 C routine for parallel computer - pec.c 109
char *num2str(val)
int val;
{
int ch;
char aval[4];
char *sval;
ch=(val/100);
val=val % 100;
aval[0]=ch+48;
ch=(val/10);
val=val % 10;
aval[1]=ch+48;
ch=val;
aval[2]=ch+48;
aval[3]='\0';
sval=aval;
return sval;
}
/* Help function to get the absolute value of a real number */
double abs_val(val)
double val;
{
if (val<0)
val=-1*val;
return val;
}
/* Main Calculation Routine in Mex file */
struct ans pec(x,y,z,t,FieldSize,elem,ElPt,Apodize,excitation,
SteerFocusDelay,TimeStepVector,trans,fla,
option,usr_resdir,values,cent,tasks)
double x[],y[],z[],t[],Apodize[],excitation[];
double SteerFocusDelay[],TimeStepVector[],values[];
Matrix *elem,*trans,*fla,*cent;
char option[],usr_resdir[];
int FieldSize[],ElPt[],tasks;
110 Programming Code
{
double r,c,PulseLength,jomega,TLen,*transducer,*flagg;
double *elem_pts,*Time,Weight,*resp_pr,*resp_pi;
double *maxpr,*minpr;
Matrix *resp,*temp,*input,*c_foc,*rhs[6],*lhs[2];
double d,d_max,foc_depth,f_number,*input_pr,*c_foc_pr;
int min_elem,max_elem;
int PNo,FieldLength,n,a,i,finite,testint,k,ulen,set;
double t_ref,x0,y0,z0,maxt,mint,TStFo,TimeDel,w_pulse;
double t_vector,r_attenu,trans_arr_value,F_des;
char *mtrname,*filename;
char mtn[10];
char fn[80];
char *conv,*Pict,*mat;
MATFile *fp;
int numt,tids[MAXTASKS],data_len,residual,cc,id,len,bl;
int mytid,lstart,lend,elems,NTASK;
double *l_pr,*l_pi;
char buf[10];int tata;
/* Structure definition used when extracting the three
solution matrices. */
struct ans reply;
/* List of variables imported from pecomp.m, that are
used in the mex file */
printf("\n** COMPUTING RESPONSE IN PARALLEL SYSTEM **\n\n");
if ((option[0] == 'm') || (option[0] =='f'))
{
Pict="Pict";
mat=".mat";
mtn[0]='P';mtn[1]='i';mtn[2]='c';mtn[3]='t';
mtn[4]='0';mtn[5]='0';mtn[6]='0';mtn[7]='\0';
ulen=0;
C.2 C routine for parallel computer - pec.c 111
ulen=strlen(usr_resdir);
for (k=0;k<ulen+12;k++){
if (k<ulen)
fn[k]=usr_resdir[k];
else if (k<ulen+7)
fn[k]=mtn[k-ulen];
else
fn[k]=mat[k-ulen-7];
}
filename=fn;
mtrname=mtn;
}
flagg=mxGetPr(fla);
transducer=mxGetPr(trans);
elem_pts=mxGetPr(elem);
r=values[0];
c=values[1];
PulseLength=values[2];
TLen=values[3];
jomega=values[4];
Weight=values[5];
trans_arr_value=values[6];
F_des=values[7];
d=transducer[0];
FieldLength=FieldSize[0]*FieldSize[1];
/* Creatingmax_resp and min_resp. */
reply.maxrp=mxCreateFull(1,2,REAL);
maxpr=mxGetPr(reply.maxrp);
reply.minrp=mxCreateFull(1,2,REAL);
minpr=mxGetPr(reply.minrp);
maxpr[0]=0;maxpr[1]=0;
minpr[0]=0;minpr[1]=0;
/* Time reference or distance reference */
112 Programming Code
if ((option[2]=='y') || (option[2]=='z'))
t_ref=t[0];
else
t_ref=r/c;
NTASK=tasks;
/* Enroll in PVM */
mytid=pvm_mytid();
/* Call PVM : Spawn ppec_slave */
numt=pvm_spawn("ppec_slave",0,0,"",NTASK,&tids);
printf("%d host(s) activated\n",numt);
/* Check if Z is finit */
finite=0;
for (i=0;i<FieldLength;i++)
if ((z[i]<INF) && (z[i]>-INF))
{finite=1;break;}
printf("Starting Calculations....\n");
/* Data distribution parameters */
data_len=(int)FieldLength/numt;
residual=FieldLength-data_len*numt;
/* LOOP OVER TIME REFERENCES */
for (PNo=1;PNo<=TLen;PNo++)
{
/* put time array into a vector. Due to the array
C.2 C routine for parallel computer - pec.c 113
nature of t, no reshaping is done when making
the Time array. */
if ((option[0]=='m') || (option[0]=='f'))
{
temp=mxCreateFull(1,FieldLength,REAL);
Time=mxGetPr(temp);
for (i=0;i<FieldLength;i++)
Time[i]=TimeStepVector[PNo-1];
}
else
Time=t;
d_max=d;
min_elem=1;
max_elem=ElPt[1];
/* If Dynamic Focus and Movie,
find SteerFocusDelays for time step */
if ((flagg[0]==2) &&
((option[0]=='m') || (option[0]=='f')))
{
foc_depth=c*TimeStepVector[PNo-1];
if (F_des > 0)
{
if (d<foc_depth/F_des)
d_max=d;
else
d_max=foc_depth/F_des;
max_elem=0;
min_elem=ElPt[1];
for (set=0;set<ElPt[1];set++)
if (abs_val(elem_pts[set*5])<=d_max/2)
{
if (max_elem<set+1)
max_elem=set+1;
if (min_elem>set+1)
min_elem=set+1;
}
if (F_des>foc_depth/d_max)
f_number=F_des;
114 Programming Code
else
f_number=foc_depth/d_max;
printf("f-number: %3.0f, aperture: %5.2f mm,
element rage: %d-%d\n",
f_number,d_max*1000,min_elem,max_elem);
}
else
d_max=d;
/* Call ULTRASIM Routine -focus-. */
input=mxCreateFull(1,7,REAL);
input_pr=mxGetPr(input);
for (i=0;i<7;i++)
input_pr[i]=excitation[i];
input_pr[1]=foc_depth;
c_foc=mxCreateFull(1,1,REAL);
c_foc_pr=mxGetPr(c_foc);
c_foc_pr[0]=excitation[13];
rhs[0]=input;rhs[1]=c_foc;rhs[2]=trans;
rhs[3]=elem;rhs[4]=cent;rhs[5]=fla;
mexCallMATLAB(2,lhs,6,rhs,"focus");
SteerFocusDelay=mxGetPr(lhs[0]);
}
/* Initialize */
if (PNo>1)
mxFreeMatrix(resp);
resp=mxCreateFull(1,FieldLength,COMPLEX);
resp_pr=mxGetPr(resp);
resp_pi=mxGetPi(resp);
/* COMPUTING RESPONSE IN PARALLEL PROGRAMMING. */
C.2 C routine for parallel computer - pec.c 115
/* OBSERVATION POINTS */
if (numt !=1)
{
lend=-1;
for (i=0;i<numt;i++)
{
lstart = lend+1;
if (i<residual)
lend = lstart + data_len;
else
lend = lstart + data_len - 1;
pvm_initsend(0);
pvm_pkint(&lstart,1,1);
pvm_pkint(&lend,1,1);
pvm_pkint(&PNo,1,1);
pvm_pkint(&min_elem,1,1);
pvm_pkint(&max_elem,1,1);
pvm_pkint(&ElPt[0],2,1);
pvm_pkdouble(&SteerFocusDelay[0],ElPt[1],1);
pvm_pkdouble(&Time[lstart],lend-lstart+1,1);
if (PNo == 1)
{
pvm_pkint(&FieldLength,1,1);
pvm_pkint(&finite,1,1);
pvm_pkdouble(&t_ref,1,1);
pvm_pkdouble(&c,1,1);
pvm_pkdouble(&jomega,1,1);
pvm_pkdouble(&Weight,1,1);
pvm_pkdouble(&PulseLength,1,1);
pvm_pkdouble(&elem_pts[0],ElPt[0]*ElPt[1],1);
pvm_pkdouble(&Apodize[0],ElPt[1],1);
pvm_pkdouble(&x[lstart],lend-lstart+1,1);
pvm_pkdouble(&y[lstart],lend-lstart+1,1);
pvm_pkdouble(&z[lstart],lend-lstart+1,1);
}
pvm_send(tids[i],3);
}
}
116 Programming Code
for (i=0;i<numt;i++)
{
cc=pvm_recv(-1,-1);
pvm_bufinfo(cc,(int*)0,(int*)0,&id);
pvm_upkint(&lstart,1,1);
pvm_upkint(&lend,1,1);
len=lend-lstart+1;
l_pr=(double *)calloc(len,sizeof(double));
l_pi=(double *)calloc(len,sizeof(double));
pvm_upkdouble(l_pr,len,1);
pvm_upkdouble(l_pi,len,1);
for (bl=lstart;bl<=lend;bl++)
{
resp_pr[bl]=l_pr[bl-lstart];
resp_pi[bl]=l_pi[bl-lstart];
}
printf("Calculation completed %d of %d hosts. \r",
i+1,numt);
cfree(l_pr);
cfree(l_pi);
}
printf("\n");
printf("Loop %d of %3.0f now at %d of totally %d
computations\n",
PNo,TLen,max_elem,ElPt[1]);
/* Put the response into a matrix */
mxSetM(resp,FieldSize[0]);
mxSetN(resp,FieldSize[1]);
for (i=0;i<FieldLength;i++)
{
/* Normalize response */
resp_pr[i]=resp_pr[i]*trans_arr_value;
resp_pi[i]=resp_pi[i]*trans_arr_value;
/* Compute max and min of response.*/
C.2 C routine for parallel computer - pec.c 117
if (maxpr[0]<resp_pr[i])
maxpr[0]=resp_pr[i];
if (maxpr[1]<resp_pi[i])
maxpr[1]=resp_pi[i];
if (minpr[0]>resp_pr[i])
minpr[0]=resp_pr[i];
if (minpr[1]>resp_pi[i])
minpr[1]=resp_pi[i];
}
resp=non_conj_transpose(resp);
if ((option[0]=='m') || (option[0]=='f'))
{
conv=num2str(PNo);
for (k=4;k<7;k++)
mtn[k]=conv[k-4];
for (k=0;k<7;k++)
fn[k+ulen]=mtn[k];
mxSetName(resp,mtrname);
fp= matOpen(filename,"w");
matPutMatrix(fp,resp);
matClose(fp);
}
}
lstart=-1;
for (i=0;i<numt;i++)
{
pvm_initsend(0);
pvm_pkint(&lstart,1,1);
pvm_send(tids[i],3);
}
pvm_exit();
reply.rsp=resp;
return reply;
}
118 Programming Code
/* MEX Gateway function */
#ifdef __STDC__
void mexFunction(
int nlhs,
Matrix *plhs[],
int nrhs,
Matrix *prhs[]
)
#else
mexFunction (nlhs,plhs,nrhs,prhs)
int nlhs,nrhs;
Matrix *plhs[],*prhs[];
#endif
{
Matrix *trans,*fla,*elem,*cent;
double *x,*y,*z,*t,*Apod,*excit,*temp,*SFD,*TSV;
int FS[2],EP[2],n,task;
char *opt,*dirpath,*delim,*usr_resdir;
struct ans mtr;
x=mxGetPr(prhs[0]);
FS[0]=mxGetM(prhs[0]);FS[1]=mxGetN(prhs[1]);
y=mxGetPr(prhs[1]);
z=mxGetPr(prhs[2]);
t=mxGetPr(prhs[3]);
elem=prhs[4];
EP[0]=mxGetM(prhs[4]);EP[1]=mxGetN(prhs[4]);
Apod=mxGetPr(prhs[5]);
excit=mxGetPr(prhs[6]);
SFD=mxGetPr(prhs[7]);
TSV=mxGetPr(prhs[8]);
trans=prhs[9];
fla=prhs[10];
n=mxGetN(prhs[11])+1;
opt=mxCalloc(n,sizeof(char));
C.3 C-Routine ppec_slave.c 119
mxGetString(prhs[11],opt,n);
n=mxGetN(prhs[12])+1;
dirpath=mxCalloc(n,sizeof(char));
mxGetString(prhs[12],dirpath,n);
n=mxGetN(prhs[13])+1;
delim=mxCalloc(n,sizeof(char));
mxGetString(prhs[13],delim,n);
usr_resdir=strcat(dirpath,delim);
temp=mxGetPr(prhs[14]);
cent=prhs[15];
printf("-> Enter the number of tasks [<=32] : ");
scanf("%d",&task);
mtr=pec(x,y,z,t,FS,elem,EP,Apod,excit,SFD,TSV,trans,
fla,opt,usr_resdir,temp,cent,task);
plhs[0]=mtr.rsp;
plhs[1]=mtr.maxrp;
plhs[2]=mtr.minrp;
}
C.3 C-Routine ppec_slave.c
#include <stdio.h>
#include <math.h>
#include "pvm3.h"
#define PI 3.14159265358979
#define EPS 2.220446049250313e-16
#define INF 1e308
120 Programming Code
main()
{
int ptid,cc,id,i,mytid,k;
int length,FieldLength,finite,n,ElPt[2];
int lstr,lend,PNo,min_elem,max_elem;
double *x,*y,*z,*Time,*elem_pts,*Apodize;
double *local_pr,*local_pi,*SteerFocusDelay;
double t_ref,c,jomega,Weight,TStFo;
double PulseLength,TimeDel,maxt,mint,t_vector,r_attenu;
double w_pulse,x0,y0,z0;
char buf[10];
mytid=pvm_mytid();
ptid=pvm_parent();
cc=pvm_recv(ptid,-1);
pvm_bufinfo(cc,(int *)0,(int *)0,&id);
pvm_upkint(&lstr,1,1);
while(lstr != -1)
{
pvm_upkint(&lend,1,1);
length=lend-lstr+1;
pvm_upkint(&PNo,1,1);
pvm_upkint(&min_elem,1,1);
pvm_upkint(&max_elem,1,1);
pvm_upkint(&ElPt[0],2,1);
SteerFocusDelay=(double *)calloc(ElPt[1],sizeof(double));
pvm_upkdouble(SteerFocusDelay,ElPt[1],1);
Time=(double *)calloc(length,sizeof(double));
pvm_upkdouble(Time,length,1);
if (PNo==1)
{
pvm_upkint(&FieldLength,1,1);
pvm_upkint(&finite,1,1);
pvm_upkdouble(&t_ref,1,1);
pvm_upkdouble(&c,1,1);
pvm_upkdouble(&jomega,1,1);
pvm_upkdouble(&Weight,1,1);
pvm_upkdouble(&PulseLength,1,1);
C.3 C-Routine ppec_slave.c 121
elem_pts=(double *)calloc(ElPt[0]*ElPt[1],sizeof(double));
pvm_upkdouble(elem_pts,ElPt[0]*ElPt[1],1);
Apodize=(double *)calloc(ElPt[1],sizeof(double));
pvm_upkdouble(Apodize,ElPt[1],1);
x=(double *)calloc(length,sizeof(double));
pvm_upkdouble(x,length,1);
y=(double *)calloc(length,sizeof(double));
pvm_upkdouble(y,length,1);
z=(double *)calloc(length,sizeof(double));
pvm_upkdouble(z,length,1);
}
local_pr=(double *)calloc(length,sizeof(double));
local_pi=(double *)calloc(length,sizeof(double));
for (n=min_elem;n<=max_elem;n++)
{
for (i=0;i<length;i++)
{
x0=elem_pts[ElPt[0]*(n-1)];
y0=elem_pts[ElPt[0]*(n-1)+1];
z0=elem_pts[ElPt[0]*(n-1)+2];
TStFo=-SteerFocusDelay[n-1];
if (finite==1)
TimeDel=(sqrt((x[i]-x0)*(x[i]-x0)+
(y[i]-y0)*(y[i]-y0)+
(z[i]-z0)*(z[i]-z0)))/c;
else
TimeDel=(sqrt((x[i]-x0)*(x[i]-x0)*180/PI*180/PI+
(y[i]-y0)*(y[i]-y0)*180/PI*180/PI+
(z[i]-z0)*(z[i]-z0)))/c;
if ((PulseLength<HUGE_VAL) && (PulseLength>-HUGE_VAL))
{
maxt=TStFo+PulseLength/2;
mint=TStFo-PulseLength/2;
if (((TimeDel-Time[i]-EPS)<=maxt) &&
((TimeDel-Time[i]+EPS)>=mint))
{
122 Programming Code
t_vector=TimeDel-Time[i]-TStFo;
if (Weight==1)
w_pulse=(cos(PI*t_vector/PulseLength)
*cos(PI*t_vector/PulseLength));
else if (Weight==0)
w_pulse=1;
r_attenu=TimeDel/t_ref;
local_pr[i]=local_pr[i]+Apodize[n-1]
*w_pulse*cos(jomega*t_vector)/r_attenu;
local_pi[i]=local_pi[i]+Apodize[n-1]
*w_pulse*sin(jomega*t_vector)/r_attenu;
}
}
else
{
t_vector=TimeDel-Time[i]-TStFo;
r_attenu=TimeDel/t_ref;
local_pr[i]=local_pr[i]
+Apodize[n-1]*cos(jomega*t_vector)/r_attenu;
local_pi[i]=local_pi[i]
+Apodize[n-1]*sin(jomega*t_vector)/r_attenu;
}
}
}
pvm_initsend(0);
pvm_pkint(&lstr,1,1);
pvm_pkint(&lend,1,1);
pvm_pkdouble(local_pr,length,1);
pvm_pkdouble(local_pi,length,1);
pvm_send(ptid,4);
cfree(local_pr);cfree(local_pi);cfree(Time);
cfree(SteerFocusDelay);
C.3 C-Routine ppec_slave.c 123
cc=pvm_recv(ptid,-1);
pvm_bufinfo(cc,(int*)0,(int*)0,&id);
pvm_upkint(&lstr,1,1);
}
cfree(x);cfree(y);cfree(z);cfree(Time);cfree(Apodize);
pvm_exit();
}
\section{MATLAB Routine for parallel network - \em{queue.m}}
\begin{verbatim}
%
s=input('Have you saved your configuration in your CNF directory??? ','s');
if s~='y'
fprintf('Please save the configurations and continue..\n');
else
disp(['Available files in ',usr_cnfdir]);
disp(' ');
tmp_dir=pwd;
eval(['cd ',usr_cnfdir]);
eval(['dir *.cnf']);
eval(['cd ',tmp_dir]);
clear tmp_dir;
cnff= input('-> File name : ','s');
if isempty(findstr(cnff,'cnf'));
cnff=[cnff,'.cnf'];
end;
cnff=[usr_cnfdir,separator,cnff];
fn='';
while ( isempty(fn))
fn=input('Enter Job file Name : ','s');
if exist(fn)
fprintf('File Exists.');
hm=input('Overwrite it?? [y/n] : ','s');
if hm~='y'
fn='';
end
124 Programming Code
end
end
qname=input('Enter Queue(TEST/PARALLEL) Name [t/p] : ','s');
if qname=='p'
qname='PARALLEL';
else
qname='TEST';
end;
fnid=fopen(fn,'w');
fprintf(fnid,'#!/bin/ksh\n# @ input = /dev/null\n');
fprintf(fnid,'# @ output = program.$(Cluster).$(Process).out\n');
fprintf(fnid,'# @ error = program.$(Cluster).$(Process).err\n');
fprintf(fnid,'# @ class = %s\n\n',qname);
fprintf(fnid,'cd $HOME\n');
fprintf(fnid,'pvm hostfile <<EOF\nquit\nEOF\n\n');
fprintf(fnid,'cd $HOME/matlab\nmatlab <<EOF\nu \n');
fprintf(fnid,'[excitation, observasjon, media, transducer, flagg, ...\n');
fprintf(fnid,'option, x, y, z, t, elem_pts, centers, elecdelr, ...\n');
fprintf(fnid,'elecdelt, comment, ...\n');
fprintf(fnid,'pvec, plength, phase_err_ud, amp_ud, ...\n');
fprintf(fnid,'array_resp, max_resp, min_resp]= ...\n');
fprintf(fnid,'lusim(''%s'');\n',cnff);
fprintf(fnid,'[resp,max_resp,min_resp]=');
fprintf(fnid,'pecomp(excitation,transducer,media,observasjon, ...\n');
fprintf(fnid,'option,flagg,x,y,z,t,elem_pts,centers,config, ...\n');
fprintf(fnid,'plotfig,usr_resdir);\n');
numproc=input('Enter number of node processes required [<=32] : ');
if (numproc < 1)
numproc=1;
elseif (numproc >32)
numproc=32;
end;
fprintf(fnid,'%d\n',numproc);
if (flagg(1)==1) | (option(1) ~='m') | (option(1) ~='f')
fprintf('Fixed focus and Not Movie.. No f-number needed\n');
else
C.4 MATLAB Routine for parallel network - retrieve.m 125
fnum=input('Enter f-number for dynamic aperture control: ');
fprintf(fnid,'%d\n',fnum);
end
fprintf(fnid,'save par_res array_resp max_resp min_resp\n');
fprintf(fnid,'quit\nEOF\n\npvm <<EOF\nhalt\nEOF\n\n# @ queue\n');
fclose(fnid);
sbt=input('Submit now?? [y/n] : ','s');
if sbt=='y'
eval(['!llsubmit ',fn]);
fprintf('Job Submitted\n');
end
end
C.4 MATLAB Routine for parallel network -
retrieve.m
if ~exist('par_res.mat')
fprintf('Sorry, process not finished yet..!\n');
else
ask=input('Is the Configuration file loaded??[y/n] : ','s');
if ask == 'y'
load par_res;
usimcnf('update_calcresp',flagg,option);
else
fprintf('Load the configuration for this simulation first..\n');
end
end
126 Programming Code
C.5 MATLAB routine - peplot3d.m
%---------------------------------------------------------------
% Routine : peplot3d.m ( PE - Plot in 3D )
%
% Description : Plots the results from the PE- analysis
% of Volumes in 'plotfig' window
%
% Language : Matlab V4.1
%
% Written by : Department of Informatics, Kapila Epasinghe
%
% Version no. : 1.0 KE 20.01.96 First version
% 1.1 KE 01.10.96 Added matrix 'visualize'
% 1.2 KE/KI 01.08.97 Added gif animation
%
% Called by : usimcnf
% Calling : cleanscr,setcomap
%---------------------------------------------------------------
function peplot3d(resp, max_resp, min_resp, excitation,transducer, media, ...
observation, option, flagg, x,y,z,t, config, visualize, ...
plotfig, plotfigOrigHandle, usr_resdir)
% make up a vector of analysis times
if option(1) == 'f'
delta_t=1/(1e3*excitation(14)*observation(15));
TimeStepVector = [observation(4):delta_t:observation(8)];
TLen = length(TimeStepVector);
else
TLen = 1;
end
% INITIALIZATION
clc
C.5 MATLAB routine - peplot3d.m 127
cleanscr(plotfig, plotfigOrigHandle);
figure(plotfig);
MeshCont = 'Mesh';
setcomap(plotfig,5);
if (flagg(2) == 1),
n_elem = transducer(2)*transducer(6);
else
n_elem = transducer(2);
end
%
% determine computer-dependent path separator
%
csys=computer;
if (csys(1:2) == 'PC'),
d = '\';
else
d = '/';
end
% Set only envelope option for the time being
if (length(visualize)<=1)
clc
disp('Set the fillowing plot options: ');
disp(' ');
plot_type = input('-> envelope / rf [e/r] ? ','s');
if all(plot_type ~= 'r')
plot_type = 'e'; % default
end
if all(plot_type == 'r')
plot_option = 'lin';
else
plot_option=input('-> linear /logarithmic [lin/log] ? ','s');
if all(plot_option ~= 'log')
plot_option = 'lin'; % default
end
128 Programming Code
end
else
fprintf('Plot Options set as :\n');
fprintf('\t 1. ');
if visualize(1)=='e'
fprintf('Envelope\n');
else
fprintf('rf\n');
end
fprintf('\t 2. ');
if visualize(2:4)=='lin'
fprintf('Linear\n');
else
fprintf('Logarithmic\n');
end
fprintf('\t 3. Colormap %s\n',visualize(5:10));
com=input('-> Accept ??? [y/n] : ','s');
if com=='n'
visualize=setvisu(option,flagg,visualize);
end;
plot_type=visualize(1);
plot_option=visualize(2:4);
end
vers=version;
if vers(1)=='4'
if (csys(1:2) ~= 'PC') & (csys(1:2)~='MA')
gif_anim = input('-> make gif-animation [n/y] ? ','s');
disp(' ')
end
end
% Does not set view option...! Not very actual on a 3D visualization.
tmp_view=3;
vtxt = [' View: 3D default'];
% Set the text variables
[Xtxt, Ytxt, Ztxt, Ttxt1, Ttxt2, Ttxt3] = ...
C.5 MATLAB routine - peplot3d.m 129
petxt(excitation, media, transducer, option, flagg, z,t, ...
plot_type,plot_option, MeshCont);
% Set axis, 3rd axis either time analysis or distance
if option(1) == '3' | option(1) == 'f'
Xx=[observation(1):1/(1e3*observation(15)):observation(5)]*1e3;
Yy=[observation(2):1/(1e3*observation(15)):observation(6)]*1e3;
if option(3) == 't'
Zz=[observation(4):1/(1e6*observation(15)):observation(8)]*1e6;
else
Zz=[observation(3):1/(1e3*observation(15)):observation(7)]*1e3;
end
end
% Get current axes position modify it (plot axes)
figure(plotfig);
handle1 = gca;
axis('off');
v1 = get(handle1,'Position');
v1(4) = v1(4) - 0.1;
set(handle1,'Position',v1);
% Make another axes for text (text axes)
v2= v1;
v2(1) = v2(1) - 0.1;
v2(2) = v2(2) + v2(4) + 0.05;
v2(4) = 0.1;
handle2= axes('Position',v2);
axis('off');
% Use of max min z not needed???
rmovie='n';
slcx=plt3d(rmovie,'X',Xx,Xx(length(Xx)),observation);
slcy=plt3d(rmovie,'Y',Yy,Yy(1),observation);
if option(1)=='3'
rmovie=input('-> Range Movie on Third axis ?? [y/n] :','s');
end
if option(3)=='t'
130 Programming Code
zsl=plt3d(rmovie,'T',Zz,Zz(length(Zz)),observation);
else
zsl=plt3d(rmovie,'Z',Zz,Zz(length(Zz)),observation);
end
if rmovie=='y'
TLen=length(zsl);
end
if (plot_type == 'e') & (plot_option == 'lin')
resp = abs(resp');
elseif (plot_type == 'e') & (plot_option == 'log')
resp = 20*log10(abs(resp')/n_elem +eps);
elseif (plot_type == 'r') & (plot_option == 'lin')
resp = real(resp');
end
zsize=size(Zz);
% add some text ...
text(0 , 1.0 , Ttxt1);
text(0 , 0.5 , Ttxt2);
tmp = [Ttxt3,vtxt];
text( 0 , 0.0 , tmp);
drawnow;
figure(plotfig);
axes(handle1);
if (TLen > 1)
M = moviein(TLen,handle1);
end
csys=computer;
if (csys(1:2) == 'PC')
d='\';
else
d='/';
end
C.5 MATLAB routine - peplot3d.m 131
if (length(visualize)>1)
% setcomap(plotfig,visualize(5:10));
colormap(visualize(5:10));
else
colormap(hot);
end
% Loop over TLen
% Make Java Imaging..!
%java=input('Do you want to plot images for Javascript??? [y/n] ','s');
%if java=='y'
% javapath=input('Where in www_docs shall I write the images?? ','s');
% javapath=[usr_resdir,'/../../../www_docs/',javapath,'/'];
%end
for PNo=1:TLen
if option(1)=='f'
if (PNo<10)
eval(['load ',usr_resdir,d,'Pict00',num2str(PNo),'.mat'])
eval(['resp = Pict00',num2str(PNo),';']);
elseif (PNo >= 10)&(PNo < 100)
eval(['load ',usr_resdir,d,'Pict0',num2str(PNo),'.mat'])
eval(['resp = Pict0',num2str(PNo),';']);
elseif (PNo >= 100)&(PNo < 1000)
eval(['load ',usr_resdir,d,'Pict',num2str(Pno),'.mat'])
eval(['resp = Pict',num2str(PNo),';'])
end
%
% set resp according to plot options
%
if (plot_type == 'e') & (plot_option == 'lin')
resp = abs(resp');
elseif (plot_type == 'e') & (plot_option == 'log')
resp = 20*log10(abs(resp')/n_elem +eps);
elseif (plot_type == 'r') & (plot_option == 'lin')
resp = real(resp');
132 Programming Code
end
end
if rmovie=='y'
slcz = zsl(PNo);
else
slcz = zsl;
end
if (TLen>1)
fprintf('plot number %3.0f of totally %3.0f plots\n',PNo,TLen);
end
if all(plot_option == 'log')
slice(Zz,Xx,Yy,clipl(resp,-50),slcz,slcx,slcy,zsize(2));
% if java=='y'
% fname=[javapath,'imag',num2str(PNo),'.gif'];
% printgif('-dgif8',fname);
% end
else
slice(Zz,Xx,Yy,resp,slcz,slcx,slcy,zsize(2));
% if java=='y'
% fname=[javapath,'imag',num2str(PNo),'.gif'];
% printgif('-dgif8',fname);
% end
end
shading flat;
M(:,PNo) = getframe;
%
% make gif files: gif.A1, gif.A2, ..., gif.B1, ...gif.Z9
%
if isempty(gif_anim)| (gif_anim == 'n')
gif_anim = 'n'; % default
else
ext=sprintf('%c%d',64+fix((10+PNo)/10),rem(PNo,10));
filename = ['gif.',ext];
printgif( '-dgif8', filename )
end
C.5 MATLAB routine - peplot3d.m 133
%end of loop over TLen
end
fprintf('Starting gif animation\n');
if (gif_anim == 'y')
eval(['!gpgconvert'])
end
fprintf('Ending gif animation\n');
%set axis and view
xlabel(Xtxt)
ylabel(Ytxt)
zlabel(Ztxt)
%
% Play movie
%
if (option(1)=='f') | (rmovie=='y')
more = 'y'
while(more =='y')
clc;
disp('Movie Options');
disp(' ');
n=input('-> Enter number of movie loops [1..100] : ');
if (n == '')
n=10;
end
fps=input('-> Enter speed in frames/sec [1..20] : ');
if (fps == '')
fps= 8;
end
% make up a vector that plays the movie with a break after each loop.
% To maintain speed, this is done with a vector N to specify the order
% of the frames
%
last_picture=size(M,2);
N=[n,1:last_picture ,ones(1,fps)*last_picture]; % break for 1 sec
134 Programming Code
disp(' ');
disp('Movie is playing ...');
figure(plotfig);
movie(M,N,fps);
disp(' ')
more = input('-> More movie ??? [y/n] : ','s');
end
%
% save movie matrix ?
%
% if (input('-> Do you want to save the movie [y/n] ? ','s') == 'y')
% [fname,path]=uiputfile('lastone.mvi','Save movie as ...');
% if ~(fname(1) == 0)
% eval(['save ',path,fname,' M N']);
% disp(' ');
% disp(['Saved movie as ',path,fname]);
% end
% end
% mpgwrite(M,hot,'testmeg.mpeg');
end
C.6 MATLAB routine - plt3d.m
%---------------------------------------------------------------
% Routine : plt3d.m ( PE - Plot in 3D )
%
% Description : Accepts data for slice movie in peplot3d.m
%
%
% Language : Matlab V4.2c-1
%
% Written by : Department of Informatics, Kapila Epasinghe
%
C.6 MATLAB routine - plt3d.m 135
% Version no. : 1.0 KE 20.01.96 First version
%
%
% Called by : peplot3d.m
% Calling :
%---------------------------------------------------------------
function res_vec = plt3d(mv,ax,vect,sl_vec,observation)
if mv=='y'
fprintf('Range for Axis %s : %d to %d',ax,vect(1),vect(length(vect)));
start=input('Enter start value :');
if isempty(start) | (start<vect(1)) | (start>vect(length(vect)))
start=vect(1);
end
finish=input('Enter end value :');
if isempty(finish) | (finish<vect(1)) |
(finish>vect(length(vect)))
finish=vect(length(vect));
end
res_vec=[start:1/observation(15):finish];
else
disp([' ']);
disp(['Obeservation slices for ',ax,' Axis']);
mod_sl=input('Modify slices [y/n] : ','s');
if isempty(mod_sl) | (mod_sl~='y')
mod_sl='n';
end
if mod_sl == 'y'
bkgrd=input('-> Background display ?? [y/n] :','s');
if bkgrd=='n'
sl_vec=[];
end
fprintf('Range for the axis %s: %d to %d',
ax,vect(1),vect(length(vect)));
num=input('Enter the number of slices(default 0, max 5) : ');
if isempty(num) | num>5 | num<0
num=0;
end
for k=1:num
136 Programming Code
temp_str=['Axis ',ax,' slice ',num2str(k),':
Enter value ->'];
tem_val= input(temp_str);
while (tem_val<vect(1) | tem_val>vect(length(vect))) |
find(sl_vec==tem_val)
temp_str=['Axis ',ax,' slice ',num2str(k),':
Re-enter value ->'];
tem_val=input(temp_str);
end
sl_vec=[tem_val sl_vec];
end
end
res_vec=sl_vec;
end