1
An FPGA Implementation of the Two-Dimensional Finite-Difference
Time-Domain (FDTD) AlgorithmWang Chen
Panos Kosmas Miriam Leeser
Carey Rappaport
Northeastern UniversityBoston, MA
2
FDTD Algorithm and Implementation
? Finite Difference Time-Domain
?Method for solving Maxwell’s equations
?Used for buried object detection
? Hardware Implementation
?3D to 2D model simplification
?Data dependency analysis
?Fixed-point quantization
3
Finite-Difference Time-Domain Method
Maxwell’s Equations?A direct time-domain solution of Maxwell's equations
?Accurate and flexible for solving electromagnetic problems
?Discretize time and electromagnetic space
4
FDTD Method (cont’d)
One FDTD Equation
Y-AxisX-Axis
Z-A
xis
Ez
Hx
Ez
Ez
Hz
Ex
Ex
Ex
Ey
Ey
Ey
Hy
?Z
?Y
? X
(i,j+ 1/2,k+1/2)
(i,j,k)
Yee Cell Taylor Series Expansion
Adjacent Cells
5
FDTD Applications
? Antenna Design
? Discrete Scattering Studies
? Medical Studies
?The study of the cell phone electromagnetic waves' effect on human brain
?The study of breast cancer detection using electromagnetic antenna
6
Buried Object Detection Forward Model
Buried Object Detection Model Space
Mine
X
Y
Z
Object X
Y
Z
Transmitting Antenna Receiving Antenna
Initialization
Excitation
Calculate E Field
ExteriorBoundaryConditions
End
Time over?
Calculate H Field
Yes
No, Go to NextTime Step
t = n + 0.5
t = n n = n + 1
7
FDTD Simulated Model Space
8
FDTD Simulated Model Space (cont’d)
9
Related Work? Software acceleration of FDTD?Parallel computers do not provide significant speedup
? FPGA implementations of FDTD?1D FDTD on hardware: architecture is too simple?Full 3D FDTD on hardware developed at UDel
? Design is slower than software: ? uses complex floating-point representation? no parallelism or pipelining
? Our 2D FDTD hardware implementation? 24 times speedup compare to 3.0G PC:
? fixed-point representation? expandable structure
10
3D to 2D Model SimplificationInitialization? Initialize parameters of model space
and time step? Build parameters of soil and buried
object? Load all the EM space data into memory
Excitation
Calculate E Field? Update Exs field? Update Eys field? Update Ezs field
End
Time over?
YesNo, Go to Next
Time Step
t = n + 0.5
t = n
n = n + 1
Calculate H Field? Update Hxs field? Update Hys field? Update Hzs field
Exterior BoundaryConditions
Boundary of EYXBoundary of EZXBoundary of EZY
Boundary of EXYBoundary of EXZBoundary of EYZ
Mine X
Y
Z
TransmittingAntenna
ReceivingAntennaSimplify
Mine
X
Y
Z
Initialization? Initialize parameters of model space
and time step? Build parameters of soil and buried
object? Load all the EM space data into memory
Excitation
Calculate E Field? Update Eys field
End
Time over?
YesNo, Go to Next
Time Step
t = n + 0.5
t = n
n = n + 1
Calculate H Field? Update Hxs field? Update Hzs field
Exterior BoundaryConditions
Boundary of EYX Boundary of EYZTransmitting
Antenna
ReceivingAntenna
11
Exterior Boundary Conditions
3D Model Space
2D Model Space
4 Edges
6 Faces and 12 Edges
Mur-type Absorbing Boundary Condition
12
Data Dependency Analysis
Mine
Time step
Memory Space forElectric Field Data
Memory Space forMagnetic Field Data
T-1
T
T-2
T-3
Se
qu
en
ce o
f the
pro
cess
ing
A
B
A
A
A
AB
A
BB
B
B
M cells
N c
ells
Initialization? Initialize parameters of model space
and time step? Build parameters of soil and buried
object? Load all the EM space data into memory
Excitation
Cal
cula
te E
ys F
ield
End
Time over?
YesNo, Go to Next
Time Step
n = n + 1
Exterior BoundaryConditions
Boundary of EYX Boundary of EYZ
Cal
cula
te H
zs F
ield
Cal
cula
te H
xs F
ield
2Rows
13
Hardware Acceleration? Smart memory interface
? Parallelism
? Pipelining
? Quantized fixed point representation ?Less area in datapath -- more parallelism
?Careful error analysis to ensure accurate results
S A AA . B BBBBBBBBBBBBBBBBBBBBBBBBB2 … 0 -1 …… -26
26 bits3 bits
14
Fixed-point quantization
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
24bits 25bits 26bits 27bits 28bits
Bit-width
Ave
rage
rel
ativ
e er
ror
(%)
Electric Field Value at R1Electric Field Value at R2
Magnetic Field Value at R1Magnetic Field Value at R2
Source Data
15
Design Flow
16
Firebird FPGA Board from Annapolis? A Xilinx VIRTEX-E XCV2000E
with 2.5 million system gates
? Processing clock up to 150MHzFDTD runs at 70 MHz
? Five independent memory banks (4 x 64-bit, 1 x 32-bit) 288Mbytes in total
? 6.6Gbytes/sec of memory bandwidth
? 3Gbytes/sec of I/O bandwidth
54%46%Percentage Used
868837Number Used
16019200Number Available
BlockRAMSlices
Utilization of Xilinx XCV2000E FPGA Chip
17
FDTD on Firebird Board
PCHOST
PCI BUS
FIREBIRD BOARD
FPGA
DESIGNOn-BoardMEMORY
On-BoardMEMORY
Me
mor
y In
terf
ace
Simulated Electromagnetic Space
ElectricField
PipelineModule
MagneticField
PipelineModule
BoundaryConditions Module
Memory in PC
Memory in PC
18
Memory Interface
FPGA CHIP
DESIGN
Ele
ctr
ica
lF
ield
Mo
du
le
Pip
eli
ne
d
Ma
gn
etic
Fie
ldM
od
ule
Pip
eli
ne
d
1EYS HZSHXS
2EYS HZSHXS
3EYS HZSHXS
0EYS HZSHXS
1EYS HZSHXS
2EYS HZSHXS
3EYS HZSHXS
0EYS HZSHXS
1EYS HZSHXS
2EYS HZSHXS
3EYS HZSHXS
0EYS HZSHXS
1EYS HZSHXS
2EYS HZSHXS
3EYS HZSHXS
0EYS HZSHXS
1EYS HZSHXS
2EYS HZSHXS
3EYS HZSHXS
0EYS HZSHXS
1EYS HZSHXS
2EYS HZSHXS
3EYS HZSHXS
0EYS HZSHXS
1EYS HZSHXS
2EYS HZSHXS
3EYS HZSHXS
0EYS HZSHXS
1EYS HZSHXS
2EYS HZSHXS
3EYS HZSHXS
0EYS HZSHXS
D D
CC
B
B
A A
Result
Bo
unda
ry
Result ON
-BO
AR
DM
EM
OR
IES
ON
-BO
AR
DM
EM
OR
IES
Result
? ??? ????
Input BlockRAMs Ouput BlockRAMs
19
Pipelining and Parallelism
DTin_2
DTin_1
-
x x
-+
Write to Eys
ReadHzs_A
ReadHzs_B
ReadHxs_B
ReadHxs_A
-
ReadEys_A
0
1
2
3
4
5
6
7
8
9
DTin_2
DTin_1
-
x x
+
-
Write to Hxs
ReadEysCo
ReadEysBoEzs
Ezs
-
ReadHxs_C
DTin_2
DTin_1
-
x x
+
-
Write to Hzs
EXS
EXSReadEysBo
ReadEysDo
-
ReadHzs_C
20
Data FlowElectric Field
On-boardMemory ?
Electric FieldOn-boardMemory ?
Magnetic FieldOn-boardMemory ?
Magnetic FieldOn-boardMemory ?
Pip
elin
e E
ys
Pip
elin
e H
xs
BlockRam
SourceAdder
Pip
elin
e H
zs
Memory Interface ModuleBlockRam
Memory Interface ModuleBlockRam
Pip
elin
eB
oun
dar
y
Source DataOn-board
Memory ?
BlockRam
21
Results and Performance
A Software Floating-point ~~ 25sFortran code at 440 MHz Sun Workstation
B Software Fixed-point ~~ 3.375sC code at 3.0 GHz PC
C Hardware ~~ 0.145sDesign working at 70MHz
Model space 100*100 cellsIterate 200 time steps
0
5
10
15
20
25
A B C
Performance Result
Exe
cutin
g T
ime
(Sec
ond)
22
Conclusions
? FPGA Implementation of FDTD exhibits significant speedup compared to software:24 times faster than 3GHz PC
? With larger FPGA, more parallelism will be available, hence more speedup
? Current design easily extendible to handle multiple types of materials, 3D space
23
Future Work? Upgrade curent design to handle multiple
types of materials? Upgrade to 3D model space?Add three more field updating algorithms:
same structure as the original three algorithms?Upgrade boundary condition updating
algorithm?Redesign memory interface
? Apply FDTD Hardware to other applications
Top Related