Acceleration of the Retinal Vascular Tracing Algorithm using FPGAs Direction Filter Design FPGA...

4
Acceleration of the Retinal Vascular Tracing Algorithm using FPGAs Directi on Filter Design FPGA FIREBIRD BOARD Framegrabbe r PCI Bus Host Data Packing Design BlockRAM Image Data MEMORY 0 Image Data MEMORY 1 Memory Switching Design Miriam Leeser Shawn Miller Smart Camera: Provides Host PC with image data along with image processing results at frame rate with low latency Result s MEMORY 2 Result s MEMORY 3 Memory Switching Design

Transcript of Acceleration of the Retinal Vascular Tracing Algorithm using FPGAs Direction Filter Design FPGA...

Page 1: Acceleration of the Retinal Vascular Tracing Algorithm using FPGAs Direction Filter Design FPGA FIREBIRD BOARD Framegrabber PCI Bus Host Data Packing Design.

Acceleration of the Retinal Vascular Tracing Algorithm

using FPGAs

Direction Filter

Design

FPGA

FIREBIRD BOARD

Framegrabber

PCI Bus

Host

Data Packing Design

Blo

ckR

AM

Image Data

MEMORY 0

Image Data

MEMORY 1M

emo

ry S

wit

ch

ing

D

esig

n

Miriam LeeserShawn Miller

Smart Camera: Provides Host PC with image data along with image processing results at frame rate with low latency

Results

MEMORY 2

Results

MEMORY 3

Mem

ory

S

wit

chin

g

Des

ign

Page 2: Acceleration of the Retinal Vascular Tracing Algorithm using FPGAs Direction Filter Design FPGA FIREBIRD BOARD Framegrabber PCI Bus Host Data Packing Design.

Retinal Vascular Tracing Application

Goal: Detection and enhancement of the vascular structure of a patient’s retina from a live video feed

Latency and throughput requirements of real-time image processing cannot be provided by software on a general-purpose processor

Page 3: Acceleration of the Retinal Vascular Tracing Algorithm using FPGAs Direction Filter Design FPGA FIREBIRD BOARD Framegrabber PCI Bus Host Data Packing Design.

Timing Issues

Return results for each pixel at frame rate of camera

Very Low Latency

•Surgical laser must be shut off immediately after detecting that it is aimed incorrectly

•Cannot tolerate 1 frame delay (33msec at 30 frames/sec)

•Complex memory management required to achieve minimum latency

0 2 4

6 810

1 3

5 7 912

20

16 18

22

14

24

11 13

15 17 19

21 23

Latency currently on the order of 100sec

Storage of a 5x5 pixel image

Memory 0 Memory 1

Page 4: Acceleration of the Retinal Vascular Tracing Algorithm using FPGAs Direction Filter Design FPGA FIREBIRD BOARD Framegrabber PCI Bus Host Data Packing Design.

ImplementationHardware Acceleration

Template responses are calculated in hardware in parallelAll pixels in the image are processedCamera connected directly to the board (no host interaction)Only the results are sent to the host after they become available, and while new results are being calculated

Very Low LatencySurgical laser must be shut off immediately if we detect that it is aimed incorrectlyCannot tolerate a one frame delayComplex memory management scheme must be introduced

AlgorithmWhat does the algorithm do?

Retinal vascular tracing: detection of blood vessels in images of the retinaThe algorithm finds blood vessels and traces out their structure

Where is the algorithm used?Processing live video of the patient retina during laser retinal surgeryHighlighting the vascular structure helps the surgeon avoid damage

Why do we need to accelerate it?Current implementation: software on a general-purpose processorImages are 512x512 pixels, and need to be processed at frame rate

Acceleration of the Retinal Vascular Tracing Algorithm using FPGAs Miriam Leeser

Shawn Miller

Badrinath RoysamCharles Stewart

Ken Fritsche

This work was supported in part by CenSSIS, the Center for Subsurface Sensing and Imaging Systems, under the Engineering Research Centers Program of the National Science Foundation (Award Number EEC-9986821).

More InformationIn proceedings: Rapid Automated Tracing and Feature Extraction from Retinal Fundus Images Using Direct Exploratory Algorithms A. Can, H. Shen, J.N. Turner, H.L. Tanenbaum and B. Roysam, IEEE Transactions on Information Technology in Biomedicine, June 99.On the web: http://www.ece.neu.edu/groups/rpl/projects/retinaltracing

Original Image

Each pixel is passed through the design unaltered.

Direction

The direction template with the maximum response for each pixel. The direction is represented by a value between 0 and 15.

Response

The maximum response that led to the direction decision for each pixel.

Results

ObjectiveTo accelerate retinal vascular tracing by implementing computation of template responses in reconfigurable hardware.

Reconfigurable HardwareFirebird reconfigurable computing engine from Annapolis Micro Systems

1 Xilinx VIRTEX E (XCV2000E) FPGA5 Memory banks (4 x 64-bit, 1 x 32-bit)5.4 Gbytes/sec of memory bandwidth66Mhz/64-bit PCI interface to host

FIREBIRD BOARD

Image Data

MEMORY 2

Image Data

MEMORY 3

PCI Bus

Host

Direction Filter Design

FPGAData Packing Design

Blo

ckR

AM

Mem

ory

Sw

itch

ing

Des

ignImage Data

MEMORY 0

Image Data

MEMORY 1

Mem

ory

Sw

itch

ing

D

esig

n

Framegrabber(Dillon Eng.)

PARTIALRESPONSE

POS2POS NEG NEG2

RESULT DIRECTION

+

PARTIALRESPONSE

PARTIALRESPONSE

PARTIALRESPONSE

+

- - <

RE

SP

ON

SE

TEMPLATE_COMPARATOR

RE

SP

ON

SE

RE

SP

ON

SE

RE

SP

ON

SE

RE

SP

ON

SE

RE

SP

ON

SE

RE

SP

ON

SE

RE

SP

ON

SE

TEMPLATE_COMPARATOR

TEMPLATE_COMPARATOR

TEMPLATE_COMPARATOR

TEMPLATE_COMPARATOR

TEMPLATE_COMPARATOR

TEMPLATE_COMPARATOR

INTERCONNECTION

RE

G4

RE

G5

RE

G6

RE

G7

RE

G8

RE

G9

RE

G1

0

RE

G1

1

RE

G1

2

RE

G1

3

RE

G1

4

RE

G1

5

RE

G1

6

RE

G1

7

RE

G1

8

RE

G1

9

000 001 010 011 100 101 110 111

RESPONSE TEMPLATE

RE

G0

RE

G1

RE

G2

RE

G3

RE

G2

0

RE

G2

1

RE

G2

2

RE

G2

3

TEMPLATE_A TEMPLATE_BRESPONSE_A RESPONSE_B

TEMPLATERESPONSE

gt

>

Direction Templates

•Stand-alone camera outputs only image data.

•Our design outputs not only image data, but directional template responses as well.

•The cost of additional image processing is a latency on the order of

10-4 seconds. This is a low cost when considering that at 30 frames/sec, a new frame of image data is introduced every 33 msec.

•The application for this project is Retinal Vascular Tracing, but the same method can be applied to any problem that requires real-time image processing.

Conclusions

Memory ManagementProblems

Must continuously write new data to input memory from the camera and be reading the data to be processedCannot read and write from one memory on the same clock cycleThe image is stored row-wise, but must be read column-wise

SolutionStore the image in “checkerboard” fashion in two memories. Every other pixel is stored in a different memoryA column of data is read by alternating between the two memories on every clock cycle

Input Memory 0

Input Memory 1

Clock

Writing data from camera

Reading data to be processed

Inactive

Time

Memory 0 Memory 1

Checkerboard storage of a 5x5 image

0 2 4

6 8

10

1 3

5 7 9

12

20

16 18

22

14

24

11 13

15 17 19

21 23