Post on 11-Jan-2016
1Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece2Micro2Gen Ltd. , NCSR Demokritos, Greece
17th IEEE International Conference on Electronics, Circuits, and Systems ICECS 2010 – Athens - Greece
Motivation Canny Algorithm Proposed Canny Implementation Simulations – Results Conclusions
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 2
Motivation Canny Algorithm Proposed Canny Implementation Simulations – Results Conclusions
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 3
Necessity of Edge Detection› First step in many computer vision algorithms› Identification of sharp discontinuities in an image
Why use the Canny algorithm› Good performance in images contaminated by noise
The need for Real-Time/High Throughput Implementation› Multiplication of camera resolutions in recent years› Real-time applications
The performance of modern FPGA devices› Powerful, efficient, availability of memory resources
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 4
Motivation Canny Algorithm Proposed Canny Implementation Simulations – Results Conclusions
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 5
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 6
Gaussian Smoothing
(5x5 convolution)
Sobel Edge Detection
(3x3 convolution)
Non Maximum Suppression
Double Thresholding
( ≥ Thigh , ≥ Tlow)
HysterisisCalibrated
Image PixelsBinary Edges
Smoothing Filter
Calculation of
gradient
Localization
Elimination of spurious
responses
Motivation Canny Algorithm Proposed Canny Implementation Simulations – Results Conclusions
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 7
Introduction of a 4-pixel parallel computation design
Pipelined architecture using on-chip BRAM memories for caching
Very efficient design with the same memory requirements as with a design without parallelism
In addition, complex arithmetical operations were substituted with shifts and additions/subtractions
8C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
9C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
2 lines + 1 word latency
Gaussian Smoothing computation
1 line + 1 word latency
Sobel Gradientcomputation
1 line + 1 word latency
Non Maximum Suppresioncomputation
Double Thresholding + Hysterisis 1st Pass
Hysterisis 2nd PassFrameBuffering
Exploitation of minimum buffering before starting the computation
10C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
5 x 5 convolution› Introducing 4-pixel parallelism
› Substitution of the multiplications and divisions with shifts additions and subtractions
25 pixels 40 pixels
11C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
Cache writemanager
Cached picture line
Cached picture line
Cached picture line
Cached picture line
Cached picture line
Cache read manager
Gaussian parallel
processing block
Control and Synchronization Logic
Output formatter
PU PU
PU PU
Input pixels
Control signals
Output pixels
Output controlsignals
4 adjacent5x5 pixel blocks
12C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
Exploitation of on-chip BRAMS› Use of one BRAM to accommodate one
image line aligned in 4-pixel words› Each BRAM has a size of
image width / 4 x 32 bits (grayscale)
WORD WORD WORD
Pix. no
1 2 3 4 5 6 7 8 9 10 11 12
1st read
2nd read
3rd read
4th read
13C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
Start of calculations As soon as the first 2 lines of the cache are filled› All non-existing lines and
columns of pixels necessary for the calculations are considered to be black
Video Frame
3 x 3 convolution› Introducing 4-pixel parallelism
› Substitution of the multiplications and divisions with shifts additions and subtractions
› Use of fixed point arithmetic14C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
9 pixels 18 pixels
15C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
Start of calculations As soon as the first line of the cache are filled› All non-existing lines and
columns of pixels necessary for the calculations are considered to be black
Video Frame
Video Frame
16C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
Same principles as in Sobel Gradient Calculation› Requires the 3 x 3 neighboring pixels
9 pixels 18 pixels
17C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
Video Frame
Double Thresholding is a double comparator› No caching required for this stage
Hysterisis normally requires the 3 x 3 neighboring pixels› Reduced to 4 neighboring pixels› Second pass in the opposite direction
18C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab
Cache logic Hysterisis comparisons
and logic
Control and Synchronization Logic
Output formatter
Cached picture line Resulting 8-bitdata word
Output pixel words
Output control signals
Input pixel words
Control signals
Motivation Canny Algorithm Proposed Canny Implementation Simulations – Results Conclusions
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 19
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 20
Synthesis
ResultsGauss Sobel NMS Db_Thres
Hysterisis
TotalTotal(
%)
Frequency
(MHz)Spartan
3ESlices
2613 1054 649 37 84 4200 28% 120.4
Spartan 6
Slices2418 1391 651 36 126 4560 2% 201.4
Virtex 5Slices
2409 1389 648 40 124 4553 6% 292.8
Synthesis results for 3 different FPGAs
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 21
Image File Size Time (ms)lena 512x512 0.66
HCLAChip 960x540 1.31Disc-brake 1280x960 3.09
Timing results for Spartan-6
Motivation Canny Algorithm Proposed Canny Implementation Simulations – Results Conclusions
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 22
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 23
Real-time canny implementation › Parallel architecture with 4-pixel
calculation› Increased throughput without increased
need for memory resources› 240 frames per second achieved for
1Mpixel images on a Spartan-3E with 28% of the area
› 580 frames per second on a Virtex-5› 396 frames per second on a Spartan-6
The research activities that led to these results, were co-financed by Hellenic Funds and by the European Regional Development Fund (ERDF) under the Hellenic National Strategic Reference Framework (ESPA) 2007-2013, according to Contract no. MICRO2-49-project LoC.
C.-L. Sotiropoulou – Real-Time Canny FPGA Implementation – AUTH-eLab 24
Thank you very much for your attention!