Mateusz Nawrocki Design and Implementation of...

1

Mateusz Nawrocki

Design and Implementation of Algorithms for Vehicle

Identification – License Plate Detection and Vehicle Iden-

tification

MASTERARBEIT

zur Erlangung des akademischen Grades

Diplom–Ingenieur

Studium INFORMATIK

Alpen–Adria–Universitat Klagenfurt

Fakultat fur Technische Wissenschaften

Begutachter/in:

Univ.-Prof. Dr.-Ing. Kyandoghere Kyamakya

Institut: Institut fur Intelligente Systemtechnologien

09/2007

Monat/Jahr

Ehrenwortliche Erklarung fur Masterarbeiten,Diplomarbeiten und Dissertationen

Ich erklare ehrenwortlich, dass ich die vorliegende wissenschaftliche Arbeit selb-ststandig angefertigt und die mit ihr unmittelbar verbundenen Tatigkeiten selbsterbracht habe. Ich erklare weiters, dass ich keine anderen als die angegebenenHilfsmittel benutzt habe. Alle aus gedruckten, ungedruckten oder dem Inter-net im Wortlaut oder im wesentlichen Inhalt ubernommenen Formulierungen undKonzepte sind gemaß den Regeln fur wissenschaftliche Arbeiten zitiert und durchFußnoten bzw. durch andere genaue Quellenangaben gekennzeichnet.

Die wahrend des Arbeitsvorganges gewahrte Unterstutzung einschließlich sig-nifikanter Betreuungshinweise ist vollstandig angegeben.

Die wissenschaftliche Arbeit ist noch keiner anderen Prufungsbehorde vorgelegtworden. Diese Arbeit wurde in gedruckter und elektronischer Form abgegeben. Ichbestatige, dass der Inhalt der digitalen Version vollstandig mit dem der gedrucktenVersion ubereinstimmt.

Ich bin mir bewusst, dass eine falsche Erklarung rechtliche Folgen haben wird.

(Unterschrift) (Ort, Datum)

Contents

1 Introduction 11.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 The aim and scope of the work . . . . . . . . . . . . . . . . . 31.4 Expected results . . . . . . . . . . . . . . . . . . . . . . . . . 51.5 Expected problems . . . . . . . . . . . . . . . . . . . . . . . . 5

2 State of the art 72.1 Existing ALPR systems . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 British national ALPR system . . . . . . . . . . . . . . 82.1.2 Congestion charge system in London . . . . . . . . . . 82.1.3 Other solutions . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Algorithms used in ALPR . . . . . . . . . . . . . . . . . . . . 102.2.1 Localizing algorithms . . . . . . . . . . . . . . . . . . . 102.2.2 Segmentation algorithms . . . . . . . . . . . . . . . . . 132.2.3 OCR algorithms . . . . . . . . . . . . . . . . . . . . . . 15

3 System design 183.1 General view of the system . . . . . . . . . . . . . . . . . . . . 183.2 DSP optimizations . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2.1 Benefits from TMS320C6414 architectural features . . 193.2.2 Examples of optimization used in the system . . . . . . 23

4 License plate localization 294.1 Input data specification . . . . . . . . . . . . . . . . . . . . . 294.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5 License plate segmentation 405.1 Input data specification . . . . . . . . . . . . . . . . . . . . . 405.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

1

CONTENTS 2

5.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6 Character recognition 456.1 Input data specification . . . . . . . . . . . . . . . . . . . . . 456.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

7 Tests 537.1 Performance tests . . . . . . . . . . . . . . . . . . . . . . . . . 53

7.1.1 Tests for DSP optimization . . . . . . . . . . . . . . . 537.1.2 Overall module performance . . . . . . . . . . . . . . . 56

7.2 Quality tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577.2.1 License plate localization . . . . . . . . . . . . . . . . . 577.2.2 License plate segmentation . . . . . . . . . . . . . . . . 597.2.3 Character recognition . . . . . . . . . . . . . . . . . . . 60

8 Conclusions 618.1 Successful issues . . . . . . . . . . . . . . . . . . . . . . . . . . 618.2 Encountered problems . . . . . . . . . . . . . . . . . . . . . . 628.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

List of Figures

1.1 Example of the license plate accepted by the system. . . . . . 3

2.1 Example of the part of the dutch license plate optimized forautomatic recognition . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 The mobile camera in London Congestion Charge zone[Fou07] 92.3 The original image (left) and image processed by FIR edge

filter (right). Source: [EKF06] . . . . . . . . . . . . . . . . . . 112.4 The sums in rows with the local minimums marked and the

sums in columns (left). The localized license plate (right).Source: [EKF06] . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.5 The scheme of the adaptive thresholding algorithm. Source[CCJ03] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.6 The pixel’s membership functions to the bright and dark fuzzysets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.7 The horizontal reduction of the number plate. Source: [Mar07] 142.8 Character area basis feature map. Source: [KK03] . . . . . . . 162.9 Background area basis feature map. Source: [KK03] . . . . . . 162.10 The scheme of the template matching algorithm. Source: [KK03] 16

3.1 The general view of the CamComp system. Author: StanislawSzczepanowski . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2 Four bytes packed into general purpose register. Source: [Tex06] 223.3 packl4 intrinsic from the packXX4 family. Source: [Tex06] . . 223.4 Intrinsic for saturated packs – spack2. Source: [Tex06] . . . . 22

4.1 The example of input data for plate cocalization algorithm . . 304.2 Membership functions for different features used for plate eval-

uation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.3 Input image binarized with the different threshold values. . . . 364.4 The binarized frame after application of sobel operator. . . . . 374.5 The magnified license plate. . . . . . . . . . . . . . . . . . . . 37

3

LIST OF FIGURES 4

4.6 The correct localization of the license plate. . . . . . . . . . . 384.7 The false positive example. . . . . . . . . . . . . . . . . . . . . 384.8 The false negative example . . . . . . . . . . . . . . . . . . . . 39

5.1 The example of algorithm’s input data. . . . . . . . . . . . . . 405.2 The gray plate and the values in the reduced vector. . . . . . . 435.3 All minimums (green) and maximums (red) . . . . . . . . . . 435.4 The verified maximum values. . . . . . . . . . . . . . . . . . . 435.5 The bounding boxes of each object present in the license plate. 435.6 The final result of the segmentation algorithm. . . . . . . . . . 445.7 The example of the blurred and skewed plate. . . . . . . . . . 445.8 The color analysis error due to the poor lighting conditions. . 44

6.1 The part of the “E” training set. . . . . . . . . . . . . . . . . 486.2 The example of the template. . . . . . . . . . . . . . . . . . . 506.3 The example of output of the recognition submodule. . . . . . 50

7.1 The time of execution of different version of shrinking function. 547.2 The . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567.3 The example of the incorrect localization of the license plate. . 587.4 The plate as the object with the highest membership value,

but the value is under the threshold – “plate not found” re-sponse returned. . . . . . . . . . . . . . . . . . . . . . . . . . . 58

7.5 The example of incorrect license plate segmentation . . . . . . 597.6 The example of incorrect license plate segmentation . . . . . . 59

List of Tables

6.1 The example of characters recognition for plate “K 944AE” . . 516.2 The example of characters recognition for plate “K 479DB” . . 516.3 The example of characters recognition for plate “K 7042H” . . 52

7.1 Different versions of scaling compared to the DSP optimizedcode. The numbers show how many percent the method isslower. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

7.2 Different versions of RGB to grayscale conversion comparedto the DSP optimized code. The numbers show how manypercent the method is slower. . . . . . . . . . . . . . . . . . . 55

7.3 The results of the license plate localization quality test. . . . . 577.4 The results of the character recognition quality test. . . . . . . 60

5

List of Algorithms

1 The plate localization algorithm. . . . . . . . . . . . . . . . . 312 The plate segmentation algorithm . . . . . . . . . . . . . . . . 413 The exact localization of the character . . . . . . . . . . . . . 464 The image by template multiplication procedure. . . . . . . . 475 The template creation. . . . . . . . . . . . . . . . . . . . . . . 49

6

Listings

3.1 Unoptimized shrinking code. . . . . . . . . . . . . . . . . . . . 243.2 Optimized shrinking code. . . . . . . . . . . . . . . . . . . . . 253.3 Unoptimized code of RGB to gray conversion. . . . . . . . . . 273.4 Optimized code of RGB to gray conversion. . . . . . . . . . . 28

7

Abstract

The number of threats in todays world increases steadily. Due to that factgovernments in many countries try to counteract. The homeland security be-comes more and more significant position in their budgets. Not only policeforces are strengthen, but also many intelligent systems collecting data aboutthe citizens are developed. Automatic license plate recognition systems areone of them. They are widely discussed in the literature – many relevant ar-ticles are published around the world. In addition to that, todays processorsare powerful enough to cope with approaches, which have been impossible forimplementation quite recently. The ability of processing enormous quantitiesof data enables to build systems that can track and identify every car thatpasses by. Within this work the effort for creating such a system was made.The general idea that is behind is to monitor the movement of each vehiclein the city. There are five goals to achieve:

1. Work in real time.

2. Autonomy of the system, no human operator is needed to assist therecognition process.

3. Identification of every vehicle, that passes by.

4. Ability to work under in different weather conditions.

5. Ability to identify plates from different countries.

Chapter 1

Introduction

This master thesis describes the part of the created CamComp (see: defini-tions section) system and all tests that were made to estimate its reliabilityand efficiency. The introduction chapter contains the characteristics of thethesis’s structure. Moreover the motivation for starting the research in thisarea is presented, the most important terms are defined and the scope theand aim is described. According to the scope the expectations of the authorand possible problems are marked out. Chapter 2 contains the informationabout existing ALPR systems and presents some current approaches copingwith problems that arise in ALPR tasks. Chapter 3 contains the brief de-scription of the CamComp system. It shows also how beneficial the DSParchitecture for pattern recognition tasks can be. There are some exam-ples of optimized code shown and a comparison to the unoptimized ANSI Ccode is made. Chapters 4, 5 and 6 present the detailed description of theconsecutive steps of the algorithm. The description of each step of the al-gorithm consists of the three subsections: input data, theoretical descriptionand some examples showing it in action. Chapter 7 presents the report fromthe tests that were performed on this system. In the last chapter 8 the finalconclusions are presented. Both successfully completed and abandoned orfailed approaches were mentioned here. Finally the current functionality andcapability is presented. The plans for the future work are presented at theend.

1.1 Definitions

This section contains definitions of most important terms and abbreviationsin this master thesis.

OCR stands for Optical Character Recognition. Defined in [Fou07] as the

1

CHAPTER 1. INTRODUCTION 2

set of techniques that transforms images of handwritten or typewrittenacquired usually by scanner or camera text into machine editable text.

Surveillance System system that consists of set of the cameras and mon-itor site. Its main aim is to monitor the behavior of people or traffic inpublic places in order to ensure obeying the social rules or speed limits,etc.

ALPR stands for Automatic License Plate Recognition. [Fou07] definesALPR as a mass surveillance method that uses OCR to read vehi-cles’ license plates. It is sometimes called ANPR Automatic NumberPlate Recognition.

DSP stands for Digital Signal Processor. The architecture and advantagesof DSPs are described in [NST07]

CamComp the name of the surveillance system that is created within thisand two other ([Szc07], [Tar07]) master thesis. Its detailed descriptioncan be found in chapter 3.

1.2 Motivation

There number of the threats in todays world increases steadily. The varietyof them is also huge. It ranges from the pilferages to acts of terrorism.As a reaction many countermeasures are taken. In most of Polish citiessurveillance systems are installed. They significantly decrease the numberof acts of vandalism, robberies, car stealing etc. On the other hand manycameras in the city remain unused, due to the fact that the acquired imagemust be controlled by the operator. The police does not have enough staffto observe everything simultaneously. There is also another disadvantage ofsuch systems. Storing the video stream from all cameras requires incrediblyhigh capacities.

The solution of this problem might be replacing the human operator byan intelligent system, that continuously analyzes the acquired image and de-tects some possibly dangerous situations. They can be archived and relevantstaff can be alarmed. As far as the car stealing problem is concerned, thesurveillance system without mentioned disadvantages can be developed. Onthe ground of above deliberations, the author claims that the system thatonly films the passing by vehicles can have very limited efficiency. It requiresmuch time of many staff members to localize particular car. Moreover it isalmost impossible to monitor many suspected vehicles or detect them whenpresent by the camera in real time.


Figure 1.1: Example of the license plate accepted by the system.

As a consequence the ALPR tasks should be done automatically andlocally at the camera site. Under those circumstances personnel can be re-duced only to a few persons, that administrate the system. ALPR gives avariety of possibilities, unavailable for human controlled cameras. These arefor example:

• storing all passing vehicles in the database with their photos and rec-ognized license plate numbers,

• querying for the last position of the particular vehicle,

• detecting the current position of the suspected vehicle – when passingby in front of the camera.

One can see, that this proposition of the system is free from all disad-vantages mentioned before. It gives brand new possibilities to the police orhomeland security agencies and requires less effort from the employees/offi-cers.

1.3 The aim and scope of the work

The aim of this master thesis is to create the module of a CamComp system.This module should receive a part of the acquired image as the input andis intended to return the number read from the license plate. Generally, thesystem is intended to recognize all types of license plates. The variety ofthem is enormous. They are of different shapes and colors, letters can bearranged in more than one row. For example in Poland and Austria licenseplates are white rectangles with black letters. Nevertheless in both countries,old plates (white letters on black background) are still in use. In this masterthesis only the single row plates with black letters on white background areassumed as the “correct” ones. The example of such plate can be seen atfigure 1.1.

The module described within this master thesis is split into three sub-modules:


Localization This submodule is responsible for the localization of the li-cense plate within the color input image. It should pass the plate’srectangle to the next module or give “Plate not found” response if thereis no license plate present. The detailed description of this submodulecan be found in the chapter 4.

Segmentation This submodule gets as input the rectangular color (8-bit,three channels) image that contains only the license plate. Its aim isto extract all characters. All other objects such as the blue countryfield, or the crest present in Austrian plates, must be eliminated. Thissubmodule gives the following output: “Cannot segment the plate” ifit is impossible to find enough glyphs within the input object (this canhappen in case of object that is not a plate or if the plate is dirty,blurred, scratched, etc...). In case of correct license plate the inputcontains the list of glyphs that were found. This submodule is describedin chapter 5.

Recognition the aim of this submodule is to classify the glyphs receivedfrom the segmentation process. The input in this case is a grayscaleimage (8-bit, one channel) containing only the glyph and its nearest sur-rounding. This surrounding cannot contain elements of other glyphs.The submodule gives the assignment to the class as the output. Insome cases this assignment can be done to a few classes, but not morethan 3. In such a case the classes are sorted in order of relevance – themost probable is the first. It can happen with letters that are similar.For example, given the letter “F” in the input. The classifier mightassume it as similar not only to the “F” template, but also to “P” or“E”. The output of this submodule consists of three arrays of characters(primary, secondary, tertiary). The classes that best fit given glyphsare placed in a primary table. If it is likely, that the glyph might belongalso to other classes, the relevant position in secondary table or eventertiary will be marked. Chapter 6 contains the extensive descriptionof this submodule.

The second main aim of this master thesis is to make the evaluation ofthis system. The author intends to discover under which conditions it canprovide reliable results. In other words, which factors can improve or worsenthe overall results. This evaluation should also provide the knowledge howto revamp the system.


1.4 Expected results

There fundamental expectation is to recognize most (over 50%) of the “cor-rect plates” under convenient conditions (strong ambient light, not blurred,undamaged plate, direct front view of the car). In addition this recognitionmust be done in real time, simultaneously with other tasks, like those de-scribed in [Tar07]. The plate can be assumed as recognized, when for everycharacter correct class is present in primary, secondary or tertiary outputarray.

The author also expects that the created module of the system should becapable of recognizing plates under conditions that are different from ideal.The camera can be placed not exactly in front of the vehicle or it can beenlightened with strong sunny light. Such factors will be surely aggravationand can lower the success rate.

1.5 Expected problems

When the pattern recognition is used to recognize the real world objects,many problems usually arise. Every object is different and untypical in hisown way. When it comes to the ALPR, there are no two identical cars. Theplate can be located almost in every location within the vehicle.

In the created CamComp system, problems may appear in many areas.Firstly, during the acquisition. The first difficulty is to provide the sharpimage. There will be a tradeoff between shutter speed and aperture value.Narrow aperture ensures the large depth of field, but reduces the quantity oflight on the camera’s sensor, what enforce long shutter times. Such times cancause that moving vehicles might be blurred. On the contrary, wide apertureenables short shutter times, but reduces the depth of field. That means thatthe focus can be adjusted only at a particular distance. Another issue isthe sensor noise, which makes plates less distinctive, especially under poorlighting condition. What is more, strong sun reflections can result in sensorerrors, seen as bright vertical stripes through the whole image.

Secondly, many problems may appear during the recognition phase. Ata first glance the most obvious is the necessity of processing large amountof data in the real time. The [NST07] informs that the used camera canproduce up to 15 color frames per second at a resolution of 1280 × 960. Itis easy to remark that the system can receive huge amount of uncompresseddata – 15fps× 1280× 960× 3channels ≈ 50MB/s.

Another issue is the state of the license plate. It can be in the formataccepted by system, but have some scratches, flies and mosquitoes splashed


on it, or some indentations might exist. The lack of the distinct borderbetween plate and the rest of the car, can also be the worsening factor. It maycause problems with localization, because the system might find it difficult todistinguish the separate object within the car. Weather phenomenons suchas rain, snow or fog can reduce the reliability of the system.

Chapter 2

State of the art

This chapter contains the description of a few ALPR systems that are cur-rently in use and presents a few approaches, that are used in this field. Dueto the growing computational power and hardware possibilities ALPR is in-creasingly used in video surveillance systems. There are examples of suchsystems working and new algorithms are continuously invented. At the endof this chapter the propositions of other methods of vehicle identification arepresented.

2.1 Existing ALPR systems

There are two main applications of ALPR systems: homeland security andtoll collection. ALPR systems can vary from simple devices that open theparking barrier only for authorized vehicles to the complicated and widearea networks of sensors that cover the whole city or even the whole country.There are also ideas how to make the recognition process easier. In somecountries (e.g. Netherlands) plates have been modified to have the glyphsnot similar to each other. Dutch plate can be seen at figure 2.1. Similarletters like “P” and “R” might be misclassified. To make the differencebetween them more significant, they have gaps in different places.

Figure 2.1: Example of the part of the dutch license plate optimized forautomatic recognition

7

CHAPTER 2. STATE OF THE ART 8

Under those circumstances, drivers try to counteract. There are someinstruments, for example sprays of liquids to cover the plate. They are ad-vertised to mislead the automatic law enforcement systems. Author has seena few advertisements on TV or in the Internet.

2.1.1 British national ALPR system

This system consists of several thousands of cameras located in the wholecountry. Its aim is to monitor the traffic and inform the police about all vehi-cles that could have been stolen or involved in crimes. The integration withthe Police National Computer, the national information system that storesall available data of people, vehicles property, crimes etc, enables pointingout the suspected vehicles. Therefore criminals can be easily located andcaptured.

According to [Mat07], the first ALPR nodes were used in 1997 in Londonand were located at the edge of the City. The introduction of the system asa country-wide proceeded after successful trials that took place in 23 policeforces between June 2003 and June 2004, reports [Com05]. The further de-velopment proceeds, according to British Home Office, £32.5 million will bespent for ANPR technical development in the period of 2005-2007. This sys-tem is believed to be the powerful countermeasure against criminals. FrankWhiteley, Chair of the Acpo ANPR Steering Group, says in [Com05]: “Thepolice service is now integrating ANPR into its day to day activities as amainstream policing tool”.

The current functionality of the system is impressive. [Mat07] reports,that the system uses an Oracle database. The data for some special analysisare extracted to the Postgres database in order to performance issues. Thedata are collected using dedicated police networks instead of the Internet.The recognition reliability reaches 95% for the British plates, neverthelessthe photo of the plate is additionally stored to have 100% certainity, andevaluate accurate reliability. The reliability is slightly worse for the foreingnlicense plates. All acquired data are stored for 2 years and can be used as aevidence in the court.

The benefits from the system are seen for example in West Yorkshire. Thelocal police reports in [Pol07] that in year 2006, the ALPR system caused1,120 arrests and 12,734 cars have been stopped to be controlled.

2.1.2 Congestion charge system in London

The previous system was created for law enforcement reasons. The aimof one described in this section is toll collection for entering the center of


Figure 2.2: The mobile camera in London Congestion Charge zone[Fou07]

London. According to [fLc07] the system consists of the network of stillcameras located at every entrance and exit of Congestion Charging zone.Additional mobile cameras are placed within the zone. Such a camera canbe seen at figure 2.2.

The system allows users to pay for the entrance into the zone in advance,or one day after. The fee for one day is £8. There are also weekly, monthlyor annual subscriptions available. When the vehicle is detected within thezone its license plate is checked against the database. If the fee has beenalready paid or the vehicle is exempt, the vehicle’s image is deleted from thedatabase on the spot. When the fee had not been already paid, the plate’simage is stored in the database till midnight of the following charging day. Ifthe fee still has not been paid, the automatic recognition is verified manuallyby the staff and the “Penalty Charge Notice” is issued. The accuracy of thesystem is claimed to be higher than 90%. Less than 10% of the total numberof captured license plates is recognized incorrectly.

This system eliminates necessity of building extra barriers, there are alsono extra tickets or passes required. It reduces costs, because interaction ofthe personnel is needed only when owner of the vehicle has not paid. Inaddition to that the database of plates in the center is also available for theanti terroristic agencies, what brings security benefits.

2.1.3 Other solutions

The ALPR systems are strongly criticized, mainly by social organizations,which defend the citizens’ privacy. Furthermore, the economical argumentsare presented against the ALPR systems. They are said to be expensiveand the opponents claim that they can be replaced by the cheaper and morereliable solutions. According to them, RFID tags or “tag and beacon” should


be such a solution.

2.2 Algorithms used in ALPR

This section presents different approaches to the problems concerned withinthis master thesis. These problems are: license plate localization, segmenta-tion and character recognition. The presented algorithms are intended to beas varied as possible to provide the wide view of this field of research.

2.2.1 Localizing algorithms

In this subsection three algorithms are presented. Each of them uses thedifferent approaches. They use different features of the plates.

The first method presented here is the “Number Plate Localization onthe Basis of Edge Finding” [EKF06]. It bases on the observation that thelicense plate is an area in the image with high contrasts, usually betweenblack and white or between black and yellow. The characters at the plateare organized in one row, or a few rows. The authors limit the search areato the image rows that exhibit the biggest brightness variation. In the firststep the edge finding algorithm is applied to the whole image, to highlightthe high contrast areas characteristic for number plates. Authors considerband-pass filters (both: with finite and infinite response) as the best for thisaim. Finally the filter with finite response (FIR) has been chosen due tofaster execution. This kind of filter produces a sum of the inputs multipliedby relevant coefficients. Having the image filtered (figure 2.2.1 (right)), eachrow of it is summed. The plate’s vertical position is established relying onamplitude of the rows’ sums. It is the place where the amplitude has thehighest value. The upper and lower end of the plate are located in thefollowing way: beginning from the maximal row the row with value of half ofit is searched in both directions (up and down). After finding such rows, thealgorithm searches for the nearest local minimum. The upper and lower edgeof the plate should be located near these minimums. To avoid getting stuckin local minimum caused by the image noise, the lowpass filter is applied tothe vector of rows’ sums. This procedure can be seen at figure 2.3 (left).The horizontal location of the plate is determined in the similar way as thevertical, but the values are summed up for columns. In this case more areasmight be selected as the possible location of the plate, because of the gapsbetween the letters, which are uniform areas. They are merged at the postprocessing phase, where the knowledge about the size and aspect ratio of theplate is used.


Figure 2.3: The original image (left) and image processed by FIR edge filter(right). Source: [EKF06]

.

Figure 2.4: The sums in rows with the local minimums marked and the sumsin columns (left). The localized license plate (right). Source: [EKF06]

.

The second method of locating the license plate is presented in [CCJ03].It is completely different from the one presented above. It is based on adap-tive thresholding. To work correctly under variety of lighting conditions theauthors decided to find the appropriate threshold iteratively. Its scheme canbe seen at the figure 2.5. The main idea that is behind this approach is tofind threshold, which applied to the input image creates uniform black orwhite areas. The only area to remain not uniform is the license plate. Theinput data was a single 320× 240 image.

At the very beginning the input image is preprocessed with median 3× 3filter to reduce the noise. The first step of the algorithm is to binarize theimage A to produce image B. The initial threshold can be set to

T = Gmax − (Gmax −Gmin)/3

. Where Gmin and Gmax are the minimal and maximal values within theimage. The second step reduces the background disturbance and producesimage C. This is done by recalculating the value of each pixel in the following


Figure 2.5: The scheme of the adaptive thresholding algorithm. Source[CCJ03]

way:

C(i, j) =

|B(i, j)−B(i, j − 1) i = 0, ..., 319; j = 1, ..., 239B(i, 0) j = 0.

After this operation most of the background is usually set to 0. The re-maining characters consist of thin insular vertical lines, while the backgroundis irregular. In order to this the median filter (1, 1, 1, 1, 1)T is applied to theimage C and image D is produced. The next step (third) is vertical localiza-tion of the plate. Image D is projected vertically (to a single column). Thenthis column is browsed from the bottom to the top until the value greater,than constant t is found. This is the probably the bottom edge of the plateand its labeled Pb. The browsing is then continued until value is less than t.This may be the localization of the plate’s top and is labeled Ph. The heightH = Pb−Ph of the plate candidate must be between 10 and 30 pixels. If H isso, the next step can be performed. Otherwise the browsing of the column iscontinued until reaching the top of the image, what indicates failure (anotherthreshold value T must be tested). Horizontal localization is the fourth step.The image D is limited only to lines between Ph and Pl. Vertical projectioncalculated. The horizontal position of the plate is established in similar asthe vertical one. The only difference is that some short intervals of lowervalues are allowed in order to the gaps between the characters. The left andthe right edge are marked as Pl and Pr respectively. If the candidate regionhas width W = Pr − Pl between 40 and 90 pixels the color verification canbe done, and a final decision is made.

The third approach concerning the plate localization problem is describedin [ZFMV97]. The main idea of this algorithm is splitting the 768× 576 im-age into 75 × 25 tiles. For each of them the fitness value is calculated. Theplate is localized within the tile that has the highest fitness value. The fit-ness value calculation uses fuzzy set theory, proposed by L.Zadeh in [Zad65].It enables to transform the description in natural language into the mathe-


Figure 2.6: The pixel’s membership functions to the bright and dark fuzzysets.

matical formula. This description is the entry point of this algorithm. Theauthors defined a license plate as the object having the following features: abright area with dark regions, it is located in the middle or lower middle partof the image, has a bright border and its dimensions are: 530mm× 120mm.Having made such a description it was transformed into the fuzzy system.Every feature was described with a membership value to the relevant fuzzyset. For example the membership function of the pixel to the bright or darkclass is shown at figure 2.6. This membership value is used for computingthe bright and dark pixel sequences’ length. The length has also its member-ship function. Such functions are also created for the horizontal and verticalposition of the tile. The overall fitness of the tile is calculated as the productof the membership values for horizontal position, vertical position, averagedark sequence length and average bright sequence length. The tile with thehighest overall fitness value contains the license plate. The exact borders areestablished according to the feature “bright border”.

There are more techniques, apart from these 3 presented above. Thereare for example methods basing on Mean Shift procedure, like in [JZH05].

2.2.2 Segmentation algorithms

This stage of number plate recognition process seems to be the easiest one.Although many problems can appear during it. Plates can be of differentformat and colors, have only one as well as more rows. The most of thealgorithms rely on the horizontal projection of the binarized plate’s image.

A good example of such an approach is presented in [Mar07]. This is theanalysis of the horizontal projection of the binarized plate’s image, exampleof this operation is presented at figure 2.7.

One can observe that there are many peaks within the image’s projection.Some of them represent the gaps between the characters, some not. Forexample there is a peak inside “0” digit, as well as inside the “B” letter.According to that observation the sementation algorithm cannot be reduced


Figure 2.7: The horizontal reduction of the number plate. Source: [Mar07]

only to finding the peaks. Its goal is to find all gaps between letters, andonly them. [Mar07] contains the following proposition. At the beginning afew symbols need to be defined:

• vm – maximum value in the projection px(x)

• va – the average value of horizontal projection px(x).

• vb – value used as a base for evaluation of peak height. Calculated asvb = 2va − vm.

• w – width of plate in pixels.

The algorithm searches iteratively for the highest peak in the projectionpx(x). If it is a gap it is zeroed, and the next peak is searched. The followingscheme presents the details of the algorithm:

1. Determine the index of the maximum value of horizontal projection:

xm = arg max0≤x<w

px(x)

2. Detect the left and right foot of the peak as:

xl = max0≤x≤xm

x|px(x) ≤ cx · px(xm)

xr = maxxm≤x≤w

x|px(x) ≤ cx · px(xm)


3. Zeroize the horizontal projection px(x) on interval 〈xl, xr〉.

4. If px(xm) < cw · vm, go to step 7.

5. Divide the plate horizontally in the point mx.

6. Go to step 1.

7. End.

This type of algorithm is frequently used for license plate segmentation.There are also many more approaches, like for example the extraction of theconnected components, used in [HP05]. In this approach the assumption,that one character is represented by single connected component, is made.Considering only the high quality sharp images the assumption can be satis-fied. Problems may occur when the image becomes blurred or the plate hassome damages. The glyphs can then be fragmented or a few of them can berepresented by one connected component. Consequently some pre and postprocessing need to be done.

2.2.3 OCR algorithms

Algorithms for OCR are invented not only for ALPR purposes. It is a dynam-ically developing research field. It is used also in offices for reading scannedtexts. There have been plenty of different approaches invented.

Template matching is one of the easiest methods to understand. In gen-eral, it is checking if the given objects correspond to a particular pattern.In [KK03] the weighted template matching method is presented. This methodconsists of two stages: Feature template matching and relative feature tem-plate matching. The first stage uses templates constructed basing on thestatistical analysis of the original glyphs. The authors assume regions of theglyph as “important”, if it is present only in this particular glyph, or in veryfew others. If some part is common for many letters it is assumed as verylittle important. This can be seen at figure 2.8 and 2.9. The bright regions atthe patterns denote the important regions, that are unique for the particulardigits.

The candidate glyph is scaled and multiplied by the given feature mapsfor the background and foreground. The most probable glyphs are chosen tothe second stage – relative feature template matching. In this part candidateglyph is examined using the relative weight templates created only for thepairs of characters. They are created to emphasize the differences betweenthe pairs of characters that were chosen in the first step. The scheme of thewhole algorithm can be seen at the figure 2.10.


Figure 2.8: Character area basis feature map. Source: [KK03]

Figure 2.9: Background area basis feature map. Source: [KK03]

Figure 2.10: The scheme of the template matching algorithm. Source: [KK03]

Another method that is used for the OCR purposes are Support VectorMachines. It establishes the separating hyperplane between two classes inthe feature space. The following description comes from [ZH06].

Given the 2-class classification problem, let:

Ω = (x1, y1), (x2, y2), ..., (xn, yn)|xi ∈ Rd, yi ∈ −1, 1, i = 1, 2, ..., n

be the set of input-output training pairs, Rd is the input space and is denotedby X. SVM projects first the input vectors into the feature space by functionΦ : X → F . The linear classifier is denoted by the hyperplane’s normal vectorw and offset b, which determines the hyperplane in the feature space:

Hw,b : wT x + b = 0

where x, w ∈ F and b ∈ R. SVM maximizes the margin between the positivexi when yi = 1 for i = 1, ..., n and negative xi when yi = −1 for i =


1, ..., n input vectors. This is the equivalent to maximize 2/||w|| (|| • || isthe norm of w). To determine the separating hyperplane, that maximizesthe margins the following problem must be solved:

min 12||w||2

s.t. yi(< w, xi > +b) ≥ 1,∀i

During the classification, the classification label is obtained according to:

fHw,b(x) = sgn(wT Φ(x) + b)

SVM can also be used in problems, where the classes cannot be separatedwith the linear hyperplane. The slack variables need to be used then. TheSVM was initially designed to classify binary problems, what is easy to beseen in the above description. When the problem has more decision classes,like for example OCR, one of two extensions is used. Both of them decomposethe multi-class problem to the set of binary problems. The first extensionis called “one against all”. For each decision class one SVM classifier thatseparates it from the other classes is created. The chosen class is the onewith the highest output value. The second extension – “one against one”applies series of classifiers for each possible pair of the decision classes. Theoutput is the most frequently chosen class.

Another method, that is frequently used for OCR purposes is the ArtificialNeural Networks (ANN). It is inspired by the neural networks that are presentfor example in human’s brain. ANN are described in many sources, like forexample [Mar07].

Chapter 3

System design

This chapter presents the created system. As mentioned in chapter 1 it hasbeen created within the confines of this master thesis and two others ([Tar07]and [Szc07]).

3.1 General view of the system

The CamComp system is intended to be the distributed network of cameraswith a central administration point that enables an administrator to controlthe whole system. Each camera is equipped with an embedded computerthat performs the ALPR tasks in real time. The nodes are able to commu-nicate with each other and to exchange information about sought vehiclesor alert the headquarters that such vehicle appeared. Another functional-ity is retrieving the past localization of the given car. The system stores inthe database the image and the license number of each vehicle that passedby. The detailed description of the system functionality with use cases anduser manual can be found at [NST07]. The general system architecture ispresented at figure 3.1.

The system can be divided into 3 functional blocks: motion detection andtracking, vehicle identification and distributed data exchange/storage.

3.2 DSP optimizations

The detailed description of the system’s hardware platform is given in theappendix [NST07]. This section tells how the programmer can benefit fromthem while writing pattern recognition algorithms. Many kinds of optimiza-tion can be done in comparison to the standard ANSI C Code working onthe x86 platform.

18

CHAPTER 3. SYSTEM DESIGN 19

Figure 3.1: The general view of the CamComp system. Author: StanislawSzczepanowski

According to [NST07], a TMS320C6414 processor has many features,that enable the programmer to write very efficient programs. The ability ofperforming many operations in one cycle is priceless as far as operations onthe image are concerned. On the other hand writing the code optimized forC6000 processor is not so easy as writing the same code in ANSI C. It requiressome additional knowledge of the processor’s architecture and its limitations.The old habits from the coding on standard PC sometimes disturb or leadto bad solutions. There are also many ways to help the compiler to providebetter optimization. [Tex06] contains tips how to write efficient code onC6000 platform.

3.2.1 Benefits from TMS320C6414 architectural fea-tures

As it was mentioned before, the compiler cannot optimize the standard Ccode to be as fast as it could be. If it was so one of the popular graphiclibraries, like for example OpenCV could be compiled and used. The pro-gramming the DSP would be identical as programming the PC. The pro-grammer has to keep in mind the C6414 architecture and try to code thealgorithm in the efficient way. In the best case the input data and algorithmitself is also fitted to the processor’s architecture. That means that the di-mensions of the input image are dividable by some number, usually 8, 16, 32


etc. The algorithm should use data structures, that can be easy processedby the processor. For example short type should be used wherever it ispossible – multiplying short values is 5 times faster than the int. So it issometimes better to split the algorithm into a few parts with narrower do-main to outrun the short range. Moreover C6414 is a fixed point processor– it cannot perform the floating point operations. They are realized on thefixed-point hardware what is extremely slow. The programmer should avoidsuch operations wherever it is possible.

There are a few general advices given in [Tex06] how to write the effi-cient code. Texas instruments proposes the following 3-step framework ofdeveloping the efficient applications:

1. Develop the standard C code, without any knowledge of the C6000. Usethe Code Composer Studio’s profiling tools to localize the inefficientareas of the written C code. To improve them proceed to step 2.

2. Use compiler intrinsics, restrict keyword, and MUST ITERATE pragmasto improve the efficiency. Profile the code. If it is not efficient enoughproceed to step 3.

3. Rewrite the time-critical parts of program using linear assembly.

While working on this thesis only the first two steps were completed. Inall cases the enough efficiency has been achieved after applying the secondstep. Generally three techniques, apart from fitting the algorithms to thearchitecture, were used for improving the performance.

The first of them was MUST ITERATE pragma. According to [Tex05] it pro-vides the compiler the knowledge of some specific properties of the loops. Ithelps to provide better software pipelining, nested loop unrolling and smallercode size. The syntax of this pragma is as follows:

#pragma MUST ITERATE(min, max, multiple)

min and max is the maximum and minimum number of loop’s iterations. Itmust be guaranteed by the programmer. The iteration count must be evenlydivisible by multiple. All arguments are optional. One has to keep it inmind that the results of the program might be unpredictable when one ofthe above parameters is incorrectly specified (e.g. real number of iterationsis less than min or greater than max). [Tex05] advices to use MUST ITERATE

pragma in every loop, due to the above advantages. Specifying the minimaliteration count enables the compiler to eliminate loop-bypassing code. Themultiplier aids to unroll the loop – to perform more operations in one iteration


and therefore to reduce the iteration count. What is more, after unrolling,memory accesses can be more efficient. The processor can for example fetch4 bytes instead of 1 in single iteration.

The second optimization technique is the restrict keyword. It helps thecompiler to determine memory dependencies. This keyword guarantees thata particular area of memory can be accessed only via the pointer precededby this keyword.

void func1(int * restrict a, int * restrict b)

/* func1’s code here */

In the above example from [Tex05] the a and b arrays cannot overlap. Thearray pointed by a can be accessed only using this pointer. It must also beguaranteed by the programmer – it is never checked. In case of violation ofthis guarantee, the result of the program will be undefined.

The last optimizing technique described here are compiler intrinsics. Theygive access to low-level processor operations. [Tex06] describes intrinsicsas “special functions that map directly to inlined C62x/C64x/C64x+/C67xinstructions, to optimize your C/C++ code quickly. All instructions that arenot easily expressed in C/C++ code are supported as intrinsics”. Their aim isto utilise the programmer’s knowledge of the data characteristics to optimizethe processing. In other words he can explicitly specify how the data will becomputed by the processor, what can give far better results than automaticoptimization by the compiler. All intrinsics available in Code ComposerStudio have a leading underscore and can be called like ordinary C/C++functions. There are many groups of intrinsics, some perform arithmeticoperations, some load data to memory, other switches halves of registers orpack and unpack data. There is short description of operations on packeddata presented above. Intrinsics that can perform more operations in a singlestep operate on packed data. The idea that is behind that feature is verysimple. It uses general purpose 32-bit registers to store more pieces of data ofsmaller types, for example 2 short, or 4 char values. Such situation is shownat figure 3.2.

Many intrinsics to manipulate the packed data are provided. The datacan be loaded directly from the memory (for example amem4(void*)), or canbe packed from two other values in many ways, like at figure 3.3. Anotherissue is saturated packing. It works as normal packing when the value doesnot exceed the destination data type range. Otherwise it is set to the maximalvalue for this type. Such saturated packing is presented at figure 3.4.


Figure 3.2: Four bytes packed into general purpose register. Source: [Tex06]

Figure 3.3: packl4 intrinsic from the packXX4 family. Source: [Tex06]

Figure 3.4: Intrinsic for saturated packs – spack2. Source: [Tex06]


If such packed data is prepared with the above described intrinsics, thecomputation can be done. There are many possibilities of operating onpacked data. The following two intrinsics are a good example:

uint avgu4(uint, uint) calculates the average value of each of 4 pairs ofunsigned char values, that are stored in unsigned int data types. Thecorresponding assembly instruction for C6414 processor is AVGU4.

double mpysu4 (int src1, uint src2) performs 4 multiplications for 4corresponding pairs of char values stored in int data type. The resultis stored in 4 16-bit short values (64-bit double) in order to supportvalues that are out of char range. The first argument contains 4 packedsigned char values, while in the second corresponding values are treatedas unsigned. This intrinsic uses the MPYSU4 assembly instruction.

What is remarkable, processing multiple data with single instruction, lim-its the variety of input data. When the whole image must be processed withuse of e.g. uint avgu4 intrinsic, it should have width, that is dividable by4. Unfortunately this is not always possible, or requires much effort to beguaranteed. In many cases the limitations are even more strict. For examplethe function IMG thr gt2thr from Image Processing Library, provided byTexas Instruments, makes simple image thresholding. To make it in efficientway it requires input and output data to be double-word aligned and numberof rows to be multiply of 16. This shows how important is consideration ofthe architecture while developing the algorithm.

3.2.2 Examples of optimization used in the system

During the development of the system two things cost the most effort. Firstly,to design reliable algorithms that give good results. The second, but not lessimportant concern was to implement these algorithms in efficient way towork on the DSP in real time. This section presents two example functionsthat were optimized to work faster on the DSP. It was achieved by usingtechniques described in the previous section.

To provide high performance, many operations are performed on a shrunkimage. This concerns motion detection as well as number plate localization.These operations do not need high accuracy – the difference of one or twopixels do not affect the final result. Every input frame is shrunk at the be-ginning, and the shrinking function is the only one performed on the full sizeinput image. So its speed is crucial for the whole system’s efficiency. Assum-ing the input video stream for 10 fps, there is only 100 ms for all operationsthat might be sometimes complex. The tunning of the shrinking functions


was the first challenge to meet. There are actually 2 shrinking functions,one shrinks the image 5 times in each direction, whereas the second makesit 2 times in each direction. The optimization described here is illustratedusing the function that shrinks the image twice. Its parameter list, in bothoptimized and unoptimized case, is as follows:

unsigned char* yInput – the input image that has 8 bits per pixel. Notethat both input and output images are one dimensional arrays in orderto be stored in one piece in memory.

unsigned char* yOutput – the output image, with width and height equalto the half of input’s width and height.

int width – width in pixels of the input image. Must be multiply of 64,and be between 200 and 1000.

int height – height of the input image. Must be multiply of 64, and begreater than 256.

Listing 3.1: Unoptimized shrinking code.

void shrink2C ( const unsigned char ∗restrict yInput ,unsigned char ∗restrict yOutput , int width , int height )

int i , j ;

for (i = 0 ; i < height ; i += 2)for (j = 0 ; j < width ; j += 2)

yOutput [ i ∗ width / 4 + j / 2 ] =( ( short ) yInput [ i ∗ width + j ]+ yInput [ i ∗ width + j + 1 ]+ yInput [ ( i + 1) ∗ width + j ]+ yInput [ ( i + 1) ∗ width + j + 1 ] ) / 4 ;

The unoptimized shrinking function shrink2C accepts all even dimensionsof the input image. It works as follows: for each pixel of the output imagethe sum for corresponding 2 × 2 area in the input image is calculated. Toobtain the average value it is divided by 4. The function processes one pixelof the output image (four input pixels of input image) in a single step of aninner loop. That requires additions and one division. The total number ofiterations is width × height/4. The code of unoptimized shrinking function


is shown at listing 3.1. As a result for the whole image a total number ofwidth× height operations is needed.

Listing 3.2: Optimized shrinking code.

void shrink2 ( const unsigned char ∗restrict yInput ,unsigned char ∗restrict yOutput , int width , int height )

int i , j ;

#pragma MUST_ITERATE (100 , 500 , 16)for (i = 0 ; i < height ; i += 2)

#pragma MUST_ITERATE (32 , 250 , 8)for (j = 0 ; j < width ; j += 8)

unsigned int avg1 =_avgu4 ( _amem4_const(&yInput [ i ∗ width + j ] ) ,

_amem4_const(&yInput [ ( i + 1) ∗ width + j ] ) ) ;

unsigned int avg2 =_avgu4 ( _amem4_const(&yInput [ i ∗ width + j + 4 ] ) ,

_amem4_const(&yInput [ ( i + 1) ∗ width + j + 4 ] ) ) ;

_amem4(&yOutput [ i ∗ width / 4 + j / 2 ] ) =_packh4 ( _avgu4 (avg2 , _swap4 ( avg2 ) ) ,

_avgu4 (avg1 , _swap4 ( avg1 ) ) ) ;

The optimized version of this function is presented at listing 3.2. It per-forms the same operation, but in completely different way. The SIMD exten-sions of the processor (for interpretation of SIMD abbreviation see [NST07])are here used. That implies different processing structure. The function pro-cesses 4 output image pixels during a single inner loop iteration. Eight pairsof values from two rows (i and i+1) of input image are taken into considera-tion during each iteration. The average value for each pair is calculated usingtwo operations that are capable of processing four pairs at once ( avgu4).After that step 16 pixels from two rows are shrunk to one row containing 8pixels. Then they have to be shrunk horizontally. The average cannot beperformed by using the same operation, because SIMD instructions are ap-plicable only between two vectors. In this case average from neighbor valuesin vector is needed. The solution here is swap4 intrinsic. It swaps byte 0 and1, as well as 2 and 3 inside 4-byte unsigned integer. For example given theinteger 0123 the result would be 1032, where 0, 1, 2, 3 are numbers of bytes.


To shrink the 8-pixel to 4-pixel vector, it can be averaged with it swappedcopy. It must be done in 2 steps, because only 4 bytes can be processedsimultaneously. After this operation there are four pairs of average values in8-pixel vector. Every second of them must be taken to the final result. ThepackXX4 intrinsic family is ideal for this purpose (one of them was presented

at figure 3.3). The single iteration requires 4 averaging, 2 byte swaps and 1bytes packing. That means 7 operations, what is more than in the classicalway, but one cannot forget that 4 pixels of output image are obtained. Thetotal number of operations for the whole image is width × height × 7

16. In

other words – 7 operations must be performed on every 16 input pixels. Incomparison to the unoptimized version the performance is theoretically morethan twice as fast.

Additionally to these compiler intrinsics also MUST ITERATE pragmas wereadded. According to some assumptions that were made during the implemen-tation the author have knowledge of the dimensions, e.g. they are dividableby 64 an the input image is larger than 300× 300.

The second example how to optimize the C code contains the comparisonbetween the optimized and unoptimized function, that converts the color 3-channel image to grayscale single-channel image. The examined function hasthe following parameters:

unsigned char* cRed, cGreen, cBlue – the 3 channels of the input im-age.

unsigned char* yOutput – the output single channel image. It must be ofthe same size as input image.

int width – width of the input and output image.

int height – height of the input and output image.

The first case of unoptimized function is shown at listing 3.3.The unoptimized function works as follows: for each output pixel the

sum of 3 channels at corresponding position is calculated. Then the sumis divided by 3 and the average value is obtained. As a result there are 3operations (2 additions and one division) for one pixel. The total number ofoperations is width× height× 3.

The optimized version is presented at listing 3.4. 8 pixels are processedat each iteration, therefore the total number of pixels in the image mustbe the multiple of 8. Similarly to the previous optimization also here theSIMD instructions are used. This example shows also how the algorithmmodification can facilitate the implementation and boost the performance.


Listing 3.3: Unoptimized code of RGB to gray conversion.

void rgbToGrayC ( const unsigned char ∗restrict cRed ,const unsigned char ∗restrict cGreen ,const unsigned char ∗restrict cBlue ,unsigned char ∗restrict yOutput , int width , int height )

int i ;

for (i = 0 ; i < width ∗ height ; i++)

short sum = cRed [ i ] ;sum += cGreen [ i ] ;sum += cBlue [ i ] ;yOutput [ i ] = sum / 3 ;

The output value is not exact average value of three input channels. Thered, green and blue have weights 0.25, 0.5 and 0.25 respectively. It enablesto make all operations with use only of the avgu4 intrinsic. What is morein this case the image can be threaten as the single vector without respectto the placement of the pixels, as it was in the shrinking function.

The optimization is very simple here. At the beginning 8 values fromred and blue channels are averaged (it is done using avgu4 twice). Thenthe result is averaged with the green channel in the same way. The outputis stored in the array using the itod intrinsic, that produces 8-byte doublevalue from two unsigned ints, by placing one in the less and second in moresignificant half. Theoretically this function needs to perform 5 operations(4 averages and 1 byte packing) for every 8 pixels. The total number ofoperations needed for the image is width × height × 5

8. So it should be

theoretically almost 5 times faster than its unoptimized version. To obtainthe real performance the both versions were tested and compared.

The performance results of the performance test can be found at thechapter 7. The methodology of this test was exactly the same as of theprevious one.


Listing 3.4: Optimized code of RGB to gray conversion.

void rgbToGrayAvg ( const unsigned char ∗restrict cRed ,const unsigned char ∗restrict cGreen ,const unsigned char ∗restrict cBlue ,unsigned char ∗restrict yOutput , int width , int height )

int i ;

#pragma MUST_ITERATE (6144 , 153600)for (i = 0 ; i < width ∗ height ; i += 8)

unsigned int a3_a2 , a1_a0 ;unsigned int b3_b2 , b1_b0 ;unsigned int c3_c2 , c1_c0 ;

a3_a2 = _hi ( _amemd8_const(&cRed [ i ] ) ) ;a1_a0 = _lo ( _amemd8_const(&cRed [ i ] ) ) ;

b3_b2 = _hi ( _amemd8_const(&cBlue [ i ] ) ) ;b1_b0 = _lo ( _amemd8_const(&cBlue [ i ] ) ) ;

c3_c2 = _avgu4 ( a3_a2 , b3_b2 ) ;c1_c0 = _avgu4 ( a1_a0 , b1_b0 ) ;

b3_b2 = _hi ( _amemd8_const(&cGreen [ i ] ) ) ;b1_b0 = _lo ( _amemd8_const(&cGreen [ i ] ) ) ;

a3_a2 = _avgu4 ( c3_c2 , b3_b2 ) ;a1_a0 = _avgu4 ( c1_c0 , b1_b0 ) ;

_amemd8(&yOutput [ i ] ) = _itod ( a3_a2 , a1_a0 ) ;

Chapter 4

License plate localization

This section contains the detailed description of the algorithm used in thesystem, that finds the license plate within the input image.

4.1 Input data specification

This part of ALPR process is the first step of vehicle identification. It gets theinput data from the motion detection and tracking part described in [Tar07].The input consists of the image, that contains the detected and trackedobject, which should be a vehicle with a visible license plate. There are thefollowing input image’s features and constraints:

• RGB – 3 channels, 8 bits per channel

• Width and height must be greater than 400 pixels.

• Width and height must be a multiple of 64.

As it was written in the introduction, there is huge variety of licenseplate types. Only white and single row plates are accepted here. What ismore the plate must have a distinctive border that isolates it from the restof the image. Otherwise the plate’s rectangle cannot be distinguished fromthe background. Additionally the plate has to be evenly lit. If only its halfis lit with the strong sunlight and the rest is shadowed there is only a narrowchance for the plate to be found. The example data can be found at figure 4.1.When the algorithm is able to localize the license plate, it should pass it tothe segmentation module. There is another possibility – the tracked objectcan be not a vehicle, or can be a vehicle with lost license plate. In suchcases the algorithm should return “plate not found” response and cancel thefurther computation.

29

CHAPTER 4. LICENSE PLATE LOCALIZATION 30

Figure 4.1: The example of input data for plate cocalization algorithm

4.2 Algorithm

The algorithm is composed of two approaches: [CCJ03] and [ZFMV97]. Fromthe first one the idea of adaptive thresholding is taken. The second providesthe idea of using the fuzzy membership functions and the way of making thefinal decision. The whole algorithm works iteratively, each iteration appliesthe search mechanism for image binarized with different threshold value. Thealgorithm’s pseudocode is presented at algorithm 1.

A few words of comment need to be added here. At the very begin-ning the input image is converted from RGB to 256 level grayscale. Thecolor information is never used at this phase. It is shrunk to the half ofits initial width and height next. Both operations are done to improve theperformance – there are 4 times fewer pixels to process and each is definedonly by one 1 instead of 3 values. The algorithm works in the main loop(line 4), that iterates until the candidate, whose evaluation is greater thanminimal acceptable value is found. The evaluation is a floating point valuefrom the range 0 to 1. To create the candidates, connected components inthe binary image are extracted. Connected components that consist of morethan MIN PIXEL COUNT set pixels are evaluated. The loop is terminatedalso when there is no such candidate, but all possible threshold values wereused. All constants were established experimentally. It is very common formachine vision tasks that there is a tradeoff between reliability and perfor-mance. All values that influenced thresholding can be a good example here.


Algorithm 1 The plate localization algorithm.

Require: The image of tracked objectEnsure: The image of localized license plate or “plate not found” response

if there is no plate within the image

1: grayImage← rgb2Gray(inputImage)2: shrinkedImage← shrinkImage(grayImage)3: bestEval← 04: threshold← INITIAL THRESHOLD

5: while threshold > MIN THRESHOLD andbestEval < MIN EV AL do

6: binaryImage← threshold(shrinkedImage, threshold)7: candidates← findConnectedComponents(binaryImage)8: for all candidate ∈ candidates

such that pixelCount(candidate) > MIN PIXEL COUNT do9: eval← evaluate(candidate)

10: if eval > bestEval then11: bestEval← eval12: bestCandidate← candidate13: end if14: end for15: threshold← threshold− THRESHOLD STEP16: end while17: if bestEval ≥MIN EV AL then18: return subImage(inputImage, ROI(bestCandidate))19: else20: return “PlateNotFound′′

21: end if


When the number of threshold values increases (by expanding the range us-ing MIN THRESHOLD or INITIAL THRESHOLD as well as decreasing theTHRESHOLD STEP), the probability of finding number plate that is presentwithin image also increases. At the same time the number of main loop’s iter-ations increases, what results in longer execution time. During experimentalprocess above values were set as follows: MIN THRESHOLD = 50; INI-TIAL THRESHOLD = 150; THRESHOLD STEP = 20. The MIN EVALwas set to 0.8. Such value guarantees that the chosen candidate is very likelyto be the license plate and terminates the processing before reaching theMIN THRESHOLD, what boosts the performance. The last parameter isMIN PIXEL COUNT. Only connected components within the binary imagethat consist of more than 200 pixels are evaluated. There is no need to eval-uate smaller components, because plates smaller than 30 × 10 are hard toidentify.

Another main issue analyzed in this chapter is candidate evaluation. De-scription of the localization algorithm only mentions about the evaluation,but it is essential element and it is not less important than candidate extrac-tion. It is easy to characterize in natural language, how the license plate lookslike. It is a white rectangular board with black letters on it. There is also ablue vertical stripe with symbol of the county. The problem arises when suchdescription should be applied to some object in the algorithm. The approachpresented here, uses the membership function defined in [Zad65]. It enablesconversion from natural language description to the function that calculatesthe degree how much the object belongs to some class. [Zad65] defines itin the following way: “Let X be a space of points (objects), with a genericelement of X denoted by x. Thus, X = x. A fuzzyset(class)A in X ischaracterized by a membership(characteristic)functionfA(x), which asso-ciates with each point in X a real number from interval [0, 1], with the valueof fA(x) at x representing the “grade of membership” of x in A. Thus thenearer the value of fA(x) to unity, the higher the grade of membership ofx in A.” In case of set of license plates the membership function indicatesthe degree of being the license plate. In this system 5 membership functionsaccording to 5 different features were defined. They are used to determinethe support for being the license plate for the particular features. They canbe seen at figure 4.2.

Such a membership function’s shape is justified in a following way:

horizontal size (the top left diagram at figure 4.2) the argument here isthe object width/image width ratio. The membership value equals 1for arguments between 0.2 and 0.35. The car is almost always widerthan three plate’s widths, the lower value comes from the observation


Figure 4.2: Membership functions for different features used for plate evalu-ation.

that even when the car is seen not directly from front, the total widthof the car rarely exceeds 5 widths of the plate very rarely. In case ofthe example input data (figure 4.1) the ratio was 0.28, so directly inthe middle.

vertical size (the top right diagram at figure 4.2) the argument here is theobject height/image height ratio. The situation is very similar to thatfrom horizontal size. The only difference is the smaller ratio – theplate’s height is almost 5 times smaller than its width, while the inputimage’s height is usually only a bit smaller than its width. This valuefor the example image is 0.08.

height/width ratio (the middle left diagram at figure 4.2) this feature isaimed to select only the candidates that have the correct proportions.The dimensions of the Polish license plates are 520 × 114 mm, theAustrian are 520×120 mm. The height/width ratios are 0.22 and 0.23,respectively. As one can see the ratio can be only little less (0.18).


The tolerance for a higher ratio is greater, because the plate might benarrowed by the perspective transformation. The example plate’s ratiois 0.24.

brightness (the middle right diagram at figure 4.2) according to the plate’sdescription in natural language the plate is white rectangle with blackletters. To ensure this attribute the argument of this membership func-tion is calculated in the following way: p

width×height, where p is the total

number of pixels that create the candidate connected component, widthand height are the dimensions of its bounding rectangle. The 1 valueis located between 0.6 and 0.9.

vertical transition count (the bottom diagram at figure 4.2) the calcula-tion of this value is the most complicated one. The single row of thelicense plate’s image is very characteristic. Its main feature are rapidbrightness changes, what is caused by the high contrast between blackletters and white background. Another advantage of that feature isthat it is typical almost only for license plates, what can help to dis-criminate other candidate objects. To extract this feature from thebinarized image the sobel operator for vertical edges is used. It em-phasizes the black to white (bw) and white to black (wb) transitionswithin the image. To obtain the maximal value of such transitionswithin the image, it was reduced to the single column, that containednumber of white pixels in each row. After applying the sobel operatorthe white pixels denote the wanted transition. The argument of themembership function is this maximal value. One can observe that themembership equals 1 for the [30, 60] interval. A plate consists of about5-7 characters. One character causes 2(e.g.“I”)-8(e.g. “W”, “M”) bwor wb transitions. Some transitions can be marked by sobel filter as 2white pixels in single row. Also a few false positive transitions can bedetected because of noise.

The final evaluation is calculated as a product of all 5 membership func-tions, what is taken from [ZFMV97]. The evaluation process has also oneoptimization. The most time consuming task is of course the computation ofthe membership value for the vertical transition count. The evaluation algo-rithm tries to avoid performing it. To obtain this, it was split in two phases.At the first phase only the 4 first membership values are calculated. If prod-uct of all of them is lower than the threshold value, 0 is returned as overallevaluation, otherwise the vertical transition count need to be calculated.

There was also implementation problem to face. The C6414 processordoes not support the floating point operations, whereas the membership


grade is the real value from the interval [0, 1]. To solve this problem themembership grade is expressed by the integer value from [0, 100] interval. Itis only an implementation issue, theoretically everything is as in definition.There is no need of high accuracy – the 101 different values are enough.The example multiplication of two membership values (e.g. 0.33 and 0.57)would be: 33 ∗ 57/100 = 18. The exact value is 0.1881, but such an error isacceptable for this task.

4.3 Example

This section illustrates what happens during the algorithm’s work. The mainidea is adaptive thresholding, what can be seen at figure 4.3. There aredifferent phases of algorithm presented, for this purpose the example inputdata was used (figure 4.1). The first(top) frame at this figure presents theinput image binarized with the initial threshold value (200 in this case).What is most remarkable there is no plate visible, because the threshold istoo high. There are only sun reflections on the car present. In such a casethe algorithm proceeds to the next iteration. The middle frame presents thecentral value from the threshold range. The plate is visible and, what isalso very important, consists of one connected component. This componentcan be extracted from the image and evaluated. The last frame (bottomone) shows what happens if the threshold value is too low. The plate’srectangle is merged with the background. There is in fact one big connectedcomponent and nothing can be extracted. Normally the algorithm will stop,when reaching the correct threshold, like this one in the middle. It mightalso happen that there is no threshold value that isolates the plate frombackground. This might occur when the plate is mounted without any outlinedirectly on the white car, but such situations are very rare.

Another issue that is visualized here is the vertical bw an wb transitionscounting. This is the 5-th membership function which is calculated, onlywhen the four previous give the membership value high enough. The ex-ample of the plate after applying the vertical Sobel operator is presentedat figure 4.5. This plate comes from well known example input data. Thesobel operator was applied to the image binarized with the same thresholdas middle frame at figure 4.3. Almost every transition is found. Some ofthem produce two pixels as a result, some only one. As the effect the verticalline in character creates most often line of the width of 4 pixels. It mightbe interrupted at the places where lines meet, for example the middle partof “K”. So the sum of each row will be in the interval, where the member-ship function has the maximal value. To provide the overall view the whole


Figure 4.3: Input image binarized with the different threshold values.


Figure 4.4: The binarized frame after application of sobel operator.

Figure 4.5: The magnified license plate.

frame after Sobel operator is presented at figure 4.5. Let’s have a closerlook to it. At the first sight the area, where the license plate is present, isthe most bright place. The uniform white areas from binarized frame, likebonnet, lights or windscreen are completely black. There are in fact a fewplaces with high intensity of vertical transitions found, but they come fromvery small connected components or from components of bad proportions,so they are not even examined. To summarize, there are very few objectswithin the car or in its background, that are able to reach the high overallevaluation.

The final result of the algorithm is presented figure 4.6. This was ideal in-put data for this algorithm an the plate was localized accurately and withoutany problems. It is the bright rectangle on the dark background, so thereis wide range of threshold values that enable extracting it. But there aresome data that can possibly cause the problems. The first example of suchdata is presented at figure 4.7. This is the screen from the early version of


Figure 4.6: The correct localization of the license plate.

Figure 4.7: The false positive example.


the system, with initial membership functions. This problem was eliminatedby reducing the maximal number of vertical transitions, but this exampleshows, that there might be some objects that have the characteristics similarto the license plate. Fortunately it happens very rarely.

There is another problem that seems to be more serious – the plate mightbe darker, or might not have the distinctive border, that separates it fromthe background. Such a case is shown at figure 4.8. There a few factors

Figure 4.8: The false negative example

that cause the algorithm to fail. The first one is very inconvenient light –it reflects at the neighborhood of the plate, what causes, that it is darkerthan the background. In addition the plate itself is dirty and therefore dark.Its black border is also very thin and a little blurred. Such problems cannotbe easily eliminated with parameters manipulation. The main idea of thealgorithm need to be altered to detect such cases.

Chapter 5

License plate segmentation

This chapter contains the detailed description of the approach used for thelicense plate segmentation.


The input data for the segmentation algorithm is the result of the algorithmdescribed in chapter 4. This is the image containing the license plate withthe following features:

• RGB – 3 channels, 8 bits per channel.

• Width should be 3-6 times greater than height.

• Dimensions do not need to be a multiple of any value.

Moreover, the plate should be evenly lit, because light reflections can causesegmentation errors. The localization module sometimes passes the platewith and sometimes without the blue country stripe. The both situationsare acceptable. If the stripe is present it is eliminated according to the colorinformation. An example of input image can be seen at figure 5.1.

Figure 5.1: The example of algorithm’s input data.

40

CHAPTER 5. LICENSE PLATE SEGMENTATION 41

5.2 Algorithm

The aim of the presented algorithm is to isolate the glyphs from the plate.What is more, some elements that are not characters, like blue country stripeor the crest must be eliminated. The main idea that is behind this algorithmassumes that the image’s column that is in the space between letters hasthe highest sum of pixels’ brightness. The algorithm’s pseudocode is givenat algorithm 2 figure. The first step of the algorithm is reducing the image

Algorithm 2 The plate segmentation algorithm

Require: The image of license plateEnsure: The list of bounding boxes of the glyphs.

1: grayImage← rgb2Gray(inputImage)2: separators← ∅3: row ← reduce2Row(grayImage)4: maximums← findLocalMaximums(row)5: separatorCount← 06: for all maximum ∈ maximums do7: if isCorrectSeparator(maximum) then8: separators[separatorCount]← maximum9: separatorCount← separatorCount + 1

10: end if11: end for12: letters← ∅13: for i = 0 to separatorCount− 1 do14: letterRectangle← findLetterRect(separators[i], separators[i + 1])15: if isCorrectLetter(letterRectangle) then16: letters← letters ∪ letterRectangle17: end if18: end for19: return letters

of the plate to the single row – each value in this row is the sum of allpixels in the corresponding column. Then the local maximums are localized(line 4). They are potential gaps between the letters. Each maximum is thenexamined to be the gap between the letters(lines 6-9), these gaps are namedseparators in the algorithm. There are two conditions for the maximum tofulfill to become the separator:

• only the limited number of pixels, such that b ≥ min + (max−min) ∗GRAY DEV IATION can exist in the column corresponding to the


maximum. b is the brightness of the pixel, min and max are the min-imal and maximal pixel brightness within the whole image, and theGRAY DEV IATION is the experimentally established constant fromthe interval [0,1].

• maximum − lMinimum > maxDifference ∗ PERCENTAGE andmaximum−rMinimum > maxDifference∗PERCENTAGE wherethe lMinimum and rMinimum are neighbor minimums located re-spectively on the left and right side of the maximum, maxDifferenceis the maximal difference between neighbor minimum and maximum inthe reduced image, percentage is the constant value from the interval[0,1]. The aim of this condition is eliminate local maximums that arewithin the letters of wide white fields, that can be present in platesthat contain few characters.

The separators established in this way, indicate the gaps between the objectspresent in the plate. The next step is to find the vertical location of theobject. It is done at line 14 (function findLetterRect). The function works asfollows: the left and right borders of the object’s rectangle are the separatorspassed to the function as the arguments. To find the top and bottom bordersfunction browses the lines from 0 to height/3 between the separators. Thetop border is simply the line with maximal pixels’ values sum. The bottomborder is established in the same way, but function browses the lines from 2∗height/3 to height−1. Each obtained rectangle contains one object from thelicense plate. It can be a character as well as the crest or the country stripe.To eliminate all objects but the characters, the last verification is performedat line 15. The function isCorrectLetter performs the color analysis of thegiven region (the bounding rectangle of the object). It calculates the so called“colorfulness” of every pixel of the region in the following way:

colorfulness = |r − g|+ |r − b|+ |g − b|

where the r, g and b denote the values at red, green and blue channelsrespectively. The region is eliminated if it contain more than the half of pixelswith colorfulness greater than some given threshold (the system’s parameter).All rectangles that have not been eliminated are passed to the recognitionmodule.

5.3 Example

This section shows how the plate is segmented step by step. At the enda few final results are presented. The first – the license plate reduction


Figure 5.2: The gray plate and the values in the reduced vector.

is presented at figure 5.2. The greatest values at the reduced plate occurbetween the glyphs. But the local maximums are present also inside thecharacters, e.g. “9”, or “4”. All maximums and minimums are shown atfigure. The most important thing here is that maximums appear in all gaps

Figure 5.3: All minimums (green) and maximums (red)

between the letters. Of course there are more of them than it should be (e.g.between blue stripe and “K”), but they are eliminated in the next steps.Figure 5.4 presents only the verified maximal values. Each object, that is

Figure 5.4: The verified maximum values.

present on the plate is between two separators and, what is more it is notsplit. The next step according to above algorithm description is determiningthe vertical location of the objects. This is presented at the figure 5.5. The

Figure 5.5: The bounding boxes of each object present in the license plate.

result is quite accurate, only the blue country stripe has the top and bottomin the random places, but it is not important, because it should be eliminated


Figure 5.6: The final result of the segmentation algorithm.

during the next step. The result of it is presented at figure 5.6. In this casethe color analysis worked correctly. All elements, that are not characterswere filtered out, what is marked by the black fields. The final result of thispart is the list of regions containing the characters. The region is denoted by4 numbers x, y, width, height where x and y are coordinates of the top leftcorner of rectangle and width and height are its width and height in pixels.

The algorithm is also able to segment plates that are slightly skewed orblurred. The detected location of characters is varied and is very close to thereal location (figure 5.7).

Figure 5.7: The example of the blurred and skewed plate.

Figure 5.8: The color analysis error due to the poor lighting conditions.

But there are a few possible sources of mistakes. The first one is darkimage – colors are less vivid then and the difference between the RGB chan-nels become less significant. The colors become close to gray. It can resultin some misclassification, whose example can be seen at figure 5.8.

Chapter 6

Character recognition

This chapter describes the last phase of the vehicle recognition process – thecharacter recognition. The output of this algorithm is passed to the database.


The algorithm’s input is the grayscale image of the license plate and the listof bounding rectangles of all characters. These rectangles do not need to beminimal – some surrounding background is also allowed until there are noparts of other objects. The other objects, like smashed insects are unwanted,but there is no possibility to avoid them. The input data has the followingfeatures and constarints:

• grayscale image (8-bit, single channel)

• 3-12 character bounding rectangles

• each rectangle consists of one character

The captured plate should be as big as possible – the lowest acceptable plate’sheight is 20 pixels. The sharp characters ensure high recognition accuracy.Little skew is acceptable, because the training patterns has also some skew.

6.2 Algorithm

This algorithm consists of two phases. The first one is the exact localizationof the of the character. It needs to be done, because the segmentation sub-module can create bounding rectangle with some margin. Before algorithm

45

CHAPTER 6. CHARACTER RECOGNITION 46

starts the area within each rectangle is binarized – the remaining computa-tion requires only the binary image. The threshold is selected adaptively foreach rectangle in the following way:

threshold = min + (max−min) ∗ THR PERCENTAGE

where min and max are respectively minimal and maximal pixel’s valueswithin the character rectangle, THR PERCENTAGE is the system’s param-eter, which was set to 0.5. This means that currently the threshold valueis in the middle of the [min, max] interval. After the binarization the mostleft/right columns and most top/bottom lines that contain black pixels arelocalized. The example of vertical borders’ localization is available at algo-rithm 3. The horizontal borders are established analogically. With the ex-

Algorithm 3 The exact localization of the character

Require: The image containing one characterEnsure: The exact localization of the character

1: xLeft← 02: while containsNoBlackP ixels(columns[xLeft] do3: xLeft← xLeft + 14: end while5: xRight← width− 16: while containsNoBlackP ixels(columns[xRight] do7: xRight← xRight− 18: end while9: return xLeft, xRight

actly localized character the recognition(classification) part can begin. Thealgorithm used for that purpose uses template matching. The similar ap-proach is described in [KK03]. Here the one phase algorithm was used. Buton the contrary the output is different – more than one character can bereturned as a result. The output can consist of at most 3 possible classes,what has been described in chapter 1.

The recognition algorithm multiplies the binary image of the character byall characters’ templates. The template that gives the highest multiplicationvalue is the most likely one. The features and construction procedure of thetemplate will be described later. If the other templates give values close toit, they can be also included in the result, but not more than two of them.The most serious problem to solve was that the template has almost neverthe same dimensions as the character’s image. The solution is very simple –


for each element of the template, one pixel within the image is searched. Themultiplication procedure is given at the algorithm 4. This pseudocode shows

Algorithm 4 The image by template multiplication procedure.

Require: The binary image of the character and the template.Ensure: The integer value of multiplication.

1: value← 02: for x = 0 to templateWidth− 1 do3: for y = 0 to templateHeight− 1 do4: i← x ∗ characterWidth/templateWidth5: j ← y ∗ characterHeight/templateHeight6: value← value + (255− character[i, j]) ∗ template[x, y]7: if character[i, j] == 0 and template[x, y] > 0 then8: value← value− template[x, y] ∗ 2559: end if

10: end for11: end for12: return value

how the template is multiplied by the letter’s image. A few things need tobe commented. The procedure iterates through all elements of the templateand for each of them finds the appropriate pixel within the letters image(lines 4-5). Then the element is multiplied by the pixel and the product isadded to the value (line 6), which is returned in the end as the result. Thereis 255 − letter[i, j] used, because there is negative needed. The backgroundpixels equal “255”, whereas characters consist of “0” pixels value. From thealgorithm’s point of view letter’s elements should have positive value (thereis letter), and the background should consist of “0s” (there is no letter). Thevalue of the multiplication should be negative also when there is no part ofcharacter, but according to the pattern there should be. It is achieved atlines 7-8.

Another issue is creation of the template. It is created on the basis of thecharacters isolated from the photos of license plates. There are a few imagescontaining the particular character used to create the template. The exampleof the part of the training set can be found at figure 6.1. The template istwo dimensional array of integer values, its features are as follows:

• The values are from the range [−∞, 255] (in practice values lower than-1024 appear very rarely).

• The sum of all values equals 0 (in practice it can be not equal if the sum


Figure 6.1: The part of the “E” training set.

of all positive elements is not dividable by the number of all negativeelements).

• The height of the pattern is fixed, while the width is calculated accord-ing to the training set, to maintain the aspect ratio.

During the training process all examples are scaled to the given height. Allof them are summed and then the result is divided by the number of theexamples. The next step is calculation of negative regions – places wherethe elements of the character should not appear. The procedure can beseen at algorithm 5. At the first step (lines 1-2) all training examples arescaled to the equal height, which is given as the HEIGHT parameter. Itwas set to 20 pixels in the system. The aspect ratio of every template ismaintained. The next step is empty template creation – it is the 0-filledarray of dimensions maxWidth×HEIGHT . Then each training example isadded to the template. It is done in 3 loops (lines 7-14). Every element ofthe template is incremented by the 255− value of corresponding pixel in thecurrently processed training example. The subtraction is performed to setthe black pixels to 255, and white ones to 0. The template is now the sum ofall used training examples. To have all elements within it normalized to theinterval of [0, 255], each of them is divided by the number of training examples(line 17). The positive part of the template is ready. The negative part, whichindicates where elements of the characters should not appear is still to becreated. All negative values are not equal. The farther from the positive(possible letter), the lower value should be. Moreover all values’ sum withinthe template should be as close to 0 as possible. To provide above attributes,the distance to the nearest positive value is calculated for each element whichstill equals 0. Then all positive values are summed and this sum is dividedby the sum of the distances. As the result the corresponding unit to theone pixel distance is obtained (line 20). Having this unit calculated, the


Algorithm 5 The template creation.

Require: The set of training examples (bitmap images) for one character T .Ensure: The template of the character.

1: for all t ∈ T do2: scale(t, HEIGHT )3: binarize(t)4: end for5: maxWidth← maxt∈T (width(t))6: template← createEmptyTemplate(maxWidth, HEIGHT )7: for all t ∈ T do8: margin← (maxWidth− width(t))/29: for x = 0 to width(t)− 1 do

10: for y = 0 to HEIGHT − 1 do11: template[x+margin, y]← template[x+margin, y]+(255−t[x, y])12: end for13: end for14: end for15: for x = 0 to maxWidth(t)− 1 do16: for y = 0 to HEIGHT − 1 do17: template[x, y]← template[x, y]/|T |18: end for19: end for20: unit← valueSum(template)/distanceFromPositiveSum(template)21: for x = 0 to maxWidth(t)− 1 do22: for y = 0 to HEIGHT − 1 do23: if template[x, y] = 0 then24: template[x, y]← −distanceFromPositive(x, y, template) ∗ unit25: end if26: end for27: end for28: return template


Figure 6.2: The example of the template.

values of all negative elements can be calculated according to the distanceto the nearest positive element. The effect of the processing is shown at thefigure 6.2. As one can see the maximal values, which equal 255, are locatedin the middle of the “E” bars. The positive, but less values are at the edges.The negative values represent the background. The lowest negative valuesare placed far away from the nearest positive, what shows the correct workof the learning process.

The recognition is called for each character that was detected within thelicense plate. It returns always three pairs of numbers: the chosen characterand the multiplication value for the corresponding pattern. The characterrecognition submodule that processes the list of characters decides whetherto put one or more possible decisions into the output structure, that consistsof the primary, secondary and tertiary array, what is shown at figure 6.3.Candidate objects whose half of highest multiplication by template result is

Figure 6.3: The example of output of the recognition submodule.

grater than other have only position in primary array filled. Otherwise therecan be up to three letters at one position, that are sorted according to their


multiplication result (the highest is placed in the primary array, the lowestin tertiary). The interpretation of the example output would be as follows:the classification of 3 objects (“4”, “4”, “A”) is done with high certainty.At the second position the most probable character is “9”, but it is possiblealso that it is “3” or what is the least possible – “8”. Similarly “K” and “E”have the alternative values, but in this case only one template multiplicationreturned over half of the best value.

6.3 Example

This section shows the examples from working algorithm. The first one tobe presented is the example from the segmentation part (figure 5.1). Itwas recognized as shown at table 6.3. The primary plate is correct. The

K 9 4 4 A EX FM

Table 6.1: The example of characters recognition for plate “K 944AE”

detected character number is equal to the real count, what is result of thegood segmentation. Moreover all characters were correctly classified. Mostof them have only one possibility – the secondary and tertiary plates areempty. Only the “K” has two other classification possibilities – “X” and“M”. These characters are in fact quite similar. The “E” character has onlyone alternative – “F”, what was very easy to predict, because differencesbetween these letters are only very little.

The next example – the dark plate from figure 5.8, shows how the incor-rect segmentation can influence the recognition. As it was discussed before,the blue country stripe was not eliminated, what effected in passing it to therecognition module. The result is given at table 6.3. The first position is the

W K 4 7 9 D B3 X A 2 C ET T B G

Table 6.2: The example of characters recognition for plate “K 479DB”

classification of the country stripe, which should not appear here. The resultis random – the all three possibilities are not similar to each other. All othercharacters, have been correctly classified in the primary array. But there


result is not so certain, as it was when the previous table was concerned.Such situation might be the result of the poor lighting conditions.

The hardest case presented here is the one from the figure 5.7. Theplate is skewed and blurred, what can heve very negative impact on theclassification quality. As one can see at table 6.3, the plate has not been

K 7 0 A 2 MX 1 8 4 NL

Table 6.3: The example of characters recognition for plate “K 7042H”

recognized correctly. At the primary plate only characters “K”, “7”, “0” and“2” are correct. The situation is not bad in case of “4”, because it appears atsecondary plate. The last position is an example of misclassification – thereis no “H” given also as alternative in secondary or tertiary plate.

Chapter 7

Tests

Two kinds of tests are presented here. In the first section the performancetests according to DSP optimizations are described. The second sectionpresents the quality results from the working system – each of three subsys-tems is evaluated.

7.1 Performance tests

7.1.1 Tests for DSP optimization

The discussion from chapter 3 gave only the theoretical view of possiblebenefits from the optimizations. To evaluate the real execution time thetests were performed. The DSP platform was the one, described in [NST07].The PC used for comparison was 1.5GHz AMD Athlon 1800+ with 768 MBDDR333 RAM. To obtain the reliable results each function was called 1000times. The average time of the single execution was calculated by dividingthe total time by 1000.

The first test was the comparison of different versions of scaling function.The result can be seen at figure 7.1. The CO abbreviation stands for “Com-piler optimization”. This test was performed to estimate the possibilities ofoptimization on the DSP. The test data were 5 color bitmap images (sized2048×1024, 1536×768, 1024×512, 512×256 and 256×128). Each of themwas scaled to the dimensions of width/2×height/2 by the 5 types of scalingfunctions:

C code with compiler optimization ordinary C code, without any opti-mizations done to obtain the high performance on DSP. It was testedto be compared to the optimized code.

53

CHAPTER 7. TESTS 54

Figure 7.1: The time of execution of different version of shrinking function.

Intrinsics with compiler optimization the dedicated code written forthe DSP with the use of compiler intrinsics described at chapter 3.It is intended to be the fastest of all presented here. It was tested todetermine, how the programmer’s knowledge of the input data struc-ture, as well as the processor’s architecture can affect the performance.

C code without compiler optimization the same standard C code, butcompiled without any compiler optimization, to determine what accel-eration can be achieved with compiler optimization only.

Intrinsics without compiler optimization the same code with intrin-sics, but compiled without compiler optimization, for the same purposeas the unoptimized C code.

OpenCV the one from the most popular PC imaging libraries. It would beused if the hardware platform was embedded PC computer. Used forthe comparison between PC system and DSP.

A few conclusions can be drawn here. Firstly, there is linear dependencybetween the image’s pixel count and the time. The DSP optimized code isthe fastest one, what shows that the hardware platform choice was correct.The difference between the DSP optimized code and all others is shown at

CHAPTER 7. TESTS 55

the table 7.1. The value given at the table is average difference between theDSP code with compiler intrinsics and the particular methods. The valuefor the smallest image was omitted due to the high variation. As one cansee the compiler optimizer boosts the performance significantly, but it cannotproduce the program in case of standard C, which is as fast as the the one forthe code that uses intrinsics and is optimized by the programmer. The codewritten especially for the TMS320C6414 processor is 23% faster. What ismore, the third step – writing linear assembly was not done here, thus thereis still some place for optimization. But the main conclussion concerningthose results can be done – the knowledge of the architecture and usingoptimizing techniques for DSP can help in creating the efficient system.

Table 7.1: Different versions of scaling compared to the DSP optimized code.The numbers show how many percent the method is slower.

C code with CO 23.46%C code without CO 829.67%Intrinsics without CO 111.48%OpenCV 30.49%

The next tested function here was the one that converts the color im-age into the grayscale one. It was described in chapter 3 with the ways ofoptimizing it. The performance results are available at figure 7.2.

The results are here very similar as at the previous test, but with oneexception – the OpenCV is the fastest method. According to the table 7.2it works almost 18% faster than the optimized DSP algorithm. One cannot

Table 7.2: Different versions of RGB to grayscale conversion compared tothe DSP optimized code. The numbers show how many percent the methodis slower.

C code with CO 26.70%C code without CO 760.82%Intrinsics without CO 112.98%OpenCV -17.85%

forget that the desktop PC, used for tests with OpenCV, has the processorthat works 50% faster (1.5 GHz vs 1.0 GHz). Moreover it has also slowermemory (ordinary SDR 100 MHz). The embedded PC systems are typicallymuch slower than the reference desktop computer, so it is very good resultfor the embedded platform. For example Gumstix computer used in thissystem for the communication and database purposes has the processor with

CHAPTER 7. TESTS 56

Figure 7.2: The

200 MHz clock. The OpenCV on such system would be a few times slower.The other conclusions are very similar. The code written especially for theDSP is more efficient than the ordinary C code, also with the optimizationlevel set to the highest one. In other words programer need to provide someoptimiaztion information to the compiler to obtain the best results.

What is more, the above results of the DSP optimized code can be better,because not all of the optimization possibilities were exhausted.

7.1.2 Overall module performance

The overall performance of the identification module is very important issue.As it was written before, the system should work in real time. The minimalframes count per second should be not less than 3 - 5. Such rate ensures thatno vehicle is missed when is in the convenient position for recognition. Givensuch frame rate, the maximal time for the whole processing is about 300 ms.There were tests made in real working environment. The camera was setby the road and the execution time was measured. During the whole testthe module received 168 images of the car. The average execution time was89.5 ms with the standard deviation of 61.7. The longest execution lasted321 ms. It was quite much, but still enables to perform more than 3 fps. Onecannot forget, that there are also some other tasks present in the system, like

CHAPTER 7. TESTS 57

motion detection and tracking. In the worst case (321 ms), there is not muchtime left for it. The shortest work of the submodule lasted only 10 ms. Itwas the case of the small input image, when there was no license plate foundand the whole procedure quited after the first step.

7.2 Quality tests

The most interesting test is the quality test. The success rate is the mostimportant feature of the system. The one of the goals listed at the begin-ning was to reach the highest possible accuracy (the closest to 100% – thebetter). Each of the 3 described algorithms was tested and described sepa-rately. This enabled the more detailed description of the system as a whole,and demonstrates where the most cases of the incorrect work occure. Apartfrom establishing the success rate, the causes of the incorrect work are to bepointed out in this chapter. This knowledge will be helpfull in the furtherdevelopment of the system.

7.2.1 License plate localization

The whole process depends on the good license plate localization at thebeginning. This module should localize the plate rectangle or return the“plate not found” response. It was tested using the 40 images acquired indifferent places and different conditions. They were acquired using differentdevices: Samsung Digimax L60 digital camera placed at the bridge overthe motorway, the Sony XCD-SX910CR camera and IQEye 702 close circuitcamera. Such input data can examine the universality and robustness of thealgorithm.

The test results are presented at the table 7.3. Another issue is what can

Table 7.3: The results of the license plate localization quality test.Total number of images 40Plate localized correctly 37 92.5%Plate localized incorrectly 1 2.5%Plate not found 2 5%

be called the useful plate. Should all plates be localized or only these, whichhave the chance to be identified. The author decided to set the threshold forthe membership value described in chapter 4. It can help to eliminate someobjects that are not license plates, or are of poor quality.

CHAPTER 7. TESTS 58

Figure 7.3: The example of the incorrect localization of the license plate.

Figure 7.4: The plate as the object with the highest membership value, butthe value is under the threshold – “plate not found” response returned.

As far as the errors are concerned, sometimes happens that even if thedistinctive plate is present within the image, other object is chosen. Suchcase is shown at figure 7.3. According to the threshold mentioned before,there is sometimes the plate, that is object with the highest membershipvalue within the image, but it is under the threshold. Such case can be seenat figure 7.4. The rejecting such a case is in fact not incorrect work. One cansay, that the plate is present, but it was not found, so it is obvious error. Onthe other hand such a plate is completely useless and unreadable. Moreover

CHAPTER 7. TESTS 59

when the localization module returns the answer “plate not found”, it is verylikely, that it will get the next frame with the same vehicle from the trackingmodule. The plate can be then easier to be located and read. To concludethe plate localization receives the different images of the same vehicle a fewtimes, so it is even better to reject doubtful cases and wait for better ones.

7.2.2 License plate segmentation

The license plate segmentation is crucial for the recognition quality. It has toextract the objects from the license plate that are characters. Other objects,like blue country stripe or the crest must be eliminated. If some errors occurat this stage, plate is unable to be read correctly – the number of characterscan be incorrect, the crest can be passed to the character recognition, 2characters can be merged, or one can be split. Such errors make it impossiblefor the recognition module to recognize all characters.

The test uses 16 different license plates, that consist of 127 objects (char-acters, blue stripes and crests). The object was assumed as successfullysegmented, when it was isolated and passed to the recognition in case ofcharacters, or if it was rejected in case of other objects. Within 127 objects,the 113 were segmented successfully, which denotes the success rate of 89%.There were plates that were segmented correctly, but some of them causedproblems. Figure 7.5 presents the omitting two characters. It happened

Figure 7.5: The example of incorrect license plate segmentation

Figure 7.6: The example of incorrect license plate segmentation

because of the plate’s skew. The dark area at the top of the plate caused theincorrect behavior. The algorithm have not assumed the local maximums asthe gaps between the letters.

On the contrary, figure 7.6 presents the case of too many objects. Inthis case two errors occurred – the “U” letter has been split into 2 parts,while the piece of shadow at the right was classified as the character. Suchsituation was caused here by the nonuniform light. There is light reflectionat the right part of the plate, what was mentioned as the possible problem.

CHAPTER 7. TESTS 60

At the same figure, the “L” letter was merged with the crest. The result ofthe recognition become of course unpredictable in such cases.

7.2.3 Character recognition

This is the final stage of the identification process. There is big chancefor the correct result if two previous stages were completed successfully andcorrectly. The classifier is given the image with single character and mustclassify it. To decrease the error count, system allows the classifier to returnat most 3 possibilities, sorted in order of probability.

The quality test for the classifier used the characters manually extractedfrom the license plates. The total number of such letters was 74. Eachof them was passed to the classifier. The results of the classification areavailable at the table 7.4. As one can see, the total number of answers with

Table 7.4: The results of the character recognition quality test.Total number of characters 74 100%Correct classification (only one possibility) 20 27%Correct classification as the first possibility 52 70%Correct classification as the second possibility 10 14%Correct classification as the third possibility 3 4%No correct classification present 9 12%

the correct classification present reaches 88%. At 20% of cases classifier didnot have any doubt and gave only one correct classification. The number ofanswers where none of the possibilities is the correct answer is 12%. It wasusually in cases when the character was rotated, or littlebit blurred. Thereare many factors that can decrease the success rate of the recognition. Themost significant of them is the plate’s skew. This factor interrupts also thesegmentation process, thus this problem must be solved in the nearest future.

Chapter 8

Conclusions

This chapter presents the conclusions after implementation and tests of thesystem. This is the summary of the whole thesis – the evaluation of the earlyversion of the system and some possibilities of development are describedhere.

8.1 Successful issues

The main success of this work is that the module described within this masterthesis, integrated with other system modules is capable of correct reading thelicense plates when is set near the road. All algorithms presented here arelikely to function correctly if the image contains the evenly lit car with visiblelicense plate and not blurred letters.

When it comes to the goals from the abstract, two of them – work inreal time and autonomy, were completely fulfilled. The tests proved that thesystem is able to work on-line with the stream of video data. The numberof frames per second is about 3 only in case of the most processing of thebiggest input data. The relevant test was presented in chapter 7. As far asthe autonomy is concerned, the system does not need any assistance duringthe work. Only action required from the staff is to set the camera correctlynear the road and to protect if from rain or acts of vandalism. It can be alsosupplied with power from battery, what makes it more mobile and enablesthe use at the places that are remote from the electrical infrastructure.

The identification of the every car that passes by was realized only partly.The current success rate of the system is too low to have certainty of havingevery vehicle identified. The system, of course tries to identify every objectthat is present in the input data. If the probability of correct character ex-traction is about 0.9 and the probability of correct recognition is also about

61

CHAPTER 8. CONCLUSIONS 62

0.9, the total probability for single character to be read and classified cor-rectly is 0.81. If the plate consists of 6 characters, the probability of correctrecognition, for the whole plate calculated using binormal distribution is 0.28.This is quite a low probability, and shows how excellent must all subpartsbe to obtain the reliable results. The total correctly recognized plates per-centage can be higher, because of the fact that the test was made using theimages of the vehicles, that had different quality. There were skewed plates,acquired from the big distance that had a few recognition mistakes. Therewere also plates seen directly from the front, that were correctly read. Itshows that the algorithm is very sensible to the acquisition quality.

As far as weather conditions are concerned the system was used during thesunny and cloudy day. In both conditions it was able to work. The ambientstrong light seems to be the most convenient one for the functioning, thereare no sun reflections present and there are not high contrasts, that make theacquisition difficult. The system was not tested during the rain or snow, butsuch phenomenons may negatively influence the recognition quality. Thereis still no case built to use the system in such weather conditions.

At the current development stage the system is capable to successfullyidentify the vehicle when it is seen directly from the front and the plateorientation is very close to vertical – only the little skew or perspective isacceptable.

8.2 Encountered problems

There were a few used approaches that failed during the implementationprocess. The first, which seemed to be promising was localization with theuse of HSI (Hue, Saturation, Intensity) color space. Author assumed, thatthe plate is the object that has uniform color and can be localized with useonly of the Hue component. This approach failed due to the fact that theplate is white and there is no white hue in this space. The white color iscompletely independent from the hue – it can be described with Saturationand intensity. One license plate could contain a few hues, so this approachwas abandoned. It would be reliable in case of the plates that are of somenot white color, for example yellow.

Another approach that caused many problems was glyph extraction usingthe connected components extraction. In case of blurred or dark plates oneglyph could be split, or more of them could be merged. This can be in factgood approach if the plates are very distinctive, clean and not damaged.

Another thing that caused problems was the acquisition. The ideal imagefor vehicle identification should be bright, sharp and the moving objects


should be not blurred. The above criteria conflict. For example bright andimage implies the long shutter time, while the demand on the moving objectsto remain not blurred implies the short shutter time. Similar conflicts arisewhen it comes to the diaphragm value. There is obvious tradeoff between allof above requirements. All parameters need to be set according to the testenvironment: lighting conditions and the wanted depth of field. It must bedone each time before the start. The shutter time can be set automatically,while the diaphragm and the focus must be set manually by the human.That causes the problems in case of the rapid weather conditions change –the necessity of staff intervention can arise.

As it was predicted at the beginning of the thesis the quality of the licenseplates caused many problems. Many front license plates were dirty. The mostcommon type of dirt were the smashed insects. They lower the success rateof the recognition, because of merging more characters together or merginga parts of one letter. The other type of the dirt was the dust – it reduced thecontrast. It sometimes happened that the plate had some scratches or therewas no paint at some part of letters, what made them impossible to be read.

Many problems were caused by the sun reflections. They resulted inthe camera sensor’s errors seen as white stripes among the whole image.It happened sometimes that such stripe ran over through the plate, whatobstructed the recognition process. Other common problem caused by thesunlight was the shadow on the plate. It happened frequently around noon,when the sun was very high. In such cases the plate was usually not localized,or only the half (the lit one) was found.

After tests of the system in Poland, author stated that the created systemcannot cope with the recognition tasks as good as in case of Austrian licenseplates. The plates have slightly different mounting system, the margins be-tween letters and the frame are different. That causes the system to fail morefrequently. The designed and implemented algorithms need to be adjustedto the Polish conditions.

8.3 Future work

During the implementation and tests of the system many assumptions orapproaches have proved to be correct. What was easy to predict there aresome parts of the system, that need to be perfected or even replaced.

The first issue, mentioned in the previous section is adjusting the cur-rently used algorithms for the Polish plates. They need the parameter inputcorrection, as well as some slight methodological changes.

The first step in the further development will be the perspective correc-


tion. During the segmentation and recognition process most of the problemswas caused by the skew of the plate. It was because of the fact, that the carwas not always seen from the front. So, the skew elimination is believed tosignificantly improve the recognition rate. To conclude – the one additionalsubmodule is to be integrated into the identification system – perspectivecorrection.

The character classifier is very vulnerable to the noise and interference inthe image. Unfortunately, such component of the input stream is not possibleto be avoided. The template matching method cannot be improved easilyto cope with such problems. Author intends to use neural network classifierinstead.

The previous paragraphs referred to the quality of the system. The fur-ther development assumes also adding the new functionality, such as recog-nition of different types of plates, e.g. two row plates, or plates from othercountries to ensure that nobody is exempt from punishment.

Bibliography

[CCJ03] Guangzhi Cao, Jiaqian Chen, and Jingping Jiang. An adaptiveapproach to vehicle license plate localization. 2003.

[Com05] British Broadcasting Company. Police start camera scan network.http://news.bbc.co.uk/2/hi/uk news/4372809.stm, 2005.

[EKF06] Balazs Enyedi, Lajos Konyha, and Kalman Fazekas. Real timenumber plate localization algorithms. Journal of ELECTRICALENGINEERING, 57:69–77, 2006.

[fLc07] Transport for London company. Congestion charge webside.http://www.cclondon.com/, 2007.

[Fou07] Wikimedia Foundation. Wikipedia – the free encyclopedia.http://www.wikipedia.org, 2007.

[HP05] Leonard G. C. Hamey and Colin Priest. Automatic number platerecognition for australian conditions. dicta, 0:14, 2005.

[JZH05] Wenjing Jia, Huaifeng Zhang, and Xiangjian He. Mean shift foraccurate number plate detection. Proceedings of the Third Inter-national Conference on Information Technology and Applications(ICITA’05), 2005.

[KK03] Mi-Ae Ko and Young-Mo Kim. License plate surveillance systemusing weighted template matching. aipr, 00:269, 2003.

[Mar07] Ondrej Martinsky. Algorithmic and Mathematical Principles ofAutomatic Number Plate Recognition Systems. PhD thesis, BrnoUniversity of Technology, 2007.

[Mat07] SA Mathieson. Worried about being watched? you already are.The guardian, 2007.

65

BIBLIOGRAPHY 66

[NST07] Mateusz Nawrocki, Stanis law Szczepanowski, and Przemys lawTaront. Camcomp – system documentation. Technical report,Poznan, Poland; Klagenfurt, Austria, 2007.

[Pol07] The West Yorkshire Police. The report about automatic numberplate recognition. http://www.westyorkshire.police.uk/section-item.asp?sid=66&iid=747, 2007.

[Szc07] Stanis law Szczepanowski. Design, implementation and research ofdistributed system of identification and tracking of vehicles. Mas-ter’s thesis, Poznan University of Technology, Alpen-Adria Uni-versity Klagenfurt, Poznan, Poland; Klagenfurt, Austria, 2007.

[Tar07] Przemys law Taront. Design and implementation of the algo-rithms for vehicle identification — foreground detection and ob-ject tracking. Master’s thesis, Poznan University of Technology,Alpen-Adria University Klagenfurt, Poznan, Poland; Klagenfurt,Austria, 2007.

[Tex05] Texas Instruments. TMS320C6000 Optimizing Compiler User’sGuide, July 2005.

[Tex06] Texas Instruments. TMS320C6000 Programmer’s Guide, March2006.

[Zad65] Lotfi A. Zadeh. Fuzzy sets. Information and control, 0:16, 1965.

[ZFMV97] N. Zimic, J. Ficzko, M. Mraz, and J. Virant. The fuzzy logicapproach to the car number plate locating problem. iis, 00:227,1997.

[ZH06] Lihong Zheng and Xiangjian He. Number plate recognition basedon support vector machines. avss, 0:13, 2006.

Mateusz Nawrocki Design and Implementation of...

Documents

Transcript of Mateusz Nawrocki Design and Implementation of...