A Practical Introduction to Computer Vision with OpenCVsetiawanhadi.unpad.ac.id/Welcome to Setiawan...

10Vision Problems

This final chapter presents a series of computer vision application problems which are possibleto solve using the theory (and practice) presented in this text. Images and videos for theseproblems are provided in the electronic resources accompanying the text.

10.1 Baby Food

On a production line cans of baby food are made by:

1. Fabricating the sides and lid of the can together but omitting the base of the can.2. Dropping a spoon into the empty upside-down can.3. Pouring the baby food (powder) into upside-down the can.4. Sealing the base onto the can.

You are asked to develop an inspection system looking at the can between steps 2 and 3 inorder to check that a single spoon has been placed into it (see Figure 10.1).

Figure 10.1 Sample baby food can images from the production just after the spoon has been droppedinto the can. Examples of no spoon, one spoon and two spoons are shown

A Practical Introduction to Computer Vision with OpenCV, First Edition. Kenneth Dawson-Howe.© 2014 John Wiley & Sons, Ltd. Published 2014 by John Wiley & Sons, Ltd.

190 A Practical Introduction to Computer Vision with OpenCV

10.2 Labels on Glue

On a production line for bottles of glue it is necessary to perform a number of inspection tests(see Figure 10.2):

1. Check that each bottle has a label.2. Check that the label on the bottle is straight.3. Check that the label is not torn or folded.

Figure 10.2 Sample images of bottles of glue are shown some without labels and some with crooked(or damaged labels). Note that these are not images from a real production line and in many cases thelabels have been manually damaged/removed

Vision Problems 191

10.3 O-rings

You are asked to detect defects (notches and breaks) in rubber O-rings as they pass before aninspection camera (see Figure 10.3).

Breaks:

Notches:

Good O-rings:

Figure 10.3 Sample images of good O-rings (i.e. with no defects; see top row), broken O-rings (middlerow) and O-rings with notches out of them (bottom row)


10.4 Staying in Lane

In order to assist drivers, you are asked to develop a system to ensure that they stay in lane,and raise an alarm if they start to drift into another lane (without indicating). Given a videofeed from a camera mounted on the dashboard of a car, looking forwards, you are askingto automatically detect the lines on each side of the lane of traffic that the car is in (seeFigure 10.4).

This is an area of significant research interest and there are other datasets available online:

1. PETS (Performance Evaluation of Tracking and Surveillance) 2001 dataset 5: 4 videos(2 looking forwards, 2 looking backwards) from a vehicle on a motorway. http://ftp.pets.rdg.ac.uk/PETS2001/DATASET5/

2. CamVid (Cambridge-driving Labeled Video Database): 4 videos looking forwards from avehicle http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/.

3. Daimler Pedestrian Detection Benchmark Dataset: A small number of long video sequenceslooking forwards from a vehicle. http://www.gavrila.net/Datasets/datasets.html

Figure 10.4 Sample images from the dashboard mounted camera

http://ftp.pets.rdg.ac.uk/PETS2001/DATASET5/

http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/

http://www.gavrila.net/Datasets/datasets.html

http://ftp.pets.rdg.ac.uk/PETS2001/DATASET5/

Vision Problems 193

10.5 Reading Notices

In order to facilitate auto-translation, you are asked to locate any text within an image andextract the text as a series of characters (see Figure 10.5). You may assume that the text willtake up a reasonable portion of the image and will be more-or-less the right way up!

Figure 10.5 Sample images of notices


10.6 Mailboxes

Given a live feed from a camera monitoring mailboxes, you are asked to determine if there isany post in each mailbox (see Figure 10.6). To make the processing a bit easier a pattern ofblack and white stripes has been put in each box. As people put mail in and take mail from themailboxes on a regular basis, your solution must be able to cope with occlusion of the cameraby moving people (without generating any false mail notifications).

Figure 10.6 Three sample frames from a video of the mailboxes, all empty (left), partly occluded(centre and with mail in three of the boxes (right)

Vision Problems 195

10.7 Abandoned and Removed Object Detection

You are asked to locate any abandoned objects and any removed objects (see Figures 10.7 and10.8).

This is an area of significant research interest and there are many datasets available online:

1. CAVIAR (Context Aware Vision using Image-based Active Recognition project): 5 videosin a lobby from above of various objects being abandoned and removed. http://homepages.inf.ed.ac.uk/rbf/CAVIAR/.

2. ViSOR (Video Surveillance Online Repository): 19 videos of abandoning and removingobjects in an outdoor car-park. http://www.openvisor.org/ and http://imagelab.ing.unimore.it/visor/

3. PETS (Performance Evaluation of Tracking and Surveillance) 2006 dataset: 7 left luggagedatasets of videos from 4 different cameras in a train station. http://www.cvg.rdg.ac.uk/PETS2006/data.html

4. PETS (Performance Evaluation of Tracking and Surveillance) 2007: 7 left luggage datasetsof videos from 4 different cameras in a busy airport terminal. http://pets2007.net/

Figure 10.7 Sample object abandonment

Figure 10.8 Sample object removal

http://homepages.inf.ed.ac.uk/rbf/CAVIAR/

http://www.openvisor.org/

http://imagelab.ing.unimore.it/visor/

http://www.cvg.rdg.ac.uk/PETS2006/data.html

http://pets2007.net/



http://www.cvg.rdg.ac.uk/PETS2006/data.html


10.8 Surveillance

You are asked to find and classify (as people, cars, etc.) all moving objects in a video sequencefrom a static camera (see Figure 10.9). You must be able to deal with environmental effects(such as changing lighting and the effects of wind, etc.).

This is an area of significant research interest and there are large numbers of datasetsavailable online. For example:

1. CAVIAR (Context Aware Vision using Image-based Active Recognition project): http://homepages.inf.ed.ac.uk/rbf/CAVIAR/.

2. ViSOR (Video Surveillance Online Repository): http://www.openvisor.org/ and http://imagelab.ing.unimore.it/visor/

3. PETS (Performance Evaluation of Tracking and Surveillance datasets: http://ftp.pets.rdg.ac.uk/

4. SPEVI (Surveillance Performance EValuation Initiative) datasets http://www.elec.qmul.ac.uk/staffinfo/andrea/spevi.html

5. ETISEO dataset from INRIA. http://www-sop.inria.fr/orion/ETISEO/6. OTCBVS dataset. http://www.cse.ohio-state.edu/otcbvs-bench/bench.html7. LIMU dataset from Kyushu University. http://limu.ait.kyushu-u.ac.jp/en/dataset/

Figure 10.9 Four frames from PETS 2001 (Dataset 1, Camera 1). Reproduced by permission of Dr.James Ferryman, University of Reading



http://www.openvisor.org/



http://ftp.pets.rdg.ac.uk/

http://www.elec.qmul.ac.uk/staffinfo/andrea/spevi.html

http://www-sop.inria.fr/orion/ETISEO/

http://www.cse.ohio-state.edu/otcbvs-bench/bench.html

http://limu.ait.kyushu-u.ac.jp/en/dataset/

http://ftp.pets.rdg.ac.uk/

http://www.elec.qmul.ac.uk/staffinfo/andrea/spevi.html

Vision Problems 197

10.9 Traffic Lights

You are asked to locate any traffic lights in a camera view and to determine what light is lit(green, amber or red) (see Figure 10.10).

There are videos, available online, taken from moving vehicles which can be used for trafficlight detection:

1. CamVid (Cambridge-driving Labeled Video Database): 4 videos looking forwards from avehicle http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/.

2. Daimler Pedestrian Detection Benchmark Dataset: A small number of long video sequenceslooking forwards from a vehicle. http://www.gavrila.net/Datasets/datasets.html

Figure 10.10 Three frames from a stationary camera at a traffic light controlled road junction

http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/

http://www.gavrila.net/Datasets/datasets.html


10.10 Real Time Face Tracking

Assuming that the cascade of Haar classifiers is too slow, propose a way of tracking a face in(close to) real time in a video (see Figure 10.11).

Figure 10.11 Sample frames from a video showing a face to be tracking

Vision Problems 199

10.11 Playing Pool

You are asked to locate the balls on a pool table so that their positions can be analysed forpossible shots (see Figure 10.12). Ball positions should be computed only when there is nomotion on the table and no one is occluding the table. Also, ball positions should be reportedrelative to the table (rather than the image).

Figure 10.12 Sample images from a video of a game of pool


10.12 Open Windows

You are asked to develop a system to determine what (top-hung) windows are open in imagesof a building. This is intended to aid in the efficient heating and ventilation of the building(see Figure 10.13).

Figure 10.13 Three images from a camera monitoring a set of windows

Vision Problems 201

10.13 Modelling Doors

Given a video feed from a surveillance camera, you are asked to locate any doors in theview. This can be done during a training phase either from video or still images. This shouldaid the system when attempting to understand the actions of people within the scene (seeFigure 10.14).

Figure 10.14 Three sample frames from a camera monitoring a doorway


10.14 Determining the Time from Analogue Clocks

You are asked to determine the time from an image of an analogue clock (see Figure 10.15).

Figure 10.15 Sample clock images

Vision Problems 203

10.15 Which Page

You are asked to determine which page (if any) out of a given text a reader is currentlyviewing so that augmented content can be added to the page through the use of a projector(see Figures 10.16 and 10.17).

Figure 10.16 Some of the provided sample page images

Figure 10.17 Sample views of the book while it is being read


10.16 Nut/Bolt/Washer Classification

You are asked to classify parts coming down a conveyor line as nuts, bolts or washers (orsomething unknown) (see Figure 10.18).

Figure 10.18 Sample views of the parts on the conveyor belt

Vision Problems 205

10.17 Road Sign Recognition

Assuming that you have sample (CAD) images of road signs such as those in Figure 10.19,you are asked to develop a system to automatically recognise these road signs from singleimages (taken from the dashboard of a car looking forwards). They will appear at differentsizes (as the car moves towards them), but should be facing directly towards the vehicle (seeFigure 10.20).

Figure 10.19 Sample CAD road signs

Figure 10.20 Sample images (containing road signs) taken looking forwards from a vehicle


10.18 License Plates

You are asked to recognise the number on license plates taken with a handheld camera.You are provided with example images of the digits in two forms (synthetic and real) (seeFigures 10.21, 10.22 and 10.23).

Figure 10.21 Sample synthetics digits

Figure 10.22 Sample digits from license plates

Figure 10.23 Sample images of car license plates

Vision Problems 207

10.19 Counting Bicycles

You are asked to locate any bicycles moving along a path (see Figure 10.24). Note that therecan be other bicycles or people walking, and that you must be able to cope with changes inthe lighting (as the sun goes in and out) and with the waves on the sea (which are constantlymoving).

Figure 10.24 Some frames from the sample bicycle video


10.20 Recognise Paintings

As part of an augmented reality system you are asked to automatically determine the identityof any painting in the view of an observer (camera) (see Figures 10.25 and 10.26).

Figure 10.25 Sample images from an observer in a gallery. Photographs taken by Dr. Kenneth Dawson-Howe at the National Gallery of Ireland

Figure 10.26 Sample known painting images. The painting are: (top left) ‘The Marriage Feast anCana’ (late 1660s) by Jan Havicksz. Steen, (top centre) ‘A Thunderstorm: The Frightened Wagoner’(1832) by James Arthur O’Connor, (top right) ‘Joseph Selling Corn in Egypt’ (1612) by Pieter Lastman,(bottom left) ‘A Sick Call’ (1863) by Matthew James Lawless, (bottom centre) ‘A View of the Rye Waternear Leixlip’ (1850s) by William Davis, and finally (bottom right) ‘The Arrival of the ‘Kattendijk’ atTexel’ (1702) by Ludolf Backhuysen I. Photographs taken by Dr. Kenneth Dawson-Howe at the NationalGallery of Ireland

A Practical Introduction to Computer Vision with OpenCVsetiawanhadi.unpad.ac.id/Welcome to Setiawan...

Documents

Transcript of A Practical Introduction to Computer Vision with OpenCVsetiawanhadi.unpad.ac.id/Welcome to Setiawan...