Counting People Based on Dynamic Texture Background ModelCounting People Based on Dynamic Texture...

4
National Conference on “Advanced Technologies in Computing and Networking"-ATCON-2015 Special Issue of International Journal of Electronics, Communication & Soft Computing Science and Engineering, ISSN: 2277-9477 49 Counting People Based on Dynamic Texture Background Model Abstract Conventional video surveillance systems often have several shortcomings. First, target detection can’t be accurate under the light variation environment. Second, multiple target tracking becomes difficult on a crowd scene. Third, it is difficult to partition the tracked targets from a merged image blob. Finally, the tracking efficiency and precision are reduced by the inaccurate foreground detection. In this paper, the fusion of temporal and texture background model, multi-mode tracking scheme are proposed to improve the above mentioned problems. In addition, a people counting scheme is proposed based on the multi-mode multi-target tracking method on a crowd scene. Key Words Target Contour extracting, Template tiling, Texture analysis I. INTRODUCTION People counting are a crucial and challenging problem in visual surveillance. An accurate and real-time estimation of people in a shopping mall can provide valuable information for managers. Automatic monitoring of the number of people in public areas is also important for safety control and urban planning. In recent years, this field has seen many advances, but the solutions have restrictions: people must be moving, the background must be simple, or the image resolution must be high. However, real scenes always include both moving and stationary human beings, the background may be complicated, and most videos in a visual surveillance system have a relatively low resolution. Nevertheless, due to different poses, views and scales, it is still a challenging task. Particularly, spatial occlusions often occur in realistic crowded scenes, which increase the difficulties for accurate pedestrian counting [1], [2]. Systems have been based on optical barriers, which produce a high error rate both in terms of false negatives, by failing to discriminate between different people walking in parallel and false positives, by including objects such as bags and cases in the people count. Several alternatives to these systems are based on computer vision. These constitute low-cost non-intrusive systems which are capable of resolving some of the problems mentioned earlier and which yield a relatively high hit rate. In addition, some of these systems have the ability to track people crossing the camera’s field of view, increasing robustness by taking several measurements corresponding to the same person in the video sequence [3]. Object detection and tracking have become central topics in computer vision research, and recent approaches have shown considerable progress under challenging situations such as changes in object appearance, scale, occlusions, and pose variations [4]. In this paper to improve accuracy of foreground detection, Dynamic texture model is used to eliminate the false foregrounds. In general, the texture model for background can be modeled by using LBP. Here a multimode-mode multitarget is applied to overcome the complex target tracking problem on crowd scene. II. BACKGROUND An effective method for estimating the number of people and locating each individual in a low resolution image with complicated scenes is developed. In general; target detection can’t be accurate under the light variation environment or clustering background. Especially, the light reflection and back-lighted problems can deteriorate the target detection seriously. Here, a pixel-wise temporal probability background model and voting rule is applied to segment the foreground and background on a light variant or clustering background. To improve the accuracy of foreground detection, the dynamic texture model is used to eliminate the false foregrounds III. PREVIOUS WORK DONE Hou et al. [1] developed a new cluster model which is more accurate in both counting and detection than the Gaussian model. The results show that the linear relationship is not very sensitive to the size of the disk. However, when the disk size is too large, many areas without people will also be identified as foreground pixels, which can result in false estimations. Wang et al. [2] presented a robust pedestrian counting approach using group context. As compared with the single frame based approach, the method achieves better results especially when severe occlusions occur in the scene .The approach used is more practicable in real world scene. Sometimes the method can exclude noisy groups such as small masks in frame caused by defective background. To address the problem of serious occlusions, some individuals may be counted repeatedly. Garcia et al. [3] proposed the use of extended modified condensation algorithm, based on optical flow generated from the movement of people and depth to the height of the system, as an estimation method from multiple people. The proposed approach solves the problem of occlusion of people and discriminating between people and objects such as shopping trolleys or bags in stores. The inclusion of different features relevant to people tracking, such as movement, size, and height, adapting the propagation and observation models in the particle filter and followed by a clustering method, provides sufficient accuracy and robustness to achieve high counting rates. The problem of occlusion is solved completely, but the possibility of extracting detailed information about people is Geeta L. Makhija Dr. R. D. Raut Dr. V. M. Thakare

Transcript of Counting People Based on Dynamic Texture Background ModelCounting People Based on Dynamic Texture...

Page 1: Counting People Based on Dynamic Texture Background ModelCounting People Based on Dynamic Texture Background Model ... Compare to other common moving object detection algorithms, background

National Conference on “Advanced Technologies in Computing and Networking"-ATCON-2015Special Issue of International Journal of Electronics, Communication & Soft Computing Science and Engineering, ISSN: 2277-9477

49

Counting People Based on Dynamic Texture BackgroundModel

Abstract —Conventional video surveillance systems often haveseveral shortcomings. First, target detection can’t be accurateunder the light variation environment. Second, multiple targettracking becomes difficult on a crowd scene. Third, it is difficultto partition the tracked targets from a merged image blob.Finally, the tracking efficiency and precision are reduced by theinaccurate foreground detection. In this paper, the fusion oftemporal and texture background model, multi-mode trackingscheme are proposed to improve the above mentioned problems.In addition, a people counting scheme is proposed based on themulti-mode multi-target tracking method on a crowd scene.

Key Words —Target Contour extracting, Template tiling, Textureanalysis

I. INTRODUCTION

People counting are a crucial and challenging problem invisual surveillance. An accurate and real-time estimation ofpeople in a shopping mall can provide valuable information formanagers. Automatic monitoring of the number of people inpublic areas is also important for safety control and urbanplanning. In recent years, this field has seen many advances,but the solutions have restrictions: people must be moving, thebackground must be simple, or the image resolution must behigh. However, real scenes always include both moving andstationary human beings, the background may be complicated,and most videos in a visual surveillance system have arelatively low resolution. Nevertheless, due to different poses,views and scales, it is still a challenging task. Particularly,spatial occlusions often occur in realistic crowded scenes,which increase the difficulties for accurate pedestrian counting[1], [2]. Systems have been based on optical barriers, whichproduce a high error rate both in terms of false negatives, byfailing to discriminate between different people walking inparallel and false positives, by including objects such as bagsand cases in the people count.

Several alternatives to these systems are based on computervision. These constitute low-cost non-intrusive systems whichare capable of resolving some of the problems mentionedearlier and which yield a relatively high hit rate. In addition,some of these systems have the ability to track people crossingthe camera’s field of view, increasing robustness by takingseveral measurements corresponding to the same person in thevideo sequence [3]. Object detection and tracking have becomecentral topics in computer vision research, and recentapproaches have shown considerable progress underchallenging situations such as changes in object appearance,scale, occlusions, and pose variations [4].

In this paper to improve accuracy of foreground detection,Dynamic texture model is used to eliminate the false

foregrounds. In general, the texture model for background canbe modeled by using LBP. Here a multimode-mode multi–target is applied to overcome the complex target trackingproblem on crowd scene.

II. BACKGROUND

An effective method for estimating the number of peopleand locating each individual in a low resolution image withcomplicated scenes is developed. In general; target detectioncan’t be accurate under the light variation environment orclustering background. Especially, the light reflection andback-lighted problems can deteriorate the target detectionseriously. Here, a pixel-wise temporal probability backgroundmodel and voting rule is applied to segment the foreground andbackground on a light variant or clustering background. Toimprove the accuracy of foreground detection, the dynamictexture model is used to eliminate the false foregrounds

III. PREVIOUS WORK DONE

Hou et al. [1] developed a new cluster model which is moreaccurate in both counting and detection than the Gaussianmodel. The results show that the linear relationship is not verysensitive to the size of the disk. However, when the disk size istoo large, many areas without people will also be identified asforeground pixels, which can result in false estimations. Wanget al. [2] presented a robust pedestrian counting approach usinggroup context. As compared with the single frame basedapproach, the method achieves better results especially whensevere occlusions occur in the scene .The approach used ismore practicable in real world scene. Sometimes the methodcan exclude noisy groups such as small masks in frame causedby defective background. To address the problem of seriousocclusions, some individuals may be counted repeatedly.Garcia et al. [3] proposed the use of extended modifiedcondensation algorithm, based on optical flow generated fromthe movement of people and depth to the height of the system,as an estimation method from multiple people. The proposedapproach solves the problem of occlusion of people anddiscriminating between people and objects such as shoppingtrolleys or bags in stores. The inclusion of different featuresrelevant to people tracking, such as movement, size, andheight, adapting the propagation and observation models in theparticle filter and followed by a clustering method, providessufficient accuracy and robustness to achieve high countingrates. The problem of occlusion is solved completely, but thepossibility of extracting detailed information about people is

Geeta L. Makhija Dr. R. D. Raut Dr. V. M. Thakare

Page 2: Counting People Based on Dynamic Texture Background ModelCounting People Based on Dynamic Texture Background Model ... Compare to other common moving object detection algorithms, background

National Conference on “Advanced Technologies in Computing and Networking"-ATCON-2015Special Issue of International Journal of Electronics, Communication & Soft Computing Science and Engineering, ISSN: 2277-9477

50

drastically reduced. Errors are introduced due to lack ofcontrast between the floor and the person moving through thecounting area. Also errors are introduced in counting due toslow movement of peoples. Detection rate decreases withincrease in ratio of people/area (when more than four peopleinteract crossing the counting area. Yimin et al. [4] proposed anInvariant Hough Random Ferns (IHRF) for object detectionand tracking. The approach delivers reliable result even if theobject undergoes heavy deformations in a complexbackground. The Hough voting step used involves considerablecomputational effort. A large number of methodologies havebeen proposed by a number of researchers focusing on theobject detection from a video sequence. Most of them makeuse of multiple techniques and there are combinations andintersections among different methodologies [5]. Panchal et al.[6] focused on detection schemes based on backgroundsubtraction because of their widespread use and thepossibilities they offer in implementing real-time objectdetection system. Compare to other common moving objectdetection algorithms, background subtraction segmentsforeground objects more accurately and detects foregroundobjects even if they are motionless [7]. Akancha et al. [8]proposed method which relies on a graph representation ofmoving objects which allows deriving and maintaining adynamic template of each moving object by enforcing theirtemporal coherence. This inferred template along with thegraph representation used in the approach allowscharacterizing object trajectories as an optimal path in a graph.The proposed tracker allows dealing with partial occlusions,stopping and going motion in very challenging situations.Jingyu et al. [9] proposed detection and tracking basedcounting method. The method involves 2 steps, at the first step,a head and shoulder shape hog feature is extracted to detectpedestrians. At the second step, particle filtering method withcolor based appearance model, is used to track pedestrians. Themethod shows great robustness in scenarios of partialocclusions, crowding and background disturbance. With GPUacceleration, our method can be fully real-time and embeddedinto real applications. Jorge et al. [10] presents an applicationfor counting people through a single fixed camera. This systemperforms the count distinction between input and output ofpeople moving through the supervised area. The counterrequires two steps: detection and tracking. The detection isbased on finding people’s heads through preprocessed imagecorrelation with several circular patterns. Tracking is madethrough the application of a Kalman filter to determine thetrajectory of the candidates. Finally, the system updates thecounters based on the direction of the trajectories

IV. EXISTING METHODOLOGY

A) A neural network method based on EM algorithm

A neural network is used to estimate the number of people inreal time. With the estimated number of people, a humandetection method based on the EM algorithm has beenattempted for subsequent video processing. By clustering theKLT feature points in a foreground mask, the requirement for

an accurate foreground contour has been reduced. Theapplication of methods based on segmenting the foregroundhas been extended to detection of people who are moving onlyslightly. This new cluster model has been shown to be moreaccurate in both counting and detection than the Gaussianmodel.

B) Spatio-temporal Group Context along with MCMCalgorithm

A robust pedestrian counting approach is presented usinggroup context. A group correspondence matrix is built not onlyfor detecting and tracking groups, but also for group states,group relatives and group events detection. Then the groupcontext is model using the foreground masks of a concerninggroup and its relatives. A set of context masks are assembled toform the group context: intra group context and inter groupcontext, serving as the spatio-temporal reference for a jointMAP estimation problem. The MCMC algorithm is employedto search for an optimal configuration set to match the groupcontext model.

C) Extended Condensation algorithm

The extended modified condensation algorithm is proposedbased on optical flow generated from the movement of peopleand depth to the height of the system, as an estimation methodfor multiple people. The modified K-means algorithm is usedto provide a deterministic output. The inclusion of differentfeatures relevant to people tracking, such as movement, size,and height, adapting the propagation and observation models inthe particle filter and followed by a clustering method, providessufficient accuracy and robustness to achieve high countingrates.

V. ANALYSIS AND DISCUSSION

A neural network method based on EM algorithm estimatesthe number of people and locate each individual in a lowresolution image with complicated scenes. ExpectationMaximization (EM) based method is developed along withpostprocessing steps. A new cluster model is used to representeach person in the scene. The threshold should be set such thatpeople moving slightly show some scattered pixels whilekeeping the noise low. An effective method for estimating thenumber of people and locate each individual even in lowresolution image with complicated scenes. The new clustermodel has been shown to be more accurate in both countingand detection than the Gaussian model. The results show thatthe linear relationship is not very sensitive to the size of thedisk. However, when the disk size is too large, many areaswithout people will also be identified as foreground pixels,which can result in false estimations. The movement of non-human objects also results in wrong foreground pixels. At theend of the four-hour video, the ground truth of the number ofpeople is relatively low. Hence, the movement of some boxescaused large error percentages. A group context basedBayesian framework is proposed to address the problem ofpedestrian counting. Also a group correspondence matrix ispresented to build the bidirectional correspondences between

Page 3: Counting People Based on Dynamic Texture Background ModelCounting People Based on Dynamic Texture Background Model ... Compare to other common moving object detection algorithms, background

National Conference on “Advanced Technologies in Computing and Networking"-ATCON-2015Special Issue of International Journal of Electronics, Communication & Soft Computing Science and Engineering, ISSN: 2277-9477

51

the groups in two consecutive frames. As compared with thesingle frame based approach, the method achieves betterresults especially when severe occlusions occur in the scene.The approach used is more practicable in real world scene. Theproposed method still needs to tune parameters. Sometimes themethod used in the paper (Group context model) can excludenoisy groups such as small masks in frame caused by defectivebackground. To address the problem of serious occlusions,some individuals may be counted repeatedly. The extendedmodified condensation algorithm solves the problem ofocclusion of people and discriminating between people andobjects such as shopping trolleys or bags in stores. Theinclusion of different features relevant to people tracking, suchas movement, size, and height, adapting the propagation andobservation models in the particle filter and followed by aclustering method, provides sufficient accuracy and robustnessto achieve high counting rates. Parameters under considerationare distance between cameras, focal length of sensors,maximum distance parameter (dmax), motion threshold (THRm),frames per second, image size, area covered, system height.The problem of occlusion is solved completely, but thepossibility of extracting detailed information about people isdrastically reduced are introduced due to lack of contrastbetween the floor and the person moving through the countingarea. Also errors are introduced in counting due to slowmovement of peoples. Detection rate decreases with increase inratio of people/area (when more than four people interactcrossing the counting area).

VI. PROPOSED METHODOLOGY

The system block Diagram is shown in Fig. 1. The proposedpeople counting method is described in Fig. 2. By analyzingthe target contour, how many people inside in the contour canbe computed

Fig. 3 shows the template tiling process that can estimate thenumber of people in a merged image blob. First, contourextraction is used to get target contour shown in Fig. 3-(c).After obtaining the target contour, a template tiling method isproposed to estimate people number in the target region.Second, a rectangular box is applied with the average humanheight and width to match object contour from left-up to right-down. Finally, it can be estimated how many people in themerged image blob. The template tiling result is shown in Fig.3-(d).

Fig. 1 Block Diagram for People Counting based on Texture analysis.

Fig 2.Flowchart of people counting

Fig3. Template - tiling. (a) Original image. (b) Detectedforeground. (c) Object contour. (d) Template tiling result.

Input video image

Image Rectification Texture ModelLocal BinaryPattern

Foreground detectionin Temporal Domain(Bayesian Classifier)

Foregrounddetection inSpecial Domain(Dynamic texturemodelling)

ContourExtraction &RegionTiling

People Counting

Single Multiple

Target

Single/Multipletarget decision

CountingPeople

TemplateTiling

TargetContour

extracting

Sum of People

Counting

Page 4: Counting People Based on Dynamic Texture Background ModelCounting People Based on Dynamic Texture Background Model ... Compare to other common moving object detection algorithms, background

National Conference on “Advanced Technologies in Computing and Networking"-ATCON-2015Special Issue of International Journal of Electronics, Communication & Soft Computing Science and Engineering, ISSN: 2277-9477

52

VII. POSSIBLE OUTCOMES AND RESULTS

First, the proposed spatial-temporal probability model isapplied to detect the moving targets. Finally, people countingscheme with the contour corner detection and template tiling isused to estimate the number in the crowd scene. The resultswill be robust even if the lighting changes. If applied thetargets on the scene may be tracked with the correct trackingmodes. Thus the method presented is efficient, robust andaccurate.

CONCLUSION

In this paper, the spatial-temporal probability backgroundmodel and texture background model are fused to detect theforegrounds robustly even thought light changes seriously. Byapplying the template tiling method the number of people inthe crowd scene can be estimated. As a future work, it needs todetect foreground detection approach and try to eliminatemoving object noise in the image.

REFERENCES

[1] Ya-Li Hou, Granthnam K.H.Pang” People Counting and HumanDetection in a Challenging Situation”, IEEE Transactions On Systems,Man, And Cybernetics—Part A: Systems And Humans, Vol. 41, No. 1,Pp 24-33 January 2011.

[2] Jinqiao Wang , WeiFu , Jinging Liu Hanqing Lu ”Spatio-temporal GroupContext for Pedestrian counting”, IEEE transaction on Circuits andSystems for Video Technology, doi:10.1109/TCSVT.2014.2308616,2014.

[3] Jorge García, Alfredo Gardel, Ignacio Bravo, José Luis Lázaro, andMiguel Martínez,” Tracking People Motion Based on ExtendedCondensation Algorithm”, IEEE Transactions On Systems, Man, AndCybernetics: Systems, Vol. 43, No. 3, Pp .606-618 May 2013.

[4] Yimin Lin, Naiguang Lu, Xiaoping Lou, Fang Zou, Yanbin Yao, Zhaocaidu.”Invariant Hough Random Ferns for Object detection and Tracking“Hindawi Publishing Corporation Mathematical Problems in EngineeringVolume 2014, Article ID 513283, 20 pageshttp://dx.doi.org/10.1155/2014/513283

[5] S. H. Shaikh Moving Object Detection Approaches challenges andObject Tracking, Springer Briefs in Computer Science, DOI10.1007/978-3-319-07386-6_2, © The Author(s) 2014.

[6] Sneha S. Joshi, J. R. Panchal, ”Background Subtraction Based Detectionand Tracking Of People In Video”, International Journal of InnovativeScience, Engineering & Technology, ISSN 2348 – 7968,Vol. 1 Issue 5,July 2014.

[7] Andrea Noreen D'silva, Nagalakshmi Shagoufta Taskeen,”ObjectTracking Initialization Based On Automatic Moving Object Detection: ASurvey Paper”, Indian Streams Research Journal ISSN 2230-7850Volume-4 | Issue-5 | June-2014.

[8] Akancha Verma, Ayush Baranwal, Shivam Kejriwal, Shobhit ShubhamSaxena, ”Efficient tracking of image sequence inspection and movingobject recognition”, International Journal of Research (IJR) , ISSN 2348-6848Vol-1, Issue-4, May 2014 .

[9] Jingyu Liu, Jiazheng Liu, Mengyang Zhang, ”A Detection And TrackingBased Method For Real-Time People Counting”, IEEE pp.470-473,2013.

[10] Jorge García, Alfredo Gardel, Ignacio Bravo,José Luis Lázaro, MiguelMartínez, and David Rodríguez,“Directional People Counter Based onHead Tracking”, IEEE Transactions On Industrial Electronics, Vol. 60,No. 9,Pp 3991-4000, September 2013.

AUTHOR’S PROFILE

Geeta L. Makhija has completed B.E. Degree inComputer Science and Engineering from Sant GadgeBaba Amravati University, Amravati, Maharashtra. Sheis persuing Master’s Degree in Computer Science andInformation Technology from P.G. Department ofComputer Science and Engineering, S.G.B.A.U.Amravati. Her current research interest is focus on thearea of Image Processing.(E-mail: [email protected]).

D. R. RautDr. Mrs. Ranjana D. Raut is graduated inInstrumentation and Control form prestigiousGovernment College of Engineering, Pune(COEP) in 1986. She was a board ranker of SSCand HSSC Maharashtra Board, and was recipientof National Merit Awards. She received hermaster’s degree in Electronic Engineering withdistinction and Ph.D. in Electronics (Biomedical-Soft Computing) from SGB Amravati University.She has taught in various Engineering Collegesfor 09 years and now is an Associate Professor inpost graduate Applied Electronics Department(Engg. & Tech.), for past 20 years. She haspublished 48 research papers in highly indexedInternational Journals. She is also In-charge ofCentral Instrumentation Research Cell andInstrument Maintenance Facility center of SGBAmravati University. Her area of research isMachine Intelligence, Soft Computing, MedicalImaging, wireless communication and design anddevelopment of low cost health care devices inmedical electronics.(e-mail: [email protected])

Dr. V. M. ThakareDr. Vilas M. Thakare is Professor and Head in PostGraduate department of Computer Science and engg,Faculty of Engineering & Technology, SGB Amravatiuniversity, Amravati. He is also working as acoordinator on UGC sponsored scheme of e-learningand m-learning specially designed for teaching andresearch. He is Ph.D. in Computer Science/Engg andcompleted M.E. in year 1989 and graduated in 1984-85.He has more than 27 years of experience in teachingand research. Throughout his teaching career he hastaught more than 50 subjects at various UG and PGlevel courses. He has done his PhD in area of robotics,AI and computer architecture. 5 candidates havecompleted PhD under his supervision and more than 8are perusing the PhD at national and international level.His area of research is Computer Architectures, AI andIT. He has completed one UGC research project on"Development of ES for control of 4 legged robotdevice model.". One UGC research project is ongoingunder innovative scheme. At PG level also he hasguided more than 300 projects/discretion. He has alsosuccessfully completed the Software Development &Computerization of Finance, Library, Exam, AdmissionProcess, and Revaluation Process of AmravatiUniversity. Also completed the Consultancy work forelection data processing. He has also worked as memberof Academic Council, selection Committee member ofvarious Other University and parent university, Memberof faculty of Engineering & Science, BOS (Comp. Sci.),Member of IT Committee, Member of NetworkingCommittee, Member of UGC, AICTE, NAAC, BUTR,ASU, DRC, RRC, SEC, CAS, NSD etc committees