An Industrial Heritage Case Study in Ayvalık: Ertem Olive Oil Factory
Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2
description
Transcript of Jayanth Nayak 1 , Luis Gonzalez-Argueta 2 , Bi Song 2 , Amit Roy-Chowdhury 2 , Ertem Tuncel 2
ICDSC'08 1
MULTI-TARGET TRACKING THROUGH OPPORTUNISTIC CAMERA CONTROL IN ARESOURCE CONSTRAINED MULTIMODAL
SENSOR NETWORKJayanth Nayak1, Luis Gonzalez-Argueta2, Bi Song2,
Amit Roy-Chowdhury2, Ertem Tuncel2Department of Electrical Engineering,
University of California, Riverside
9/8/2008
Bourns College of EngineeringInformation Processing Laboratorywww.ipl.ee.ucr.edu
ICDSC'08 2
Overview
IntroductionProblem FormulationAudio And Video ProcessingCamera Control StrategyComputing Final Tracks Of All TargetsExperimental ResultsConclusionAcknowledgements
9/8/2008
ICDSC'08 3
Motivation
Obtaining multi-resolution video from a highly active environment requires a large number of cameras.Disadvantages
Cost of buying, installing and maintainingBandwidth limitationsProcessing and storagePrivacy
Our goal: minimize the quantity of cameras by a control mechanism that directs the attention of the cameras to the interesting parts.
9/8/2008
ICDSC'08 4
Proposed Strategy
Audio sensors direct the pan/tilt/zoom of the camera to the location of the event.Audio data intelligently turns on the camera and video data turns off the camera.Audio and video data are fused to obtain tracks of all targets in the scene.
9/8/2008
ICDSC'08 5
Example Scenario
9/8/2008
An example scenario where audio can be used to efficiently control two video cameras. There are four tracks that need to be inferred. Directly indicated on tracks are time instants of interest, i.e., initiation and end of each track, mergings, splittings, and cross-overs. The mergings and crossovers are further emphasized by X. Two innermost tracks coincide in the entire time interval (t2, t3). The cameras C1 and C2 need to be panned, zoomed, and tilted as decided based on their own output and that of the audio sensors a1, . . . , aM.
ICDSC'08 6
Relation To Previous Work
Fusion of simultaneous audio and video data.Our audio and video data are captured at disjoint time intervals.
Dense network of vision sensors.In order to cover a large field, we focus on controlling a reduced set of vision sensors.
Our video and audio data is analyzed from dynamic scenes.
9/8/2008
ICDSC'08 7
Problem Formulation
Audio sensors A = {a1, . . . , aM} are distributed across ground plane RR is also observable from a set of controllable cameras C = {c 1, . . . ,cL}.However, entire region R may not be covered with one set of camera settings.p-tracks: tracks belonging to targetsa-tracks: tracks obtained by clustering audioResolving p-track ambiguity
Camera ControlPerson Matching
9/8/2008
ICDSC'08 8
Tracking System Overview
9/8/2008
a-tracks
Overall camera control system. Audio sensors A = {a1, . . . , aM} are distributed across regions Ri. The set of audio clusters are denoted by Bt, and Kt− represent the set of confirmed a-tracks estimated based on observations before time t. P/T/Z cameras are denoted by C = {c1, . . . , cL}. Ground plane positions are denoted by Ot
k .
ICDSC'08 9
Processing Audio and Video
a-tracks are clusters of audio data that are above amplitude threshold
Tracked using Kalman FilterIn video, people are detected using histogram of orientated gradients and tracked using Auxilary Particle Filter
9/8/2008
ICDSC'08 10
Mapping From Image Plane to Ground Plane
Learned parameters are used to transform tracks from image plane to ground planeEstimate projective transformation matrix H during a calibration phasePrecompute H for each PTZ setting of each camera
9/8/2008
vanishing line
ICDSC'08 11
Tracking System Overview
9/8/2008
ICDSC'08 12
Camera Control
Camera controlGoal: avoid ambiguity or disambiguate when tracks
are created or deletedintersectmerge
Set pan/tilt/zoom parameters
9/8/2008
ICDSC'08 13
Setting Camera Parameters
Heuristic algorithmCover ground plane by regions Ri
l Ri
l in field of view of camera Cl Camera parameters
Tracking algorithm specifies point of interest x from last known a-track
If no camera on, find Ri l containing x
Reassign a camera and set its parameters if x approaches boundary of current Ri
l
9/8/2008
li
li
li ZTP ,,
ICDSC'08 14
Camera Control Based on Track Trajectories
Intersection
9/8/2008
SeparationMerger
Sudden Appearance Undetected Disappearance
Sudden Disappearance
Locatio
n(M
eters)
Time(Seconds)
Locatio
n(M
eters)
Time(Seconds)
Locatio
n(M
eters)
Time(Seconds)
Locatio
n(M
eters)
Time(Seconds)
Locatio
n(M
eters)
Time(Seconds)
Switch to video
Locatio
n(M
eters)
Time(Seconds)
ICDSC'08 15
Creating Final Tracks Of All Targets
Bipartite graph matching over a set of color histograms
We collect features as the target enters and exits the scene in video.For every new a-track, features are collected from a small set of frames.The weight of an edge is the distance between the observed video features.Additionally, audio data is enforced on the weights.
9/8/2008
ICDSC'08 16
Creating Final Tracks Using Bipartite Matching
9/8/2008
Locatio
n(M
eters)
Time(Seconds)
Audio AudioVideo[a+, a-]
[b+, b-]
[c+]
[d+]
[e+, e-]
Tracking in Audio and Video
Locatio
n(M
eters)
Time(Seconds)
Tracking in Audio Only
Three tracks are recovered by matching every node (entry and exit from the scene) where video was capture.
Two tracks are recovered . However, red and green show the wrong path.
Audio cannot disambiguate independence once the clusters have merged.
[f+]
[g+]
Video
abcdefg
+-
Bipartite Graph Matching
abcdefg
abcdefg
+-
Bipartite Graph Matching Without Audio Constraint
abcdefg
[d-]
[c-]
ICDSC'08 17
Experimental Results
9/8/2008
Inter P-Track Distance at a Merge EventInter P-Track Distance at a Crossover Event
ICDSC'08 18
Experimental Results (Cont.)
9/8/2008
Click To Review Layout
ICDSC'08 19
Conclusion
Goal: minimize camera usage in a surveillance system
Save power, bandwidth, storage and moneyAlleviate privacy concerns
Proposed a probabilistic scheme for opportunistically deploying cameras in a multimodal network. Showed detailed experimental results on real data collected in multimodal networks.Final set of tracks are computed by bipartite matching
9/8/2008
ICDSC'08 20
Acknowledgements
This work was supported by Aware Building: ONR-N00014-07-C-0311 and the NSF CNS 0551719.
Bi Song2 and Amit Roy-Chowdhury2 were additionally supported by NSF-ECCS 0622176 and ARO-W911NF-07-1-0485.
9/8/2008
Thank You.
Questions?
Jayanth Nayak1
[email protected] Gonzalez-Argueta2, Bi Song2,
Amit Roy-Chowdhury2, Ertem Tuncel2
{largueta,bsong,amitrc,ertem}@ee.ucr.edu
9/8/2008
Bourns College of EngineeringInformation Processing Laboratorywww.ipl.ee.ucr.edu