Mul$mediaEventDetec$on*Task - NIST...Given an event specified by an event kit, search multimedia...
Transcript of Mul$mediaEventDetec$on*Task - NIST...Given an event specified by an event kit, search multimedia...
Mul$media Event Detec$on Task
Time Presentation
9:20 – 9:40 Task Overview (NIST) 9:40 – 10:00 Access to Audiovisual Media (AXES)
10:00 – 10:20 SRI International; Sarnoff Corporation (Aurora)
10:20 – 10:40
Break in the NIST West Square Cafeteria
10:40 – 11:00 Kitware Inc (GENIE)
11:00 – 11:20 Tokyo Institute of Technology; Canon Corporation (TokyoTechCanon)
11:20 – 11:40 SRI Internations (SESAME) 11:40 – 12:10 Discussion
2012 TRECVID Workshop Mul$media Event Detec$on Task
Jonathan Fiscus Na$onal Ins$tute of Standards and Technology (NIST)
Mar$al Michel Systems Plus Inc.
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Talk Outline • MED Task Overview • HAVIC Data Resources • The 2012 MED Results • Summary and What’s Next
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Given an event specified by an event kit, search multimedia recordings for the event: 1. determine a hard decision confidence
threshold prior to search time, 2. assign a confidence score to each clip in
the collection, 3. measure Content Description build time,
and 4. measure the Event Agent execution time
An MED Event is
MED Task Definition
• complex activity occurring at a specific place and time;
• involves people interacting with other people and/or objects;
• consists of a number of human actions, processes, and activities that are loosely or tightly organized and that have significant temporal and semantic relationships to the overarching activity;
• is directly observable.
Rock Climbing Event Kit Text
Illustrative Examples • Positive instances of the event • Clips “Related” to the event
Evidential Description: • scene: outdoors in natural setting, indoors in rock
climbing gym, or outdoors on a specially … • objects/people: carabiners, rope, helmet, harness, rock
formation, artificial rock wall, climbers • activities: hooking rope to harness, moving hands and
feet along side of rock face, grabbing rock ……. • audio: carabiners clinking, climbers making comments
on the difficulty of the climb, onlookers cheering on …
Definition: One or more people climb up or across rock formations or artificial rock walls. Explication: Rock climbing is a physically intense activity, where the goal is to reach the top or endpoint of a pre-defined route on a rock formation or artificial rock wall by finding a grip on the surface using hands and feet, and then pulling up using their arm and leg strength. …
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Posi$ve Rock Climbing Video Example
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
MED Evalua$on Condi$ons • MED Tasks
– Pre-‐Specified Event (PS) – MED metadata genera$on op$mized with knowledge of events
– Ad-‐Hoc Event (AH) – MED metadata genera$on complete before events revealed
• Event Agent Genera$on (EAG) Processing Types – AutomaAc EAG – No human interac$on to build the event agent – Semi-‐AutomaAc EAG – Human guidance of event agent building
• Events Processes – MEDFull – Processing 20 PS event, 5 AH events – MEDPart – Processing a subset of the events
• Event Training Condi$on – EKFull – Use the event kit text and all supplied posi$ve, near_miss, and related
exemplars – EK10Ex – Use a 10-‐posi$ve and 10-‐related clip subset (20 total) of EKFulll
• Required Condi$on – PS, EKFull
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
The TRECVID MED 2012 Events
MED ’11 Events Changing a vehicle $re Ge\ng a vehicle unstuck Grooming an animal Making a sandwich Parkour Repairing an appliance Working on a sewing project Birthday party Flash mob gathering Parade
New Events A_emp$ng a bike trick Cleaning an appliance Dog show Giving direc$ons to a loca$on Marriage proposal Renova$ng a home Rock climbing Town hall mee$ng Winning a race without a vehicle Working on a metal cra`s project
New Events Doing homework or studying Hide and seek Hiking Installing flooring Wri$ng text
Pre-Specified Events Ad Hoc Events
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
17 MED 2012 Finishers and Number of Runs
!"#$% !"&'(( !"#$% !"&'((
)%!* + +
,'-(./01.2304/.56704/.56089:/2:70;<=>?@04/.5670"A2B>(.:C:04/.5:?D.230E:'5:/708:FB/.F>(>?70!?ADG'D04/.560H>22:?@AG70&?A'/B>=:?0I:D:(([email protected]/70JJ170,:'DFB:KL:((:70M:2B:?(A/@D0N/D260=>?0*>'/@0A/@0O.D.>/70!H1NP
JJMON*!HQ R # # R HA32B:>/0JJM08:FB/>(>S.:D704P,701>('G-.A7041&08:AG1!H8TKN8NQ + N/=>?GA2.FD0A/@08:(:GA2.FD0N/D26701:/2?:0=>?0H:D:A?FB0A/@08:FB61P4Q # # # U 1A?/:S.:0P:((>/04/.5:?D.23,14K.),K1E)HN8VQ # # ,'-(./01.2304/.5:?D.2370N),!1M4 U N/D2.2'2:0>=01>GW'2:?0)WW(.FA2.>/D70!AD201B./A0M>?GA(04/.5:?D.23I:/.:Q # R # U ".29A?:0N/F6NJP14Q # # # U NJP0860X60LA2D>/0H:D:A?FB01:/2:?P:@.AP.((Q R # U 4/.5:?D.230>=0)GD2:?@AGMNNQ + MA2.>/A(0N/D2.2'2:0>=0N/=>?GA2.FDM88KMNN R + M8801>GG'/.FA2.>/0*F.:/F:0EA-670MA260N/D260>=0N/=>?GA2.FD;Y4 # # ;DACA0Y?:=:F2'?:04/.5:?D.23*:DAG:Q # R # U *HN0N/2:?/A2.>/A(0*!*)P!*HN)4H;H)Q # # # U *HN0N/2:?/A2.>/A(0*A?/>==0)'?>?A8>C3>8:FB1A/>/Q R + 8>C3>0N/D2.2'2:0>=08:FB/>(>S30A/@01A/>/4!1 # # 4/.5:?D.230>=0!(:F2?>K1>GG'/.FA2.>/D0ONH!; U 1.2304/.5:?D.230>=0T>/S0">/S
! "# $ %!
)@KT>F Y?:K*W:F.=.:@8:AG ;?SA/.ZA2.>/
* MED ‘11 Participants
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Data Collec$on & Annota$on • Team of 50 data scouts at the Linguis$c Data Consor$um
– In-‐person training, regular team mee$ngs, work remotely • Custom GUI to search web for appropriate videos, then
annotate their proper$es • Two guiding annota$on principles, plus corollary
– Sufficient Evidence Rule: Video must contain sufficient evidence to decide that an event has occurred
– Reasonable Viewer Rule: If according to a reasonable interpreta$on of the video the event must have occurred, then the clip is a posi$ve instance of that event
– Corollary: Not necessary for full process to be shown • Scouts encouraged to seek out interes$ng, varied clips
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Annota$on and Prepara$on of Candidate Videos
• For each candidate video, scouts are required to – Watch clip in its en$rety – Determine and verify the download URL – Screen for sensi$ve PII, objec$onable content
• Collec$on strategies – Event specific: label with event status (posi$ve, near miss, background) – Background clips: collected without regard to an event
• Downloaded videos processed to standardize data format and encoding – MPEG-‐4, h.264 video encoding, aac audio encoding – Original video resolu$on and audio/video bitrates retained
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
HAVIC Data Resources
Video clips Video duraAon
MED ‘12 Training
MED ‘10 3,468 114 hours
MED ‘11 DEV 10,403 324 hours
MED ‘11 Eval 32,061 991 hours
Transcrip$on 1,498 45 hours
Progress Test Collec$on (Used for MED ‘12-‐15) 98,117 3,722 hours
Total 144,049 5,151 hours
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Decision Error Tradeoff (DET) Curves ProbMiss vs. ProbFA
!i : (PFA (DecisionScorei ),PMiss (DecisionScorei ))
The Target Error Ra$o (TER) Line • Constant PMiss/PFA Ra$o
(12.5)
Analysis Point (PFA,Pmiss) at Actual Detec$on Threshold
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Primary, Pre-‐Specified Event Systems Event-‐Averaged, PMiss and PFA at Actual Decision Threshold
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
CMU
Sesame
BBNVISER
SRIAURO
RA
Med
iaMill
ECNU
Genie
TokyoT
echC
anon
IBMCU
AX
ES
DCU-‐iA
D-‐CLAR
ITY
UEC
OPU
VIRE
O
NII
NTT-‐NII
CERT
H-‐ITI
Prob
ability
EvAvg-‐PFA EvAvg-‐PMiss
Sorted by PMiss
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Primary, Ad-‐Hoc Event Systems Event-‐Averaged, PMiss and PFA at Actual Decision Threshold
Sorted by PMiss
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Sesame
CMU
BBNVISER
TokyoT
echC
anon
Med
iaMill
SRIAURO
RA
Genie
AXES
NTT-‐NII
DCU-‐iA
D-‐CLAR
ITY
IBMCU
UEC
OPU
Prob
ability
EvAvg-‐PFA
EvAvg-‐PMiss
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Pre-‐Specified Event Systems: EKFull vs. EK10Ex Event-‐Averaged, PMiss and PFA at Actual Decision Threshold
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Metadata Genera$on Speed Primary Pre-‐Specified Event Systems
Realtime Factors Single Core Realtime Factors
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Event Agent Execu$on Speed Pre-‐Specified Event Systems
Exe
cutio
n R
eal T
ime
Fact
or (X
RT)
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Summary • 17 par$cipants processed 3,722 hours of video searching for 20 Pre-‐
Specified events – Repor$ng results on the Progress Set while protec$ng the data set’s
proper$es was challenging • Repor$ng threshold performance would give too much insight into the data set
– Core-‐normalized run$mes should be viewed as approximate • 13 par$cipants par$cipated in the Ad-‐Hoc Event Pilot
– Encouraging results for the Ad-‐Hoc Events • What’s next?
– The Progress set should be not accessible un$l next summer • Remove LDC disk drive from your system • Either: delete video and models, turn off disk drive(s), or remove drive(s)
– Proposed task changes (NIST will present during the discussion period) – Next evalua$on cycle will:
• Retest the 25 MED ’12 events • Add 10 new Pre-‐specified events (Spring) • Add 10 new AdHoc events (Fall)
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Ques$ons?
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Backup Slides
2012 TRECVID Workshop November 26, 2012 (Gaithersburg, Maryland)
Proposed 2013 MED Task Changes • Discon$nue event kit texts • Define Nega$ve clips as a unique set of file IDs • Test MED systems on event exemplars • Combine MER and MED into a single system output per clip