ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

48
ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net

Transcript of ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Page 1: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

ViPERVideo Performance Evaluation Toolkit

viper-toolkit.sf.net

Page 2: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Performance Evaluation

• Ideal:– Fully automated– Repeatable– Can be used to compare results without

access to the product– Predictive validity– Useful for the task– General enough to cover any task

Page 3: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Reality• Cannot fully automate for most domains.

– Subjective– Objective

• Results from subjective studies cannot be easily extended, if at all.

• Ground truth is hard to gather, lossy, and evaluation metrics are hard to formulate.

• It is often difficult to determine what is really being measured.

Page 4: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

The ViPER Toolkit

• Unified video performance evaluation resource, including:– ViPER-GT – a Java toolkit for marking up

videos with truth data.– ViPER-PE – a command line tool for

comparing truth data to result data.– A set of scripts for running several sets

of results with different options and generating graphs.

Page 5: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

The Video Performance

Evaluation Resource

Ground Truth Editor

PerformanceEvaluation Tool

TruthData

Video AnalysisAlgorithm

ResultData

SchemaMapping

Metrics Filters

EvaluationResults

Video AnalysisAlgorithm

ResultDataVideo Analysis

AlgorithmResultData

Page 6: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

ViPER (Evaluator View)

PerformanceEvaluation Scripts

TruthData

ResultData

SchemaMapping

MetricsTemplateFilter

Template

EvaluationResults

ResultData

EvaluationResultsEvaluation

Results

SchemaMapping

PerformanceEvaluation Tool

MetricsParamsMetrics

ParamsMetricsParams

Page 7: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

ViPER (Developer View)

ResultData

Page 8: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

ViPER File Format

• Represents data as set of descriptors, which the user defines in a schema.– Each descriptor has a set of attributes,

which may take on different values over the file.

– Like a temporally qualified relational database for each media file, where each row is an instance of a descriptor.

Page 9: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

ViPER File Format: Descriptors

Descriptor Types– FILE (Video Level Information)– CONTENT (descriptors of the scene)

• Static attribute values• Single instance of one type for any frame

– OBJECT* (descriptors of instances, including events)• Attributes are dynamic by default• Multiple instances can exist at a single frame.

Page 10: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Attributes• Attribute Types:

– Strings, numbers, booleans, and enumerations

– Shape types, including bounding boxes and polygons

– Relations (Foreign keys)

Page 11: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

ViPERGround Truth Editing

viper-toolkit.sf.net

Page 12: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Ground Truth Editing

Page 13: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

The Difficulty of Authoring Ground

Truth• Ground truth is tedious and time

consuming to edit.• Ground truth is lossy.

Page 14: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

A Generic Video Annotation Tool

• Lets the user specify the task and the interpretation.

• Provides a

Page 15: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Competition• VideoAnnEx

– IBM AlphaWorks MPEG-7 Editor

• OntoLog (OWL)– Jon Heggland’s RDF Video Ontology Editor

• Informedia– CMU Digital Video Library

• PhotoStuff– Still image annotation for the semantic web

Page 16: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Time Line View

• Provides summary of ground truth.• Direct manipulation across frames.• Feedback for indirect manipulation.

Page 17: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Time Line View

• Provides summary of ground truth.• Direct manipulation.

– Quick editing of activities, events, and other CONTENT descriptors.

– Some ability to modify descriptors with dynamic attributes directly, if not the attribute values.

• Feedback for indirect manipulation.– Easier to notice massive changes.

Page 18: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Enhanced Keyboard Editing

• Support for real-time mark-up of events and activities.– Keys for creating and deleting activities.– Keys for controlling rate of display (jog dials).

• Enhance mark-up of spatial data.– Keys for creating, editing of a single descriptor's

attribute.

• Overall attempt to minimize effort in a GOMS model.– Mouse events unnecessary except for polygon editing.

Page 19: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Enhanced Keyboard Editing:

Real-time Example• User assigns keys for three content types. Each key toggles between off/on states for each content type.

• Forward/back decelerate/accelerate video playback. May skip frames, rewind,etc. – In paused mode, space goes to next

frame.

• USB jog dial might be useful.

Page 20: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Enhanced Keyboard Editing:

Spatial Example• Mode selection:– Control-d cycles

through descriptor types.

– Control-a cycles through attribute types.

– Control-s cycles through available descriptors.

• Editing:– Control-n creates new

descriptor of given descriptor type.

– Control-f creates a new attribute of given type if none exists.

– Arrow keys move, arrow+modifier resizes.

Page 21: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Frame View

Page 22: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Schema Editor

Page 23: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

ViPER-GT Internals

ViPER-GT:A Video

Ground Truth Annotation

Tool

Schema Editor

ViPER Metadata API

Pure Java MPEG Decoder

AppLoaderPlug-In Manager

Jena

Core GT API

Plug-InsPlug-Ins

Plug-Ins Native Decoders:VirtualDubQuickTime

JMF

Page 24: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Latest Version in Series

Page 25: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Latest Version in Series

• Schema editor.• Timeline view.• Supports undo/redo.• New video annotation widget.

Page 26: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

GTF Inputter (Original V-GT)

Page 27: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

ViPERPerformance Evaluation

viper-toolkit.sf.net

Page 28: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.
Page 29: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

PE Methodology•Ground truth and results are represented by a set of

descriptive records–Target: An object or content record delineated temporally in the ground truth along with a set of of attributes (possibly spatial)

–Candidate: An object or content record delineated temporally in the results along with a set of of attributes (possibly spatial)

•Requirements–Matching records which are close enough to satisfy a given

set of constraints on:• Temporal range• Spatial location of object• Values of attributes in date-type specific parameter space

Page 30: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Detection and Localization

Detection: whether a target object or content record is properly identified

Localization: how well the target is detected

•Simplest level: –A target is detected if its temporal range overlaps the

temporal range of a single candidate

•Qualifiers and localization constraints–Temporal overlap must meet a certain tolerance (% or #)–Spatial attributes must overlap within a tolerance a frame

by frame basis–Non-spatial attributes must be within a given tolerance

Page 31: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Temporal Localization

•MetricsOverlap coefficient:

• # Of % of target frames detected

–Dice coefficient• # Or % in common• Similarity measure

–Extent coefficient• Deviation in the endpoint

location of ranges

TARGET

CANDIDATES

Page 32: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Attribute Localization

•Each datatype has its own metric–Svalue: edit distance–Point: euclidean distance–Bboxes, oboxes, circle: overlap and dice coefficients–Bvalue, lvalue: exact match [0,1]–Remainder: absolute difference

•Object correspondence–Optimal subset

•Temporal constraints–Frame by frame tolerance–Virtual candidate

TARGET

CANDIDATES

Page 33: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Reporting Metrics •Detections

–List of correct, missed and false detections–Summary of absolute detection scores as a percentage–Summary of overall precision and recall

•Localization–Optimal subset of matching frames–Frame by frame tolerance–Mean, median, SD and maximum values reported

• Issues:–Many-to-one, many-to-many

Page 34: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Evaluation Using “Gtfc”

•Used to provide basic evaluation mechanisms–Requires configuration

• Equivalence classes• Evaluation specification

–Reports• Attribute and descriptor level recall and

precision

Page 35: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Evaluation Configuration

#BEGIN_EQUIVALENCEDISSOLVE : FADE-IN FADE-OUT TRANSLATE

#END_EQUIVALENCE

#BEGIN_EVALUATION_LISTCONTENT Shot-Change

TYPE: CUT FADE-IN FADE-OUT

OBJECT TextTYPE: FULL OVERLAY SCENE*POSITION*CONTENT*MOTION

#END_EVALUATION_LIST

Set up Equivalencies

Evaluate subsets of GT

Allow Selected Performance Measures

Page 36: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Video Evaluation• Providing metrics to judge correctness of

–Value of Attributes–Range of Frames (temporal)–Detection and Localization of objects (spatial)–Moving objects (spatio temporal)

•Degree of Correctness related to –similarity of or distance between descriptors–cost of transformation between result and ideal data

•Performance Metrics reported as–% of correct/incorrect instances

Page 37: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

PE Methodology•Ground truth and results are represented by a set of

descriptive records–Target: An object or content record delineated temporally in the ground truth along with a set of of attributes (possibly spatial)

–Candidate: An object or content record delineated temporally in the results along with a set of of attributes (possibly spatial)

•Requirements–Matching records which are close enough to satisfy a given

set of constraints on:• Temporal range• Spatial location of object• Values of attributes in date-type specific parameter space

Page 38: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Detection and Localization

Detection: whether a target object or content record is properly identified

Localization: how well the target is detected

•Simplest level: –A target is detected if its temporal range overlaps the

temporal range of a single candidate•Qualifiers and localization constraints

–Temporal overlap must meet a certain tolerance (% or #)

–Spatial attributes must overlap within a tolerance a frame by frame basis

–Non-spatial attributes must be within a given tolerance

Page 39: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Temporal Localization

•MetricsOverlap coefficient:

• # Of % of target frames detected

–Dice coefficient• # Or % in common• Similarity measure

–Extent coefficient• Deviation in the endpoint

location of ranges

TARGET

CANDIDATES

Page 40: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Attribute Localization

•Each datatype has its own metric–Svalue: edit distance–Point: euclidean distance–Bboxes, oboxes, circle: overlap and dice

coefficients–Bvalue, lvalue: exact match [0,1]–Remainder: absolute difference

•Object correspondence–Optimal subset

•Temporal constraints–Frame by frame tolerance–Virtual candidate

TARGET

CANDIDATES

Page 41: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Metric and Tolerance

Specification•Specification in the evaluation parameter file

descriptor-type descriptor-name [METRIC TOLERANCE]

attribute1 [METRIC TOLERANCE]

attribute2 [METRIC TOLERANCE]

Page 42: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Match Scenarios

FALSE

MISSED

CORRECT

CORRECT

Page 43: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Error Graphs

Page 44: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Localization Graphs

Page 45: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Enhanced Don't Care Example

• In an activity detection, certain segments are often more important than others:– The moment someone enters or exits the

scene.– The moment a thief grabs a bag.

• These segments might be marked up explicitly as part of an activity descriptor, and treated as important during the evaluation.

Page 46: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Enhanced Don't Care Regions

• For object evaluation, Don't Care currently applies only to entire descriptors. – Needs to apply to dynamic attributes at a per-

frame level, as it does for framewise evaluations.

• Need enhanced rules for computing don't-care regions spatially and temporally. For example:– Region of body not part of torso or head is

unimportant.– Frames before this event are unimportant.

Page 47: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Scripting ViPER

• RunEvaluation– Runs sets of comparisons with different

input parameters.

Page 48: ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.