www.petamedia.eu
IRP of Special Interest Group 2 - Leader: TU BerlinTools for Tag Generation
Introduction
The aim of this integrative research project (IRP)is the generation of tags and metadata using sig-nal processing and/or users’ annotation.This IRP deals with algorithms for key frame ex-traction and video shot clustering to enable usersto tag more easily videos.
TUB
Database
TUB
TUD
Annotation
TUB
Shot/subshot
boundary
detection
EPFL
Visual quality
TUB
Key framing
TUB
Text detection /
feature
extractionTUB
EPFL
TUD
QMUL
Video clustering
Integration acitivies
The IRP’s main topic is the integration of differentexpertise in the areas of image search engines, au-dio/video signal processing, machine learning andtext detection/recognition.
Preparations
The first activity was to set up a database ofvideos from the unstructured channel “Travel” onYouTube.com using “NUE YouTube Downloader”.This tool is also useful for other IRPs, i. e. “SocialMedia Acquisition”.
Tool used for setting up the database
This common database, consisting of 100 videosand affiliated metadata (keywords, comments,user information etc.), was then annotated forshot boundaries.
Key Frame Extraction
Temporal video segmentation divides the videostream into a set of segments from each of whichone representative frame is extracted based on at-tention features.Key frame extraction methods are simple yet ef-fective form of summarizing a long video sequenceand can be used for applications that only work onimages, like search engines (CBIR) or image clus-ter algorithms. These key frames can also be usedfor automatic or manual tagging, because they fa-cilitate users’ annotation.
Extracted key frames of a video sequence
Tag Generation
A topic of this IRP is the generation of tags. Fol-lowing aspects have been considered:
• Quality Tags: Key frames are used to pro-duce quality tags by no-reference video qual-ity assessment.
good quality
Tags derived by no-reference video quality
assessment
• Tags derived from Text:
Recognizing text within video sequences isalso a possibility for generating tags.
Another possibility to produce tags is to findout persons and locations by analyzing thesentence structure of affiliated descriptionsand user comments.
• Tags generated by concept detectors (likeindoor/outdoor, face, etc.)
Clustering
A fundamental step in this video summarizationis to create a similarity matrix and organize keyframes into the tree-structure using ant-tree clus-tering method.
Tree structuring of video frames by ant-tree clustering
Low-level features and tags of key frames are clus-tered to find related video content, and visualizedusing the FastMap algorithm applied on distancematrices. Propagation of widely common tagswithin compact clusters is to be studied.
Clustered key frames to perform similarity search
Future Work
• Automatic ROI Image Tagging
A solution for automatic image tagging canbe archived by object duplicate detection instatic images or key frames. The goal ofdetection is to propagate tags of objects re-garding from a training set.
What is this object?
Taj Mahal
OK
Taj Mahal
Taj Mahal
Taj
Mah
al
User Annotated Object Automatic Image Tagging
Tag propagation by object duplicate detection
• Subject Classification
Automatic subject tagging of video involvesthe assignment of a subject label to a videoobject or to a time point within a video ob-ject. The subject label reflects the semantictheme treated by the video; it reflects whatthe video is about rather than what is de-picted in the visual channel.
• Semantic Key Frame Extraction
Semantic Key Frame extraction is the taskof selecting one or more keyframes to rep-resent the intellectual content of a video ora given segment of video stream.
• Visual Reranking to improve Video Retrieval
Low-level visual features will be exploited forimproving semantic-theme based retrievalof videos indexed using speech recognitiontranscripts of their spoken content.
Contact
Coordination: Pascal KelmWeb: www.petamedia.euEmail: {[email protected]}
Top Related