Dipl.-Inf. Jörg WaitelonisHasso-Plattner-Institut for IT-Systems Engineering
University of Potsdam
Seman&sche Analyse und Suche
Semantic Search Engine
Media Analysis‣Structural Video Analysis‣Intelligent Character Recognition‣Face Detection & Clustering ‣Audio Mining‣Visual Concept Detection
Semantic Analysis‣Named Entity Recognition‣Context Analysis‣Semantic Annotation
konzep&oneller Workflow
Graphical User Interface‣Facetted Search‣Explorative Search‣fine granular User Annotation
Distribution / Production‣Media Asset Management
Digitization | Metadata | Rights
Warum unbedingt Seman&k???
Jaguar
Mehrdeutigkeiten durch Kontextbetrachtung auflösen
Die natürliche Sprache ist unfassbar ausdrucksstark UND mehrdeutig.
„Armstrong betrat als erster Mensch den Mond.“
‣Kontext im Text
‣z.B. aus ASR oder OCR
‣Kontext im Bild
‣z.B. aus Visual Concept Detection
Auf den Kontext kommt es an.
Named En&ty Recogni&on
„Armstrong betrat als erster Mensch den Mond.“
Armstrong Mensch MondGeorge Armstrong Custer
Neil Armstrong
The Armstrong Twins
Armstrong, Florida
Armstrong, Ontario
Armstrong Automobile
Joe Armstrong
Armstrong County, Texass
Armstrong Gun
Craig Armstrong
Armstrong (Mondkrater)
Louis Armstrong
Armstrong Tunnel
Louis Armstrong International Airport
Armstrong‘s Theorem
Sir Thomas Armstrong
Ian Armstrong
HumanBill Mensch
Bob Mensch
David Mensch
Homer Mensch
Louise Mensch
Halber Mensch
Mensch ärgere Dich nichtMensch Computer
Peter van Mensch
Daniel Mensch
Mensch (album)
Der Mond (Oper)
MOND
Mond Nickel CompanyBrunner Mond
Bernard Mond
Peter Mond
Julian Mond
Ludwig Mond
Violet MondMOND Technologies
Robert Mond
Henry Mond
Alfred Mond
Chava Mond
Named En&ty Recogni&on
Wikipedia Info-Boxen
Wikipedia Info-Boxen
Die semantische Wikipedia
Named En&ty Recogni&on
Web of Data
Neil Armstrong Entities
Astronaut
Science Occupation
Employment
is a
is a
is a
Classes
Person
is a
has a
Named En&ty Recogni&on
Named En&ty Recogni&on
„Armstrong betrat als erster Mensch den Mond.“
Armstrong Mensch MondGeorge Armstrong Custer
Neil Armstrong
The Armstrong Twins
Armstrong, Florida
Armstrong, Ontario
Armstrong Automobile
Joe Armstrong
Armstrong County, Texass
Armstrong Gun
Craig Armstrong
Armstrong (Mondkrater)
Louis Armstrong
Armstrong Tunnel
Louis Armstrong International Airport
Armstrong‘s Theorem
Sir Thomas Armstrong
Ian Armstrong
HumanBill Mensch
Bob Mensch
David Mensch
Homer Mensch
Louise Mensch
Halber Mensch
Mensch ärgere Dich nichtMensch Computer
Peter van Mensch
Daniel Mensch
Mensch (album)
Der Mond (Oper)
MOND
Mond Nickel CompanyBrunner Mond
Bernard Mond
Peter Mond
Julian Mond
Ludwig Mond
Violet MondMOND Technologies
Robert Mond
Henry Mond
Alfred Mond
Chava Mond
Named En&ty Recogni&on
„Armstrong betrat als erster Mensch den Mond.“
Armstrong Mensch Mond
George Armstrong Custer
Neil Armstrong
Armstrong, Florida
Armstrong, Ontario
Armstrong Gun
Craig Armstrong
Armstrong (Mondkrater)
Louis Armstrong
Sir Thomas Armstrong
Human
Bob Mensch
David Mensch
Homer Mensch
Louise Mensch
Halber Mensch
Mensch ärgere Dich nichtMensch Computer
Mensch (album)
Der Mond (Oper)
Mond (Erdtrabant)
Mond Nickel Company
Brunner Mond
Bernard Mond
Peter Mond
Julian Mond
Ludwig Mond
Henry Mond
Alfred Mond
Chava Mond
Zeitabhängige Seman&sche Daten
time
Video Analysis /Metadata Extraction
metadatametadata
metadata
metadatametadata
e.g., bibliographical data,geographical data,encyclopedic data, ..
Entity Mapping
Entity Recognition
Kontext Defini&on
RDF graph to find relations between entities co-occurringin a text maintaining the hypothesis that disambiguationof co-occurring elements in a text can be obtained byfinding connected elements in an RDF graph [7]. In orderto regard the special compilation of non-textual data, staticand user-genrated metadata in audio-visual content our novelapproach combines the use of semantic technologies andLinked Data with linguistic methods.
III. METHOD
According to a study about structure and characteristicsof folksonomy tags [8] an average of 83% of user-generatedtags are single terms. Also, an average of 82% of thereviewed tags are nouns. Based on these study results, weignore tag practices, such as camel case (”barackObama”)and treat tags as subjects or categories describing a resource.As a tag could also be part of a group of nouns representingan entity or a name (”flying machine”,”albert einstein”) thetags stored as single words without any given order have tobe combined in term groups of two or more terms to findall appropriate entities. Hence, every tag or group of tagswithin a given context may represent a distinct entity. Theterm combination process and subsequent mapping of termsand term groups to entities are described in sect. III-B.
To disambiguate ambiguous terms we combine two meth-ods: a co-occurences analysis of the terms in the context inWikipedia articles and an analysis of the page link graph ofthe Wikipedia articles of entity candidates. The scores forboth analysis steps are calculated to a total score.
A. Context Definition
Metadata exists in a certain context and has to be inter-preted according to this context. For tags of audio-visualcontent we identified two dimensions:
• temporal dimension• user-centered dimensionIn the temporal dimension a context can be defined as the
entire video, a segment or a single timestamp in the video.The user-centered dimension classifies a context by howmany users created the concerning metadata - only tags by acertain user or all tags regardless of which user. Fig. 1 showsthe combinations of the two dimensions of contexts formetadata in audio-visual content the interpretation regardingthe significance of a context.
Audio-visual content also provides the opportunity tosupply spatial information. Thus, tags in the same regionof a video frame are considered as related to each other.In the current approach we did not consider this contextdimension.
To describe our approach we use a sample context of ourtest set (see sect. IV). This sample context is composed oftags by only one user at a certain timestamp in the video.The video containing this sample context is a presentation
Figure 1. Dimensions of context definition in audio-visual content
by Dr. Garik Israelian at the TED conference3 entitled ”Howspectroscopy could reveal alien life”4. Our sample contextconsists of the tags ”hubble”, ”spitzer”, ”carbon”, ”dioxide”,”methan”, ”co2”, and ”water”.
B. Preprocessing
Term Combination: Our combination algorithm takesall tags of a specified spatio-temporal context (at a certaintimestamp/in a certain segment of a video, of a singleURL/image and generates every possible combination of atmost three terms of the context in every possible order. Inthat way we make sure to rectify groups of single termsthat belong together. We chose to generate combinationsof three words to make sure to also hit named entitiesconsisting of more than two words, such as ”public keycryptography” or ”alberto santos dumont”. About 90% ofthe DBpedia [9] labels consist of at most three words, butless than 5% consist of 4 words. Due to these numbersand performance issues we decided to limit the number ofterms to be combined to three. Subsequently in this paperby terms we will refer to single terms as well as generatedterm groups. The number c of combinations is calcultaed byc =
�jk=1
n!(n−k)! .
For our sample context containing 7 tags and at most3 terms in a combination (j = 3), 259 combinations aregenerated.
Term Mapping: The terms then have to be mapped tosemantic entities. For our approach we use entities of theLinked Open Data Cloud [10], in particular of the DBpedia,version 3.5.1.
DBpedia provides labels for the identification of distinctentities in 92 languages. We use English and German aswell as Finnish labels, as we noticed that neither English northe German labels contain important acronyms as labels, butthe Finnish language version does. As tagging users prefer tokeep it simple and short[2], resources dealing with ”DomainName System” would rather be tagged with ”DNS” than”Domain Name System”.
After simple string matching of the terms of the contextto DBpedia URIs, the URIs are revised for redirects and
3http://www.ted.com4http://yovisto.com/play/14415
User-centered Dimension
Temporal Dimension
Spatial Dimension
‣unterschiedliche Metadatenquellen haben unterschiedliche Zuverlässigkeit
‣autoritative Metadaten (strukturiert / unstrukturiert)
‣analytische Metadaten (zeit- / lagebezogen)
‣nichtautoritative nutzergenerierte Metadaten (global und zeit- bzw. lagebezogen))
En&täten-‐basierte Annota&on
‣räumlich und zeitliche Annotation mit semantischen Entitäten
En&täten-‐basierte Suche
FaceJerte Suche
Link And Brush
Demo: http://mediaglobe.yovisto.com/mggui-dev2/
•Ein einfaches Beispiel:
Ich suche das Buch „Wem die Stunde schlägt“ von Ernest Hemingway in der ersten deutschen Ausgabe...
Suchen ist nicht gleich Suchen
•Ein einfaches Beispiel:
Ich suche das Buch „Wem die Stunde schlägt“ von Ernest Hemingway in der ersten deutschen Ausgabe...
Wem die Stunde schlägt. - Ernest H E M I N G W A Y. (Stockholm usw., Bermann-Fischer Verlag, 1941) 560 S. 8“II 1, 2506, 34548
Suchen ist nicht gleich Suchen
•...aber was, wenn man nicht genau weiß, was man sucht?
Mir hat das Buch „Wem die Stunde schlägt“ von Ernest Hemingway gefallen und ich weiß nicht genau, was ich als nächstes lesen soll....
Suchen ist nicht gleich Suchen
• Was, wenn der Benutzer nicht weiß, welchen Suchbegriff er/sie benutzen soll?
• Was, wenn der Benutzer komplexere Antworten sucht?
• Was, wenn er/sie das Wissensgebiet, über das er sich informieren will, nicht (gut) kennt?
• Was, wenn er/sie wissen möchte, welche Dokumente es insgesamt zu einem speziellen Thema in einem Repository gibt?
• ...,Stöbern‘ statt ,Suchen‘• ...etwas ,zufällig‘ finden• ...Serendipity• ...einen Überblick gewinnen
Explora&ve Suche
dbpedia:For_Whom_the_Bell_Tolls
Wie soll das semantischeNetzwerk um dbpedia:For_Whom_the_Bell_Tollsherum durchsucht werden?
http://dbpedia.org/page/For_Whom_the_Bell_Tolls
Explora&ve Suche
dbpedia-owl:author
dbpedia-owl:author
dbpedia-owl:author
dbpedia-owl:author
Explora&ve Suche
dbpedia-owl:author
dbpedia:Ernest_Hemingwaydbpedia:For_Whom_the_Bell_Tolls
dbpedia:Raymond_Carver
dbpedia-
owl:influenced_by
dbpedia:Jack_Kerouac
dbpedia-
owl:influenced_by
dbpedia-owl:influenced_by
dbpedia:Jerome_D._Salinger
Explora&ve Suche
dbpedia:Jack_Kerouac dbpedia:Raymond_Carverdbpedia:Jerome_D._Salinger
dbpedia-owl:notableWork dbpedia-owl:notableWork dbpedia-owl:notableWork
Explora&ve Suche
FAZIT
‣Mediaglobe ermöglicht eine semantische Entitäten-basierte Suche
‣Mediaglobe schlägt damit traditionelle schlüsselwortbasierte Suchmaschinen in Genauigkeit und Trefferquote
‣Mediaglobes semantische Annotationen ermöglichen:
‣neuartige Empfehlungssysteme
‣z.B. als Erweiterung der Suchmöglichkeiten
‣oder als Grundlage für andere Content-sensitive Services
‣Interoperabilität zu anderen Systemen durch Standards
‣neue Gestaltungsmöglichkeiten innovativer User-Interfaces
Dipl.-Inf. Jörg WaitelonisHasso-Plattner-Institut for IT-Systems Engineering
University of Potsdam
Vielen Dank !
Top Related