Exploring Audiovisual Archives through Aligned Thesauri

27
Exploring Audiovisual Archives through Aligned Thesauri Victor de Boer , Matthias Priem, Michiel Hildebrand, Nico Verplancke, Arjen de Vries and Johan Oomen

Transcript of Exploring Audiovisual Archives through Aligned Thesauri

Page 1: Exploring Audiovisual Archives through Aligned Thesauri

Exploring Audiovisual Archives through Aligned Thesauri

Victor de Boer, Matthias Priem, Michiel Hildebrand, Nico Verplancke, Arjen de Vries and Johan Oomen

Page 2: Exploring Audiovisual Archives through Aligned Thesauri

CC-by-nc-nd https://www.flickr.com/photos/joinash/

The dangers of silos

Page 3: Exploring Audiovisual Archives through Aligned Thesauri

The modern archiveThe modern archive

Page 4: Exploring Audiovisual Archives through Aligned Thesauri

...TO CONTEXT: MUTUALLY CONNECTED COLLECTIONS...

03-05-2023

Connecting collections:topics, people, genres, etc

Catalogue

Photos

Wiki

Programm

eguides

Internal: Video hyperlinking

Page 5: Exploring Audiovisual Archives through Aligned Thesauri

External: Networked heritageLinks through vocabularies

Page 6: Exploring Audiovisual Archives through Aligned Thesauri

Case: Flemish Institute for Archiving (VIAA) and the Netherlands Institute for Sound and Vision (NISV) • pipeline of a real-world, international use

case that illustrates the end-user benefit of aligned SKOS thesauri

• method and tools for converting XML thesauri to SKOS;

• CultuurLINK, an interactive tool for thesaurus alignment;

• application that enables cross-collection search and browsing using the aligned thesauri.

Page 7: Exploring Audiovisual Archives through Aligned Thesauri

Sound and Vision VIAADutch AV heritage

> 1.000.000 hrs of Tv (public broadcasters)

radio, music, docu, film, commercials, etc

Flemish archive,

including Flemish broadcaster (VRT)

Page 8: Exploring Audiovisual Archives through Aligned Thesauri

Gemeenschappelijke Thesaurus Audiovisuele Archieven (GTAA)

184,484 terms (concepts, persons, geo,…)19,695 terms in hierarchy9 conceptSchemes90,708 scopeNotes33,542 relations

Published as SKOS Linked Open Datahttp://gtaa.beeldengeluid.nl/

Page 9: Exploring Audiovisual Archives through Aligned Thesauri

VRT Thesaurus100.000+ termsStructured, but not SKOS yetNo concept schemes

Page 10: Exploring Audiovisual Archives through Aligned Thesauri

.

Mapped to SKOS, Hierarchies to skos:broader/narrower

VRT Thesaurus

Page 11: Exploring Audiovisual Archives through Aligned Thesauri

VRT Thesaurus102,172 terms97,744 in hierarchy4,429 top concepts212 scopeNotes6,828 relations

Conversion code available at https://github.com/viaacode/skoscreator Triples available at http://semanticweb.cs.vu.nl/test

Page 12: Exploring Audiovisual Archives through Aligned Thesauri

CollectionsVIAA

• Part of the VRT AV collection • +/- 35,000 items

(out of ~1Million)• Annotated with VRT

thesaurus• Not publicly available

NISV• Openimages.eu• +/- 3,000 items out of 800K hrs• Mostly news broadcasts• Annotated with GTAA• Publicly available (CC-by-SA)

Page 13: Exploring Audiovisual Archives through Aligned Thesauri

VRT Thesaurus GTAA

Page 14: Exploring Audiovisual Archives through Aligned Thesauri

ALIGNMENTVRT Thesaurus GTAA

Page 15: Exploring Audiovisual Archives through Aligned Thesauri

‘Happy alignments are all alike; every unhappy alignment is unhappy in its own way’

Jacco van Ossenbruggen, (with apologies to Tolstoy)

Page 16: Exploring Audiovisual Archives through Aligned Thesauri

http://cultuurlink.beeldengeluid.nl/

Semi-automatic SKOS vocabulary alignment service

Successor of EuropeanaConnect’s Amalgame

Users can upload vocabularies and match with existing vocs.

Users can design, experiment, improve their alignment strategy

Matching, selecting, excluding, sampling, evaluating

Page 17: Exploring Audiovisual Archives through Aligned Thesauri

Example alignment strategy: Concepts

Page 18: Exploring Audiovisual Archives through Aligned Thesauri

Example alignment strategy: Persons

Page 19: Exploring Audiovisual Archives through Aligned Thesauri

Four strategies

Type Nr of correspondences

Subjects 4,176

Names 2,197

Locations 4,011

Persons 11,265

Total 21,640

Page 20: Exploring Audiovisual Archives through Aligned Thesauri

ALIGNMENTVRT Thesaurus GTAA

Page 21: Exploring Audiovisual Archives through Aligned Thesauri
Page 22: Exploring Audiovisual Archives through Aligned Thesauri

Demonstrator: Information Retrieval tool using Spinque search-by-strategy paradigm

No programming needed, just modelling the IR strategy

Keyword, vocabulary term or Related-Object search

Search on titles, description, vocabulary labels

Weight on collection (user-input)

Page 23: Exploring Audiovisual Archives through Aligned Thesauri

Demonstrator

http://link.spinque.com/VIAA-1.0/

Page 24: Exploring Audiovisual Archives through Aligned Thesauri

Input for keyword search or thesaurus concepts

Search results

Collection indicator

Thesaurus terms associated with video. Terms may appear in one thesaurus or in both thesauriThesaurus terms

associated with retrieval results (grouped by type)

Slider used to indicate collection preference/weight

Per results, the thumbnail, title, description, identifier and thesaurus terms are shown

Page 25: Exploring Audiovisual Archives through Aligned Thesauri

The selected video appears in the search field.

Thesaurus terms associated with search results and selection.

Play screenIn this case, the user positioned the slider all the way to the right, indicating that he/she is interested in Open Images videos related to this VRT item.

List of OpenImages videos related to this VRT video. Matching terms are highlighted.

Page 26: Exploring Audiovisual Archives through Aligned Thesauri

ConclusionsConversion of structured vocabularies to SKOS opens possibilities for connecting collections

Interactive alignment produces many useful links

Demonstrator shows possibilities of aligned collections

Demonstrator will be extended whenmore collections are available

> Complete NISV collection metadata (?)

> Compete VIAA collection metadata(?)