Pierre-Yves BurgiIT Division, University of Geneva, Switzerland
Patrick Monbaron & Nastaran FatemiUniversity of Applied Sciences, Yverdon,
Switzerland
Content-based image retrieval
integrated into Fedora
Context MPEG-7 Data migration Indexing Image retrieval Demo Conclusions Perspectives
Context Migration Indexing Retrieval Demo Conclusions 2MPEG7 Perspectives
Presentation outline
Paradigm shift Migration of image collections from Oracle DB to
Fedora First step: data synchronization Second step: user interface targeting data retrieval Third step: user interface for ingesting images
(through Valet) Why Fedora?
Conceptually rich (objects, datastreams, SOA, etc.) Based on open standards (e.g. XML) Convenient for adding datastreams such as MPEG-
7
Context Migration Indexing Retrieval Demo Conclusions 3MPEG7 Perspectives
Context of the project
Objectifs Migration Indexing Retrieval Demo Conclusions 4Fedora Perspectives
Old (relational) object model
Oracle Object Model
Context MigrationMPEG7
Fedora Object Model Fedora’s view
Objectifs Migration Indexing Retrieval Demo Conclusions 5Fedora Perspectives
New object model
Context MigrationMPEG7
Context Migration Indexing Retrieval Demo Conclusions 6MPEG7 Perspectives
Why MPEG-7 ? (and not DC)
MPEG-7
Makes easier migration from the DB Match from DB’s 21 fields to MPEG-7 possible
Fits more image description dc:creator versus DS Creator with <Role> &
<Agent> <VisualDescriptor>, <MediaFormat>, etc. Exif metadata Etc.
Context Migration Indexing Retrieval Demo Conclusions 7MPEG7 Perspectives
What is MPEG-7 ?
MPEG-7
Context Migration Indexing Retrieval Demo Conclusions 8MPEG7 Perspectives
MPEG-7: feature extraction
MPEG-7
Color layout descriptor
Edge histogram descriptor
Scalable color descriptor
Caliph and Emirhttp://sourceforge.net/projects/caliph-emir/
Context Migration Indexing Retrieval Demo Conclusions 9MPEG7 Perspectives
MPEG-7: encoding
MPEG-7
Context Migration Indexing Retrieval Demo Conclusions 10
MPEG7 Perspectives
MPEG-7: retrieving
MPEG-7
Match is expressed as a number
Lucene
Application server
gSearch
Fedora
Messaging
XSLTFOXML
Oracle DB
XSLT
XML
Context Migration Indexing Retrieval Demo Conclusions11
MPEG7 Perspectives
File system
Data migration
Record reading
API-M
Image
FOXML
Datamigration
MPEG7 + FOXML
indexing
3 Phases Delete obsolete objects
« Delete » in Fedora those recordings which do not exist anymore in Oracle DB
Update objects Update in Fedora objects corresponding to recordings
modified in Oracle DB Create new objects
Create new objects in Fedora corresponding to the new recordings present in Oracle DB
Algorithm based on the date of the last batch Date saved within a configuration file
Context Migration Indexing Retrieval Demo Conclusions 12MPEG7 Perspectives
Data synchronisation
Reading of the id of all Oracle recordings
Reading of the id of all Fedora objects.
Compare both lists and « delete » from Fedora
missing elements in Oracle
Context Migration Indexing Retrieval Demo Conclusions 13
MPEG7 Perspectives
Migration: delete objects
Read date of last object creation
Search id of those elements created since last update
For new objects:Reading of metadata and image
analysis
Data migration into XML format and into MPEG-7 and FOXML
through XSLT
Ingest FOXML and images into Fedora
Context Migration Indexing Retrieval Demo Conclusions 14
Fedora Perspectives
Migration: create new objects
Read date of last object creation
Search id of those elements updated since last update
Reading of metadata and image analysis
Data migration into XML format and into MPEG-7
Ingest MPEG-7 into Fedora
Context Migration Indexing Retrieval Demo Conclusions 15
Fedora Perspectives
Migration: update objects
Indexing of DC Textual MPEG-7 metadata
Not indexed but saved in the index Technical MPEG-7 metadata Image attributes
Context Migration Indexing Retrieval Demo Conclusions 16
Fedora Perspectives
Indexing
Indexing process within Lucene
File system Lucene
Application server
gSearch
Fedora
API-M
Messaging
XSLT
Oracle DB
Record reading
MPEG7 + FOXML
XSLT FOXMLXML
Data migratio
n
Image
Context Migration Indexing Retrieval Demo Conclusions 17
Fedora Perspectives
Indexing
Indexing
FOXML
Application server User interface
Request parsing
HTML page generation
XSLT
Search by image matching
gSearch connector
File system
gSearch
Indexing
XSLT
Lucene
Fedora
API-A
Resolver
Objectifs Migration Indexing Retrieval Demo Conclusions 18
MPEG-7 Perspectives
Image retrieval
Image reference id
XML output for display
Reading of all data from Lucene and conversion into XML
Selection (10 first)
Retrieval of the attributes of this image from Lucene
Match of each image attribute with that reference and
computation of a matching score
Context Migration Indexation Retrieval Demo Conclusions 19
Fedora Perspectives
Retrieval by image matching
Context Migration Indexation Retrieval Demo Conclusions 20
Fedora Perspectives
Demo: Photothèque
Applications of the MPEG-7 standard to image matching retrieval give satisfactory results … but this is not terribly semantic!
For large image data bank indexation method might not be optimal (now about 1’000 images)
Parameter tuning of image attributes not easy Image corpus too small to establish benchmarks Migration procedure from a DB to Fedora tested
Data synchronization remains a difficulty Integration within Fedora otherwise is a success Culture change (paradigm shift) in progress
Context Migration Indexation Retrieval Demo Conclusions 21
Fedora Perspectives
Conclusions
Exploiting more Fedora’s features Using disseminator for watermarking images on the
fly Applying XACML policies
Improving user interface Play with search parameters (shape versus color)
Applying the methodology to other image data banks Medicine, architecture, etc. Other media (audio, video)
Context Migration Indexation Retrieval Demo Conclusions 22
Fedora Perspectives
Perspectives
Context Migration Indexation Retrieval Demo Conclusions Perspectives 23
Fedora
Questions
Complete report available in French at:
http://www.unige.ch/dinf/ntice/accueil/MembresProjet/PMonbaronRapportFinal.pdf
Context Migration Indexation Retrieval Demo Conclusions 24
Fedora Perspectives
« building » (key word)
Context Migration Indexation Retrieval Demo Conclusions 25
Fedora Perspectives
Retrieval results
Context Migration Indexation Retrieval Demo Conclusions 26
Fedora Perspectives
Display of chosen image
Context Migration Indexation Retrieval Demo Conclusions 27
Fedora Perspectives
Retrieval by image matching
Context Migration Indexation Retrieval Demo Conclusions 28
Fedora Perspectives
Retrieval results (by contour matching)
Context Migration Indexation Retrieval Demo Conclusions 29
Fedora Perspectives
Display of an other chosen image
Context Migration Indexation Retrieval Demo Conclusions 30
Fedora Perspectives
Retrieval results (by color matching)