Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Post on 21-Jan-2016

40 views 0 download

description

Content-based image retrieval integrated into Fedora. Pierre-Yves Burgi IT Division, University of Geneva, Switzerland Patrick Monbaron & Nastaran Fatemi University of Applied Sciences, Yverdon, Switzerland. Presentation outline. Context MPEG-7 Data migration Indexing Image retrieval - PowerPoint PPT Presentation

Transcript of Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Pierre-Yves BurgiIT Division, University of Geneva, Switzerland

Patrick Monbaron & Nastaran FatemiUniversity of Applied Sciences, Yverdon,

Switzerland

Content-based image retrieval

integrated into Fedora

Context MPEG-7 Data migration Indexing Image retrieval Demo Conclusions Perspectives

Context Migration Indexing Retrieval Demo Conclusions 2MPEG7 Perspectives

Presentation outline

Paradigm shift Migration of image collections from Oracle DB to

Fedora First step: data synchronization Second step: user interface targeting data retrieval Third step: user interface for ingesting images

(through Valet) Why Fedora?

Conceptually rich (objects, datastreams, SOA, etc.) Based on open standards (e.g. XML) Convenient for adding datastreams such as MPEG-

7

Context Migration Indexing Retrieval Demo Conclusions 3MPEG7 Perspectives

Context of the project

Objectifs Migration Indexing Retrieval Demo Conclusions 4Fedora Perspectives

Old (relational) object model

Oracle Object Model

Context MigrationMPEG7

Fedora Object Model Fedora’s view

Objectifs Migration Indexing Retrieval Demo Conclusions 5Fedora Perspectives

New object model

Context MigrationMPEG7

Context Migration Indexing Retrieval Demo Conclusions 6MPEG7 Perspectives

Why MPEG-7 ? (and not DC)

MPEG-7

Makes easier migration from the DB Match from DB’s 21 fields to MPEG-7 possible

Fits more image description dc:creator versus DS Creator with <Role> &

<Agent> <VisualDescriptor>, <MediaFormat>, etc. Exif metadata Etc.

Context Migration Indexing Retrieval Demo Conclusions 7MPEG7 Perspectives

What is MPEG-7 ?

MPEG-7

Context Migration Indexing Retrieval Demo Conclusions 8MPEG7 Perspectives

MPEG-7: feature extraction

MPEG-7

Color layout descriptor

Edge histogram descriptor

Scalable color descriptor

Caliph and Emirhttp://sourceforge.net/projects/caliph-emir/

Context Migration Indexing Retrieval Demo Conclusions 9MPEG7 Perspectives

MPEG-7: encoding

MPEG-7

Context Migration Indexing Retrieval Demo Conclusions 10

MPEG7 Perspectives

MPEG-7: retrieving

MPEG-7

Match is expressed as a number

Lucene

Application server

gSearch

Fedora

Messaging

XSLTFOXML

Oracle DB

XSLT

XML

Context Migration Indexing Retrieval Demo Conclusions11

MPEG7 Perspectives

File system

Data migration

Record reading

API-M

Image

FOXML

Datamigration

MPEG7 + FOXML

indexing

3 Phases Delete obsolete objects

« Delete » in Fedora those recordings which do not exist anymore in Oracle DB

Update objects Update in Fedora objects corresponding to recordings

modified in Oracle DB Create new objects

Create new objects in Fedora corresponding to the new recordings present in Oracle DB

Algorithm based on the date of the last batch Date saved within a configuration file

Context Migration Indexing Retrieval Demo Conclusions 12MPEG7 Perspectives

Data synchronisation

Reading of the id of all Oracle recordings

Reading of the id of all Fedora objects.

Compare both lists and « delete » from Fedora

missing elements in Oracle

Context Migration Indexing Retrieval Demo Conclusions 13

MPEG7 Perspectives

Migration: delete objects

Read date of last object creation

Search id of those elements created since last update

For new objects:Reading of metadata and image

analysis

Data migration into XML format and into MPEG-7 and FOXML

through XSLT

Ingest FOXML and images into Fedora

Context Migration Indexing Retrieval Demo Conclusions 14

Fedora Perspectives

Migration: create new objects

Read date of last object creation

Search id of those elements updated since last update

Reading of metadata and image analysis

Data migration into XML format and into MPEG-7

Ingest MPEG-7 into Fedora

Context Migration Indexing Retrieval Demo Conclusions 15

Fedora Perspectives

Migration: update objects

Indexing of DC Textual MPEG-7 metadata

Not indexed but saved in the index Technical MPEG-7 metadata Image attributes

Context Migration Indexing Retrieval Demo Conclusions 16

Fedora Perspectives

Indexing

Indexing process within Lucene

File system Lucene

Application server

gSearch

Fedora

API-M

Messaging

XSLT

Oracle DB

Record reading

MPEG7 + FOXML

XSLT FOXMLXML

Data migratio

n

Image

Context Migration Indexing Retrieval Demo Conclusions 17

Fedora Perspectives

Indexing

Indexing

FOXML

Application server User interface

Request parsing

HTML page generation

XSLT

Search by image matching

gSearch connector

File system

gSearch

Indexing

XSLT

Lucene

Fedora

API-A

Resolver

Objectifs Migration Indexing Retrieval Demo Conclusions 18

MPEG-7 Perspectives

Image retrieval

Image reference id

XML output for display

Reading of all data from Lucene and conversion into XML

Selection (10 first)

Retrieval of the attributes of this image from Lucene

Match of each image attribute with that reference and

computation of a matching score

Context Migration Indexation Retrieval Demo Conclusions 19

Fedora Perspectives

Retrieval by image matching

Context Migration Indexation Retrieval Demo Conclusions 20

Fedora Perspectives

Demo: Photothèque

Applications of the MPEG-7 standard to image matching retrieval give satisfactory results … but this is not terribly semantic!

For large image data bank indexation method might not be optimal (now about 1’000 images)

Parameter tuning of image attributes not easy Image corpus too small to establish benchmarks Migration procedure from a DB to Fedora tested

Data synchronization remains a difficulty Integration within Fedora otherwise is a success Culture change (paradigm shift) in progress

Context Migration Indexation Retrieval Demo Conclusions 21

Fedora Perspectives

Conclusions

Exploiting more Fedora’s features Using disseminator for watermarking images on the

fly Applying XACML policies

Improving user interface Play with search parameters (shape versus color)

Applying the methodology to other image data banks Medicine, architecture, etc. Other media (audio, video)

Context Migration Indexation Retrieval Demo Conclusions 22

Fedora Perspectives

Perspectives

Context Migration Indexation Retrieval Demo Conclusions Perspectives 23

Fedora

Questions

Complete report available in French at:

http://www.unige.ch/dinf/ntice/accueil/MembresProjet/PMonbaronRapportFinal.pdf

Context Migration Indexation Retrieval Demo Conclusions 24

Fedora Perspectives

« building » (key word)

Context Migration Indexation Retrieval Demo Conclusions 25

Fedora Perspectives

Retrieval results

Context Migration Indexation Retrieval Demo Conclusions 26

Fedora Perspectives

Display of chosen image

Context Migration Indexation Retrieval Demo Conclusions 27

Fedora Perspectives

Retrieval by image matching

Context Migration Indexation Retrieval Demo Conclusions 28

Fedora Perspectives

Retrieval results (by contour matching)

Context Migration Indexation Retrieval Demo Conclusions 29

Fedora Perspectives

Display of an other chosen image

Context Migration Indexation Retrieval Demo Conclusions 30

Fedora Perspectives

Retrieval results (by color matching)