Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

30
Pierre-Yves Burgi IT Division, University of Geneva, Switzerland Patrick Monbaron & Nastaran Fatemi University of Applied Sciences, Yverdon, Switzerland Content-based image retrieval integrated into Fedora

description

Content-based image retrieval integrated into Fedora. Pierre-Yves Burgi IT Division, University of Geneva, Switzerland Patrick Monbaron & Nastaran Fatemi University of Applied Sciences, Yverdon, Switzerland. Presentation outline. Context MPEG-7 Data migration Indexing Image retrieval - PowerPoint PPT Presentation

Transcript of Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Page 1: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Pierre-Yves BurgiIT Division, University of Geneva, Switzerland

Patrick Monbaron & Nastaran FatemiUniversity of Applied Sciences, Yverdon,

Switzerland

Content-based image retrieval

integrated into Fedora

Page 2: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context MPEG-7 Data migration Indexing Image retrieval Demo Conclusions Perspectives

Context Migration Indexing Retrieval Demo Conclusions 2MPEG7 Perspectives

Presentation outline

Page 3: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Paradigm shift Migration of image collections from Oracle DB to

Fedora First step: data synchronization Second step: user interface targeting data retrieval Third step: user interface for ingesting images

(through Valet) Why Fedora?

Conceptually rich (objects, datastreams, SOA, etc.) Based on open standards (e.g. XML) Convenient for adding datastreams such as MPEG-

7

Context Migration Indexing Retrieval Demo Conclusions 3MPEG7 Perspectives

Context of the project

Page 4: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Objectifs Migration Indexing Retrieval Demo Conclusions 4Fedora Perspectives

Old (relational) object model

Oracle Object Model

Context MigrationMPEG7

Page 5: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Fedora Object Model Fedora’s view

Objectifs Migration Indexing Retrieval Demo Conclusions 5Fedora Perspectives

New object model

Context MigrationMPEG7

Page 6: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexing Retrieval Demo Conclusions 6MPEG7 Perspectives

Why MPEG-7 ? (and not DC)

MPEG-7

Makes easier migration from the DB Match from DB’s 21 fields to MPEG-7 possible

Fits more image description dc:creator versus DS Creator with <Role> &

<Agent> <VisualDescriptor>, <MediaFormat>, etc. Exif metadata Etc.

Page 7: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexing Retrieval Demo Conclusions 7MPEG7 Perspectives

What is MPEG-7 ?

MPEG-7

Page 8: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexing Retrieval Demo Conclusions 8MPEG7 Perspectives

MPEG-7: feature extraction

MPEG-7

Color layout descriptor

Edge histogram descriptor

Scalable color descriptor

Caliph and Emirhttp://sourceforge.net/projects/caliph-emir/

Page 9: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexing Retrieval Demo Conclusions 9MPEG7 Perspectives

MPEG-7: encoding

MPEG-7

Page 10: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexing Retrieval Demo Conclusions 10

MPEG7 Perspectives

MPEG-7: retrieving

MPEG-7

Match is expressed as a number

Page 11: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Lucene

Application server

gSearch

Fedora

Messaging

XSLTFOXML

Oracle DB

XSLT

XML

Context Migration Indexing Retrieval Demo Conclusions11

MPEG7 Perspectives

File system

Data migration

Record reading

API-M

Image

FOXML

Datamigration

MPEG7 + FOXML

indexing

Page 12: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

3 Phases Delete obsolete objects

« Delete » in Fedora those recordings which do not exist anymore in Oracle DB

Update objects Update in Fedora objects corresponding to recordings

modified in Oracle DB Create new objects

Create new objects in Fedora corresponding to the new recordings present in Oracle DB

Algorithm based on the date of the last batch Date saved within a configuration file

Context Migration Indexing Retrieval Demo Conclusions 12MPEG7 Perspectives

Data synchronisation

Page 13: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Reading of the id of all Oracle recordings

Reading of the id of all Fedora objects.

Compare both lists and « delete » from Fedora

missing elements in Oracle

Context Migration Indexing Retrieval Demo Conclusions 13

MPEG7 Perspectives

Migration: delete objects

Page 14: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Read date of last object creation

Search id of those elements created since last update

For new objects:Reading of metadata and image

analysis

Data migration into XML format and into MPEG-7 and FOXML

through XSLT

Ingest FOXML and images into Fedora

Context Migration Indexing Retrieval Demo Conclusions 14

Fedora Perspectives

Migration: create new objects

Page 15: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Read date of last object creation

Search id of those elements updated since last update

Reading of metadata and image analysis

Data migration into XML format and into MPEG-7

Ingest MPEG-7 into Fedora

Context Migration Indexing Retrieval Demo Conclusions 15

Fedora Perspectives

Migration: update objects

Page 16: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Indexing of DC Textual MPEG-7 metadata

Not indexed but saved in the index Technical MPEG-7 metadata Image attributes

Context Migration Indexing Retrieval Demo Conclusions 16

Fedora Perspectives

Indexing

Page 17: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Indexing process within Lucene

File system Lucene

Application server

gSearch

Fedora

API-M

Messaging

XSLT

Oracle DB

Record reading

MPEG7 + FOXML

XSLT FOXMLXML

Data migratio

n

Image

Context Migration Indexing Retrieval Demo Conclusions 17

Fedora Perspectives

Indexing

Indexing

FOXML

Page 18: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Application server User interface

Request parsing

HTML page generation

XSLT

Search by image matching

gSearch connector

File system

gSearch

Indexing

XSLT

Lucene

Fedora

API-A

Resolver

Objectifs Migration Indexing Retrieval Demo Conclusions 18

MPEG-7 Perspectives

Image retrieval

Page 19: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Image reference id

XML output for display

Reading of all data from Lucene and conversion into XML

Selection (10 first)

Retrieval of the attributes of this image from Lucene

Match of each image attribute with that reference and

computation of a matching score

Context Migration Indexation Retrieval Demo Conclusions 19

Fedora Perspectives

Retrieval by image matching

Page 20: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexation Retrieval Demo Conclusions 20

Fedora Perspectives

Demo: Photothèque

Page 21: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Applications of the MPEG-7 standard to image matching retrieval give satisfactory results … but this is not terribly semantic!

For large image data bank indexation method might not be optimal (now about 1’000 images)

Parameter tuning of image attributes not easy Image corpus too small to establish benchmarks Migration procedure from a DB to Fedora tested

Data synchronization remains a difficulty Integration within Fedora otherwise is a success Culture change (paradigm shift) in progress

Context Migration Indexation Retrieval Demo Conclusions 21

Fedora Perspectives

Conclusions

Page 22: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Exploiting more Fedora’s features Using disseminator for watermarking images on the

fly Applying XACML policies

Improving user interface Play with search parameters (shape versus color)

Applying the methodology to other image data banks Medicine, architecture, etc. Other media (audio, video)

Context Migration Indexation Retrieval Demo Conclusions 22

Fedora Perspectives

Perspectives

Page 23: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexation Retrieval Demo Conclusions Perspectives 23

Fedora

Questions

Complete report available in French at:

http://www.unige.ch/dinf/ntice/accueil/MembresProjet/PMonbaronRapportFinal.pdf

Page 24: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexation Retrieval Demo Conclusions 24

Fedora Perspectives

« building » (key word)

Page 25: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexation Retrieval Demo Conclusions 25

Fedora Perspectives

Retrieval results

Page 26: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexation Retrieval Demo Conclusions 26

Fedora Perspectives

Display of chosen image

Page 27: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexation Retrieval Demo Conclusions 27

Fedora Perspectives

Retrieval by image matching

Page 28: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexation Retrieval Demo Conclusions 28

Fedora Perspectives

Retrieval results (by contour matching)

Page 29: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexation Retrieval Demo Conclusions 29

Fedora Perspectives

Display of an other chosen image

Page 30: Pierre-Yves Burgi IT Division, University of Geneva, Switzerland

Context Migration Indexation Retrieval Demo Conclusions 30

Fedora Perspectives

Retrieval results (by color matching)