“Inside the Bible”Segmentation, annotation and retrieval for a new
browsing experience
Daniele Borghesani
International Doctorate school on Information and Communication Technologies
English for academic purposes I
Overview
1.Goals
2.Text segmentation
3.Picture segmentation
4.Results
5.Conclusions
Overview
1.Goals
2.Text segmentation
3.Picture segmentation
4.Results
5.Conclusions
Dataset description
Dataset description
• Holy Bible of Borso d’Este (1450-1471 d.C.)
• Illuminated manuscript
• A lot of illustrations (biblical episodes, animals, symbols, court life scenes…)
• 1200+ high resolution pages
Manual annotation
Our project
Text recognition
Texture analysisPreprocessing
Illustrations classification
Text Illustrations
Decorated initials Decoration Picture
Annotationdatabase
Imagesdatabase
Feature annotation
User interfaceCBIR
Our project
• Automatic analysis of Bible pages
• Extraction of valuable pictures
• Addition of translations, commentaries, references…
• Finally, media station with an appealing user interface
(museums)
Obscura HP Multi-Touch Video Wall
Overview
1.Goals
2.Text segmentation
3.Picture segmentation
4.Results
5.Conclusions
Text Segmentation
1. Block analysis with autocorrelation
2. Directional histogram
• Sum of pixel along each direction
3. Modeling with mixtures of Von Mises distributions
• Very good for handling of angular data
• Compact representation (5 values for a mixture
of two Von Mises distributions)
Text Segmentation
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0
1
2
3
4
5
6
7
8
1 10 19 28 37 46 55 64 73 82 91 100
109
118
127
136
145
154
163
172
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0
1
2
3
4
5
6
7
8
1 10 19 28 37 46 55 64 73 82 91 100
109
118
127
136
145
154
163
172
Text!Text
Text Segmentation
Overview
1.Goals
2.Text segmentation
3.Picture segmentation
4.Results
5.Conclusions
Picture Segmentation
3. Preprocess to focus on most important blobs of pixels
(1) Original image (2) Background suppression and Labeling (fast)
(3) Morphology (4) Blob filling
Picture Segmentation
4. Block analysis
b) SVM classification on the pages…
a) SVM learning with a
training set of positive
and negative samples ...
...
Features: color (HSV and RGB histogram), texture (gradients), low frequency coefficients
Overview
1.Goals
2.Text segmentation
3.Picture segmentation
4.Results
5.Conclusions
Results
Results
Retrieval by similarity
Browsing with Sammon Mapping
Conclusions
• We are studying a set of techniques in order to analyze the
Holy Bible of Borso d’Este
• Our goal is to produce a media station, available both locally
(museums) and remotely (web app), to “touch” this
untouchable masterpiece
Thank you!
Top Related