IMPACT Final Conference - NCSR - Border detection

14
IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. 24-25 October 2011, London, UK IMPACT Tools Developed by NCSR IMPACT Final Conference 2011 B. Gatos Computational Intelligence Laboratory Institute of Informatics and Telecommunications National Center for Scientific Research ( NCSR) "Demokritos" GR-153 10 Agia Paraskevi, Athens, Greece

description

IMPACT Final Conference - NCSR - Border detection

Transcript of IMPACT Final Conference - NCSR - Border detection

Page 1: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

24-25 October 2011, London, UK

IMPACT Tools Developed by NCSR

IMPACT Final Conference 2011

B. Gatos Computational Intelligence LaboratoryInstitute of Informatics and TelecommunicationsNational Center for Scientific Research (NCSR) "Demokritos"GR-153 10 Agia Paraskevi, Athens, Greece

Page 2: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

2

Border_Detection_v4 [0|1] [infile] [outfile1] [outfile2]

parameter [0|1]: 0 -> only border removal, 1 -> border removal & page split

parameter [infile]: Input filename (b/w or gray scale image)

parameters [outfile1] [outfile2]: Output filenames (b/w or gray scale image) + web service implementationIMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 3: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

3

We use projection profiles and a connected component labelling process to detect black borders

Signal cross-correlation is used in order to verify the detected noisy text areas

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 4: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

4

We remove small noisy components

We detect vertical page zones based on vertical white run projections

Then, for every vertical zone we detect horizontal page zones based on horizontal white run projections.

N. Stamatopoulos, B. Gatos, T. Georgiou, “Page frame detection for double page document images”, 9th IAPR International Workshop on Document Analysis Systems (DAS 2010), pp. 401-408, Cambridge, MA, USA, June 2010.

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 5: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

5IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 6: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

6IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 7: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

7IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 8: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

8IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 9: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

9

21709 images to

test border removal 3003 newspaper images to

test border removal

1 (Bad) 2 3 4 5 (Good)

1 (Bad) 2 3 4 5 (Good)

Av=4.3Av=3.6

1. Final image almost destroyed!

2. Big part of text is missing

3. Small part of text is missing

4. All text is there, border not completely removed.

5. All text is there, border has been completely removed.

1. Final image almost destroyed!

2. Big part of text is missing

3. Small part of text is missing

4. All text is there, border not completely removed.

5. All text is there, border has been completely removed.

Page 10: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

10

3009 images to

test page split (results on 50%)

1 (Bad) 2 3 4 5 (Good)

1 (Bad) 2 3 4 5 (Good)

Av=3.3

1. Page split fails!

2 Page split with problems.

3. Page split is correct, large parts of noise remains or text is

removed4. Page split is correct, small parts of

noise remains or text is removed

5. Page split is correct, only black noise has been removed

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 11: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

11IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

BL BNE BNF BSB JSI NLB ONB TOTAL

#images 3632 11126 12251 4784 4430 706 1789 38718

IMPACT

Prec (%) 99.49 99.89 98.88 98.10 98.91 99.86 97.29 99.08

Rec (%) 98.83 99.26 99.40 96.07 99.06 99.73 97.82 98.79

FM (%) 99.16 99.58 99.14 97.07 98.99 99.79 97.55 98.93

D.X Le

Prec (%) 94.98 99.68 98.67 97.70 97.35 99.80 97.19 98.30

Rec (%) 99.31 90.65 99.24 95.58 99.21 99.81 99.19 96.63

FM (%) 97.10 94.95 98.26 96.63 98.27 99.80 98.18 97.30

BookRestorer

Prec (%) 91.13 96.88 98.08 97.29 94.50 99.79 95.12 96.47

Rec (%) 99.56 91.57 99.77 97.43 99.40 99.85 99.61 97.06

FM (%) 95.16 94.15 98.91 97.36 96.89 99.82 97.31 96.76

WiseBook

Prec (%) 86.93 88.57 91.20 95.76 90.69 99.46 80.37 90.20

Rec (%) 98.37 99.47 99.10 96.40 97.29 98.45 98.63 98.56

FM (%) 92.30 93.71 94.99 96.08 93.87 98.95 88.57 94.20

ScanFix

Prec (%) 81.65 92.87 91.29 95.62 91.00 99.24 84.52 91.17

Rec (%) 94.97 98.66 98.66 97.81 95.66 80.81 96.98 97.46

FM (%) 87.81 95.68 94.83 96.70 93.27 89.08 90.32 94.21

SET A: 38718 randomly selected historical images

Page 12: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

12IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

SET B: 22383 images with noisy black border

BL BNE BNF BSB JSI NLB ONB TOTAL

#images 1631 7543 7677 2417 1416 315 1384 22383

IMPACT

Prec (%) 98.94 99.88 98.29 96.86 98.01 99.98 96.63 98.62

Rec (%) 98.18 99.27 99.26 93.14 99.15 99.87 98.24 98.46

FM (%) 98.56 99.57 98.77 94.96 98.58 99.92 97.43 98.54

D.X Le

Prec (%) 88.89 99.58 97.98 96.08 93.20 99.85 96.48 97.28

Rec (%) 99.05 86.64 98.86 91.53 99.09 99.97 99.06 94.01

FM (%) 93.70 92.66 98.42 93.75 96.05 99.91 97.75 95.62

BookRestorer

Prec (%) 80.30 95.46 97.00 95.27 84.26 99.83 93.76 94.11

Rec (%) 99.36 93.02 99.68 95.22 99.56 99.96 99.62 96.92

FM (%) 88.82 94.22 98.32 95.24 91.27 99.89 96.60 95.50

WiseBook

Prec (%) 70.98 83.24 86.10 92.24 72.44 99.18 74.77 83.32

Rec (%) 99.38 99.49 99.58 95.19 98.36 98.61 99.09 98.94

FM (%) 82.81 90.64 92.35 93.69 83.43 98.89 85.23 90.46

ScanFix

Prec (%) 59.23 89.57 86.23 91.96 73.38 99.04 80.14 85.00

Rec (%) 95.42 98.78 99.03 96.54 98.55 80.61 97.67 98.04

FM (%) 73.09 93.95 92.19 94.19 84.12 88.88 88.04 91.05

Page 13: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

13

3009 images to

test page split

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 14: IMPACT Final Conference - NCSR - Border detection

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

14

458 images from BNF to

test page split

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK