IMPACT Final Conference - NCSR - Character segmentation

9
IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. 24-25 October 2011, London, UK IMPACT Tools Developed by NCSR IMPACT Final Conference 2011 B. Gatos Computational Intelligence Laboratory Institute of Informatics and Telecommunications National Center for Scientific Research ( NCSR) "Demokritos" GR-153 10 Agia Paraskevi, Athens, Greece

description

IMPACT Final Conference - NCSR - Character segmentation

Transcript of IMPACT Final Conference - NCSR - Character segmentation

Page 1: IMPACT Final Conference - NCSR - Character segmentation

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

24-25 October 2011, London, UK

IMPACT Tools Developed by NCSR

IMPACT Final Conference 2011B. Gatos Computational Intelligence LaboratoryInstitute of Informatics and TelecommunicationsNational Center for Scientific Research (NCSR) "Demokritos"GR-153 10 Agia Paraskevi, Athens, Greece

Page 2: IMPACT Final Conference - NCSR - Character segmentation

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

2

0.21

0.91

Character_Segmentation_v3 [WordImageFilename] [XMLOutputFilename]parameter [WordImageFilename]: An image containing a word

parameter [XMLOutputFilename] : several character segmentation variations encoded following the XML schema of IBM used in TR3 (Adaptive OCR)

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 3: IMPACT Final Conference - NCSR - Character segmentation

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

3

Merged characters Broken characters Overlapped characters Noise

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 4: IMPACT Final Conference - NCSR - Character segmentation

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

4

Calculation of the inner/outer skeleton

Classification of the skeleton parts

Detection of feature points

Construct all possible segmentation paths that result to characters with width in the limit of [MinC*LettH, MaxC*LettH]. For the MinC and MaxC parameters the following pairs are used: (0.3, 0.4). (0.4, 0.5), (0.5, 0.6), (0.6, 0.7), (0.7, 0.8), (0.8, 0.9). As a result we have several segmentation variations with and without applying noise removal. Confidence is based on the difference between Average and Dominant Ratio of Height to Width.

Segmentation variations encoded following the XML schema used in TR3

N. Nikolaou, M. Makridis, B. Gatos, N. Stamatopoulos, N. Papamarkos, "Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths", Image and Vision Computing, Vol. 28 , Issue 4, pp. 590-604, 2010.

Page 5: IMPACT Final Conference - NCSR - Character segmentation

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

5

0.61

0.79

0.85 0.98

0.94

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 6: IMPACT Final Conference - NCSR - Character segmentation

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

6

0.83

0.63

0.730.89

0.90

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 7: IMPACT Final Conference - NCSR - Character segmentation

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

7

0.61

0.79

0.94

Evaluation of the result with the highest

confidence

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 8: IMPACT Final Conference - NCSR - Character segmentation

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

8

0.61

0.79

0.94

Evaluation of the best possible result

IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK

Page 9: IMPACT Final Conference - NCSR - Character segmentation

IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

9IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK