[IEEE 2008 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) - Doha,...

5
2 1 1 var ( ) n i i V V n Compression of the images of ancient Arab manuscript documents Based on segmentation Walid Elloumi, Mohamed Chakroun, Moncef Charfi, Mohamed Adel Alimi [email protected] , [email protected] , [email protected] , [email protected] REGIM: Research Group on Intelligent Machines, University of Sfax, ENIS, BP W - 3038, Sfax, Tunisia Abstract – This paper presents our contribution for the compression of images of old Arabic Manuscripts. Our method of compression is based on the segmentation of the images in three blocks: text, graphics and background, then each block will undergo a different compression method. We present two parts: first, while developing a characteristic extractor based on the wavelet transform and statistical characteristics, we propose an algorithm of segmentation based on the classification of old documents images. Second, we present the education and the implementation of a compression method by using the wavelet transform. This method comprises three steps: the compression of the text blocks, the background blocks, and the graphics blocks. Key words- Wavelet, Image, Historical Document, color image Segmentation, Compression 1. Introduction The ancient Arab manuscripts have numerous characteristics that prevent the use of traditional techniques of compression of composite image of the document. Here are some problems: Text appearance in the form of “waves”, the text of the back shows through (too acid ink). Presence of tasks of moisture absorptive by the paper and which make it illegible. Presence of folds and tears. Variability of page-setting and the styles of writings. Presence of irregular spacing between lines All these defects generate a loss of the structural information of the documents, and thus increase the difficulties to recognize the structure and therefore the information. The best algorithms of documents compression requires that the images of documents be initially segmented in areas such as the text, the graphics, and the background. In this article, we present a multi-layer compression algorithm for the image of the document. This algorithm segments the image of the document in various classes and compresses them by using a specifically designed algorithm [3] [10] [11] [18][19]. 2. Segmentation The principle of our method of segmentation is as follows [7] [8]: - To segment the original image in two blocks: foreground and background, while basing itself on the multi scales analysis [4] [9]. - To segment the image of foreground in two blocks: text and graphics, while basing on classification and of the statistical characteristics [5]. 2.1. Separation between foreground and background of the original image Based on the wavelet transform, the first system separates the foreground and the background of the image of an Arab manuscript. It comprises three stages: - The first separates the image of the document in foreground and background with 32x32 pixels blocks. - The second separates the image of foreground, obtained by the preceding step, on foreground and background with blocks of 4x4 pixels. - The third and last stage consists in correcting the position of the badly classified blocks. a) Step 1: threshold Separation between foreground and background components (size of blocks 32x32 pixels) The principle of this step consists to: Transform the color space of original image from RGB to HSV [1] [2]. Decompose the image to blocks of 32x32 pixels. Apply to each block a decomposition in wavelet, we chose the wavelet of Daubechies 4 (db4), the level of decomposition is equal to 1. FIGURE 2 Decomposition on wavelet of Daubechies 4 level 1 Linearization of blocks HL (horizontal detail) and LH (vertical detail) in only one vector V. Calculate variance of the vector V, With: n: the length of the vector V i: the index of each element of the vector V Vi: element i of the matrix V V : the average of the vector V LL LH HL HH 978-1-4244-1968-5/08/$25.00 ©2008 IEEE 879

Transcript of [IEEE 2008 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) - Doha,...

Page 1: [IEEE 2008 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) - Doha, Qatar (2008.03.31-2008.04.4)] 2008 IEEE/ACS International Conference on Computer

2

1

1var ( )n

ii

V Vn

Compression of the images of ancient Arab manuscript documents Based on segmentation

Walid Elloumi, Mohamed Chakroun, Moncef Charfi, Mohamed Adel Alimi [email protected], [email protected], [email protected], [email protected]

REGIM: Research Group on Intelligent Machines, University of Sfax, ENIS, BP W - 3038, Sfax, Tunisia

Abstract – This paper presents our contribution for the compression of images of old Arabic Manuscripts. Our method of compression is based on the segmentation of the images in three blocks: text, graphics and background, then each block will undergo a different compression method. We present two parts: first, while developing a characteristic extractor based on the wavelet transform and statistical characteristics, we propose an algorithm of segmentation based on the classification of old documents images. Second, we present the education and the implementation of a compression method by using the wavelet transform. This method comprises three steps: the compression of the text blocks, the background blocks, and the graphics blocks.

Key words- Wavelet, Image, Historical Document, color image Segmentation, Compression

1. Introduction

The ancient Arab manuscripts have numerous characteristics that prevent the use of traditional techniques of compression of composite image of the document. Here are some problems:

Text appearance in the form of “waves”, the text of the back shows through (too acid ink).Presence of tasks of moisture absorptive by the paper and which make it illegible.Presence of folds and tears.Variability of page-setting and the styles of writings.Presence of irregular spacing between lines

All these defects generate a loss of the structural information of the documents, and thus increase the difficulties to recognize the structure and therefore the information. The best algorithms of documents compression requires that the images of documents be initially segmented in areas such as the text, the graphics, and the background. In this article, we present a multi-layer compression algorithm for the image of the document. This algorithm segments the image of the document in various classes and compresses them by using a specifically designed algorithm [3] [10] [11] [18][19].

2. Segmentation

The principle of our method of segmentation is as follows [7] [8]:

- To segment the original image in two blocks: foreground and background, while basing itself on the multi scales analysis [4] [9].

- To segment the image of foreground in two blocks: text and graphics, while basing on classification and of the statistical characteristics [5].

2.1. Separation between foreground and background of the original image

Based on the wavelet transform, the first system separates the foreground and the background of the image of an Arab manuscript.

It comprises three stages: - The first separates the image of the document in foreground and background with 32x32 pixels blocks. - The second separates the image of foreground, obtained by the preceding step, on foreground and background with blocks of 4x4 pixels. - The third and last stage consists in correcting the position of the badly classified blocks. a) Step 1: threshold Separation between foreground and background components (size of blocks 32x32 pixels)

The principle of this step consists to: Transform the color space of original image from RGB to HSV [1] [2].Decompose the image to blocks of 32x32 pixels.Apply to each block a decomposition in wavelet, we chose the wavelet of Daubechies 4 (db4), the level of decomposition is equal to 1.

FIGURE 2 Decomposition on wavelet of Daubechies 4 level 1 Linearization of blocks HL (horizontal detail) and LH (vertical detail) in only one vector V.

Calculate variance of the vector V,With: n: the length of the vector V i: the index of each element of the vector V Vi: element i of the matrix V V : the average of the vector V

LL

LH

HL

HH

978-1-4244-1968-5/08/$25.00 ©2008 IEEE 879

Page 2: [IEEE 2008 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) - Doha, Qatar (2008.03.31-2008.04.4)] 2008 IEEE/ACS International Conference on Computer

This stage of the algorithm will be useful to separate the background and the foreground blocks. If the variance is lower than a threshold so we use a background block otherwise we should use a foreground block [6]. Based on the preceding experimental results, we chose the threshold value.

FIGURE 3 Separation by threshold of the components foreground and background (size of blocks 32x32 pixels)

b) Step 2: Threshold separation of the foreground and background (blocks size 4 * 4 pixels) This step consists in repeating the instructions of the first stage with blocks of size 4x4 pixels for better distinguishing between foreground and background. This step requires the exploitation of the following images: foreground, background, and the mask resulting from the preceding step. c) Step 3: Correction of the badly classified blocks After the execution of the two preceding stages, we noticed that there are badly classified blocks, i.e. there are blocks of foreground which were classified like blocks of the background and vice versa.

Foreground before Foreground after correction correction

Background before Background after correction correction

FIGURE 4 Final result of separation between background and foreground

2.2. Separation text/graphic based statistical characteristics

This method separates the image from the foreground into two compon²ents: text and graphics. The Separation is carried out by using the classifier FCM (fuzzy C-means).

This method is composed in the following steps: Conversion of color space of the original image from RGB to HSV. Developpement of a characteristic statistical extractor based on the average calculation and the standard deviation of each block. Usage of the extractor to separate two classes: text and image starting from classifier FCM.

Original Image Text GraphicsFIGURE 5 Result of separation between text and graphics

Original image Text Background Graphics FIGURE 6 Result of old document segmentation

3. Compression

3.1. Metric of compression

With the growing use of multimedia technologies, the compression of images requires increasingly large performances. Recently, the techniques of images compression have known a true revolution. Indeed, compression using wavelets techniques have permitted to reduce appreciably the quantity of information necessary to store an image of a very good quality. That we have used two metric of compression: Peak Signal Noise Ration and Mean Square Error [15].

Decomposition in size of blocks 32X32 pixels

Decomposition in wavelet (db4 L 1) for each block

Linearization of blocks HL and LH

Calculation to the variance of the vector obtained

Var>threshold

Original Image

Foreground

Background

no

yes

880

Page 3: [IEEE 2008 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) - Doha, Qatar (2008.03.31-2008.04.4)] 2008 IEEE/ACS International Conference on Computer

2

1 1

1 ( '( , ) ( , ))*

M N

y xMSE I x y I x y

M N

12

255PSNR log10 ( )(MSE)

I’: image after compression I: image before compression

3.2. Our approach

The compression of the entire image has the major disadvantage of a loss of significant information, thus degrades the document. We made a compression “by part”, after the segmentation of the document, which permit to decompose it on its three components: background, text and graphic. We present our compression approach of old Arab documents by this figure. The compression of the various fields is based on the technique of wavelets. The method of compression used is specific to each part: the background is compressed with loss, by applying the wavelet Daubechies level 5. The background is generally uniform having only one color. We carry out a compression with loss, producing a better compression ratio. The text is compressed by applying the wavelet Haar level 2. The original quality of the image is entirely preserved. The graphic is compressed with a minimum of loss, by applying the wavelet Symmlet level 2

FIGURE 7 Our method of compression

Here is an example in which we try our method of compression:

Original Image 1 Background FIGURE 8 Compression of background

Size of original background = 2.25 MB Size of background compressed = 12.6 KB Compression ratio = 182:1

Component MSE PSNR The red component of

background 85 28

The green component of background

85 28

The blue component of background

85 28

FIGURE 9 Compression of graphics level 2 Floor level 2: Size of original floor = 775 KB Size of compressed floor = 66.1 KB Compression ratio =11:1

Component MSE PSNR The red component of

graphics 9,3913 38

The green component of graphics

7,3764 39

The blue component of graphics

9,9928 38

FIGURE 10 Compression of graphics level 3

Background

Segmentation

GraphicsText

Huffman code

Compressed Image

Original Image

Wavelet Haar L 2

Wavelet Symmlet L 2

Wavelet Daubechies L 5

881

Page 4: [IEEE 2008 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) - Doha, Qatar (2008.03.31-2008.04.4)] 2008 IEEE/ACS International Conference on Computer

Floor level 3: Size of original floor = 775 KB Size of compressed floor = 19.2 KB Compression ratio = 40.36:1

Component MSE PSNR The red component of

graphics 19,0006 35

The green component of graphics

16,4454 36

The blue component of graphics

19,5043 35

We note that the compression ratio becomes more important but the effect of pixalisation appears. For this reason we choose level 2 for the graphics.

FIGURE 11 Compression of Text Size of original text = 327 KB Size of compressed text = 2.84 KB Compression ratio =115:1

MSE PSNR 6,3664 40

Here now another example of an old Arab manuscript document.

FIGURE 12 Segmentation of Image2 in three blocks

a) Compression of background Size original of background =1.47 MB Cut of background compressed = 8.67 KB Compression ratio = 173:1

Component MSE PSNR The red component of

background 0,3333 52

The green component of background

0,3333 52

The blue component of background

0,3333 52

b) Compression of graphics Size original of graphics = 737 KB Size of graphics compressed = 77.6 KB Compression ratio = 9:1

Component MSE PSNR The red component of

graphics 1.3851 46

The green component of graphics

0.6595 49

The blue component of graphics

0.4314 51

c) Compression of text Size original of text = 457 KB Size of text compressed = 54.2 KB Compression ratio = 8:1

Component MSE PSNR The red component of text 1.1615 47

The green component of text 0.6020 50 The blue component of text 1.5535 46

3.3. Comparison between our method and JPEG2000

We notice that the compression ratio of our work is more important than that of JPEG2000 by preserving a high quality [12] [13] [14].

We present the quality (PSNR) according to the compression ratio.

FIGURE 13 Comparison between our method and JPEG2000 for Image 1

Original Image Graphics

Text Background

Our aproche

graphics text

background

05

1015202530354045

11 115 182

Compression ratio

(PSN

R)

JPEG 2000

graphics text

background

0

10

20

30

40

50

60

2 11 71

compression ratio

(PSN

R)

882

Page 5: [IEEE 2008 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) - Doha, Qatar (2008.03.31-2008.04.4)] 2008 IEEE/ACS International Conference on Computer

FIGURE 14 Comparison between our method and JPEG2000 for Image 2 Conclusions:

We proposed a compression method of the image of old Arab manuscript document based on the segmentation:

We segment the image in three areas: background, text, and graphics. We compress every area by a different method.

The method used is the wavelet transform; it is based on the classification by fuzzy C-means and statistical characteristics in order to segment the images in text, graphic and background. We use any method of wavelet transform with any level to compress each area.

We also proposed a system of compression of composite documents by applying the technique of wavelets, on the various fields obtained by segmentation of the image of the document in three areas. We applied for each component a different algorithm. The obtained compression ratios are encouraging, since we obtained a compression ratio that can reach 182 for the background, 11 for the graphic and 115 for the text.

References:

[1] M.W. Schwartz, W.B. Cowan and J.C. Beatty, “An experimental comparison of RGB, YIQ, LAB, HSV, and opponent color models”. ACM Transactions on Graphics, N° 6 Vol 2, 1987, pp. 123-158.

[2] M. Borsotti, P. Campadelli, and R. Schettini, “Quantitative evaluation of color image segmentation results”. Pattern Recognition Letters, 1998, pp. 741-747.

[3] H.D. Cheng, X.H. Jiang, Y. Sun and J. Wang, “Color image segmentation: advances and prospects”, Department of Computer Science, Utah State University, Logan, UT 84322-4205, USA 12 September 2001, pp. 2259-2281.

[4] W. Boussellaa “ Segmetation texte / graphique : Application au manuscrits Arabes Anciens”, CIFED 2006, pp.139-144.

[5] S. Chuai-aree, “a statistical feature classification of text and image segmentation method”, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol. 9, No. 6, 2001, pp. 661-671.

[6] C. Strouthopoulos, N. Papamarkos and A. Atsalakis, “Text extraction in complex color documents”, Pattern Recognition, Vol. 35, Issue 8, 2002, pp. 1743-1758.

[7] A. Carlos, B. Mello and D. Rafael, “Generation of Images of Historical Documents by Composition”, DocEng´02, November, 8, 2002, McLean, Virginia, USA.

[8] P. Lambert and H. Grecu, “A quick and coarse color image segmentation”, International Conference on Image Processing (ICIP) 2003, pp. 965-968.

[9] S. Zhixin and V. Govindaraju, “Historical Document Image Enhancement Using Background Light Intensity Normalization”, ICPR 2004.

[10] P. Haffner, L. Bottou, P.G. Howardand Y. LeCun, “analyzing and compressing scanned documents for Internet distribution”, Scientific Literature Digital Library 1999.

[11] H. Cheng, C.A. Bouman, “Multilayer document compression algorithm” IEEE Image Processing, 1999.Volume 1, 1999, pp. 244-248 vol.1.

[12] M. W. Marcellin, M. Gormish, A. Bilgin, M. P. Boliek, “An Overview of JPEG2000”, Proc. of the Data Compression Conference, Snowbird, Utah, March 2000, pp. 523-544.

[13] M. W. Marcellin and D. S. Taubman, “Jpeg2000: Image Compression Fundamentals, Standards, andPractice”, Kluwer International Series in Engineering and Computer Science, Secs 642.

[14] D. Taubman, “High performance scalable image compression with EBCOT”, IEEE Trans. On Image Processing, Vol. 9, No. 7, Jul. 2000, pp. 1158-1170.

[15] H.S. Malvar “Fast Progressive Image Coding without Wavelets IEEE data Compression”, Conference– Snowbird, utath, March 2000.

[16] P.Y. Simard, H.S. Malvar and J. Rinker, “A foreground-background separation algorithm for image compression”, Data Compression Conference, 2004. Proceedings. DCC 2004.

[17] W. Boussellaa, A. Zahour, B.Taconet A. Alimi, A. Benabdelhafid, "PRAAD: Preprocessing and Analysis Tool for Arabic Ancient Documents", ICDAR 2007

[18] M. Ben Halima, W. Boussellaa, M. Charfi, A. Alimi, " Restauration des images couleurs des documents arabes ancients basée sur les EPDs", CIFED 2006, pp 103-108.

[19] J. Marcelo Monte da silva, R. Dueire Lins "Color document synthesis as a compression strategy" ICDAR 2007,pp466-470

our approche

text graphicsbackground

0

10

20

30

40

50

60

8 9 173

compression ratio

(PSN

R)

JPEG 2000

text graphics

background

010203040506070

1,7 1,8 62

compression ratio

(PSN

R)

883