Compression of the image Adolf Knoll National Library of the Czech Republic.
-
Upload
cory-mccoy -
Category
Documents
-
view
216 -
download
0
Transcript of Compression of the image Adolf Knoll National Library of the Czech Republic.
![Page 1: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/1.jpg)
Compression of the image
Adolf Knoll
National Library of the Czech Republic
![Page 2: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/2.jpg)
General schemes for application of compression
The schemes adapt to the character of the represented objects:
Bitonal image (1-bit, black-and-white) Colour photorealistic image Mixed document (two above-mentioned
components)
![Page 3: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/3.jpg)
![Page 4: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/4.jpg)
![Page 5: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/5.jpg)
![Page 6: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/6.jpg)
Trends
Bitonal from CCITT Gr. Fax 3 and 4 to JBIG variants
Photorealistic Lossless compression: PNG, TIFF/LZW Lossy: from JPEG DCT to wavelet
Mixed document Both applied (Mixed Raster Content –
usually vertically)
![Page 7: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/7.jpg)
How is it built into formats?
Trying to have it in ISO TIFF (even JPEG, LZW, or PNG) – but it is not enough due to lack of tools for conversion and display.
That is why the other more suitable formats are used: JPEG, PNG
That is why there is a lot of development in the area of mixed formats – they do not aim to become ISO
![Page 8: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/8.jpg)
Relevant directions
Bitonal image JBIG2 (ISO) – no support (exc. Xerox), but
many similar activities Photorealistic image
wavelet JPEG2000 and many other non-ISO initiatives (WI, LWF, IW44, SID, Imagepower IW, …)
Mixed content DjVu, LDF, Imagepower MRC
![Page 9: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/9.jpg)
Aims
Image Archiving standardized
archival format (TIFF, JPEG, PNG, …)
Image Delivery More efficient
modern format (JB2, MrSID, DjVu, LDF, …)
Which relationship will be between both of them?It will be defined by the goal of the project.
![Page 10: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/10.jpg)
Around compression
Pre-processing of the image Compression Encoding in a format De-coding from the format De-compression Display – print-out
![Page 11: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/11.jpg)
Pre-processing of the bitonal image - I
Efficient schemes are built on possibilities to apply vocabularies of pixel chunks/groups: E.g. a text is an image that can be interpreted as
several dozens of images of letters, while the repeated occurrence of each letter can be represented by its coordinates (x,y) and reference to a dictionary in which there is only one representation of similar letters (digitized only once as a bitmap)
This method is called PATTERN MATCHING, but…
![Page 12: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/12.jpg)
Pre-processing of the bitonal image - II
However, scanned texts have a lot of information noise in individual pixel chunks representing, for instance, letters in text
Therefore, it is convenient to reduce differences between identically indentifiable chunks smoothing pixel flipping noise removal
![Page 13: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/13.jpg)
Smoothing and pixel flipping
![Page 14: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/14.jpg)
Problems in pattern matching
Česká republika
Low quality original and/or scan + inappropriate processing
![Page 15: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/15.jpg)
Soft pattern matching
Better work with dictionaries; replacement only there, where the threshold value of the pixel chunk is satisfied
If not, the whole small bitmap is stored Tuning of these mechanisms is a key
to successful application of the lossy compression of a bitonal image.
![Page 16: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/16.jpg)
How to know…
Libraries have documents of various qualities- also very bad
These documents are more difficult to process than good samples presented by software producers
Tests… tests… tests… on typical materials
![Page 17: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/17.jpg)
Bitonal compression
Lossless (LZW, PNG, …, CCITT Fax Group 3 a 4, JB2, JBIG, JBIG2, Algo Vision/Luratech (1-bit LDF component)
Lossy modern schemes: AT&T (Lizardtech) (JB2) – soft pattern
matching ImagePower Inc. JBIG2 (JB2) – only pattern
matching Summus Inc. (Lightning Strike), ...
![Page 18: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/18.jpg)
GIF would beslightly worsethan PNG
![Page 19: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/19.jpg)
Květy české – 19th century Czech journal
![Page 20: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/20.jpg)
![Page 21: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/21.jpg)
Impact of the quality of digitized originals on performance of compression schemes
![Page 22: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/22.jpg)
JB2
Most efficient compression schemes JB2 from the DjVu format (AT&T).
It enables compression: lossless lossy aggressive – while preserving high
quality
![Page 23: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/23.jpg)
JB2 as a component part of the DjVu format
More files can be merged and saved into one (as PDF) – they have the common dictionary so that together their size will be smaller than the sum of all individual files
More files can be virtually joined (they are called one after another from the server)
More advantages: display, references, OCR, … (DjVu plug-in)
Expensive or free software for Linux or Solaris
![Page 24: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/24.jpg)
Samples and résumé
Monitor and test new approaches for image processing
They can be very suitable for document delivery services Image servers Scanned content CLICK!!!
![Page 25: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/25.jpg)
Which formats to use for bitonal image?
If you have no special tools: GIF
If you wish smaller files, use PNG Both are recommended for WWW However, TIFF/CCITT Fax Gr. 4 is
better Use DjVu, if you wish very small files
![Page 26: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/26.jpg)
Problems
Good image editing software does not support TIFF with Gr. 4 encoding
Display possible within normal Windows tools
GIF and PNG support also higher brightness resolution (8-bit / 24-bit) – take care not to save bi-level image in higher image depth
DjVu – necessary to solve authoring software problem
![Page 27: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/27.jpg)
Lossy compression – bitonal image
![Page 28: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/28.jpg)
Compression of colour images
Lossless LZW
GIF (8-bit only) TIFF (5.0)
PNG Wavelet
JPEG2000 (JP2) …
Lossy DCT (JPEG) Fractals Wavelet
IW44 LWF, WI JPEG2000 (JP2) MrSID, …
Classical (LZW, RLE, DCT) versus wavelet approaches.
![Page 29: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/29.jpg)
![Page 30: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/30.jpg)
True colour image
DCT
wavelet
![Page 31: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/31.jpg)
Testing compression efficiency
Sample Reference Full-colour (JPEG, wavelet) 1-bit (establish tresholds – Paint Shop
Pro, LuraWave) MRC (same sample – DjVu Solo)
![Page 32: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/32.jpg)
Compression efficiency – bitonal image
![Page 33: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/33.jpg)
Compression efficiency
True colour
![Page 34: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/34.jpg)
How to apply compression?
It depends on the character of objects in the image: Photorealistic image (JPEG, wavelet) Text and simple blac-and-white graphics (Fax
Group 4, JB2, …) Colour graphics (problem to compress with losses
– better lossless PNG or GIF – application area of vector graphics - SVG)
Mixed content (composed solutions: DjVu, LDF, …)
![Page 35: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/35.jpg)
The most efficient solution
To segment images into two or more groups of objects:
1. Objects good for bitonal conversion
2. Objects good for true colour representation
Tto compress each group separately and then merge into one format.
![Page 36: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/36.jpg)
Horizontal segmentation/zoning
Horizontally- Text- Grafics- Photographs
Imagepower Inc.
![Page 37: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/37.jpg)
Vertical segmentation/zoning
Vertically Foreground Background
Lizardtech Inc. (AT&T)Luratech GmBH
DjVu, LDF
![Page 38: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/38.jpg)
Comparison of DjVu and LDF
DjVu
6 layers
Foreground: JB2 IW44
Background: 4 layers IW44
LDF
3 layers
Foreground: LDF 1-bit Comp. LFW
Background: 1 layer LWF, JP2
![Page 39: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/39.jpg)
Bitonal versus composed image
![Page 40: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/40.jpg)
Grey level
![Page 41: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/41.jpg)
Other DjVu properties
More images in one: as TIFF, PDF, LDF, …, with use of the
common dictionary of pixel chunks Virtually: pages remaion on server and
only that page that is called is delivered
![Page 42: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/42.jpg)
Multiresolution image
MrSID In one file several (up to 8) images in
various resolutions Sample Efficient with an image server
![Page 43: Compression of the image Adolf Knoll National Library of the Czech Republic.](https://reader035.fdocuments.us/reader035/viewer/2022062519/56649e145503460f94afe4f3/html5/thumbnails/43.jpg)
SAMPLES
Samples of various compression solutions