Entropy and some applications in image processing Neucimar J. Leite Institute of Computing...

54
Entropy and some applications in image processing Neucimar J. Leite Institute of Computing [email protected]

Transcript of Entropy and some applications in image processing Neucimar J. Leite Institute of Computing...

Entropy and some applications in image processing

Neucimar J. LeiteInstitute of Computing

[email protected]

Outline

• Introduction– Intuitive understanding

• Entropy as global information• Entropy as local information

– edge detection, texture analysis

• Entropy as minimization/maximization constraints– global thresholding

– deconvolution problem

Information Entropy (Shannon´s entropy)

An information theory concept closely related to the following question:

- What is the minimum amount of data needed to represent aninformation content?

• For images (compression problems):

- How few data are sufficient to completely describe an imageswithout (much) loss of information?

Intuitive understanding:

- relates the amount of uncertainty about an event with a given probability distribution

Event: randomly draw out a ball

high uncertainty

lowuncertainty

no uncertainty

entropy = max min(uncertainty)

Example 1:

Event: a coin flipping = { heads, tails }

Probability: P(heads) = P(tails) = 1/2

bit 1 )(log )(

1log 22 EP

EP0 heads1 tails

self-information: inversely related to the probability of E

Self-information:

- Units of information used to represent an event E

Example 2:

bits 38

1log- )(log ninformatio-self 22 EP

amount of conveyed information of

event EEntropy: average information

8

1)( kballP

k kk ballPballPH

)(

1log

)(

1

8

1)( kballP

0 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1

8

3

8

1log

8

1

coding the balls (3 bits/ball)

8

3

8

1log

8

1

8

3

8

1log

8

1

Entropy: = 3 bits/ball

Degree of information compression:

balls 8log2

equa

l len

gth

bin

ary

code

18log

length code optimized

length code original 2 EntropyR

C

for independent data: EntropyR

H= -( 5/8 log2(5/8) + 1/8 log2(1/8) + 1/8 log2(1/8) + 1/8 log2(1/8) ) = 1.54

H = -8log21 = 0

medium uncertainty:

no uncertainty:

code0 00 1

1 01 1

H= -( 5/8 log2(5/8) + 1/8 log2(1/8) + 1/8 log2(1/8) + 1/8 log2(1/8) ) = 1.54

medium uncertainty:

code0 00 1

1 01 1

• 2 bits/ball > 1.54 bit/ball code redundancy !!!and 29.154.1

2

R

C

We need an encoding method for eliminating this code redundancy

22%

Ball Probability Reduction 1 Reduction 2

red

black

blue

green

5/8

1/8

1/8

1/8

The Huffman encoding:

Ball Probability Reduction 1 Reduction 2

red

black

blue

green

5/8

1/8

1/8

1/8

5/8

2/8

1/8

Ball Probability Reduction 1 Reduction 2

red

black

blue

green

5/8

1/8

1/8

1/8

5/8

2/8

1/8

5/8

3/8

Ball Probability Reduction 1 Reduction 2

red

black

blue

green

5/8

1/8

1/8

1/8

5/8

2/8

1/8

5/8

3/8

(1)

(0)

Ball Probability Reduction 1 Reduction 2

red

black

blue

green

5/8

1/8

1/8

1/8

5/8

2/8

1/8

5/8

3/8

(1)

(0)(01)

(00)

(1)

Ball Probability Reduction 1 Reduction 2

red

black

blue

green

5/8

1/8

1/8

1/8

5/8

2/8

1/8

5/8

3/8

(1)

(0)(01)

(00)

(1)

(011)

(010)

(1)

(00)

variable length code

red

black

blue

green

ball

100

011010

1.62 8

13

8

13

8

12

8

51

R

23.162.1

2

R

Cand (18,6%)

Entropy: 4.11

512 x 512 8-bit image:

After Huffman encoding:

93.114.4

8

R

C

bits/pixel

Variable length coding does not take advantage of the highimages pixel-to-pixel correlation: a pixel can be predicted from the values of its neighbors more redundancy lower entropy (bits/pixel)

Entropy: 7.45

After Huffman encoding:

07.146.7

8

R

C

Entropy: 7.35

08.139.7

8

R

C

After Huffman encoding:

Coding the interpixel difference highlighting redundancies:

Entropy: 4.73 instead of 7.45

56.111.5

8

R

C

After Huffman encoding:

Entropy: 5.97 instead of 7.35

34.197.5

8

R

C

After Huffman encoding:

instead of 1.07

instead of 1.08

Entropy as a local information: the edge detection example

Edge detection examples: -1 0

0 1

0 -1

1 0

-1 -2 -1

0 0 0

1 2 1

-1 0 1

-2 0 2

-1 0 1

Entropy-based edge detection

• Low entropy values low frequencies uniform image regions

• High entropy values high frequencies image edges

Binary entropy function:

0.50 1.0

Ent

ropy

H

}R,{R Regions 21

pRP )( 1

ppRP 1)( 2

p

1.0

0.50 1.0

Ent

ropy

H

p

1.0

0.50 1.0

Ent

ropy

H

p

1.0

0.50 1.0

Ent

ropy

H

p

1.0

0.50 1.0

Ent

ropy

H

p

1.0

0.50 1.0

Ent

ropy

H

p

1.0

Binary entropy function:

pppppH loglog)(

Isotropic edge detection

H in a 3x3 neighborhood:

5x5 neighborhood:

7x7 neighborhood:

9x9 neighborhood:

Texture Analysis

• Similarity grouping based on brightness, colors, slopes, sizes etc

• The perceived patterns of lightness, directionality, coarseness, regularity, etc can be used to describe and segment an image

Texture description: statistical approach

• Characterizes textures as smooth, coarse, periodic, etc

- Based on the intensity histogram prob. density function

Descriptors examples:

)(1

0i

L

ii zpzm

zi = random variable denoting gray levelsp(zi) = the intensity histogram in a region

• Mean: a measure of average intensity

)()(1

0i

L

i

nin zpmz

• Other moments of different orders:

22 )( z

- e.g., standard deviation: a measure of average contrast

Entropy: a measure of randomness

1

0

)(log)(L

iii zpzpe

Texture Average Intensity Average contrast Entropy

smooth 87.5 10.8 5.3

coarse 121.2 74.2 7.8

periodic 99.6 34.0 6.5

smooth

coarse

periodic

Descriptors and segmentation:

?

Gray-level co-occurrence matrix: Haralick´s descriptors

• Conveys information about the positions of pixels having similar gray level values.

1 2 1 3 3 2 12 3 2 2 2 1 13 3 2 2 1 1 31 3 1 2 1 1 3

d=1

0 1 2 3 4

0 0 0 0 0 01 0 4 2 1 02 0 3 3 2 03 0 1 2 3 04 0 0 0 0 0

Md(a,b)

For the descriptor H:

large empty spaces in M little information

contentcluttered areas large information content

i j

dd jiMjiMH ],[log],[

Md = the probability that a pixel with gray level i will have a pixel with level j a distance of d pixels away in a given direction

d = 2, horizontal direction

Obviously, more complex texture analysis based on statistical descriptors should consider combination of information related toimage scale, moments, contrast, homogeneity, directionality, etc

Entropy as minimization/maximization constraints

Global thresholding examples:

mean histogram peaks

For images with levels 0-255:

The probability that a given pixel will have value less than or equal t is:

t

iit pP

0

Now considering:

Class A: t

t

tt P

p

P

p

P

p,,, 10

Class B: tt

t

t

t

P

p

P

p

P

p

1,,

1,

125521

The optimal threshold is the value of t that maximizes

),()()( tHtHtH wb

where

t

it

oi t

ib P

p

P

pH log

t

i

ti t

iw P

p

P

pH

1log

1

255

1

Examples:

Entropy as a fuzziness measure

In fuzzy set theory an element x belongs to a set S with a certain probability px defined by a membership function px(x)

Example of a membership function for a given threshold t:

t x if /|-x|1

1

t x if /||1

1

)(

1

0

C

Cxxpx

px(x) gives the degree to which x belongs to the object or background with gray-level average and , respectively.0 1

How can the degree of fuzziness be measured?

Example: t = 0 for a a binary image fuzziness = 0

),1log()1()log()( xxxxxf pppppH

Using the Shannon´s function (for two classes):

the entropy of an entire fuzzy set of dimension MxN is

i

xf ihistipHMN

tE )())((1

)(

and for segmentation purpose, the threshold t is such that E(t) isminimum t minimizes fuzziness

Segmentation examples

Maximum Entropy Restoration: the deconvolution problem

The image degradation model:

),(),(),(),( yxyxhyxfyxg

f(x,y) h(x,y) +

noise

degraded image g(x,y)

The restoration problem:

Given g, h, and we can find an estimate such that

the residual

f̂ fhg ˆ

• Since there may exist many functions such that the above constraint is satisfied, we can consider the maximization entropy as an additional constraint for “optimum” restoration

original degraded restored

Wiener Lucy-Richardson

Entropy

h

Degraded

Other restoration methods:

Conclusions

• The entropy information has been extensively used in various image processing applications.

• Other examples concern distortion prediction, images evaluation, registration, multiscale analysis, high-level feature extraction and classification, etc