Steganalysis Against Difference Expansion Based Reversible ...€¦ · paper’s scheme is...

Steganalysis Against Difference Expansion Based Reversible Data Hiding Schemes for 2D Vector Maps Shangping Zhong, Bin Liao, Guolong Chen

International Journal of Advancements in Computing Technology, Volume 3, Number 3, April 2011

Steganalysis Against Difference Expansion Based Reversible Data Hiding Schemes for 2D Vector Maps

1 Shangping Zhong, 2 Bin Liao, 3 Guolong Chen

1, First Author and Corresponding Author Department of Computer Science and Technology, Fuzhou University, Fuzhou, China, 350108, [email protected]

2,3 Department of Computer Science and Technology, Fuzhou University, Fuzhou, China, 350108, [email protected] doi:10.4156/ijact.vol3.issue3.6

Abstract

The difference expansion based (DE-based) reversible data hiding schemes suit different types of vector maps respectively, and show good performance, both in capacity and invisibility. In addition, the schemes are strictly reversible. The potential applications of the data hiding schemes may include map data authentication, secret communication, and so on. However, a loophole exists in the DE-based data hiding schemes. Through analyzing the DE-based embedding behavior and the effect of embedding on the histogram of coordinate differences or Manhattan distance differences, this paper finds unusual gaps in the histogram which reveal the presence of secret data. Furthermore, by deducing and analyzing the coordinate or Manhattan distance difference histogram of a stego-map, we can get a formula of the estimated embedding rate, and estimate the length of hidden data through the use of Laplace curve fitting function. Adopted practical map datum, computing results show that this paper’s scheme is applicable to the vector maps represented by polygons or polygonal lines. Moreover, it is possible to extend this paper’s scheme to some other data sets, e.g., 3D polygonal meshes, or images.

Keywords: Steganalysis, Difference Expansion Embedding Behavior, Histogram Analysis,

Laplace Curve Fitting Function, Reversible Data Hiding.

1. Introduction Nowadays, applications of 2D vector maps have been increasing rapidly. For the purposes of

copyright protection, integrity authentication, or secret c ommunication, etc., the t echnique of data hiding has been introduced i nto 2D vector maps(e.g.,[1]-[3]). However, due to the strict application requirements of vector maps, modifications to map data are generally undesired. Therefore, reversible schemes are more appropriate for hiding data in vector maps because the distortions can be removed after the hidden data have been extracted[4].

Despite the fact that quite a few irreversible and reversible data hiding algorithms for images have been proposed(e.g.,[5]-[11]), few works have focused on the reversible data hiding algorithms for 2D vector maps. M. Voigt et al. [ 12] first proposed the method of reversibly hiding data in vector maps. They hide the data by modifying the integer discrete cosine transform (DCT) coefficients of the map coordinates. The d istortion controlling mechanism in their scheme seems to be co mplex since t he scheme is realized in the transform domain. XiaoTong Wang et al. [4] presented two reversible data-hiding schemes based on the idea of difference expansion(DE). The two proposed schemes suit different types of maps respectively, and show good performance, both in capacity and invisibility. In addition, the two schemes are strictly reversible. But the r obustness of the two schemes is very weak. They can only resist distortions with very low amplitudes and they are fragile to map simplification and interpolation, which could destroy the synchronization of the data extraction. Low robustness of the two schemes implies that it could be a limitation for robust watermarking applications. So, the potential applications of the two proposed schemes may include map data authentication, secret communication, and so o n. Difference expansion transform, invented by Tian[9], is an outstanding reversible data-hiding scheme in terms of high embedding capacity and low distortion in image quality. There are different variants or extensions of the DE-based reversible steganographic method (e.g., [8],[13]).

- 49 -



As we know, steganography is one of important branches of data-hiding. The purpose of steganography is to send secrete messages under the cover of a carrier signal, i.e., secret communication. On the other hand, steganalysis is the set of techniques that aim to distinguish between cover-objects and stego-objects, or go one step further and estimate some parameters of the embedded message such as i ts length, location, etc.. Several approaches have been proposed to solve the image steganalysis problem and we can broadly classify them into the following groups[14]: Supervised learning based steganalysis (e.g.,[15]), Blind identification based stega nalysis (e.g.,[16]), Parametric statistical steganalysis (e.g.,[17]) and Hybrid techniques. Each of these methodologies has pros and cons. Therefore, it is up to the user (steganalyst) to choose an appropriate methodology [14]. To our knowledge, there is no work which focuses on steganalysis schemes against DE-based reversible steganographic methods.

In this paper, we propose a s teganalysis method which attacks and successfully identifies the existence of embedding done by the two DE-based reversible data-hiding schemes for 2D vector maps[4]. Our steganal ysis method can even estimate the length of payload size. In section 2 of the paper, the two DE-based reversible data-hiding schemes for 2D vector maps[4] are briefly reviewed. Our proposed steganalysis method is presented in section 3. In section 4, simulation results from the application of the proposed steganalysis are presented. Conclusion of the paper is found in section 5.

2. DE-based reversible data-hiding schemes for 2D vector maps

In the two DE-based reversible data-hiding schemes for 2D vec tor maps[4], a coor dinate-based

scheme is first proposed, in the scheme, the correlation of map coordinates is utilized for data hiding. The hidden data are embedded based on DE transform by changing the coordinate differences between the adjacent vertices of the m ap. For i mproving the capacity and invisibility in such maps whose coordinates exhibit low correlation, the paper [4] f urther uses the Manhattan distances between adjacent vertices as the cov er data t o implement a dista nce-based scheme. The original map is first divided into embedding units by taking every three consecutive vertices as a unit. Within a unit, two Manhattan distances from the middle vertex to its two neighbor vertices are calculated by a set o f invertible integer mappings. Then, a hidden bit can be embedded based on DE transform by modifying the difference between two Manhattan distances.

2.1. Basic idea of difference expansion(DE)

The basic idea of DE is to utilize the high correlation of the cover data. Given a pair of adjacent

elements of highly correlated cover data (denoted by 1x and 2x , which are both integers), an integer

transform is defined to calculate their difference ( d ) and integer-mean ( m ), which is shown in (1)

1 2

1 2

2

d x x

x xm

(1)

The transform is strictly invertible and (2) is its inverse transform.

1

2

12

2

dx m

dx m

(2)

- 50 -



To hide data bits, 1x and 2x are first transformed into d and m by (1).The high correlation of

the cover data means that two elements 1x and 2x are generally very close (i.e., their difference d could be very small in most cases). So it is possible to provide i bits for placing the hidden data by left shifting d by i bits (namely expanding d to its 2i times) while keeping the induced distortions acceptable. Suppose that the expanded difference carrying i hidden bits is denoted as 'd , the embedded elements '

1x and '2x can be calculated by 'd and m via (2).

2.2. Reversible embedding scheme based on coordinates

The coordinates of a vector map should be first transformed to integers. Next, the transformed map

is divided into N vertex pairs. Every pair contains two adjacent vertices. For example, a map object (a polyline or a polygon) composed by vertices 1 2 3 4{ , , , ,...}v v v v will be divided into

1 2 3 4{( , ), ( , ),...}v v v v . Within a pair, the difference d and the integer-mean m of two vertices are

calculated for x and y respectively. The difference sequence ,x yD D and the integer-mean

sequence ,x yM M can be denoted as follows:

1 2

1 2

{ , ,..., }{ , ,..., }

Nx x x x

Nx x x x

D d d d

M m m m

1 2

1 2

{ , ,..., }{ , ,..., }

Ny y y y

Ny y y y

D d d d

M m m m

The selection of the suitable elements by applying difference expansion in xD (or yD ) is based on an embedding condition (3) which is related to the precision tolerance of the original map. Whereas, if the i th vertex pair 1 2( , )i iv v does not meet the condition (3), a hidden bit wi ll also be

embedded into such 1 2( , )i iv v by replacing the LSB of ixd .

2 1 2 2ixd (3)

The scheme uses (3) to check the suitability of all N vertex pairs and then generate an N -length

flag F to record the results: | {0,1}, 1,..., .i iF f f i N

1if means that the i th vertex pair 1 2( , )i iv v meets the embedding condition and a hidden bit

will be e mbedded by expanding the difference ixd , whereas 0if means that 1 2( , )i iv v does not

meet the embedding condition. In the scheme, a hidden bit will also be embedded into such 1 2( , )i iv v

by replacing the LSB of ixd . For ensuring the reversibility of the scheme, the original LSB of i

xd

should not be discarded. The sche me collects all replaced original LSBs into a bit sequence L to avoid information loss. Both F and L are necessary information which wi ll be n eeded for data recovering. As a result, they will be embedded into the cover map as a part of the hidden data.

- 51 -



2.3. Reversible embedding scheme based on Manhattan distances Firstly, the coordinates of a vector map should be transformed to integers. Every three consecutive

vertices should be grouped as a unit here. Namely, for an object composed by vertices

1 2 3 4 5 6{ , , , , , ,...}v v v v v v , the divided object should be 1 2 3 4 5 6{( , , ), ( , , ),...}v v v v v v . The resulting

N embedding units can be r epresented as

1 2 3 1 1 2 2 3 3( , , ) {( , ), ( , ), ( , )}; {1, 2,..., }.i i i i i i i i iv v v x y x y x y i N B y taking the middle vertex 2iv as

the origin (0,0)O , a relative coordinate system ( , )x y can be constructed within a unit and the

coordinates of 1iv and 3

iv can then be converted to relative values 1 1( , )i ix y and 2 2( , )i ix y .

2 3 21 1 2

2 3 21 1 2

,i i ii i i

i i ii i i

x x xx x x

y y yy y y

Next, the forward mapping fT in all quadrants is d efined by a uniform reversible transform equation (4).

:2

f

l x y

T x yr

(4)

Where l is the Manhattan distance from a vertex to the origin. Given an embedding unit 1 2 3( , , )i i iv v v , a hidden bit could be embedded by the following steps. First,

construct the relative coordinate plane and translate 1iv and 3

iv to new coordinates (i.e., 1 1( , )i ix y

and 2 2( , )i ix y ). Second, calculate the Manhattan distances 1il and 2

il by the forward mapping (3)

according to the regions where 1iv and 3

iv are located. Then, perform the transform defined by (1)

on 1il and 2

il , obtaining their difference id and integer-mean im . Next, embed a hidden bit i nto

id by difference expansion or replacing LSB, obtaining the embedded difference 'id . Then,

transform 'id and im back to th e embedded Manhattan distances

'

1il and

'

2il by (2). Finally,

calculate the modified relative coordinates ' '

1 1( , )i ix y and ' '

2 2( , )i ix y by '

1il and

'

2il where

the corresponding inverse transforms should be selected according to t he settled regions of 1iv and

3iv . The real coordinates of the embedded unit

' ' '

1 2 3( , , )i i iv v v could be obtained by translating the coordinate system back to the real system.

Considering the above procedure, it could be generalized that an embedding unit should satisfy two conditions (5) and (6) if it is used for hiding data.

' '

2 1 1 2 3 3( , ) ; ( , )i i i iL v v L v v (5) where 2 ( )L is the Euclidean distance.

' '2 1, 0; 0

2 , 0r r

l lr r

(6)

- 52 -



where 'l is the modified Manhattan distance after data hiding.

3. Steganalysis against reversible DE-based data-hiding schemes for 2D vector maps

3.1. Analysis of DE-based embedding behavior

Let the histogram of d , which is the difference between the coordinates of two adjacent vertices or

the difference between two Manhattan distances, be ( )h d . Let max{ }d and min{ }d . Generally speaking, in a normal vector map, the number of occurrences of the coordinate difference,

( )h d , decreases with increasing d in a macroscopically smooth fashion. Fig. 1 is a lake vector map with 100759 vertices, and seven digits after decimal.

Figure 1. Original lake vector map

The histograms of x-coordinate differences , y-coordinate differences, and Manhattan distance

differences for the lake vector map, with [0, 200]d , are shown in Fig. 2 , Fig. 3, and Fig. 4. The coordinate or Manhattan distance differences approximately follow the Laplace distribution.

0

100

200

300

400

500

600

700

800

900

1000

-300 -200 -100 0 100 200 300

X-coordinate difference

Occurrence times

Figure 2. Histogram of x-coordinate differences for the lake vector map

- 53 -



0

100

200

300

400

500

600

700

800

900

-300 -200 -100 0 100 200 300

Y-coordinate difference

Occurrence times

Figure 3. Histogram of y-coordinate differences for the lake vector map

0

500

1000

1500

2000

2500

3000

3500

4000

4500

-300 -200 -100 0 100 200 300

Manhattan distance difference

Occurrence times

Figure 4. Histogram of Manhattan distance differences for the lake vector map

Figure 5. Embedding a hidden bit into d by applying difference expansion

Figure 6. Embedding a hidden bit into d by replacing LSB (where i is an integer)

The effect of embedding on the histogram of differences is analyzed as follows. Fig. 5 i s the

hidden message embedding scheme by applying difference expansion, and scheme in Fi g. 6 i s by replacing LSB. Let be the ratio between the number of blocks containing secret bits and the total number of blocks. The data to be hidden can be viewed as a random bit stream since they are usually encrypted before embedding.

Let the difference expansion embedding condition be , 0 0L d H L and H ,

and let the coordinate or Manhattan distance difference of the stego-map be d , and let j and k be integers. Then, according to Fig.5 and Fig.6, we kn ow that: if

- 54 -



, 2 , 2 1

1 , 2 1, 2 1

1, 2 , 2

1 1, 2 1, 2

L d H if L j H k

L d H if L j H k

L d H if L j H k

L d H if L j H k

, then all the ds had been embedded only by applying

difference expansion; if 1 2 1, 2 1

2 1, 2

H d H if H k

H d H if H k

or 2 , 2 1

1 2 , 2

L d L if L j

L d L if L j

, then

some of ds had been embedded by applying difference expansion, and some of ds had been embedded

by replacing the LSB; and if 2 2H d or 2 1d L , then all t he ds had been embedded only by replacing the LSB.

From section II (A) of the paper, we k now that in t he reversible data hiding scheme based on coordinate difference(coordinate-based scheme), 2 1L and 2 2H (There is no harm in assuming that is a positive integer.).

In the reversible data hidin g scheme based on Manh attan distance difference(distance-based scheme), according to Equation (5) and (6), we can work out the detail embedding conditions (7~10) for most of embedding units which is related to the precision tolerance of the original map. Based on these embedding conditions, we can select the suitable units of the original map to be embedded by applying difference expansion.

Quadrant I ( 0, 0x y )

2 2

1 2 1 2

1 2 1 2

1 2 2 1 3 2 2 1

2 1 2 1, 0 02 2 , 0 0

i

i i i i

i i i

i i i i

i i i

d

r m d m r if r r

r m d m r if r r

(7)

Quadrant II ( 0, 0x y )

2 2

1 2 1 2

1 2 1 2

1 2 2 1 2 2 2 1

2 1 2 1, 0 02 2 , 0 0

i

i i i i

i i i

i i i i

i i i

d

r m d m r if r r

r m d m r if r r

(8)

Quadrant III ( 0, 0x y )

2 2

1 2 1 2

1 2 1 2

2 2 2 1 2 2 2 1

2 1 2 1, 0 02 2 , 0 0

i

i i i i

i i i

i i i i

i i i

d

r m d m r if r r

r m d m r if r r

(9)

Quadrant IV ( 0, 0x y )

2 2

1 2 1 2

1 2 1 2

1 2 2 1 2 2 2 1

2 1 2 1, 0 02 2 , 0 0

i

i i i i

i i i

i i i i

i i i

d

r m d m r if r r

r m d m r if r r

(10)

- 55 -



where 1 1 2 2

2

i i i i

i

x y x ym

, 1 11 2

i i

ix y

r

,and 2 22 2

i i

ix y

r

.

In addition, for most of embedding units, conditions (7~10) are approximately equivalent to

2 22 2 2 1 3 2 2 1

12

iL d H

(11)

when meets the condition (12):

0 , 00 , 0

i i i

i i i

B if A B

A if B A

(12)

where 21

1((2 1 ) 4)

8i

i iA r m , and 22

1((2 3 ) 4)

8i

i ir mB .

Actually, for practical vector maps, most embedding units both meet the condi tion (11) and the condition (12). Taking the lake vector map(Fig.1) for example, 91.657% units both meet the condition (11) and the condition (12). So, there is no harm in using Equation (11) as the embedding conditions in the distance-based scheme to analyze the embedding behavior.

Thus, the coordinate or Manhattan distance difference histogram ' ( )h d of the stego-map can be obtained as follows:

' (2 )

(1 ) (2 )

1( ) [0, ], 2 12' ( ) ,

' [0, 1], 2(2 1)

(1 ) (2 1)

1( )

2

h i

h i

h iH H k

h dH H kh i

h i

h i

d

(13)

' 1 2

1( ) (1 ) ( ) ( ( ) ( )),

2

12

H H H H H kh h h h (14)

'

1 2

1

1 2 1

1 1(1 ) ( ) ( ),

2 2( )1 1

(1 ) ( ) ( ( 1)) ( 2)),2 2

(

12

12

H H k

H

H H k

h h H

h

h h H h H

(15)

1 1' (2 ) (1 ) (2 ) ( (2 1) ( ))2 2' ( ) ,

1 1' (2 1) (1 ) (2 1) ( (2 ) ( ))2 2

[ 2, 2 1]h i h i h i h i

h

h i h i h i h i

d H Hd

(16)

- 56 -



1 1' (2 ) (1 ) (2 ) (2 1)2 2' ( ) , [2 2, ]

1 1' (2 1) (1 ) (2 1) (2 )2 2

h i h i h i

h H

h i h i h i

d d

(17)

1' ( 1) (1 ) ( 1) ( 1)2

h h h (18)

1' (2 ) (1 ) (2 ) ( ) [ 1, 2], 2 12' ( ) ,

1 [ , 2], 2' (2 1) (1 ) (2 1) ( )2

h i h i h iL L j

h dL L j

h i h i h i

d

(19)

' 1 2 1

1 1( ) (1 ) ( ) ( ( 1 ) ( )),

2 2( ) jh L h L h L h L L (20)

'

11 2

21

11 2 1

2

1(1 ) ( ) ( ( ) ( )),

2( )1

(1 ) ( ) ( ( 1)),2

1( 2) 2

212

j

j

h L h L h L L

h L

h L h L L

(21)

1 1' (2 ) (1 ) (2 ) ( (2 1) ( ))2 2' ( ) , [2 , 2]

1 1' (2 1) (1 ) (2 1) ( (2 ) ( ))2 2

h i h i h i h i

h d d L L

h i h i h i h i

(22)

1 1' (2 ) (1 ) (2 ) (2 1)2 2' ( ) , [ , 2 1]

1 1' (2 1) (1 ) (2 1) (2 )2 2

h i h i h i

h L

h i h i h i

d d

(23)

3.2. Revealing the presence of secret data

In order to reveal the presence of secret data, we consider

1

1

' 'lim ( ( ) ( 1)), 2

' 'lim ( ( 1) ( )), 2 1

h H h H H k

h H h H H k

and 1

' 'lim ( (2 1) (2 2))h H h H

.

- 57 -



1

1

' 'lim ( ( ) ( 1))

' 'lim ( ( 1) ( ))

1( ( ) (2 1) ( 1)), 2

21

( (2 2)2

( 1) (2 3) ( )), 2 1

h H h H

h H h H

h k h k h k H k

h k

h k h k h k H k

.

There is no harm in assuming that ( 1) ( ) ( 1)h k h k h k , so, when is close to 1 ,

' '( ) ( 1), 2

' '( 1) ( ), 2 1

h H h H H k

h H h H H k

(24)

1

' 'lim ( (2 1) (2 2))h H h H

( (2 1) (2 )12

h H h H ( ) (2 2) (2 3))h H h H h H . There is

no harm in assuming that (2 1) (2 3)

(2 ) (2 2)

h H h H

h H h H

, so, when is close to 1,

' '(2 1) (2 2)h H h H (25)

With the similar principle, we can also obtain Equation (26) and (27).

' '

' ', 1

( ) ( 1), 2( 1) ( ), 2 1h L h L L j

h L h L L j

(26)

' ' 1(2 ) (2 1),h L h L (27)

Then, combining Equation (24)~(27) and Equation (13)~(23), we can easily note that : there is a

gap between ' ( )h H and ' ( 1)h H when 2H k , or bet ween ' ( 1)h H and ' ( )h H when

2 1H k ,or between ' (2 1)h H and ' (2 2)h H , or between ' ( )h L and ' ( 1)h L when

2L j , or between ' ( 1)h L and ' ( )h L when 2 1L j ,or between ' (2 )h L and ' (2 1)h L . For example, x-coordinate difference , y- coordinate difference , an d Manhattan distance difference histograms of the stego-map (Fig.1 is the original vector map) containing embedded data in all usable blocks (i.e. 1 ) and in 50% of the blocks (i.e. 0.5 ), respectively, are shown in Fig. 7

~12,with [0,200]d and 11 or 5 .

- 58 -



0

100

200

300

400

500

600

700

800

900

-300 -200 -100 0 100 200 300


Occurrence times

Figure 7. Histogram of x-coordinate differences for the stego-map when 1 and 11

0

100

200

300

400

500

600

700

800

-300 -200 -100 0 100 200 300


Occurrence times

Figure 8. Histogram of x-coordinate differences for the stego-map when 0.5 and 11

0

100

200

300

400

500

600

700

800

900

1000

-300 -200 -100 0 100 200 300


Occurrence times

Figure 9. Histogram of y-coordinate differences for the stego-map when 1 and 11

0

100

200

300

400

500

600

700

800

-300 -200 -100 0 100 200 300


Occurrence times

Figure 10. Histogram of y-coordinate differences for the stego-map when 0.5 and 11

- 59 -



0

500

1000

1500

2000

2500

-300 -200 -100 0 100 200 300


Occurrence times

Figure 11. Histogram of Manhattan distance differences for the stego-map when 1 and 5

0

500

1000

1500

2000

2500

3000

3500

-300 -200 -100 0 100 200 300


Occurrence times

Figure 12. Histogram of Manhattan distance differences for the stego-map when 0.5 and 5

It is obvious that if a received map is clean, the histogram of coordinate or Manhattan distance

differences should be approximatively smooth without prono unced gaps. O n the o ther hand, if the received map contains embedded data, gaps will occur in the histogram of coordinate or Manhattan distance differences.

Moreover, because Equat ion (24) is depended on the assumption ( 1) ( ) ( 1)h k h k h k , for the distance-based scheme, the assumption ( 1) ( ) ( 1)h k h k h k generally induces more error than the coordinate -based scheme. Equation (26) has the same situation as Equation (24).

Furthermore, to a stego-map, we don not know whether H and L are odd or even, but we can

get H and L through studying values of ' '(2 1) (2 2)h H h H and ' '(2 ) (2 1)h L h L .

If

' '(2 1) (2 2) 0,

' '(2 1) (2 2)1

' '(2 2) (2 3)H

h H h H and

h H h H

h H h H

, and

' '(2 ) (2 1) 0,

' '(2 ) (2 1)' '(2 1) (2 2)

1L

h L h L and

h L h L

h L h L

, then two gaps are

found , and values of H and L can be gotten, where H and L are two thresholds.

3.3. Estimating the embedding rate and the length of hidden data

Furthermore, the embedding rate and the length o f hidden data can be esti mated from the coordinate or Manhat tan distance difference histogram of the received stego-map, ' ( )h d . By

- 60 -



identifying the difference between two successive values in ' ( )h d which is significantly larger than that between other adjacent values, a steganalyst can reliably find the values of , 2 1, , 2 ,H H L L and et al., then, can be easily obtained. In addition, according to Equation (13), we have:

' '

'

(1) (0) (1 )( (1) (0))

(0)(1 )( (1) )

11

2

h h h h

hh

Because the coordinate or Manhattan distance differences approximately follow the Laplace distribution, there is no harm in assuming that (1) ( 1)h h , so, according to Equation (18), we have:

' '' '

' '

( 1) (0)(1) (0) (1 )( )

1 11 1

2 2(1 )( ( 1) (0))

11

2

h hh h

h h

Thus, the estimated embedding rate E can be got as follow:

' '

' ' '

2( (1) ( 1))(1) (0) 2 ( 1)

E h h

h h h

(28)

Because:

①.' 1 1(1) (1 ) (1) (0) (1 ) (1)

2 2h h h h '1

(1 ) ( 1) ( 1)2

h h ;

②. ' 1(0) (1 ) (0)

2h h '1

(1 ) ( 1) ( 1)2

h h ;

③. ' ' ' ' '2( (1) ( 1)) ( (1) (0) 2 ( 1))h h h h h ' '(1) (0) (1 )( (1) (0)) 0h h h h ; it is easy to know that:

' '

' ' '

2( (1) ( 1))0 1

(1) (0) 2 ( 1)h h

h h h

.

Let EDELen be the es timated length of embedded data by applying difference expansion, i.e. the

payload size; Let ELSBLen be the estimated length of embedded data by replacing LSB, and let TE

be the total number of suitable embedded elements. Then, we have:

E E EDE LSBLen Len TE

Additionally, according to Equation (13),(18),(19) and Equation (30), we can get: When 0 1 ,

- 61 -



'

'

'

~

~

~

(0)(0) (0)

11

21

(2 1) ( )2(2 1) (2 1) ,

11

(2 ) ( )2(2 ) (2 )

1

hh

h i h ii h i

h i h ii h i

h

h

h

1[0, ], 2 1

2

2[0, ], 2

2

HH k

HH k

i

(29)

and '

'

'

~

~

~

( 1)( 1) ( 1)

11

21

(2 1) ( )2(2 1) (2 1) ,

11

(2 ) ( )2(2 ) (2 )

1

hh

h i h ii h i

h i h ii h i

h

h

h

1[ ], 2 1

2

[ ], 22

, 1

, 1

LL j

LL j

i

When 1 ,

~'

1[0, ], 2 1

22

[0, ], 22

( ) ( ) 2 (2 ),

HH k

HH k

h d h d h d d

, (30)

and

~'

1[ ], 2 1

2

[ ], 22

, 1( ) ( ) 2 (2 ),

, 1

LL j

LL j

h d h d h d d

.

From Equation (29) or (30),

11, 2 1

2( )2

1, 22

HH k

n nH

H k

coordinates of points can be obtained:

~( , ( )), 0,1,..., 1d d d nh . Then, using least-squares method, we can get Laplace curve fitting function (31):

( )1

exp( )2

F xb b

x u

. (31)

Through Function (31) and floor operation, we can estimate:

~

1[ , ], 2 1

22

[ , ], 22

1( ),

1

HH H k

HH H k

h d d

.

- 62 -



With the similar principle, we can also estimate:

~

1[ , ], 2 1

2

[ , ], 22

1( ),

1

LL L j

LL L j

h d d

.

F inally, EDELen and E

LSBLen are obtained as follow:

~

( )H

d L

E EDELen h d

(32)

DE

E ELSBLen TE Len (33)

4. Experiments and results

A river map, a region map, two road maps, and the lake map(Fig.1) are used as the original maps,

respectively, to test the s teganalysis performance against the two reversible data hiding schemes. The river map, the region map, and the two road maps are shown in Fig.13 .Table 1 lists some features of the five maps. For cases of 1 and 0.5 , respectively, the actual lengths of hidden

bit-sequences DELen and LSBLen , and the esti mated lengths EDELen (which defined by Formula

(32)) and ELSBLen (which defined by Formula (33)) are listed in Table 2- 7.

In Table 2- 7 , s are chosen o n the ba sis of the condition that DELen must be greater than

12

TE to be able to e mbed a non-zer o payload[18] . Additionally, when L is smaller than

max{ }L and H is smaller than max{ }H , two gaps in difference histogram of stego-map are

found and values of H and L can be determined. E is the estimated embedding rate defined by Formula (28).

(a) (b)

(c) (d)

Figure 13. Test maps. (a) River map; (b) Region map; (c) Road map1; (d) Road map2

- 63 -



Table 1. Features of original maps Original Maps Scale Amount of

Vertices The Number of Digits after Decimal

Map Entity Type

River Map 1:10000 186495 6 Area Region Map 1:20000 23883 6 Area Road Map1 1:10000 27877 7 Line Road Map2 1:20000 70833 6 Line Lake Map 1:10000 100759 7 Area

Table 2. Actual and estimated parameters of embedded data using x-coordinate-based scheme when

1 Original Maps

max{ }L max{ }HE DELen E

DELen LSBLen E

LSBLen

River Map 11 3.04 6.71 0.9159 63950 60313 29297 27421

Region Map 15 10.00 2.67 1.0697 5417 5108 5238 5002

Road Map1 35 1.52 6.00 1.0666 7400 7236 6539 6210

Road Map2 15 1.79 3.00 1.0322 29923 29743 5496 5565

Lake Map 11 10.22 56.40 0.8957 26046 22701 24333 22423

Table 3. Actual and estimated parameters of embedded data using y-coordinate-based scheme when

1 Original Maps

max{ }L max{ }HE DELen E

DELen LSBLen E

LSBLen

River Map 11 18.21 34.00 0.9633 65906 65129 27341 27407

Region Map 15 1.60 3.82 0.9842 6513 6654 4142 4241

Road Map1 35 3.66 2.20 1.1852 7679 7879 6260 6311

Road Map2 15 8.67 5.25 1.0483 31950 31992 5224 5309

Lake Map 11 13.62 10.00 0.9031 26957 23313 23422 22184

Table 4. Actual and estimated parameters of embedded data using distance-based scheme when 1

Original Maps

max{ }L max{ }H E DELen E

DELen LSBLen E

LSBLen

River Map 5 7.25 15.23 0.9823 39829 37802 18396 17264

Region Map 5 1.87 15.00 1.1502 4757 4828 1758 1883

Road Map1 25 1.78 1.57 0.9296 4937 4609 4156 3869

Road Map2 5 2.07 3.25 0.9869 17537 16937 4551 4373

Lake Map 5 8.53 4.53 0.9385 21103 20018 10395 10121

- 64 -



Table 5. Actual and estimated parameters of embedded data using x-coordinate-based scheme when 0.5

Original Maps max{ }L max{ }HE DELen E

DELen LSBLen E

LSBLen

River Map 11 6.10 2.61 0.6269 31931 32778 14616 15335

Region Map 15 3.66 1.69 0.6444 2738 2902 2710 2885

Road Map1 35 1.50 2.33 0.3526 3754 2816 3309 2704

Road Map2 15 1.38 4.00 0.6845 15081 17737 2809 3078

Lake Map 11 3.41 52.33 0.3963 13023 10066 12166 9899

Table 6. Actual and estimated parameters of embedded data using y-coordinate-based scheme when

0.5 Original Maps max{ }L max{ }H

E DELen E

DELen LSBLen

E

LSBLen

River Map 11 15.78 13.20 0.6071 32778 34232 13540 14883

Region Map 15 1.45 2.77 0.5903 3217 3340 2231 2477

Road Map1 35 2.40 1.46 0.2273 3777 2013 3102 1667

Road Map2 15 5.50 3.51 0.3341 14959 10077 2571 1836

Lake Map 11 2.82 13.16 0.4034 13478 10410 11711 9912

Table 7. Actual and estimated parameters of embedded data using distance-based scheme when

0.5 Original Maps max{ }L max{ }H

E DELen E

DELen LSBLen E

LSBLen

River Map 5 2.93 10.91 0.5227 19712 19929 9161 9344

Region Map 5 2.20 13.00 0.6595 2466 2756 891 1018

Road Map1 25 1.42 1.93 0.3006 2417 1847 2076 1536

Road Map2 5 1.80 2.00 0.5380 8875 8739 2290 2112

Lake Map 5 4.36 8.25 0.5570 10569 10103 5164 5085

As we know, the above estimation results are based on the assumption: the data to be hidden can

be viewed as a random bit stream (i.e. the proportion of 0 or 1 is 1/2.) since they are usually encrypted before embedding. But unlike a image(e.g. 512 512 ) which has enough usable blocks used for data embedding in statistical sense, most practical maps have n ot enough usable blocks to meet the condition: the proportion of embedded 0 or 1 is 1/2. Thus, although the estimation in Table 2- 7 is accurate, there are some errors.

In addition, to the Riv er Map and the Lake Map with a large amount of vertices, if L and H are smaller than 2.5 , then two gaps in difference histogram of stego-map are found and values of H and L can be determined. But to the Region Map , the Road Map1 and Road Map2 with not very large amount of vertices, L and H must be smaller than 1.4. So, we must choose proper

thresholds: H and L for different stego-maps. Finally, because there is no work which focuses on steganalysis schemes against DE-based

reversible steganographic methods, this paper can not have some contrast results in the above experiments.

- 65 -



5. Conclusion

This paper focuses on th e steganalysis scheme against reversible data hi ding schemes (coordinate-based scheme and distance-based scheme) for 2D vector maps based on difference expansion. This paper’s scheme is effective not only to reveal the presence of secret data, but also to estimate the embedding rate and the len gth of hidden data. The following conclusions can be dr awn from the above theory analysis and computing results:1) Maps having more usable embedding blocks could result in high steganalysis accuracy; 2) For dif ferent stego-maps , w e must choose different proper thresholds H and L to find gaps in the difference histograms of stego-maps; 3) The idea of the proposed schemes is applicable to the vector maps represented by polygons or polygonal lines. Moreover, it is possible to extend the scheme to some other data sets, e.g., 3D polygonal meshes, or images.

One of our future works is to propose a modified DE-based reversible data hiding scheme which avoids occurrence of the above-mentioned gaps in the difference histograms of stego-objects.

6. Acknowledgment

The authors would like to thank the reviewers for their careful reading and insightful comments.

This work wa s supported by the Techno logy Innovation Platform Project of Fujian Province under Grant No.2009J1007, the Natural Science Foundation of Fujian Province under Grant No. 2010J01331.

7. References

[1] R. Ohbuchi, H. Ueda, and S. Endoh, “Robust watermarking of vector digital maps”, in Proc. IEEE

Int. Conf. Multimedia and Expo, Lausanne, Switzerland, vol. 1,Aug. 26–29, pp. 577–580, 2002. [2] H. Gou and M. Wu, “Data hiding in curves with applications to map fingerprinting”, IEEE Trans.

Signal Process., vol. 53, no. 4, pp. 3988–4005,2005. [3] G. Schulz and M. Voigt, “A high capacity watermarking system for digital maps”, in Proc. ACM

Int. Workshop on Multimedia and Security, Magdeburg, Germany, pp. 180–186,2004. [4] XiaoTong Wang, ChengYong Shao, XiaoGang Xu and XiaMu Niu, “Reversible Data-Hiding

Scheme for 2-D Vector Maps Based on Difference Expansion”, IEEE Transactions on Information Forensics and Security, vol.2,no. 3, pp.311–320,2007.

[5] Stuti Bazaj, Sachin Modi, Anand Mohan, S. P. Singh, "An Improved Algorithm for Data Hiding Using HH-subband Haar Wavelet Coefficients", IJACT, Vol. 2, No. 2, pp. 109 ~ 116, 2010.

[6] Samira Lagzian, Mohsen Soryani, Mahmood Fathy, "A New Robust Watermarking Scheme Based on RDWT-SVD", IJIIP, Vol. 2, No. 1, pp. 22 ~ 29, 2011.

[7] Hongyuan Li, Guangjie Liu, Yuewei Dai, Zhiquan Wang , "Copyright Protecting Using The Secure Visible Removable Watermarking In JPEG Compression ", JDCTA, Vol. 4, No. 8, pp. 34 ~ 42, 2010.

[8] Yongjian Hu, Heung-Kyu Lee, Kaiying Chen, and Jianwei Li, “Difference Expansion Based Reversible Data Hiding Using Two Embedding Directions”, IEEE Transactions on Multimedia ,vol.10, no.8, pp.1500–1512,2008.

[9] J. Tian, “Reversible data embedding using a difference expansion”, IEEE Trans. Circuits Syst. Video Technol., vol. 13 , pp. 890–896,2003.

[10] M. U. Celik, G. Sharma, A. M. Tekalp, and E. Saber, “Lossless generalized-LSB data embedding”, IEEE Trans. Image Process., vol. 12, no.2 , pp. 157–160,2005.

[11] Mohammad Athar Ali , Eran. A. Edirising he, "Reversible Watermarking using Differential Expansion on IPCM Macroblocks in H.264/AVC", JNIT, Vol. 2, No. 1, pp. 105 – 116, 2011.

[12] M. Voigt, B. Yang, and C. Busch, “Reversible watermarking of 2d-vector data”, in Proc. ACM Int. Workshop on Multimedia and Security, Magdeburg, Germany, pp. 160–165,2004.

- 66 -



[13] D. M. Thodi and J. J. Rodriguez, “Expansion embedding techniques for reversible watermarking”, IEEE Trans. Image Process., vol. 16, no.3, pp. 721–730,2007.

[14] Chandramouli R,Subbalakshmi K P, “Current trends in s teganalysis:a critical survey”, In Proceeding of Eighth International Conference Control on Automation, Robotics and Vision, KunMing: Elseviser Press, pp.964–967,2004.

[15] I. Avcibas, N. Memon, and B. Sankur, “Steganalysis using image quality metrics”, IEEE Trans. on Image Processing, vol. 12, no. 2, pp. 221–229,2003.

[16] R. Chandramouli, “A mathematical framework for active steganalysis” ,ACM Multimedia Systems, vol. 9, no.3 , pp. 303–311,2003.

[17] X. Zhang and S . Wang, “Vulnerability of pixel-value differencing steganography to histo gram analysis and modification for enhanced security”, Pattern Recognition Letters 25, pp.331–339,2004.

[18] A.M.Alattar, “Reversible watermark using the difference expansion of a generalized integer transform”, IEEE Transactions on Image Processing, vol. 13, no.8 , pp.1147–1156,2004.

- 67 -

Steganalysis Against Difference Expansion Based Reversible ...€¦ · paper’s scheme is...

Documents

Transcript of Steganalysis Against Difference Expansion Based Reversible ...€¦ · paper’s scheme is...