Steganalysis by subtractive pixel adjacency matrix and dimensionality reduction

. RESEARCH PAPER .

SCIENCE CHINAInformation Sciences

doi: 10.1007/s11432-013-4793-x

c© Science China Press and Springer-Verlag Berlin Heidelberg 2013 info.scichina.com www.springerlink.com

Steganalysis by subtractive pixel adjacency matrixand dimensionality reduction

ZHANG Hao∗, PING XiJian, XU ManKun & WANG Ran

Department of Signal and Information Processing, Zhengzhou Information Science and Technology Institute,Zhengzhou 450002, China

Received March 4, 2012; accepted November 22, 2012

Abstract Subtractive pixel adjacency matrix (SPAM) features, introduced by Pevny et al. as a type of

Markov chain features, are widely used for blind steganalysis in the spatial domain. In this paper, we present

three improvements to SPAM as follows: 1) new features based on parallel subtractive pixels are added to the

SPAM features, which only refer to collinear subtractive pixels; 2) features are extracted not only from the

spatial image, but also from its grayscale-inverted image, making the feature matrices symmetrical and reducing

their dimensionality by about half; and 3) a new kind of adjacency matrix is used, thereby reducing about 3/4 of

the dimensionality of the features. Experimental results show that these methods for dimensionality reduction

are very effective and that the proposed features outperform SPAM.

Keywords steganalysis, Markov chain, dimensionality reduction, LSB matching, YASS algorithm

Citation Zhang H, Ping X J, Xu M K, et al. Steganalysis by subtractive pixel adjacency matrix and dimension-

ality reduction. Sci China Inf Sci, 2013, 56, doi: 10.1007/s11432-013-4793-x

1 Introduction

Steganalysis aims to expose the presence of hidden data using steganography. Generally, steganalysis is

divided into targeted and blind steganalysis. While the former is used to detect a specific steganography

algorithm, the latter can detect a large number of algorithms and is more attractive in practice. Currently,

a great variety of features for blind steganalysis are provided. For instance, wavelet absolute moment

(WAM) features [1,2] are based on the wavelet coefficients’ prediction error, empirical characteristic

function moment features [3,4] are established by the characteristic histogram function in spatial or

frequency domains, and Markov chain features (MCFs) [5–7] are associated with joint and transition

probability matrices. As a member of the MCF family, SPAM (subtractive pixel adjacency matrix) [6] is

quite well known for its strong detection ability in the spatial domain. It outperforms both the targeted

steganalyzer ALE (amplitude of local extreme) [8] and the blind steganalyzer WAM on the LSB matching

(also known as ±1 embedding) algorithm, and is superior to the 274 dimensional merged features [9] in

detecting the “yet another steganographic scheme” YASS algorithm [10,11]. Recently, a collection of

MCFs [7] containing SPAM was utilized to detect the content-adaptive algorithm given by Pevny et al.

∗Corresponding author (email: hao [email protected])

2 Zhang H, et al. Sci China Inf Sci

[12]. The resulting steganalyzer, using a more efficient classifier [7] than support vector machines (SVMs),

was shown to be very powerful.

Although MCFs are very useful in steganalysis, their high dimensionality may affect the complexity

of training. Consequently, we aim to design an efficient dimensionality reduction (DR) method in this

paper. Because SPAM has been widely used and can be regarded as a representation of MCFs, our

work focuses on improving SPAM. In fact, our methods can be generalized to other kinds of MCFs. By

analyzing SPAM, we find that the differences and transition probability are always computed in the same

direction. This means that SPAM only involves collinear subtractive pixels and the adjacent pixels have

a fixed transitive relationship. This discovery leads to two questions. First, why not change the direction

of transition probability? Second, what would happen if we used a different transitive relationship to that

used in SPAM? To answer the first question, we propose new features based on parallel subtractive pixels,

in which the directions of the differences and the transition probability are perpendicular to one another.

Experimental results in Section 3 demonstrate that this kind of features is obviously useful for improving

detection ability. To answer the second question, we alter the original transitive relationship to create

a new adjacency matrix. It can be seen that this new adjacency matrix has more symmetric properties

than the previous one, and therefore, the dimensionality can be greatly reduced (see Subsection 2.3 for

the details).

The rest of the paper is organized as follows. Section 2 gives some necessary descriptions including

distribution behavior of adjacent pixels in difference images, expressions of the novel features and the pro-

cess of DR. The following section shows the detection results of the LSB matching and YASS algorithms.

Finally, Section 4 gives our conclusions and future work.

2 Features of adjacent subtractive pixels

In this section, we propose new features based on collinear and parallel subtractive pixels (CPSPs) which

can be seen as an improvement of SPAM. We start by investigating distribution behavior of adjacent

subtractive pixels in Subsection 2.1, and then give some definitions and declarations of novel features in

Subsection 2.2. Finally, DR processes for our features are described in detail.

2.1 Distribution behavior of adjacent subtractive pixels

For a 256-grayscale image I, let Dd(I) be its difference image. The direction d ∈ {0, 1, . . . , 7} rep-

resents anti-clockwise d × 45◦. Subtractive pixels are considered as vectors when referring to their

location relationship. Examples of collinear subtractive pixels (CSPs) are (D0i,j−1(I),D

0i,j(I)) and

(D0i,j−1(I),D

0i,j(I),D

0i,j+1(I)), while examples of parallel subtractive pixels (PSPs) are (D0

i−1,j(I),

D0i,j(I)) and (D0

i−1,j(I),D0i,j(I),D

0i+1,j(I)). The joint distributions of three adjacent CSPs and PSPs

are denoted by matrices Cd1 and Cd

2 with a truncation threshold T . Inspired by the symmetrical distri-

bution of subtractive pixels (usually modeled as a generalized Gaussian distribution), we explored the

symmetry properties of Cd1 and Cd

2 using their empirical distributions. Details of the image databases 1)

used for the experiments are given below:

S1 CAMERA contains 3164 images with fixed size, 512× 512.

S2 NRCS contains 1500 images with sizes ranging from 2.3 to 6 Mpix.

S3 BOWS2 consists of 5000 images with fixed size, 512× 512.

For each database, sample Cdi values were calculated and the average value of all the samples was

taken as the empirical distribution Cd

i . In the following, we computed the relative error Edi as

Edi (u, v, w) =

Cd

i (u, v, w)−Cd

i (−u,−v,−w)

Cd

i (u, v, w) +Cd

i (−u,−v,−w).

1) All image databases used in this study are available online. S1–http://www.adastral.ucl.ac.uk./gwendoer/steganalysis/;

S2–http://photogallery.nrcs.usda.gov/; S3–http://baws2.gipsa-lab.inpg.fr/.

Zhang H, et al. Sci China Inf Sci 3

Table 1 Bounds of the relative error

T S1 S2 S3

1 0.0043 0.0158 0.0174

E01 2 0.0169 0.0261 0.0261

3 0.0418 0.0407 0.0261

1 0.0031 0.0010 0.0042

E02 2 0.0066 0.0016 0.0042

3 0.0107 0.0021 0.0042

It is easy to see that maxu,v,w{Edi } = −minu,v,w{Ed

i }, and as space is limited, we only list values of

maxu,v,w{E0i } (see Table 1).

From the results, it can be seen that the difference between Cd

i (u, v, w) and Cd

i (−u,−v,−w) is very

small. Thus we assume that Cdi is symmetrical about the origin point. This assumption implies that the

CSPs (PSPs) from both Dd(I) and Dd(I) have the same distribution, where I = 255 − I denotes the

grayscale-inverted image. In the following sections, we show that this assumption is useful for feature

extraction and dimensionality reduction.

2.2 Introduction to CPSP features

Given a direction d, let M i,d(I) and N i,d(I) be the ith order feature matrices of CSPs and PSPs,

respectively. We list several types of matrices to describe the MCFs of CSPs or PSPs. While types A

and B are a pair of transition probability matrices, type C denotes the joint probability matrix. In more

detail, the first order matrices of CSPs with direction d = 0 are

A M1,0(I) : M1,0u,v(I) = Pr(D0

i,j(I) = v | D0i,j−1(I) = u),

C M1,0(I) : M1,0u,v(I) = Pr(D0

i,j(I) = v, D0i,j−1(I) = u);

the first order matrices of PSPs are

A N1,0(I) : N1,0u,v(I) = Pr(D0

i,j(I) = v | D0i−1,j(I) = u),

C N1,0(I) : N1,0u,v(I) = Pr(D0

i,j(I) = v, D0i−1,j(I) = u);

the second order matrices of CSPs with direction d = 0 are

A M2,0(I) : M2,0u,v,w(I) = Pr(D0

i,j+1(I) = w | D0i,j−1(I) = u, D0

i,j(I) = v),

B M2,0(I) : M2,0u,v,w(I) = Pr(D0

i,j(I) = v | D0i,j−1(I) = u, D0

i,j+1(I) = w),

C M2,0(I) : M2,0u,v,w(I) = Pr(D0

i,j−1(I) = u, D0i,j(I) = v, D0

i,j+1(I) = w);

and the second order matrices of PSPs are

A N2,0(I) : N2,0u,v,w(I) = Pr(D0

i+1,j(I) = w | D0i−1,j(I) = u, D0

i,j(I) = v),

B N2,0(I) : N2,0u,v,w(I) = Pr(D0

i,j(I) = v | D0i−1,j(I) = u, D0

i+1,j(I) = w),

C N2,0(I) : N2,0u,v,w(I) = Pr(D0

i−1,j(I) = u, D0i,j(I) = v, D0

i+1,j(I) = w),

where u, v, w ∈ {−T, . . . , T }. We can also define the matrices of other directions in the same way. To

create the new features, we incorporate the matrices M i,d(I)(N i,d(I)) and M i,d(I)(N i,d(I)) based on

the conclusion in Subsection 2.1 that the CSPs (PSPs) from both Dd(I) and Dd(I) have the same

distribution, and the ith order CPSP features of each type are given by

Fi,1 =1

8

∑

d∈{0,2,4,6}

(M i,d(I) +M i,d(I)

), Fi,2 =

1

8

∑

d∈{1,3,5,7}

(M i,d(I) +M i,d(I)

),


−2 −1 0 1 2−2

−1

0

1

2

u

v

(a)

−1

0

1

−10

1−1

0

1

w u

v

(b)

Figure 1 Graphical representation of (a) S1 and (b) S2.

Gi,1 =1

8

∑

d∈{0,2,4,6}

(N i,d(I) +N i,d(I)

), Gi,2 =

1

8

∑

d∈{1,3,5,7}

(N i,d(I) +N i,d(I)

).

By observing the calculation of CSP ([Fi,1,Fi,2]) and SPAM [6] features, it is easy to confirm that

these two kinds of features are almost the same except for the addition or not of a grayscale-inverted

image statistic. This subtle change gives CSP features a symmetrical data structure and a mechanism for

dimensionality reduction, while PSP features ([Gi,1,Gi,2]) have the same structure (see Subsection 2.3

for the details). Moreover, if we compare CSP and PSP features, it is interesting to show that these two

kinds of features represent different distributions of different neighboring pixels. That means combining

CSP and PSP features may detect more changes through steganography and improve the steganalysis.

2.3 Dimensionality reduction

From observation of matrices of types B and C, we find that

1) {(M i,d(I),M i,d(I)), (N i,d(I),N i,d(I))}d are symmetrical pairs of matrices about the origin;

2) {(M2,d(I),M2,d+4(I)), (N2,d(I),N2,d+4(I))}d∈{0,1,2,3} are symmetrical pairs of matrices about

the line (u, v) = (−w, 0) and the plane u = w, respectively.

We can deduce from these properties that

1) Fi,1,Fi,2,Gi,1,Gi,2 are all symmetrical about the origin;

2) F2,1,F2,2,G2,1,G2,2 are all symmetrical about the line u = w and the plane (u, v) = (−w, 0),

respectively.

Now we introduce the DR process for matrices. In the case that i = 1, since Fi,1,Fi,2,Gi,1,Gi,2 are

symmetrical about the origin (u, v) = (0, 0), about half of the data can be omitted. The coordinate set

of the remaining data (see Figure 1) is denoted by

S1 = {(u, v) | u ∈ {−T, . . . ,−1}, v ∈ {−T, . . . , T }} ∪ {(0, v) | v ∈ {0, . . . , T }}.

In the case that i = 2, since Fi,1,Fi,2,Gi,1,Gi,2 of types B and C have a symmetry centre (u, v, w) =

(0, 0, 0) and a symmetry plane u = w, we can delete the data for v < 0 and w < u, respectively. Moreover,

by using the symmetry axis (u, v) = (−w, 0), the data for {(u, v, w) | v = 0, u < −w} can be discarded.

The coordinate set of the remaining data (see Figure 1) is denoted by

S2 = {(u, v, w) | u � w, v > 1, u, v, w ∈ {−T, . . . , T }}∪{(u, 0, w) | 0 � u � w, u, w ∈ {0, . . . , T }}∪{(u, 0, w) | −w � u � −1, − u,w ∈ {1, . . . , T }}.

Since the matrices of type A are merely symmetrical about the origin, the associated coordinate set is

denoted by

{(u, v, w) | v > 1, u, v, w ∈ {−T, . . . , T }} ∪ {(u, 0, w) | u ∈ {−T, . . . ,−1}, w ∈ {−T, . . . , T }}.


Table 2 Dimensional comparison of CPSP features

Feature DR Dimensionality

1st before 4(2T + 1)2

1st after 4(2T 2 + 2T + 1)

2nd A,B,C before 4(2T + 1)3

2nd A after 4(T + 1)(4T 2 + 2T + 1)

2nd B,C after 4(T + 1)(2T 2 + 2T + 1)

Table 3 LSB matching detection

S1-25% S1-50% S2-25% S2-50%

DA FP DA FP DA FP DA FP

1st SPAM 162D 86.64% 16.75% 94.07% 8.15% 74.04% 26.89% 90.60% 11.04%

1st CSP-A 82D 86.24% 16.05% 93.93% 8.15% 72.93% 29.85% 91.52% 9.93%

1st CPSP-A 164D 92.22% 7.76% 96.61% 5.86% 79.63% 20.30% 92.89% 7.41%

1st CPSP-C 164D 91.71% 8.71% 96.47% 3.65% 78.78% 23.48% 91.78% 11.11%

2nd SPAM 686D 90.70% 12.99% 97.86% 3.37% 79.33% 22.96% 92.04% 7.11%

2nd CSP-A 200D 91.26% 11.94% 97.91% 2.98% 81.89% 20.00% 93.85% 8.00%

2nd CPSP-A 688D 95.98% 3.30% 99.14% 0.98% 82.30% 19.56% 94.00% 5.63%

2nd CPSP-B 400D 95.54% 6.29% 98.65% 1.40% 81.52% 22.15% 93.22% 7.19%

2nd CPSP-C 400D 94.65% 7.51% 97.65% 3.23% 80.63% 20.07% 92.04% 11.41%

Since CPSP features consist of four matrices, the dimensionality will be four times the cardinality of

Si. Table 2 lists the dimensionality of the features before and after DR.

When T tends to infinity, half the dimensionality of the 1st and 2nd features of type A can be reduced.

At the same time, 3/4 of the dimensionality of the 2nd features of types B and C can be reduced. It

should be noted that all the CPSP features used in the following section are those after DR.

3 Experimental results

The steganalyzers were constructed by using SVMs with a Gaussian kernel. Ten percent of the carriers

and their corresponding stego images were randomly chosen for training. Detection accuracy (DA) was

used to evaluate the detection ability of the steganalyzer, while false positive (FP) rates associated with

the DA values were used as complementary results.

3.1 LSB matching detection

We chose S1 and S2 as carrier sets. The stego sets were created by LSB matching with payloads of 0.25

bits per pixel (bpp) and 0.5 bpp, respectively. T was set to 4 for the first order features and 3 for the

second order ones. The DA values are listed in Table 3.

From these experimental results, the following conclusions can be reached. 1) Compared with SPAM,

CSP-A features have smaller dimensionality and comparable DA values, which implies that adding I to

calculate the features does not reduce the detection ability. 2) CPSP-A features are superior to SPAM

and CSP-A features, so PSP features are useful for improving detection ability. 3) CPSP-B features have

smaller dimensionality than CPSP-A features and approximately the same detection ability. 4) Joint

probability features (type C) are inferior to the transition probability features (types A and B).


Table 4 YASS detection-DA

QFh 50 55 60 65 70 75

AP 0.0619 0.0671 0.0714 0.0765 0.0821 0.0877

1st SPAM 88.28% 86.26% 83.77% 80.32% 77.06% 72.24%

1st CSP-A 88.77% 86.97% 85.01% 81.19% 77.78% 73.14%

1st CPSP-A 90.22% 88.32% 86.73% 83.13% 78.44% 73.63%

1st CPSP-C 87.63% 85.57% 82.91% 80.78% 76.72% 73.40%

2nd SPAM 94.58% 92.73% 91.30% 89.71% 85.31% 80.29%

2nd CSP-A 95.15% 93.03% 92.80% 89.87% 87.10% 82.17%

2nd CPSP-A 95.51% 93.87% 92.57% 90.59% 88.59% 83.17%

2nd CPSP-B 95.09% 93.44% 93.08% 89.83% 87.10% 80.73%

2nd CPSP-C 92.53% 90.64% 90.04% 87.90% 85.97% 79.70%

Table 5 YASS detection-FP

QFh 50 55 60 65 70 75

1st SPAM 13.36% 14.04% 17.36% 20.98% 24.80% 29.09%

1st CSP-A 13.33% 14.07% 16.73% 20.62% 25.36% 28.36%

1st CPSP-A 12.22% 13.24% 14.40% 19.56% 21.44% 29.16%

1st CPSP-C 16.20% 15.93% 20.27% 22.49% 26.00% 28.36%

2nd SPAM 5.36% 7.71% 9.80% 12.20% 16.93% 20.56%

2nd CSP-A 5.20% 7.60% 8.73% 12.02% 14.78% 19.11%

2nd CPSP-A 4.87% 6.78% 7.80% 10.16% 12.49% 17.98%

2nd CPSP-B 5.91% 7.76% 7.60% 10.64% 13.64% 20.13%

2nd CPSP-C 10.11% 11.96% 11.73% 13.67% 16.24% 22.13%

Table 6 YASS detection with CPSP-B features

O T 50 55 60 65 70 75

2 3 94.43% 94.07% 93.21% 90.59% 88.38% 82.27%

3 5 95.27% 94.82% 93.09% 91.88% 88.33% 84.21%

4 6 95.74% 95.20% 94.27% 92.82% 90.94% 86.92%

5 6 96.21% 95.42% 94.86% 93.22% 91.12% 88.48%

6 6 94.87% 93.81% 92.76% 91.02% 89.00% 85.54%

3.2 YASS detection

Images of set S3 were embedded with maximal-length random messages by the original YASS algorithm

[10] with six hiding quality factors (QFh) and an advertising quality factor (QFa) of 75. The average

payloads (AP) over the corpus of images are shown in Table 4. It should be noted that features were

extracted from decompressed images and the threshold T was used as described in Subsection 3.1. From

Table 5, we reach the same conclusions as in Subsection 3.1.

In addition, inspired by [7], we extracted CPSP features from higher-order (O ∈ {2, . . . , 6}) differencesbetween neighboring pixels. The threshold T was chosen from {3, 4, 5, 6}. As CPSP-B features have

smaller dimensionality than CPSP-A features and approximately the same detection ability, we chose

CPSP-B features to detect YASS. We observe that the best threshold T corresponding to a given order

O is independent of parameter QFh. Consequently, the optimum T is listed with its DA value for each

order O. In Table 6, the steganalyzer with O = 5 and T = 6 gives the best detection performance.


4 Conclusions

In this paper, we proposed an improvement for SPAM introduced in [6]. By adding PSP features, the

novel steganalyzer outperforms SPAM. In addition, by calculating features from a grayscale-inverted

image and using a new adjacency matrix, the dimensionality of the proposed features can be greatly

reduced. Extensive experiments show that these methods are very effective and useful. Moreover, the

techniques used in this paper can obviously be generalized to other MCFs. In the future, we would like

to study MCF in frequency domain. Earlier work refers to the 324 dimensional MCFs proposed in [5].

Investigating MCFs in the frequency domain could be useful for steganalysis of JPEG compressed images.

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 60970142).

References

1 Lyu S, Farid H. Steganalysis using higher-order image statistics. IEEE Trans Inf Forensic Secur, 2006, 1: 111–119

2 Goljan M, Fridrich J, Holotyak T. New blind steganalysis and its implications. In: Proceedings of SPIE, Electronic

Imaging, Security, Steganography, and Watermarking of Multimedia Contents VIII, San Jose, 2006. 1–13

3 Harmsen J J, Pearlman W A. Steganalysis of additive noise moderable information hiding. In: Proceedings of SPIE,

Electronic Imaging, Security, Steganography, and Watermarking of Multimedia Contents VI, Santa Clara, 2003. 131–

142

4 Wang Y, Moulin P. Optimized feature extraction for learning-based image steganalysis. IEEE Trans Inf Forensic Secur,

2007, 2: 31–45

5 Shi Y Q, Chen C, Chen W. A Markov process based approach to effective attacking JPEG steganography. In:

Camenisch J L, Collberg C S, Johnson N F, et al., eds. Information Hiding, 8th International Workshop. Berlin:

Springer-Verlag, 2006. 249–264

6 Pevny T, Bas P, Fridrich J. Steganalysis by subtractive pixel adjacency matrix. IEEE Trans Inf Forensic Secur, 2010,

5: 215–224

7 Fridrich J, Kodovsky J, Holub V, et al. Steganalysis of content-adaptive steganography in spatial domain. In: Filler T,

Pevny T, Ker A, et al., eds. Information Hiding, 13th International Workshop. Berlin: Springer-Verlag, 2011. 101–116

8 Cancelli G, Doerr G, Cox I, et al. Detection of ±1 steganography based on the amplitude of histogram local extreme.

In: Proceedings of the 15th IEEE International Conference on Image Processing, San Diego, 2008. 12–15

9 Pevny T, Fridrich J. Merging Markov and DCT features for multi-class JPEG steganalysis. In: Proceedings of SPIE,

Electronic Imaging, Security, Steganography, and Watermarking of Multimedia Contents IX, San Jose, 2007. 1–14

10 Solanki K, Sarkar A, Manjunath B S. YASS: Yet another steganographic scheme that resist blind steganalysis. In:

Furon T, Cayre F, Doerr G, et al., eds. Information Hiding, 9th International Workshop. Berlin: Springer-Verlag,

2007. 16–31

11 Sarkar A, Solanki K, Manjunath B S. Further study on YASS: Steganography based on randomized embedding to resist

blind steganalysis. In: Proceedings of SPIE, Electronic Imaging, Security, Forensics, Steganography, and Watermarking

of Multimedia Contents X, San Jose, 2008. 16–31

12 Pevny T, Filler T, Bas P. Using high-dimensional image models to perform highly undetectable steganography. In:

Fong P, Bohme R, Safavinaini R, eds. Information Hiding, 12th International Workshop. Berlin: Springer-Verlag,

2010. 161–177

Steganalysis by subtractive pixel adjacency matrix and dimensionality reduction

Documents

Transcript of Steganalysis by subtractive pixel adjacency matrix and dimensionality reduction