Steganalysis by subtractive pixel adjacency matrix and dimensionality reduction
Transcript of Steganalysis by subtractive pixel adjacency matrix and dimensionality reduction
. RESEARCH PAPER .
SCIENCE CHINAInformation Sciences
doi: 10.1007/s11432-013-4793-x
c© Science China Press and Springer-Verlag Berlin Heidelberg 2013 info.scichina.com www.springerlink.com
Steganalysis by subtractive pixel adjacency matrixand dimensionality reduction
ZHANG Hao∗, PING XiJian, XU ManKun & WANG Ran
Department of Signal and Information Processing, Zhengzhou Information Science and Technology Institute,Zhengzhou 450002, China
Received March 4, 2012; accepted November 22, 2012
Abstract Subtractive pixel adjacency matrix (SPAM) features, introduced by Pevny et al. as a type of
Markov chain features, are widely used for blind steganalysis in the spatial domain. In this paper, we present
three improvements to SPAM as follows: 1) new features based on parallel subtractive pixels are added to the
SPAM features, which only refer to collinear subtractive pixels; 2) features are extracted not only from the
spatial image, but also from its grayscale-inverted image, making the feature matrices symmetrical and reducing
their dimensionality by about half; and 3) a new kind of adjacency matrix is used, thereby reducing about 3/4 of
the dimensionality of the features. Experimental results show that these methods for dimensionality reduction
are very effective and that the proposed features outperform SPAM.
Keywords steganalysis, Markov chain, dimensionality reduction, LSB matching, YASS algorithm
Citation Zhang H, Ping X J, Xu M K, et al. Steganalysis by subtractive pixel adjacency matrix and dimension-
ality reduction. Sci China Inf Sci, 2013, 56, doi: 10.1007/s11432-013-4793-x
1 Introduction
Steganalysis aims to expose the presence of hidden data using steganography. Generally, steganalysis is
divided into targeted and blind steganalysis. While the former is used to detect a specific steganography
algorithm, the latter can detect a large number of algorithms and is more attractive in practice. Currently,
a great variety of features for blind steganalysis are provided. For instance, wavelet absolute moment
(WAM) features [1,2] are based on the wavelet coefficients’ prediction error, empirical characteristic
function moment features [3,4] are established by the characteristic histogram function in spatial or
frequency domains, and Markov chain features (MCFs) [5–7] are associated with joint and transition
probability matrices. As a member of the MCF family, SPAM (subtractive pixel adjacency matrix) [6] is
quite well known for its strong detection ability in the spatial domain. It outperforms both the targeted
steganalyzer ALE (amplitude of local extreme) [8] and the blind steganalyzer WAM on the LSB matching
(also known as ±1 embedding) algorithm, and is superior to the 274 dimensional merged features [9] in
detecting the “yet another steganographic scheme” YASS algorithm [10,11]. Recently, a collection of
MCFs [7] containing SPAM was utilized to detect the content-adaptive algorithm given by Pevny et al.
∗Corresponding author (email: hao [email protected])
2 Zhang H, et al. Sci China Inf Sci
[12]. The resulting steganalyzer, using a more efficient classifier [7] than support vector machines (SVMs),
was shown to be very powerful.
Although MCFs are very useful in steganalysis, their high dimensionality may affect the complexity
of training. Consequently, we aim to design an efficient dimensionality reduction (DR) method in this
paper. Because SPAM has been widely used and can be regarded as a representation of MCFs, our
work focuses on improving SPAM. In fact, our methods can be generalized to other kinds of MCFs. By
analyzing SPAM, we find that the differences and transition probability are always computed in the same
direction. This means that SPAM only involves collinear subtractive pixels and the adjacent pixels have
a fixed transitive relationship. This discovery leads to two questions. First, why not change the direction
of transition probability? Second, what would happen if we used a different transitive relationship to that
used in SPAM? To answer the first question, we propose new features based on parallel subtractive pixels,
in which the directions of the differences and the transition probability are perpendicular to one another.
Experimental results in Section 3 demonstrate that this kind of features is obviously useful for improving
detection ability. To answer the second question, we alter the original transitive relationship to create
a new adjacency matrix. It can be seen that this new adjacency matrix has more symmetric properties
than the previous one, and therefore, the dimensionality can be greatly reduced (see Subsection 2.3 for
the details).
The rest of the paper is organized as follows. Section 2 gives some necessary descriptions including
distribution behavior of adjacent pixels in difference images, expressions of the novel features and the pro-
cess of DR. The following section shows the detection results of the LSB matching and YASS algorithms.
Finally, Section 4 gives our conclusions and future work.
2 Features of adjacent subtractive pixels
In this section, we propose new features based on collinear and parallel subtractive pixels (CPSPs) which
can be seen as an improvement of SPAM. We start by investigating distribution behavior of adjacent
subtractive pixels in Subsection 2.1, and then give some definitions and declarations of novel features in
Subsection 2.2. Finally, DR processes for our features are described in detail.
2.1 Distribution behavior of adjacent subtractive pixels
For a 256-grayscale image I, let Dd(I) be its difference image. The direction d ∈ {0, 1, . . . , 7} rep-
resents anti-clockwise d × 45◦. Subtractive pixels are considered as vectors when referring to their
location relationship. Examples of collinear subtractive pixels (CSPs) are (D0i,j−1(I),D
0i,j(I)) and
(D0i,j−1(I),D
0i,j(I),D
0i,j+1(I)), while examples of parallel subtractive pixels (PSPs) are (D0
i−1,j(I),
D0i,j(I)) and (D0
i−1,j(I),D0i,j(I),D
0i+1,j(I)). The joint distributions of three adjacent CSPs and PSPs
are denoted by matrices Cd1 and Cd
2 with a truncation threshold T . Inspired by the symmetrical distri-
bution of subtractive pixels (usually modeled as a generalized Gaussian distribution), we explored the
symmetry properties of Cd1 and Cd
2 using their empirical distributions. Details of the image databases 1)
used for the experiments are given below:
S1 CAMERA contains 3164 images with fixed size, 512× 512.
S2 NRCS contains 1500 images with sizes ranging from 2.3 to 6 Mpix.
S3 BOWS2 consists of 5000 images with fixed size, 512× 512.
For each database, sample Cdi values were calculated and the average value of all the samples was
taken as the empirical distribution Cd
i . In the following, we computed the relative error Edi as
Edi (u, v, w) =
Cd
i (u, v, w)−Cd
i (−u,−v,−w)
Cd
i (u, v, w) +Cd
i (−u,−v,−w).
1) All image databases used in this study are available online. S1–http://www.adastral.ucl.ac.uk./gwendoer/steganalysis/;
S2–http://photogallery.nrcs.usda.gov/; S3–http://baws2.gipsa-lab.inpg.fr/.
Zhang H, et al. Sci China Inf Sci 3
Table 1 Bounds of the relative error
T S1 S2 S3
1 0.0043 0.0158 0.0174
E01 2 0.0169 0.0261 0.0261
3 0.0418 0.0407 0.0261
1 0.0031 0.0010 0.0042
E02 2 0.0066 0.0016 0.0042
3 0.0107 0.0021 0.0042
It is easy to see that maxu,v,w{Edi } = −minu,v,w{Ed
i }, and as space is limited, we only list values of
maxu,v,w{E0i } (see Table 1).
From the results, it can be seen that the difference between Cd
i (u, v, w) and Cd
i (−u,−v,−w) is very
small. Thus we assume that Cdi is symmetrical about the origin point. This assumption implies that the
CSPs (PSPs) from both Dd(I) and Dd(I) have the same distribution, where I = 255 − I denotes the
grayscale-inverted image. In the following sections, we show that this assumption is useful for feature
extraction and dimensionality reduction.
2.2 Introduction to CPSP features
Given a direction d, let M i,d(I) and N i,d(I) be the ith order feature matrices of CSPs and PSPs,
respectively. We list several types of matrices to describe the MCFs of CSPs or PSPs. While types A
and B are a pair of transition probability matrices, type C denotes the joint probability matrix. In more
detail, the first order matrices of CSPs with direction d = 0 are
A M1,0(I) : M1,0u,v(I) = Pr(D0
i,j(I) = v | D0i,j−1(I) = u),
C M1,0(I) : M1,0u,v(I) = Pr(D0
i,j(I) = v, D0i,j−1(I) = u);
the first order matrices of PSPs are
A N1,0(I) : N1,0u,v(I) = Pr(D0
i,j(I) = v | D0i−1,j(I) = u),
C N1,0(I) : N1,0u,v(I) = Pr(D0
i,j(I) = v, D0i−1,j(I) = u);
the second order matrices of CSPs with direction d = 0 are
A M2,0(I) : M2,0u,v,w(I) = Pr(D0
i,j+1(I) = w | D0i,j−1(I) = u, D0
i,j(I) = v),
B M2,0(I) : M2,0u,v,w(I) = Pr(D0
i,j(I) = v | D0i,j−1(I) = u, D0
i,j+1(I) = w),
C M2,0(I) : M2,0u,v,w(I) = Pr(D0
i,j−1(I) = u, D0i,j(I) = v, D0
i,j+1(I) = w);
and the second order matrices of PSPs are
A N2,0(I) : N2,0u,v,w(I) = Pr(D0
i+1,j(I) = w | D0i−1,j(I) = u, D0
i,j(I) = v),
B N2,0(I) : N2,0u,v,w(I) = Pr(D0
i,j(I) = v | D0i−1,j(I) = u, D0
i+1,j(I) = w),
C N2,0(I) : N2,0u,v,w(I) = Pr(D0
i−1,j(I) = u, D0i,j(I) = v, D0
i+1,j(I) = w),
where u, v, w ∈ {−T, . . . , T }. We can also define the matrices of other directions in the same way. To
create the new features, we incorporate the matrices M i,d(I)(N i,d(I)) and M i,d(I)(N i,d(I)) based on
the conclusion in Subsection 2.1 that the CSPs (PSPs) from both Dd(I) and Dd(I) have the same
distribution, and the ith order CPSP features of each type are given by
Fi,1 =1
8
∑
d∈{0,2,4,6}
(M i,d(I) +M i,d(I)
), Fi,2 =
1
8
∑
d∈{1,3,5,7}
(M i,d(I) +M i,d(I)
),
4 Zhang H, et al. Sci China Inf Sci
−2 −1 0 1 2−2
−1
0
1
2
u
v
(a)
−1
0
1
−10
1−1
0
1
w u
v
(b)
Figure 1 Graphical representation of (a) S1 and (b) S2.
Gi,1 =1
8
∑
d∈{0,2,4,6}
(N i,d(I) +N i,d(I)
), Gi,2 =
1
8
∑
d∈{1,3,5,7}
(N i,d(I) +N i,d(I)
).
By observing the calculation of CSP ([Fi,1,Fi,2]) and SPAM [6] features, it is easy to confirm that
these two kinds of features are almost the same except for the addition or not of a grayscale-inverted
image statistic. This subtle change gives CSP features a symmetrical data structure and a mechanism for
dimensionality reduction, while PSP features ([Gi,1,Gi,2]) have the same structure (see Subsection 2.3
for the details). Moreover, if we compare CSP and PSP features, it is interesting to show that these two
kinds of features represent different distributions of different neighboring pixels. That means combining
CSP and PSP features may detect more changes through steganography and improve the steganalysis.
2.3 Dimensionality reduction
From observation of matrices of types B and C, we find that
1) {(M i,d(I),M i,d(I)), (N i,d(I),N i,d(I))}d are symmetrical pairs of matrices about the origin;
2) {(M2,d(I),M2,d+4(I)), (N2,d(I),N2,d+4(I))}d∈{0,1,2,3} are symmetrical pairs of matrices about
the line (u, v) = (−w, 0) and the plane u = w, respectively.
We can deduce from these properties that
1) Fi,1,Fi,2,Gi,1,Gi,2 are all symmetrical about the origin;
2) F2,1,F2,2,G2,1,G2,2 are all symmetrical about the line u = w and the plane (u, v) = (−w, 0),
respectively.
Now we introduce the DR process for matrices. In the case that i = 1, since Fi,1,Fi,2,Gi,1,Gi,2 are
symmetrical about the origin (u, v) = (0, 0), about half of the data can be omitted. The coordinate set
of the remaining data (see Figure 1) is denoted by
S1 = {(u, v) | u ∈ {−T, . . . ,−1}, v ∈ {−T, . . . , T }} ∪ {(0, v) | v ∈ {0, . . . , T }}.
In the case that i = 2, since Fi,1,Fi,2,Gi,1,Gi,2 of types B and C have a symmetry centre (u, v, w) =
(0, 0, 0) and a symmetry plane u = w, we can delete the data for v < 0 and w < u, respectively. Moreover,
by using the symmetry axis (u, v) = (−w, 0), the data for {(u, v, w) | v = 0, u < −w} can be discarded.
The coordinate set of the remaining data (see Figure 1) is denoted by
S2 = {(u, v, w) | u � w, v > 1, u, v, w ∈ {−T, . . . , T }}∪{(u, 0, w) | 0 � u � w, u, w ∈ {0, . . . , T }}∪{(u, 0, w) | −w � u � −1, − u,w ∈ {1, . . . , T }}.
Since the matrices of type A are merely symmetrical about the origin, the associated coordinate set is
denoted by
{(u, v, w) | v > 1, u, v, w ∈ {−T, . . . , T }} ∪ {(u, 0, w) | u ∈ {−T, . . . ,−1}, w ∈ {−T, . . . , T }}.
Zhang H, et al. Sci China Inf Sci 5
Table 2 Dimensional comparison of CPSP features
Feature DR Dimensionality
1st before 4(2T + 1)2
1st after 4(2T 2 + 2T + 1)
2nd A,B,C before 4(2T + 1)3
2nd A after 4(T + 1)(4T 2 + 2T + 1)
2nd B,C after 4(T + 1)(2T 2 + 2T + 1)
Table 3 LSB matching detection
S1-25% S1-50% S2-25% S2-50%
DA FP DA FP DA FP DA FP
1st SPAM 162D 86.64% 16.75% 94.07% 8.15% 74.04% 26.89% 90.60% 11.04%
1st CSP-A 82D 86.24% 16.05% 93.93% 8.15% 72.93% 29.85% 91.52% 9.93%
1st CPSP-A 164D 92.22% 7.76% 96.61% 5.86% 79.63% 20.30% 92.89% 7.41%
1st CPSP-C 164D 91.71% 8.71% 96.47% 3.65% 78.78% 23.48% 91.78% 11.11%
2nd SPAM 686D 90.70% 12.99% 97.86% 3.37% 79.33% 22.96% 92.04% 7.11%
2nd CSP-A 200D 91.26% 11.94% 97.91% 2.98% 81.89% 20.00% 93.85% 8.00%
2nd CPSP-A 688D 95.98% 3.30% 99.14% 0.98% 82.30% 19.56% 94.00% 5.63%
2nd CPSP-B 400D 95.54% 6.29% 98.65% 1.40% 81.52% 22.15% 93.22% 7.19%
2nd CPSP-C 400D 94.65% 7.51% 97.65% 3.23% 80.63% 20.07% 92.04% 11.41%
Since CPSP features consist of four matrices, the dimensionality will be four times the cardinality of
Si. Table 2 lists the dimensionality of the features before and after DR.
When T tends to infinity, half the dimensionality of the 1st and 2nd features of type A can be reduced.
At the same time, 3/4 of the dimensionality of the 2nd features of types B and C can be reduced. It
should be noted that all the CPSP features used in the following section are those after DR.
3 Experimental results
The steganalyzers were constructed by using SVMs with a Gaussian kernel. Ten percent of the carriers
and their corresponding stego images were randomly chosen for training. Detection accuracy (DA) was
used to evaluate the detection ability of the steganalyzer, while false positive (FP) rates associated with
the DA values were used as complementary results.
3.1 LSB matching detection
We chose S1 and S2 as carrier sets. The stego sets were created by LSB matching with payloads of 0.25
bits per pixel (bpp) and 0.5 bpp, respectively. T was set to 4 for the first order features and 3 for the
second order ones. The DA values are listed in Table 3.
From these experimental results, the following conclusions can be reached. 1) Compared with SPAM,
CSP-A features have smaller dimensionality and comparable DA values, which implies that adding I to
calculate the features does not reduce the detection ability. 2) CPSP-A features are superior to SPAM
and CSP-A features, so PSP features are useful for improving detection ability. 3) CPSP-B features have
smaller dimensionality than CPSP-A features and approximately the same detection ability. 4) Joint
probability features (type C) are inferior to the transition probability features (types A and B).
6 Zhang H, et al. Sci China Inf Sci
Table 4 YASS detection-DA
QFh 50 55 60 65 70 75
AP 0.0619 0.0671 0.0714 0.0765 0.0821 0.0877
1st SPAM 88.28% 86.26% 83.77% 80.32% 77.06% 72.24%
1st CSP-A 88.77% 86.97% 85.01% 81.19% 77.78% 73.14%
1st CPSP-A 90.22% 88.32% 86.73% 83.13% 78.44% 73.63%
1st CPSP-C 87.63% 85.57% 82.91% 80.78% 76.72% 73.40%
2nd SPAM 94.58% 92.73% 91.30% 89.71% 85.31% 80.29%
2nd CSP-A 95.15% 93.03% 92.80% 89.87% 87.10% 82.17%
2nd CPSP-A 95.51% 93.87% 92.57% 90.59% 88.59% 83.17%
2nd CPSP-B 95.09% 93.44% 93.08% 89.83% 87.10% 80.73%
2nd CPSP-C 92.53% 90.64% 90.04% 87.90% 85.97% 79.70%
Table 5 YASS detection-FP
QFh 50 55 60 65 70 75
1st SPAM 13.36% 14.04% 17.36% 20.98% 24.80% 29.09%
1st CSP-A 13.33% 14.07% 16.73% 20.62% 25.36% 28.36%
1st CPSP-A 12.22% 13.24% 14.40% 19.56% 21.44% 29.16%
1st CPSP-C 16.20% 15.93% 20.27% 22.49% 26.00% 28.36%
2nd SPAM 5.36% 7.71% 9.80% 12.20% 16.93% 20.56%
2nd CSP-A 5.20% 7.60% 8.73% 12.02% 14.78% 19.11%
2nd CPSP-A 4.87% 6.78% 7.80% 10.16% 12.49% 17.98%
2nd CPSP-B 5.91% 7.76% 7.60% 10.64% 13.64% 20.13%
2nd CPSP-C 10.11% 11.96% 11.73% 13.67% 16.24% 22.13%
Table 6 YASS detection with CPSP-B features
O T 50 55 60 65 70 75
2 3 94.43% 94.07% 93.21% 90.59% 88.38% 82.27%
3 5 95.27% 94.82% 93.09% 91.88% 88.33% 84.21%
4 6 95.74% 95.20% 94.27% 92.82% 90.94% 86.92%
5 6 96.21% 95.42% 94.86% 93.22% 91.12% 88.48%
6 6 94.87% 93.81% 92.76% 91.02% 89.00% 85.54%
3.2 YASS detection
Images of set S3 were embedded with maximal-length random messages by the original YASS algorithm
[10] with six hiding quality factors (QFh) and an advertising quality factor (QFa) of 75. The average
payloads (AP) over the corpus of images are shown in Table 4. It should be noted that features were
extracted from decompressed images and the threshold T was used as described in Subsection 3.1. From
Table 5, we reach the same conclusions as in Subsection 3.1.
In addition, inspired by [7], we extracted CPSP features from higher-order (O ∈ {2, . . . , 6}) differencesbetween neighboring pixels. The threshold T was chosen from {3, 4, 5, 6}. As CPSP-B features have
smaller dimensionality than CPSP-A features and approximately the same detection ability, we chose
CPSP-B features to detect YASS. We observe that the best threshold T corresponding to a given order
O is independent of parameter QFh. Consequently, the optimum T is listed with its DA value for each
order O. In Table 6, the steganalyzer with O = 5 and T = 6 gives the best detection performance.
Zhang H, et al. Sci China Inf Sci 7
4 Conclusions
In this paper, we proposed an improvement for SPAM introduced in [6]. By adding PSP features, the
novel steganalyzer outperforms SPAM. In addition, by calculating features from a grayscale-inverted
image and using a new adjacency matrix, the dimensionality of the proposed features can be greatly
reduced. Extensive experiments show that these methods are very effective and useful. Moreover, the
techniques used in this paper can obviously be generalized to other MCFs. In the future, we would like
to study MCF in frequency domain. Earlier work refers to the 324 dimensional MCFs proposed in [5].
Investigating MCFs in the frequency domain could be useful for steganalysis of JPEG compressed images.
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant No. 60970142).
References
1 Lyu S, Farid H. Steganalysis using higher-order image statistics. IEEE Trans Inf Forensic Secur, 2006, 1: 111–119
2 Goljan M, Fridrich J, Holotyak T. New blind steganalysis and its implications. In: Proceedings of SPIE, Electronic
Imaging, Security, Steganography, and Watermarking of Multimedia Contents VIII, San Jose, 2006. 1–13
3 Harmsen J J, Pearlman W A. Steganalysis of additive noise moderable information hiding. In: Proceedings of SPIE,
Electronic Imaging, Security, Steganography, and Watermarking of Multimedia Contents VI, Santa Clara, 2003. 131–
142
4 Wang Y, Moulin P. Optimized feature extraction for learning-based image steganalysis. IEEE Trans Inf Forensic Secur,
2007, 2: 31–45
5 Shi Y Q, Chen C, Chen W. A Markov process based approach to effective attacking JPEG steganography. In:
Camenisch J L, Collberg C S, Johnson N F, et al., eds. Information Hiding, 8th International Workshop. Berlin:
Springer-Verlag, 2006. 249–264
6 Pevny T, Bas P, Fridrich J. Steganalysis by subtractive pixel adjacency matrix. IEEE Trans Inf Forensic Secur, 2010,
5: 215–224
7 Fridrich J, Kodovsky J, Holub V, et al. Steganalysis of content-adaptive steganography in spatial domain. In: Filler T,
Pevny T, Ker A, et al., eds. Information Hiding, 13th International Workshop. Berlin: Springer-Verlag, 2011. 101–116
8 Cancelli G, Doerr G, Cox I, et al. Detection of ±1 steganography based on the amplitude of histogram local extreme.
In: Proceedings of the 15th IEEE International Conference on Image Processing, San Diego, 2008. 12–15
9 Pevny T, Fridrich J. Merging Markov and DCT features for multi-class JPEG steganalysis. In: Proceedings of SPIE,
Electronic Imaging, Security, Steganography, and Watermarking of Multimedia Contents IX, San Jose, 2007. 1–14
10 Solanki K, Sarkar A, Manjunath B S. YASS: Yet another steganographic scheme that resist blind steganalysis. In:
Furon T, Cayre F, Doerr G, et al., eds. Information Hiding, 9th International Workshop. Berlin: Springer-Verlag,
2007. 16–31
11 Sarkar A, Solanki K, Manjunath B S. Further study on YASS: Steganography based on randomized embedding to resist
blind steganalysis. In: Proceedings of SPIE, Electronic Imaging, Security, Forensics, Steganography, and Watermarking
of Multimedia Contents X, San Jose, 2008. 16–31
12 Pevny T, Filler T, Bas P. Using high-dimensional image models to perform highly undetectable steganography. In:
Fong P, Bohme R, Safavinaini R, eds. Information Hiding, 12th International Workshop. Berlin: Springer-Verlag,
2010. 161–177