[IEEE 2012 IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA) - Cuernavaca,...
Transcript of [IEEE 2012 IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA) - Cuernavaca,...
Semi Fragile Watermarking System in Temporal Domain for PCM Audio Signals
Mario Gonzalez-Lee, Luis J. Morales-MendozaRene F. Vazquez-Bautista, Efren Morales-Mendoza
FIEC Poza Rica, Universidad VeracruzanaAv. Venustiano Carranza S/N, Col. Revolucion, Poza Rica, Veracruz, Mexico
E-mail:[email protected]
Abstract
In this work, a watermark detection system for audio sig-nals in temporal domain is proposed; the watermark em-bedding algorithm is the multiplicative embedding rule andthe optimal detection equation is derived for this constraintsusing the maximum likelihood (ML) criterion. The optimalthreshold equation is also derived using the Neyman-Personcriterion and the resulting system is blind and has very lowcomplexity. Computer simulations were performed apply-ing the proposed model to audio signals and results showthat the proposed system is semi fragile since it is able todetect watermarks if the watermarked object was severelyattacked by noise, low pass filtering among other attacksbut is not capable of detecting watermarks for others at-tacks such MP3 compression.
1 Introduction
Digital watermarking has been an active research field
which had produced very interesting approaches, however,
there is a lot of work that has be done before considering
watermarking as a solid discipline.
A very helpful approach is to establish analogies to the
very strong field of the theory of communications, in this
context, we can think of a watermark as a signal that propa-
gates through a communications channel. This channel can
modeled using a known Probability Density Function.
In this work, we focus in the case of the watermark prop-
agating through a Laplacian channel and its applications to
audio watermarking, a Laplacian channel is a channel that
can be statistically modeled using a Laplacian Probability
Density Function (PDF).
There are some approaches from other authors that stud-
ied the case of the Laplacian channel applications to audio
watermarking, for example, in [1] a semi-blind multiplica-
tive watermarking approach for audio and speech signals
has been presented. The detection of the watermark is ac-
complish by using the optimal ML detector aided by the
channel side information for Gaussian and Laplacian sig-
nals in noisy environment. They applied their proposed
scheme to speech and audio signals. The algorithm was
applied to low frequency components of the host signal.
In addition, the power of the watermark was controlled
to have inaudibility using perceptual evaluation of audio
quality (PEAQ) and perceptual evaluation of speech quality
(PESQ) algorithms. However, a drawback of this proposal
is that it is semi-blind.
In [3] an algorithm for audio watermarking is proposed.
The basic idea of their algorithm is to change the length of
the intervals between salient points of the audio signal to
embed data. The authors propose several ideas for practical
implementations that can be used by other watermarking
schemes as well. Their results suggest that the algorithm
is robust to common audio processing operations e.g. MP3
lossy compression, low pass filtering, and time-scale mod-
ification. The watermarked signal is claimed to have very
high perceptual quality. The major drawback of this pro-
posal is its low bit embedding rate.
Another watermarking scheme is proposed in [4], this
system works for both monophonic and stereophonic audio
files. This method uses MPEG 1 Layer 3 compression to
determine where and how the embedded mark must be in-
troduced. The results show that the suggested watermarking
scheme is robust against the attacks of the audio Stirmark
benchmark and compression attacks using the sound qual-
ity assessment material as the experimental corpus set.
The authors of [5] proposed an audio watermarking
method for copyright protection without the use of the orig-
inal signal for watermark detection. The analysis filterbank
decomposition, the psychoacoustic model and the empiri-
cal mode decomposition (EMD) techniques are used. The
algorithm proposed in that paper embeds the watermark bits
in the final residue of the subbands in the transform do-
main. The authors claim that experimental results show that
the proposed blind watermarking scheme is robust against
2012 Ninth Electronics, Robotics and Automotive Mechanics Conference
978-0-7695-4878-4/12 $26.00 © 2012 IEEE
DOI 10.1109/CERMA.2012.26
109
2012 Ninth Electronics, Robotics and Automotive Mechanics Conference
978-0-7695-4878-4/12 $26.00 © 2012 IEEE
DOI 10.1109/CERMA.2012.26
117
Watermark
GeneratorUser’s Key
Watermark
EmbeddingAudio Signal
ChannelNoise
Watermark
Assesment
�
�
��
���
�
�
Figure 1. Watermark propagation model.
MP3 compression and Gaussian noise attacks. A drawback
for this method is that it might not be robust to some other
common attacks such as band-pass filtering and cropping.
In this paper, we derive an optimal detector for audio sig-
nals under the assumption that audio signals can be modeled
as a Laplacian channel; unlike previous approaches, pro-
posed scheme has very low complexity. Two facts support
this, first, the derived detector variables in this paper are
much simpler than previously discussed approaches, sec-
ond, both the embedding and detection is done in temporal
domain so no transform must be performed before embed-
ding, furthermore the proposed system is blind, that is to
say, it doesn’t need the original audio signal.
In next section, we will present the watermark propaga-
tion model.
2 Watermarking Embedding Model
The watermarking model used in this work is presented
in detail in this section, the main properties of the variables
involved in the process are presented as well and pertinent
considerations.
The watermarking model is shown in Fig. 1, we can
identify the main input variables: the cover, which is an au-
dio signal that will carry the watermark, a user’s key which
is used to generate a pseudo-random signal and the embed-
ding gain which is related to the embedding energy of the
watermark.
In this work, a watermark is a binary signal W � ����with �� � ���� ��. This watermark has zero mean and its
variance is �.
This watermark is embedded in the audio signal X so it’s
User’s KeyWatermark
Generator
Compute
Detection
Variable ���
Watermarked
Samples
Compute
Threshold
��
� �
Watermark
Present
No Watermark
Present
���
�
�
�
�
�
Figure 2. Watermark detection model.
not possible for any third party observer to assess if the wa-
termark is present in the watermarked signal X� or not;
Ideally, the audio signal doesn’t interfere the watermark,
however in practice this is not true, the sole embedding pro-
cess damage the watermark, in consequence, we model the
effects of the cover within the channel block, and attacks to
the watermark (attacks will be discussed latter in this paper)
are modeled as noise in the channel during the propagation
of the watermark.
Once the watermark reaches the detector, it has to as-
sess the presence of the watermark, usually by computing
statistics that measures the presence of the watermark in
the possibly watermarked audio signal and compares it to
a threshold that is also computed. If the computed statistics
surpasses the value of threshold, the watermark is detected,
otherwise, the watermark is considered to be absent. The
complete watermark propagation model and a block dia-
gram of the detection process are depicted in Fig. 1 and
Fig. 2 respectively.
The computed statistics are often known as the decision
variable � and the decision threshold respectively.
The watermarking propagation model just stated is the
basis to derive the mathematical model introduced in this
work. In order to summarize the discussion carried out so
far, we have to keep in mind that the watermark is a binary
signal which is transmitted trough a communications chan-
nel, this channel reflects the influence of the audio signal,
110118
and finally the noise which prevents the watermark from
being detected is modeled as an attack to the watermark.
3 Laplacian Channel Model
As stated in last section, the audio signal interferes the
watermark effectively taking the role as the communication
channel, and in this work, we are interested in channels that
can be accurately modeled with the laplacian PDF, so first
we will review the pertinent aspects of the Laplacian statis-
tical model.
The Laplacian PDF is defined as:
�� �� ��
����
��
� � (1)
for �� � � � �, here we denote a Laplacian PDF with
parameter � as ���.Main statistics of Laplacian PDF are its mean, variance
and the shape parameter �; those values are estimated,
given that � � ��� as:
� ��
�
�����
� (2)
�� � �� (3)
� ��
�
�����
� �� (4)
for any sequence of length � .
These statistics for the model will suffice for deriving
the optimal decision variables, specially, one should keep
in mind (4) which will be very useful latter in this paper.
4 Watermark Detection Variables for Lapla-cian Channels
In this section, the optimal decision variables will be de-
rived considering a Laplacian channel and a multiplicative
embedding rule as the embedding algorithm.
4.1 Watermark Embedding Algorithm
The watermark embedding algorithm used in this paper
is introduced in this section. The multiplicative embedding
rule is defined as:
�� � ��� � ���� (5)
where �� is watermarked sample, �� is i-th watermark bit
and � is watermark embedding gain, which controls the wa-
termark energy and robustness. Multiplicative rules exhibit
several desirable properties, the most important is the inher-
ent masking effect that allows greater embedding strength
while imperceptibility holds.
4.2 Optimal Detection of Watermarks
Once the embedding algorithm was stated, we can pro-
ceed to derive the optimal detection variables: the detec-
tion variable �, which is a measure of the presence of the
watermark in a given signal; and the Threshold , which
provides reference to decide if the signal is watermarked.
A handy approach to watermark detection is performing
an estimation of the gain (�), analyzing (5), clearly, we havetwo possible outcomes for an estimator:
�
�� if no watermark was embedded
� otherwise(6)
Thus, if we can estimate the value of � used during the
embedding process, then, we can detect the watermark in
the signal under analysis.
Considering that an audio signal, and thus the channel,
can be statistically modeled as ��� according (1), we will
use the ML criterion to derive the optimal detection vari-
able and given that a vector of samples � is recovered at the
receiver, we have the following likelihood function:
���� �
�����
�
����
��
� � ��
������
��
���
� ��
� �(7)
Finding the maximum of (7) is the same problem as find-
ing the maximum of the exponent, so we have to maximize:
� � �
�����
��� �
�
��� (8)
Recalling that � � ���, we have from (5):
� ���
� � ���
(9)
And (8) becomes:
� � �
�����
���� ��
��� � ����
���� (10)
Now, finding the maximum of previous equation, we get:
��
��� �
�����
�
��
���� ��
��� � ����
���� (11)
Resulting in the following equation:
��
��� �
�����
� ��
���
��� � ����
�����
���� � ����
��
(12)
111119
Since �� � ���� �� then � � ��� is always positive we
can get rid of the absolute value operation, and by using the
Taylor series for �� � �� � � � for � � � �, so we
have:
��
��� �
�����
� ������
����
����� ���� (13)
Solving for the embedding gain � we get:
� �
����
����
�
����
����
����
�
�� (14)
And finally, since � � � the optimal detection variable
is given by.
� � � ��
���
�����
������ (15)
where,
�� ��
�
�����
���� (16)
In order to properly detect the watermark, the decision
variable � must be compared to a threshold, a watermark is
present if � � , the general threshold equation derived
from the Neyman-Person criterion in [2] is:
� ���� � ��������� ����
���� (17)
Where �� is the false positive probability, that is to say,
the probability that the system determines that a watermark
is present when actually no watermark was embedded and
������� � is the inverse complementary error function.
Computing both expected value ���� and variance ����from (15) we get:
���� ��
���
�����
��������� ������
���
�����
�� � � (18)
���� ��
�����
�����
���������
� (19)
Since the PDF of a distribution of a random variable � �distributed accordingly �� � is �� � � ��� � for � �,we have for the ���� that �� � � ���� (Because its
symmetry), clearly it becomes an exponential and thus the
variance is ��
� , using (19), we get:
���� ��
�(20)
Finally, the threshold equation is:
� ��������� ���
�
�(21)
This completes the derivation of the detection variables
for the detection model proposed in this paper. In next sec-
tion, computer simulations will be carried out.
5 Computer Simulations
In this section we present simulation results that validate
that (15) and (21) provide an accurate watermark detection
model.
All test were carried out under the following scenario:
the watermark was embedded in non overlapping blocks
with length of 2 times the sampling frequency of an audio
signal using (5). Detection is made in the same block wise
approach, � and are computed for each block using (15)
and (21) and the responses for each block are accumulated
and averaged. We let �� � ����. All audio signal used for
out tests were uncompressed 16-bit stereo WAV files with
48000 Hz sampling rate.
In next section we will present an evaluation of perfor-
mance of the model.
5.1 Detector Performance
One of the most important evaluation parameters is the
detection behavior of a given model for an arbitrary set
of different watermarks; ideally, the detector variable from
(15) should give a zero response for any watermark different
from the embedded watermark. In practice, it is not possi-
ble due the fact that the cover signal is always correlated to
the watermark. In practical situations, the detector variable
outputs a very low value for any watermark different from
the embedded one.
In a good detection model, only the watermark that was
embedded in the cover should cross the threshold value, fur-
thermore, this response should be much larger than the re-
sponse to any other watermark, and the smaller the response
of watermarks different from the one embedded, the better
the model is.
In Fig. 3, we can see the computed detector variable, a
gain value of � � ��� was used to embed a watermark and
���� different watermarks were tested, only the watermark
that was actually embedded, in this case, watermark number
���, crosses the computed threshold (in dotted line), whilst,
the other watermarks produces a very low response from the
detector variable, which confirms that the derived detector
is optimal under the ML criterion.
Figure 3 shows the system performance, we can see that
the system performs as expected.
112120
Figure 3. Detection variable response, em-bedded watermark corresponds to key num-ber 500.
In next section, we will carry out several performance
tests when the watermark is attacked.
5.2 Detection Performance Under Attacks to theWatermark
In this section, we present the system performance when
the watermarked audio signal is attacked so we will be able
to verify if the watermark still being detectable.
An attack to watermarks is either an intentional or unin-
tentional signal processing operation that damages the wa-
termark. An intentional attack is done with the goal of dam-
aging the watermark in order to make the detector fail to de-
tect the watermark, otherwise is called unintentional, such
operations are carried out with different proposes, for ex-
ample, compress for reduced hard disk usage, which is not
in fact an effort to wipe the watermark.
In our first test, the watermarked audio signal was
cropped in such way ��� of the signal was discarded. The
results can be seen in Fig. 4, we can see that the system per-
forms very well even for few seconds of the audio signal,
this is a consequence of the block wise approach, because
two times the sampling rate is about two seconds of the au-
dio signal, so in order to crop an useful piece of audio the
attacker must to preserve several blocks of the watermarked
audio.
In another test, additive white noise was added, the pa-
rameter is the amplitude of the noise, it can be seen that the
watermark was detected even at high values of noise. A plot
of detection variable versus noise amplitude is shown if Fig.
5. One must to note that noise with amplitude ��� is very
annoying so the attacked audio is worthless.
The next test, watermarked audio signal was low pass
Figure 4. Detection performance under acropping attack, successful watermark detec-tion achieved with only �� of the original au-dio signal.
filtered , in fig.(6) the performance of the system for var-
ious cutoff frequencies is shown, it was expected that the
watermark was not capable of being detected for low cutoff
frequencies since a watermark is mostly a signal made up
of high frequencies. Again, the system exhibits very good
performance for most cases.
However, the systems was not able to detect watermarks
under certain attacks, we can see in table 1 a list of attacks
that make the watermark undetectable and related test pa-
rameters.
Table 1. Failed tests and its related experi-ment parameters.
Attack Parameters
Bass Boost frequency = 200 Hz, 12 dB
MP3 Compression Bitrate = 320 kbps
Change Speed Percent change = �� faster
Change Tempo Percent change = ��
In next section, the conclusions of this work are dis-
cussed.
6 Conclusions
We have seen in this work that the proposed watermark-
ing system can detect watermarks even under some aggres-
sive attacks such as additive white noise whilst is not ca-
pable to detect watermarks even for light attacks such as a
change of speed.
113121
Figure 5. Detection variable � versus noiseamplitude in a white noise addition attack.
Even when many approaches prefer the use of some
transform domain claiming superior performance, it is done
at the expense of many arithmetic operations for comput-
ing such transforms, however in our approach, the number
of such operations is reduced due the low complexity of
the detector, and because there is not need of applying any
transform on the data since all process is done in temporal
domain.
Finally, those properties would help for developing real
time applications, and we must note in addition that the pro-
posed approach would make a good use of available mem-
ory, specially in systems with little memory available.
In this way we proved that the detector variable have very
good performance in detecting watermarks.
Acknowledgements
Authors wish to thank the FIEC of University of Ver-
acruz for the support for this work.
References
[1] M. Akhaee, N. Kalantari, and F. Marvasti. Robust multiplica-
tive audio and speech watermarking using statistical model-
ing. In Communications, 2009. ICC ’09. IEEE InternationalConference on, pages 1 –5, june 2009.
[2] M. Gonzalez-Lee. Marcas de Agua Digitales y sus Aplica-ciones Practicas. PhD thesis, Instituto Politecnico Nacional.
[3] M. Mansour and A. Tewfik. Audio watermarking by time-
scale modification. In Acoustics, Speech, and Signal Pro-cessing, 2001. Proceedings. (ICASSP ’01). 2001 IEEE Inter-national Conference on, volume 3, pages 1353 –1356 vol.3,
2001.
Figure 6. Detection variable � versus cutofffrequency in an low pass filtering attack.
[4] D. Megias, J. Herrera-Joancomarti, and J. Minguillon. A
robust frequency domain audio watermarking scheme for
monophonic and stereophonic pcm formats. In EuromicroConference, 2004. Proceedings. 30th, pages 449 – 452, aug.-3sept. 2004.
[5] L. Wang, S. Emmanuel, and M. Kankanhalli. Emd and psy-
choacoustic model based watermarking for audio. In Multi-media and Expo (ICME), 2010 IEEE International Confer-ence on, pages 1427 –1432, july 2010.
114122