[IEEE 2012 IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA) - Cuernavaca,...

6
Semi Fragile Watermarking System in Temporal Domain for PCM Audio Signals Mario Gonzalez-Lee, Luis J. Morales-Mendoza Rene F. Vazquez-Bautista, Efren Morales-Mendoza FIEC Poza Rica, Universidad Veracruzana Av. Venustiano Carranza S/N, Col. Revolucion, PozaRica, Veracruz, Mexico E-mail:[email protected] Abstract In this work, a watermark detection system for audio sig- nals in temporal domain is proposed; the watermark em- bedding algorithm is the multiplicative embedding rule and the optimal detection equation is derived for this constraints using the maximum likelihood (ML) criterion. The optimal threshold equation is also derived using the Neyman-Person criterion and the resulting system is blind and has very low complexity. Computer simulations were performed apply- ing the proposed model to audio signals and results show that the proposed system is semi fragile since it is able to detect watermarks if the watermarked object was severely attacked by noise, low pass filtering among other attacks but is not capable of detecting watermarks for others at- tacks such MP3 compression. 1 Introduction Digital watermarking has been an active research field which had produced very interesting approaches, however, there is a lot of work that has be done before considering watermarking as a solid discipline. A very helpful approach is to establish analogies to the very strong field of the theory of communications, in this context, we can think of a watermark as a signal that propa- gates through a communications channel. This channel can modeled using a known Probability Density Function. In this work, we focus in the case of the watermark prop- agating through a Laplacian channel and its applications to audio watermarking, a Laplacian channel is a channel that can be statistically modeled using a Laplacian Probability Density Function (PDF). There are some approaches from other authors that stud- ied the case of the Laplacian channel applications to audio watermarking, for example, in [1] a semi-blind multiplica- tive watermarking approach for audio and speech signals has been presented. The detection of the watermark is ac- complish by using the optimal ML detector aided by the channel side information for Gaussian and Laplacian sig- nals in noisy environment. They applied their proposed scheme to speech and audio signals. The algorithm was applied to low frequency components of the host signal. In addition, the power of the watermark was controlled to have inaudibility using perceptual evaluation of audio quality (PEAQ) and perceptual evaluation of speech quality (PESQ) algorithms. However, a drawback of this proposal is that it is semi-blind. In [3] an algorithm for audio watermarking is proposed. The basic idea of their algorithm is to change the length of the intervals between salient points of the audio signal to embed data. The authors propose several ideas for practical implementations that can be used by other watermarking schemes as well. Their results suggest that the algorithm is robust to common audio processing operations e.g. MP3 lossy compression, low pass filtering, and time-scale mod- ification. The watermarked signal is claimed to have very high perceptual quality. The major drawback of this pro- posal is its low bit embedding rate. Another watermarking scheme is proposed in [4], this system works for both monophonic and stereophonic audio files. This method uses MPEG 1 Layer 3 compression to determine where and how the embedded mark must be in- troduced. The results show that the suggested watermarking scheme is robust against the attacks of the audio Stirmark benchmark and compression attacks using the sound qual- ity assessment material as the experimental corpus set. The authors of [5] proposed an audio watermarking method for copyright protection without the use of the orig- inal signal for watermark detection. The analysis filterbank decomposition, the psychoacoustic model and the empiri- cal mode decomposition (EMD) techniques are used. The algorithm proposed in that paper embeds the watermark bits in the final residue of the subbands in the transform do- main. The authors claim that experimental results show that the proposed blind watermarking scheme is robust against 2012 Ninth Electronics, Robotics and Automotive Mechanics Conference 978-0-7695-4878-4/12 $26.00 © 2012 IEEE DOI 10.1109/CERMA.2012.26 109 2012 Ninth Electronics, Robotics and Automotive Mechanics Conference 978-0-7695-4878-4/12 $26.00 © 2012 IEEE DOI 10.1109/CERMA.2012.26 117

Transcript of [IEEE 2012 IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA) - Cuernavaca,...

Page 1: [IEEE 2012 IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA) - Cuernavaca, Mexico (2012.11.19-2012.11.23)] 2012 IEEE Ninth Electronics, Robotics and Automotive

Semi Fragile Watermarking System in Temporal Domain for PCM Audio Signals

Mario Gonzalez-Lee, Luis J. Morales-MendozaRene F. Vazquez-Bautista, Efren Morales-Mendoza

FIEC Poza Rica, Universidad VeracruzanaAv. Venustiano Carranza S/N, Col. Revolucion, Poza Rica, Veracruz, Mexico

E-mail:[email protected]

Abstract

In this work, a watermark detection system for audio sig-nals in temporal domain is proposed; the watermark em-bedding algorithm is the multiplicative embedding rule andthe optimal detection equation is derived for this constraintsusing the maximum likelihood (ML) criterion. The optimalthreshold equation is also derived using the Neyman-Personcriterion and the resulting system is blind and has very lowcomplexity. Computer simulations were performed apply-ing the proposed model to audio signals and results showthat the proposed system is semi fragile since it is able todetect watermarks if the watermarked object was severelyattacked by noise, low pass filtering among other attacksbut is not capable of detecting watermarks for others at-tacks such MP3 compression.

1 Introduction

Digital watermarking has been an active research field

which had produced very interesting approaches, however,

there is a lot of work that has be done before considering

watermarking as a solid discipline.

A very helpful approach is to establish analogies to the

very strong field of the theory of communications, in this

context, we can think of a watermark as a signal that propa-

gates through a communications channel. This channel can

modeled using a known Probability Density Function.

In this work, we focus in the case of the watermark prop-

agating through a Laplacian channel and its applications to

audio watermarking, a Laplacian channel is a channel that

can be statistically modeled using a Laplacian Probability

Density Function (PDF).

There are some approaches from other authors that stud-

ied the case of the Laplacian channel applications to audio

watermarking, for example, in [1] a semi-blind multiplica-

tive watermarking approach for audio and speech signals

has been presented. The detection of the watermark is ac-

complish by using the optimal ML detector aided by the

channel side information for Gaussian and Laplacian sig-

nals in noisy environment. They applied their proposed

scheme to speech and audio signals. The algorithm was

applied to low frequency components of the host signal.

In addition, the power of the watermark was controlled

to have inaudibility using perceptual evaluation of audio

quality (PEAQ) and perceptual evaluation of speech quality

(PESQ) algorithms. However, a drawback of this proposal

is that it is semi-blind.

In [3] an algorithm for audio watermarking is proposed.

The basic idea of their algorithm is to change the length of

the intervals between salient points of the audio signal to

embed data. The authors propose several ideas for practical

implementations that can be used by other watermarking

schemes as well. Their results suggest that the algorithm

is robust to common audio processing operations e.g. MP3

lossy compression, low pass filtering, and time-scale mod-

ification. The watermarked signal is claimed to have very

high perceptual quality. The major drawback of this pro-

posal is its low bit embedding rate.

Another watermarking scheme is proposed in [4], this

system works for both monophonic and stereophonic audio

files. This method uses MPEG 1 Layer 3 compression to

determine where and how the embedded mark must be in-

troduced. The results show that the suggested watermarking

scheme is robust against the attacks of the audio Stirmark

benchmark and compression attacks using the sound qual-

ity assessment material as the experimental corpus set.

The authors of [5] proposed an audio watermarking

method for copyright protection without the use of the orig-

inal signal for watermark detection. The analysis filterbank

decomposition, the psychoacoustic model and the empiri-

cal mode decomposition (EMD) techniques are used. The

algorithm proposed in that paper embeds the watermark bits

in the final residue of the subbands in the transform do-

main. The authors claim that experimental results show that

the proposed blind watermarking scheme is robust against

2012 Ninth Electronics, Robotics and Automotive Mechanics Conference

978-0-7695-4878-4/12 $26.00 © 2012 IEEE

DOI 10.1109/CERMA.2012.26

109

2012 Ninth Electronics, Robotics and Automotive Mechanics Conference

978-0-7695-4878-4/12 $26.00 © 2012 IEEE

DOI 10.1109/CERMA.2012.26

117

Page 2: [IEEE 2012 IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA) - Cuernavaca, Mexico (2012.11.19-2012.11.23)] 2012 IEEE Ninth Electronics, Robotics and Automotive

Watermark

GeneratorUser’s Key

Watermark

EmbeddingAudio Signal

ChannelNoise

Watermark

Assesment

��

���

Figure 1. Watermark propagation model.

MP3 compression and Gaussian noise attacks. A drawback

for this method is that it might not be robust to some other

common attacks such as band-pass filtering and cropping.

In this paper, we derive an optimal detector for audio sig-

nals under the assumption that audio signals can be modeled

as a Laplacian channel; unlike previous approaches, pro-

posed scheme has very low complexity. Two facts support

this, first, the derived detector variables in this paper are

much simpler than previously discussed approaches, sec-

ond, both the embedding and detection is done in temporal

domain so no transform must be performed before embed-

ding, furthermore the proposed system is blind, that is to

say, it doesn’t need the original audio signal.

In next section, we will present the watermark propaga-

tion model.

2 Watermarking Embedding Model

The watermarking model used in this work is presented

in detail in this section, the main properties of the variables

involved in the process are presented as well and pertinent

considerations.

The watermarking model is shown in Fig. 1, we can

identify the main input variables: the cover, which is an au-

dio signal that will carry the watermark, a user’s key which

is used to generate a pseudo-random signal and the embed-

ding gain which is related to the embedding energy of the

watermark.

In this work, a watermark is a binary signal W � ����with �� � ���� ��. This watermark has zero mean and its

variance is �.

This watermark is embedded in the audio signal X so it’s

User’s KeyWatermark

Generator

Compute

Detection

Variable ���

Watermarked

Samples

Compute

Threshold

��

� �

Watermark

Present

No Watermark

Present

���

Figure 2. Watermark detection model.

not possible for any third party observer to assess if the wa-

termark is present in the watermarked signal X� or not;

Ideally, the audio signal doesn’t interfere the watermark,

however in practice this is not true, the sole embedding pro-

cess damage the watermark, in consequence, we model the

effects of the cover within the channel block, and attacks to

the watermark (attacks will be discussed latter in this paper)

are modeled as noise in the channel during the propagation

of the watermark.

Once the watermark reaches the detector, it has to as-

sess the presence of the watermark, usually by computing

statistics that measures the presence of the watermark in

the possibly watermarked audio signal and compares it to

a threshold that is also computed. If the computed statistics

surpasses the value of threshold, the watermark is detected,

otherwise, the watermark is considered to be absent. The

complete watermark propagation model and a block dia-

gram of the detection process are depicted in Fig. 1 and

Fig. 2 respectively.

The computed statistics are often known as the decision

variable � and the decision threshold respectively.

The watermarking propagation model just stated is the

basis to derive the mathematical model introduced in this

work. In order to summarize the discussion carried out so

far, we have to keep in mind that the watermark is a binary

signal which is transmitted trough a communications chan-

nel, this channel reflects the influence of the audio signal,

110118

Page 3: [IEEE 2012 IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA) - Cuernavaca, Mexico (2012.11.19-2012.11.23)] 2012 IEEE Ninth Electronics, Robotics and Automotive

and finally the noise which prevents the watermark from

being detected is modeled as an attack to the watermark.

3 Laplacian Channel Model

As stated in last section, the audio signal interferes the

watermark effectively taking the role as the communication

channel, and in this work, we are interested in channels that

can be accurately modeled with the laplacian PDF, so first

we will review the pertinent aspects of the Laplacian statis-

tical model.

The Laplacian PDF is defined as:

�� �� ��

����

��

� � (1)

for �� � � � �, here we denote a Laplacian PDF with

parameter � as ���.Main statistics of Laplacian PDF are its mean, variance

and the shape parameter �; those values are estimated,

given that � � ��� as:

� ��

�����

� (2)

�� � �� (3)

� ��

�����

� �� (4)

for any sequence of length � .

These statistics for the model will suffice for deriving

the optimal decision variables, specially, one should keep

in mind (4) which will be very useful latter in this paper.

4 Watermark Detection Variables for Lapla-cian Channels

In this section, the optimal decision variables will be de-

rived considering a Laplacian channel and a multiplicative

embedding rule as the embedding algorithm.

4.1 Watermark Embedding Algorithm

The watermark embedding algorithm used in this paper

is introduced in this section. The multiplicative embedding

rule is defined as:

�� � ��� � ���� (5)

where �� is watermarked sample, �� is i-th watermark bit

and � is watermark embedding gain, which controls the wa-

termark energy and robustness. Multiplicative rules exhibit

several desirable properties, the most important is the inher-

ent masking effect that allows greater embedding strength

while imperceptibility holds.

4.2 Optimal Detection of Watermarks

Once the embedding algorithm was stated, we can pro-

ceed to derive the optimal detection variables: the detec-

tion variable �, which is a measure of the presence of the

watermark in a given signal; and the Threshold , which

provides reference to decide if the signal is watermarked.

A handy approach to watermark detection is performing

an estimation of the gain (�), analyzing (5), clearly, we havetwo possible outcomes for an estimator:

�� if no watermark was embedded

� otherwise(6)

Thus, if we can estimate the value of � used during the

embedding process, then, we can detect the watermark in

the signal under analysis.

Considering that an audio signal, and thus the channel,

can be statistically modeled as ��� according (1), we will

use the ML criterion to derive the optimal detection vari-

able and given that a vector of samples � is recovered at the

receiver, we have the following likelihood function:

���� �

�����

����

��

� � ��

������

��

���

� ��

� �(7)

Finding the maximum of (7) is the same problem as find-

ing the maximum of the exponent, so we have to maximize:

� � �

�����

��� �

��� (8)

Recalling that � � ���, we have from (5):

� ���

� � ���

(9)

And (8) becomes:

� � �

�����

���� ��

��� � ����

���� (10)

Now, finding the maximum of previous equation, we get:

��

��� �

�����

��

���� ��

��� � ����

���� (11)

Resulting in the following equation:

��

��� �

�����

� ��

���

��� � ����

�����

���� � ����

��

(12)

111119

Page 4: [IEEE 2012 IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA) - Cuernavaca, Mexico (2012.11.19-2012.11.23)] 2012 IEEE Ninth Electronics, Robotics and Automotive

Since �� � ���� �� then � � ��� is always positive we

can get rid of the absolute value operation, and by using the

Taylor series for �� � �� � � � for � � � �, so we

have:

��

��� �

�����

� ������

����

����� ���� (13)

Solving for the embedding gain � we get:

� �

����

����

����

����

����

�� (14)

And finally, since � � � the optimal detection variable

is given by.

� � � ��

���

�����

������ (15)

where,

�� ��

�����

���� (16)

In order to properly detect the watermark, the decision

variable � must be compared to a threshold, a watermark is

present if � � , the general threshold equation derived

from the Neyman-Person criterion in [2] is:

� ���� � ��������� ����

���� (17)

Where �� is the false positive probability, that is to say,

the probability that the system determines that a watermark

is present when actually no watermark was embedded and

������� � is the inverse complementary error function.

Computing both expected value ���� and variance ����from (15) we get:

���� ��

���

�����

��������� ������

���

�����

�� � � (18)

���� ��

�����

�����

���������

� (19)

Since the PDF of a distribution of a random variable � �distributed accordingly �� � is �� � � ��� � for � �,we have for the ���� that �� � � ���� (Because its

symmetry), clearly it becomes an exponential and thus the

variance is ��

� , using (19), we get:

���� ��

�(20)

Finally, the threshold equation is:

� ��������� ���

�(21)

This completes the derivation of the detection variables

for the detection model proposed in this paper. In next sec-

tion, computer simulations will be carried out.

5 Computer Simulations

In this section we present simulation results that validate

that (15) and (21) provide an accurate watermark detection

model.

All test were carried out under the following scenario:

the watermark was embedded in non overlapping blocks

with length of 2 times the sampling frequency of an audio

signal using (5). Detection is made in the same block wise

approach, � and are computed for each block using (15)

and (21) and the responses for each block are accumulated

and averaged. We let �� � ����. All audio signal used for

out tests were uncompressed 16-bit stereo WAV files with

48000 Hz sampling rate.

In next section we will present an evaluation of perfor-

mance of the model.

5.1 Detector Performance

One of the most important evaluation parameters is the

detection behavior of a given model for an arbitrary set

of different watermarks; ideally, the detector variable from

(15) should give a zero response for any watermark different

from the embedded watermark. In practice, it is not possi-

ble due the fact that the cover signal is always correlated to

the watermark. In practical situations, the detector variable

outputs a very low value for any watermark different from

the embedded one.

In a good detection model, only the watermark that was

embedded in the cover should cross the threshold value, fur-

thermore, this response should be much larger than the re-

sponse to any other watermark, and the smaller the response

of watermarks different from the one embedded, the better

the model is.

In Fig. 3, we can see the computed detector variable, a

gain value of � � ��� was used to embed a watermark and

���� different watermarks were tested, only the watermark

that was actually embedded, in this case, watermark number

���, crosses the computed threshold (in dotted line), whilst,

the other watermarks produces a very low response from the

detector variable, which confirms that the derived detector

is optimal under the ML criterion.

Figure 3 shows the system performance, we can see that

the system performs as expected.

112120

Page 5: [IEEE 2012 IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA) - Cuernavaca, Mexico (2012.11.19-2012.11.23)] 2012 IEEE Ninth Electronics, Robotics and Automotive

Figure 3. Detection variable response, em-bedded watermark corresponds to key num-ber 500.

In next section, we will carry out several performance

tests when the watermark is attacked.

5.2 Detection Performance Under Attacks to theWatermark

In this section, we present the system performance when

the watermarked audio signal is attacked so we will be able

to verify if the watermark still being detectable.

An attack to watermarks is either an intentional or unin-

tentional signal processing operation that damages the wa-

termark. An intentional attack is done with the goal of dam-

aging the watermark in order to make the detector fail to de-

tect the watermark, otherwise is called unintentional, such

operations are carried out with different proposes, for ex-

ample, compress for reduced hard disk usage, which is not

in fact an effort to wipe the watermark.

In our first test, the watermarked audio signal was

cropped in such way ��� of the signal was discarded. The

results can be seen in Fig. 4, we can see that the system per-

forms very well even for few seconds of the audio signal,

this is a consequence of the block wise approach, because

two times the sampling rate is about two seconds of the au-

dio signal, so in order to crop an useful piece of audio the

attacker must to preserve several blocks of the watermarked

audio.

In another test, additive white noise was added, the pa-

rameter is the amplitude of the noise, it can be seen that the

watermark was detected even at high values of noise. A plot

of detection variable versus noise amplitude is shown if Fig.

5. One must to note that noise with amplitude ��� is very

annoying so the attacked audio is worthless.

The next test, watermarked audio signal was low pass

Figure 4. Detection performance under acropping attack, successful watermark detec-tion achieved with only �� of the original au-dio signal.

filtered , in fig.(6) the performance of the system for var-

ious cutoff frequencies is shown, it was expected that the

watermark was not capable of being detected for low cutoff

frequencies since a watermark is mostly a signal made up

of high frequencies. Again, the system exhibits very good

performance for most cases.

However, the systems was not able to detect watermarks

under certain attacks, we can see in table 1 a list of attacks

that make the watermark undetectable and related test pa-

rameters.

Table 1. Failed tests and its related experi-ment parameters.

Attack Parameters

Bass Boost frequency = 200 Hz, 12 dB

MP3 Compression Bitrate = 320 kbps

Change Speed Percent change = �� faster

Change Tempo Percent change = ��

In next section, the conclusions of this work are dis-

cussed.

6 Conclusions

We have seen in this work that the proposed watermark-

ing system can detect watermarks even under some aggres-

sive attacks such as additive white noise whilst is not ca-

pable to detect watermarks even for light attacks such as a

change of speed.

113121

Page 6: [IEEE 2012 IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA) - Cuernavaca, Mexico (2012.11.19-2012.11.23)] 2012 IEEE Ninth Electronics, Robotics and Automotive

Figure 5. Detection variable � versus noiseamplitude in a white noise addition attack.

Even when many approaches prefer the use of some

transform domain claiming superior performance, it is done

at the expense of many arithmetic operations for comput-

ing such transforms, however in our approach, the number

of such operations is reduced due the low complexity of

the detector, and because there is not need of applying any

transform on the data since all process is done in temporal

domain.

Finally, those properties would help for developing real

time applications, and we must note in addition that the pro-

posed approach would make a good use of available mem-

ory, specially in systems with little memory available.

In this way we proved that the detector variable have very

good performance in detecting watermarks.

Acknowledgements

Authors wish to thank the FIEC of University of Ver-

acruz for the support for this work.

References

[1] M. Akhaee, N. Kalantari, and F. Marvasti. Robust multiplica-

tive audio and speech watermarking using statistical model-

ing. In Communications, 2009. ICC ’09. IEEE InternationalConference on, pages 1 –5, june 2009.

[2] M. Gonzalez-Lee. Marcas de Agua Digitales y sus Aplica-ciones Practicas. PhD thesis, Instituto Politecnico Nacional.

[3] M. Mansour and A. Tewfik. Audio watermarking by time-

scale modification. In Acoustics, Speech, and Signal Pro-cessing, 2001. Proceedings. (ICASSP ’01). 2001 IEEE Inter-national Conference on, volume 3, pages 1353 –1356 vol.3,

2001.

Figure 6. Detection variable � versus cutofffrequency in an low pass filtering attack.

[4] D. Megias, J. Herrera-Joancomarti, and J. Minguillon. A

robust frequency domain audio watermarking scheme for

monophonic and stereophonic pcm formats. In EuromicroConference, 2004. Proceedings. 30th, pages 449 – 452, aug.-3sept. 2004.

[5] L. Wang, S. Emmanuel, and M. Kankanhalli. Emd and psy-

choacoustic model based watermarking for audio. In Multi-media and Expo (ICME), 2010 IEEE International Confer-ence on, pages 1427 –1432, july 2010.

114122