6.pdf

Signal Processing in the Encrypted Domain

Guest Editors: Alessandro Piva and Stefan Katzenbeisser

EURASIP Journal on Information Security

Signal Processing in theEncrypted Domain

EURASIP Journal on Information Security

Signal Processing in theEncrypted Domain

Guest Editors: Alessandro Piva andStefan Katzenbeisser

Copyright © 2007 Hindawi Publishing Corporation. All rights reserved.

This is a special issue published in volume 2007 of “EURASIP Journal on Information Security.” All articles are open access articlesdistributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in anymedium, provided the original work is properly cited.

Editor-in-ChiefMauro Barni, University of Siena, Siena, Italy

Associate Editors

Jeffrey A. Bloom, USAG. Doerr, UKJean-Luc Dugelay, FranceT. Furon, FranceMiroslav Goljan, USAS. Katzenbeisser, The NetherlandsHyoung Joong Kim, Korea

D. Kirovski, USADeepa Kundur, USAE. Magli, ItalyKivanc Mihcak, TurkeyLawrence O’Gorman, USAFernando Perez-Gonzalez, SpainA. Piva, Italy

Hans Georg Schaathun, UKMartin Steinebach, GermanyQ. Sun, SingaporeW. Trappe, USAC. Vielhauer, GermanyS. Voloshynovskiy, SwitzerlandAndreas Westfeld, Germany

Contents

Signal Processing in the Encrypted Domain, Alessandro Piva and Stefan KatzenbeisserVolume 2007, Article ID 82790, 1 page

A Survey of Homomorphic Encryption for Nonspecialists, Caroline Fontaine andFabien GalandVolume 2007, Article ID 13801, 10 pages

Secure Multiparty Computation between Distrusted Networks Terminals,S.-C. S. Cheung and Thinh NguyenVolume 2007, Article ID 51368, 10 pages

Protection and Retrieval of Encrypted Multimedia Content: When Cryptography MeetsSignal Processing, Zekeriya Erkin, Alessandro Piva, Stefan Katzenbeisser, R. L. Lagendijk,Jamshid Shokrollahi, Gregory Neven, and Mauro BarniVolume 2007, Article ID 78943, 20 pages

Oblivious Neural Network Computing via Homomorphic Encryption, C. Orlandi, A. Piva,and M. BarniVolume 2007, Article ID 37343, 11 pages

Efficient Zero-Knowledge Watermark Detection with Improved Robustness to SensitivityAttacks, Juan Ramon Troncoso-Pastoriza and Fernando Perez-GonzalezVolume 2007, Article ID 45731, 14 pages

Anonymous Fingerprinting with Robust QIM Watermarking Techniques, J. P. Prins, Z. Erkin,and R. L. LagendijkVolume 2007, Article ID 31340, 13 pages

Transmission Error and Compression Robustness of 2D Chaotic Map Image EncryptionSchemes, Michael Gschwandtner, Andreas Uhl, and Peter WildVolume 2007, Article ID 48179, 16 pages

Hindawi Publishing CorporationEURASIP Journal on Information SecurityVolume 2007, Article ID 82790, 1 pagedoi:10.1155/2007/82790

EditorialSignal Processing in the Encrypted Domain

Alessandro Piva1 and Stefan Katzenbeisser2

1 Department of Electronics and Telecommunications, University of Florence, Via S. Marta 3, 50139 Firenze, Italy2 Information & System Security Group, Philips Research Europe, High Tech Campus 34 MS 61, 5656 AE Eindhoven, The Netherlands

Correspondence should be addressed to Alessandro Piva, [email protected]

Received 31 December 2007; Accepted 31 December 2007

Copyright © 2007 A. Piva and S. Katzenbeisser. This is an open access article distributed under the Creative Commons AttributionLicense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properlycited.

Recent advances in digital signal processing enabled a num-ber of new services in various application domains, rangingfrom enhanced multimedia content production and distri-bution, to advanced healthcare systems for continuous healthmonitoring. At the heart of these services lies the abilityto securely manipulate “valuable” digital signals in order tosatisfy security requirements such as intellectual propertymanagement, authenticity, privacy, and access control. Cur-rently available technological solutions for “secure manipu-lation of signals” apply cryptographic primitives by build-ing a secure layer on top of existing signal processing mod-ules, able to protect them from leakage of critical infor-mation, assuming that the involved parties or devices trusteach other. This implies that the cryptographic layer is usedonly to protect the data against access through unautho-rized third parties or to provide authenticity. However, thisis often not enough to ensure the security of the applica-tion, since the owner of the data may not trust the process-ing devices, or those actors that are required to manipulatethem.

It is clear that the availability of signal processing algo-rithms that work directly on encrypted signals would be ofgreat help for application scenarios where signals must beproduced, processed, or exchanged securely.

Whereas the development of tools capable of processingencrypted signals may seem a formidable task, some recent,still scattered, studies, spanning from secure embedding anddetection of digital watermarks and secure content distri-bution to compression of encrypted data and access to en-crypted databases, have shown that performing signal pro-cessing operations in encrypted content is indeed possible.

We are delighted to present the first issue of a journal, en-tirely devoted to signal processing in the encrypted domain.The issue contains both survey papers allowing the reader tobecome acquainted with this exciting field, and research pa-pers discussing the latest developments.

The first part of the special issue contains three surveypapers: Fontaine and Galand give an overview of homomor-phic encryption, which is one of the key tools for signal pro-cessing in the encrypted domain, in their paper “A survey ofhomomorphic encryption for nonspecialists.” An introduc-tion to the field of secure multiparty computation is providedby the paper “Secure multiparty computation between dis-trusted networks terminals” by Cheung and Nguyen. Finally,research in the area of signal processing under encryption issurveyed in the paper “Protection and retrieval of encryptedmultimedia content: when cryptography meets signal pro-cessing” by Erkin et al.

The second part of the special issue contains four re-search papers. Orlandi et al. introduce the notion of obliv-ious computing with neural networks in the paper “Obliv-ious neural network computing via homomorphic encryp-tion.” Troncoso-Pastoriza and Perez-Gonzalez present newprotocols for zero-knowledge watermark detection in theirpaper “Efficient zero-knowledge watermark detection withimproved robustness to sensitivity attacks.” Prins et al.show in their paper “Anonymous fingerprinting with robustQIM watermarking techniques” how advanced quantization-index-modulation watermarking schemes can be used inconjunction with buyer-seller watermarking protocols. Fi-nally, Gschwandtner et al. explore properties of specializedimage encryption schemes in their paper “Transmission er-ror and compression robustness of 2D chaotic map imageencryption schemes.”

Finally, we would like to thank all the authors, as well asall reviewers, for their contribution to this issue. We hopethat the readers will enjoy this special issue and that it en-courages more colleagues to devote time to this novel andexciting field of research.

Alessandro PivaStefan Katzenbeisser

Hindawi Publishing CorporationEURASIP Journal on Information SecurityVolume 2007, Article ID 13801, 10 pagesdoi:10.1155/2007/13801

Review ArticleA Survey of Homomorphic Encryption for Nonspecialists

Caroline Fontaine and Fabien Galand

CNRS/IRISA-TEMICS, Campus de Beaulieu, 35042 Rennes Cedex, France

Correspondence should be addressed to Caroline Fontaine, [email protected]

Received 30 March 2007; Revised 10 July 2007; Accepted 24 October 2007

Recommended by Stefan Katzenbeisser

Processing encrypted signals requires special properties of the underlying encryption scheme. A possible choice is the use of ho-momorphic encryption. In this paper, we propose a selection of the most important available solutions, discussing their propertiesand limitations.

Copyright © 2007 C. Fontaine and F. Galand. This is an open access article distributed under the Creative Commons AttributionLicense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properlycited.

1. INTRODUCTION

The goal of encryption is to ensure confidentiality of datain communication and storage processes. Recently, its usein constrained devices led to consider additional features,such as the ability to delegate computations to untrustedcomputers. For this purpose, we would like to give the un-trusted computer only an encrypted version of the data toprocess. The computer will perform the computation on thisencrypted data, hence without knowing anything on its realvalue. Finally, it will send back the result, and we will decryptit. For coherence, the decrypted result has to be equal to theintended computed value if performed on the original data.For this reason, the encryption scheme has to present a par-ticular structure. Rivest et al. proposed in 1978 to solve thisissue through homomorphic encryption [1]. Unfortunately,Brickell and Yacobi pointed out in [2] some security flawsin the first proposals of Rivest et al. Since this first attempt,a lot of articles have proposed solutions dedicated to nu-merous application contexts: secret sharing schemes, thresh-old schemes (see, e.g., [3]), zero-knowledge proofs (see, e.g.,[4]), oblivious transfer (see, e.g., [5]), commitment schemes(see, e.g., [3]), anonymity, privacy, electronic voting, elec-tronic auctions, lottery protocols (see, e.g., [6]), protectionof mobile agents (see, e.g., [7]), multiparty computation (see,e.g., [3]), mix-nets (see, e.g., [8, 9]), watermarking or finger-printing protocols (see, e.g., [10–14]), and so forth.

The goal of this article is to provide nonspecialists witha survey of homomorphic encryption techniques. Section 2recalls some basic concepts of cryptography and presents ho-

momorphic encryption; it is particularly aimed at noncryp-tographers, providing guidelines about the main characteris-tics of encryption primitives: algorithms, performance, secu-rity. Section 3 provides a survey of homomorphic encryptionschemes published so far, and analyses their characteristics.

Most schemes we describe are based on mathematical no-tions the reader may not be familiar with. In the cases thesenotions can easily be introduced, we present them briefly.The reader may refer to [15] for more information concern-ing those we could not introduce properly, or algorithmicproblems related to their computation.

Before going deeper in the subject, let us introduce somenotation. The integer �(x) denotes the number of bits con-stituting the binary expansion of x. As usual, Zn will denotethe set of integers modulo n, and Z∗n the set of its invertibleelements.

2. TOWARDS HOMOMORPHIC ENCRYPTION

2.1. Basics about encryption

In this section, we will recall some important concepts con-cerning encryption schemes. For more precise information,the reader may refer to [16] or the more recent [17].

Encryption schemes are, first and foremost, designed topreserve confidentiality. According to Kerckoffs’ principle(see [18, 19] for the original papers, or any book on cryp-tography), their security must not rely on the obfuscation oftheir code, but only on the secrecy of the decryption key. Wecan distinguish two kinds of encryption schemes: symmetric

2 EURASIP Journal on Information Security

and asymmetric ones. We will present them shortly and dis-cuss their performance and security issues.

Symmetric encryption schemes

Here “symmetric” means that encryption and decryption areperformed with the same key. Hence, the sender and the re-ceiver have to agree on the key they will use before perform-ing any secure communication. Therefore, it is not possi-ble for two people who never met to use such schemes di-rectly. This also implies to share a different key with everyone we want to communicate with. Nevertheless, symmet-ric schemes present the advantage of being really fast and areused as often as possible. In this category, we can distinguishblock ciphers (AES [20, 21])1 and stream ciphers (One-timepad presented in Figure 1 [22], Snow 2.0 [23]),2 which areeven faster.

Asymmetric encryption schemes

In contrast to the previous family, asymmetric schemes in-troduce a fundamental difference between the abilities to en-crypt and to decrypt. The encryption key is public, as thedecryption key remains private. When Bob wants to send anencrypted message to Alice, he uses her public key to encryptthe message. Alice will then use her private key to decrypt it.Such schemes are more functional than symmetric ones sincethere is no need for the sender and the receiver to agree onanything before the transaction. Moreover, they often pro-vide more features. These schemes, however, have a big draw-back: they are based on nontrivial mathematical computa-tions, and much slower than the symmetric ones. The twomost prominent examples, RSA [24] and ElGamal [25], arepresented in Figures 2 and 3.

Performance issues

A block cipher like AES is typically 100 times faster than RSAencryption and 2000 times than RSA decryption, with about60 MB per second on a modest platform. Stream ciphersare even faster, some of them being able to encrypt/decrypt100 MB per second or more.3 Thus, while encryption or de-cryption of the whole content of a DVD will take about aminute with a fast stream cipher, it is simply not realistic touse an asymmetric cipher in practice for such a huge amountof data as it would require hours, or even days, to encrypt ordecrypt.

Hence, in practice, it is usual to encrypt the data we wantto transmit with an efficient symmetric cipher. To provide

1 AES has been standardized; see http://csrc.nist.gov/groups/ST/toolkit/block ciphers.html for more details.

2 Snow 2.0 is included in the draft of Norm ISO/IEC 18033-4, http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=3997.

3 See, for example, http://www.ecrypt.eu.org/stream/perf/alpha/bench-marks/snow-2.0 for some benchmark of Snow 2.0, or openssl for AES andRSA.

the receiver with the secret key needed to recover the data, thesender encrypts this key with an asymmetric cipher. Hence,the asymmetric cipher is used to encrypt only a short data,while the symmetric one is used for the longer one. Thesender and the receiver do not need to share anything be-fore performing the encryption/decryption as the symmet-ric key is transmitted with the help of the public key of thereceiver. Proceeding this way, we combine the advantages ofboth: efficiency of symmetric schemes and functionalities ofthe asymmetric schemes.

Security issues

Security of encryption schemes was formalized for the firsttime by Shannon [26]. In his seminal paper, Shannon in-troduced the notion of perfect secrecy/unconditional secu-rity, which characterizes encryption schemes for which theknowledge of a ciphertext does not give any information ei-ther about the corresponding plaintext or about the key. Heproved that the one-time pad is perfectly secure under someconditions, as explained in Figure 1. In fact, no other scheme,neither symmetric nor asymmetric, has been proved uncon-ditionally secure. Hence, if we omit the one-time pad, anyencryption scheme’s security is evaluated with regard to thecomputational power of the opponent. In the case of asym-metric schemes, we can rely on their mathematical structureto estimate their security level in a formal way. They are basedon some well-identified mathematical problems which arehard to solve in general, but easy to solve for the one whoknows the trapdoor, that is, the owner of the keys. Hence,it is easy for the owner of the keys to compute his/her pri-vate key, but no one else should be able to do so, as theknowledge of the public key should not endanger the privatekey. Through reductions, we can compare the security levelof these schemes with the difficulty of solving these math-ematical problems (factorizing large integers or computinga discrete logarithm in a large group) which are famous fortheir hardness. Proceeding this way, we obtain an estimateof the security level, which sometimes turns out to be op-timistic. This estimation may not be sufficient for severalreasons. First, there may be other ways to break the systemthan solving the reference mathematical problem [27, 28].Second, most of security proofs are performed in an ideal-ized model called the random oracle model, in which involvedprimitives, for example, hash functions, are considered trulyrandom. This model has allowed the study of the securitylevel for numerous asymmetric ciphers. Recent works showthat we are now able to perform proofs in a more realisticmodel called the standard model. From [29] to [30], a lot ofpapers compared these two models, discussing the gap be-tween them. In parallel with this formal estimation of thesecurity level, an empirical one is performed in any case, andnew symmetric and asymmetric schemes are evaluated ac-cording to published attacks.

The framework of a security evaluation has been statedby Shannon in 1949 [26]: all the considered messages areencrypted with the same key—so, for the same recipient—and the opponent’s challenge is to take an advantage from allhis observations to disclose the involved secret/private key.

C. Fontaine and F. Galand 3

Usually, to evaluate the attack capacity of the opponent, wedistinguish among several contexts [31]: ciphertext-only at-tacks (where the opponent has access only to some cipher-texts), known-plaintext attacks (where the opponent has ac-cess to some pairs of corresponding plaintext-ciphertexts),chosen-plaintext attacks (same as previous, but the opponentcan choose the plaintexts and get the corresponding cipher-texts), and chosen-ciphertext attacks (the opponent has accessto a decryption oracle, behaving as a black-box, that takesa ciphertext and outputs the corresponding plaintext). Thefirst context is the most frequent in real life, and results fromeavesdropping the communication channel; it is the worstcase for the opponent. The other cases may seem difficult toachieve, and may arise when the opponent has a more pow-erful position; he may, for example, have stolen some plain-texts, or an encryption engine. The “chosen” ones exist inadaptive versions, where the opponent can wait for a compu-tation result before choosing the next input.

How do we choose the right scheme?

The right scheme is the one that fits your constraints in thebest way. By constraints, we may understand constraints intime, memory, security, and so forth. The two first criteriaare very important in highly constrained architectures, of-ten encountered in very small devices (PDAs, smart cards,RFID tags, etc.). They are also important if we process a hugeamount of data, or numerous data at the same time, for ex-ample, video streams. Some schemes as AES or RSA are usu-ally chosen because of their reputation, but it is importantto note that new schemes are proposed each year. Indeed, itis necessary to keep a diversity in the proposals. First, it isnecessary in order to be able to face new kinds of require-ments. Second, because of security purpose, having all theschemes relying on the same structure may lead to a disasterin case an attack breaks this structure. Hence, huge interna-tional projects have been funded to ask for new proposals,with a fair evaluation to check their advantages and draw-backs, for example, RIPE, NESSIE,4 and NIST’s call for thedesign of the AES,5 CRYPTREC,6 ECRYPT,7and so forth.

2.2. Probabilistic encryption

The most well-known cryptosystems are deterministic: fora fixed encryption key, a given plaintext will always be en-crypted in the same ciphertext. This may lead to some draw-backs. RSA is a good example to illustrate this point:

(i) particular plaintexts may be encrypted in a too muchstructured way: with RSA, messages 0 and 1 are alwaysencrypted as 0 and 1, respectively;

(ii) it may be easy to compute partial information aboutthe plaintext: with RSA, the ciphertext c leaks one bit

4 see http://www.cryptonessie.org.5 see http://csrc.nist.gov and http://csrc.nist.gov/CryptoToolkit/aes.6 see http://www.ipa.go.jp/security/enc/CRYPTREC/index-e.html.7 see http://www.ecrypt.eu.org.

of information about the plaintext m, namely, the so-called Jacobi symbol;

(iii) when using a deterministic encryption scheme, it iseasy to detect when the same message is sent twicewhile processed with the same key.

So, in practice, we prefer encryption schemes to be prob-abilistic. In the case of symmetric schemes, we introduce arandom vector in the encryption process (e.g., in the pseudo-random generator for stream ciphers, or in the operatingmode for block ciphers), generally called IV . This vectormay be public, and transmitted as it is, without being en-crypted, but IV must be changed every time we encrypta message. In the case of asymmetric ciphers, the securityanalysis is more mathematical, and we want the randomizedschemes to remain analyzable in the same way as the deter-ministic schemes. Some adequate modes have been proposedto randomize already published deterministic schemes, asthe Optimal Asymmetric Encryption Padding OAEP for RSA(or any scheme based on a trap-door one-way permutation)[33].8 Some new schemes, randomized by nature, have alsobeen proposed [25, 34, 35] (see also Figures 3 and 4).

A simple consequence of this requirement to be proba-bilistic appears in the so-called expansion: since for a plain-text we require the existence of several possible ciphertexts,the number of ciphertexts is greater than the number of pos-sible plaintexts. This means the ciphertexts cannot be as shortas the plaintexts, they have to be strictly longer. The ratiobetween the length, in bits, of ciphertexts and plaintexts iscalled the expansion. Of course, this parameter is of practicalimportance. We will see in the sequel that efficient proba-bilistic encryption schemes have been proposed with an ex-pansion less than 2 (e.g., Paillier’s scheme).

2.3. Homomorphic encryption

We will present in this section the basic definitions related tohomomorphic encryption. The state of the art will be given inSection 3.

The most common definition is the following. Let M(resp., C) denote the set of the plaintexts (resp., ciphertexts).An encryption scheme is said to be homomorphic if for anygiven encryption key k the encryption function E satisfies

∀m1,m2 ∈M, E(m1�Mm2

)←− E(m1)�C E

(m2)

(1)

for some operators �M in M and �C in C, where← means“can be directly computed from,” that is, without any inter-mediate decryption.

If (M,�M) and (C,�C) are groups, we have a group ho-momorphism. We say a scheme is additively homomorphic ifwe consider addition operators, and multiplicatively homo-morphic if we consider multiplication operators.

A lot of such homomorphic schemes have been publishedthat have been widely used in many applications. Note that

8 Note that there are a lot of more recent papers proposing variants or im-provements of OAEP, but it is not our purpose here.


Prerequisite: Alice and Bob share a secret random keystream, say a binary one.Goal: Alice can send an encrypted message to Bob, and Bob can send an encrypted message to Alice.Principle: To encrypt a message, Alice (resp., Bob) XORs the plaintext and the keystream. To decrypt the received

message, Bob (resp., Alice) applies XOR on the ciphertext and the keystream.Security: This scheme has been showed to be unconditionally secure by Shannon [26] if and only if the keystream is truly random,

has the same length as the plaintext, and is used only once. Thus, this scheme is used only for very critical situations forwhich these constraints may be managed, as the red phone used by the USA and the USSR [32, pp. 715-716]. What wemay use more commonly is a similar scheme, where the keystream is generated by a pseudorandom generator, initializedby the secret key shared by Alice and Bob. A lot of such stream ciphers has been proposed, and their security remainsonly empirical. Snow 2.0 is one of these.

Figure 1: One-time pad—1917(used)/1926 (published [22]). Note that this scheme may be transposed in any group (G, +) other than({0, 1}, XOR), encryption being related to addition of the keystream, while decryption consists in subtracting the keystream.

Prerequisite: Alice computed a (public, private) key: an integer n = pq, where p and q are well chosen large prime numbers,an integer e such that gcd (e,φ(n)) = 1, and an integer d which is the inverse of e modulo φ(n), that is,ed ≡ 1 mod φ(n); φ(n) denotes the Euler function, φ(n) = φ(pq) = (p − 1)(q − 1). Alice’s public key is (n, e), andher private key is d; p and q have also to be kept secret, but are no more needed to process the data, they were onlyuseful for Alice to compute d from e.

Goal: Anyone can send an encrypted message to Alice.Principle: To send an encrypted version of the message m to Alice, Bob computes c = me mod n. To get back to the plaintext,

Alice computes cd mod n which, according to Euler’s theorem, is precisely equal to m.Security: It is clear that if an opponent may factor n and recover p and q, he will be able to compute φ(n), then d, and will be able

to decrypt Alice messages. So, the RSA problem (accessing m while given c) is weaker than the factorizationproblem. It is not known whether the two problems are equivalent or not.

Figure 2: RSA—1978 [24].

in some contexts it may be of great interest to have this prop-erty not only for one operator but for two at the same time.Hence, we are also interested in the design of ring/algebraichomomorphisms. Such schemes would satisfy a relation of theform

∀m1,m2 ∈M, E(m1+Mm2

)←− E(m1)+C E

(m2),

E(m1×Mm2

)←− E(m1)×CE

(m2).

(2)

As it will be further discussed, no convincing algebraic ho-momorphic encryption scheme has been found yet, and theirdesign remains an open problem.

Less formally, these definitions mean that, for a fixed keyk, it is equivalent to perform operations on the plaintextsbefore encryption, or on the corresponding ciphertexts afterencryption. So we require a kind of commutativity betweenencryption and some data processing operations.

Of course, the schemes we will consider in the followinghave to be probabilistic ciphers, and we may consider E tobehave in a probabilistic way in the above definitions.

2.4. New security considerations

Probabilistic encryption was introduced with a clear pur-pose: security. This requires to properly define different se-curity levels. Semantic security was introduced in [34], at thesame time as probabilistic encryption, in order to define whatcould be a strong security level, unavailable without proba-bilistic encryption. Roughly, a probabilistic encryption is se-mantically secure if the knowledge of a ciphertext does not

provide any useful information on the plaintext to some hy-pothetical adversary having only a reasonably restricted com-putational power. More formally, for any function f andany plaintext m, and with only polynomial resources (thatis, with algorithms which time/space complexities vary as apolynomial function of the size of the inputs), the probabil-ity to guess f (m) (knowing f but not m) does not increaseif the adversary knows a ciphertext corresponding to m. Thismight be thought of as a kind of perfect secrecy in the casewhen we only have polynomial resources.

Together with this strong requirement, the notion ofpolynomial security was defined: the adversary chooses twoplaintexts, and we choose secretly at random one plaintextand provide to the adversary a corresponding ciphertext. Theadversary, still with polynomial resources, must guess whichplaintext we chose. If the best he can do is to achieve a prob-ability 1/2 + ε of success, the encryption is said to be polyno-mially secure. Polynomial security is now known as the indis-tinguishability of encryptions following the terminology anddefinitions of Goldreich [36].

Quite amazingly, Goldwasser and Micali proved theequivalence between polynomial security and semantic se-curity [34]; Goldreich extended these notions [36] preserv-ing the equivalence. With this equivalence, it is easy to statethat a deterministic asymmetric encryption scheme cannotbe semantically secure since it cannot be indistinguishable:the adversary knows the encryption function, and thus cancompute the single ciphertext corresponding to each plain-text.


Prerequisite: Alice generated a (public, private) key: she first chose a large prime integer p, a generating element g of the cyclicgroup Z∗p , and considered q = p − 1, the order of the group; building her public key, she picked at random a ∈ Zq

and computed yA = ga in Z∗p , her public key being then (g, q, yA); her private key is a.Goal: Anyone can send an encrypted message to Alice.Principle: To send an encrypted version of the message m to Alice, Bob picks at random k ∈ Zq, computes (c1, c2) = (gk ,mykA)

in Z∗p . To get back to the plaintext, Alice computes c2(ca1)−1 in Z∗p , which is precisely equal to m.Security: The security of this scheme is related to the Diffie-Hellman problem: if we can solve it, then we can break ElGamal

encryption. It is not known whether the two problems are equivalent or not. This scheme is IND-CPA.

Figure 3: ElGamal—1985 [25].

But with asymmetric encryption schemes, the adversaryknows the whole encryption material E involving both theencryption function and the encryption key. Thus, he cancompute any pair (m,E(m)). Naor and Yung [37] and Rack-off and Simon [38] introduced different abilities, relying onthe different contexts we discussed above. From the weak-est to the strongest, we have the chosen-plaintext, nonadap-tive chosen ciphertext and the strongest is the adaptive cho-sen ciphertext. This leads to the IND-CPA, IND-CCA1, andIND-CCA2 notions in the literature. IND stands for indistin-guishability whereas CPA and CCA are acronyms for chosenplaintext attack and chosen-ciphertext attack. Finally, CCA1refers to nonadaptive attacks, and CCA2 to adaptive ones.Considering the previous remarks on the ability for anyoneto encrypt while using asymmetric schemes, the adversaryhas always the chosen-plaintext ability.

Another security requirement termed nonmalleabilityhas also been introduced to complete the analysis. Given aciphertext c = E(m), it should be hard for an opponent toproduce a ciphertext c′ such that the corresponding plain-text m′, that is not necessary known to the opponent, hassome known relation with m. This notion was formalizeddifferently by Dolev et al. [39, 40], and by Bellare et al. [41],both approaches being proved equivalent by Bellare and Sa-hai [42].

We will not detail the relations between all these differ-ent notions and the interested reader can refer to [41–43] fora comprehensive treatment. Basically, the adaptive chosen-ciphertext indistinguishability IND-CCA2 is the strongest re-quirement for an encryption; in particular, it implies non-malleability.

It should be emphasized that a homomorphic encryptioncannot have the nonmalleability property. With the notationof Section 2.3, knowing c, we can compute c′ = c�Cc and de-duce, by the homomorphic property, that c′ is a ciphertext ofm′ = m�Mm. According to the previous remark on adaptivechosen-ciphertext indistinguishability, an homomorphic en-cryption has no access to the strongest security requirement.The highest security level it can reach is IND-CPA.

To conclude this section on security, and for the sakeof completeness, we point out some security considerationsabout deterministic homomorphic encryption. First, it wasproved that a deterministic homomorphic encryption forwhich the operation � is a simple addition is insecure [44].Second, Boneh and Lipton showed in 1996 that any de-terministic algebraically homomorphic cryptosystem can be

broken in subexponential time [45]. Note that this last pointdoes not mean that deterministic algebraically homomor-phic cryptosystems are insecure, but that one can find theplaintext from a ciphertext in a subexponential time (whichis still too long to be practicable). For example, we knowthat the security of RSA encryption depends on factorizationalgorithms and we know subexponential factorization algo-rithm. Nevertheless, RSA is still considered strong enough.

3. HOMOMORPHIC ENCRYPTION: STATE OF THE ART

First of all, let us recall that both RSA and ElGamal encryp-tion schemes are multiplicatively homomorphic. The prob-lem is that the original RSA being deterministic, it cannotachieve a security level of IND-CPA (which is the highestsecurity level for homomorphic schemes, see Section 2.4).Furthermore its probabilistic variants, obtained throughOAEP/OAEP+, are no more homomorphic. In contrast toRSA, ElGamal offers the best security level for a homomor-phic encryption scheme, as it has been shown to be IND-CPA. Moreover, it is interesting to notice that an additivelyhomomorphic variant of ElGamal has also been proposed[48]. Comparing it with the original ElGamal, this variantalso involves an element G (G may be equal to g) that gen-erates (Zq, +) with respect to the addition operation. To sendan encrypted version of the message m to Alice, Bob picks atrandom k ∈ Zq and computes (c1, c2) = (gk,GmykA). To getback the plaintext, Alice computes c2(ca1)−1, which is equal toGm; then, she has to compute m in a second step. Note thatthis last decryption step is hard to achieve and that there isno other choice for Alice than to use brute force search to getback m from Gm. It is also well known that ElGamal’s con-struction works for any family of groups for which the dis-crete logarithm problem is considered intractable. For exam-ple, it may be derived in the setup employing elliptic curves.Hence, ElGamal and its variants are known to be really in-teresting candidates for realistic homomorphic encryptionschemes.

We will now describe another important family of homo-morphic encryption schemes, ranging from the first proba-bilistic system9 proposed by Goldwasser and Micali in 1982

9 To be more precise, the first published probabilistic public-key encryptionscheme is due to McEliece [49], and the first to add the homomorphicproperty is due to Goldwasser-Micali.


Prerequisite: Alice computed a (public, private) key: she first chose n = pq, p and q being large prime numbers, and g a quadraticnonresidue modulo n whose Jacobi symbol is 1; her public key is composed of n and g, and her private key is thefactorization of n.

Goal: Anyone can send an encrypted message to Alice.Principle: To encrypt a bit b, Bob picks at random an integer r ∈ Z∗n , and computes c = gbr2 mod n (remark that c is a quadratic

residue if and only if b = 0). To get back to the plaintext, Alice determines if c is a quadratic residue or not. To do so,she uses the property that the Jacobi symbol (c/p) is equal to (−1)b. Please, note that the scheme encrypts 1 bit ofinformation, while its output is usually 1024 bits long!

Security: This scheme is the first one that was proved semantically secure against a passive adversary (under computationalassumption).

Figure 4: Goldwasser-Micali—1982 [34, 46].

Prerequisite: Alice computed a (public, private) key: she first chose an integer n = pq, p and q being two large prime numbers andn satisfying gcd (n,φ(n)) = 1, and considered the group G = Z∗n2 of order k. She also considered g ∈ G of order n. Herpublic key is composed of n and g, and here private key consists in the factors of n.

Goal: Anyone can send a message to Alice.Principle: To encrypt a message m ∈ Zn, Bob picks at random an integer r ∈ Z∗n , and computes c = gmrn mod n2. To get back to

the plaintext, Alice computes the discrete logarithm of cλ(n) mod n2, obtaining mλ(n) ∈ Zn, where λ(n) denotes theCarmichael function. Now, since gcd (λ(n),n) = 1, Alice easily computes λ(n)−1 mod n and gets m.

Security: This scheme is IND-CPA.

Figure 5: Paillier—1999 [47].

[34, 46] (described in Figure 4), to the famous Paillier’s en-cryption scheme [47] (described in Figure 5) and its im-provements. Paillier’s scheme and its variants are famous fortheir efficiency, but also because, as ElGamal, they achieve thehighest security level for homomorphic encryption schemes.We will not discuss their mathematical considerations indetail, but will summarize their important parameters andproperties.

(i) We begin with the rather simple scheme ofGoldwasser-Micali [34, 46]. Besides some historical impor-tance, this scheme had an important impact on later pro-posals. Several other schemes, that will be presented below,were obtained as generalizations of this one. For these rea-sons, we provide a detailed description in Figure 4. Here, asfor RSA, we use computations modulo n = pq, a productof two large primes. Encryption is simple, with a productand a square, whereas decryption is heavier, with an expo-nentiation. Nevertheless, this step can be done in O(�(p)2).Unfortunately, this scheme presents a strong drawback sinceits input consists of a single bit. First, this implies that en-crypting k bits leads to a cost of O(k·�(p)2). This is not veryefficient even if it is considered as practical. The second con-sequence concerns the expansion: a single bit of plaintext isencrypted in an integer modulo n, that is, �(n) bits. Thus, theexpansion is really huge. This is the main drawback of thisscheme.

Before continuing our review, let us present theGoldwasser-Micali (GM) scheme from another point of view.This is required to understand how it has been generalized.The basic principle of GM is to partition a well-chosen sub-set of integers modulo n into two secret parts: M0 and M1.

Then, encryption selects a random element of Mb to encryptb, and decryption allows to know in which part the ran-domly selected element lies. The core point lies in the wayto choose the subset, and to partition it into M0 and M1. GMuses group theory to achieve the following: the subset is thegroup G of invertible integers modulo n with a Jacobi sym-bol, with respect to n, equal to 1. The partition is generatedby another group H ⊂ G, composed of the elements that areinvertible modulo n with a Jacobi symbol, with respect to afixed factor of n, equal to 1; with these settings, it is possibleto split G into two parts: H and G \H .

The generalizations of Goldwasser-Micali play with thesetwo groups; they try to find two groups G and H such that Gcan be split into more than k = 2 parts.

(ii) Benaloh [50] is a generalization of GM, that enablesto manage inputs of �(k) bits, k being a prime satisfyingsome particular constraints. Encryption is similar as in theprevious scheme (encrypting a message m ∈ {0, . . . , k − 1}means picking an integer r ∈ Z∗n and computing c = gmrk

mod n) but decryption is more complex. The input and out-put sizes being, respectively, of �(k) and �(n) bits, the expan-sion is equal to �(n)/�(k). This is better than in the GM case.Moreover, the encryption cost is not too high. Nevertheless,the decryption cost is estimated to be O(

√k�(k)) for pre-

computation, and the same for each dynamical decryption.This implies that k has to be taken quite small, which limitsthe gain obtained on the expansion.

(iii) Naccache-Stern [51] is an improvement of Benaloh’sscheme. Considering a parameter k that can be greaterthan before, it leads to a smaller expansion. Note thatthe constraints on k are slightly different. The encryption


step is precisely the same as in Benaloh’s scheme, but thedecryption is different. To summarize, the expansion isstill equal to �(n)/�(k), but the decryption cost is lower:O(�(n)5 log (�(n))), and the authors claim it is reasonable tochoose the parameters as to get an expansion equal to 4.

(iv) In order to improve previous schemes, Okamoto andUchiyama decided to change the base group G [52]. Consid-ering n = p2q, p and q still being two large primes, and thegroup G = Z∗p2 , they achieve k = p. Thus, the expansionis equal to 3. As Paillier’s scheme is an improvement of thisone and will be fully described below, we will not discuss itsdescription in detail. Its advantage lies in the proof that its se-curity is equivalent to the factorization of n. Unfortunately,a chosen-ciphertext attack has been proposed leading to thisfactorization. This scheme was used to design the EPOC sys-tems [53], currently submitted for the supplement P1363a tothe IEEE Standard Specifications for Public-Key Cryptogra-phy (IEEE P1363). Note that earlier versions of EPOC weresubject to security flaws as pointed out in [54], due to a baduse of the scheme.

(v) One of the most well-known homomorphic encryp-tion schemes is due to Paillier [47], and is described inFigure 5. It is an improvement of the previous one, that de-creases the expansion from 3 to 2. Paillier came back ton = pq, with gcd (n,φ(n)) = 1, but considered the groupG = Z∗n2 , and a proper choice of H led him to k = �(n).The encryption cost is not too high. Decryption needs oneexponentiation modulo n2 to the power λ(n), and a mul-tiplication modulo n. Paillier showed in his paper how tomanage decryption efficiently through the Chinese Remain-der Theorem. With smaller expansion and lower cost com-pared with the previous ones, this scheme is really attractive.In 2002, Cramer and Shoup proposed a general approach togain security against adaptive chosen-ciphertext attacks forcertain cryptosystems with some particular algebraic prop-erties [55]. Applying it to Paillier’s original scheme, they pro-posed a stronger variant. Bresson et al. proposed in [56] aslightly different version that may be more accurate for someapplications.

(vi) Damgard and Jurik proposed in [57] a generalizationof Paillier’s scheme to groups of the form Z∗ns+1 with s > 0. Thelarger the s is, the smaller the expansion is. Moreover, thisscheme leads to a lot of applications. For example, we canmention the adaptation of the size of the plaintexts, the useof threshold cryptography, electronic voting, and so forth. Toencrypt a message m ∈ Zn, one picks r ∈ Z∗n at random andcomputes gmrn

s ∈ Zns+1 . The authors show that if one canbreak the scheme for a given value s = σ , then one can breakit for s = σ − 1. They also show that the semantic security ofthis scheme is equivalent to that of Paillier. To summarize, theexpansion is of 1+1/s, and hence can be close to 1 if s is suffi-ciently large. The ratio of the encryption cost of this schemeover Paillier’s can be estimated to be (1/6)s(s + 1)(s + 2). Thesame ratio for the decryption step equals (1/6)(s + 1)(s + 2).Note that even if this scheme is better than Paillier’s accord-ing to its lower expansion, it remains more costly. Moreover,if we want to encrypt or decrypt k blocks of �(n) bits, runningPaillier’s scheme k times is less costly than running Damgard-Jurik’s scheme once.

(vii) Galbraith proposed in [58] an adaptation of the pre-vious scheme in the context of elliptic curves. Its expansionis equal to 3. The ratio of the encryption (resp., decryption)cost of this scheme in the case s = 1 over Paillier’s can beestimated to be about 7 (resp., 14). But, in contrast to theprevious scheme, the larger the s is, the more the cost may de-crease. Moreover, as in the case of Damgard-Jurik’s scheme,the higher the s is, the stronger the scheme is.

(viii) Castagnos explored in [59, 60]10 another improve-ment direction considering quadratic fields quotients. Wehave the same kind of structure regarding ns+1 as before, butin another context. To summarize, the expansion is 3 and theratio of the encryption/decryption cost of this scheme in thecase s = 1 over Paillier’s can be estimated to be about 2 (plus 2computations of Legendre symbols for the decryption step).

(x) To close the survey of this family of schemes, let usmention the ElGamal-Paillier amalgam, which merges Pail-lier and the additively homomorphic variant of ElGamal.More precisely, it is based on Damgard-Jurik’s (presentedabove) and Cramer-Shoup’s [55] analyses and variants ofPaillier’s scheme, and was proposed by [9]. The goal wasto gain the advantages of both schemes while minimizingtheir drawbacks. Preserving the notation of both ElGamaland Paillier schemes, we will describe the encryption in theparticular case s = 1, which leads Damgard-Jurik’s variantto the original Paillier. To encrypt a message m ∈ Zn, Bobpicks at random an integer k, and computes (c1, c2) = (gk

mod n, (1 + n)m(ykA mod n)n

mod n2).Now that we have reviewed the two most famous fami-

lies of homomorphic encryption schemes, we would like tomention a few research directions and challenges.

First, as we mentioned in Section 2.1, it is importantto have different kinds of schemes, because of applicationsand security purposes. One direction to design homomor-phic schemes that are not directly related to the same math-ematical problems as ElGamal or Paillier (and variants) is toconsider the recent papers dealing with Weil pairing. As thisnew direction is more and more promising in the design ofasymmetric schemes, the investigation in the particular caseof homomorphic ciphers is of interest. ElGamal may not bedirectly used in the Weil pairing setup as the mathematicalproblem it is based on becomes easy to manage. One morepromising direction is the use of the pairing-based schemeproposed by Boneh and Franklin [61] to obtain a secure ho-momorphic ID-based scheme (see directions in [62] for theability of such schemes to provide interesting new features).

A second interesting research direction lies in the area ofsymmetric encryption. As all the homomorphic encryptionschemes we mentioned so far are asymmetric, they are notas fast as symmetric ones could be. But, homomorphy is eas-ier to manage when mathematical operators are involved inthe encryption process, which is not usually the case in sym-metric schemes. Very few symmetric homomorphic schemeshave been proposed, most of them being broken ([63] bro-ken in [64, 65], [66] broken in [67]). Nevertheless, it may

10 This scheme is mentioned in the conclusion of [59], and more deeplypresented in [60], unfortunately in French.


be of interest to consider a simple generalization of the one-time pad, where bits are replaced by integers modulo n, asintroduced by [68]. In terms of security, it has exactly thesame properties than the one-time pad, that is, perfect se-crecy if and only if the keystream is truly random, of samelength as the plaintext, and is used only once. Here again, thisis overwhelming and the keystream could be generated by awell-chosen pseudorandom generator (e.g., as Snow 2.0), de-creasing security from unconditional to computational. Notethat this scheme’s homomorphy is a little bit fuzzy, as we havefor any pair of encryption keys (k1, k2)

∀m1,m2 ∈M, Ek1+k2

(m1 + m2

)←− Ek1

(m1)

+ Ek2

(m2).

(3)

This is the only example of a symmetric homomorphic en-cryption that has not been cracked.

As per algebraic homomorphy, designing algebraicallyhomomorphic encryption schemes is a real challenge today.There has been only a few ones proposed: by Fellows andKoblitz [69] (which cannot be considered as secure nor ef-ficient [70]), by Domingo-Ferrer [63, 66] (which has beenbroken [64, 65, 67]), and construction studies of Rappe et al.[3]. No satisfactory solution has been proposed so far, and,as Boneh and Lipton conjectured that any algebraically ho-momorphic encryption would prove to be insecure [45], thequestion of their existence and design is still open.

4. CONCLUSION

We presented in this paper a state of the art on homomor-phic encryption schemes discussing their parameters, perfor-mances and security issues. As we saw, these schemes are notwell suited for every use, and their characteristics must betaken into account. Nowadays, such schemes are studied inwide application contexts, but the research is still challeng-ing in the cryptographic community to design more power-ful/secure schemes. Their use in the signal processing com-munity is quite new, and we hope this paper will serve asa guide for understanding their specificities, advantages andlimits.

ACKNOWLEDGMENTS

The authors are indebted to the referees for their fruitfulcomments concerning this manuscript, and to Fabien Laguil-laumie and Guilhem Castagnos for discussions about the re-cent improvements in the field. They also thank all the peo-ple who took the time to read this manuscript and share theirthoughts about it. Dr. C. Fontaine is supported (in part) bythe European Commission through the IST Programme un-der Contract IST-2002-507932 ECRYPT.

REFERENCES

[1] R. Rivest, L. Adleman, and M. Dertouzos, “On data banks andprivacy homomorphisms,” in Foundations of Secure Computa-tion, pp. 169–177, Academic Press, 1978.

[2] E. Brickell and Y. Yacobi, “On privacy homomorphisms,” inAdvances in Cryptology (EUROCRYPT ’87), vol. 304 of Lecture

Notes in Computer Science, pp. 117–126, Springer, New York,NY, USA, 1987.

[3] D. Rappe, Homomorphic cryptosystems and their applications,Ph.D. thesis, University of Dortmund, Dortmund, Germany,2004, http://www.rappe.de/doerte/Diss.pdf.

[4] R. Cramer and I. Damgard, “Zero-knowledge for finite fieldarthmetic, or: can zeroknowledge be for free?” in Advancesin Cryptology (CRYPTO ’98), vol. 1462 of Lecture Notes inComputer Science, pp. 424–441, Springer, New York, NY, USA,1998.

[5] H. Lipmaa, “Verifiable homomorphic oblivious transfer andprivate equality test,” in Advances in Cryptology (ASIACRYPT’03), vol. 2894 of Lecture Notes in Computer Science, pp. 416–433, Springer, New York, NY, USA, 2003.

[6] P.-A. Fouque, G. Poupard, and J. Stern, “Sharing decryptionin the context of voting or lotteries,” in Proceedings of the 4thInternational Conference on Financial Cryptography, vol. 1962of Lecture Notes in Computer Science, pp. 90–104, Anguilla,British West Indies, 2000.

[7] T. Sander and C. Tschudin, “Protecting mobile agents againstmalicious hosts,” in Mobile Agents and Security, vol. 1419 ofLecture Notes in Computer Science, pp. 44–60, Springer, NewYork, NY, USA, 1998.

[8] P. Golle, M. Jakobsson, A. Juels, and P. Syverson, “Universal re-encryption for mixnets,” in Proceedings of the RSA ConferenceCryptographers (Track ’04), vol. 2964 of Lecture Notes in Com-puter Science, pp. 163–178, San Francisco, Calif, USA, 2004.

[9] I. Damgard and M. Jurik, “A length-flexible threshold cryp-tosystem with applications,” in Proceedings of the 8th Aus-tralian Conference on Information Security and Privacy (ACISP’03), vol. 2727 of Lecture Notes in Computer Science, Wollon-gong, Australia, 2003.

[10] A. Adelsbach, S. Katzenbeisser, and A. Sadeghi, “Cryptologymeets watermarking: detecting watermarks with minimal orzero-knowledge disclosures,” in Proceedings of the EuropeanSignal Processing Conference (EUSIPCO ’02), Toulouse, France,September 2002.

[11] B. Pfitzmann and W. Waidner, “Anonymous fingerprinting,”in Advances in Cryptology (EUROCRYPT ’97), vol. 1233 ofLecture Notes in Computer Science, pp. 88–102, Springer, NewYork, NY, USA, 1997.

[12] N. Memon and P. Wong, “A buyer-seller watermarking proto-col,” IEEE Transactions on Image Processing, vol. 10, no. 4, pp.643–649, 2001.

[13] C.-L. Lei, P.-L. Yu, P.-L. Tsai, and M.-H. Chan, “An efficientand anonymous buyer-seller watermarking protocol,” IEEETransactions on Image Processing, vol. 13, no. 12, pp. 1618–1626, 2004.

[14] M. Kuribayashi and H. Tanaka, “Fingerprinting protocol forimages based on aditive homomorphic property,” IEEE Trans-actions on Image Processing, vol. 14, no. 12, pp. 2129–2139,2005.

[15] V. Shoup, A Computational Introduction to NumberTheory and Algebra, Cambridge University Press, 2005,http://www.shoup.net/ntb/.

[16] A. Menezes, P. Van Orschot, and S. Vanstone, Hand-book of applied cryptography, CRC Press, 1997,http://www.cacr.math.uwaterloo.ca/hac/.

[17] H. Van Tilborg, Ed., Encyclopedia of Cryptography and Security,Springer, New York, NY, USA, 2005.

[18] A. Kerckhoffs, “La cryptographie militaire (part i),” Journal desSciences Militaires, vol. 9, no. 1, pp. 5–38, 1883.

[19] A. Kerckhoffs, “La cryptographie militaire (part ii),” Journaldes Sciences Militaires, vol. 9, no. 2, pp. 161–191, 1883.


[20] J. Daemen and V. Rijmen, “The block cipher RIJNDAEL,” in(CARDIS ’98), vol. 1820 of Lecture Notes in Computer Science,pp. 247–256, Springer, New York, NY, USA, 2000.

[21] J. Daemen and V. Rijmen, “The design of Rijndael,” in AES—the Advanced Encryption Standard, Informtion Security andCryptography, Springer, New York, NY, USA, 2002.

[22] G. Vernam, “Cipher printing telegraph systems for secret wireand radio telegraphic communications,” Journal of the Ameri-can Institute of Electrical Engineers, vol. 45, pp. 109–115, 1926.

[23] P. Ekdahl and T. Johansson, “A new version of the streamcipher SNOW,” in Selected Areas in Cryptography (SAC ’02),vol. 2595 of Lecture Notes in Computer Science, pp. 47–61,Springer, New York, NY, USA, 2002.

[24] R. Rivest, A. Shamir, and L. Adleman, “A method for obtainingdigital signatures and public-key cryptosystems,” Communica-tions of the ACM, vol. 21, no. 2, pp. 120–126, 1978.

[25] T. ElGamal, “A prublic key cryptosystem and a signaturescheme based on discrete logarithms,” in Advances in Cryp-tology (CRYPTO ’84), vol. 196 of Lecture Notes in ComputerScience, pp. 10–18, Springer, New York, NY, USA, 1985.

[26] C. Shannon, “Communication theory of secrecy systems,” BellSystem Technical Journal, vol. 28, pp. 656–715, 1949.

[27] M. Ajtai and C. Dwork, “A public key cryptosystem withworst-case/average-case equivalence,” in Proceedings of the29th ACM Symposium on Theory of Computing (STOC ’97),pp. 284–293, 1997.

[28] P. Nguyen and J. Stern, “Cryptanalysis of the Ajtai-Dworkcryptosystem,” in Advances in Cryptology (CRYPTO ’98),vol. 1462 of Lecture Notes in Computer Science, pp. 223–242,Springer, New York, NY, USA, 1999.

[29] R. Canetti, O. Goldreich, and S. Halevi, “The random oraclemodel, revisited,” in Proceedings of the 30th ACM Symposiumon Theory of Computing (STOC ’98), pp. 209–218, Berkeley,Calif, USA, 1998.

[30] P. Paillier, “Impossibility proofs for RSA signatures in the stan-dard model,” in Proceedings of the RSA Conference 2007, Cryp-tographers’ (Track), vol. 4377 of Lecture Notes in Computer Sci-ence, pp. 31–48, San Fancisco, Calif, USA, 2007.

[31] W. Diffie and M. Hellman, “New directions in cryptography,”IEEE Transactions on Information Theory, vol. 22, no. 6, pp.644–654, 1976.

[32] D. Kahn, The Codebreakers: The Story of Secret Writing,Macmillan, New York, NY, USA, 1967.

[33] M. Bellare and P. Rogaway, “Optimal asymmetricencryption—how to encrypt with RSA,” in Advances inCryptology (EUROCRYPT ’94), vol. 950 of Lecture Notes inComputer Science, pp. 92–111, Springer, New York, NY, USA,1995.

[34] S. Goldwasser and S. Micali, “Probabilistic encryption & howto play mental poker keeping secret all partial information,” inProceedings of the 14th ACM Symposium on the Theory of Com-puting (STOC ’82), pp. 365–377, New York, NY, USA, 1982.

[35] M. Blum and S. Goldwasser, “An efficient probabilistic public-key encryption scheme which hides all partial information,” inAdvances in Cryptology (EUROCRYPT ’84), vol. 196 of LectureNotes in Computer Science, pp. 289–299, Springer, New York,NY, USA, 1985.

[36] O. Goldreich, “A uniform complexity treatment of encryptionand zero-knowledge,” Journal of Cryptology, vol. 6, no. 1, pp.21–53, 1993.

[37] M. Naor and M. Yung, “Public-key cryptosystems provably se-cure against chosen ciphertext attacks,” in Proceedings of the22nd ACM Annual Symposium on the Theory of Computing(STOC ’90), pp. 427–437, Baltimore, Md, USA, 1990.

[38] C. Rackoff and D. Simon, “Non-interactive zero-knowledgeproof of knowledge and chosen ciphertext attack,” in Advancesin Cryptology (CRYPTO ’91), vol. 576 of Lecture Notes in Com-puter Science, pp. 433–444, Springer, New York, NY, USA,1991.

[39] D. Dolev, C. Dwork, and M. Naor, “Non-malleable cryptogra-phy,” in Proceedings of the 23rd ACM Annual Symposium on theTheory of Computing —(STOC ’91), pp. 542–552, 1991.

[40] D. Dolev, C. Dwork, and M. Naor, “Non-malleable cryptogra-phy,” SIAM Journal of Computing, vol. 30, no. 2, pp. 391–437,2000.

[41] M. Bellare, A. Desai, D. Pointcheval, and P. Rogaway, “Re-lations among notions of security for public-key encryptionschemes,” in Advances in Cryptology (CRYPTO ’98), vol. 1462of Lecture Notes in Computer Science, pp. 26–45, Springer, NewYork, NY, USA, 1998.

[42] M. Bellare and A. Sahai, “Non-malleable encryption: equiva-lence between two notions, and an indistinguishability-basedcharacterization,” in Advances in Cryptology (CRYPTO ’99),vol. 1666 of Lecture Notes in Computer Science, pp. 519–536,Springer, New York, NY, USA, 1999.

[43] Y. Watanabe, J. Shikata, and H. Imai, “Equivalence betweensemantic security and indistinguishability against chosen ci-phertext attacks,” in Public Key Cryptography (PKC ’03),vol. 2567 of Lecture Notes in Computer Science, pp. 71–84,Springer, New York, NY, USA, 2003.

[44] N. Ahituv, Y. Lapid, and S. Neumann, “Processing encrypteddata,” Communications of the ACM, vol. 30, no. 9, pp. 777–780,1987.

[45] D. Boneh and R. Lipton, “Algorithms for black box fields andtheir application to cryptography,” in Advances in Cryptology(CRYPTO ’96), vol. 1109 of Lecture Notes in Computer Science,pp. 283–297, Springer, New York, NY, USA, 1996.

[46] S. Goldwasser and S. Micali, “Probabilistic encryption,” Jour-nal of Computer and System Sciences, vol. 28, no. 2, pp. 270–299, 1984.

[47] P. Paillier, “Public-key cryptosystems based on composite de-gree residuosity classes,” in Advances in Cryptology (EURO-CRYPT ’99), vol. 1592 of Lecture Notes in Computer Science,pp. 223–238, Springer, New York, NY, USA, 1999.

[48] R. Cramer, R. Gennaro, and B. Schoenmakers, “A secure andoptimally efficient multiauthority election scheme,” in Ad-vances in Cryptology (EUROCRYPT ’97), vol. 1233 of LectureNotes in Computer Science, pp. 103–118, Springer, New York,NY, USA, 1997.

[49] R. McEliece, “A public-key cryptosystem based on algebraiccoding theory,” Dsn progress report, Jet Propulsion Labora-tory, 1978.

[50] J. Benaloh, Verifiable secret-ballot elections, Ph.D. thesis, YaleUniversity, Department of Computer Science, New Haven,Conn, USA, 1988.

[51] D. Naccache and J. Stern, “A new public-key cryptosystembased on higher residues,” in Proceedings of the 5th ACM Con-ference on Computer and Communications Security, pp. 59–66,San Francisco, Calif, USA, November 1998.

[52] T. Okamoto and S. Uchiyama, “A new public-key cryptosys-tem as secure as factoring,” in Advances in Cryptology (EURO-CRYPT ’98), vol. 1403 of Lecture Notes in Computer Science,pp. 308–318, Springer, New York, NY, USA, 1998.

[53] T. Okamoto, S. Uchiyama, and E. Fujisaki, “Epoc: efficientprobabilistic publickey encryption,” Tech. Rep., 2000, Proposalto IEEE P1363a, http://grouper.ieee.org/groups/1363/P1363a/draft.html.


[54] M. Joye, J.-J. Quisquater, and M. Yung, “On the power ofmisbehaving adversaries and security analysis of the originalEPOC,” in Topics in Cryptology CT-RSA 2001, vol. 2020 of Lec-ture Notes in Computer Science, Springer, New York, NY, USA,2001.

[55] R. Cramer and V. Shoup, “Universal hash proofs and aparadigm for adaptive chosen ciphertext secure public-keyencryption,” in Advances in Cryptology (EUROCRYPT ’02),vol. 2332 of Lecture Notes in Computer Science, pp. 45–64,Springer, New York, NY, USA, 2002.

[56] E. Bresson, D. Catalano, and D. Pointcheval, “A simple public-key cryptosystem with a double trapdoor decryption mech-anism and its applications,” in Advances in Cryptology (ASI-ACRYPT ’03), vol. 2894 of Lecture Notes in Computer Science,pp. 37–54, Springer, New York, NY, USA, 2003.

[57] I. Damgard and M. Jurik, “A generalisation, a simplificationand some applications of Pailliers probabilistic public-key sys-tem,” in 4th International Workshop on Practice and Theory inPublic-Key Cryptography, vol. 1992 of Lecture Notes in Com-puter Science, pp. 119–136, Springer, New York, NY, USA,2001.

[58] S. Galbraith, “Elliptic curve paillier schemes,” Journal of Cryp-tology, vol. 15, no. 2, pp. 129–138, 2002.

[59] G. Castagnos, “An efficient probabilistic public-key cryp-tosystem over quadratic fields quotients,” 2007, FiniteFields and Their Applications, paper version in press,http://www.unilim.fr/pages perso/guilhem.castagnos/.

[60] G. Castagnos, Quelques schemas de cryptographie asymetriqueprobabiliste, Ph.D. thesis, Universite de Limoges, 2006,http://www.unilim.fr/pages perso/guilhem.castagnos/.

[61] D. Boneh and M. Franklin, “Identity-based encryption fromthe Weil pairing,” in Advances in Cryptology (CRYPTO ’01),vol. 2139 of Lecture Notes in Computer Science, pp. 213–229,Springer, New York, NY, USA, 2001.

[62] D. Boneh, X. Boyen, and E.-J. Goh, “Hierarchical identitybased encryption with constant size ciphertext,” in Advancesin Cryptology (EUROCRYPT ’05), vol. 3494 of Lecture Notes inComputer Science, pp. 440–456, Springer, New York, NY, USA,2005.

[63] J. Domingo-Ferrer, “A provably secure additive and multi-plicative privacy homomorphism,” in Proceedings of the 5thInternational Conference on Information Security (ISC ’02),vol. 2433 of Lecture Notes in Computer Science, pp. 471–483,Sao Paulo, Brazil, 2002.

[64] D. Wagner, “Cryptanalysis of an algebraic privacy homomor-phism,” in Proceedings of the 6th International Conference onInformation Security (ISC ’03), vol. 2851 of Lecture Notes inComputer Science, Bristol, UK, 2003.

[65] F. Bao, “Cryptanalysis of a provable secure additive and multi-plicative privacy homomorphism,” in International Workshopon Coding and Cryptograhy (WCC ’03), pp. 43–49, Versailles,France, 2003.

[66] J. Domingo-Ferrer, “A new privacy homomorphism and ap-plications,” Information Processing Letters, vol. 60, no. 5, pp.277–282, 1996.

[67] J. Cheon, W.-H. Kim, and H. Nam, “Known-plaintext crypt-analysis of the domingo-ferrer algebraic privacy homomor-phism scheme,” Information Processing Letters, vol. 97, no. 3,pp. 118–123, 2006.

[68] C. Castelluccia, E. Mykletun, and G. Tsudik, “Efficient ag-gregation of encrypted data in wireless sensor networks,” inACM/IEEE Mobile and Ubiquitous Systems: Networking andServices (Mobiquitous ’05), pp. 109–117, 2005.

[69] M. Fellows and N. Koblitz, “Combinatorial cryptosystems ga-lore!,” in Contemporary Mathematics, vol. 168 of Finite Fields:Theory, Applications, and Algorithms, FQ2, pp. 51–61, 1993.

[70] L. Ly, A public-key cryptosystem based on Polly Cracker, Ph.D.thesis, Ruhr-Universitat Bochum, Bochum, Germany, 2002.


Research ArticleSecure Multiparty Computation between DistrustedNetworks Terminals

S.-C. S. Cheung1 and Thinh Nguyen2

1 Center for Visualization and Virtual Environments, Department of Electrical and Computer Engineering, University of Kentucky,Lexington, KY 40507, USA

2 School of Electrical Engineering and Computer Science, Oregon State University, 1148 Kelley Engineering Center Corvallis, Oregon,OR 97331-5501, USA

Correspondence should be addressed to S.-C. S. Cheung, [email protected]

Received 7 May 2007; Accepted 12 October 2007


One of the most important problems facing any distributed application over a heterogeneous network is the protection of pri-vate sensitive information in local terminals. A subfield of cryptography called secure multiparty computation (SMC) is the studyof such distributed computation protocols that allow distrusted parties to perform joint computation without disclosing privatedata. SMC is increasingly used in diverse fields from data mining to computer vision. This paper provides a tutorial on SMC fornonexperts in cryptography and surveys some of the latest advances in this exciting area including various schemes for reducingcommunication and computation complexity of SMC protocols, doubly homomorphic encryption and private information re-trieval.

Copyright © 2007 S.-C. S. Cheung and T. Nguyen. This is an open access article distributed under the Creative CommonsAttribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work isproperly cited.

1. INTRODUCTION

The proliferation of capturing and storage devices as well asthe ubiquitous presence of computer networks make shar-ing of data easier than ever. Such pervasive exchange of data,however, has increasingly raised questions on how sensitiveand private information can be protected. For example, it isnow commonplace to send private photographs or videos tothe hundreds of online photoprocessing stores for storage,development, and enhancement like sharpening and red-eyeremoval. Few companies provide any protection of the per-sonal pictures they receive. Hackers or employees of the storemay steal the data for personal use or distribute them for per-sonal gain without consent from the owner.

There are also security applications in which multipleparties need to collaborate with each other but do not wantany of their own private data disclosed. Consider the fol-lowing example: a law-enforcement agency wants to searchfor possible suspects in a surveillance video owned by pri-vate company A, using a proprietary software developed byanother private company B. The three parties involved allhave information they do not want to share with each other:

the criminal biometric database from law enforcement, thesurveillance tape from company A, and the proprietary soft-ware from company B.

Encryption alone cannot provide adequate protectionwhen performing the aforementioned applications. The en-crypted data needs to be decrypted at the receiver for pro-cessing and the raw data will then become vulnerable. Al-ternatively, the client can download the software and pro-cess her private data in a secure environment. This, however,runs the risk of having the proprietary technology of the soft-ware company pirated or reverse-engineered by hackers. TheTrusted Computing (TC) Platform may solve this problem byexecuting the software in a secure memory space of the clientmachine equipped with a cryptographic coprocessor [1]. Be-sides the high cost of overhauling the existing PC platform,the TC concept remains highly controversial due to its un-balanced protection of the software companies over the con-sumers [2].

The technical challenge to this problem lies in develop-ing a joint computation and communication protocol to beexecuted among multiple distrusted network terminals with-out disclosing any private information. Such a protocol is


called a secure multiparty computation (SMC) protocol andhas been an active research area in cryptography for morethan twenty years [3]. Recently, researchers in other disci-plines such as signal processing and data mining have begunto use SMC to solve various practical problems. The goal ofthis paper is to provide a tutorial on the basic theory of SMCand to survey recent advances in this area.

2. PROBLEM FORMULATION

The basic framework of SMC is as follows: there are n par-ties P1,P2, . . . ,Pn on a network who want to compute a jointfunction f (x1, x2, . . . , xn) based on private data xi owned byparty Pi for i = 1, 2, . . . ,n. The goal of the SMC is that Piwill not learn anything about xj for j�=i beyond what can beinferred from her private data xi and the result of the com-putation f (x1, x2, . . . , xn). SMC can be trivially accomplishedif there is a special server, trusted by every party with its pri-vate data, to carry out the computation. This is not a practicalsolution as it is too costly to protect such a server. The objec-tive of any SMC protocol is to emulate this ideal model asmuch as possible by using clever transformations to concealthe private data.

Almost all SMC protocols are classified based on theirmodels of security and adversarial behaviors. The most com-monly used security models are perfect security and compu-tational security, which will be covered in Sections 3 and 4,respectively. Adversarial behaviors are broadly classified intotwo types: semihonest and malicious. A dishonest party iscalled semihonest if she follows the SMC protocol faithfullybut attempts to find out about other’s private data throughthe communication. A malicious party, on the other hand,will modify the protocol to gain extra information. We willfocus primarily on semihonest adversaries but briefly de-scribe how the protocols can be fortified to handle maliciousadversaries.

We also assume that private data are elements from a fi-nite field F and the target function f (·) can be implementedas a combination of the field’s addition and multiplication.This is a reasonably general computational model for tworeasons: first, at the lowest level, any digital computing devicecan be modeled by setting F as the binary field with the XORas addition and AND as multiplication. Second, while mostsignal processing and scientific computation are describedusing real numbers, we can approximate the real numberswith a reasonably large finite field and estimate any analyticalfunction using a truncated version of its power series expan-sion, which consists of only additions and multiplications.

3. SMC WITH PERFECT SECURITY

In this section, we discuss perfectly secure multiparty com-putation (PSMC) in which an adversary will learn nothingabout the secret numbers of the honest parties no matterhow computationally powerful the adversary is. The idea isthat while the adversary may control a number of parties whoreceive messages from other honest senders, these messagesprovide no useful information about the secret numbers ofthe senders.

One of the basic tools used in PSMC is secret sharing.A t-out-of-m secret-sharing scheme breaks a secret num-ber x into mshares r1, r2, . . . , rm such that x cannot be recon-structed unless an adversary obtains more than t − 1 shareswith t ≤ m. The importance of a secret-sharing scheme inPSMC is illustrated by the following example: in a 2-partysecure computation of f (x1, x2), party Pi will use a 2-out-of-2 secret-sharing scheme to break xi into ri1 and ri2, andshare ri j with party Pj . Each party then computes the func-

tion using the shares received, resulting in y1 � f (r11, r21)at P1 and y2 � f (r12, r22) at P2. If the secret-sharing schemeis homomorphic under the function f (·), that is, y1 and y2

are themselves secret shares of the desired function f (x1, x2),f (x1, x2) can then be easily computed by exchanging y1 andy2 between the two parties. Under our computational model,all SMC problems can be solved if the secret-sharing schemeis doubly homomorphic—it preserves both addition and mul-tiplication. One such scheme was invented by Adi Shamirwhich we will explain next [4].

In Shamir’s secret-sharing scheme, a party hides her se-cret number x as the constant term of a secret polynomialg(z) of degree t − 1,

g(z) � at−1zt−1 + at−2z

t−2 + · · · + a1z + x. (1)

The coefficients a1 to at−1 are random coefficients distributeduniformly over the entire field. Given the polynomial g(z),the secret number x can be recovered by evaluating it atz = 0. The secret shares are computed by evaluating g(z) atz = 1, 2, . . . ,m and are distributed to m other parties. It is as-sumed that each party knows the degree of g(z) and the valuez at which her share is evaluated. We follow the conventionthat the share received by party Pi is evaluated at z = i.

If an adversary obtains any t shares g(z1), g(z2), . . . , g(zt)with zi ∈ {1, 2, . . . ,m}, the adversary can then formulate thefollowing polynomial g(z):

g(z) �t∑

i=1

g(

zi)

∏ tj=1, j�=i

(

z − zj)

∏ tj=1, j�=i

(

zi − zj) . (2)

We claim that g(z) is identical to the secret polynomial g(z):first, the degree g(z) is t − 1, same as that of g(z). Second,g(z) = g(z) for z = z1, z2, . . . , zt because, when evaluatingg(z) at a particular z = zi, every term inside the summa-tion in (2) will go to zero except for the one that containsg(zi) it simply becomes g(zi) as the multiplier becomes one.Consequently, the (t − 1)th-degree polynomial g(z) − g(z)will have t roots. As the number of roots is higher than thedegree, g(z) − g(z) must be identically zero or g(z) ≡ g(z).As a result, the adversary can reconstruct the secret numberx = g(0).

On the other hand, the adversary will have no knowledgeabout x even if it possesses as many as t − 1 shares. This isbecause, for any arbitrary secret number x′, there exists apolynomial h(z) such that h(0) = x′ and h(zi) = g(zi) for

S.-C. S. Cheung and T. Nguyen 3

i = 1, 2, . . . , t− 1. h(z) is given as follows and its properties issimilar to those of (2):

h(z)

� x′∏ t−1

j=1

(

z − zj)

∏ t−1j=1

(− zj) +

t−1∑

i=1

g(

zi) z∏ t−1

j=1, j�=i(

z − zj)

zi∏ t−1

j=1, j�=i(

zi − zj) .

(3)

Shamir’s secret-sharing scheme is obviously homomor-phic under addition: given two secret (t − 1)th-degree poly-nomials g(z) and h(z), the secret shares of g(z) + h(z) aresimply the summation of their respective secret shares g(1) +h(1), g(2)+h(2), . . . , g(m)+h(m). Secrecy is also maintainedas the coefficients of g(z) +h(z), except for the constant termwhich is the sum of all the secret numbers, are uniformly dis-tributed and no party can gain additional knowledge aboutothers’ secret shares. On the other hand, the degree of theproduct polynomial g(z)h(z) increases to 2(t−1). The locallycomputed shares g(1)h(1), g(2)h(2), . . . , g(m)h(m) cannotcompletely specify g(z)h(z) unless the number of shares mis strictly larger than 2(t − 1) or equivalently, t ≤ �m/2�.Even if this condition is satisfied, a series of product can eas-ily result in a polynomial with degree higher than m. Fur-thermore, the coefficients of the product polynomial is notentirely random, for example, they are related in such a waythat the polynomial can be factored by the original polyno-mials. These problems can be solved by first assuming thatt ≤ �m/2� and then replacing the product polynomial by anew (t − 1)th-degree polynomial as follows.

Pi first computes g(i)h(i) and then generates a random(t − 1)th-degree polynomial qi(z) with qi(0) = g(i)h(i).Again, using the secret-sharing scheme, Pi sends share qi( j)to party Pj for j = 1, 2, . . . ,m. This step leaks no informationabout the local product g(i)h(i). In the final step, Pi computesdi based on all the received shares qj(i) for j = 1, 2, . . . ,m,

di �m∑

j=1

γ jq j(i), (4)

where γ j for j = 1, 2, . . . ,m solve the following equation:

g(0)h(0) =m∑

j=1

γ jg( j)h( j). (5)

Before explaining how Pi can solve (5) without knowingg(0)h(0) and g( j)h( j) for j�=i, we first note that di for i =1, 2, . . . ,m are shares of a (t − 1)th-degree polynomial q(z)defined below:

q(z) �m∑

j=1

γ jq j(z). (6)

The coefficients of q(z) are uniformly random as they arelinear combinations of uniformly distributed coefficients ofqj(z)’s. Furthermore, its constant term is our target secretnumber g(0)h(0):

q(0) =m∑

j=1

γ jq j(0) =m∑

j=1

γ jg( j)h( j) = g(0)h(0). (7)

q(1) = γ1q1(1)+

γ2q2(1) + γ3q3(1)

q(2) = γ1q1(2)+

γ2q2(2) + γ3q3(2)

q(3) = γ1q1(3)+

γ2q2(3) + γ3q3(3)

q(0) = γ1q(1) + γ2q(2) + γ3q(3) = g(0)h(0)

q1(1) q1(2)q1(3) q2(1)

q2(2)q2(3) q3(1)

q3(2)q3(3)

q1(z) with

q1(0) = g(1)h(1)q2(z) with

q2(0) = g(2)h(2)q3(z) with

q3(0) = g(3)h(3)

g(1)h(1) g(2)h(2) g(3)h(3)

Party 1 Party 2 Party 3

Figure 1: This diagram shows how three parties can sharethe secret g(0)h(0) based on the locally computed productsg(1)h(1), g(2)h(2), and g(3)h(3).

The second last equality is because g( j)h( j) is the secretnumber hidden by the polynomial qj(z). The last equalityis based on (5). This implies that di for i = 1, 2, . . . ,m aresecret shares of the scalar g(0)h(0). An example of the aboveprotocol in a three-party situation is shown in Figure 1.

To address how each party can solve (5), we note that,based on our assumption t ≤ �m/2� the degree of the prod-uct polynomial g(z)h(z) is strictly smaller than the numberof shares m. Let g(z)h(z) = am−1zm−1 + · · · + a0. The coef-ficients ai’s are completely determined by the values g(z)h(z)at z = 1, 2, . . . ,m. In other words, the following matrix equa-tion has a unique solution:

Va �

⎛

⎜

⎜

⎜

⎜

⎝

1m−1 1m−2 · · · 10

2m−1 2m−2 · · · 20

......

...mm−1 mm−2 · · · m0

⎞

⎟

⎟

⎟

⎟

⎠

⎛

⎜

⎜

⎜

⎜

⎝

am−1

am−2...a0

⎞

⎟

⎟

⎟

⎟

⎠

=

⎛

⎜

⎜

⎜

⎜

⎝

g(1)h(1)g(2)h(2)

...g(m)h(m)

⎞

⎟

⎟

⎟

⎟

⎠

.

(8)

The m × m invertible matrix V is called the Vandermondematrix and it is a constant matrix. Taking its inverse W =V−1 and considering the last row entries Wmi for i =1, 2, . . . ,m, we have

m∑

i=1

Wmig(i)h(i) = a0 = g(0)h(0). (9)

Comparing (9) with (5), we haveWmi = γi for i = 1, 2, . . . ,m,which are constants.

The condition t ≤ �m/2� on using Shamir’s scheme inPSMC posts a restriction on the number of dishonest partiestolerated—it implies that the number of honest parties mustbe a strict majority. In particular, we cannot use this schemefor a two-party SMC in which one party has to assume thatthe other party is dishonest. A surprising result in [5] showsthat the condition t ≤ �m/2� is not a weakness of Shamir’s


scheme—in fact, except for certain trivial functions,1 it is im-possible to compute any f (x1, x2, . . . , xm) with perfect securityif the number of dishonest parties equals to or exceeds �m/2�.

To conclude this section, we briefly describe how PSMCprotocols can be modified to handle malicious parties. Thereare two types of disruption: first, a malicious party can out-put erroneous results and second, she may perform an incon-sistent secret-sharing scheme such as evaluating the polyno-mial at random points. Provided the number of maliciousparties is less than one third of the total number of par-ties, the first problem can be solved by replacing (2) with arobust extrapolation scheme based on Reed-Solomon codes[5]. This bound on the number of malicious parties can beraised to one half by combining interactive zero-knowledgeproof with a broadcast channel [6]. The second problem canbe solved by using a verifiable secret-sharing (VSS) schemein which the sender needs to provide auxiliary informationso that the receivers can verify the consistency of their shareswithout gaining knowledge of the secret number [5].

4. SMC WITH COMPUTATIONAL SECURITY

It is unsatisfactory that PSMC introduced in Section 3 can-not even provide secure two-party computation. Instead ofrelying on perfect security, modern cryptographical tech-niques primarily use the so-called computational securitymodel. Under this model, secrets are protected by encodingthem based on a mathematical function whose inverse is dif-ficult to compute without the knowledge of a secret key. Sucha function is called one-way trapdoor function and the con-cept is used in many public-key cipher: a sender who wantsto send a message m to party P will first compute a cipher-text c = E(m, k) based on the publicly known encryptionalgorithm E(·)’s and P’s advertised public key k. The encryp-tion algorithm acts as a one-way trapdoor function becausea computationally bounded eavesdropper will not be able torecover m given only c and k. On the other hand, P can re-cover m by applying a decoding algorithm D(E(m, k), s) = musing her secret key s. Unlike perfectly secure protocols inwhich the adversary simply does not have any informationabout the secret, the adversary in the computationally securemodel is unable to decrypt the secret due to the computa-tional burden in solving the inverse problem. Even thoughit is still a conjecture that true one-way trapdoor functionsexist and future computation platforms like quantum com-puter may drastically change the landscape of these func-tions, many one-way function candidates exist and are rou-tinely used in practical security systems.2

The most fundamental result in SMC is that it is possibleto design general computationally secure multiparty compu-tation (CSMC) protocols to handle arbitrary number of dis-honest parties [3]. In this section, we will discuss the basicconstruction of these protocols. Similar to Section 3, we con-

1 The exceptions are those functions that are separable or f (x1, x2, . . . ,xm) = f1(x1) f2(x2) · · · fm(xm).

2 A list of one-way function candidates can be found in [7, Chapter 1].

Table 1: OT table at P1.

Key Values

0 −u1 1r11 − u2 2r11 − u...

...

r22 r22r11 − u...

...

N − 2 (N − 2)r11 − uN − 1 (N − 1)r11 − u

sider the protocols for addition and multiplication in finitefields. We will concentrate on the canonical two-party casebut our construction can be easily extended to more thantwo parties. Our starting point of building general CSMC isa straightforward secret-sharing scheme: each secret numberis simply broken down as a sum of two uniformly distributedrandom numbers: x1 = r11 + r12 and x2 = r21 + r22. Pi thensends ri j to Pj for j�=i. This scheme is clearly homomorphicunder addition

x1 + x2 =(

r11 + r21)

+(

r12 + r22)

. (10)

Multiplication, on the other hand, introduces cross-termr11r22 which breaks the homomorphism the homomorphism

x1x2 = r11r21 + r12x2 + r11r22. (11)

While the first two terms can be locally computed by P1 andP2, respectively, it is impossible to compute the third termr11r22 without having one party revealed the actual secretnumber to the other. In order to accomplish this under thecomputational security model, we will make use of a generalcryptographic protocol called the oblivious transfer (OT).

A 1-out-of-N OT protocol allows one party (the chooser)to read one entry from a table with N entries hosted by an-other party (the sender). Provided that both parties are com-putationally bounded, the OT protocol prevents the chooserfrom reading more than one entry and the sender fromknowing the chooser’s choice. We first show how the OTprotocol can be used to break r11r22 in (11) into randomshares u and v such that r11r22 = u + v. Assume our fi-nite field has N elements. The sender P1 generates a ran-dom u and then creates a table T with N entries shown inTable 1.3 Using the OT protocol, the chooser P2 selects theentry v � T(r22) = r22r11 − u without letting P1 know herselection or inspecting any other entries in the table.

It remains to show how OT provides the security guaran-tee. A 1-out-of-N OT protocol consists of the following fivesteps.

(1) P1 sends N randomly generated public keys k0, k1, . . . ,kN−1 to P2.

3 The role of P1 and P2 can be interchanged with proper adjustment toTable 1 entries.


(2) P2 selects kr22 based on her secret number r22, encryptsher public key k′ using kr22 , and sends E(k′, kr22 ) backto P1.

(3) As P1 does not know P2’s key selection, P1 decodes

the incoming message using all possible keys or k′i =D(E(k′, kr22 ), si) with private keys si for i = 0, 1, . . . ,

N − 1. Only one of k′i ’s (k′r22) matches the real key k′

but P1 has no knowledge of it.

(4) P1 encrypts each table entry T(i) using k′i and sends

E(T(i), k′i ) for i = 0, 1, . . . ,N − 1 to P2.(5) P2 decrypts the r22th message using her private key s′:

D(E(T(r22),k′r22), s′) = T(r22) as k′r22

= k′ is the publickey corresponding to the secret key s′. P2 then obtainsher random share of v = T(r22) = r22r11 − u. Notethat P2 will not be able to decrypt any other message

E(T(i), k′i ) for i�=r22 as it requires the knowledge of P1’ssecret key si.

It is clear from the above procedure that OT can accomplish atable lookup secure to both P1 and P2. As the definition of thetable is arbitrary, OT can support secure two-party computa-tion of any finite field function. Following similar proceduresas in Section 3, the above construction can be extended usingstandard zero-knowledge proof and verifiable secret-sharingscheme to handle malicious parties that do not follow theprescribed protocols [8, Chapter 7].

5. RECENT ADVANCES

In Sections 3 and 4, we present the construction of generalSMC protocols under the perfect security model and thecomputational security model. While most of these resultsare established in 1980s, SMC continues to be a very active re-search area in cryptography and its applications begin to ap-pear in many other disciplines. Recent advances focus on bet-ter understanding of the security strength of individual pro-tocols and their composition, improving CSMC protocols interms of their computation complexity [9, 10] and commu-nication cost [11–14], relating SMC to error-correcting cod-ing [15, 16], and introducing SMC to a variety of applica-tions [17–22]. The rigorous study of protocol security is be-yond the scope of this paper, and thus we will focus on theremaining three topics.

5.1. Reduction of computation complexity andcommunication cost

Both the computation complexity and communication costof the 1-out-of-N OT protocol depend linearly on the sizeN of the sender’s table that defines the function—it requiresO(N) invocations of a public-key cipher and O(N) messagesexchanged between the sender and the chooser. In manypractical applications, the value of N could be very large.For example, computing a general function on 32-bit com-puters requires a table of N= 232 or more than four billionentries! This renders our basic version of OT hopelessly im-practical. Improving the computation efficiency and reduc-

ing the communication requirement of OT and other CSMCprotocols thus become the focus of intensive research effort.

In [9], Naor and Pinkas showed that the 1-out-of-N OTprotocol can be reduced to applying a 1-out-of-2 OT proto-col log2N times. The idea is that the two parties repeatedlyuse the 1-out-of-2 OT on individual bits of the binary repre-sentation of the chooser’s secret number x2: in the ith round,the sender will present two keys Ki0 and Ki1 to the chooserwho will choose Kix2[i] based on x2[i], the ith bit of x2. Thekeys Ki0 and Ki1 for i = 1, 2, . . . , log2N are used by the senderto encrypt the table entries T(k) using the binary representa-tion of k as follows:

E(

T(k)) = T(k)⊕

log2N⊕

i=1

f(

Kik[i])

, (12)

where k is a log N-bit number, f (s) is a random number gen-erated by seed s, and ⊕ denotes XOR. The entire encryptedtable is sent to the chooser. Since the chooser already knowsKix2[i] for i = 1, 2, . . . , log2N , she can use them to decryptE(T(x2)) as follows:

T(

x2) = E

(

T(

x2))⊕

log2N⊕

i=1

f(

Kix2[i])

. (13)

The same authors further improved the computationcomplexity of the 1-out-of-2 OT protocol in [10]. Theyshowed that it is possible to use one exponentiation, the mostcomplex operation in a public-key cipher, for any number ofsimultaneous invocations of the 1-out-of-2 OT at the costof increasing the communication overhead. Their public-keycipher is based on the assumed difficulty of the DecisionalDiffie-Hellman problem whose encryption process enablesthe sender to prepare all her encrypted messages with oneexponentiation without any loss of secrecy.

An aspect that the above algorithms do not address isthe communication requirement of general CSMC protocols.There are three different facets to the communication prob-lem. First, our basic version of the 1-out-of-N OT protocolrequires the sender to send N random keys and N encryptedmessages to the chooser. The random keys can be consideredas setup cost, provided that the sender changes her randomshare u and the chooser changes her key k′ in every invoca-tion of the protocol. However, it seems necessary to send theN encrypted messages every time as the messages depend onu. A closer examination reveals that all the chooser needs isone particular message that corresponds to her secret num-ber. The entire set of N messages is sent simply to obfuscateher choice from the sender. This subproblem of obfuscating aselection from a public data collection is called private infor-mation retrieval (PIR). PIR attracts much research interestlately and is treated in Section 5.2. It suffices to know thatthere are techniques that can reduce the communication costfrom O(N) to O(logN) [23].

The second facet involves the communication cost of theoriginal unsecured implementation of the target function.The CSMC protocols in Section 4 provide a systematic pro-cedure to secure each addition and multiplication operationin the original implementation. However, not all operations


need to be secured—local operations can be performed with-out any modification. As such, it is important to minimizethe number of cross-party operations that need to be forti-fied with the OT protocol. Consider the following example:P1 and P2, each with n/2 secret numbers, want to find themedian of the entire set of n numbers. The best known unse-cured algorithm to find the median requires O(n) compari-son operations. To make this algorithm secure, we can use the1-out-of-N OT protocol to implement each comparison,4 re-sulting in communication requirement of O(n logN). This,however, is not the optimal solution—a distributed median-finding algorithm requires much less communication [13].The idea is to have P1 and P2 first compared with their re-spective local medians. The party with the the larger me-dian can then discard the half of the local data larger thanthe local median—the global median cannot be in this por-tion of the local data as the global median must be smallerthan the larger of the two local medians. Following the samelogic, the other party can discard the smaller half of her lo-cal data. The two parties again compare their local medi-ans of the remaining data until exhaustion. Notice that allthe local computation can be done without invocations ofOT. As a result, this algorithm only requires O(log n) cross-party secure comparison and this results in a communi-cation cost of O(log n logN), a significant reduction fromthe naive implementation. In fact, it has been shown that ifa communication-efficient unsecured implementation existsfor a general function, we can always convert it into a secureone without much increase in communication [12].

The final facet of communication requirements has to dowith the interactivity of the CSMC protocols. All the pro-tocols introduced thus far require multiple rounds of com-munications between the parties. Such frequent interactionis undesirable in many applications such as batch processingin which one party needs to reuse many times the same se-cret information from another party, and asymmetric com-putation in which a low-complexity client wants to leveragea sophisticated server to privately perform a complex com-putation. Earlier work in this area showed that one round ofmessage exchange is indeed possible for secure computationof any function [11]. However, the length of the replied mes-sage depends on the complexity of the implementation of thefunction. As a result, this requires the end receiver to devotemuch time in decoding the message even though the outputcan be as small as a binary decision. This problem can be re-solved using a doubly homomorphic public-key encryptionscheme in which arbitrary computation can be done on theencrypted data without size expansion. It is an open problemin cryptography on whether a doubly homomorphic encryp-tion scheme exists. The closest scheme, which we will explainnext, can support arbitrary numbers of additions and onemultiplication on encrypted data [14].

The construction is based on two public-key ciphers de-fined on two different finite cyclic groups G and G of thesame size n = q1q2, where q1 and q2 are large private primes.

4 Secure comparison is also called the Secure Millionaire Problem, one ofthe earliest problems studied in SMC literature [3].

These two groups are related by a special bilinear map e :G×G→ G such that e(uα, vβ) = e(u, v)αβ for arbitrary u, v ∈ G

and integers α,β.5 Furthermore, e(g, g) is a generator for Gif g is a generator for G. The public keys for the cipher de-fined on G are a generator g and a random h = gαq2 forsome α. The public keys for the cipher on G are g = e(g, g)

and h = e(g,h) = gαq2 . Given a message m, the sendergenerates a random integer r and computes the ciphertextC = gmhr ∈ G. To decrypt this ciphertext, the receiver firstremoves the random factor by raising C to the power of theprivate key q1:

Cq1 = (gmhr)q1 = (gq1)mgαq2rq1 = (gq1

)m, (14)

where we use the basic fact gq1q2 = gn = 1 from group theory.Provided that the message space is small enough, the receivercan then retrieve m by computing the discrete logarithm ofCq1 base gq1 . The security of the cipher is based on the as-sumed hardness of the so-called subgroup decision problemof which we refer the readers to the original paper [14]. Wenow focus on the homomorphic properties of this scheme.Given two ciphertext messages C1 = gm1hr1 and C2 = gm2hr2 ,it is easy to see that C1C2 = gm1+m2hr1+r2 which is the cipher-text of message m1 + m2. For multiplication, we apply thebilinear map e(·, ·) on C1 and C2:

e(

C1,C2) = e

(

gm1hr1 , gm2hr2)

= e(

gm1+αq2r1 , gm2+αq2r2)

= e(g, g)m1m2+αq2(m1r2+m2r1+αq2r1r2)

= e(g, g)m1m2e(g,h)m1r2+m2r1+αq2r1r2

= gm1m2hr′.

(15)

The last expression is clearly a ciphertext for m1m2. Unfortu-nately, e(C1,C2) belongs to G, not in G. This means that onecannot further combine this with other ciphertexts in G andas such this scheme falls short of being a completely homo-morphic encryption scheme.

5.2. Private information retrieval

Private information retrieval (PIR) protocols allow a party (auser) to select a record from a database owned by anotherparty (a server) without the server knowing the selection ofthe user. PIR is a step in OT as explained in Section 5.1. Un-like OT, PIR does not prevent the sender from obtaining in-formation about the collection beyond her choice. Due to itsasymmetric protection, the paradigm of PIR is useful for pri-vacy protection of ordinary citizens in using search engine,shopping at online stores, participating in public survey andelectronic voting. As we have seen in Section 5.1, the sim-plest form of PIR is to send the entire database to the user.This imposes a communication cost in the order of the size

5 An example of such construction is based on the modified Weil paring onthe elliptic curve y2 = x3 + 1 defined over a finite field [14].


of the database. Recent advances in PIR protocols, however,show that the goal can be accomplished with a much smallercommunication overhead.

The problem of PIR was first proposed in the seminal pa-per by Chor et al. as follows [24]: the server has an n-bit bi-nary string x, and a user wants to know x[i], the ith bit of x,without the server knowing about i. The first important re-sult shown in [24] is that, under the perfect security model,it is impossible to send less data than the trivial solution ofsending the entire x to the user. On the other hand, if iden-tical databases are available at k ≥ 2 noncolluding servers,then perfect security can be achieved with the communica-tion cost of O(n1/k). Their results are based on the followingbasic two-server scheme that allows a user to privately obtainx[i] by receiving a single bit from each of the two servers. Letus denote

S⊗ a =

⎧

⎪

⎨

⎪

⎩

S∪ {a}, if a�∈ S,

S \ {a}, if a ∈ S.(16)

The user first randomly selects the indexes j ∈ {1, 2, . . . n}with probability of 1/2 for each value of j, to form a set S.Next, the user computes S⊗i, where i is the desired index. Theuser then sends S to server one and S⊗ i to server two. Uponreceiving S, server one replies to the user with a single bitwhich is the result of XORing of all the bits in the positionsspecified by S. Similarly, server two replies to the user witha single bit which is the result of XORing of all the bits inthe positions specified by S⊗ i. The user then computes x[i]by XORing the two bits received from the two servers. Thisscheme works because every position j�=iwill appear twice—one in S and one in S⊗ i, therefore the result from XORing ofall x[ j]’s together will be 0. On the other hand, i appears onlyonce in either S or S⊗ i, therefore the result of XORing of allx[ j]’s and x[i] will be x[i]. Provided the two servers do notcollude, every bit is equally likely to be selected by the user. Inthis scheme, each server sends one bit to the user but the userhas to send an n-bit message6 to each server. Thus, the overallcommunication cost is still O(n). With minor modification,this basic scheme can be extended to reduce the number ofbits sent by the user to O(n1/k) [24].

Recently, an interesting connection is made between PIRand a special type of forward-error-correcting codes (FEC)called locally decodable codes (LDC) and it has created aflurry of interest in the information theory community [16].FEC is used to combat transmission errors by adding redun-dancy to the transmitted data. Formally, the sender uses anencoding function C(·) to map an n-bit message x to an m-bit message C(x) with m > n, and then sends C(x) over anoisy channel. Upon receiving a string y possibly differentfrom C(x), a receiver attempts to recover x using a decodingalgorithm D(C(x)). In the conventional FEC, it will takes atleast O(n) complexity to recover an n-bit x since O(n) is re-quired just to record x. LDC, on the other hand, allows the

6 The message is simply an n-bit number with ones indicating the desiredbit.

user to inspect only a small fraction of C(x), say k � n bits,in order to fully recover a specific bit x[i] in x. Furthermore,each bit in C(x) can be used in a k-bit subset to recover x[i].As such, the knowledge of a particular bit in C(x) being usedprovides no information about which x[i] is being recovered.To see how LDC is used in PIR, we assume that each of thek servers has the same m-bit C(x) generated using an LDCencoding function on the n-bit database x. In order to re-trieve x[i], the user sends q1, q2, . . . , qk ∈ {1, 2, . . . ,m}, thelocations of bits in C(x) needed to recover x[i], to each ofthe k servers, respectively. Note that these locations dependonly on i and the particular LDC used. Upon receiving qj ,the jth server simply replies with C(x)[qj] for j = 1, 2, . . . , k.After gathering all the k replies, the user can then run the de-coding algorithm to recover x[i]. Using this framework, thecommunication cost of the PIR system is k(l + logm) withklogm and kl corresponded to the user’s and server’s com-munication costs, respectively.

In fact, the two-server basic scheme introduced earliercan be viewed as using the Hadamard code in the LDCframework. The Hadamard code H(x) of an n-bit messagex has 2n bits. The kth bit of H(x) for k ∈ {0, 1, . . . , 2n − 1} isdefined as follows:

H(x)[k] =n⊕

j=1

x[ j]k[ j]. (17)

To retrieve x[i] from the servers, the user first randomly picksan n-bit number k, and then sends k to server one and k ⊕ eito server two, where ei is an n-bit number with a single onein the ith position. Upon receiving k and k ⊕ ei, servers oneand two reply with H(x)[k] and H(x)[k ⊕ ei], respectively.The user can then decode x[i] by computing

H(x)[k]⊕H(x)[

k ⊕ ei]

=n⊕

j=1, j�=ix[ j]k[ j]⊕ x[i]k[i]⊕

n⊕

j=1, j�=ix[ j]k[ j]⊕x[i]

(∼k[i])

= x[i](

k[i]⊕∼k[i]) = x[i].

(18)

The symbol ∼ denotes negation. This scheme is almostequivalent to the scheme by Chor et al., except that the XORof all possible selections of bits in x are already contained inthe Hadamard code H(x). We mention again that the com-munication cost of this scheme is O(n) due to the exponen-tial code length of the Hadamard code. Nevertheless, the pos-sibility of using better error-correcting codes in the place ofthe Hadamard code opens many opportunities for new PIRschemes. PIR schemes based on Reed-Solomon codes andReed-Muller codes can be found in [16]. The best publishedresult on PIR uses LDC to achieve a communication com-plexity of O(n10−7

) with three noncolluding servers [25].All of the above constructions provide PIR under the per-

fect security model. By making certain computational as-sumptions, PIR can also achieve sublinear communicationcomplexity with only one database [23, 26]. We briefly re-view the scheme in [26] as follows: it is based on the assumedhardness of determining whether a number in a finite field


F is a quadratic residue, that is, without knowing the primefactorization of the field size N , it is difficult to compute thefollowing predicate:

QR(u) ={

1 if u = v2 for some v ∈ F,

0 otherwise.(19)

It is easy to see that QR(·) is homomorphic under multipli-cation, that is, QR(xy) = QR(x)QR(y). The basic principleof using QR to retrieve x[i] is straightforward: the user sendsthe server n numbers y1, . . . , yn ∈ F, all of them quadraticresidues except yi, that is, QF(yj) = 1 for j�=i and QF(yi) =0. The server then replies with m ∈ F computed as follows:

m � Πnj=1wj , where wj =

{

yj if x[ j] = 0,

y2j if x[ j] = 1.

(20)

Since all yj ’s are quadratic residues except for yi, we haveQR(wj) = 1 for j�=i and QR(wi) = x[i]. Combining thehomomorphic property, we get the desired result QR(m) =QR(wi) = x[i]. This scheme, however, is very wasteful as theuser needs to send n logN bits. We can improve this by rear-ranging x as an s× t matrix M with s = n(L−1)/L and t = n1/L

for some integer L. Assume that x[i] is the entry at the athrow and the bth column of M. The user then sends the serveryj , for j = 1, 2, . . . , t, all quadratic residues except for yb. Thecommunication for this step is O(n1/L). Using these t num-bers, the server carries a similar computation as (20) for eachrow of M, resulting in mk for k = 1, 2, . . . , s. Of all the mk’s,all the user needs is ma from the ath row because it is suffi-cient to retrieve x[i] as QR(ma) = x[i]. Since each of the mk

is a logN-bit number, this is equivalent to carrying out thePIR procedure logN times—but this time the database sizeshrinks from n to s = n(L−1)/L. This observation allows thesame procedure to be applied recursively with exponentiallydecreasing communication cost. As a result, the communi-cation is dominated by the first step which is O(n1/L) and wecan make L as big as we want. Subsequent work by Cachinet al. showed that the communication cost can be further re-duced to logarithmic complexity [23].

5.3. Practical applications of SMC

While the theoretical studies of SMC have advanced signif-icantly in recent years, developing practical applications us-ing SMC has been slow. The data mining community is thefirst to introduce SMC into practical usage. The goal is tocompute aggregate statistics over private data stored in dis-tributed databases. Using the OT protocol as the core, dif-ferent SMC protocols have been developed to construct lin-ear algebra routines [27], median computation [13], deci-sion trees [17], neural network [19], and others. Even thoughthese algorithms provide innovative implementations formany data mining schemes, their security relies on modulararithmetic operations on very large integers which are com-putationally intensive. In a recent study on PIR, the authorsof [28] showed that even with the most advanced CPUs, themodular arithmetic in the SMC protocol requires more timethan simply sending the entire database through a typicalbroadband connection.

Original signalP1’s estimateP2’s astimate

0 10 20 30 40 50 60−150

−100

−50

0

50

100

150

200

250

Figure 2: Original signal and least-square estimates in secure innerproduct.

While an algorithm in a typical data mining applica-tion may need to handle millions of records on a daily ba-sis, a real-time signal processing algorithm needs to handlemillions of samples within milliseconds. Very efficient algo-rithms have recently been developed at the expense of pri-vacy. The pioneering work by Avidan and Moshe showedthe feasibility of building a secure distributed face detector[20]. While keeping OT as the core, they provide an efficientimplementation based on the assumption that certain visualfeatures used in the detector are noninvertible and for thisthey do not leak important information about the images.

Another noteworthy scheme is a collection of statisticalroutines, developed in [18], that use linear subspace projec-tion for privacy projection. We illustrate the idea with a sim-ple inner product computation. Assume that two parties, P1

and P2, have n-dimensional vectors x1 and x2, respectively.They both know an invertible matrix M and its inverse M−1.M is broken down into top and bottom halves T ∈ R�n/2�×nand B ∈ R(n−�n/2�)×n, while M−1 into left and right halvesL ∈ Rn×�n/2� and R ∈ Rn×(n−�n/2�). The inner product xT1 x2

can then be decomposed as follows:

xT1 x2 = xT1 M−1Mx2 = xT1 LTx2 + xT1 RBx2. (21)

P1 then sends xT1 R to P2 who computes xT1 RBx2 while P2

sends P1Tx2 so that she can compute xT1 LTx2. P2 can thensend his scalar to P1 or vice versa to obtain the final answer.They cannot recover each other’s data as the transmitted dataxT1 R and Tx2 are all n/2-dimensional vectors. Using a ran-domly generated M and x1 = x2, Figure 2 shows the leastsquare estimates by both parties based on the received data.Following a similar approach, we have also developed securetwo-party routines for linear filtering [21] and thresholding


[22]. Even though all of the above algorithms are computa-tionally very efficient, they all leak private information to acertain degree and thus may not be suitable for applicationsthat demand the utmost privacy and security.

6. CONCLUSIONS

In this article, we have briefly reviewed the foundation ofSMC protocols and some of the latest developments. As wedo not assume any background in cryptography, we focus onthe intuition rather than the rigorous treatment of the sub-ject. Serious readers should consult the comprehensive textof [8] and the collection of papers at specialized bibliogra-phy sites [29, 30]. As the demand for secure and privacy-enhancing applications is rapidly growing, we believe that itis a great opportunity for researchers in diverse areas outsideof cryptography to understand the concepts of SMC and todevelop practical SMC protocols for their respective applica-tions.

ACKNOWLEDGMENT

The authors would like to thank the constructive commentsfrom the anonymous reviewers.

REFERENCES

[1] Trusted Computing Group, “TCG Specification ArchitectureOverview,” April 2004, https://www.trustedcomputinggroup.org.

[2] R. Anderson, “Trusted Computing Frequently Asked Ques-tions,” August 2003, http://www.cl.cam.ac.uk/∼rja14/tcpa-faq.html.

[3] A. C. Yao, “Protocols for secure computations,” in Proceedingsof the 23rd Annual IEEE Symposium on Foundations of Com-puter Science, pp. 160–164, Chicago, Ill, USA, November 1982.

[4] Shamir, “How to share a secret,” Communications of the ACM,vol. 22, no. 11, pp. 612–613, 1979.

[5] M. Ben-Or, S. Goldwasser, and A. Wigderson, “Complete-ness thorems for non-cryptographic fault tolerant distributedcomputation,” in Proceedings of the 20th ACM Symposium onthe Theory of Computing, pp. 1–10, Chicago, Ill, USA, May1988.

[6] T. Rabin and M. Ben-Or, “Verifiable secret sharing and multi-party protocols with honest majority,” in Proceedings of the 21stAnnual ACM Symposium on Theory of Computing, pp. 73–85,Seattle, Wash, USA, May 1989.

[7] S. Goldwasser and M. Bellare, Lecture Notes on Cryptography,Massachusetts Institue of Technology, Cambridge, Mass, USA,2001.

[8] O. Goldreich, Foundations of Cryptography: Volume II BasicApplications, Cambridge University Press, Cambridge, Mass,USA, 2004.

[9] M. Naor and B. Pinkas, “Oblivious transfer and polynomialevaluation,” in Proceedings of the Annual ACM Symposium onTheory of Computing, pp. 245–254, Atlanta, Ga, USA, 1999.

[10] M. Naor and B. Pinkas, “Efficient oblivious transfer proto-cols,” in Proceedings of the SIAM Symposium on Discrete Algo-rithms (SODA ’01), pp. 448–457, Washington, DC, USA, 2001.

[11] C. Cachin, J. Camenisch, J. Kilian, and J. Muller, “One-roundsecure computation and secure autonomous mobile agents,”in Proceedings of the 27th International Colloquium on Au-

tomata, Languages and Programming, pp. 512–523, Geneva,Switzerland, July 2000.

[12] M. Naor and K. Nissim, “Communication complexity and se-cure function evaluation,” Electronic Colloquium on Computa-tional Complexity, vol. 8, no. 62, 2001.

[13] G. Aggarwal, N. Mishra, and B. Pinkas, “Secure computationof the kth-ranked element,” in Proceedings of Advances in Cryp-tology International Conference on the Theory and Applicationsof Cryptographic Techniques (EUROCRYPT ’04), vol. 3027 ofLecture Notes in Computer Science, pp. 40–55, 2004.

[14] D. Boneh, E.-J. Goh, and K. Nissim, “Evaluating 2-DNF for-mulas on ciphertexts,” in Proceedings of Theory of Cryptogra-phy Conference 2005, vol. 3378 of Lecture Notes in ComputerScience, pp. 325–341, Cambridge, Mass, USA, February 2005.

[15] W. Gasarch, “A survey on private information retrieval,” TheBulletin of the EATCS, vol. 82, pp. 72–107, 2004.

[16] L. Trevisan, “Some applications of coding theory in computa-tional complexity,” Quaderni di Matematica, vol. 13, pp. 347–424, 2004.

[17] Y. Lindell and B. Pinkas, “Privacy preserving data mining,”Journal of Cryptology, vol. 15, no. 3, pp. 177–206, 2003.

[18] W. Du, Y. S. Han, and S. Chen, “Privacy-preserving multivari-ate statistical analysis: linear regression and classification,” inProceedings of the 4th SIAM International Conference on DataMining, pp. 222–233, Lake Buena Vista, Fla, USA, April 2004.

[19] Y.-C. Chang and C.-J. Lu, “Oblivious polynomial evaluationand oblivious neural learning,” Theoretical Computer Science,vol. 341, no. 1–3, pp. 39–54, 2005.

[20] S. Avidan and M. Butman, “Blind vision,” in Proceedings of the9th European Conference on Computer Vision, vol. 3953 LNCSof Lecture Notes in Computer Science, pp. 1–13, Graz, Austria,May 2006.

[21] N. Hu and S.-C. Cheung, “Secure image filtering,” in Pro-ceedings of IEEE International Conference on Image Processing(ICIP ’06), Atlanta, Ga, USA, October 2006.

[22] N. Hu and S.-C. Cheung, “A new security model for securethresholding,” in Proceedings of IEEE International Conferenceon Acoustic, Speech and Signal Processing (ICASSP ’07), Hon-olulu, Hawaii, USA, April 2007.

[23] C. Cachin, S. Micali, and M. Stadler, “Computationally privateinformation retrieval with polylogarithmic communication,”in Proceedings of Advances in Cryptology: International Con-ference on the Theory and Applications of Cryptographic Tech-niques (EUROCRYPT ’99), vol. 1592, pp. 402–414, 1999.

[24] B. Chor, O. Goldreich, E. Kushilevitz, and M. Sudan, “Privateinformation retrieval,” in Proceedings of the Annual Symposiumon Foundations of Computer Science, pp. 41–50, October 1995.

[25] S. Yekhanin, “New locally decodable codes and private infor-mation retrieval schemes,” Tech. Rep. 127, Electronic Collo-quium on Computational Complexity, 2006.

[26] E. Kushilevitz and R. Ostrovsky, “Replication is not needed:single database, computationally-private information re-trieval,” in Proceedings of the Annual Symposium on Founda-tions of Computer Science, pp. 364–373, Miami Beach, Fla,USA, 1997.

[27] R. Cramer and I. Damgaard, “Secure distributed linear algebrain constant number of rounds,” in Proceedings of the 21st An-nual IACR (CRYPTO ’01), vol. 2139 of Lecture Notes in Com-puter Science, pp. 119–136, Santa Barbara, Calif, USA, August2001.

[28] R. Sion and B. Carbunar, “On the computational practical-ity of prive information retrieval,” in Proceedings of the 14thISOC Network and Distributed Systems Security Symposium,San Diego, Calif, USA, February-March 2007.


[29] H. Lipmaa, “Oblivious Transfer or Private Information Re-trieval,” University College London, http://www.adastral.ucl.ac.uk/∼helger/crypto/link/protocols/oblivious.php.

[30] K. Liu, “Privacy Preserving Data Mining Bibliography,”University of Maryland, Baltimore County, http://www.csee.umbc.edu/∼kunliu1/research/privacy review.html.


Review ArticleProtection and Retrieval of Encrypted Multimedia Content:When Cryptography Meets Signal Processing

Zekeriya Erkin,1 Alessandro Piva,2 Stefan Katzenbeisser,3 R. L. Lagendijk,1 Jamshid Shokrollahi,4

Gregory Neven,5 and Mauro Barni6

1 Electrical Engineering, Mathematics, and Computer Science Faculty, Delft University of Technology,2628 CD, Delft, The Netherlands

2 Department of Electronics and Telecommunication, University of Florence, 50139 Florence, Italy3 Information and System Security Group, Philips Research Europe, 5656 AE, Eindhoven, The Netherlands4 Department of Electrical Engineering and Information Sciences, Ruhr-University Bochum, 44780 Bochum, Germany5 Department of Electrical Engineering, Katholieke Universiteit Leuven, 3001 Leuven, Belgium6 Department of Information Engineering, University of Siena, 53100 Siena, Italy

Correspondence should be addressed to Zekeriya Erkin, [email protected]

Received 3 October 2007; Revised 19 December 2007; Accepted 30 December 2007

Recommended by Fernando Perez-Gonzalez

The processing and encryption of multimedia content are generally considered sequential and independent operations. In certainmultimedia content processing scenarios, it is, however, desirable to carry out processing directly on encrypted signals. The fieldof secure signal processing poses significant challenges for both signal processing and cryptography research; only few ready-to-gofully integrated solutions are available. This study first concisely summarizes cryptographic primitives used in existing solutionsto processing of encrypted signals, and discusses implications of the security requirements on these solutions. The study thencontinues to describe two domains in which secure signal processing has been taken up as a challenge, namely, analysis and retrievalof multimedia content, as well as multimedia content protection. In each domain, state-of-the-art algorithms are described. Finally,the study discusses the challenges and open issues in the field of secure signal processing.

Copyright © 2007 Zekeriya Erkin et al. This is an open access article distributed under the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION

In the past few years,the processing of encrypted signals hasemerged as a new and challenging research field. The combi-nation of cryptographic techniques and signal processing isnot new. So far, encryption was always considered as an add-on after signal manipulations had taken place (see Figure 1).For instance, when encrypting compressed multimedia sig-nals such as audio, images, and video, first the multime-dia signals were compressed using state-of-the-art compres-sion techniques, and next encryption of the compressed bitstream using a symmetric cryptosystem took place. Conse-quently, the bit stream must be decrypted before the multi-media signal can be decompressed. An example of this ap-proach is JPSEC, the extension of the JPEG2000 image com-pression standard. This standard adds selective encryptionto JPEG2000 bit streams in order to provide secure scalablestreaming and secure transcoding [1].

In several application scenarios, however, it is desirable tocarry out signal processing operations directly on encryptedsignals. Such an approach is called secure signal processing, en-crypted signal processing, or signal processing in the encrypteddomain. For instance, given an encrypted image, can we cal-culate the mean value of the encrypted image pixels? On theone hand, the relevance of carrying out such signal manipu-lations, that is, the algorithm, directly on encrypted signals isentirely dependent on the security requirements of the appli-cation scenario under consideration. On the other hand, theparticular implementation of the signal processing algorithmwill be determined strongly by the possibilities and impossi-bilities of the cryptosystem employed. Finally, it is very likelythat new requirements for cryptosystems will emerge fromsecure signal processing operations and applications. Hence,secure signal processing poses a joint challenge for both thesignal processing and the cryptographic community.


x(n) Process(compress) Encrypt Channel Decrypt

Process(decompress)

x(n)

Figure 1: Separate processing and encryption of signals.

The security requirements of signal processing in en-crypted domains depends strongly on the considered appli-cation. In this survey paper, we take an application-orientedview on secure signal processing and give an overview of pub-lished applications in which the secure processing of signalamplitudes plays an important role. In each application, weshow how signal processing algorithms and cryptosystemsare brought together. It is not the purpose of the paper todescribe either the signal processing algorithms or the cryp-tosystems in great detail, but rather focus on possibilities, im-possibilities, and open issues in combining the two. The pa-per includes many references to literature that contains moreelaborate signal processing algorithms and cryptosystem so-lutions for the given application scenario. It is also crucialto state that the scenarios in this survey can be implementedmore efficiently by using trusted third entities. However, it isnot always easy to find trusted entities with high computa-tional power, and even if one is found, it is not certain thatit can be applicable in these scenarios. Therefore, the trustedentities either do not exist or have little role in discussed sce-narios in this paper.

In this paper, we will survey applications that directly ma-nipulate encrypted signals. When scanning the literature onsecure signal processing, it becomes immediately clear thatthere are currently two categories under which the secure sig-nal processing applications and research can be roughly clas-sified, namely, content retrieval and content protection. Al-though the security objectives of these application categoriesdiffer quite strongly, similar signal processing considerationsand cryptographic approaches show up. The common cryp-tographic primitives are addressed in Section 2. This sectionalso discusses the need for clearly identifying the security re-quirements of the signal processing operations in a given sce-nario. As we will see, many of the approaches for secure sig-nal processing are based on homomorphic encryption, zero-knowledge proof protocols, commitment schemes, and mul-tiparty computation. We will also show that there is ampleroom for alternative approaches to secure signal processingtowards the end of Section 2. Section 3 surveys secure sig-nal processing approaches that can be classified as “contentretrieval,” among them secure clustering and recommenda-tion problems. Section 4 discusses problems of content pro-tection, such as secure watermark embedding and detection.Finally, Section 5 concludes this survey paper on secure pro-tection and retrieval of encrypted multimedia content.

2. ENCRYPTION MEETS SIGNAL PROCESSING

2.1. Introduction

The capability to manipulate signals in their encrypted formis largely thanks to two assumptions on the encryptionstrategies used in all applications discussed. In the first place,

encryption is carried out independently on individual signalsamples. As a consequence, individual signal samples can beidentified in the encrypted version of the signal, allowing forprocessing of encrypted signals on a sample-by-sample basis.If we represent a one-dimensional (e.g., audio) signal X thatconsists of M samples as

X = [x1, x2, x3, . . . , xM−1, xM]T

, (1)

where xi is the amplitude of the ith signal sample, then theencrypted version of X using key k is given as

Ek(X) = [Ek(x1),Ek(x2),Ek(x3), . . . ,Ek

(xM−1

),Ek(xM)]T

.(2)

Here the superscript “T” refers to vector transposition. Notethat no explicit measures are taken to hide the temporal orspatial structure of the signal, however, the use of sophisti-cated encryption schemes that are semantically secure (as theone in [2]) achieves this property automatically.

Secondly, only public key cryptosystems are used thathave particular homomorphic properties. The homomorphicproperty that these public key cryptographic system providewill be concisely discussed in Section 2.2.1. In simple terms,the homomorphic property allows for carrying out additionsor multiplications on signal amplitudes in the encrypted do-main. Public key systems are based on the intractability ofsome computationally complex problems, such as

(i) the discrete logarithm in finite field with a large(prime) number of elements (e.g., ElGamal cryptosys-tem [3]);

(ii) factoring large composite numbers (e.g., RSA cryp-tosystem [4]);

(iii) deciding if a number is an nth power in ZN for largeenough composite N (e.g., Paillier cryptosystem [2]).

It is important to realize that public key cryptographic sys-tems operate on very large algebraic structures. This meansthat signal amplitudes xi that were originally represented in8-to-16 bits will require at least 512 or 1024 bits per signalsample in their encrypted form Ek(xi). This data expansionis usually not emphasized in literature but this may be animportant hurdle for practical applicability of secure signalprocessing solutions. In some cases, however, several signalsamples can be packed into one encrypted value in order toreduce the size of the whole encrypted signal by a linear fac-tor [5].

A characteristic of signal amplitudes xi is that they areusually within a limited range of values, due to the 8-to-16bits amplitude representation format of sampled signals. Ifa deterministic encryption scheme would be used, each sig-nal amplitude would always give rise to the same encryptedvalue, making it easy for an adversary to infer information

Zekeriya Erkin et al. 3

Table 1: Some (probabilistic) encryption systems and their homomorphisms.

Encryption system f1(·, ·) f2(·, ·)Multiplicatively Homomorphic El-Gamal [3] Multiplication Multiplication

Additively Homomorphic El-Gamal [13] Addition Multiplication

Goldwasser-Micali [14] XOR Multiplication

Benaloh [15] Addition Multiplication

Naccache-Stern [16] Addition Multiplication

Okamoto-Uchiyama [17] Addition Multiplication

Paillier [2] Addition Multiplication

Damgard-Jurik [18] Addition Multiplication

about the signal. Consequently, probabilistic encryption hasto be used, where each encryption uses a randomization orblinding factor such that even if two signal samples xi and xjhave the same amplitude, their encrypted values Epk[xi] andEpk[xj] will be different. Here, pk refers to the public key usedupon encrypting the signal amplitudes. Public key cryptosys-tems are constructed such that the decryption uses only theprivate key sk, and that decryption does not need the valueof the randomization factor used in the encryption phase. Allencryption schemes that achieve the desired strong notion ofsemantic security are necessarily probabilistic.

Cryptosystems operate on (positive) integer values onfinite algebraic structures. Although sampled signal ampli-tudes are normally represented in 8-to-16 bits (integer) val-ues when they are stored, played, or displayed, intermediatesignal processing operations often involve noninteger signalamplitudes. Work-arounds for noninteger signal amplitudesmay involve scaling signal amplitudes with constant factors(say factors of 10 to 1000), but the unavoidable successiveoperations of rounding (quantization) and normalization bydivision pose significant challenges for being carried out onencrypted signal amplitudes.

In Section 2.2, we first discuss four important cryp-tographic primitives that are used in many secure signalprocessing applications, namely, homomorphic encryption,zero-knowledge proof protocols, commitment schemes, andsecure multiparty computation. In Section 2.3, we then con-sider the importance of scrutinizing the security require-ments of the signal processing application. It is meaninglessto speak about secure signal processing in a particular ap-plication if the security requirements are not specified. Thesecurity requirements as such will also determine the possi-bility or impossibility of applying the cryptographic prim-itives. As we will illustrate by examples—and also in moredetail in the following sections—some application scenariossimply cannot be made secure because of the inherent infor-mation leakage by the signal processing operation because ofthe limitations of the cryptographic primitives to be used,or because of constraints on the number of interactions be-tween parties involved. Finally, in Section 2.4, we briefly dis-cuss the combination of signal encryption and compressionusing an approach quite different from the ones discussed inSections 3 and 4, namely, by exploiting the concept of codingwith side information. We discuss this approach here to em-phasize that although many of the currently existing applica-

tion scenarios are built on the four cryptographic primitivesdiscussed in Section 2.2, there is ample room for entirely dif-ferent approaches to secure signal processing.

2.2. Cryptographic primitives

2.2.1. Homomorphic cryptosystems

Many signal processing operations are linear in nature. Lin-earity implies that multiplying and adding signal amplitudesare important operations. At the heart of many signal pro-cessing operations, such as linear filters and correlation eval-uations, is the calculation of the inner product between twosignals X and Y. If both signals (or segments of the signals)contain M samples, then the inner product is defined as

〈X, Y〉 = XTY = [x1, x2, . . . , xM] ·⎡⎢⎢⎢⎢⎣y1

y2...yM

⎤⎥⎥⎥⎥⎦ =M∑i=1

xi yi. (3)

This operation can be carried out directly on an encryptedsignal X and plain text signal Y if the encryption system usedhas the additive homomorphic property, as we will discussnext.

Formally, a “public key” encryption system Epk(·) and itsdecryption Dsk(·) are homomorphic if those two functionsare maps between the message group with an operation f1(·)and the encrypted group with an operation f2(·), such thatif x and y are taken from the message space of the encryptionscheme, we have

f1(x, y) = Dsk(f2(Epk(x),Epk(y)

)). (4)

For secure signal processing, multiplicative and additive ho-momorphisms are important. Table 1 gives an overview ofencryption systems with additive or multiplicative homo-morphism. Note that those homomorphic operations are ap-plied to a modular domain (i.e., either in a finite field or in aring ZN )—thus, both addition and multiplication are takenmodulo some fixed value. For signal processing applications,which usually require integer addition and multiplication, itis thus essential to choose the message space of the encryp-tion scheme large enough so that overflows due to modulararithmetic are avoided when operations on encrypted dataare performed.


Another important consideration is the representation ofthe individual signal samples. As encryption schemes usuallyoperate in finite modular domains (and all messages to beencrypted must be represented in this domain), a mapping isrequired which quantizes real-valued signal amplitudes andtranslates the signal samples of X into a vector of modularnumbers. In addition to the requirement that the computa-tions must not overflow, special care must be taken to repre-sent negative samples in a way which is compatible with thehomomorphic operation offered by the cryptosystem. Forthe latter problem, depending on the algebraic structure ofthe cipher, one may either encode the negative value −x bythe modular inverse x−1 in the underlying algebra of the mes-sage space or by avoiding negative numbers entirely by usinga constant additive shift.

In the context of the above inner product example, werequire an additively homomorphic scheme (see Table 1).Hence, f1 is the addition, and f2 is a multiplication:

x + y = Dsk(Epk(x) · Epk(y)

), (5)

or, equivalently,

Epk(x + y) = Epk(x) · Epk(y). (6)

Note that the latter equation also implies that

Epk(c · x) = (Epk(x))c

(7)

for every integer constant c. Thus, every additively homo-morphic cryptosystem also allows to multiply an encryptedvalue with a constant available or known as clear text.

The Paillier cryptosystem [2] provides the required ho-momorphism if both addition and multiplication are con-sidered as modular. The encryption of a message m under aPaillier cryptosystem is defined as

Epk(m) = gmrN mod N2, (8)

where N = pq, p and q are large prime number, g ∈ Z∗N2 isa generator whose order is a multiple of N , and r ∈ Z∗N is arandom number (blinding factor). We then easily see that

Epk(x)Epk(y) = (gxrNx )(g yrNy ) mod N2

= gx+y(rxry

)Nmod N2

= Epk(x + y).

(9)

Applying the additive homomorphic property of the Paillierencryption system, we can evaluate (3) under the assumptionthat X is an encrypted signal and Y is a plain text signal:

Epk〈X, Y〉 = Epk

( M∑i=1

xi yi

)=

M∏i=1

Epk(xi yi

) = M∏i=1

Epk(xi)yi .

(10)

Here, we implicitly assume that xi, yi are represented as inte-gers in the message space of the Paillier cryptosystem, that is,xi, yi ∈ ZN . However, (10) essentially shows that it is possi-ble to compute an inner product directly in case one of the

two vectors is encrypted. One takes the encrypted samplesEpk(xi), raises them to the power of yi, and multiplies all ob-tained values. Obviously, the resulting number itself is also inencrypted form. To carry out further useful signal processingoperations on the encrypted result, for instance, to compareit to a threshold, another cryptographic primitive is needed,namely, zero knowledge proof protocols, which is discussedin the next section.

In this paper, we focus mainly on public-key encryptionschemes, as almost all homomorphic encryption schemes be-long to this family. The notable exception is the one-time pad(and derived stream ciphers), where messages taken from afinite group are blinded by a sequence of uniformly randomgroup elements. Despite its computationally efficient encryp-tion and decryption processes, the application of a one-timepad usually raises serious problems with regard to key dis-tribution and management. Nevertheless, it may be used totemporarily blind intermediate values in larger communica-tion protocols. Finally, it should be noted that some recentwork in cryptography (like searchable encryption [6] andorder-preserving encryption [7]) may also yield alternativeways for the encryption of signal samples. However, these ap-proaches have not yet been studied in the context of mediaencryption.

To conclude this section, we observe that directly com-puting the inner product of two encrypted signals is not pos-sible since this would require a cryptographic system that hasboth multiplicative and additive (i.e., algebraic) homomor-phism. Recent proposals in that direction like [8, 9] were laterproven to be insecure [10, 11]. Therefore, no provably securecryptographic system with these properties is known to date.The construction of an algebraic privacy homomorphism re-mains an open problem. Readers can refer to [12] for moredetails on homomorphic cryptosystems.

2.2.2. Zero-knowledge proof protocols

Zero-knowledge protocols are used to prove a certain state-ment or condition to a verifier, without revealing any“knowledge” to the verifier except the fact that the assertionis valid [19]. As a simple example, consider the case wherethe prover Peggy claims to have a way of factorizing largenumbers. The verifier Victor will send her a large numberand Peggy will send back the factors. Successful factorizationof several large integers will decrease Victor’s doubt in thetruth of Peggy’s claim. At the same time Victor will learn “noknowledge of the actual factorization method.”

Although simple, the example shows an important prop-erty of zero-knowledge protocol proofs, namely, that they areinteractive in nature. The interaction should be such thatwith increasing number of “rounds,” the probability of anadversary to successfully prove an invalid claim decreasessignificantly. On the other hand, noninteractive protocols(based on the random oracle model) also do exist. A formaldefinition of interactive and noninteractive proof systems,such as zero-knowledge protocols, falls outside the scope ofthis paper, but can be found, for instance, in [19].

As an example for a commonly used zero-knowledgeproof, consider the proof of knowing the discrete logarithm


x of an element y to the base g in a finite field [20]. Hav-ing knowledge of discrete logarithm x is of interest in someapplications since if

y = gx mod p, (11)

then given p (a large prime number), g, and y (the calcu-lation of the logarithm x) are computationally infeasible. IfPeggy (the prover) claims she knows the answer (i.e., thevalue of x), she can convince Victor (the verifier) of thisknowledge without revealing the value of x by the follow-ing zero-knowledge protocol. Peggy picks a random numberr ∈ Zp and computes t = gr mod p. She then sends t to Vic-tor. He picks a random challenge c ∈ Zp and sends this toPeggy. She computes s = r − cx mod p and sends this to Vic-tor. He accepts Peggy’s knowledge of x if gs yc = t, since ifPeggy indeed used the correct logarithm x in calculating thevalue of s, we have

gs yc mod p = gr−cx(gx)c

mod p = gr = t mod p. (12)

In literature, many different zero-knowledge proofs exist.We mention a number of them that are frequently used insecure signal processing:

(i) proof that an encrypted number is nonnegative [21];(ii) proof that shows that an encrypted number lies in a

certain interval [22];(iii) proof that the prover knows the plaintext x corre-

sponds to the encryption E(x) [23];(iv) proofs that committed values (see Section 2.2.3) satisfy

certain algebraic relations [24].

In zero-knowledge protocols, it is sometimes necessary forthe prover to commit to a particular integer or bit value.Commitment schemes are discussed in the next section.

2.2.3. Commitment schemes

An integer or bit commitment scheme is a method that al-lows Alice to commit to a value while keeping it hidden fromBob, and while also preserving Alice’s ability to reveal thecommitted value later to Bob. A useful way to visualize acommitment scheme is to think of Alice as putting the valuein a locked box, and giving the box to Bob. The value in thebox is hidden from Bob, who cannot open the lock (withoutthe help of Alice), but since Bob has the box, the value in-side cannot be changed by Alice; hence, Alice is “committed”to this value. At a later stage, Alice can “open” the box andreveal its content to Bob.

Commitment schemes can be built in a variety of ways.As an example, we review a well-known commitment schemedue to Pedersen [25]. We fix two large primes p and q suchthat q | (p − 1) and a generator g of the subgroup of order qof Z∗p . Furthermore, we set h = ga mod p for some randomsecret a. The values p, q, g, and h are the public parametersof the commitment scheme. To commit to a value m, Alicechooses a random value r ∈ Zq and computes the commit-ment c = gmhr mod p. To open the commitment, Alice sendsm and r to Bob, who verifies that the commitment c receivedpreviously indeed satisfies c = gmhr mod p. The scheme is

hiding due to the random blinding factor r; furthermore, itis binding unless Alice is able to compute discrete logarithms.

For use in signal processing applications, commitmentschemes that are additively homomorphic are of specificimportance. As with homomorphic public key encryptionschemes, knowledge of two commitments allows one tocompute—without opening—a commitment of the sumof the two committed values. For example, the above-mentioned Pedersen commitment satisfies this property:given two commitments c1 = gm1hr1 mod p and c2 = gm2hr2

mod p of the numbers m1 and m2, a commitment c =gm1+m2hr1+r2 mod p of m1 +m2 can be computed by multiply-ing the commitments: c = c1c2 mod p. Note that the com-mitment c can be opened by providing the values m1 + m2

and r1 + r2. Again, the homomorphic property only supportsadditions. However, there are situations where it is not possi-ble to prove the relation by mere additive homomorphismas in proving that a committed value is the square of thevalue of another commitment. In such circumstances, zero-knowledge proofs can be used. In this case, the party whichpossesses the opening information of the commitments com-putes a commitment of the desired result, hands it to theother party, and proves in zero-knowledge that the commit-ment was actually computed in the correct manner. Amongothers, such zero-knowledge proofs exist for all polynomialrelations between committed values [24].

2.2.4. Secure multiparty computation

The goal of secure multiparty computation is to evaluate apublic function f (x(1), x(2), . . . , x(m)) based on the secret in-puts x(i), i = 1, 2, . . . ,m of m users, such that the users learnnothing except their own input and the final result. A sim-ple example, called Yao’s Millionaire’s Problem, is the com-parison of two (secret) numbers in order to determine ifx(1) > x(2). In this case, the parties involved will only learnif their number is the largest, but nothing more than that.

There is a large body of literature on secure multipartycomputation; for example, it is known [26] that any (com-putable) function can be evaluated securely in the multi-party setting by using a general circuit-based construction.However, the general constructions usually require a largenumber of interactive rounds and a huge communicationcomplexity. For practical applications in the field of dis-tributed voting, private bidding and auctions, and private in-formation retrieval, dedicated lightweight multiparty proto-cols have been developed. An example relevant to signal pro-cessing application is the multiparty computation known asBitrep which finds the encryption of each bit in the binaryrepresentation of a number whose encryption under an ad-ditive homomorphic cryptosystem is given [27]. We refer thereader to [28] for an extensive summary of secure multipartycomputations and to [29] for a brief introduction.

2.3. Importance of security requirements

Although the cryptographic primitives that we discussed inthe previous section are useful for building secure signal


processing solutions, it is important to realize that in eachapplication the security requirements have to be made ex-plicit right from the start. Without wishing to turn to formaldefinition, we choose to motivate the importance of what toexpect from secure signal processing with three simple yet il-lustrative two-party computation examples.

The first simple example is the encryption of a (say au-dio) signal X that contains M samples. Due to the sample-by-sample encryption strategy as shown in (2), the encryptedsignal Epk(X) will also contain M encrypted values. Hence,the size M of the plain text signal cannot be hidden by theapproaches followed in secure signal processing surveyed inthis paper.

In the second example, we consider the linear filtering ofthe signal X. In an (FIR) linear filter, the relation between theinput signal amplitudes X and output signal amplitudes Y isentirely determined by the impulse response (h0,h1, . . . ,hr)through the following convolution equation:

yi = h0xi + h1xi−1 + · · · + hrxi−r =r∑

k=0

hkxi−k. (13)

Let us assume that we wish to compute this convolution ina secure way. The first party, Alice, has the signal X and thesecond party, Bob, has the impulse response (h0,h1, . . . ,hr).Alice wishes to carry out the convolution (13) using Bob’slinear filter. However, both Bob and Alice wish to keep secrettheir data, that is, the impulse response and the input signal,respectively. Three different setups can now be envisioned.

(1) Alice encrypts the signal X under an additive homo-morphic cryptosystem and sends the encrypted signalto Bob. Bob then evaluates the convolution (13) on theencrypted signal as follows:

EpkA

(yi) = EpkA

( r∑k=0

hkxi−k

)

=r∏

k=0

EpkA

(hkxi−k

)

=r∏

k=0

EpkA

(xi−k

)hk .(14)

Notice that the additive homomorphic property isused in the above equation and that, indeed, individ-ually encrypted signal samples should be available toBob. Also notice that the above evaluation is only pos-sible if both X and (h0,h1, . . . ,hr) are integer-valued,which is actually quite unlikely in practice. After com-puting (14), Bob sends the result back to Alice whodecrypts the signal using her private key to obtain theresult Y. In this setup, Bob does not learn the outputsignal Y.

(2) Bob encrypts his impulse response (h0,h1, . . . ,hr) un-der a homomorphic cryptosystem and sends the resultto Alice. Alice then evaluates the convolution (13) us-ing the encrypted impulse response as follows:

EpkB

(yi) = EpkB

( r∑k=0

hkxi−k

)

=r∏

k=0

EpkB

(hkxi−k

)

=r∏

k=0

EpkB

(hk)xi−k .

(15)

Alice then sends the result to Bob, who decrypts to ob-tain the output signal Y. In this solution, Bob learnsthe output signal Y.

(3) Alice and Bob engage in a formal multiparty proto-col, where the function f (x1, x2, . . . , xM ,h0,h1, . . . ,hr)is the convolution equation, Alice holds the signal val-ues xi and Bob the impulse response hi as secret inputs.Both parties will learn the resulting output signal Y.

Unfortunately, none of the above three solutions really pro-vides a solution to the secure computation of a convolutiondue to inherent algorithm properties. For instance, in the firstsetup, Alice could send Bob a signal that consists of all-zerovalues and a single “one” value (a so-called “impulse sig-nal”). After decrypting the result EpkA(yi) that she obtainsfrom Bob, it is easy to see that Y is equal to (h0,h1, . . . ,hr),hence Bob’s impulse response is subsequently known to Al-ice. Similar attacks can be formulated for the other two cases.In fact, even for an arbitrary input, both parties can learn theother’s input by a well-known signal processing procedureknown as “deconvolution.” In conclusion, although in somecases there may be a need for the secure evaluation of convo-lutions, the inherent properties of the algorithm make securecomputing in a two-party scenario meaningless. (Neverthe-less, the protocols have value if used as building blocks in alarge application where the output signal Y is not revealed tothe attacker.)

The third and final example is to threshold a signal’s(weighted) mean value in a secure way. The (secure) meanvalue computation is equivalent to the (secure) computationof the inner product of (3), with X the input signal and Y theweights that define how the mean value is calculated. In themost simple case, we have yi = 1 for all i, but other defini-tions are quite common. Let us assume that Alice wishes Bobto determine if the signal’s mean value is “critical,” for in-stance, above a certain threshold value Tc, without revealingX to Bob. Bob, on the other hand, does not want to reveal hisexpert knowledge, namely, the weights Y and the thresholdTc. Two possible solutions to this secure decision problemare the following.

(i) Use secure multiparty computation, where the func-tion f (x1, x2, . . . , xM , y1, y2, . . . , yM ,Tc) is a combina-tion of the inner product and threshold comparison.Both parties will only learn if the mean value is criticalor not.

(ii) Alice sends Bob the signal X under additively homo-morphic encryption. Bob securely evaluates the in-ner product using (10). After encrypting Tc using Al-ice’s public key, Bob computes the (encrypted versionof the) difference between the computed mean and


Message sourceEncryption

Key

Compression

Eavesdropper

Public channel

Secure channel

Joint decompressionand decryption

Reconstructedsource

Figure 2: Compression of an encrypted signal from [30].

threshold Tc. Bob sends the result to Alice, who de-crypts the result using her secret key and checks if thevalue is larger or smaller than zero.

Although the operations performed are similar to the sec-ond example, in this example the processing is secure sinceBob learns little about Alice’s signal and Alice will learn lit-tle about the Bob’s expert knowledge. In fact, in the firstimplementation, the entire signal processing operation isultimately condensed into a single bit of information; thesecond implementation leaks more information, namely, thedistance between the correlation value from the threshold.In both cases, the result represents a high information ab-straction level, which is insufficient for launching successfulsignal processing-based attacks. In contrast, in the examplebased on (13), the signal processing operation led to an enor-mous amount of information—the entire output signal—tobe available to either parties, making signal processing-basedattacks quite easy.

As we will see in Sections 3 and 4, many of the two-partysecure signal processing problems eventually include an in-formation condensation step, such as (in the most extremecase) a binary decision. We postulate that for two-party lin-ear signal processing operations in which the amount of plaintext information after processing is in the same order of mag-nitude as before processing, no secure solutions exist purelybased on the cryptographic primitives discussed in the previ-ous section, due to inherent properties of the signal process-ing problems and the related application scenario. For thatreason, entirely other approaches to secure signal processingare also of interest. Although few results can be found in lit-erature on approaches not using homomorphic encryption,zero-knowledge proofs, and multiparty computation proto-cols, the approach discussed in the next section may wellshow a possible direction for future developments.

2.4. Compression of encrypted signals

When transmitting signals that contain redundancy over aninsecure and bandwidth-constrained channel, it is custom-ary to first compress and then encrypt the signal. Using theprinciples of coding with side information, it is, however, alsopossible to interchange the order of (lossless) compressionand encryption, that is, to compress encrypted signals [30].

The concept of swapping the order of compression and en-cryption is illustrated in Figure 2. A signal from the messagesource is first encrypted and then compressed. The compres-sor does not have access to the secret key used in the encryp-tion. At the decoder, decompression and decryption are per-formed jointly. From classical information theory, it wouldseem that only minimal gain could be obtained as the en-crypted signal has maximal entropy, that is, no redundancyis left after encryption. However, the decoder can use thecryptographic key to decode and decrypt the compressed andencrypted bit stream. This brings opportunities for efficientcompression of encrypted signals based on principle of cod-ing with side information. In [30], it was shown that neithercompression performance nor security need to be negativelyimpacted under some reasonable conditions.

In source coding with side information, the signal X iscoded under the assumption that the decoder—but not theencoder—has statistically dependent information Y, calledthe side information, available. In conventional coding sce-narios, the encoder would code the difference signal X−Y insome efficient way, but in source coding with side informa-tion, this is impossible since we assume that Y is only knownat the decoder. In the Slepian-Wolf coding theory [31], thecrucial observation is that the side information Y is regardedas a degraded version of X. The degradations are modeled as“noise” on the “virtual channel” between X and Y. The signalX can then be recovered from Y by the decoder if sufficienterror-correcting information is transmitted over the chan-nel. The required bit rate and amount of entropy are relatedas R ≥ H(X | Y). This shows that, at least theoretically, thereis no loss in compression efficiency since the lower boundH(X | Y) is identical to the scenario in which Y is availableat the encoder. Extension of the Slepian-Wolf theory existsfor lossy source coding [32]. In all practical cases of interests,the information bits that are transmitted over the channel areparity bits or syndromes of channel coding methods such asHamming, Turbo or LDPC codes.

In the scheme depicted in Figure 2, we have a similar sce-nario as in the above source coding with side informationcase. If we consider the encrypted signal Ek(X) at the input ofthe encoder, then we see that the decoder has the key k avail-able, representing the “statistically dependent side informa-tion.” Hence, according to the Slepian-Wolf viewpoint, theencrypted signal Ek(X) can be compressed to a rate that is


the same as if the key k would be available during the sourceencoding process, that is, R ≥ H(Ek(X) | k) = H(X). Thisclearly says that the (lossless) coding of the encrypted sig-nal Ek(X) should be possible with the same efficiency as the(lossless) coding of X. Hence, using the side information keyk, the decoder can recover first Ek(X) from the compressedchannel bit stream and subsequently decode Ek(X) into X.

A simple implementation of the above concept for a bi-nary signal X uses a pseudorandomly generated key. The keyk is in this case a binary signal K of the same dimension M asthe signal X. The encrypted signal is computed as follows:

Ek(X) = X⊕K,

Ek(xi) = xi ⊕ ki, i = 1, 2, . . . ,M.

(16)

The encrypted signal Ek(X) is now input to a channel cod-ing strategy, for instance, a Hamming coding. The strengthof the Hamming code is dependent on the dependency be-tween Ek(X) and the side information K at the decoder.This strength obviously depends solely on the properties ofthe original signal X. This does, however, require the mes-sage source to inform the source encoder about the entropyH(X), which represents a small leak of information. The en-coder calculates parity check bits over binary vectors of somelength L created by concatenating L bits of the encryptedsignal Ek(X), and sends only these parity check bits to thereceiver.

The decoder recovers the encrypted signal by first ap-pending to K the parity check bits, and then error correctingthe resulting bit pattern. The success of this error correctionstep depends on the strength of the Hamming code, but asmentioned, this strength has been chosen sufficiently withregards to the “errors” in K on the decoding side. Notice thatin this particular setup the “errors” represent the bits of theoriginal signal X. If the error correction step is successful,the decoder obtains Ek(X), from which the decryption canstraightforwardly take place:

X = Ek(X)⊕K,

xi = Ek(xi)⊕ ki, i = 1, 2, . . . ,M.

(17)

The above example is too simple for any practical sce-nario for a number of reasons. In the first place, it uses onlybinary data, for instance, bit planes. More efficient codingcan be obtained if the dependencies between bit planes areconsidered. This effectively requires an extension of the bitplane coding and encryption approach to coding and en-cryption of symbol values. Secondly, the decoder lacks amodel of the dependencies in X. Soft decoders for Turbo orLDPC codes can exploit such message source models, yield-ing improved performance. Finally, the coding strategy islossless. For most continuous or multilevel message sources,such as audio, images, and video, lossy compression is desir-able.

3. ANALYSIS AND RETRIEVAL OF CONTENT

In the today’s society, huge quantities of personal data aregathered from people and stored in databases for various

purposes ranging from medical researches to online person-alized applications. Sometimes, providers of these servicesmay want to combine their data for research purposes. Aclassical example is the one where two medical institutionswish to perform joint research on the union of their pa-tients data. Privacy issues are important in this scenario be-cause the institutions need to preserve their private data dur-ing their cooperation. Lindell and Pinkas [33] and Agrawal

and Srikant [34] proposed the notion of privacy preservingdata mining, meaning the possibility to perform data analysisfrom distributed database, under some privacy constraints.Privacy preserving data mining [35–38] deals with mutualuntrusted parties that on the one hand wish to cooperate toachieve a common goal but, on the other hand, are not will-ing to disclose their knowledge to each other.

There are several solutions that cope with exact matchingof data in a secure way. However, it is more common in signalprocessing to perform inexact matching, that is, learning thedistance between two signal values, rather than exact match-ing. Consider two signal values x1 and x2. Computing thedistance between them or checking if the distance is within athreshold is important:

∣∣x1 − x2∣∣ < ε. (18)

This comparison or fuzzy matching can be used in a vari-ety of ways in signal processing. One example is quantizingdata which is of crucial importance for multimedia compres-sion schemes. However, considering that these signal valuesare encrypted—thus the ordering between them is totally de-stroyed, there is not any efficient way known to fuzzy com-pare two values.

In the following sections, we give a summary of tech-niques that focus on extracting some information from pro-tected datasets. Selected studies mostly use homomorphicencryption, zero-knowledge proofs, and, sometimes, multi-party computations. As we will see, most solutions still re-quire substantial improvements in communication and com-putation efficiency in order to make them applicable in prac-tice. Therefore, the last section addresses a different approachthat uses other means of preserving privacy to show that fur-ther research on combining signal processing and cryptogra-phy may result in new approaches rather than using encryp-tion schemes and protocols.

3.1. Clustering

Clustering is a well-studied combinatorial problem in datamining [39]. It deals with finding a structure in a collectionof unlabeled data. One of the basic algorithms of cluster-ing is the K-means algorithm that partitions a dataset intoK clusters with a minimum error. We review the K-meansalgorithm and its necessary computations such as distancecomputation and finding the cluster centroid, and show thatcryptographic protocols can be used to provide user’s privacyin clustering for certain scenarios.


Y

X

Cluster centers

Objects

Figure 3: Clustered dataset. Each object is a point in the 2-dimensional space. K-means clustering algorithm assigns each ob-ject to the cluster with the smallest distance.

(1) select K random objects representing the Kinitial centroid of the clusters.

(2) assign each object to the cluster with thenearest centroid.

(3) recalculate the centroids for each cluster.(4) repeat step 2 and 3 until centroids do not

change or a certain threshold achieved.

Algorithm 1: The K-means clustering algorithm

3.1.1. K -means clustering algorithm

The K-means clustering algorithm partitions a dataset D of“objects” such as signal values or features thereof into K dis-joint subsets, called clusters. Each cluster is represented by itscenter which is the centroid of all objects in that subset.

As shown in Algorithm 1, the K-means algorithm is aniterative procedure that refines the cluster centroids until apredefined condition is reached. The algorithm first choosesK random points as the cluster centroids in the dataset Dand assigns the objects to the closest cluster centroid. Then,the cluster centroid is recomputed with recently assigned ob-jects. When the iterative procedure reaches the terminationcondition, each data object is assigned to the closest cluster(Figure 3). Thus to carry out the K-means algorithm, the fol-lowing quantities needs to be computed:

(i) the cluster centroid, or the mean of the data objects inthat cluster,

(ii) the distance between an object and the cluster cen-troid,

(iii) the termination condition which is a distance mea-surement compared to a threshold.

Attribute names

Data owned by Alice

Data owned by Bob

Figure 4: Shared dataset on which K-means algorithm is run.

In the following section, we describe a secure protocol thatcarries out secure K-means algorithm on protected data ob-jects.

3.1.2. Secure K -means clustering algorithm

Consider the scenario in which Alice and Bob want to applythe K-means algorithm on their joint datasets as shown inFigure 4, but at the same time they want to keep their owndataset private. Jagannathan and Wright proposed a solutionfor this scenario in [40].

In the proposed method, both Alice and Bob get the fi-nal output but the values computed in the intermediate stepsare unknown to the both parties. Therefore, the intermediatevalues such as cluster centroids are uniformly shared betweenAlice and Bob in such a way that for a value x, Alice gets arandom share a and Bob gets another random share b, where(a + b) modN = x and N is the size of the field in which alloperations take place. Alice and Bob keep their private sharesof the dataset secret.

The secure K-means clustering algorithm is separatedinto subprotocols where Alice and Bob computes the follow-ings (Algorithm 2).

(i) Distance measurement and finding the closest cluster: thedistance between each object and cluster centroid iscomputed by running a secure scalar product proto-col by Goethals et al. [41]. The closest cluster centroidis determined by running Yao’s circuit evaluation pro-tocol [42] with the shared data of Alice and Bob.

(ii) New cluster centroid: the new cluster centroid requiresto determine an average computation over shared val-ues of Alice and Bob. This function of the form (a +b)/(m+n) can be computed by applying Yao’s protocolwhere Alice knows a and m and Bob knows b and n.


Randomly select K objects from the dataset D as initialcluster centroidsRandomly share the cluster centroid between Aliceand Bobrepeat

for all object dk in dataset D doRun the secure closest cluster protocolAssign to dk to the closest cluster

end forAlice and Bob computes the random shares for the newcentroids of the clusters.

until cluster centroids are close to each other with an errorof ε.

Algorithm 2: Privacy preserving K-means clustering algorithm.

(iii) Termination condition: the termination condition ofthe algorithm is computed by running the Yao’s circuitevaluation protocol [42].

The squared distance between an object Xi = (xi,1, . . . , xi,M)and a cluster centroid μj is given by the following equation:

(dist

(Xi,μj

))2

= (xi,1 − μj,1)2

+(xi,2 − μj,2

)2+ · · · +

(xi,M − μj,M

)2.

(19)

Considering that the clusters centroids are shared betweenAlice and Bob, (19) can be written as(

dist(

Xi,μj))2

= (xi,1 − (μAj,1 + μBj,1))2

+ · · · +(xi,M −

(μAj,M + μBj,M

))2,

(20)

where μAj is Alice’s share and μBj is Bob’s share such that the

jth-cluster centroid is μj = μAj +μBj . Then, (20) can be writtenas

(dist

(Xi,μj

))2=M∑k=1

x2i,k+

M∑k=1

(μAj,k)2

+M∑k=1

(μBj,k)2

+2M∑k=1

μAj,kμBj,k

− 2M∑k=1

μAj,kxi,k − 2M∑k=1

xi,kμBj,k.

(21)

Equation (21) can be computed by Alice and Bob jointly. Asthe first term of the equation is shared between them, Al-ice computes the sum of components of her share while Bobcomputes the rest of the components. The second term andthird term can be computed by Alice and Bob individually,and the rest of the terms are computed by running a securescalar product protocol between Alice and Bob, much similarto the evaluation of (3) via the secure form of (10). Alice firstencrypts her data EpkA(μAj ) = (EpkA(μAj,1), . . . ,EpkA(μAj,M)) andsends it to Bob who computes the scalar product of this datawith his own by using the additive homomorphic property

of the encryption scheme as follows:

EpkA

(μAj)μBj = (EpkA

(μAj,1)μBj,1 , . . . ,EpkA

(μAj,M

)μBj,M). (22)

Then, multiplying the encrypted components gives the en-crypted scalar product of Alice’s and Bob’s data

EpkA

( M∑k=1

μAj,kμBj,k

)=

M∏k=1

EpkA

(μAj,k)μBj,k . (23)

The computed distances between the objects and the clustercentroids can later be the input to the Yao’s circuit evaluationprotocol [42] in which the closest cluster centroid is deter-mined. We refer readers to [41, 42] for further details on thispart.

Once the distances and the closest clusters to the objectsare determined, each object is labeled with the nearest clusterindex. At the end of each iteration, it is necessary to computethe new cluster centroids. Alice computes the sum of the cor-responding coordinates of all object s j and the number ofobjects nj within each of the K clusters for j, 1 ≤ j ≤ M.As shown in Figure 4, Alice has only some of the attributes ofthe objects, thus she treats these missing values as zero. Bobalso applies the same procedure and determines the sum ofcoordinates t j and the number of objects mj in the clusters.Given s j , t j ,nj , and mj , the jth component of the ith clusteris

μi, j =s j + t jnj + mj

. (24)

Since there are only four values, this equation can be com-puted efficiently by using Yao’s circuit evaluation protocol[42] with Alice’s shares s j and nj and Bob’s shares t j and mj .

In the last step of the K-means algorithm, the iterationis terminated if there is no further improvement between theprevious and current cluster centroids. In order to do that, adistance is computed between the previous and current clus-ter centroids. This is done in the same way as computing dis-tances between an object and a cluster centroid but in addi-tion, this distance is compared to a threshold value ε. Con-sidering that the cluster centroids are shared between Aliceand Bob, the result of the computation of the squared dis-tance of cluster centroids for the kth and (k + 1)th iterationsis again random shares for Alice and Bob:

(dist

(μA,k+1j + μB,k+1

j ,μA,kj + μB,k

j

))2 = αj + βj , (25)

where α and β are the shares of Alice and Bob. Alice andBob then apply Yao’s protocol on their K-length vectors(α1, . . . ,αK ) and (β1, . . . ,βK ) to check if αj + βj < ε for1 ≤ j ≤ K .

3.2. Recommender systems

Recommender services play an important role in applica-tions like e-commerce and direct recommendations for mul-timedia contents. These services attempt to predict items that


a user may be interested in by implementing a signal process-ing algorithm known as collaborative filtering on user prefer-ences to find similar users that share the same taste (likes ordislikes). Once similar users are found, this information canbe used in variety ways such as recommending restaurants,hotels, books, audio, and video.

Recommender systems store user data, also known aspreferences, in servers, and the collaborative filtering algo-rithms work on these stored preferences to generate recom-mendations. The amount of data collected from each userdirectly affects the accuracy of the predictions. There are twoconcerns in collecting information from the users in suchsystems. First, in an ordinary system they are in the order ofthousands items, so that it is not realistic for the users to rateall of them. Second, users would not like to reveal too muchprivacy sensitive information that can be used to track them.

The first problem, also known as the sparseness problemin datasets, is addressed for collaborative filtering algorithmsin [43–45]. The second problem on user privacy is of interestto this survey paper since users tend to not give more infor-mation about themselves for privacy concerns and yet theyexpect more accurate recommendations that fit their taste.This tradeoff between privacy and accuracy leads us to anentirely new perspective on recommender systems. Namely,how can privacy of the users be protected in recommendersystems without loosing too much accuracy?

We describe two solutions that address the problem ofpreserving privacy of users in recommender systems. In thefirst approach, user privacy is protected by means of encryp-tion and the recommendations are still generated by pro-cessing these encrypted preference values. In the second ap-proach, protecting the privacy of the users is possible withoutencryption but by means of perturbation of user preferencedata.

3.2.1. Recommendations by partial SVD onencrypted preferences

Canny [46] addresses the user privacy problem in recom-mender systems and proposes to encrypt user preferences.Assume that the recommender system applies a collaborativefiltering algorithm on a matrix P of users versus item ratings.Each row of this matrix represents the corresponding user’staste for the corresponding items. Canny proposes to use acollaborative filtering algorithm based on dimension reduc-tion of P. In this way, an approximation matrix of the orig-inal preference matrix is obtained in a lower dimension thatbest represents the user taste for the overall system. When anew user enters the system, the recommendations are gener-ated by simply reprojecting the user preference vector, whichhas many unrated items, over the approximation matrix. As aresult, a new vector will be obtained which contains approx-imated values for the unrated items [43, 46].

The ratings in recommender systems are usually integernumbers within a small range and items that are not rated areusually assigned to zero. To protect the privacy of the users,the user preferences vector X = [x1, x2, . . . , xM] is encryptedindividually as Epk(X). To reduce the dimension of the pref-

erence matrix P singular value decomposition (SVD) is anoption. The SVD allows P to be written as

P = UDVT , (26)

where the columns of U are the left singular vectors, D is adiagonal matrix containing the singular values, and VT hasrows that are the right singular vectors.

Once the SVD of the preference matrix P is computed,an approximation matrix in a lower-dimension subspace canbe computed easily. Computing the SVD on P that containsencrypted user preferences is, however, more complicated.

Computing the decomposition of the users’ preferencematrix requires sums of products of vectors. If the preferencevector of each user is encrypted, there is no efficient way ofcomputing sums of products of vectors since this would re-quire an algebraic homomorphic cryptosystem. Using securemultiparty computation protocols on this complex functionis costly considering the size of the circuit necessary for thecomplex operation.

Instead of straightforward computation of SVD, Canny[46] proposed to use an iterative approximation algorithmto obtain a partial decomposition of the user preference ma-trix. The conjugate gradient algorithm is an iterative pro-cedure consisting merely of additions of vectors which canbe done under homomorphically encrypted user preferencevectors. Each iteration in the protocol has two steps, that is,users compute (1) their contribution to the current gradientand (2) scalar quantities for the optimization of the gradi-ent. Both steps require only additions of vectors thus we onlyexplain the first step.

For the first step of the iterations, each user computes hiscontribution Gk to the current gradient G by the followingequation:

Gk = AXTk Xk

(I− ATA

), (27)

where matrix A is the approximation of the preference ma-trix P and it is initialized as a random matrix before the pro-tocol starts. Each user encrypts his own gradient vector Gk

with the public key of the user group by following the Peder-sen’s threshold scheme [47] that uses El Gamal cryptosystemwhich is modified to be additively homomorphic. All con-tributions from the users are then added up to form the en-crypted gradient Epk(G) by using the additive homomorphicproperty of the cryptosystem:

Epk(G) = Epk

( ∑k∈users

Gk

)=

∏k∈users

Epk(

Gk). (28)

This resulting vector Epk(G) is then jointly decrypted andused to update the approximated matrix A which is publiclyknown and used to compute the new gradient for the nextiteration.

Although the protocol is based on addition of vectors,zero-knowledge proof protocols play an important role. Thevalidity of the user inputs, that is, the encrypted preferencevector elements lie in a certain range, are verified by zero-knowledge proofs. Moreover, the partial encryption results


User1 User2 UserN

Collaborative filtering

Data disguising

Central database

Disguised dataOriginal data

Figure 5: Privacy preserving collaborative filtering with user pref-erence perturbation.

from the users are also proved valid by running a zero-knowledge proof protocol. Both group of zero-knowledgeproofs are checked by a subgroup of users of whose major-ity is necessary for the validation.

Canny [48] also applies this approach to a differentcollaborative filtering method, namely, expectation maxi-mization- (EM-) based factor analysis. Again this algorithminvolves simple iterative operations that can be implementedby vector additions. In both recommender system solutions,multiple iterations are necessary for the algorithm to con-verge and in each iteration, users need to participate in thecryptographic computations as in joint decryption and zero-knowledge proofs for input validation. These computationsare interactive and thus, it is imperative for the users to beonline and synchronized.

3.2.2. Randomized perturbation to protect preferences

Previous section showed that homomorphic cryptosystems,zero-knowledge proof protocols, and secure multiparty com-putations play an important role in providing solutions forprocessing encrypted data. However, there are other ways topreserve privacy. In the following, we discuss preserving pri-vacy in recommender systems by perturbation of user data.

Randomized perturbation technique was first intro-duced in privacy-preserved data-mining by Agrawal andSrikant [34]. Polat and Du [49, 50] proposed to use thisrandomization-based technique in collaborative filtering.The user privacy is protected by simply randomizing userdata while certain computations on aggregate data can stillbe done. Then, the server generates recommendations basedon the blinded data but can not derive the user’s private in-formation (Figure 5).

Consider the scalar product of two vectors X and Y.These vectors are blinded by R = [r1, . . . , rM] and S = [s1,. . . , sM] such that X′ = X + R and Y′ = Y + S. Here ri’s andsi’s are uniformly distributed random values with zero mean.The scalar product of X and Y can be estimated from X′ andY′:

X′ · Y′ =M∑k=1

(xk yk + xksk + rk yk + rksk

) ≈ M∑k=1

xk yk. (29)

Since R and S are independent and independent of X andY, we have

∑Mk=1xksk ≈ 0,

∑Mk=1rk yk ≈ 0, and

∑Mk=1rksk ≈ 0.

Similarly, the sum of the elements of any vector A can be esti-mated from its randomized form A′. Polat and Du used thesetwo approximations to develop a privacy-preserving collab-orative filtering method [49, 50].

This method works if the number of users in the system issignificantly large. Only then the computations based on ag-gregated data can still be computed with sufficient accuracy.Moreover, it is also pointed out in [51, 52] that the idea ofpreserving privacy by adding random noise might not pre-serve privacy as much as it had been believed originally. Theuser data can be reconstructed from the randomly perturbeduser data matrix. The main limitation in the original work ofPolat and Du is shown to be the item-invariant perturbation[53]. Therefore, Zhang et al. [53] propose a two-way com-munication perturbation scheme for collaborative filteringin which the server and the user communicates to determineperturbation guidance that is used to blind user data beforesending to the server. Notwithstanding these approaches, thesecurity of such schemes based on perturbation of data is notwell understood.

4. CONTENT PROTECTION

4.1. Watermarking of content

In the past decade, content protection measures have beenproposed based on digital watermarking technology. Digi-tal watermarking [54, 55] allows hiding into a digital con-tent information that can be detected or extracted at a latermoment in time by means of signal processing operationssuch as correlation. In this way, digital watermarking pro-vides a communication channel multiplexed into originalcontent through which it is possible to transmit informa-tion. The type of information transmitted from sender to re-ceiver depends on the application at hand. As an example, ina forensic tracing application, a watermark is used to embeda unique code into each copy of the content to be distributed,where the code links a copy either to a particular user or toa specific device. When unauthorized published content isfound, the watermark allows to trace the user who has redis-tributed the content.

Secure signal processing needs to be performed in casewatermark detection or embedding is done in untrusted de-vices; watermarking schemes usually rely on a symmetric keyfor both embedding and detection, which is critical to boththe robustness and security of the watermark and thus needsto be protected.

For the application of secure signal processing in con-tent protection, three categories can be identified, namely,distribution models, customer rights protection, and securewatermark detection. The first two categories are relevant toforensic tracing (fingerprinting) applications. In classical dis-tribution models, the watermark embedding process is car-ried out by a trusted server before releasing the content to theuser. However this approach is not scalable, and in large-scaledistribution systems, the server may become overloaded. Inaddition, since point-to-point communication channels are


X

b Embedder

sk

Xw Channel

Attacksmanipulations

X′

b

X sk

Detector/decoderbYes/no

Figure 6: A digital watermarking model.

required, bandwidth requirements become prohibitive. Aproposed solution is to use client-side watermark embed-ding. Since the client is untrusted, the watermark needs tobe embedded without the client having access to the originalcontent and watermark.

The customer’s rights problem relates to the intrinsicproblem of ambiguity when watermarks are embedded at thedistribution server: a customer whose watermark has beenfound on unauthorized copies can claim that he has beenframed by a malicious seller who inserted his identity as wa-termark in an arbitrary object. The mere existence of thisproblem may discredit the accuracy of the forensic tracingarchitecture. Buyer-seller protocols have been designed toembed a watermark based on the encrypted identity of thebuyer, making sure that the watermarked copy is availableonly to the buyer and not to the seller.

In the watermark detection process, a system has to proveto a verifier that a watermark is present in certain content.Proving the presence of such a watermark is usually doneby revealing the required detection information to the ver-ifying party. All current applications assume that the verifieris a trusted party. However, this is not always true, for in-stance, if the prover is a consumer device. A cheating veri-fier could exploit the knowledge acquired during watermarkdetection to break the security of the watermarking system.Cryptographic protocols, utilizing zero-knowledge proofs,have been constructed in order to mitigate this problem.

We will first introduce a general digital watermarkingmodel to define the notation that will be useful in theremainder of the section. An example of a watermarkingscheme is proposed, namely, the one proposed by Cox et al.[56] since this scheme is adopted in many of the content pro-tection applications.

4.1.1. Watermarking model

Figure 6 shows a common model for a digital watermark-ing system [57]. The inputs of the system are the originalhost signal X and some application dependent to-be-hiddeninformation, here represented as a binary string B = [b1,b2, . . . , bL], with bi taking values in {0, 1}. The embedder in-serts the watermark code B into the host signal to producea watermarked signal Xw, usually making use of a secret keysk to control some parameters of the embedding process andallow the recovery of the watermark only to authorized users.

The watermark channel takes into account all processingoperations and (intentional or non-intentional) manipula-tions the watermarked content may undergo during distri-

bution and use. As a result, the watermarked content Xw ismodified into the “received” version X′. Based on X′, eithera detector verifies the presence of a specific message given toit as input, thus only answering yes or no, or a decoder readsthe (binary) information conveyed by the watermark. Detec-tors and decoders may need to know the original content Xin order to retrieve the hidden information (non-blind de-tector/decoder), or they do not require the original content(blind or oblivious detector/decoder).

4.1.2. Watermarking algorithm

Watermark information is embedded into host signals bymaking imperceptual modifications to the host signal. Themodifications are such that they convey the to-be-hidden in-formation B. The hidden information can be retrieved after-wards from the modified content by detecting the presenceof these modifications. Embedding is achieved by modifyingthe set of features X = [x1, x2, . . . , xM]. In the most simplecase, the features are simple signal amplitudes. In more com-plicated scenarios, the features can be DCT or wavelet coeffi-cients. Several watermarking schemes make use of a spread-spectrum approach to code the to-be-hidden information Binto W = [w1,w2, . . . ,wM]. Typically, W is a realization of anormally distributed random signal with zero mean and unitvariance.

The most well-known spread-spectrum techniques wasproposed by Cox et al. [56]. The host signal is first trans-formed into a discrete cosine transform (DCT) representa-tion. Next the largest magnitude DCT coefficients are se-lected, obtaining the set of features X. The multiplicative wa-termark embedding rule is defined as follows:

xw,i = xi + γwixi = xi(1 + γwi

), (30)

where xw,i is the ith component of the watermarked featurevector and γ is a scaling factor controlling the watermarkstrength. Finally, an inverse DCT transform yields the wa-termarked signal Xw.

To determine if a given signal Y contains the watermarkW, the decoder computes the DCT of Y, extracts the set X′ oflargest DCT coefficients, and then computes the correlationρX′W between the features X′ and the watermark W. If thecorrelation is larger than a threshold T , that is,

ρX′W = 〈X′, W〉〈X′, X′〉 ≥ T , (31)

the watermark is considered present in Y.


4.2. Client-side watermark embedding

Client-side watermark embedding systems transmit the sameencrypted version of the original content to all the clients buta client-specific decryption key allows to decrypt the contentand at the same time implicitly embed a watermark. Whenthe client uses his key to decrypt the content, he obtains auniquely watermarked version of the content. The securityproperties of the embedding scheme usually guarantees thatobtaining either the watermark or the original content in theclear is of comparable hardness as removing the watermarkfrom the personalized copy.

In literature, several approaches for secure embeddingcan be found. In [58], a pseudorandom mask is blended overeach frame of a video. Each client is given a different mask,which, when subtracted from the masked broadcast video,leaves an additive watermark in the content. The scheme isnot very secure because since the same mask is used for allframes of a video, it can be estimated by averaging attacks.

In broadcast environments, stream switching [59, 60]can be performed. Two differently watermarked signals arechopped up into small chunks. Each chunk is encrypted bya different key. Clients are given a different set of decryp-tion keys that allow them to selectively decrypt chunks of thetwo broadcast streams such that each client obtains the fullstream decrypted. The way the full stream is composed outof the two broadcast versions encodes the watermark. Thissolution consumes considerable bandwidth, since the data tobe broadcast to the clients is twice as large as the content it-self.

A second solution involves partial encryption, for in-stance, encrypting the signs of DCT coefficients of a signal[61]. Since the sign bits of DCT coefficients are perceptu-ally significant, the partially encrypted version of the signalis heavily distorted. During decryption, each user has a dif-ferent key that decrypts only a subset of these coefficients, sothat some signs are left unchanged. This leaves a detectablefingerprint in the signal. A similar approach was used in [62]to obtain partial encryption-based secure embedding solu-tions for audiovisual content.

A third approach is represented by methods using astream-cipher that allows the use of multiple decryptionkeys, which decrypt the same cipher text to slightly differ-ent plain-texts. Again, the difference between the originaland the decrypted content represents the embedded water-mark. The first scheme following this approach was pro-posed by Anderson and Manifavans [63] who designed aspecial stream cipher, called Chameleon, which allows todecrypt Chameleon-encrypted content in slightly differentways. During encryption, a key and a secure index generatorare used to generate a sequence of indices, which are used toselect four entries from a look-up-table (LUT). These entriesare XORed with the plaintext to form a word of the ciphertext. The decryption process is identical to encryption exceptfor the use of a decryption LUT, which is obtained by prop-erly inserting bit errors in some entries of the encryptionLUT. Decryption superimposes these errors onto the con-tent, thus leaving a unique watermark. Recently, Adelsbachet al. [64] and Celik et al. [65] proposed generalizations of

X

sk

Encryption

Enc LUT

Server

Wk

X′

sk

Dec LUT

Decryption XW

Clientk

Figure 7: Encryption and following joint decryption and water-marking procedure proposed in [65].

Chameleon, suitable for embedding robust spread-spectrumwatermarks. The schemes operate on LUTs composed of in-tegers from Zp and replace the XOR operation by a (modu-lar) addition.

In more detail, the secure embedding solution works asfollows. The distribution server generates a long-term mas-ter encryption LUT E of size L, whose entries properlygener-ated random samples; E will be used to encrypt the contentto be distributed to the clients. Next, for the kth client, theserver generates a personalized watermark LUT Wk accord-ing to a desired probability distribution, and builds a person-alized decryption LUT Dk by combining the master LUT andthe watermark LUT:

Dk[i] = −E[i] + Wk[i]. (32)

The personalized LUTs are then transmitted once to eachclient over a secure channel. Let us note that the generationof the LUTs is carried out just once at the setup of the ap-plication. A content X is encrypted by adding to it a pseu-dorandom sequence obtained by selecting some entries ofthe LUT with a secure pseudorandom sequence generatordriven by a session key sk. Each client receives the encryptedcontent X′ along with the session key sk and decrypts it us-ing some entries of his/her personalized decryption LUT Dk

(again chosen according to sk), with the final effect that aspread-spectrum watermark sequence is embedded into thedecrypted content. This process is summarized in Figure 7.In detail, driven by the session key sk, a set of indices ti j isgenerated, where 0 ≤ i ≤M−1, 0 ≤ j ≤ S−1, 0 ≤ ti j ≤ L−1.Each feature of the content xi is encrypted by adding S entriesof the encryption LUT, obtaining the encrypted feature x′i asfollows:

x′i = xi +S−1∑j=0

E[ti j]. (33)

Joint decryption and watermarking is accomplished by re-constructing with the session key sk the same set of indicesti j and by adding S entries of the decryption LUT to each en-crypted feature x′i :

xw,i = x′i +S−1∑j=0

D[ti j] = xi +

S−1∑j=0

W[ti j] = xi + wi. (34)


4.3. Buyer seller protocols

Forensic tracing architectures which perform watermark em-bedding at the distribution server are vulnerable against adishonest seller. The mere fact that a seller may fool a buyermay have an impact on the credibility of the whole tracingsystem. (Note that a seller may in fact have an incentive tofool a buyer: a seller who acts as an authorized reselling agentmay be interested in distributing many copies of a work con-taining the fingerprint of a single buyer to avoid paying theroyalties to the author by claiming that such copies were ille-gally distributed or sold by the buyer.)

A possible solution consists in resorting to a trusted thirdparty, responsible for both embedding and detection of wa-termarks; however, such an approach is not feasible in prac-tical applications because the TTP could easily become a bot-tleneck for the whole system. The Buyer-Seller Protocol relieson cryptographic primitives to perform watermark embed-ding [66]; the protocol assures that the seller does not haveaccess to the watermarked copy carrying the identity of thebuyer, hence he cannot distribute or sell these copies. In spiteof this, the seller can identify the buyer from whom unau-thorized copies originated, and prove it by using a properdispute resolution protocol.

We describe the protocol by Memon and Wong [66] inmore detail. Let Alice be the seller, Bob the buyer, and WCA atrusted watermark certification authority in charge of gener-ating legal watermarks and sending them to any buyer uponrequest. The protocol uses a public key cryptosystem whichis homomorphic with respect to the operation used in thewatermark embedding equation (i.e., the cryptosystem willbe multiplicatively homomorphic if watermark embeddingis multiplicative, like in Cox’s scheme); moreover, Alice andBob possess a pair of public/private keys denoted by pkA, pkB(public keys) and skA, skB (private keys).

In the first part of the protocol, on request of Bob, theWCA generates a valid watermark signal W and sends it backto Bob, encrypted with Bob’s public key EpkB (W), along withits digital signature SWCA(EpkB (W)), to prove that the water-mark is valid.

Next, Bob sends to Alice EpkB (W) and SWCA(EpkB (W)),so that Alice can verify that the encrypted watermark hasbeen generated by the WCA. Alice performs two watermarkembedding operations. First, she embeds (with any water-marking scheme) into the original content X a watermarkV, which just conveys a distinct ID univocally identifying thetransaction, obtaining the watermarked content X′. Next, asecond watermark is built by using EpkB (W): Alice permutesthe watermark components through a secret permutation σ :

σ(EpkB (W)

) = EpkB

(σ(W)

), (35)

and inserts EpkB (σ(W)) in X′ directly in the encrypted do-main, obtaining the final watermarked content X′′ in en-crypted form; X′′ is thus unknown to her. This is possibledue to the homomorphic property of the cipher:

EpkB (X′′) = EpkB (X′) · EpkB

(σ(W)

). (36)

When Bob receives EpkB (X′′), he decrypts it by using his pri-vate key skB, thus obtaining X′′, where the watermarks V and

σ(W) are embedded. Note that Bob cannot read the water-mark σ(W), since he does not know the permutation σ . Thescheme is represented in Figure 8.

In order to recover the identity of potential copyrightviolators, Alice first looks for the presence of V. Upon de-tection of an unauthorized copy of X, say Y, she can usethe second watermark to effectively prove that the copy isoriginated from Bob. To do so, Alice must reveal to judgethe permutation σ , the encrypted watermark EpkB (W) andSWCA(EpkB (W)). After verifying SWCA(EpkB (W)), the judgeasks Bob to use his private key skB to compute and revealW. Now it is possible to check Y for the presence of σ(W):if such a presence is verified, then Bob is judged guilty, oth-erwise, Bob’s innocence has been proven. Note that if σ(W)is found in Y, Bob cannot state that Y originated from Alicesince to do so Alice should have known either W to insert itwithin the plain asset X, or skB to decrypt EpkB (X′′) after thewatermark was embedded in the encrypted domain.

As a particular implementation of the protocol, [66] pro-posed to use Cox’s watermarking scheme and a multiplica-tively homomorphic cipher (despite its deterministic nature,authors use RSA). More secure and less complex implemen-tations of the Buyer Seller Protocol have been proposed in[67–70].

4.4. Secure watermark detection

To tackle the problem of watermark detection in the pres-ence of an untrusted verifier (to whom the watermark se-crets cannot be disclosed), two approaches have been pro-posed: one approach called asymmetric watermarking [71,72] uses different keys for watermark embedding and detec-tion. Whereas a watermark is embedded using a private key,its presence can be detected by a public key. In such schemes,the knowledge of the public detection key must not enablean adversary to remove the embedded watermark; unfortu-nately, none of the proposed schemes is sufficiently robustagainst malicious attacks [73]. Another approach is repre-sented by zero-knowledge watermark detection.

Zero-knowledge watermark detection (ZKWD) uses acryptographic protocol to wrap a standard symmetric wa-termark detection process. In general, a zero-knowledge wa-termark detection algorithm is an interactive proof systemwhere a prover tries to convince a verifier that a digital con-tent X′ is watermarked with a given watermark B withoutdisclosing B. In contrast to the standard watermark detector,in ZKWD the verifier is given only properly encoded (or en-crypted) versions of security-critical watermark parameters.Depending on the particular protocol, the watermark code,the watermarked object, a watermark key, or even the origi-nal unmarked object is available in an encrypted form to theverifier. The prover runs the zero-knowledge watermark de-tector to demonstrate to the verifier that the encoded water-mark is present in the object in question, without removingthe encoding. A protocol run will not leak any informationexcept for the unencoded inputs and the watermark presencedetection result.

Early approaches for zero-knowledge watermark detec-tion used permutations to conceal both the watermark and


EpkB (X′) · σ · EpkB (W)

XEmbedding

V

X′

σ

Seller

EpkB (X′′)Decryption

X′′

Buyer

skB

EpkB (W)

WCA

Figure 8: The scheme of the Buyer Seller Protocol proposed in [66].

the object in which the watermark is to be detected [74]; theprotocol assures that the permuted watermark is detected inthe permuted content and that both the watermark and theobject are permuted in the same manner. Craver [75] pro-posed to use ambiguity attacks as a central tool to constructzero-knowledge detectors; such attacks allow to compute awatermark that is detectable in a content but never has beenembedded there. To use ambiguity attacks in a secure detec-tor, the real watermark is concealed within a number of fakemarks. The prover has to show that there is a valid watermarkin this list without revealing its position. Now, the adversary(equipped solely with a watermark detector) cannot decidewhich of the watermarks is not counterfeit. Removal of thewatermark is thus sufficiently more difficult.

Another proposal is to compute the watermark detec-tion statistic in the encrypted domain (e.g., by using additivehomomorphic public-key encryption schemes or commit-ments) and then use zero-knowledge proofs to convince theverifier that the detection statistic exceeds a fixed threshold.This approach was first proposed by Adelsbach and Sadeghi[76], who use a homomorphic commitment scheme to com-pute the detection statistic; the approach was later refined in[77].

Adelsbach and Sadeghi [76] propose a zero-knowledgeprotocol based on the Cox’s watermarking scheme. In con-trast to the original algorithm, it is assumed that the water-mark and DCT-coefficients are integers and not real numbers(this can be achieved by appropriate quantization). More-over, for efficiency reasons, the correlation computation in(31) is replaced by the detection criterion:

C := (〈X′, W〉)2 − 〈X′, X′〉 · T2

:= (A)2 − B ≥ 0;(37)

the latter detection criterion is equivalent to the original one,provided that the factor A is positive.

The following zero-knowledge detection protocol hasbeen designed to allow the prover to prove to a verifier thatthe watermark committed to in the commitment com(W) ispresent in the watermarked content X′, without revealing anyinformation about W. In the protocol, the authors employ anadditively homomorphic commitment scheme (such as theone proposed by Damgard and Fujisaki [78]). Let ppub, X′,

com(W), T be the common inputs of prover and verifier andlet psec be the private input of the prover. First, both proverand verifier select the watermarked features X′ and computethe value B of (37); the prover sends a commitment com(B)to the verifier and opens it immediately, allowing him to ver-ify that the opened commitment contains the same value Bhe computed himself. Now both compute the commitment

com(A) =M∏i=1

com(wi)x′i (38)

by taking advantage of the homomorphic property of thecommitment scheme. Subsequently, the prover proves inzero-knowledge that A ≥ 0. Next, the prover computes thevalue A2, sends a commitment com(A2) to the verifier, andgives him a zero-knowledge proof to prove that com(A2) re-ally contains the square of the value contained in com(A).Being convinced that com(A2) really contains the correctlycomputed value A2, the two parties compute the commit-ment com(C) := com(A2)/com(B) on the value C. Fi-nally, the prover proves to the verifier, with a proper zero-knowledge protocol, that com(C) ≥ 0. If this proof is ac-cepted, then the detection algorithm ends with true, other-wise, with false.

While early protocols addressed only correlation-basedwatermark detectors, the approach has recently be extendedto Gaussian maximum likelihood detectors [79] and Dithermodulation watermarks [80, 81].

5. CONCLUSION AND DISCUSSION

The availability of signal processing algorithms that work di-rectly on the encrypted data would be of great help for appli-cation scenarios where “valuable” signals must be produced,processed, or exchanged in digital format. In this paper, wehave broadly referred to this new class of signal processingtechniques operating in the encrypted domain as signal pro-cessing in the encrypted domain. We mainly review the state-of-the-art, describing the necessary properties of the crypto-graphic primitives and highlighting the limits of current so-lutions that have an impact on processing in the encrypteddomain.


Concerning the use of cryptographic primitives for sig-nal processing in the encrypted domain, we can observe thattreating the digital content as a binary data is not realistic andeliminates the possibility of further processing. Concerningthe basic encryption primitives that make processing in theencrypted domain possible, for the particular case when itis necessary to compress an encrypted signal, a possibility isto resort to the theory of coding with side information; thisprimitive, however, seems to be applicable only to this kindof problem.

The general cryptographic tools that allow to process en-crypted signals are homomorphic cryptosystems since theyallow performing linear computations on the encrypted data.In order to implement necessary signal processing opera-tions, it seems crucial to have an algebraic cryptosystem.However, such a system does not exist and despite the factthat there is no formal proof, it is highly believed that sucha system will be insecure due to preserving too much struc-ture. Yet, homomorphic cryptosystems are the key compo-nents in signal processing in the encrypted domain. Anotherproperty, important for signal processing in the encrypteddomain, is probabilistic encryption: since signal samples areusually 8-bit or 16-bit in length, encrypting such values witha deterministic cryptosystem will result in reoccurring en-crypted values which significantly reduces the search spacefor brute-force attacks. A probabilistic scheme, which doesnot encrypt two equal plain texts into the same cipher text,eliminates such an issue. However, once the data is en-crypted, the probabilistic encryption makes it impossible tocheck if the encrypted value represents a valid input for thepurposes of the subsequent processing. Similarly, the out-put of a function that is computed with encrypted data mayneed to be compared with another value. In such situations,cryptography provides a solution known as zero-knowledgeproofs. Moreover, when nonlinear function needs to be com-puted, homomorphic encryption cannot help; in such a case,it is possible to resort to interactive protocols (e.g., the securemultiparty computation). The limit of these protocols is thata general solution is infeasible for situations where the partiesown huge quantities of data or the functions to be evaluatedare complex, as it happens in signal processing scenarios.

Though the possibility of processing encrypted data hasbeen advanced several years ago, processing encrypted sig-nals poses some new problems due to the peculiarities ofsignals with respect to other classes of data more commonlyencountered in the cryptographic literature, for example, al-phanumeric strings or bit sequences. One property of sig-nals is that in many signal processing applications, there isinterest on the way the signal varies with time rather thanthe single values it assumes. Moreover, the arithmetic usedto represent the signal samples has to be carefully taken intoaccount. If the signal samples are represented by means offixed-point arithmetic, we need to ensure that no overflowoccurs; for signal processing in the encrypted domain, it isnecessary that such a condition is ensured a priori by care-fully constraining the properties of the signals we operate onand the type and number of operations we want to performon them. Moreover, keeping the distinction between the in-teger and the fractional part of a number is a difficult task,

given that, once again, it calls for the possibility of comparingencrypted numbers. If the signals are represented by meansof floating point arithmetic, working in the encrypted do-main is a very difficult task due to the necessity of imple-menting operations such as comparisons and right shifts inthe encrypted domain, for which efficient (noninteractive)solutions are not known yet.

ACKNOWLEDGMENTS

The work reported here has been funded in part by the Eu-ropean Community’s Sixth Framework Programme underGrant no. 034238, SPEED project—Signal Processing in theEncrypted Domain. The work reported reflects only the au-thors’ views; the European Community is not liable for anyuse that may be made of the information contained herein.

REFERENCES

[1] JPSEC, International standard, ISO/IEC 15444-8, 2007.

[2] P. Paillier, “Public-key cryptosystems based on composite de-gree residuosity classes,” in Proceedings of the InternationalConference on the Theory and Application of CryptographicTechniques (EUROCRYPT ’99), vol. 1592 of Lecture Notes inComputer Science, pp. 223–238, Springer, Prague, Czech Re-public, May 1999.

[3] T. ElGamal, “A public key cryptosystem and a signaturescheme based on discrete logarithms,” in Proceedings of the 4thAnnual International Cryptology Conference (CRYPTO ’84),vol. 196 of Lecture Notes in Computer Science, pp. 10–18,Springer, Santa Barbara, Calif, USA, August 1985.

[4] R. L. Rivest, A. Shamir, and L. Adleman, “A method for obtain-ing digital signatures and public-key cryptosystems,” Commu-nications of the ACM, vol. 21, no. 2, pp. 120–126, 1978.

[5] J. R. Troncoso-Pastoriza, S. Katzenbeisser, M. Celik, and A.Lemma, “A secure multidimensional point inclusion proto-col,” in Proceedings of the 9th Workshop on Multimedia & Secu-rity (MM&Sec ’07), pp. 109–120, ACM Press, Dallas, Tex, USA,September 2007.

[6] D. Boneh, G. D. Crescenzo, R. Ostrovsky, and G. Persiano,“Public key encryption with keyword search,” in Proceedingsof the International Conference on the Theory and Applicationsof Cryptographic Techniques (EUROCRYPT ’04), vol. 3027 ofLecture Notes in Computer Science, pp. 506–522, Springer, In-terlaken, Switzerland, May 2004.

[7] R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, “Order pre-serving encryption for numeric data,” in Proceedings ofthe ACM International Conference on Management of Data(SIGMOD ’04), pp. 563–574, ACM Press, Paris, France, June2004.

[8] J. Domingo-Ferrer, “A new privacy homomorphism and ap-plications,” Information Processing Letters, vol. 60, no. 5, pp.277–282, 1996.

[9] J. Domingo-Ferrer, “A provably secure additive and multi-plicative privacy homomorphism,” in Proceedings of the 5thInternational Conference on Information Security (ISC ’02),vol. 2433 of Lecture Notes in Computer Science, pp. 471–483,Springer, Sao Paulo, Brazil, September-October 2002.

[10] J. H. Cheon and H. S. Nam, “A cryptanalysis of the originalDomingo-Ferrer’s algebraic privacy homomophism,” Cryp-tology ePrint Archive, Report 2003/221, 2003.


[11] D. Wagner, “Cryptanalysis of an algebraic privacy homomor-phism,” in Proceedings of the 6th International Conference onInformation Security (ISC ’03), vol. 2851 of Lecture Notes inComputer Science, pp. 234–239, Bristol, UK, October 2003.

[12] C. Fontaine and F. Galand, “A survey of homomorphic en-cryption for nonspecialists,” EURASIP Journal on InformationSecurity, vol. 2007, Article ID 13801, 10 pages, 2007.

[13] B. Schoenmakers and P. Tuyls, “Practical two-party com-putation based on the conditional gate,” in Proceedings ofthe 10th International Conference on the Theory and Applica-tion of Cryptology and Information Security (ASIACRYPT ’04),vol. 3329 of Lecture Notes in Computer Science, pp. 119–136,Jeju Island, Korea, December 2004.


[15] J. Benaloh, Verifiable secret-ballot elections, Ph.D. thesis, De-partment of Computer Science, Yale University, New Haven,Conn, USA, 1988.

[16] D. Naccache and J. Stern, “A new public key cryptosystembased on higher residues,” in Proceedings of the 5th ACM Con-ference on Computer and Communications Security (CCS ’98),pp. 59–66, San Francisco, Calif, USA, November 1998.

[17] T. Okamoto and S. Uchiyama, “A new public-key cryptosys-tem as secure as factoring,” in Proceedings of the Interna-tional Conference on the Theory and Application of Crypto-graphic Techniques (EUROCRYPT ’98), vol. 1403 of LectureNotes in Computer Science, pp. 308–318, Springer, Espoo, Fin-land, May-June 1998.

[18] I. Damgard and M. Jurik, “A generalisation, a simplificationand some applications of Paillier’s probabilistic public-keysystem,” in Proceedings of the 4th International Workshop onPractice and Theory in Public Key Cryptosystems (PKC ’01),vol. 1992 of Lecture Notes In Computer Science, pp. 119–136,Springer, Cheju Island, Korea, February 2001.

[19] O. Goldreich, Foundations of Cryptography I, Cambridge Uni-versity Press, Cambridge, UK, 2001.

[20] C. P. Schnorr, “Efficient identification and signatures for smartcards,” in Proceedings of the 9th Annual International Cryp-tology Conference (CRYPTO ’89), vol. 435 of Lecture Notes inComputer Science, pp. 239–252, Springer, Santa Barbara, Calif,USA, August 1990.

[21] H. Lipmaa, “On diophantine complexity and statistical zero-knowledge arguments,” in Proceedings of the 9th InternationalConference on the Theory and Application of Cryptology andInformation Security (ASIACRYPT ’03), vol. 2894 of LectureNotes in Computer Science, pp. 398–415, Springer, Taipei, Tai-wan, November-December 2003.

[22] F. Boudot, “Efficient proofs that a committed number liesin an interval,” in Proceedings of the International Conferenceon the Theory and Application of Cryptographic Techniques(EUROCRYPT ’00), vol. 1807 of Lecture Notes in ComputerScience, pp. 431–444, Springer, Bruges, Belgium, May 2000.

[23] E. Fujisaki and T. Okamoto, “Statistical zero-knowledge pro-tocols to prove modular polynomial relations,” in Proceed-ings of the 17th Annual International Cryptology Conference(CRYPTO ’97), vol. 1294 of Lecture Notes in Computer Science,pp. 16–30, Springer, Santa Barbara, Calif, USA, August 1997.

[24] J. Camenisch and M. Michels, “Proving in zero-knowledgethat a number is the product of two safe primes,” in Proceed-ings of the International Conference on the Theory and Applica-tion of Cryptographic Techniques (EUROCRYPT ’99), vol. 1592

of Lecture Notes in Computer Science, pp. 107–122, Springer,Prague, Czech Republic, May 1999.

[25] T. Pedersen, “Non-interactive and information-theoretic se-cure verifiable secret sharing,” in Proceedings of the 11thAnnual International Cryptology Conference (CRYPTO ’91),vol. 576 of Lecture Notes in Computer Science, pp. 129–140,Springer, Santa Barbara, Calif, USA, August 1992.

[26] A. C. Yao, “Protocols for secure computations,” in Proceedingsof 23rd IEEE Symposium on Foundations of Computer Science(FOCS ’82), pp. 160–164, Chicago, Ill, USA, November 1982.

[27] B. Schoenmakers and P. Tuyls, “Efficient binary conversion forPaillier encrypted values,” in Proceedings of the 24th Annual In-ternational Conference on the Theory and Applications of Cryp-tographic Techniques (EUROCRYPT ’06) , vol. 4004 of LectureNotes in Computer Science, pp. 522–537, Springer, St. Peters-burg, Russia, May-June 2006.

[28] O. Goldreich, Foundations of Cryptography II, Cambridge Uni-versity Press, Cambridge, UK, 2004.

[29] S.-C. S. Cheung and T. Nguyen, “Secure multiparty computa-tion between distrusted networks terminals,” EURASIP Jour-nal on Information Security, vol. 2007, Article ID 51368, 10pages, 2007.

[30] M. Johnson, P. Ishwar, V. Prabhakaran, D. Schonberg, and K.Ramchandran, “On compressing encrypted data,” IEEE Trans-actions on Signal Processing, vol. 52, no. 10, pp. 2992–3006,2004.

[31] D. Slepian and J. K. Wolf, “Noiseless coding of correlated in-formation sources,” IEEE Transactions on Information Theory,vol. 19, pp. 471–480, 1973.

[32] S. Pradhan and K. Ramchandran, “Distributed source codingusing syndromes (DISCUS): design and construction,” IEEETransactions on Information Theory, vol. 49, no. 3, pp. 626–643, 2003.

[33] Y. Lindell and B. Pinkas, “Privacy preserving data mining,” inProceedings of the 20th Annual International Cryptology Con-ference (CRYPTO ’00), vol. 1880 of Lecture Notes in ComputerScience, pp. 36–54, Santa Barbara, Calif, USA, August 2000.

[34] R. Agrawal and R. Srikant, “Privacy-preserving data mining,”ACM SIGMOD Record, vol. 29, no. 2, pp. 439–450, 2000.

[35] C. Clifton, M. Kantarcioglu, J. Vaidya, X. Lin, and M. Y. Zhu,“Tools for privacy preserving distributed data mining,” ACMSIGKDD Explorations Newsletter, vol. 4, no. 2, pp. 28–34, 2002.

[36] M. Kantarcioglu and J. Vaidya, “Privacy preserving naiveBayes classifier for horizontally partitioned data,” in Procced-ings of the IEEE Workshop on Privacy Preserving Data Mining(ICDM ’03), pp. 3–9, Melbourne, Fla, USA, November 2003.

[37] B. Pinkas, “Cryptographic techniques for privacy-preservingdata mining,” ACM SIGKDD Explorations Newsletter, vol. 4,no. 2, pp. 12–19, 2002.

[38] V. S. Verykios, E. Bertino, I. N. Fovino, L. P. Provenza, Y. Say-gin, and Y. Theodoridis, “State-of-the-art in privacy preserv-ing data mining,” ACM SIGMOD Record, vol. 33, no. 1, pp.50–57, 2004.

[39] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: areview,” ACM Computing Surveys, vol. 31, no. 3, pp. 264–323,1999.

[40] G. Jagannathan and R. N. Wright, “Privacy-preserving dis-tributed k-means clustering over arbitrarily partitioned data,”in Proceedings of the ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD ’05), pp. 593–599, ACM Press, Chicago, Ill, USA, August 2005.

[41] B. Goethals, S. Laur, H. Lipmaa, and T. Mielikainen, “On se-cure scalar product computation for privacy-preserving data


mining,” in Proceedings of the 7th Annual International Con-ference in Information Security and Cryptology (ICISC ’04),vol. 3506 of Lecture Notes in Computer Science, pp. 104–120,Springer, Seoul, Korea, December 2004.

[42] A. C. Yao, “How to generate and exchange secrets,” in Proceed-ings of the 27th Annual Symposium on Foundations of Com-puter Science, pp. 162–167, Toronto, Ontario, Canada, October1986.

[43] K. Goldberg, T. Roeder, D. Gupta, and C. Perkins, “Eigentaste:a constant time collaborative filtering algorithm,” InformationRetrieval, vol. 4, no. 2, pp. 133–151, 2001.

[44] J. D.M. Rennie and N. Srebro, “Fast maximum margin ma-trix factorization for collaborative prediction,” in Proceed-ings of the 22nd International Conference on Machine Learning(ICML ’05), pp. 713–720, Bonn, Germany, August 2005.

[45] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Applica-tion of dimensionality reduction in recommender systems,”in Proceedings of the Web Mining for E-Commerce—Challengesand Opportunities (WEBKDD ’00), Boston, Mass, USA, Au-gust 2000.

[46] J. F. Canny, “Collaborative filtering with privacy,” in Proceed-ings of the IEEE Symposium on Security and Privacy, pp. 45–57,Berkeley, Calif, USA, May 2002.

[47] T. Pedersen, “A threshold cryptosystem without a trustedparty,” in Proceedings of the Workshop on the Theory andApplication of Cryptographic Techniques (EUROCRYPT ’91),vol. 547 of Lecture Notes in Computer Science, pp. 522–526,Brighton, UK, April 1991.

[48] J. F. Canny, “Collaborative filtering with privacy via factoranalysis,” in Proceedings of the 25th Annual International ACMSIGIR Conference on Research and Development in InformationRetrieval (SIGIR ’02), pp. 238–245, ACM Press, Tampere, Fin-land, August 2002.

[49] H. Polat and W. Du, “Privacy-preserving collaborative filter-ing using randomized perturbation techniques,” in Proceed-ings of the 3rd IEEE International Conference on Data Min-ing (ICDM ’03), pp. 625–628, IEEE Computer Society, Mel-bourne, Fla, USA, November 2003.

[50] H. Polat and W. Du, “SVD-based collaborative filtering withprivacy,” in Proceedings of the 20th Annual ACM Symposiumon Applied Computing (SAC ’05), vol. 1, pp. 791–795, Santa Fe,NM, USA, March 2005.

[51] Z. Huang, W. Du, and B. Chen, “Deriving private informa-tion from randomized data,” in Proceedings of the ACM Inter-national Conference on Management of Data (SIGMOD ’05),pp. 37–48, ACM Press, Baltimore, Md, USA, June 2005.

[52] H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar, “On theprivacy preserving properties of random data perturbationtechniques,” in Proceedings of the 3rd IEEE International Con-ference on Data Mining (ICDM ’03), pp. 99–106, Melbourne,Fla, USA, November 2003.

[53] S. Zhang, J. Ford, and F. Makedon, “A privacy-preserving col-laborative filtering scheme with two-way communication,” inProceedings of the 7th ACM Conference on Electronic Commerce(EC ’06), pp. 316–323, Ann Arbor, Mich, USA, June 2006.

[54] M. Barni and F. Bartolini, Watermarking Systems Engineering:Enabling Digital Assets Security and Other Applications, MarcelDekker, New York, NY, USA, 2004.

[55] I. J. Cox, M. L. Miller, and J. A. Bloom, Digital Watermarking,Morgan Kaufmann, San Francisco, Calif, USA, 2001.

[56] I. J. Cox, J. Kilian, F. T. Leighton, and T. Shamoon, “Securespread spectrum watermarking for multimedia,” IEEE Trans-actions on Image Processing, vol. 6, no. 12, pp. 1673–1687, 1997.

[57] M. Barni and F. Bartolini, “Data hiding for fighting piracy,”IEEE Signal Processing Magazine, vol. 21, no. 2, pp. 28–39,2004.

[58] S. Emmanuel and M. Kankanhalli, “Copyright protectionfor MPEG-2 compressed broadcast video,” in Proceedings ofthe IEEE International Conference on Multimedia and Expo(ICME ’01), pp. 206–209, Tokyo, Japan, August 2001.

[59] J. Crowcroft, C. Perkins, and I. Brown, “A method and appa-ratus for generating multiple watermarked copies of an infor-mation signal,” WO Patent No. 00/56059, 2000.

[60] R. Parviainen and P. Parnes, “Large scale distributed water-marking of multicast media through encryption,” in Proceed-ings of the IFIP TC6/TC11 International Conference on Com-munications and Multimedia Security Issues of the New Cen-tury, vol. 192, pp. 149–158, Darmstadt, Germany, May 2001.

[61] D. Kundur and K. Karthik, “Video fingerprinting and encryp-tion principles for digital rights management,” Proceedings ofthe IEEE, vol. 92, no. 6, pp. 918–932, 2004.

[62] A. Lemma, S. Katzenbeisser, M. Celik, and M. van der Veen,“Secure watermark embedding through partial encryption,” inProceedings of the 5th International Workshop on Digital Water-marking (IWDW ’06), vol. 4283 of Lecture Notes in ComputerScience, pp. 433–445, Jeju Island, Korea, November 2006.

[63] R. J. Anderson and C. Manifavas, “Chameleon—a new kind ofstream cipher,” in Proceedings of the 4th International Workshopon Fast Software Encryption (FSE ’97), vol. 1267, pp. 107–113,Springer, Haifa, Israel, January 1997.

[64] A. Adelsbach, U. Huber, and A.-R. Sadeghi, “Fingercasting—joint fingerprinting and decryption of broadcast messages,” inProceedings of the 11th Australasian Conference on InformationSecurity and Privacy (ACISP ’06), vol. 4058 of Lecture Notesin Computer Science, pp. 136–147, Springer, Melbourne, Aus-tralia, July 2006.

[65] M. Celik, A. Lemma, S. Katzenbeisser, and M. van der Veen,“Secure embedding of spread spectrum watermarks usinglook-up tables,” in Proceedings of the International Conferenceon Acoustics, Speech and Signal Processing (ICASSP ’07), vol. 2,pp. 153–156, IEEE Press, Honolulu, Hawaii, USA, April 2007.

[66] N. Memon and P. W. Wong, “A buyer-seller watermarking pro-tocol,” IEEE Transactions on Image Processing, vol. 10, no. 4, pp.643–649, 2001.

[67] F. Ahmed, F. Sattar, M. Y. Siyal, and D. Yu, “A secure wa-termarking scheme for buyer-seller identification and copy-right protection,” EURASIP Journal on Applied Signal Process-ing, vol. 2006, Article ID 56904, 15 pages, 2006.

[68] M. Kuribayashi and H. Tanaka, “Fingerprinting protocolfor images based on additive homomorphic property,” IEEETransactions on Image Processing, vol. 14, no. 12, pp. 2129–2139, 2005.

[69] C.-L. Lei, P.-L. Yu, P.-L. Tsai, and M.-H. Chan, “An efficientand anonymous buyer-seller watermarking protocol,” IEEETransactions on Image Processing, vol. 13, no. 12, pp. 1618–1626, 2004.

[70] J. Zhang, W. Kou, and K. Fan, “Secure buyer-seller watermark-ing protocol,” IEE Proceedings—Information Security, vol. 153,no. 1, pp. 15–18, 2006.

[71] J. J. Eggers, J. K. Su, and B. Girod, “Public key watermarkingby eigenvectors of linear transforms,” in Proceedings of the Eu-ropean Signal Processing Conference (EUSIPCO ’00), Tampere,Finland, September 2000.

[72] T. Furon and P. Duhamel, “An asymmetric public detectionwatermarking technique,” in Proceedings of the 3rd Interna-tional Workshop on Information Hiding (IH ’99), vol. 1768 of


Lecture Notes in Computer Science, pp. 88–100, Springer, Dres-den, Germany, September-October 2000.

[73] J. J. Eggers, J. K. Su, and B. Girod, “Asymmetric watermark-ing schemes,” in Proceedings of the Sicherheit in Mediendaten,GMD Jahrestagung, Berlin, Germany, September 2000.

[74] S. A. Craver and S. Katzenbeisser, “Security analysis of public-key watermarking schemes,” in Mathematics of Data/ImageCoding, Compression, and Encryption IV, with Applications,vol. 4475 of Proceedings of SPIE, pp. 172–182, San Diego, Calif,USA, July 2001.

[75] S. Craver, “Zero knowledge watermark detection,” in Proceed-ings of the 3rd International Workshop on Information Hiding(IH ’99), vol. 1768 of Lecture Notes in Computer Science, pp.101–116, Springer, Dresden, Germany, September-October1999.

[76] A. Adelsbach and A.-R. Sadeghi, “Zero-knowledge water-mark detection and proof of ownership,” in Proceedings of the4th International Workshop on Information Hiding (IH ’01),vol. 2137 of Lecture Notes in Computer Science, pp. 273–288,Springer, Pittsburgh, Pa, USA, April 2001.

[77] A. Adelsbach, M. Rohe, and A.-R. Sadeghi, “Non-interactivewatermark detection for a correlation-based watermarkingscheme,” in Proceedings of the 9th IFIP TC-6 TC-11Interna-tional Conference on Communications and Multimedia Security(CMS ’05), vol. 3677 of Lecture Notes in Computer Science, pp.129–139, Springer, Salzburg, Austria, September 2005.

[78] I. Damgard and E. Fujisaki, “A statistically-hiding integer com-mitment scheme based on groups with hidden order,” inProceedings of the 8th International Conference on the The-ory and Application of Cryptology and Information Security(ASIACRYPT ’02), Y. Zheng, Ed., vol. 2501 of Lecture Notes inComputer Science, pp. 125–142, Springer, Queenstown, NewZealand, December 2002.

[79] J. R. Troncoso-Pastoriza and F. Perez-Gonzalez, “Efficientnon-interactive zero-knowledge watermark detector robust tosensitivity attacks,” in Security,Steganography, and Watermark-ing of Multimedia Contents IX, P. W. Wong and E. J. Delp, Eds.,vol. 6505 of Proceedings of SPIE, pp. 1–12, San Jose, CA, USA,January 2007.

[80] M. Malkin and T. Kalker, “A cryptographic method for securewatermark detection,” in Proceedings of the 8th InternationalWorkshop on Information Hiding (IH ’06), vol. 4437 of LectureNotes in Computer Science, pp. 26–41, Springer, Alexandria,Va, USA, July 2006.

[81] A. Piva, V. Cappellini, D. Corazzi, A. De Rosa, C. Orlandi, andM. Barni, “Zero-knowledge ST-DM watermarking,” in Secu-rity, Steganography, and Watermarking of Multimedia ContentsVIII, vol. 6072 of Proceedings of SPIE, pp. 291–301, San Jose,Calif, USA, January 2006.


Research ArticleOblivious Neural Network Computing viaHomomorphic Encryption

C. Orlandi,1 A. Piva,1 and M. Barni2

1 Department of Electronics and Telecommunications, University of Florence, Via S.Marta 3, 50139 Firenze, Italy2 Department of Information Engineering, University of Siena, Via Roma 56, , 53100 Siena, Italy

Correspondence should be addressed to C. Orlandi, [email protected]

Received 27 March 2007; Accepted 1 June 2007


The problem of secure data processing by means of a neural network (NN) is addressed. Secure processing refers to the possibilitythat the NN owner does not get any knowledge about the processed data since they are provided to him in encrypted format. At thesame time, the NN itself is protected, given that its owner may not be willing to disclose the knowledge embedded within it. Theconsidered level of protection ensures that the data provided to the network and the network weights and activation functions arekept secret. Particular attention is given to prevent any disclosure of information that could bring a malevolent user to get access tothe NN secrets by properly inputting fake data to any point of the proposed protocol. With respect to previous works in this field,the interaction between the user and the NN owner is kept to a minimum with no resort to multiparty computation protocols.

Copyright © 2007 C. Orlandi et al. This is an open access article distributed under the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION

Recent advances in signal and information processing to-gether with the possibility of exchanging and transmittingdata through flexible and ubiquitous transmission mediasuch as Internet and wireless networks have opened the waytowards a new kind of services whereby a provider sells itsability to process and interpret data remotely, for example,through a web service. Examples in this sense include in-terpretation of medical data for remote diagnosis, access toremote databases, processing of personal data, processing ofmultimedia documents. In addition to technological devel-opments in artificial intelligence, multimedia processing anddata interpretation, and to an easy and cheap access to thecommunication channel, the above services call for the adop-tion of security measures that ensure that the informationprovided by the users and the knowledge made available bythe service providers are adequately protected.

Most of the currently available solutions for secure ma-nipulation of signals apply some cryptographic primitiveson top of the signal processing modules, so to prevent theleakage of critical information. In most cases, however, itis assumed that the involved parties trust each other, andthus the cryptographic layer is used only to protect the dataagainst third parties. In the new application scenarios out-

lined above, however, this is only rarely the case, since thedata owner usually does not trust the processing devices, orthose actors required to manipulate the data. It is clear thatthe availability of signal processing algorithms that work di-rectly on the encrypted data, would represent a powerful so-lution to the security problems described above.

A fundamental brick of modern artificial intelligence the-ory is represented by neural networks (NNs), which thanksto their approximation and generalization capabilities [1] area universal tool enabling a great variety of applications. Forthis reason, in this paper we introduce a protocol wherebya user may ask a service provider to run a neural networkon an input provided in encrypted format. The twofold goalis on one side to ensure that the data provided by the user,representing the input of the neural network, are adequatelyprotected, on the other side to protect the knowledge (ex-pertise) of the service provider embedded within the NN.it is worth pointing out that the scope of our protocol isnot to preserve user anonymity. Specifically, the latter goalis achieved by protecting the weights of the network arcs,together with the parameters defining the neuron activa-tion functions. The proposed protocol relies on homomor-phic encryption principles (first introduced in [2]) wherebya few elementary operations can be performed directly in theencrypted domain. For those tasks that cannot be handled


by means of homomorphic encryption, a limited amountof interaction between the NN owner and the user is intro-duced; however, in contrast to previous works in the generalarea of privacy preserving data mining [3], the interactionis kept to a minimum and no resort to sophisticated mul-tiparty computation protocols [4, 5] is made. Great atten-tion is paid to avoid any unnecessary disclosure of informa-tion, so that at the end of the protocol the user only knowsthe final NN output, whereas all internal computations arekept secret. In this way, the possibility for a malevolent userto provide a set of fake inputs properly selected to disclosethe network secrets is prevented. A solution is also sketchedthat permits to obfuscate the network topology, however, adeeper investigation in this direction is left for future re-search.

The rest of this paper is organized as follows. In Section 2,the prior art in the general field of privacy preserving andoblivious computing is reviewed, and the peculiarities of ournovel protocol are discussed. In Section 3 the cryptographicprimitives our scheme relies on are presented. The details ofthe protocol we propose for oblivious NN computing are de-scribed in Section 4, where a single perceptron is studied, andin Section 5, where the whole multilayer feedforward net-work is analyzed. Section 6 is devoted to the discussion raisedby the necessity of approximating real numbers by integervalues (given that the adopted cryptosystem works only withinteger values while NN computations are usually carried outby considering real numbers). Section 7 is devoted to the ex-perimental results obtained developing a distributed appli-cation that runs the protocol. Some concluding remarks aregiven in Section 8.

2. PRIOR ART

In modern society great amount of data are collected andstored by different entities. Some of these entities may takean advantage cooperating with each other. For example, twomedical institutions may want to perform a joint research ontheir data; another example is a patient that needs a diagno-sis from a medical institute that has the knowledge neededto perform the diagnosis. Of course those entities want to getthe maximum advantage from the cooperation, but they can-not (or do not want to) let the other party know the data theyown. Usually they cannot disclose personal data due to pri-vacy related law, and at the same time they like to keep theirknowledge for business reasons.

A trivial solution to protect the data owned by the partic-ipants to the computation consists in resorting to a trustedthird party (TTP) that actually carries out the computationon the inputs received by the two parties, and sends to themthe corresponding output. A privacy preserving protocol al-lows to achieve the same goal without the participation of aTTP, in such a way that each player can only learn from theprotocol execution the same information he/she could get byhis/her own inputs and the output received by the TTP.

In 2000 two different papers proposed the notions of pri-vacy preserving data mining, meaning the possibility to per-form data analysis on a distributed database, under some pri-

vacy constraints. Lindell and Pinkas [6] presented a way tosecurely and efficiently compute a decision tree using cryp-tographic protocols; at the same time, Agrawal and Srikant[7] presented another solution to the same problem usingdata randomization, that is by adding noise to customer’sdata.

After the publication of these papers, the interest in pri-vacy preserving cooperative computation has grown up. Inparticular several techniques from machine learning wereconverted to the multiparty scenario where several partiescontribute to some kind of computation while preserving thesecurity of the data provided by each of them. Solutions forthe following algorithms were proposed: decision trees [6],neural networks [8], SVM [9], naive bayes classifiers [10],belief networks [11, 12], clustering [13]. In all these works,we can identify two major scenarios: in the first one Aliceand Bob share a dataset and want to extract knowledge fromit without revealing their own data (privacy preserving datamining). In the other scenario, the one considered in this pa-per, Alice owns her private data x, while Bob owns an evalu-ation function C (in most cases C is a classifier); Alice wouldlike to have her data processed by Bob, but she does not wantthat Bob learns either her input or the output of the compu-tation. At the same time Bob does not want to reveal the exactform of C, representing his knowledge, since, for instance, hesells a classification service through the web (as in the remotemedical diagnosis example). This second scenario is usuallyreferred to as oblivious computing.

Cooperative privacy preserving computing is closely re-lated to secure multiparty computation (SMC), that is a sce-nario where Alice owns x, Bob owns y, and they want tocompute a public function f (·) of their inputs without re-vealing them to each other. At the end of the protocol, Aliceand Bob will learn nothing except f (x, y). The roots of SMClie in a work by Yao [14] proposing a solution to the mil-lionaire problem, in which two millionaires want to find outwhich of them is richer without revealing the amount of theirwealth. Later on Yao [15] presented a constant-round pro-tocol for privately computing any probabilistic polynomial-time function. The main idea underling this protocol is to ex-press the function f as a circuit of logical gates, and then per-form a secure computation for each gate. It is clear that thisgeneral solution is unfeasible for situations where the partiesown huge quantities of data or the functions to be evaluatedare complex.

After these early papers extensively relying on SMC,more efficient primitives for privacy preserving computingwere developed, based on homomorphic encryption schemes[16], which permit to carry out a limited set of elementaryoperations like additions or multiplications in the encrypteddomain. In this way, a typical scheme for privacy preserv-ing computing consists in a first phase where each party per-forms the part of the computation that he can do by himself(possibly by relying on a suitable homomorphic cryptosys-tem). Then the interactive part of the protocol starts, withprotocol designers trying to perform as much as they can inan efficient way. At the end, the operations for which an effi-cient protocol is not known (like division, maximum finding,

C. Orlandi et al. 3

etc.) are carried out by resorting to the general solution byYao.

Previous works on privacy preserving NN computing arelimited to the systems presented in [8, 17]. However, firststudy resort extensively to SMC for the computation of thenonlinear activation functions implemented in the neurons,and hence is rather cumbersome. On the other hand, the pro-tocol proposed in [17] may leak some information at theintermediate states of the computation, in fact the outputof all the intermediate neurons is made available to the dataowner, hence making it rather easy for a malevolent user todisclose the NN weights by feeding each neuron with prop-erly chosen inputs. This is not the case with our new protocolwhich conceals all the intermediate NN computations anddoes not resort to SMC for the evaluation of the activationfunctions. In a nutshell, the owner of the NN (say Bob) per-forms all the linear computations in the encrypted domainand delegates the user (say Alice) to compute the nonlinearfunctions (threshold, sigmoid, etc.). Before doing so, how-ever, Bob obfuscates the input of the activation functions sothat Alice does not learn anything about what she is comput-ing.

When designing an SMC protocol it is necessary to takeinto account the possible behavior of the participants to theprotocol. Cryptographic design usually considers two possi-ble behaviors: a participant is defined semihonest or passive ifhe follows the protocol correctly, but tries to learn additionalinformation by analyzing the messages exchanged during theprotocol execution; he is defined malicious or active if he arbi-trarily deviates from the protocol specifications. In this work,like most of the protocols mentioned above, the semi-honestmodel is adopted. Let us note, however, that a protocol securefor semi-honest users can always be transformed into a pro-tocol secure against malicious participants by requiring eachparty to use zero-knowledge protocols to grant that they arecorrectly following the specifications of the scheme.

3. CRYPTOGRAPHIC PRIMITIVES

In this section the cryptographic primitives used to build theproposed protocol are described.

3.1. Homomorphic and probabilistic encryption

To implement our protocol we need an efficient homomor-phic and probabilistic, public key, encryption scheme.

Given a set of possible plain texts M, a set of cipher textsC, and a set of key pairs K = PK× SK (public keys and secretkeys), a public key encryption scheme is a couple of func-tions Epk : M → C, Dsk : C → M such that Dsk(Epk(m)) = m(where m ∈ M) and such that, given a cipher text c ∈ C,it is computationally unfeasible to determine m such thatEpk(m) = c, without knowing the secret key sk.

To perform linear computation (i.e., scalar product), weneed an encryption scheme that satisfies the additive homo-morphic property according to which, given two plaintexts

m1 and m2 and a constant value a, the following equalitiesare satisfied:

Dsk(Epk(m1) · Epk

(m2)) = m1 + m2,

Dsk(Epk(m1)a) = am1.

(1)

Another feature that we need is that the encryptionscheme does not encrypt two equal plain texts into the samecipher text, since we have to encrypt a lot of 0s and 1s,given that the output of the thresholding and sigmoid acti-vation functions is likely to be zero or one in most of thecases. For this purpose, we can define a scheme where theencryption function Epk is a function of both the secret mes-sage x and a random parameter r such that if r1 �= r2 wehave Epk(x, r1) �= Epk(x, r2) for every secret message x. Letc1 = Epk(x, r1) and c2 = Epk(x, r2), for a correct behavior wealso need that Dsk(c1) = Dsk(c2) = x, that is, the decryp-tion phase does not depend on the random parameter r. Wewill refer to a scheme that satisfies the above property as aprobabilistic scheme. This idea was first introduced in [18].Luckily, homomorphic and probabilistic encryption schemesdo exist. Specifically, in our implementation we adopted thehomomorphic and probabilistic scheme presented by Paillierin [16], and later modified by Damgard and Jurik in [19].

3.2. Paillier cryptosystem

The cryptosystem described in [16], usually referred to asPaillier cryptosystem, is based on the problem to decidewhether a number is an nth residue modulo n2. This prob-lem is believed to be computationally hard in the cryptogra-phy community, and is related to the hardness to factorize n,if n is the product of two large primes.

Let us now explain what an nth residue is and how it canbe used to encrypt data. The notation we use is the classicone, with n = pq indicating the product of two large primes,Zn the set of the integer numbers modulo n, and Z∗n the set ofinvertible elements modulo n, that is, all the integer numbersthat are relatively prime with n. As usual, the cardinality ofthe latter set is indicated by |Z∗n | and it is equal to the Euler’stotient function φ(n).

Definition 1. z ∈ Z∗n2 is said to be a nth residue modulo n2 ifthere exists a number y ∈ Z∗n2 such that z = yn mod n2.

Conjecture 1. The problem of deciding nth residuosity, that is,distinguishing nth residues from non nth residues is computa-tionally hard.

Paillier cryptosystem works on the following facts fromnumber theory.

(1) The application

εg : Zn × Z∗n −→ Z∗n2 ,

m, y −→ gmyn mod n2 (2)

with g ∈ Z∗n2 an element with order multiple of n is abijection.


(2) We define the class of c ∈ Z∗n2 as the unique m ∈ Zn forwhich y ∈ Z∗n exists such that c = gmyn mod n2.

This is the ciphering function, where (g,n) represents thepublic key, m the plaintext, and c the ciphertext. Note thaty can be randomly selected to have different values of c thatbelong to the same class. This ensures the probabilistic natureof the Paillier cryptosystem.

Let us describe now the deciphering phase, that is, howwe can decide the class of c from the knowledge of the factor-ization of n,

(1) It is known that |Z∗n | = φ(n) = (p − 1)(q − 1) and|Z∗n2 | = φ(n2) = nφ(n).Define λ(n) = lcm(p− 1, q− 1) (least common multi-ple).

(2) This leads to, for all x ∈ Z∗n2 ,

(i) xλ(n) = 1 mod n,

(ii) xnλ(n) = 1 mod n2.(3)

(3) From (ii) (xλ(n))n = 1 mod n2, so xλ(n) is an nth rootof unity, and from (i) we learn that we can write it as1+an for some a ∈ Zn. So gλ(n) can be written as 1+anmod n2.

(4) Note that for every element of the form 1 + an it istrue that (1 + an)b mod n2 = (1 + abn) mod n2. So(gλ(n))m = (1 + amn).

(5) Consider cλ(n) = gmλ(n)ynλ(n): we have that gmλ(n) = 1 +amn mod n2 from (4) and ynλ(n) = 1 mod n2 from(ii). We obtain that cλ(n) = 1 + amn mod n2.

(6) So we can compute cλ(n) = 1 + amn mod n2 andgλ(n) = 1 + an mod n2.

(7) With the function L(x) = (x − 1)/n, by computingL(cλ mod n2) and L(cλ mod n2) we can simply re-cover am, a, and obtain am · a−1 = m mod n2.

Note that this is only an effort to make Paillier cryptosys-tem understandable using simple facts. For a complete treat-ment we refer to the original paper [16] or to [20].

At the end, the encryption and the decryption proceduresare the following

Setup

Select p, q big primes. λ = lcm(p − 1, q − 1) is the privatekey. Let n = pq and g in Z∗n2 an element of order αn for someα�= 0. (n, g) is the public key.

Encryption

Let m < n be the plaintext and r < n a random value. Theencryption c of m is

c = Epk(m) = gmrn mod n2. (4)

Decryption

Let c < n2 be the ciphertext. The plaintext m hidden in c is

m = Dsk(c) = L(cλ mod n2

)

L(gλ mod n2

) mod n, (5)

where L(x) = (x − 1)/n.

3.3. Generalized Paillier cryptosystem

In [19] the authors present a generalized and simplified ver-sion of the Paillier cryptosystem. This version is based on thecomplexity to decide the nth residuosity modulo ns+1, andincludes the original Paillier cryptosystem as a special casewhen s = 1.

The way it works is almost the same of the original ver-sion except for

(i) The domain where one can pick up the plaintext is Znsand the ciphertexts are in Z∗ns+1 .

(ii) g is always set to 1 + n (that has order ns).(iii) The decryption phase is quite different.

The main advantage of this cryptosystem is that the onlyparameter to be fixed is n, while s can be adjusted accordingto the plaintext. In other words, unlike other cryptosystems,where one has to choose the plaintext m to be less than n,here one can choose an m of arbitrary size, and then adjust sto have ns > m and the only requirement for n is that it mustbe unfeasible to find its factorization.

The tradeoff between security and arithmetic precision isa crucial issue in secure signal processing applications. As wewill describe later a cryptosystem that offers the possibilityto work with an arbitrary precision allows us to neglect thatthe cryptosystem works on integer modular numbers so fromnow on we will describe our protocol as we have a homomor-phic cryptosystem that works on approximated real numberswith arbitrary precision. A detailed discussion of this claim isgiven in Section 6.

3.4. Private scalar product protocol

A secure protocol for the scalar product allows Bob to com-pute an encrypted version of the scalar product 〈·, ·〉 be-tween an encrypted vector given by Alice c = Epk(x) andone vector y owned by Bob. The protocol guarantees thatBob gets nothing, while Alice gets an encrypted version ofthe scalar product that she can decrypt with her private key.At the end Bob learns nothing about Alice’s input while Alicelearns nothing except for the output of the computation. Asdescribed in [21], there are a lot of protocols proposed forthis issue.

Here we use a protocol based on an additively homomor-phic encryption scheme (see Algorithm 1).

After receiving z, Alice can decrypt this value with herprivate key to discover the output of the computation. By us-ing the notation (inputA; inputB) → (outputA; outputB), theabove protocol can be written as (c = Epk(x); y) → (z =Epk(〈x, y〉);∅) where∅ denotes that Bob gets no output.

C. Orlandi et al. 5

Input: (c = Epk(x); y)Output: (z = Epk(〈x, y〉);∅)PSPP(c; y)(1) Bob computes w =∏N

i=1 cyii

(2) Bob sends z to Alice

Algorithm 1

It is worth observing that though the above protocol is asecure one in a cryptographic sense, some knowledge aboutBob’s secrets is implicitly leaked through the output of theprotocol itself. If, for instance, Alice can interact N timeswith Bob (where N = |x| = |w| is the size of the inputvectors), she can completely find out Bob’s vector, by sim-ply setting the input of the ith iteration as the vector withall 0s and a 1 in the ith position, for i = 1, . . . ,N . This ob-servation does not contrast with the cryptographic notion ofsecure multiparty computation, since a protocol is definedsecure if what the parties learn during the protocol is onlywhat they learn from the output of the computation. How-ever, if we use the scalar product protocol described above tobuild more sophisticated protocols, we must be aware of thisleakage of information. In the following we will refer to thisway of disclosing secret information as a sensitivity attack af-ter the name of a similar kind of attack usually encounteredin watermarking applications [22, 23]. Note that the prob-lems stemming from sensitivity attacks are often neglected inthe privacy preserving computing literature.

3.5. Malleability

The homomorphic property that allows us to produce mean-ingful transformation on the plaintext modifying the cipher-text also allows an attacker to exploit it for a malicious pur-pose.

In our application one can imagine a competitor of Bobthat wants to discredit Bob’s ability to process data, and thusadds random noise to all data exchanged between Alice andBob, making the output of the computation meaningless. Al-ice and Bob have no way to discover that such an attack wasdone, because if the attacker knows the public key of Alice,he can transform the ciphertext in the same way that Bobcan, so Alice cannot distinguish between honest homomor-phic computation made by Bob and malicious manipulationof the ciphertext performed by the attacker. This is a wellknown drawback of every protocol that uses homomorphicencryption to realize secure computation. Such a kind of at-tacks is called a malleability attack [24]. To prevent attackersfrom maliciously manipulating the content of the messagesexchanged between Alice and Bob, the protocol, such as anyother protocol based on homomorphic encryption, shouldbe run on a secure channel.

4. PERCEPTRON

We are now ready to describe how to use the Paillier cryp-tosystem and the private scalar product protocol to build a

x1

x2

xm

w1

w2

wm

∑ y τ(y, δ)δx2w2

x1w1

xmwm

...

Figure 1: A perceptron is a binary classifier that performs aweighted sum of the inputs x1, . . . , xm by means of the weightsw1, . . . ,wm followed by an activation function usually implementedby a threshold operation.

protocol for oblivious neural network computation. We startby describing the instance of a single neuron, in order to clar-ify how the weighted sum followed by the activation functionshaping the neuron can be securely computed.

A single neuron in a NN is usually referred to as percep-tron. A perceptron (see Figure 1) is a binary classifier thatperforms a weighted sum of the input x1, . . . , xm by meansof the weights w1, . . . ,wm followed by an activation function(usually a threshold operation). So if y = ∑m

i=1 xiwi, the out-put of the perceptron will be

τ(y, δ) =⎧⎨

⎩1 if y ≥ δ,

0 if y < δ.(6)

We also address the case where the activation function isa sigmoid function, in this case the output of the perceptronis

σ(y,α) = 11 + e−αy

. (7)

This function is widely used in feedforward multilayer NNsbecause of the following relation:

dσ(x,α)dx

= ασ(x,α)(1− σ(x,α)

)(8)

that is easily computable and simplifies the backpropagationtraining algorithm execution [25].

In the proposed scenario the data are distributed as fol-lows: Alice owns her private input x, Bob owns the weightsw, and at the end only Alice obtains the output. Alice willprovide her vector in encrypted format (c = Epk(x)) and willreceive the output in an encrypted form. We already showedhow to compute an encrypted version of y, the scalar prod-uct between x and w. Let us describe now how this compu-tation can be linked to the activation function in order toobtain a secure protocol (c; w, δ) → (Epk(τ(〈x, w〉, δ));∅)in the case of a threshold activation function, or (x; w,α) →(Epk(σ(〈x, w〉,α));∅) in the case of the sigmoid activationfunction. In order to avoid any leakage of information, an ob-fuscation step is introduced to cover the scalar product andthe parameters of the activation function.


Input: (c; w, δ)Output: (Epk(τ(〈x, y〉, δ));∅)PerceptronThreshold(c; w, δ)(1) Bob computes y =∏N

i=1 cwii

(2) Bob computes γ = (y · Epk(−δ))a with random a anda > 0

(3) Bob sends γ to Alice(4) Alice’s output is 1 if Dsk(γ) ≥ 0; else it is 0

Algorithm 2

4.1. Secure threshold evaluation

What we want here is that Alice discovers the output of thecomparison without knowing the terms that are compared.Moreover, Bob cannot perform such a kind of computationby himself, as thresholding is a highly non-linear function,thus homomorphic encryption cannot help here. The solu-tion we propose is to obfuscate the terms of the comparisonand give them to Alice in such a way that Alice can computethe correct output without knowing the real values of the in-put. To be specific, let us note that τ(y, δ) = τ( f (y−δ), 0) forevery function such that sign( f (x)) = sign(x). So Bob needsonly to find a randomly chosen function that he can com-pute in the encrypted domain, that transforms y − δ into avalue indistinguishable from purely random values and keepsthe sign unaltered. In our protocol, the adopted function isf (x) = ax with a > 0. Due to the homomorphic property ofthe cryptosystem, Bob can efficiently compute

Epk(〈x, w〉 − δ

)a∼ Epk

(a(〈x, w〉 − δ

)), (9)

where ∼ means that they contain the same plaintext. Next,Bob sends this encrypted value to Alice that can decrypt themessage and check if a(〈x, w〉 − δ) > 0. Obviously, this givesno information to Alice on the true values of 〈x, w〉 and δ. Insummary, the protocol for the secure evaluation of the per-ceptron is shown in Algorithm 2.

4.2. Secure sigmoid evaluation

The main idea underlying the secure evaluation of the sig-moid function is similar to that used for thresholding. Evenin this case we note that σ(y,α) depends only on the productof the two inputs, say if yα = y′α′, then σ(y,α) = σ(y′,α′).So what Bob can do to prevent Alice to discover the outputof scalar product and the parameter of the sigmoid α is togive Alice the product of those values, that Bob can computein the encrypted domain and that contains the same infor-mation of the output of the sigmoid function. In fact, as thesigmoid function could be easily inverted, the amount of in-formation provided by σ(y,α) is the same of αy. The solutionwe propose, then, is the following: by exploiting again thehomomorphic property of the cryptosystem Bob computesEpk(y)α ∼ Epk(αy). Alice can decrypt the received value andcompute the output of the sigmoid function. The protocolfor the sigmoid-shaped perceptron is shown in Algorithm 3.

Input: (c; w,α)Output: (Epk(σ(〈x, y〉,α));∅)PerceptronSigmoid(x; w,α)(1) Bob computes y =∏N

i=1 cwii

(2) Bob computes η = yα and(3) Bob sends η(4) Alice decrypts η and computes her output

σ(Dsk(η), 1)

Algorithm 3

4.3. Security against sensitivity attacks

Before passing to the description of the protocol for the com-putation of a whole NN, it is instructive to discuss the sensi-tivity attack at the perceptron level. Let us consider first thecase of a threshold activation function: in this case the per-ceptron is nothing but a classifier whose decision regions areseparated by a hyperplane with coefficients given by the vec-tor w. Even if Alice does not have access to the intermediatevalue 〈x, w〉, she can still infer some useful information aboutw by proceeding as follows. She feeds the perceptron with aset of random sequences until she finds two sequences ly-ing in different decision regions, that is, for one sequence theoutput of the perceptron is one, while for the other is zero.Then Alice applies a bisection algorithm to obtain a vectorthat lies on the border of the decision regions. By iteratingthe above procedure, Alice can easily find m points belong-ing to the hyperplane separating the two decision regions ofthe perceptron, hence she can infer the values of the m un-knowns contained in w. In the case of a sigmoid activationfunction, the situation is even worse, since Alice only needsto observe m + 1 values of the product αy to determine them + 1 unknowns (w1,w2 · · ·wm;α).

Note that it is impossible to prevent the sensitivity attacksdescribed above by working at the perceptron level, since atthe end of the protocol the output of the perceptron is theminimum amount of disclosed information. As it will beoutlined in the next section, this is not the case when we areinterested in using the perceptron as an intermediate step ofa larger neural network.

5. MULTILAYER FEEDFORWARD NETWORK

A multilayer feedforward network is composed by n layers,each having mi neurons (i = 1 · · ·n). The network is thencomposed by N = ∑n

i=1 mi neurons. Every neuron is iden-tified by two indexes, the superscript refers to the layer theneuron belongs to, the subscript refers to its position in thelayer (e.g., w2

3 indicates the weights vector for the third neu-ron in the second layer, while its components will be referredto as w2

3, j). An example of such a network is given in Figure 2.The input of each neuron in the ith layer is the weighted sumof the outputs of the neurons of the (i− 1)th layer. The inputof the first layer of the NN is Alice’s vector, while the out-put of the last layer is the desired output of the computation.Each neuron that is not an output neuron is called hiddenneuron.

C. Orlandi et al. 7

x1

x2

x3

x1 x2 x3

w11 w2

1 w31

w12 w2

2 w32

z1

z2

Figure 2: This network has n = 3 layers. The network has threeinputs, and all layers are composed of two neurons (m1 = m2 =m3 = 2). The network is so composed of six neurons (N = 6).Let us note that the input neurons are not counted as they do notperform computation. For the sake of simplicity, the weights vectorof every neuron is represented into the neuron, and not on the edge.

In addition to protecting the weights of the NN, as de-scribed in the previous section, the protocol is designed toprotect also the output of each of those neurons. In factthe simple composition of N privacy preserving perceptronswould disclose some side information (the output of the hid-den neurons nodes) that could be used by Alice to run a sen-sitivity attack at each NN node.

The solution adopted to solve this problem is that Bobdoes not delegate Alice to compute the real output of the hid-den perceptrons, but an apparently random output, so that,as it will be clarified later, the input of each neuron of theith layer will not be directly the weighted sum of the outputsof the neurons of the (i − 1)th layer, but an obfuscation ofthem. To be specific, let us focus on the threshold activationfunction, in this case every neuron will output a 0 or a 1. Thethreshold function is antisymmetric with respect to (0, 1/2)as shown in Figure 3. That is, we have that y ≥ δ ⇒ −y ≤ −δor equivalently:

τ(−y,−δ) = 1− τ(y, δ). (10)

Then, if Bob changes the sign of the inputs of the thresh-old with 0.5 probability, he changes the output of the com-putation with the same probability, and Alice computes anapparently random output according to her view. Then sheencrypts this value, sends it to Bob that can flip it again inthe encrypted domain, so that the input to the next layer willbe correct.

Also the sigmoid is antisymmetric with respect to(0, 1/2), since we have that 1/(1 + e−αy) = 1 − 1/(1 + eαy)or equivalently:

σ(−y,α) = 1− σ(y,α), (11)

then if Bob flips the product inputs with 0.5 probability, thesign of the value that Bob sends to Alice will be again ap-parently random. Alice will still be able to use this value tocompute the output of the activation function that will ap-pear random to her. However, Bob can retrieve the correctoutput, since he knows whether he changed the sign of the

y

x

1

0

0

(a)

y

x

1

0

0

(b)

Figure 3: Both threshold and sigmoid functions are antisymmetricwith respect to (0, 1/2) as shown, that is, τ(−y,−δ) = 1 − τ(y, δ)and σ(−y,α) = 1− σ(y,α).

inputs of the activation function or not. Note that Bob canflip the sign of one or both the inputs of τ or σ in the en-crypted domain, and he can also retrieve the real output bystill working in the encrypted domain since he can do thisby means of a simple linear operation (multiplication by 1 or−1 and subtractions).

5.1. Multilayer network with threshold

We are now ready to describe the final protocol for a mul-tilayer feedforward neural network whose neurons use thethreshold as activation function. The privacy preserving per-ceptron protocol presented before is extended adding an in-put for Bob using the following notation: given ξ ∈ {+,−},we define

PerceptronThreshold (c; w, δ, ξ) → (Epk(τ(ξ〈x, w〉,ξδ));∅).

The Alice’s encrypted input vector will be the input forthe first layer, that is c1 = c. With this new definition, weobtain the protocol shown in Algorithm 4.

To understand the security of the protocol, let us notethat if Bob flips the sign of the input in the threshold withprobability 1/2, Alice does not learn anything from the com-putation of the threshold function hence achieving perfectsecurity according to Shannon’s definition. In fact, it is likeperforming a one time pad on the neuron output bit. This isnot true in the case of the sigmoid, for which an additionalstep must be added.

5.2. Multilayer network with sigmoid

Even in this case, we need to extend the perceptron protocolpresented before by adding an input to allow Bob to flip the


Input: (c; wij , δ

ij) with i = 1 · · ·n, j = 1 · · ·mi

Output: (Epk(z);∅) where z is the output of the last layer of the networkPPNNThreshold(c; wi

j , δij)

(1) c1 = c(2) for i = 1 to n− 1(3) for j = 1 to mi

(4) Bob picks ξ ∈ {+,−} at random(5) (Epk(τ(ξ〈xi, wi

j〉, ξδij));∅) = PerceptronThreshold(ci; wij , δ

ij , ξ)

(6) Alice decrypts the encrypted output and computes the new input xi+1, sendingback to Bob ci+1

j = Epk(xi+1j )

(7) if ξ = “−′′(8) Bob sets ci+1

j = Epk(1) · (ci+1j )−1

(9) // last layer does not obfuscate the output(10) for j = 1 to mn

(11) (zj ;∅) = PerceptronThreshold(xn; wnj , δ

nj , +)

Algorithm 4

sigmoid input:PerceptronSigmoid (c; w,α, ξ) → (Epk(σ(ξ〈x, w〉,

α));∅).At this point we must consider that, while the threshold

function gives only one bit of information, and the flippingoperation carried out by Bob completely obfuscates Alice’sview, the case of the sigmoid is quite different: if Bob flips theinputs with probability 0.5, Alice will not learn if the input ofthe sigmoid was originally positive or negative, but she willlearn the product ±αy. This was not a problem for the per-ceptron case, as knowing z or this product is the same (dueto the invertibility of sigmoid function). For the multilayercase, instead, it gives to Alice more information than whatshe needs, and this surplus of information could be used toperform a sensitivity attack.

Our idea to cope with this attack at the node level is torandomly scramble the order of the neurons in the layer forevery execution of the protocol except for the last one. Ifthe layer i has mi neurons we can scramble them in mi! dif-ferent ways. We will call πri the random permutation usedfor the layer i, depending on some random seed ri (wherei = 1, . . . ,n− 1) so that the protocol will have a further inputr. Evidently, the presence of the scrambling operator preventsAlice from performing a successful sensitivity attack. In sum-mary, the protocol for the evaluation of a multilayer networkwith sigmoid activation function, using the same notation ofthe threshold case, is shown in Algorithm 5.

5.3. Sensitivity attack

Before concluding this section, let us go back to the sen-sitivity attack. Given that the intermediate values of thecomputation are not revealed, a sensitivity attack is possibleonly at the whole network level. In other words, Alice couldconsider the NN as a parametric function with the parame-ters corresponding to the NN weights, and apply a sensitiv-ity attack to it. Very often, however, a multilayer feedforwardNN implements a complicated, hard-to-invert function, sothat discovering all the parameters of the network by con-sidering it as a black box requires a very large number of in-

teractions. To avoid this kind of attack, then, we can simplyassume that Bob limits the number of queries that Alice canask, or require that Alice pays an amount of money for eachquery.

5.4. Protecting the network topology

As a last requirement Bob may desire that Alice does notlearn anything about the NN topology. Though strictlyspeaking this is a very ambitious goal, Bob may distort Alice’sperception of the NN by randomly adding some fake neuronsto the hidden layers of the network, as shown in Figure 4. Asthe weights are kept secret, Bob should randomly set the in-bound weight of each neuron. At the same time Bob has toreset the outbound weights, so that the fake neurons will notchange the final result of the computation. The algorithmsthat we obtain by considering this last modification are equalto those described so far, the only difference being in thetopology of Bob’s NN. Note that for networks with sigmoidactivation functions, adding fake neurons will also increasethe number of random permutations that can be applied toavoid sensitivity attacks.

6. HANDLING NONINTEGER VALUES

At the end of Section 3.3 we made the assumption that thePaillier encryption scheme, noticeably the Damgard-Jurikextension, works properly on noninteger values and satisfiesthe additive homomorphic properties on such kind of datato simplify the analysis reported in the subsequent sections.Indeed, rigorously speaking, this is not true. We now analyzemore formally every step of the proposed protocols showinghow the assumption we made in Section 3.3 is a reasonableone.

To start with, let us remember that the Damgard-Jurik cryptosystem allows to work on integers in the range{0, . . . ,ns − 1}. First of all we map, in a classic way, the posi-tive numbers in {0, . . . , (ns − 1)/2}, and the negative ones in{(ns − 1)/2+1, . . . ,ns−1}, with−1 = ns−1. Then, given a realvalue x ∈ R, we can quantize it with a quantization factor Q,

C. Orlandi et al. 9

Input: (c; wij ,α

ij , r) with i = 1 · · ·n, j = 1 · · ·mi

Output: (Epk(z);∅) where z is the output of the last layer of the networkPPNNSIGMOID(c; wi

j ,αij , r)

(1) for i = 1 to n− 1(2) Bob permutes neurons position in layer i using random permutation πri(3) // now the network is scrambled, and the protocol follows as before(4) x1 = x(5) for i = 1 to n− 1(6) for j = 1 to mi

(7) Bob picks ξ ∈ {+,−} at random(8) (Epk(σ(ξ〈xi, wi

j〉,αij));∅) = PerceptronSigmoid(ci; wij ,α

ij , ξ)

(9) Alice decrypts the encrypted output and computes the new input xi+1, sendingback to Bob ci+1

j = Epk(xi+1j )

(10) if ξ = “−′′(11) Bob sets ci+1

j = Epk(1) · (ci+1j )−1

(12) // last layer does not obfuscate the output(13) for j = 1 to mn

(14) (zj ;∅) =PerceptronSigmoid(cn; wnj ,α

nj , +)

Algorithm 5

Figure 4: To obfuscate the number and position of the hidden neu-rons Bob randomly adds fake neurons to the NN. Fake neurons donot affect the output of the computation as their outbound weightsare set to 0. Inbound weights are dotted as they are meaningless.

and approximate it as x = x/Q� � x/Q for a sufficiently thinquantization factor. Clearly the first homomorphic propertystill stands, that is,

Dsk(Epk(x1) · Epk

(x2)) = x1 + x2 � x1 + x2

Q. (12)

This allows Bob to perform an arbitrarily number ofsums among cipher texts. Also the second property holds,but with a drawback. In fact:

Dsk(Epk(x)a

) = a · x � a · xQ2

. (13)

The presence of the Q2 factor has two important conse-quences:

(1) the size of the encrypted numbers grows exponentiallywith the number of multiplications;

(2) Bob must disclose to Alice the number of multiplica-tions, so that Alice can compensate for the presence ofthe Q2 factor.

The first drawback is addressed with the availability ofDamgard-Jurik cryptosystem that allows us, by increasing s,to cipher bigger numbers. The second one imposes a limit onthe kind of secure computation that we can perform usingthe techniques proposed here.

We give here an upper bound for the bigger integer thatcan be encrypted, that forces us to select an appropriate pa-rameter s for the Damgard-Jurik cryptosystem.

In the neural network protocol, the maximum number ofmultiplications done on a quantized number is equal to two:the first in the scalar product protocol and the second witha random selected number in the secure thresholding evalu-ation or with the α parameter in the secure sigmoid evalua-tion. Assume that the random values and the α parametersare bounded by R.

Let X be the upper bound for the norm of Alice’s inputvector, and W an upper bound for the weight vectors norm.We have that every scalar product computed in the proto-col is bounded by |x| · |w| cos(xw) ≤ XW . Given a modulon sufficiently high for security purposes, we have to select ssuch that

s ≥⌈

logn2XWR

Q2

⌉, (14)

where the factor 2 is due to the presence of both positive andnegative values.

Other solutions for working with noninteger values canbe found in [8] where a protocol to evaluate a polynomial onfloating-point numbers is defined (but the exponent mustbe chosen in advance), and [26], where a sophisticated cryp-tosystem based on lattice properties allowing computationwith rational values is presented (even in this case, however,


a bound exists on the number of multiplications that can becarried out to allow a correct decryption).

7. IMPLEMENTATION OF THE PROTOCOL

In this section a practical implementation of the proposedprotocol is described, and a case study execution that willgive us some numerical results in terms of computational andbandwidth resource needed is analyzed.

7.1. Client-server application

We developed a client-server application based on the Javaremote method invocation technology.1 The application,based on the implementation of the Damgard-Jurik cryp-tosystem available on Jurik’s homepage,2 is composed of twomethods, one for the initialization of the protocol (wherepublic key and public parameters are chosen) and one forthe evaluation of every layer of neurons.

7.2. Experimental data

From the UCI machine learning repository,3 the data set cho-sen by Gorman and Sejnowski in their study about the clas-sification of sonar signals by means of a neural network [27]has been selected. The task is to obtain a network able to dis-criminate between sonar signals bounced off a metal cylinderand those bounced off a roughly cylindrical rock. Followingthe author’s results, we have trained a NN with 12 hiddenneurons and sigmoid activation function with the standardbackpropagation algorithm, obtaining an accuracy of 99.8%on the training set and 84.7% on the test set.

7.3. Experimental setup

To protect our network we have embedded it in a networkmade of 5 layers of 15 neurons each, obtaining a high levelof security as the ratio of real neurons on fake one is reallylow, in fact it is 12/75 = 0.16. The public key n is 1024 bitlong, and the s parameter has been set to 1, without any prob-lem for a very thin quantization factor Q = 10−6. We havethen initialized every fake neuron with connections from ev-ery input neuron in a way that they will look the same ofthe real ones, setting the weights of the connection at ran-dom. Then we have deployed the application on two mid-level notebooks, connected on a LAN network.

The execution of the whole process took 11.7 seconds, ofwhich 9.3 on server side, with a communication overhead of76 kb. Let us note that no attempt to optimize the executiontime was done, and as seen the client computation is negli-gible. These results confirm the practical possibility to run aneural network on an input provided in encrypted format.

1 http://java.sun.com/javase/technologies/core/basic/rmi2 http://www.daimi.au.dk/∼jurik/research.html3 http://www.ics.uci.edu/∼mlearn/MLRepository.html

8. CONCLUSIONS

In artificial intelligence applications, the possibility that theowner of a specific expertise is asked to apply its knowledgeto process some data without that the privacy of the dataowner is violated is of crucial importance. In this frame-work, the possibility of processing data and signals directly inthe encrypted domain is an invaluable tool, upon which se-cure and privacy preserving protocols can be built. Given thecentral role that neural network computing plays in mod-ern artificial intelligence applications, we devoted our at-tention to NN-based privacy-preserving computation, wherethe knowledge embedded in the NN as well as the data theNN operates on are protected. The proposed protocol re-lies on homomorphic encryption; for those tasks that cannotbe handled by means of homomorphic encryption, a lim-ited amount of interaction between the NN owner and theuser is introduced; however, in contrast to previous works,the interaction is kept to a minimum, without resorting tomultiparty computation protocols. Any unnecessary disclo-sure of information has been avoided, keeping all the internalcomputations secret such that at the end of the protocol theuser only knows the final output of the NN. Future researchwill be focused on investigating the security of the networktopology obfuscation proposed here, and on the design ofmore efficient obfuscation strategies. Moreover, the possibil-ity of training the network in its encrypted form will also bestudied.

ACKNOWLEDGMENTS

The work described in this paper has been supported in partby the European Commission through the IST Programmeunder Contract no 034238-SPEED. The information in thisdocument reflects only the author’s views, is provided as isand no guarantee or warranty is given that the informationis fit for any particular purpose. The user thereof uses theinformation at its sole risk and liability.

REFERENCES

[1] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feed-forward networks are universal approximators,” Neural Net-works, vol. 2, no. 5, pp. 359–366, 1989.

[2] R. L. Rivest, L. Adleman, and M. L. Dertouzos, “On data banksand privacy homomorphisms,” in Foundations of Secure Com-putation, pp. 169–178, Academic Press, New York, NY, USA,1978.

[3] B. Pinkas, “Cryptographic techniques for privacy-preservingdata mining,” ACM SIGKDD Explorations Newsletter, vol. 4,no. 2, pp. 12–19, 2002, ACM special interest group on knowl-edge discovery and data mining.

[4] O. Goldreich, S. Micali, and A. Wigderson, “How to play anymental game or a completeness theorem for protocols withhonest majority,” in Proceedings of the 19th Annual ACM Sym-posium on Theory of Computing (STOC ’87), pp. 218–229,ACM Press, New York, NY, USA, May 1987.

[5] D. Chaum, C. Crepeau, and I. Damgard, “Multiparty uncon-ditionally secure protocols,” in Proceedings of the 20th Annual

C. Orlandi et al. 11

ACM Symposium on Theory of Computing (STOC ’88), pp. 11–19, ACM Press, Chicago, Ill, USA, May 1988.

[6] Y. Lindell and B. Pinkas, “Privacy preserving data mining,” inProceedings of the 20th Annual International Cryptology Con-ference on Advances in Cryptology (CRYPTO ’00), vol. 1880 ofLecture Notes in Computer Science, pp. 36–54, Santa Barbara,Calif, USA, August 2000.

[7] R. Agrawal and R. Srikant, “Privacy-preserving data mining,”in Proceedings of the ACM SIGMOD International Conferenceon Management of Data, pp. 439–450, ACM Press, Dallas, Tex,USA, May 2000.

[8] Y.-C. Chang and C.-J. Lu, “Oblivious polynomial evaluationand oblivious neural learning,” Theoretical Computer Science,vol. 341, no. 1–3, pp. 39–54, 2005.

[9] S. Laur, H. Lipmaa, and T. Mielikaihen, “Cryptographicallyprivate support vector machines,” in Proceedings of the 12thACM SIGKDD International Conference on Knowledge Discov-ery and Data Mining (KDD ’06), pp. 618–624, ACM Press,Philadelphia, Pa, USA, August 2006.

[10] M. Kantarcioglu and J. Vaidya, “Privacy preserving naive bayesclassifier for horizontally partitioned data,” in Proceedings ofthe Workshop on Privacy Preserving Data Mining, Melbourne,Fla, USA, November 2003.

[11] Z. Yang and R. N. Wright, “Improved privacy-preservingBayesian network parameter learning on vertically partitioneddata,” in Proceedings of the 21st International Conference onData Engineering Workshops (ICDEW ’05), p. 1196, IEEEComputer Society, Tokyo, Japan, April 2005.

[12] R. Wright and Z. Yang, “Privacy-preserving Bayesian networkstructure computation on distributed heterogeneous data,” inProceedings of the 10th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD ’04), pp. 713–718, ACM Press, Seattle, Wash, USA, August 2004.

[13] G. Jagannathan and R. N. Wright, “Privacy-preserving dis-tributed k-means clustering over arbitrarily partitioned data,”in Proceeding of the 11th ACM SIGKDD International Confer-ence on Knowledge Discovery in Data Mining (KDD ’05), pp.593–599, ACM Press, Chicago, Ill, USA, August 2005.

[14] A. C. Yao, “Protocols for secure computations,” in Proceedingsof the 23rd Annual Symposium on Foundations of Computer Sci-ence, pp. 160–164, Chicago, Ill, USA, November 1982.

[15] A. Yao, “How to generate and exchange secrets,” in Proceed-ings of the 27th Annual Symposium on Foundations of ComputerScience (FOCS ’86), pp. 162–167, Toronto, Ontario, Canada,October 1986.

[16] P. Pailler, “Public-key cryptosystems based on composite de-gree residuosity classes,” in Proceedings of International Con-ference on the Theory and Application of Cryptographic Tech-niques (EUROCRYPT ’99), vol. 1592 of Lecture Notes is Com-puter Science, pp. 223–238, Springer, Prague, Czech Republic,May 1999.

[17] M. Barni, C. Orlandi, and A. Piva, “A privacy-preserving pro-tocol for neural-network-based computation,” in Proceedingsof the 8th Multimedia and Security Workshop (MM & Sec ’06),pp. 146–151, ACM Press, Geneva, Switzerland, September2006.


[19] I. Damgard and M. Jurik, “A generalisation, a simplificationand some applications of Paillier’s probabilistic public-key sys-tem,” in Proceedings of the 4th International Workshop on Prac-

tice and Theory in Public Key Cryptography (PKC ’01), pp. 119–136, Cheju Island, Korea, February 2001.

[20] D. Catalano, The bit security of Paillier’s encryption scheme anda new, efficient, public key cryptosystem, Ph.D. thesis, Universitadi Catania, Catania, Italy, 2002.

[21] B. Goethals, S. Laur, H. Lipmaa, and T. Mielikainen, “On pri-vate scalar product computation for privacy-preserving datamining,” in Proceedings of the 7th Annual International Con-ference in Information Security and Cryptology (ICISC ’04), pp.104–120, Seoul, Korea, December 2004.

[22] I. J. Cox and J.-P. M. G. Linnartz, “Public watermarks and re-sistance to tampering,” in Proceedings the 4th IEEE Interna-tional Conference on Image Processing (ICIP ’97), vol. 3, pp.3–6, Santa Barbara, Calif, USA, October 1997.

[23] T. Kalker, J.-P. M. G. Linnartz, and M. van Dijk, “Watermarkestimation through detector analysis,” in Proceedings of IEEEInternational Conference on Image Processing (ICIP ’98), vol. 1,pp. 425–429, Chicago, Ill, USA, October 1998.

[24] D. Dolev, C. Dwork, and M. Naor, “Nonmalleable cryptogra-phy,” SIAM Journal on Computing, vol. 30, no. 2, pp. 391–437,2000.

[25] T. M. Mitchell, Machine Learning, McGraw-Hill, New York,NY, USA, 1997.

[26] P.-A. Fouque, J. Stern, and J.-G. Wackers, “CryptoComputingwith rationals,” in Proceedings of the 6th International Con-ference on Financial-Cryptography (FC ’02), vol. 2357 of Lec-ture Notes in Computer Science, pp. 136–146, Southampton,Bermuda, March 2002.

[27] R. P. Gorman and T. J. Sejnowski, “Analysis of hidden unitsin a layered network trained to classify sonar targets,” NeuralNetworks, vol. 1, no. 1, pp. 75–89, 1988.


Research ArticleEfficient Zero-Knowledge Watermark Detection withImproved Robustness to Sensitivity Attacks

Juan Ramon Troncoso-Pastoriza and Fernando Perez-Gonzalez

Signal Theory and Communications Department, University of Vigo, 36310 Vigo, Spain

Correspondence should be addressed to Juan Ramon Troncoso-Pastoriza, [email protected]

Received 28 February 2007; Revised 20 August 2007; Accepted 18 October 2007


Zero-knowledge watermark detectors presented to date are based on a linear correlation between the asset features and a givensecret sequence. This detection function is susceptible of being attacked by sensitivity attacks, for which zero-knowledge does notprovide protection. In this paper, an efficient zero-knowledge version of the generalized Gaussian maximum likelihood (ML) de-tector is introduced. This detector has shown an improved resilience against sensitivity attacks, that is empirically corroborated inthe present work. Two versions of the zero-knowledge detector are presented; the first one makes use of two new zero-knowledgeproofs for absolute value and square root calculation; the second is an improved version applicable when the spreading sequenceis binary, and it has minimum communication complexity. Completeness, soundness, and zero-knowledge properties of the de-veloped protocols are proved, and they are compared with previous zero-knowledge watermark detection protocols in terms ofreceiver operating characteristic, resistance to sensitivity attacks, and communication complexity.

Copyright © 2007 J. R. Troncoso-Pastoriza and F. Perez-Gonzalez. This is an open access article distributed under the CreativeCommons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided theoriginal work is properly cited.

1. INTRODUCTION

Watermarking technology has emerged as a solution for au-thorship proofs or dispute resolving. In these applications,there are several requirements that watermarking schemesmust fulfill, like imperceptibility, robustness to attacks thattry to erase a legally inserted watermark or to embed an ille-gal watermark in some asset, and they must also be secure tothe disclosure of information that could allow the breakageof the whole system by unauthorized parties.

The schemes that have been used up to now are symmet-ric, as they employ the same key for watermark embeddingand watermark detection; thus, such key must be given tothe party that runs the detector, which in most cases is nottrusted. In order to satisfy the security requirements, two ap-proaches have been proposed: the first one, called asymmet-ric watermarking, follows the paradigm of asymmetric cryp-tosystems, and employs different keys for embedding and de-tection; the second approach, zero-knowledge watermarking,makes use of zero-knowledge (ZK) protocols [1] in order toget a secure communication layer over a pre-existent sym-metric protocol. In zero-knowledge watermark detection [2],

a prover P tries to demonstrate to a verifier V the presenceof a watermark in a given asset. Commitment schemes [3]are used to conceal the secret information, so that detectionis performed without providing to V any information addi-tional to the presence of the watermark.

Nevertheless, such minimum disclosure of informationstill allows for blind sensitivity attacks [4], that have arisenas very harmful attacks for methods that present simple de-tection boundaries. The ZK detection protocols presented todate—Adelsbach and Sadeghi [2] and Piva et al. [5]—arebased on correlation detectors, for which blind sensitivity at-tacks are especially efficient.

In this paper, a new zero-knowledge blind watermark de-tection protocol is presented; it is based on the spread spec-trum detector by Hernandez et al. [6], which is optimal foradditive watermarking in generalized Gaussian distributedhost features (e.g., AC DCT coefficients of images). The ro-bustness to sensitivity attacks comes from the complexityof the detection boundary for certain shape factors. Thus,when combined with zero-knowledge, it becomes secure androbust. This protocol will be compared in terms of perfor-mance and efficiency with the previous ZK protocols based


on additive spread-spectrum and Spread-Transform DitherModulation (ST-DM), and rewritten in a form that greatlyimproves its communication and computation complexity.

The rest of the paper is organized as follows. In Section 2,some basics about zero-knowledge and watermark detec-tion are reviewed, and the three studied detectors are com-pared, pointing out the improved robustness of the GG de-tector against sensitivity attacks. In Section 3, the needed ZKsubprotocols are enumerated, along with their communi-cation complexity and a detailed description of the devel-oped proofs. Sections 4 and 5 detail the complete detectionprotocol and the improved version for a binary antipodalspreading sequence. Section 6 presents the security analy-sis for these protocols; complexity and implementation con-cerns are discussed in Section 7. Finally, some conclusionsare drawn in Section 8.

2. NOTATION AND PREVIOUS CONCEPTS

In this section, some of the concepts needed for the develop-ment of the studied protocols are briefly introduced. Bold-face lower-case letters will denote column vectors of lengthL, whereas boldface capital letters are used for matrices, andscalar variables will be denoted by italicized letters. Upper-case calligraphic letters represent sets or parties participatingin a protocol.

2.1. Cryptographic primitives

2.1.1. Commitment schemes

Commitment schemes [3] are cryptographic tools that, givena common public parameter parcom, allow that one party ofa protocol choose a determined value m from a finite set Mand commit to his choice Cm = Com(m, r, parcom), such thathe cannot modify it during the rest of the protocol; the com-mitted value is not disclosed to the other party, thanks to therandomization produced by r, which constitutes the secretinformation needed to open the commitment.

The required security properties that the commit func-tion must fulfill are binding and hiding; the first one guar-antees that once produced a commitment Cm to a messagem, the committer cannot open it to a different message m′;the second one guarantees that the distributions of the com-mitments to different messages are indistinguishable, so onecommitment does not reveal any information about the con-cealed message. Each of these properties can be achieved ei-ther computationally or in an information-theoretic sense,but the information-theoretic version cannot be obtained forboth properties at the same time.

The commitment scheme used in the present work isDamgard-Fujisaki’s scheme [7], that provides statistically-hiding and computationally-binding commitments, basedon Abelian groups of hidden order. Given the security pa-rameters F,B,T , and k, the common parameters are a mod-ulus n (that can be obtained as an RSA modulus), such thatthe order of Z∗n can be upper bounded by 2B, a generator h ofa multiplicative subgroup of high order (the order must beF-rough) in Z∗n , and a value g = hα, such that the committer

knows neither α nor the order of the subgroups. The com-mit function of a message x ∈ [−T ,T] with a random valuer ∈ [0, 2B+k] takes the form Cx = gxhr modn.

Additionally, this commitment scheme presents an ad-ditive homomorphism that allows computing the additionof two committed numbers (Cx+y = Cx·Cy modn) and theproduct of a committed number and a public integer (Cax =Cax modn).

2.1.2. Interactive proof systems

Interactive proof systems were introduced by Goldwasseret al. [1]; they are two party protocols in which a prover Ptries to prove a statement x to a verifier V, and both can makerandom choices. The two main properties that an interactiveprotocol must satisfy are completeness and soundness; the firstone guarantees that a correct prover P can prove all correctstatements to a correct verifier V, and the second guaran-tees that a cheating prover P ∗ will only succeed in proving awrong statement with negligible probability.

A special class of interactive protocols are proofs ofknowledge [8], in which the proved statement is the knowl-edge of a witness that makes a given binary relation output atrue value, such that a probabilistic algorithm called knowl-edge extractor exists, and it is able to output a witness forthe common input x using any probabilistic polynomial timeprover P ∗ as an oracle, in polynomial expected time (weaksoundness).

2.1.3. Zero-knowledge protocols

In order for an interactive proof to be zero-knowledge [1], itmust be such that the only knowledge disclosed to the verifieris the statement that is being proved. More formally, an in-teractive proof system (P , V) is statistically zero-knowledgeif it exists a probabilistic polynomial algorithm (simulator)SV such that the conversations produced by the real interac-tion between P and V are statistically indistinguishable fromthe outputs of SV .

2.2. Blind watermark detection

Given a host signal x, a watermark w, and a pair of keys{Kemb,Kdet} for embedding and detection (they are the samekey in symmetric schemes), a digital blind watermark detec-tion scheme consists of an embedder that outputs the water-marked signal y = Embed(x, w,Kemb) and a detector thattakes as parameters a possibly attacked signal z = y + n,where n represents added noise, the watermark w, and thedetection key Kdet, and it outputs a Boolean value indicat-ing whether the signal z contains the watermark w, withoutusing the original host data x.

Three detection algorithms will be compared in termsof their Receiver Operating Characteristic (ROC), namely,additive spread spectrum with a correlation-based detector(SS), spread-transform dither modulation without distor-tion compensation (ST-DM), and additive spread spectrumwith a generalized Gaussian maximum likelihood (ML) de-tector (GG). In all of them, the host features x are considered

J. R. Troncoso-Pastoriza and F. Perez-Gonzalez 3

x

sCorr.

rxQΛ(.)

QΛ(rx)−

+ × +ρ

1L

w y

Figure 1: Block diagram of the watermark embedding process forST-DM.

i.i.d. with variance σ2X , the watermarked features are denoted

by y = x+w, and z represents the input to the receiver, whichmay be corrupted with AWGN noise n, that is considered alsoi.i.d with variance σ2

N . The binary hypothesis test that mustbe solved at the detector is

H0 : z = x + n,

H1 : z = x + w + n.(1)

Table 1 summarizes the probabilities of false alarm(Pf ) and missed detection (Pm) for the three detectors[9–11].

2.2.1. Additive spread spectrum withcorrelation-based detector

In SS, the watermark is generated as the product of a pseu-dorandom vector s, that we will consider a binary sequencewith values {±1} (with norm ‖s‖2 = L) and a perceptualmask α (that is assumed to be constant to simplify the anal-ysis), that controls the tradeoff between imperceptibility anddistortion (Dw = (1/L)

∑Lk=1E{w2

k} = E{α2k} = α2).

The maximum-likelihood detector for Gaussian dis-tributed host features is a correlation-based detector:

H1

rz = 1L

L∑

k=1

zksk ≷ η,

H0

(2)

where η is a threshold that depends on the probabilities offalse alarm (Pf ) and missed detection (Pm), as indicated inTable 1.

2.2.2. Spread transform dither modulation

Given the host features x and the secret spreading sequences, which will be considered here binary with values {±1},the embedding of the watermark in ST-DM [12] (similar toquantized projection QP [9, 10]) is done as indicated in Fig-ure 1.

The host features x are correlated with the projection sig-nal s, and the result (rx) is quantized with an Euclidean scalarquantizer QΛ(·) of step Δ, that controls the distortion, andwith centroids defined by the shifted lattice Λ � ΔZ + Δ/2.

z[n]DCT

zDetection suff.

statisticsLikelihoodfunction

η

H1, H0

s

Perceptualanalysis α

K

PRSgenerator

Figure 2: Block diagram of the watermark detection process for theGG detector.

Let ρ = (QΛ(rx) − rx); then the watermarked vector is givenby

y = x + w = x +1Lρs. (3)

In order to detect the watermark, the host features, pos-sibly degraded by AWGN noise n, are correlated with thespreading sequence s, and the resulting value rz =

∑Lk=1zksk

is quantized and compared to a threshold η to determinewhether the watermark is present:

H1∣∣ QΛ(rz)− rz

∣∣ ≶ η.

H0

(4)

Due to the Central Limit Theorem (CLT), the computedcorrelations can be accurately modeled by a Gaussian pdf.

2.2.3. Additive spread spectrum with generalized-Gaussianfeatures

Figure 2 shows the detection scheme for this case. The hostfeatures are assumed to be the DCT coefficients of an image,what justifies the generalized Gaussian model with the fol-lowing pdf:

fX(x) = Ae−|βx|c,

β = 1σ

(Γ(3/c)Γ(1/c)

)1/2

,

A = βc

2Γ(1/c).

(5)

The embedding procedure is the same as the one de-scribed for SS. For detection, a preliminary perceptual anal-ysis provides the estimation of the perceptual mask α thatmodulates the inserted secret sequence s. The parameters cand β are also estimated from the received features. The like-lihood function for detection is

H1

l(y) =∑

k

βc(∣∣Yk∣∣c − ∣∣Yk − αksk

∣∣c)

≷ η,

H0

(6)

where η represents the threshold value used to make the de-cision.


Table 1: Probabilities of false alarm (Pf ) and missed detection (Pm) for the three studied detectors.

AddSS ST-DM GG

Pf Q(√Lη/√σ2X + σ2

N )∑∞

i=−∞[Q((Δ(i + 1/2)− η)/√L(σ2

X + σ2N ))−Q((Δ(i + 1/2) + η)/

√L(σ2

X + σ2N ))] Q((η +m1)/σ1)

Pm Q(√L(α− η)/

√σ2X + σ2

N ) 1−∑∞i=−∞ [Q((iΔ− η)/√LσN )−Q((iΔ + η)/

√LσN )] 1−Q((η−m1)/σ1)

As shown in [6], the pdfs of l(Y) conditioned to hypothe-ses H0 and H1 are approximately Gaussian with the samevariance σ2

1 , and respective means −m1 and m1, that can beestimated from the watermarked image [6].

2.2.4. Comparison

The three detectors can be compared in terms of robustnessthrough their Receiver Operating Characteristic (ROC), takenfrom the formulas in Table 1. The correlation-based detec-tor is only optimum when c = 2, and when c /= 2, the gen-eralized Gaussian detector outperforms it; ST-DM can out-perform both for a sufficiently high DWR (Data to Water-mark Ratio, DWR = 10log10(σ2

X/σ2W )), due to its host rejec-

tion capabilities. However, the performance of the general-ized Gaussian detector and the ST-DM one are not much farapart when c is near 1 and the DWR in the projected domain(DWRp = DWR − 10 log10L) is low. Figure 3 shows a plotof the ROC for fixed DWR and WNR (Watermark to NoiseRatio, WNR = 10 log10(σ2

W/σ2N )), with a features shape pa-

rameter of c = 0.8, that has been chosen as an example ofa relatively common value for the distribution of AC DCTcoefficients of most images. It is remarkable that even whenthe exact c is not used, and it is below 1, the performance ofthe GG detector with c = 0.5 is much better than that of thecorrelation-based one, and its ROC remains near the ST-DMROC.

Regarding the resilience against sensitivity attacks, it canbe shown that the correlation-based detector and the ST-DMone make the watermarking scheme very easy to break whenthe attacker has access to the output of the detector, as thedetection boundaries for both methods are just hyperplanes;Figure 4 shows the two-dimensional detection regions foreach of the three methods. On the other hand, the detec-tion function in the GG detector when c < 1 (Figure 4(c))presents the property that component-wise modificationsproduce bounded increments; that is, when modifying onecomponent of the host signal Y , the increment produced inthe likelihood function (6) is bounded by |αksk|c indepen-dently of the component |Yk| if c < 1:

∣∣∣∣Yk∣∣c − ∣∣Yk − αksk

∣∣c∣∣ ≤ ∣∣αksk

∣∣c. (7)

This means that it is not possible to get a signal in theboundary by modifying a single component (or a number Nof components such that

∑N |αksk|c is less than the gap to η),

opposed to a correlation detector, in which just making onecomponent big (or small) enough can get the signal out ofthe detection region. This property can make very difficultthe task of finding a vector in the boundary given only onemarked signal.

10−20

10−15

10−10

10−5

100

Pf

10−6 10−4 10−2 100

Pm

STDMCox

GG c = 1GG c = 0.5

Figure 3: Theoretical ROC curves for the studied detectors underAWGN attacks, with DWR = 20 dB, WNR = 0 dB, L = 1000, andgeneralized Gaussian distributed host features with c = 0.8.

In order to quantitatively compare the resilience of thethree detectors against sensitivity attacks, we will take as ro-bustness criterion the number of calls to the detector neededfor reaching an attack distortion equal to that of the water-mark (NWR = 0 dB). This choice is supported by the fact thatfor an initially nonmarked host x in which a watermark w hasbeen inserted, yielding y, it is always possible to find a vectorz in the boundary whose distortion with respect to y is lessthan the power of the watermark (e.g., taking the intersectionbetween the detection boundary and the line that connects xand y). Thus, a sensitivity attack can always reach a pointwith NWR = 0 dB. In general, it is not guaranteed that an at-tack can reach a lower NWR. Furthermore, given that for ablind detection the original nonmarked host is not known,imposing a more restrictive fidelity criterion for the attackerthan for the embedder makes no sense. In light of the previ-ous discussion, we can consider that a watermark has beeneffectively erased when a point z is found, whose distortionwith respect to y is equal to the power of the embedded wa-termark w; the number of iterations that a sensitivity attackneeds to reach this point can thus be used for determiningthe robustness of the detector against the attack.

We have taken blind newton sensitivity attack (BNSA[4]; an RRP-compliant description of BNSA can be found in[13]) as a powerful representative of sensitivity attacks, andsimulated its execution against the three studied detectors.Each iteration of this algorithm calls the detector a number


(a) (b)

(c)

Figure 4: Two-dimensional detection boundaries for ST-DM (a),correlation-based detector (b), and GG detector (c).

of times proportional to the number of dimensions of theinvolved signals. The results show that both ST-DM and thecorrelation detector are completely broken in just one iter-ation of the algorithm, independently of the dimensionalityof the signals, so the attack needs O(L) calls to the detectorin order to succeed (achieving not only a point with NWR <0 dB, but also convergence to the nearest point in the bound-ary). This is due to their simple detection boundaries, thathave a constant gradient. Figure 5 shows the NWR of the at-tack as a function of the number of calls to the detector, forthe three detectors, using DWR = 16 dB and Pf= 10−4, as aresult of averaging 100 random executions. The GG detectoris used with two different shape factors, c = 0.5 and c = 1.5;the number of iterations needed to break the detector in bothcases is bigger than for the correlation detectors, due to themore involved detection boundary, but this effect is more ev-ident when c < 1, case in which the detector has the afore-mentioned property of bounded increments for component-wise modifications at the input.

The involved detection boundary of the generalizedGaussian ML detector makes the number of iterationsneeded for achieving convergence grow also with the dimen-sionality of the host. This means that the number of calls tothe detector needed to get a certain target distortion is notonly higher for the GG detector, but it also grows faster thanfor the other detectors with the dimensionality of the host(Figure 6) for fixed WNR and Pf . We have found empiricallythat the number of calls needed for reaching NWR = 0 dBis approximately O(L1.5). Furthermore, if we took as robust-ness criterion the absolute convergence of the algorithm (notonly achieving NWR = 0 dB), the advantage of the GG detec-tor is even better both in number of iterations and in numberof calls to the detector; that is, while for the GG detector con-vergence is slowly achieved several iterations after reaching

−10

0

10

20

30

40

50

60

70

80

NW

R(d

B)

0 0.5 1 1.5 2 2.5 3×106

Calls to the detector

STDMCox

GG c = 1.5GG c = 0.5

Figure 5: NWR for a sensitivity attack (BNSA) as a function ofnumber of calls to the detector for correlation detector (Cox), ST-DM, and generalized Gaussian (GG) with c = 0.5, and c = 1.5 forDWR = 16 dB, Pf= 10−4, and L = 8192.

0

0.5

1

1.5

2

2.5

3×106

Nu

mbe

rof

orac

leca

llsfo

rN

WR=

0dB

1000 2000 3000 4000 5000 6000 7000 8000

L

STDMCox

GG c = 1.5GG c = 0.5

Figure 6: Number of calls to the detector for a sensitivity attack(BNSA) for reaching NWR = 0 dB as a function of the dimensional-ity of the watermark for correlation detector (Cox), ST-DM, andgeneralized Gaussian (GG) with c = 0.5 and c = 1.5 for DWR= 16 dB and Pf= 10−4.

NWR = 0 dB, for correlation detectors BNSA achieves bothNWR < 0 dB and convergence in just one iteration.

2.3. Zero-knowledge watermark detection

The use of zero-knowledge protocols in watermark detec-tion was first issued by Craver [14], and later formalized


by Adelsbach et al. [2, 15]. The formal definition of a zero-knowledge watermark detection scheme concreted for ablind detection mechanism can be stated as follows.

Definition 1 (Zero-knowledge Watermark Detection). Givena secure commitment scheme with the operations Com()and Open(), and a blind watermarking scheme with theoperations Embed() and Detect(), the watermarked hostdata z and the commitments on the watermark Cw andkey CKw (for a keyed scheme), with their respective pub-lic parameters parcom = (parw

com, parKwcom), a zero-knowledgeblind watermark detection protocol for this watermarkingscheme is a zero-knowledge proof of knowledge between aprover P and a verifier V where on common input x :=(z,Cw,CKw , parcom), P proves knowledge of a tuple aux =(w,Kw, rw

com, rKwcom) such that

[(Open

(Cw, w, rw

com, parwcom

) = true)∧

(Open

(CKw ,Kw, rKwcom, parKwcom

) = true)∧

(Detect

(z, w,Kw

) = true)].

(8)

Adelsbach and Sadeghi introduced in [2] a zero-knowledge watermark detection protocol for the Cox et al.[16] detection scheme, that consists in a normalizedcorrelation-detector for spread spectrum. In [17], they havestudied the communication complexity of the non-blindprotocol, that is much less efficient than the blind one, dueto the higher number of committed operations that must beundertaken. Later, Piva et al. also developed a ZK watermarkdetection protocol for ST-DM in [5].

3. ZERO-KNOWLEDGE SUBPROOFS

The proofs that are employed in the previous zero-knowledge detectors and in the generalized Gaussian oneare shown in Table 2 with their respective communica-tion complexity, which has been calculated when applied tothe Damgard-Fujisaki commitment scheme [7] as a func-tion of the security parameters F,B,T and k, defined inSection 2.1.1.

The first five proofs are already existing zero-knowledgeproofs for the opening of a commitment [7] (PKop), theequality of two commitments [18] (PKeq), the square of acommitment [18] (PKsq), a commitment is inside an inter-val [18] (PKint) and nonnegativity of a commitment [19](PK≥0).

All these proofs are just simple operations, but the lack ofsome operations like the computation of the absolute valueor the square root, both necessary for the first implementa-tion of the GG ML detector, led us to the development of thelast two zero-knowledge proofs; PKsqrt represents a proof thata committed integer is the rounded square root of anothercommitted integer, and it is based on a mapping of quan-tized square roots into integers. PKabs allows the applicationof the absolute value operator to a committed number, with-out disclosing the magnitude nor the sign of that number.Both proofs are described in the following.

3.1. Zero-knowledge proof that a committedinteger is the rounded square root of anothercommitted integer

Adelsbach et al. presented in [20] a proof for a generic func-tion approximation whose inverse can be efficiently proven,covering, for example, divisions and square roots. Here, wepresent a specific protocol for proving a rounded squareroot that follows a similar philosophy, we study its commu-nication complexity and propose a mapping (presented inAppendix A) that makes possible this zero-knowledge proto-col to prove the correct calculation of square roots on com-mitted integers (not necessarily perfect square residues):

PKsqrt[y, r1, r2 : Cy=g yhr1 modn∧ Cn√y=gn

√yhr2 modn

].

(9)

Let Cy be the commitment to the integer whose squareroot must be calculated. The protocol that prover and verifierwould follow is the next.

(1) First, the prover calculates the value x = round(√y),its commitment Cx, and the commitment to itssquared value Cx2 , and sends both commitments andCy to the verifier.

(2) The prover proves in zero-knowledge that Cx2 containsthe squared value of the integer hidden in Cx, throughPK{x, r1, r2 : Cx = gxhr1 modn, Cx2 = gx

2hr2 modn}.

(3) Then, the prover must prove that x2 ∈ [y − x, y + x],using a modified version of Boudot’s proof [18] withhidden interval, that consists in considering also ran-domness in the commitments of the interval limits cal-culated by both parties at the first step of the proof.Using this interval instead of the one indicated inAppendix A, the zero values are also accepted with noambiguity when the maximum allowable value for y isbelow the order of the group generated by g. The coun-terpart is that there are two possibilities for the squareroot of integers of the form k2 + k, with k an integer,namely k and k + 1. The effect of this relaxation on theconditions imposed before is a small rise in the round-ing error, smaller as k grows; if we take into accountthat the numbers that are considered integers are actu-ally the quantization of real numbers using a step thatis fixed by the precision of the system, the error is of thesame order as this precision. Nevertheless, the need ofworking with null values without disclosing any infor-mation forces us to make this adaptation.

(4) At last, it is necessary to prove that x ∈ [0,√m], if

m is the order of the subgroup generated by g. If itis known—by the initialization of the commitmentscheme—that log2(m) = l, then proving that x ∈[0, 2l/2−1] is enough; if the working range for the com-mitted integers is [−T ,T], with T <

√m (as it will

be if the bit length of T is at most l/2 − 1), then itsuffices with the proof that x is in the working range:x ∈ [0,T].


Table 2: Zero-knowledge subproofs and their communication complexity.

Proof CompPK (bits)

PKop[m, r : Cm = gmhr modn] 3|F| + |T| + 2B + 3k + 2

PKeq[m, r1, r2 : C(1)m = gm1 h

r11 modn∧ C(2)

m = gm2 hr22 modn] 4|F| + |T| + 2B + 5k + 3

PKsq[m, r1, r2 : Cm = gm1 hr11 modn∧ gm2

2 hr22 modn] 4|F| + |T| + 3B + 5k + 3

PKint[m, r : Cm = gmhr modn∧m ∈ [a, b]] 25|F| + 5|T| + 10B + 27k + 2|n| + 20

PK≥0[m, r : Cm = gmhr modn∧m ≥ 0] 11|F| + 4|T| + 12B + 14k + 9

PKsqrt[m, r1, r2 : Cm = gmhr1 modn∧ Cn√m = gn√mhr2 modn] 48|F| + 9|T| + 18B + 53k + 6|n| + 39

PKabs[m, r1, r2 : Cm = gmhr1 modn∧ C|m| = g |m|hr2 modn] 19|F| + 6|T| + 16B + 24k + 15

Claim 1. The presented interactive proof is computationallysound and statistically zero-knowledge in the random oraclemodel.

A sketch of the proof for this claim is given in Appen-dix C.

The communication complexity of this protocol is shownin Table 2.

3.2. Zero-knowledge proof that a committed integer isthe absolute value of another committed integer

This proof is a zero-knowledge protocol that allows the appli-cation of the absolute value operator to a committed number,without disclosing the magnitude nor the sign of that num-ber

PKabs[x, r1, r2 : Cx = gx1h

r11 modn∧ C|x| = g |x|2 hr2

2 modn].

(10)

As in a residue group Zq there is no notion of “sign,” weare using the commonly known mapping:

sign(x) =

⎧⎪⎪⎪⎨

⎪⎪⎪⎩

1, x ∈{

0,⌊q

2

⌋}

,

−1, x ∈{⌊

q

2

⌋

+ 1,n− 1}

;

taking into account that −x ≡ q − xmod q, the mapping isconsistent.

Let Cx = gx1hr11 modn be the commitment to a num-

ber x, whose sign is not known by the verifier, and C|x| =g |x|2 hr2

2 modn the commitment to a number which is claimedto be the absolute value of x. The scheme of the protocol is asfollows:

(1) both prover and verifier calculate the commitment tothe opposite of x, with the help of the homomorphicproperties of the commitment scheme:

C−x = C−1x ; (11)

(2) next, the prover must demonstrate that the value hid-den in C|x| corresponds to the value hidden in oneof the previous commitments Cx,C−x, using the ZKproof of knowledge described in Appendix B;

(3) at last, the prover demonstrates that the value hiddenin C|x| is |x| ≥ 0, using the protocol proposed by Lip-maa [19].

Claim 2. The presented interactive proof is computationallysound and statistically zero-knowledge in the random oraclemodel.

A sketch of the proof for this claim can be found inAppendix C.

The communication complexity of this protocol is givenin Table 2.

4. ZERO-KNOWLEDGE GG WATERMARK DETECTOR

The zero-knowledge version of the generalized Gaussian de-tector conceals the secret pseudorandom signal sk using theDamgard-Fujisaki scheme [7] Csk . The supposedly water-marked image Yk is publicly available, so the perceptual anal-ysis (αk) and the extraction of the parameters βk and ck canbe done in the public domain, as well as the estimation of thethreshold η for a given point in the ROC. In this first imple-mentation, only shape factors c = 1 or c = 0.5 are allowed,so the employed ck will be the nearest to the estimated shapefactor. The target is to perform the calculation of the likeli-hood function:

D =∑

k

βckk

⎛

⎜⎜⎝∣∣Yk∣∣ck − ∣∣

Ak︷︸︸︷Yk − αksk

∣∣ck

︸︷︷︸Bk

⎞

⎟⎟⎠ , (12)

and the comparison with the threshold η, without disclosingsk.

The protocol executed by prover and verifier so as toprove that the given image Yk is watermarked with the se-quence hidden in Csk is the following:

(1) prover and verifier calculate the commitment to Ak =Yk − αksk applying the homomorphic property of theDamgard-Fujisaki scheme:

CAk =gYk

C αksk

; (13)

(2) next, the prover generates a commitment C|Ak| to theabsolute value of Ak, sends it to the verifier, and provesin zero-knowledge that it hides the absolute value ofthe commitment CAk , through the developed proofPKabs (Section 3.2);

(3) if c = 1 (Laplacian features) then the operation|Ak|c is not needed, so, just for the sake of notationCBk = C|Ak|. If c = 0.5, the rounded square root of


|Ak| must be calculated by the prover; then he gen-erates the commitment CBk = C√|Ak|, sends it to theverifier and proves in zero-knowledge the validity ofthe square root calculation, through the proof PKsqrt

(Section 3.1);(4) both prover and verifier can independently calculate

the value βckk and |Yk|ck , and complete the commit-ted calculation of the sum D = ∑k β

ckk (|Yk|ck − Bk),

thanks to the homomorphic property of the used com-mitment scheme

CD =∏

k

(g |Yk|

ck

CBk

)βckk; (14)

(5) finally, the prover must demonstrate in zero-knowledge that D > η, or equivalently, that D − η > 0,which can be done by running the proof of knowledgeby Lipmaa [19] on Cth = CDg−η.

5. IMPROVED GG DETECTOR WITH BINARYANTIPODAL SPREADING SEQUENCE (GGBA)

When the spreading sequence sk is a binary antipodal se-quence, so it takes only values {±s}, we can apply a trivialtransformation to the detection function of the GG detector(6):

D =∑

k

βckk(∣∣Yk∣∣ck − ∣∣Yk − αksk

∣∣ck)

=∑

k

βckk(∣∣Yk∣∣ck − (∣∣Yk − αks

∣∣ck·1{s}

(sk)

+∣∣Yk + αks

∣∣ck·1{−s}

(sk)))

=∑

k

βckk

(∣∣Yk∣∣ck −

(∣∣Yk − αks

∣∣ck· 1

2s

(s + sk)

+∣∣Yk + αks

∣∣ck· 1

2s

(s− sk)))

(15)

=∑

k

βckk

(∣∣Yk∣∣ck − 1

2

(∣∣Yk − sαk

∣∣ck +∣∣Yk + sαk

∣∣ck))

︸︷︷︸G

−∑

k

βckk2s

(∣∣Yk − sαk

∣∣ck − ∣∣Yk + sαk

∣∣ck)

︸︷︷︸Hk

sk.

(16)

In (15), we use the fact that sk can only be given a value sor −s in order to substitute the indicator function 1{s}(sk) =(1/2s)(s + sk) and 1{−s}(sk) = (1/2s)(s− sk).

The factors termed as G and Hk in (16) can be computedin the clear-text domain, working with floating-point preci-sion arithmetic, and then have their commitments generated.This implies that all the nonlinear operations are transferredto the clear-text domain, greatly reducing the communica-tion overhead, as will be shown in Section 7; only additionsand multiplications must be performed in the encrypted do-main, and they can be undertaken through the homomor-

phic properties of the commitment scheme. This transfer-ence also diminishes the computational load, as clear-textoperations are much more efficient than modular operationsin a large ring.

The zero-knowledge protocol can be reduced to the fol-lowing two steps.

(1) prover and verifier homomorphically compute th =D − η

Cth =gG−η∏

k CHksk

. (17)

(2) The prover demonstrates the presence of the water-mark by running the zero-knowledge proof that D −η > 0.

The number of needed proofs during the protocol isreduced to only one, what propitiates the aforementionedreduction in computation and communication complexity,with the additional advantage that this scheme can be appliedto any value of the shape parameter ck, so it will be preferredto the previous one unless sk is not binary antipodal.

6. SECURITY ANALYSIS FOR THE GGDETECTION PROTOCOLS

After presenting the protocols for the zero-knowledge imple-mentation of the generalized Gaussian ML detector, we canstate the following theorem.

Theorem 1. The developed detection protocols for the general-ized Gaussian detector are computationally sound and statisti-cally zero-knowledge.

A sketch of the proof for this theorem can be found inAppendix C.

The reformulation of the generalized Gaussian protocoldeserves two comments concerning security. The first one in-volves the nonlinear operations that were performed underencryption in Section 4, which are now transferred to thepublic clear-text domain. Although this could seem at firstsight a knowledge leakage, currently it is not; all those oper-ations can be performed with the same public parameters asin Section 4 in a feasible time, so the parameters G and Hk

that are publicly calculated in this protocol could also be ob-tained in the previous version, and their disclosure gives noextra knowledge.

The second comment deals with the correlation form ofthe reformulation, and its resilience to blind sensitivity at-tacks. Even when the operation performed in the encrypteddomain is a correlation, the additive term (G) is what pre-serves the bounded-increment property, by virtue of whichcomponent-wise modifications of the input signal only pro-duce bounded increments on the likelihood function:

−αc ≤ ∣∣Yk∣∣c − ∣∣Yk − αsk

∣∣c ≤ αc, c < 1. (18)

The result of the addition is not disclosed during the pro-tocol; thus, the correlation cannot be known even when theterm G is public, and both terms cannot be decoupled, so


no extra knowledge is learned from G, and the difficulty forfinding points in the detection boundary, that is a necessarystep for sensitivity attacks, remains, as well as the shape of thedetection regions, unaltered.

7. EFFICIENCY AND PRACTICAL IMPLEMENTATION

We will measure the efficiency of the developed protocols interms of their communication complexity, as this parameteris what entails the bottleneck of the system, and it is easilyquantifiable given the complexity measures calculated in theprevious sections for each of the subprotocols.

Taking into account the plot of the raw protocol(Section 4), a total of 2L commitments (with a length |n|) areinterchanged, namely the L commitments that correspond tothe secret pseudorandom sequence s and the L commitmentsto |Ak|, while in the GGBA detector (Section 5) only the Lcommitments to s are sent; the rest of the commitments areeither calculated using homomorphic computation or are al-ready included in the complexity of the subprotocols.

Thus, the total communication complexity for the detec-tor applied to Laplacian distributed features and c = 0.5 inthe first scheme, as well as the complexity for the improvedGGBA detector can be expressed as

CompZKWDGG(c=1)

= 2L|n| + L·(CompPKabs+ CompPKop

)+ CompPK≥0

,

CompZKWDGG(c=0.5)

=2L|n|+L·(CompPKabs+CompPKop

+CompPKsqrt

)+CompPK≥0

,

CompZKWDGGBA

= (L + 1)|n| + L·CompPKop+ CompPK≥0

.

(19)

In every calculation, L proofs of knowledge of the open-ing of the initial commitments have been added, as evenwhen they are not explicitly mentioned in the sketch of theprotocols, they are needed to protect the verifier.

In order to reduce the total time spent during the inter-action, it is possible to convert the whole protocol in a non-interactive one, following the procedure described in [21],keeping the condition that the parameters for the commit-ment scheme must not be chosen by the prover, or he wouldbe able to fake all the proofs. In addition to the reduction ininteraction time, the use of this technique also overcomes thenecessity of a honest verifier that some subprotocols impose.

The calculated complexity for Piva et al.’s ST-DM detec-tor and Adelsbach and Sadeghi’s blind correlation-based de-tector is the following:

CompZKWDSTDM

= (L + 1)|n| + L·CompPKop+ CompPKint

,

CompZKWDSS

= (L + 1)|n| + L·CompPKop+ 2CompPK≥0

+ CompPKsq.

(20)

101

102

103

104

Len

gth

ofth

epr

otoc

ol(k

B)

100 200 300 400 500 600 700 800 900 1000

Number of watermark coefficients

STDMCoxc = 1

c = 0.5GGBA

Figure 7: Communication complexity in kB for the studied proto-cols.

As a numeric example, in Figure 7 the evolution of thecommunication complexity for every protocol is comparedusing |F| = 80, |n| = 1024, B = 1024, T= 2256 and k = 40,for growing L. All the protocols have complexity O(L). Thetwo protocols for generalized Gaussian host features withc = 1 and c = 0.5 have a higher complexity, due to theoperations that cannot be computed by making use of thehomomorphic property of the commitment scheme (abso-lute value and square root). Nevertheless, their complexity iscomparable to that of the zero-knowledge non-blind detec-tion protocol developed by Adelsbach et al. [17].

On the other hand, the zero-knowledge GGBA detec-tor achieves the lowest communication complexity of all thestudied protocols, even lower than the previous correlation-based protocols, with the increased protection against blindsensitivity attacks when c < 1 is used, being this the first ben-efit of the reformulated algorithm.

Furthermore, the communication complexity of the pro-tocol is constant if we discard the initial transmission of thecommitments for the spreading sequence and their corre-sponding proofs of opening; once this step is performed, theprotocol can be applied to several watermarked works forproving the presence of the same watermark with a (small)constant communication complexity.

Regarding computation complexity, the original detec-tion algorithm (without the addition of the zero-knowledgeprotocol) for the generalized Gaussian is more expensivethan ST-DM or Cox’s (normalized) linear correlator, due toits nonlinear operations. The use of zero-knowledge pro-duces an increase in computation complexity, as, addition-ally to the calculation and verification of the proofs, homo-morphic computation involves modular products and expo-nentiations in a large ring, so clear-text operations have al-most negligible complexity in comparison with encryptedoperations.


The second benefit of the presented GGBA zero-knowledge protocol is that all the nonlinear operations aretransferred from the encrypted domain (where they must beperformed using proofs of knowledge) to the clear-text pub-lic domain; thus, all the operations that made the symmetricprotocol more expensive than the correlation-based detec-tors can be neglected in comparison with the encrypted oper-ations, so the computation complexity of the zero-knowledgeGGBA protocol will be roughly the same as the one for thecorrelation-based zero-knowledge detectors.

8. CONCLUSIONS

The presented zero-knowledge watermark detection pro-tocol based on generalized Gaussian ML detector outper-forms the previous correlation-based zero-knowledge de-tectors implemented to date in terms of robustness againstblind sensitivity attacks, while improving on the ROC of thecorrelation-based spread-spectrum detector with a perfor-mance that is near that of ST-DM.

If the employed spreading sequence is a binary antipodalsequence, the protocol can be restated in a much more effi-cient way, reaching a communication complexity that is evenlower than that of the previous correlation-based protocols,while keeping its robustness against sensitivity attacks.

Two zero-knowledge proofs for square root calculationand absolute value have been presented. They serve as build-ing blocks for the zero-knowledge implementation of thegeneralized Gaussian ML detector, and also allow for the en-crypted execution of these two nonlinear operations in otherhigh level protocols.

Finally, the use of the technique shown in [21] makesthe whole protocol noninteractive, so that it does not needa honest verifier to achieve the zero-knowledge property. Inorder to get protection against cheating provers, the proofsshown in [22] can be employed to prove some statisticalproperties of the inserted watermark, resulting in an increasein communication complexity.

APPENDICES

A. MAPPING FOR ROUNDED SQUARE ROOT

Current cryptosystems are based in modular operations in agroup of high order. Although simple operations like addi-tion or multiplication have a direct mapping from quantizedreal numbers to modular arithmetic (provided that the num-ber of elements inside the used group is big enough to avoidthe effect of the modulus), when trying to cope with non-integer operations, like divisions or square roots, problemsarise.

In the following, a mapping that represents quantizedsquare roots inside integers in the range {1, . . . ,n− 1} is pre-sented, and existence and uniqueness of the solutions for thismapping are derived. The target is to find which conditionsmust be satisfied by the input and the output to keep thisoperation secure when the arguments are concealed.

The mapping must be such that if y ∈ Z+ and x = √y ∈R, then n

√y := round(x). For this mapping to behave like

the conventional square root for positive reals, it is necessaryto bound the domain where it can be applied. The formaliza-tion of the mapping would be as follows:

n√. : A

= {y ∈ Z+ | y < n} −→ B = {x ∈ Z+|x < round(√n)}

y −→ x =n√y = round(

√y).(A.1)

In order for this definition to be valid, and given thatthe elements with which this mapping works are just therepresentatives of the residue classes of Zn in the interval{1, . . . ,n− 1}, we can state the following lemma.

Lemma 1 (Existence and uniqueness of a solution). A uniquex ∈ [1, xm]∩ Z+ exists, such that for all y ∈ {1, . . . , min(x2

m +xm,n− 1)}, xm ≤ �

√n� − 1,

x2 modn ∈ [y − x, y + x)n, x ≤ y, (A.2)

where [, )n represents the modular reduction of the given inter-val.

Proof.

Existence. Given y ∈ Z+, its real square root admits a uniquedecomposition as an integer and a decimal in this way:

√y = x + d, x = round(

√y) ∈ Z+, d ∈ [−0.5, 0.5).

(A.3)

Squaring the previous expression, both sides of the equal-ity must be integers, so,

(√y)2 = x2 + d2 + 2dx

x2 = y − 2dx − d2,(A.4)

and taking into account that y is integer, 2dx + d2 must bealso an integer, and it is bounded by

2dx + d2 ∈ [−x + 0.25, x + 0.25) =⇒ 2dx + d2 ∈ [−x + 1, x].(A.5)

Substituting this last equation in the previous one givesthe desired result:

x2 ∈ [y − x, y + x − 1]. (A.6)

Thus, the modular reduction of x2 is inside the modularreduction of the interval, and x exists.

Uniqueness. Here uniqueness is concerned with modular op-erations, and the possibility that the interval [y−x, y+x) in-clude integers out of the initial representing range {0, . . . ,n−1}, which would result in ambiguities after applying the modoperator. In the following, all the operations are modular,and thus, the mod operator is omitted. The intervals also rep-resent their modular reduction.

The proof is based on reductio ad absurdum. Let y ∈{1, . . . , x2

m + xm}, and let x, x′ ∈ [1, xm] ∩ Z+ two different


integers such that both fulfill x=n√y, x′=n

√y. This means

that

x2 ∈ [y − x, y + x)∩ Z,

x′2 ∈ [y − x′, y + x′)∩ Z.(A.7)

Combining the previous relations, x and x′ must be suchthat

x2 − x′2 ∈ (−x − x′, x + x′)∩ Z. (A.8)

Let us suppose, without loss of generality, that x > x′. Ifboth x, x′ are less than xm ≤ �√n� − 1, then their squaresare below n, and follow the same behavior as if no modularoperation were applied. Squares in Z can be represented bythe following recursive formula:

yk = k2 = yk−1 + k + k − 1 =⇒

yk − yi = k2 − i2 =

⎧⎪⎪⎨

⎪⎪⎩

k−i−1∑

l=1

2(k − l) + k + i, k > i

0, k = i,

(A.9)

what means that in order for x2 and x′2 to be spaced less thatx + x′ the next inequality must be satisfied:

x′−x−1∑

l=1

2(x − l) + x + x′ < x + x′ =⇒x−x′−1∑

l=1

2(x − l) < 0.

(A.10)

Thus, the only solution is x = x′.If, on the other hand, x = xm, and taking into account

that

x2 ∈ [y − x, y + x − 1] ⇐⇒ y ∈ [x2 − x + 1, x2 + x],

(A.11)

there are two possibilities.(1) y ∈ {x2 − x + 1, . . . ,n − 1}: if x /= x′, then x′ <

round(√n), so the range (x′2 − x′, x′2 + x′] cannot include

y, and x is the only admissible solution.(2) y ∈ {1, . . . , x2+x−n}: this is only possible if x2

m+xm >n; in such case, given the condition imposed on xm, then

y ≤ x2m + xm − n ≤

√n

2 − 1 + xm − n = xm − 1. (A.12)

As x = xm, this means that y < x, which violates one ofthe conditions established at the beginning.

One issue in the previous exposition is that it is pos-sible that the mapping is not defined over the entire set{1, . . . ,n − 1}. Instead, if the modulus is not public, the fullworking range is not known, and it becomes necessary to up-per bound the integers with which the system will work. Inthis case, the upper bound can be set to ym = x2

m + xm, andthe mapping can be applied to the full working range; fur-thermore, the condition that x ≤ y can be eliminated, asx ∈ {1, . . . , xm} already guarantees that there is no ambiguity.

A similar reasoning can be applied when the workingrange includes negative numbers:

{

−⌊n

2

⌋

, . . . , 0, . . . ,⌈n

2

⌉

− 1}

. (A.13)

In this case, it is enough if x ∈ {1, . . . , round(√n/2)}, and

y ∈ {1, . . . , �n/2� − 1}, as x2 covers all the range of positivenumbers in which y is included, and there are no ambiguitieswith the mod operation, as the overlap in intervals can onlybe produced with negative numbers, already discarded by theprevious conditions.

Limiting the working range is the biggest issue of thismethod; with sequential modular additions and multiplica-tions in Zn, it is only needed that the result of applying thesame sequence of operations (without applying the modu-lus) in Z belongs to the interval {1, . . . ,n − 1} to reach thesame value with modular operations. In the case of the de-fined square root, it is necessary that the operations madebefore applying a root also return a number inside the inter-val {1, . . . ,n− 1}, and it is not enough that the final result ofall the computation is in this interval.

B. ZERO-KNOWLEDGE PROOF THAT ACOMMITMENT HIDES THE SAME VALUE ASONE OF TWO GIVEN COMMITMENTS

This proof constitutes a mixture of a variation of the proof ofequality of two commitments [18] and the technique shownin [23] to produce an OR proof through the application ofsecret sharing schemes.

Given three commitments Cx1 = gx11 h

r11 , Cx2 = gx2

2 hr22 and

Cx = gxhr , the prover states that x = x1 or that x = x2. Thenotation used for the security parameters (B,T , k,F = C(k))is the same as in Section 2.1.1; the structure of the proof isthe following.

(1) Let us suppose that xi = x, and xj /= x, with i, j ∈{1, 2}, i /= j. Then, for xj , the prover must generate the values

Wj1 = gujj h

uj1j C

−ejxj ,

Wj2 = guj huj2C−ejx ,

(B.1)

such that ej is a randomly chosen t-bit integer (ej ∈[0,C(k))), uj is randomly chosen in [0,C(k)T2k) and uj1 anduj2 are randomly chosen in [0,C(k)2B+2k).

For xi, the prover chooses at random yi ∈ [1,C(k)T2k)and ri3, ri4 ∈ [0,C(k)2B+2k), and constructs

Wi1 = gyii h

ri3i ,

Wi2 = g yihri4 .(B.2)

Then, the prover sends to the verifier the values W11,W12, W21, W22.

(2) The verifier generates a random t-bit number s ∈[0,C(k)), and sends it to the prover.


(3) The prover calculates the remaining challenge apply-ing an XOR ei = ej ⊕ s, and then generates the followingvalues:

ui = yi + eix,

ui1 = ri3 + eiri,

ui2 = ri4 + eir,

(B.3)

and sends to the verifier e1,u1,u11,u12, e2,u2,u21,u22.(4) The verifier checks that the challenges e1, e2 are con-

sistent with his random key s (s = e1 ⊕ e2), and then checks,for k = {1, 2}, the proofs

guk1 huk11 C−ekxk =Wk1,

gukhuk2C−ekx =Wk2.(B.4)

The completeness of the proof follows from its definition,as if one of the xk is equal to x, then all the subproofs willsucceed.

The soundness of the protocol resides in the key s, that isgenerated by the verifier. This protocol can be decomposedin two parts, each one consisting in the proof that x = xi foreach xi. Both are based in a protocol that is demonstrated tobe sound [18]. So, without access to ei at the first stage, theonly way for the prover to generate the correct values withnonnegligible probability is that xi = x; if xi /= x, he mustgenerate ei in advance for making that the proof succeeds.With this premise, one of the ei must be fixed by the prover,and he indirectly commits to it in the first stage of the pro-tocol; but the other value ej is determined by ei and by therandom choice of the verifier s, so for the prover it is as ran-dom as s, guaranteeing that the second proof will only suc-ceed with negligible probability when xj = x.

The protocol is witness hiding, due to the followed proce-dure for developing it [23]; thanks to the statistically hidingproperty of the commitments, all the values generated for thefalse proof will be indistinguishable from those of the trueproof. Furthermore, the protocol is also zero-knowledge, asa simulator can be built that given the random choices (s)of the verifier can construct both proofs applying the sametrick as for the false proof, and the distribution of the re-sulting commitments will be statistically indistinguishablefrom that of the real interactions; in fact, the original proto-col was honest-verifier zero-knowledge, but adding the addi-tional XOR on the verifier’s random choice for the true proofmakes that the resulting value is completely random, at leastif one of the parties is honest (it is like a fair coin flip), so thezero-knowledge property is gained in this process.

Applying the technique shown in [21], the previous pro-tocol can be transformed in a noninteractive zero-knowledgeproof of knowledge, by using a hash function H , so thats = H(W11‖W12‖W21‖W22), and eliminating the transmis-sion of W11,W12,W21,W22. This way, the verifier checks that

e1 ⊕ e2

=s=H(gu11 hu11

1 C−e1x1

∥∥gu1hu12C−e1

x

∥∥gu2

2 hu212 C−e2

x2

∥∥gu2hu22C−e2

x

).

(B.5)

C. SECURITY PROOFS

In this appendix, we have included the sketches of the secu-rity proofs for the developed protocols.

C.1. Sketch of the proof for Claim 1

Completeness and soundness of the protocol in Section 3.1are held upon the validity of the mapping of Appendix A.

Proof.

Completeness. If both prover and verifier behave according tothe protocol in Section 3.1, then the verifier will accept all thesubproofs and all its tests will succeed. If x is generated as therounded square root of y, the square proof and both rangeproofs will be accepted because of the validity of the mappingof Appendix A and the completeness of these subproofs.

Soundness. Taking into account the consideration about inte-gers of the form k2 + k, the binding property of the commit-ment guarantees that the prover cannot open the generatedCx and Cx2 to incorrect values; thus, appealing to the unique-ness property of the mapping of Appendix A, the computa-tional soundness of the range and squaring subproofs guar-antees that a proof for a value that does not fulfill that map-ping will only succeed with negligible probability.

Zero-knowledge. We can construct a simulator SV∗

for theverifier’s view of the interaction. SV

∗must generate values Cx

andCx2 as commitments to random values, that will be statis-tically indistinguishable from the true commitments, due tothe statistically hiding property of the commitment scheme.Furthermore, the statistical zero-knowledge property of thesquaring and range subproofs guarantees that simulators forthese proofs exist and generate the correct views, and the gen-eration of Cx and Cx2 does not affect these views, due to theirindistinguishability with respect to the true commitments,and that the simulators do not need knowledge of the com-mitted values in order to succeed.

C.2. Sketch of the proof for Claim 2

Proof.

Completeness. If both parties adhere to the protocol, thenwhen C|x| hides the absolute value of the number concealedin Cx, the protocol always succeeds due to the completenessof the OR proof and the nonnegativity proof.

Soundness. Due to the binding property of the commitments,the prover cannot open Cx and C|x| to incorrect values. Fur-thermore, due to the soundness of the subproofs, if C|x| hidesa negative number, the proof in step (3) will fail, so the com-plete protocol will fail (except with negligible probability); onthe other hand, if C|x| does not hide a number with the sameabsolute value as the one hidden by Cx, the proof in step (2)will also fail (except with negligible probability). Thus, thewhole protocol will only succeed for a non-valid input witha negligible probability given by the soundness error of theproofs in steps (2) and (3).


such thatthe real interactions have a probability distribution indis-tinguishable from that of the outputs of the simulator. The


statistical zero-knowledge property of the OR and nonnega-tivity subproofs guarantees that simulators exist that can pro-duce sequences that are statistically indistinguishable fromthese protocols’ outputs, so the only quantity that the simu-lator SV

∗has to produce is C−x, whose true value can be gen-

erated directly from Cx due to the homomorphic property ofthe used commitment scheme. Thus, the whole protocol isstatistically zero-knowledge.

C.3. Sketch of the proof for Theorem 1

Proof.

Completeness. Let us assume that both parties behave accord-ing to the protocol. The values CAk calculated by the correctprover and the correct verifier coincide. For correctly pro-duced C|Ak|, the completeness of the absolute value subproofguarantees the acceptance of the verifier; equally, the com-pleteness of the rounded square root subproof guarantees theacceptance for a correctly calculated CBk . Next, the values ofCD computed by both parties coincide, and, finally, due tothe completeness of the nonnegativity proof, the verifier willaccept the whole proof in case the signal {Yk} is inside thedetection region. For the case of a binary antipodal spread-ing sequence (Section 5), if the values G, Hk and Cth are cor-rectly calculated, the completeness of the nonnegativity proofguarantees the acceptance when {Yk} is inside the detectionregion. This concludes the completeness proof.

Soundness. The binding property of the commitments as-sures that the prover will not be able to open the commit-ments that he calculates (CAk , C|Ak|, CBk , CD, Cth) to wrongvalues. Furthermore, the statistical soundness of the usedsubproofs (absolute value, rounded square root, and non-negativity) guarantees that an incorrect input in any of themwill only succeed with negligible probability. This fact, to-gether with the homomorphic properties of the commit-ments, that makes impossible for the prover to fake the arith-metic operations performed in parallel by the verifier, propi-tiates that the probability that a signal {Y∗k } that is not insidethe detection region succeeds the proof be negligible.


such thatthe real interactions have a probability distribution indis-tinguishable from that of the outputs of the simulator. Thestatistical zero-knowledge property of the absolute value,rounded square root and nonnegativity subproofs guaran-tee the existence of simulators for their outputs; thus, SV

∗

can generate CAk , CD, and Cth as in a real execution of theprotocol, thanks to the homomorphic properties of the com-mitment scheme. On the other hand, it must generate C|Ak|and CBk as commitments to random numbers; the statis-tical hiding property of the commitments guarantees thatthe distribution of these random commitments be indistin-guishable from the true commitments. Furthermore, thesegenerated values will not affect the indistinguishability ofthe simulators for the subproofs, as these simulators do notneed knowledge of the committed values in order to succeed.Thus, the output of SV

∗is indistinguishable from true inter-

actions of an accepting protocol, and the whole protocol isstatistically zero-knowledge.

ACKNOWLEDGMENTS

This work was partially funded by Xunta de Galiciaunder projects PGIDT04 TIC322013PR and PGIDT04PXIC32202PM, Competitive Research Units ProgramRef. 150/2006, MEC project DIPSTICK, Ref. TEC2004-02551/TCM, MEC FPU grant, Ref. AP2006-02580, andEuropean Commission through the IST Program underContract IST-2002-507932 ECRYPT. ECRYPT disclaimer:the information in this paper is provided as is, and noguarantee or warranty is given or implied that the infor-mation is fit for any particular purpose. The user thereofuses the information at its sole risk and liability. This workwas partially presented at ACM Multimedia and SecurityWorkshop 2006 [24] and Electronic Imaging 2007 [25].

REFERENCES

[1] S. Goldwasser, S. Micali, and C. Rackoff, “The knowledgecomplexity of interactive proof systems,” SIAM Journal onComputing, vol. 18, no. 1, pp. 186–208, 1989.

[2] A. Adelsbach and A.-R. Sadeghi, “Zero-knowledge watermarkdetection and proof of ownership,” in Proceedings of the 4th In-ternational Workshop on Information Hiding (IH ’01), vol. 2137of Lecture Notes in Computer Science, pp. 273–288, Springer,Pittsburgh, Pa, USA, April 2001.

[3] I. Damgard, “Commitment schemes and zero-knowledge pro-tocols,” in Lectures on Data Security: Modern Cryptology inTheory and Practice, vol. 1561 of Lecture Notes in ComputerScience, pp. 63–86, Springer, Aarhus, Denmark, July 1998.

[4] P. Comesana, L. Perez-Freire, and F. Perez-Gonzalez, “Blindnewton sensitivity attack,” IEE Proceedings on Information Se-curity, vol. 153, no. 3, pp. 115–125, 2006.

[5] A. Piva, V. Cappellini, D. Corazzi, A. De Rosa, C. Orlandi, andM. Barni, “Zero-knowledge ST-DM watermarking,” in Secu-rity, Steganography, and Watermarking of Multimedia ContentsVIII, E. J. Delp III and P. W. Wong, Eds., vol. 6072 of Proceed-ings of SPIE, pp. 1–11, San Jose, Calif, USA, January 2006.

[6] J. R. Hernandez, M. Amado, and F. Perez-Gonzalez, “DCT-domain watermarking techniques for still images: detectorperformance analysis and a new structure,” IEEE Transactionson Image Processing, vol. 9, no. 1, pp. 55–68, 2000.

[7] I. Damgard and E. Fujisaki, “A statistically-hiding integer com-mitment scheme based on groups with hidden order,” in Pro-ceedings of the 8th International Conference on the Theory andApplication of Cryptology and Information Security: Advancesin Cryptology (ASIACRYPT ’02), vol. 2501 of Lecture Notes InComputer Science, pp. 125–142, Springer, Queenstown, NewZealand, December 2002.

[8] M. Bellare and O. Goldreich, “On defining proofs of knowl-edge,” in Proceedings of the 12th Annual International Cryp-tology Conference on Advances in Cryptology (CRYPTO ’92),vol. 740 of Lecture Notes in Computer Science, pp. 390–420,Springer, Santa Barbara, Calif, USA, August 1992.

[9] L. Perez-Freire, P. Comesana, and F. Perez-Gonzalez, “Detec-tion in quantization-based watermarking: performance andsecurity issues,” in Security, Steganography, and Watermarkingof Multimedia Contents VII, E. J. Delp III and P. W. Wong, Eds.,vol. 5681 of Proceedings of SPIE, pp. 721–733, San Jose, Calif,USA, January 2005.

[10] F. Perez-Gonzalez, F. Balado, and J. R. Hernandez Martin,“Performance analysis of existing and new methods for data


hiding with known-host information in additive channels,”IEEE Transactions on Signal Processing, vol. 51, no. 4, pp. 960–980, 2003.

[11] M. Barni and F. Bartolini, Watermarking Systems Engineering.Signal Processing and Communications, Marcel Dekker, NewYork, NY, USA, 2004.

[12] B. Chen and G. W. Wornell, “Quantization index modulation:a class of provably good methods for digital watermarkingand information embedding,” IEEE Transactions on Informa-tion Theory, vol. 47, no. 4, pp. 1423–1443, 2001.

[13] P. Comesana and F. Perez-Gonzalez, “Breaking the BOWS wa-termarking system: key guessing and sensitivity attacks,” to ap-pear in EURASIP Journal on Information Security.

[14] S. Craver, “Zero knowledge watermark detection,” in Proceed-ings of the 3rd International Workshop on Information Hiding(IH ’99), vol. 1768 of Lecture Notes in Computer Science, pp.101–116, Springer, Dresden, Germany, September 2000.

[15] A. Adelsbach, S. Katzenbeisser, and A.-R. Sadeghi, “Water-mark detection with zero-knowledge disclosure,” in Multime-dia Systems, vol. 9, pp. 266–278, Springer, Berlin, Germany,2003.

[16] I. J. Cox, J. Kilian, T. Leighton, and T. Shamoon, “A secure, ro-bust watermark for multimedia,” in Proceedings of the 1st Inter-national Workshop on Information Hiding (IH ’96), vol. 1174of Lecture Notes in Computer Science, pp. 185–206, Springer,Cambridge, UK, May-June 1996.

[17] A. Adelsbach, M. Rohe, and A.-R. Sadeghi, “Non-interactivewatermark detection for a correlation-based watermarkingscheme,” in Proceedings of the 9th IFIP TC-6 TC-11 Interna-tional Conference on Communications and Multimedia Security(CMS ’05), vol. 3677 of Lecture Notes in Computer Science, pp.129–139, Springer, Salzburg, Austria, September 2005.

[18] F. Boudot, “Efficient proofs that a committed number lies inan interval,” in Proceedings of the International Conference onthe Theory and Application of Cryptographic Techniques: Ad-vances in Cryptology (EUROCRYPT ’00), vol. 1807 of LectureNotes in Computer Science, pp. 431–444, Springer, Bruges, Bel-gium, May 2000.

[19] H. Lipmaa, “On diophantine complexity and statistical zero-knowledge arguments,” in Proceedings of the 9th InternationalConference on the Theory and Application of Cryptology and In-formation Security: Advances in Cryptology (ASIACRYPT ’03),vol. 2894 of Lecture Notes in Computer Science, pp. 398–415,Springer, Taipei, Taiwan, November-December 2003.

[20] A. Adelsbach, M. Rohe, and A.-R. Sadeghi, “Complementingzero-knowledge watermark detection: proving properties ofembedded information without revealing it,” Multimedia Sys-tems, vol. 11, no. 2, pp. 143–158, 2005.

[21] M. Bellare and P. Rogaway, “Random oracles are practical:a paradigm for designing efficient protocols,” in Proceedingsof the 1st ACM Conference on Computer and CommunicationsSecurity (CCS ’93), pp. 62–73, ACM Press, Fairfax, Va, USA,November 1993.

[22] A. Adelsbach, M. Rohe, and A.-R. Sadeghi, “Overcomingthe obstacles of zero-knowledge watermark detection,” inProceedings of the Workshop on Multimedia and Security(MM&Sec ’04), pp. 46–54, Magdeburg, Germany, September2004.

[23] R. Cramer, I. Damgard, and B. Schoenmakers, “Proofs of par-tial knowledge and simplified design of witness hiding pro-tocols,” in Proceedings of the 14th Annual International Cryp-tology Conference on Advances in Cryptology (CRYPTO ’94),vol. 839 of Lecture Notes In Computer Science, pp. 174–187,Santa Barbara, Calif, USA, August 1994.

[24] J. R. Troncoso-Pastoriza and F. Perez-Gonzalez, “Zero-knowledge watermark detector robust to sensitivity attacks,”in Proceedings of the 8th Workshop on Multimedia and Security(MM&Sec ’06), pp. 97–107, Geneva, Switzerland, September2006.

[25] J. R. Troncoso-Pastoriza and F. Perez-Gonzalez, “Efficientnon-interactive zero-knowledge watermark detector robust tosensitivity attacks,” in Security, Steganography, and Watermark-ing of Multimedia Contents IX, E. J. Delp III and P. W. Wong,Eds., vol. 6505 of Proceedings of SPIE, pp. 1–12, San Jose, Calif,USA, January 2007.


Research ArticleAnonymous Fingerprinting with Robust QIMWatermarking Techniques

J. P. Prins, Z. Erkin, and R. L. Lagendijk

Information and Communication Theory Group, Faculty of Electrical Engineering, Mathematics, and Computer Science,Delft University of Technology, 2628 Delft, The Netherlands

Correspondence should be addressed to Z. Erkin, [email protected]

Received 20 March 2007; Revised 4 July 2007; Accepted 8 October 2007

Recommended by A. Piva

Fingerprinting is an essential tool to shun legal buyers of digital content from illegal redistribution. In fingerprinting schemes,the merchant embeds the buyer’s identity as a watermark into the content so that the merchant can retrieve the buyer’s identitywhen he encounters a redistributed copy. To prevent the merchant from dishonestly embedding the buyer’s identity multipletimes, it is essential for the fingerprinting scheme to be anonymous. Kuribayashi and Tanaka, 2005, proposed an anonymousfingerprinting scheme based on a homomorphic additive encryption scheme, which uses basic quantization index modulation(QIM) for embedding. In order, for this scheme, to provide sufficient security to the merchant, the buyer must be unable to removethe fingerprint without significantly degrading the purchased digital content. Unfortunately, QIM watermarks can be removed bysimple attacks like amplitude scaling. Furthermore, the embedding positions can be retrieved by a single buyer, allowing for alocally targeted attack. In this paper, we use robust watermarking techniques within the anonymous fingerprinting approachproposed by Kuribayashi and Tanaka. We show that the properties of an additive homomorphic cryptosystem allow for creatinganonymous fingerprinting schemes based on distortion compensated QIM (DC-QIM) and rational dither modulation (RDM),improving the robustness of the embedded fingerprints. We evaluate the performance of the proposed anonymous fingerprintingschemes under additive-noise and amplitude-scaling attacks.

Copyright © 2007 J. P. Prins et al. This is an open access article distributed under the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION

Intellectual property protection is a severe problem in today’sdigital world due to the ease of illegal redistribution throughthe Internet. As a countermeasure to deter people from il-legally redistributing digital content such as audio, images,and video, a fingerprinting scheme embeds specific informa-tion related to the identity of the buyer by using watermark-ing techniques. In conventional fingerprinting schemes, thisidentity information is embedded into the digital data by themerchant and the fingerprinted copy is given to the buyer.When the merchant encounters redistributed copies of thisfingerprinted content, he can retrieve the identity informa-tion of the buyer who (illegally) redistributed his copy. Fromthe buyer’s point of view, however, this scenario is unattrac-tive because during the embedding procedure, the merchantobtains the identity information of the buyer. This enables acheating merchant to embed the identity information of the

buyer into any content without the buyer’s consent and sub-sequently accuse the buyer of illegal redistribution.

To protect the identity of the buyer, anonymous finger-printing schemes have been proposed [1, 2]. In [2], the buyerand the merchant follow an interactive embedding proto-col, in which the identity information of the buyer remainsunknown to the merchant. When the buyer wishes to pur-chase, for instance, an image, he registers himself to a reg-istration centre and receives a proof of his identity with asignature of the registration centre. Then the buyer encryptshis identity and sends both encrypted identity and the proofof identity to the merchant. The merchant checks the valid-ity of the signature by using the public key of the registra-tion centre. After the buyer convinces the merchant, throughthe provided identity proof, that the encrypted identity in-deed contains the identity information of the buyer, the mer-chant embeds the identity information of the buyer intothe (encrypted) image data by exploiting the homomorphic


property of the cryptosystem. Then the encrypted finger-printed image is sent to the buyer for decryption and futureuse.

In this scheme, the merchant can only retrieve the iden-tity information of the buyer when it is detected in a copyof the fingerprinted image. This idea, first presented in [2],was constructed in [3, 4] using digital coins. In order to em-bed the identity information of the buyer, a single-bit com-mitment scheme with exclusive, or homomorphism, is usedthat allows for computing the encrypted XOR of two bits bymultiplying their ciphertexts. In [5], Kuribayashi and Tanakaobserve that this construction is not efficient because of thelow enciphering rate. The single bit commitment scheme canonly contain one bit of information for a log 2n-bit cipher-text, where n is a product of two large primes.

In order to increase the enciphering rate, Kuribayashi andTanaka suggested using a cryptosystem with a larger mes-sage space. They introduced an anonymous fingerprintingalgorithm based on an additive homomorphic cryptosystemthat allows for the addition of values in the plaintext do-main by multiplying their corresponding ciphertexts. Con-sequently, Kuribayashi and Tanaka used a basic amplitudequantization-based scheme similar to the well-known quan-tization index-modulation (QIM) scheme as the underly-ing watermarking scheme. Since QIM essentially modulates(integer-valued) quantization levels to embed informationbits into a signal, QIM can elegantly be implemented in anadditive homomorphic cryptosystem. However, QIM is a ba-sic watermarking scheme that has limited robustness com-pared to other watermarking schemes. The embedding po-sitions can easily be retrieved from an individual finger-printed copy and are thus vulnerable to local attacks. Suchattacks result in minimal overall signal degradation, whilecompletely removing the fingerprint. Furthermore, QIM isvulnerable to simple, either malevolent or unintentional,global attacks such as randomization of the least significantbits, addition of noise, compression, and amplitude scal-ing.

In this paper, we use the ideas in [5] to build anonymousversions of state-of-the-art watermarking schemes, namely,distortion-compensated QIM (DC-QIM) [6] and rationaldither modulation (RDM) [7]. By adapting these watermark-ing schemes to the anonymous fingerprinting protocol ofKuribayashi and Tanaka, we improve the robustness of theembedded fingerprints and, as a consequence, the merchant’ssecurity. As DC-QIM and RDM are based on subtractive-dither QIM (SD-QIM), they both hide the embedding lo-cations from the buyer more effectively, preventing local,targeted attacks on the fingerprint. With respect to globalattacks, like additive noise and amplitude scaling, RDM isprovably equivalent in robustness, while DC-QIM is prov-ably better in robustness against additive noise attacks. Fur-thermore, RDM improves the QIM scheme so that the fin-gerprint becomes robust to amplitude-scaling attacks.

The outline of this paper is as follows. In Section 2, we in-troduce the basic QIM watermarking scheme, as well as theadditive homomorphic cryptosystem of Okamoto-Uchiyama[8], on which the approach in [5] is based. In Section 3,we review the anonymous fingerprinting scheme by Kurib-

Table 1: Table of symbols.

A.1. Cryptosystems

Symbol Usage

p, q Large primes of size k

n Modulus

g Generator

m Message

c Cipher-text

r, s∈RZ∗n r and s are random blinding factors from Z∗nE(m) Encryption (and integer rounding) of m

D(c) Decryption of ciphertext c

A.2. Watermarking and fingerprinting

Symbol Usage

x/X Original sample/original signal

y/Y Watermarked sample/watermarked signal

z/Z Received sample/received signal

w/W Individual watermark bit/total watermark

d Dither

Δ Quantization step size

QΔ(·) Uniform quantizer with step size Δ

α DC-QIM factor

ρ Gain factor

cScaling factor used for rounding/reducing quanti-zation step size

v(·) Function to normalize coefficients for RDM.

id Buyer identity

ayashi and Tanaka. In Section 4, we describe the proposedanonymous fingerprinting schemes using the subtractivedither QIM, DC-QIM, and RDM watermarking schemes.Section 5 describes the experiments that evaluate the robust-ness of the proposed schemes compared to the original wa-termarking schemes. Section 6 discusses the security ben-efits of using specially constructed buyer ids. Conclusionsare given in Section 7. A list of used symbols is provided inTable 1.

2. WATERMARKING AND ENCRYPTIONPRELIMINARIES

2.1. Basic quantization-index modulation

Quantization-index modulation (QIM) is a relatively recentwatermarking technique [6]. It has become popular becauseof the high watermarking capacity and the ease of implemen-tation. The basic quantization-index modulation algorithmembeds a watermark bit w by quantizing a single-signal sam-ple x by choosing between a quantizer with even or oddvalues, depending on the binary value of w. These quantiz-ers with a step size Δ ∈ N are denoted by QΔ-even(·) andQΔ-odd(·), respectively.

Figure 1 shows the input and output characteristics of thequantizer, where w ∈ {0, 1} denotes the message bit that is

J. P. Prins et al. 3

Δ

Δ

x

w = 0

w = 1

Q2Δ(x)

Figure 1: Quantizer input-output characteristics.

embedded into the host data. The watermarked signal sampley then is

y ={QΔ-even(x), if w = 0,

QΔ-odd(x), if w = 1.(1)

The quantizers QΔ-even(·) and QΔ-odd(·) are designed suchthat they avoid biasing the values of y, that is, the expected(average) value of x and y are identical. The trade-off be-tween embedding distortion and robustness of QIM againstadditive noise attacks is controlled by the value of Δ. Thedetection algorithm requantizes the received signal samplez with both QΔ-even(·) and QΔ-odd(·). The detected bit w ={0, 1} is determined by the quantized value QΔ-even(z) orQΔ-odd(z) with the smallest distance to the received samplez.

This scheme of even and odd quantizers can also be im-plemented by using a single quantizer with a step-size of 2Δand subtracting/adding Δ when w = 1. Implementing thequantizer in this way allows for the implementation of thescheme in the encrypted domain as was shown in [5].

A serious drawback of basic QIM watermarking is itssensitivity to amplitude-scaling attacks [7], in which signalsamples are multiplied by a gain factor ρ. If the gain fac-tor ρ is constant for all samples, the attack is called a fixed-gain attack (FGA). In amplitude-scaling attacks, the detectordoes not posses the factor ρ, which causes a mismatch be-tween embedder and decoder quantization lattices, affectingthe QIM-detector performance dramatically.

Another drawback of basic QIM is that the embeddingpositions can be retrieved from a single copy. The embeddingpositions are those signal values xi that have been (heavily)quantized to QΔ-even(xi) and QΔ-odd(xi), and have a constantdifference value equal to Δ, that is, the quantizer coarsenessparameter. By constructing a high-resolution histogram, thebuyer can easily observe the even-spaced spikes of signal in-tensity values and identify, and thus attack the embeddingpositions locally. This results in the removal of the finger-print with little degradation to the overall signal.

2.2. Homomorphic encryption schemes

The idea of processing encrypted data was first suggested byAhituv et al. in [9]. In their paper, the problem of decrypt-ing data before applying arithmetic operations is addressedand a new approach is described as processing data withoutdecrypting it first.

Succeeding works showed that some asymmetric cryp-tosystems preserve structure, which allows for arithmetic op-erations to be performed on encrypted data. This structurepreserving property, called homomorphism, comes in twomain types, namely, additive and multiplicative homomor-phism. Using additive homomorphic cryptosystems, per-forming a particular operation (e.g., multiplication) withencrypted data, results in the addition of the plaintexts.Similarly, using a multiplicatively homomorphic cryptosys-tem, multiplying ciphertexts, results in the multiplicationof the plaintexts. Paillier [10], Okamoto-Uchiyama [8], andGoldwasser-Micali [11] are additively homomorphic cryp-tosystems while RSA [12] and ElGamal [13] are multiplica-tively homomorphic cryptosystems.

The anonymous fingerprinting scheme proposed in [5]is based on the addition of the fingerprint to the digitaldata, and hence, an additive cryptosystem is used. Amongthe candidates, the Okamoto-Uchiyama cryptosystem is cho-sen for efficiency considerations [5]. In the next section, theOkamoto-Uchiyama cryptosystem is described. We observe,however, that the anonymous fingerprinting schemes, pro-posed in this paper, can easily be implemented by using otheradditively homomorphic cryptosystems. It is, however, re-quired to have a sufficiently large message space to representthe signal samples. Further, the underlying security proto-cols, such as the proof protocol for validating the buyer iden-tity, must be suitable for the chosen cryptosystem.

A requirement for the cryptosystem is that it is proba-bilistic in order to withstand chosen plaintext attacks. Suchattacks are easily performed in our scheme because individ-ual signal samples are usually limited in value (e.g., 8 bit). Ifwe were to use a nonprobabilistic cryptosystem, this wouldenable the buyer to construct a codebook of ciphertexts forall possible messages (in total, 28 = 256) using the public keyand decrypt through this codebook. Fortunately probabilis-tic cryptosystems were introduced in [11], which enable theencryption of a single plaintext to n ciphertexts, where n isa security parameter related to the size of the key. To whichciphertext the plaintext is encrypted is dependent on a blind-ing factor r, which is usually taken at random. Selecting dif-ferent r’s does not affect the decrypted plaintext. By havinga multitude of ciphertexts for a single plaintext, the size of acodebook will become 28·2n, and thus impractically large,preventing such attacks. All the above-mentioned addi-tive homomorphic-encryption schemes (Paillier, Okamoto-Uchiyama, and Goldwasser-Micali) are probabilistic, andhence withstand chosen plaintext attacks.

From Section 3 onwards, we compactly denote the en-cryption and the decryption of a message with E(m) andD(c), respectively, omitting the dependency on the randomfactor r. In the scope of this paper, an additive homomor-phic cryptosystem will be used for encrypting signal samples


which do not necessarily need to be integer values. In thiscase, rounding to the nearest integer value precedes the en-cryption, and thus, in this paper, E(·) denotes both roundingand encryption.

2.2.1. Okamoto-Uchiyama cryptosystem

Okamoto and Uchiyama [8] proposed a semantically secureand probabilistic public key cryptosystem based on compos-ite numbers. Let n = p2q, where p and q are two primenumbers of length k bits, and let g be a generator such thatthe order of g p−1modp2 is p. Another generator is defined ash = gn. In this scheme, the public key is pk = (n, g,h, k) andthe secret key is sk = (p, q).

Encryption.

A message m (0 < m< 2k−1) is encrypted as follows:

c = E(m, r) = gmhrmod n, (2)

where r is a random number in Z∗n .

Decryption.

Decoding the cipher-text is defined as

m = D(c) = L(cp−1mod n

)L(g p−1mod n

)mod p, (3)

where the function L(·) is

L(u) = u− 1p

. (4)

The Okamoto-Uchiyama cryptosystem has the additive ho-momorphic property such that, given two encrypted mes-sages E(m1, r1) and E(m2, r2), the following equality holds:

E(m1, r1)× E(m2, r2) = gm1hr1 × gm2hr2 mod n

= gm1+m2hr1+r2 mod n

= E(m1 +m2, r1 + r2).

(5)

Here, × denotes integer-modulo-n multiplication.

3. KURIBAYASHI AND TANAKA ANONYMOUSFINGERPRINTING PROTOCOL

The fingerprinting scheme in [5] is carried out betweenbuyer and merchant, and has, as objective to anonymouslyembed, the buyer’s identity information into the merchant’sdata (e.g., audio, image, or video signal). The buyer decom-poses his l -bit identityW into bits asW = (w0,w1, . . . ,wl−1).For applications such as embedding identity information inmultimedia data, the value of l is typically between 32 and128 (bits), which is sufficiently large to prevent the merchantfrom guessing valid buyer ids. Where necessary, we assumethat the probability P[wj = 0] and P[wj = 1] are equal. Afterdecomposition of W into individual bits, the buyer encryptseach bit with his public key using the Okamoto-Uchiyama

cryptosystem, so that E(W) = (E(w0),E(w1), . . . ,E(wl−1)).These encrypted values are sent to the merchant.

The merchant first quantizes the samples of the (audio,image, and video) signal that the buyer wishes to obtain, us-ing a quantizer with coarseness 2Δ, that is, x′ = Q2Δ(x). Here,the quantizer step size Δ is a positive integer to ensure thatthe quantized value can be encrypted. He then encrypts allquantized signal samples x′ with the public key of the buyer,yielding E(x′). The merchant selects watermark embeddingpositions by using a unique secret key that will be used toextract the watermark from the redistributed copies. In or-der to embed a single bit of information wj into one of thequantized and encrypted value E(x′) at a particular water-mark embedding position, the merchant performs the fol-lowing operation:

E(y) = E(x′)× E(wj

)Δ= E

(x′ +wjΔ

).

(6)

The result is an encrypted and watermarked signal value y,as can be readily seen by the following relation:

D(E(y)) = x′ +wjΔ,

y ={Q2Δ(x), if wj = 0,

Q2Δ(x) + Δ, if wj = 1.

(7)

The encrypted signal, with the buyer’s identity informationembedded into it in the form of a watermark, is finally sentto the buyer. Obviously, only the buyer can decrypt the wa-termarked signal values.

In order for the system to be robust against local attacks,the relation between the buyer’s identity-information bits wj

and the signal values y (audio samples, image, or video pix-els), into which the information bits are embedded, shouldbe kept secret from the buyer. Note that, as a consequence,all signal values x will have to be encrypted, also the onesthat do not carry a bit wj of the buyer’s identity information,as so to hide these embedding positions.

Compared to the original QIM scheme in (1), the abovewatermarking scheme introduces a bias, as the expected (av-erage) value of y is Δ/2 larger than that of x. This bias is in-troduced becauseΔwj is always added to the quantized signalvalue x′ and never subtracted. In order to avoid this undesir-able side effect, either the even or odd quantizer should beselected depending on the watermark bit wj as in (1). How-ever, the merchant has only the encrypted version of each wa-termark bit wj , which prevents him from deciding betweenthe two quantizers. To overcome this problem, the merchantcompares the signal values x and x′, and depending on the re-sult, the encrypted value of Δwj can be added or subtracted[5]. When x′ is smaller than x, Δwj is added, otherwise, it issubtracted. This procedure now is equivalent to (1) and thuseffectively removes the bias. As the decision is not depen-dent on the value of wj , no information is leaked about thevalue of wj . The resulting embedding procedure for identity-information bit wj then becomes

E(y) =

⎧⎪⎨⎪⎩E(x′)× E(wj

)Δ, if x ≥ Q2Δ(x),

E(x′)× (E(wj)Δ)−1

, if x < Q2Δ(x),(8)


where ()−1 denotes modular inverse in the cyclic group de-fined by the encryption scheme. When the buyer decryptsthe received encrypted and watermarked signal values, he ob-tains the following result for the watermark embedding po-sitions:

y ={x′ +wjΔ, if x ≥ Q2Δ(x),

x′ −wjΔ, if x < Q2Δ(x).(9)

For all other positions, the unwatermarked and unchanged,but encrypted and therefore rounded, signal values x aretransmitted.

In the above embedding protocol, we have assumed thatthe buyer provides encrypted values of a valid binary de-composition (w0,w1, . . . ,wl−1) of his identity informationW to the merchant. Since, however, the decomposed bitsof the identity information of the buyer are encrypted, themerchant cannot easily check this assumption. In the origi-nal work by Kuribayashi and Tanaka [5], a registration cen-ter is used, which assures the legitimacy of the buyer. Dur-ing the purchase, the merchant first confirms the identityof the buyer, and then the buyer proves the validity of thedecomposed bits of his identity information by using zero-knowledge proof protocols. Since this procedure is entirelyindependent of the watermarking scheme, we refer, for de-tails on the identity and decomposition validation and thesecurity of this procedure, to [5], where it is given for theOkamoto-Uchiyama encryption scheme. The focus of thispaper is on the application of the homomorphic embeddingprocedure described above to the more robust watermarkingschemes of [6, 7].

4. ANONYMOUS FINGERPRINTING USING ADVANCEDWATERMARKING SCHEMES

From the perspective of the merchant, the embedding ofthe buyer’s identification information must be as robust aspossible in order to both withstand malicious and benignsignal-processing operations on the fingerprinted signal. Ifthe buyer id-embedding procedure is not robust, the buyercould remove the fingerprint either intentionally or uninten-tionally, and as a consequence, the merchant would lose hisability to trace illegally redistributed copies. The fingerprintsembedded in the Kuribayashi and Tanaka (KT) anonymousfingerprinting protocol, described in Section 3, are known tobe sensitive to a number of signal-processing operations, andare, in fact, relatively easy to remove through attacks men-tioned in Section 2.1. We propose to increase the robust-ness of the Kuribayashi and Tanaka anonymous fingerprint-ing protocol, as perceived by the merchant, by applying theirapproach to two advanced quantization-based watermarkingschemes, namely, DC-QIM and RDM.

So far, we have embedded the bits of the identity infor-mation into signal values without specifying what these sig-nal values actually are. In the rest of this paper, we will useblock-DCT transform coefficients of images to embed theidentity bits into. A particular block-DCT coefficient, intowhich, we embed an information bit wj , will be abstractlydenoted by xi. Of course, in actual images, xi may be a partic-ular DCT coefficient of a particular DCT block in the image.

xi

di

Q2Δ

±Δwj

di

yi

Figure 2: Subtractive dither QIM.

The relation between the bits wj and watermark embeddingpositions xi is determined by a key known only to the mer-chant. In practical cases of interest, the number of candidateembedding positions is in the same order as the number ofsignal samples, whereas the number of information bits istypically between 32 and 128. For instance, for a 1024×1024pixels image, the maximum number of possible embeddingcombinations for 128 bits of information is ( 10242

128 ), whichprovides enough security. In the case of embedding the bitswj into DCT coefficients, the number of possible embeddingcombinations will be smaller depending on the DCT blocksize and the number of DCT coefficients in one block thatare (perceptually and qualitatively) suitable for embedding awatermark bit into.

It is important to note that the goal for each water-marking scheme within the Kuribayashi-Tanaka protocol isto compute the encryption of watermarked coefficients yi,while only having available original signal values xi, the en-crypted bits E(wj) of the buyer’s decomposed identity, andthe public key pk of the selected additively homomorphicencryption scheme. Once the buyer identification informa-tion is correctly embedded in the encrypted domain, the en-crypted coefficients (i.e., encrypted digital content) will besent to the buyer, who can decrypt these with his private keyto obtain correctly watermarked data. Since the informationbits are embedded in the DCT domain, a trivial inverse DCTon the decrypted data is necessary as the last step to obtainthe purchased digital image. Because this is easiest performedin the plaintext domain, we leave it to the buyer to performthis inverse DCT after decryption, which is much like JPEGdecompression.

4.1. Subtractive dither-quantization-indexmodulation

Fingerprints embedded by the basic QIM watermarkingscheme used by Kuribayashi and Tanaka as described inSection 2.1 can be locally attacked because the buyer can findthe embedding positions xi without checking all possible (forinstance ( 10242

128 )) combinations. A common solution to thisweakness of the basic QIM watermarking scheme is to addpseudorandom noise, usually called dither, to xi before em-bedding an information bit wj , and subtracting the ditherafter embedding. As a consequence, the quantization levelsand their constant difference Δ can no longer be observed,making the separation between embedding positions xi andnonembedding positions impossible. The resulting water-marking scheme, illustrated in Figure 2, is called subtractivedither QIM (SD-QIM).


xi

α

1− α

di

Q2Δ

SD-QIM

±Δwj

di

yi

Figure 3: Distortion-compensated QIM.

In QIM terminology, a small amount of dither di is addedprior to quantizing the signal amplitude xi to an odd or evenvalue depending on the information bit wj . After quantiza-tion of xi+di, the same amount of dither di is subtracted. It isdesirable that the dither can be used in cooperation with theQIM uniform quantizers QΔ-odd(·) and QΔ-even(·), which usea quantization step size of 2Δ, as in the basic QIM. It has beenshown [14] that a suitable choice for the PDF of the randomdither di is a uniform distribution on [−Δ,Δ].

In order to embed the buyer’s identity information bitE(wj) into coefficient xi using the Kuribayashi-Tanaka pro-tocol in combination with subtractive dither, we carry outthe following protocol.

(i) Add random dither di to the signal sample or coeffi-cient xi.

(ii) Quantize xi + di with a quantization coarseness of 2Δ,and encrypt the result using the buyer’s public key,yielding E(Q2Δ(xi + di)).

(iii) Multiply by E(wj)Δ or its modular inverse depending

on the value of xi + di, in order to achieve the desiredquantization level.

(iv) Encrypt the dither di to obtain E(di). Note that, sincedi ∈ R, the encryption operation includes modulon rounding to an integer. Multiply the result of theprevious step with the modular inverse of E(di) as soto implement the subtraction of the dither di fromQ2Δ(xi + di).

Summarizing the above protocol steps, we obtain

E(ti) =

⎧⎪⎨⎪⎩E(Q2Δ(xi + di))× E(wj)

Δ, if xi ≥ Q2Δ(xi),

E(Q2Δ(xi + di))× (E(wj)Δ)−1

, if xi < Q2Δ(xi),

E(yi) = E(ti)× E(di)−1.

(10)

After decryption, the buyer obtains the (DCT transformed)image, into which, his identity information is embedded incertain DCT coefficients yi according to the following sub-tractive dither QIM scheme

yi ={QΔ-even

(xi + di

)− di, if wj = 0,

QΔ-odd(xi + di

)− di, if wj = 1.(11)

The above embedding procedure demonstrates the usageof the Kuribayashi-Tanaka protocol to subtractive-ditherQIM. The plaintext subtractive-dither QIM and the aboveKuribayashi-Tanaka subtractive-dither QIM (KT SD-QIM)

are equivalent except for the rounding of the dither di to in-tegers before encryption. How to limit the adverse effect ofinteger rounding will be addressed next.

Two improvements of (10) are desirable. In the firstplace, we can subtract di before encrypting Q2Δ(xi + di). Thiseffectively removes the last protocol step, and hence elim-inates an unnecessary encryption operation. The resultingscheme can then be rewritten as follows:

E(yi)=

⎧⎪⎨⎪⎩E(Q2Δ(xi+di)− di)× E(wj)

Δ, if xi ≥ Q2Δ(xi),

E(Q2Δ(xi+di)− di)× (E(wj)Δ)−1

, if xi < Q2Δ(xi).

(12)

The second improvement concerns the quantization opera-tion. The quantizer not only rounds the signal amplitudesto predetermined (not necessarily integer) quantization lev-els, but it must also round signal values or DCT coefficientsxi + di to integers because of the ensuing encryption opera-tion. If the signal values of DCT coefficients xi are sufficientlylarge, using integer-valued coefficients is not a restriction atall. For smaller values of xi, however, using integer values maybe too restrictive or may yield too large deviations betweenthe results of (12) and (11).

We propose to circumvent this problem by scaling all co-efficients xi with a constant factor c before embedding. Scal-ing has little effect on the en-/decryption, as long as the sam-ples are not scaled beyond the message group size of theencryption scheme used. The message group size is, how-ever, usually very large because of encryption security re-quirements (typically > 2512). As a consequence of scalingxi, the dither di and all encrypted bits E(wj) of the decom-posed identity of the buyer also have to be scaled by c. Wenote that scaling introduces extra computation. However, thedither can be scaled and subtracted before encryption, result-ing in a very small increase in complexity. The scaling of theencrypted bits E(wj) of the decomposed identity of the buyerhas to be taken into account in the protocol steps, which isrelatively easy since the scaling can be combined with themultiplication of wj with Δ. The resulting embedding equa-tion can be summarized as follows:

E(yi) =

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

E(c·(Q2Δ

(xi + di

)− di))× E(wj)Δ

,

if xi ≥ Q2Δ(xi),

E(c·(Q2Δ

(xi + di

)− di))× (E(wj)Δ)−1

,

if xi < Q2Δ(xi).

(13)

The scaling factor c has to be communicated to the buyer sothat the buyer can rescale the entire image after decryptionto the proper (original) intensity range.

4.2. Distortion-Compensated QIM

Distortion-compensated QIM (DC-QIM) [6] is an extensionto the subtractive dither-QIM scheme described in the previ-ous section. Rather than directly adding dither to and quan-tizing of xi, a fraction α·xi is used in the SD-QIM procedure(see Figure 3). The information bits will be embedded only inthe fraction α·xi, where α lies within the range [0, 1]. The re-maining fraction (1−α)·xi is added back to the watermarked


signal component α·xi to form the final embedded coeffi-cient yi. The embedder chooses an appropriate value for αdepending on the desired detection performance and robust-ness of DC-QIM; an often selected value is as in [15]:

α = σ2w

σ2w + σ2

n, (14)

where σ2w = Δ2/3 is the variance of the watermark in the wa-

termarked signal, and σ2n is the variance of the noise or other

degradation that an attacker applies in an attempt to ren-der the watermark bits undetectable. Obviously, the standardSD-QIM scheme is optimal only if an attacker inserts littleor no noise into the watermarked image since, for σ2

n→0, wefind α→1. The difference in robustness between SD-QIM andDC-QIM becomes especially relevant if the variance of theattacker becomes large relative to σ2

w, that is, σ2n→σ2

w.As the differences between the SD-QIM and DC-QIM

watermarking scheme merely consist of plaintext multiplica-tions and ciphertext additions, DC-QIM can also be achievedwithin the limitations of the homomorphic additive encryp-tion scheme used by the Kuribayashi-Tanaka protocol. Thebasic embedding operations can now be written as follows:

E(ti) =

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

E(Q2Δ(α·xi + di)− di)× E(wj)Δ,

if α·xi ≥ Q2Δ(α·xi),

E(Q2Δ(α·xi + di)− di)× (E(wj)Δ)−1

,

if α·xi < Q2Δ(α·xi),

E(yi) = E(ti)× E((1− α)·xi).

(15)

Equation (15) results in the following watermarked values yiafter decryption:

ti ={Q2Δ

(α·xi + di

)− di +wj·Δ, if α·xi ≥ Q2Δ(α·xi

),

Q2Δ(α·xi + di

)− di −wj·Δ, if α·xi ≥ Q2Δ(α·xi

),

yi = ti + (1− α)·xi.(16)

The plaintext distortion-compensated QIM and the aboveKuribayashi-Tanaka distortion-compensated QIM (KT DC-QIM) are equivalent, except again for the rounding of thereal-valued dither di and (1−α)·xi to integers before encryp-tion.

Similar to the subtractive dither-QIM watermark algo-rithm, KT DC-QIM can be modified to subtract the ditherbefore encryption, and to scale the signal values before en-cryption. Furthermore, the term (1 − α)·xi can be addedbefore encryption, further reducing the number of encryp-tions needed. The resulting KT DC-QIM embedding equa-tions then become:

E(ti) =

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

E(c·(Q2Δ

(α·xi + di

)− di))× E(wj)Δ

,

if α·xi ≥ Q2Δ(α·xi

),

E(c·(Q2Δ

(α·xi + di

)− di))× (E(wj)Δ)−1

,

if α·xi < Q2Δ(α·xi

).

E(yi) = E

(ti)× E(c·(1− α)·xi

).

(17)

xi

1v(Yi−1)

v(Yi−1)

di

Z−L

Q2Δ

SD-QIM

±Δwj

di

yi

Figure 4: Rational dither modulation.

4.3. Rational dither modulation

DC-QIM provides a significant improvement in robustnesscompared to the basic QIM scheme. Nevertheless, the DC-QIM scheme is known to be very sensitive to gain or volu-metric attacks, which is just simply scaling of the image in-tensities. Because of the use of the scaling factor c in SD-QIMand DC-QIM in order to reduce the sensitivity to integer-rounding before encryption, the buyer has an excellent op-portunity to perform a gain attack on the watermarked sig-nal. The gain effect causes the quantization levels used at thedetector to be misaligned with those embedded in the pur-chased and illegally distributed digital data, effectively mak-ing the retrieval of the watermarked identity bits impossible[16].

Perez-Gonzalez et al. [7], proposed the usage of QIM onratios between signal samples as so to make the watermark-ing system robust against fixed gain attacks. The resulting ap-proach, known as rational dither modulation (RDM), is ro-bust against both additive-noise and fixed-gain attacks. TheRDM-embedding scheme is illustrated in Figure 4. The ro-bustness against fixed gain attacks is achieved by normalizingthe signal value (or DCT coefficient) xi by v(Yi−1), which isa function that combines L previous watermarked signal val-ues Yi−1 = (yi−1, yi−2, . . . , yi−L). An example for the functionv(Yi−1) is the Holder vector norm, as suggested in [7]:

v(Yi−1) =(

1L

i−1∑m=i−L

∣∣ym∣∣p)1/p

. (18)

The SD-QIM watermark embedding will then take place us-ing the normalized signal values xi/v(Yi−1), yielding

yi =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎩

v(

Yi−1)·(QΔ-even

(xi

v(

Yi−1) + di

)− di

),

if wj = 0,

v(

Yi−1)·(QΔ-odd

(xi

v(

Yi−1) + di

)− di

),

if wj = 1,

(19)

where the multiplication of the quantization results withv(Yi−1) is required to scale the coefficients to their originalvalue range. Another way of viewing RDM is that it is equiv-alent to using SD-QIM with a signal amplitude-dependentquantization coarseness v(Yi−1)·Δ.

The normalization of xi takes place on a function of(yi−1, yi−2, . . . , yi−L) rather than of (xi−1, xi−2, . . . , xi−L). Theusage of v(Yi−1) is preferable because only the watermarked


values yi are available during watermark detection. In theKuribayashi-Tanaka protocol, the watermarked signal valuesor DCT coefficients yi are only available to the merchant inan encrypted form E(yi). Unfortunately, the embedder can-not make use of v(Yi−1) as a normalization factor, primarilybecause the homomorphic division (and multiplication forthat matter) is not defined for two encrypted values in a ho-momorphic additive-encryption scheme. Also the evaluationof the normalization function v(Yi−1) (e.g., (18)) may not becomputable on encrypted values.

Consequently, we will have to use the original sig-nal/coefficient values (xi−1, xi−2, . . . , xi−L), which will havethe same statistics as (yi−1, yi−2, . . . , yi−L) for sufficiently largevalue of L. Experimental results have shown that an appro-priate value of L is 25. For this value of L, the detection re-sults, using normalization on v(Xi−1), are sufficiently close tothe results based on normalization using v(Yi−1).

Since RDM applies QIM on the ratio xi/v(Xi−1), atten-tion should be paid to the integer rounding process. Sincexi/v(Xi−1) will usually be around (the real number) 1.0, therounding to an integer will almost always yield (the integer)1, introducing unacceptably large watermarking distortions.Therefore, the scaling of the ratio with a factor c becomesessential in RDM. Furthermore, after quantization of the ra-tio xi/v(Xi−1), the result needs to be multiplied with v(Xi−1).Thanks to the homomorphic property, this can be carriedout by an exponentiation in modulo arithmetic with v(Xi−1)in the encrypted domain. To this end, obviously v(Xi−1) hasto be an integer, requiring another rounding step. In case thisrounding effect is severe, another scaling can be carried outon v(Xi−1). Since, in our experiments, this effect showed tobe negligible, we do not consider scaling of v(Xi−1) itself. Wedenote the rounded value of v(Xi−1) by vint(Xi−1).

Using again the notation di for the uniformly distributeddither, the RDM-embedding equations become

E(ti) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

E(c·(Q2Δ

(xi

vint(Xi−1)+ di

)− di

))× E(wj

)Δ,

if(

c·xivint(

Xi − 1)) ≥ Q2Δ

(c·xi

vint(Xi − 1)

),

E(c·(Q2Δ

(xi

vint(Xi−1)+di

)−di

))×(E(wj)Δ)−1

,

if(

c·xivint(Xi − 1)

)< Q2Δ

(c·xi

vint(Xi − 1)

),

E(yi) = E(ti)vint(Xi−1).

(20)

With the above scheme, we have succeeded in adaptingthe RDM watermarking scheme, one of the most recentQIM watermarking approaches, to the constraints set by theKuribayashi-Tanaka protocol.

5. EXPERIMENTAL VALIDATION

In this section, we experimentally compare the plain-text versions of the SD-QIM, DC-QIM, and RDM wa-termarking schemes with the proposed version based onthe Kuribayashi-Tanaka fingerprinting protocol. The buyer’s

Table 2: Table of parameters.

Algorithm Scaling factor Quantization step size Noise

SD-QIM c = 1, 2, 5, 10, 100 Δ = k for k, 1 ≤ k ≤ 20

DC-QIM c = 1, 10, 100 Δ = 5k for k, 1 ≤ k ≤ 20 σn = 15

RDM

c = 10 Δ = k for k, 1 ≤ k ≤ 20 σn = 15

c = 100 Δ = k for k, 1 ≤ k ≤ 20

c = 1000 Δ = 8k for k, 1 ≤ k ≤ 20

c = 10.000 Δ = 75k for k, 1 ≤ k ≤ 20

identity information will be embedded into the DC DCT co-efficients of 8 × 8 blocks. Per image, we embed 64 bits ofidentity information into 64 DC DCT coefficients that arepseudorandomly selected based on a secret key only knownto the merchant. In all experiments, we use the 256 × 256pixels gray-valued Lena and Baboon images. Because of run-time efficiency and the availability of the necessary proofs,we selected the Okamoto-Uchiyama cryptosystem for all ex-periments as in [5]. The Okamoto-Uchiyama cryptosystemhas a smaller encryption rate compared to (generalized ver-sions of) Paillier because of a smaller message space for thesame security level. However, as signal values are usuallysampled with 8 bit precision, a smaller message space is nota problem for our application, while the ciphertext size is re-duced with the Okamoto-Uchiyama cryptosystem, resultingin lower overall computational complexity.

We not only compare the performance of the plaintextand ciphertext versions of the SD-QIM, DC-QIM, and RDMwatermarking schemes, but we also evaluate the effect of in-teger rounding and the scaling parameter c on the perfor-mance. In our graphs, each point shown is based on 100 mea-surements, and each measurement is a complete, new itera-tion of the Kuribayashi-Tanaka protocol. A table of parame-ters1 for algorithms can be found in Table 2.

5.1. Subtractive dither QIM

An important performance measure of a watermarkingscheme is the bit-error rate (BER) of the watermark detectoras a function of the strength of embedding the watermark.The BER is a measure that quantifies the probability Pe ofincorrectly detecting a single bit of information. Usually, thebuyer’s identity information contains some form of channelcoding so that the buyer’s identity can still be retrieved evenif a few bits are incorrectly detected from the fingerprintedimage, this is further discussed in Section 6.

In order to measure the distortion that the watermarkintroduces into the host signal, we use the document-to-watermark ratio (DWR):

DWR = 10 log 10

(σ2x

σ2w

)(dB). (21)

1 The codes for the implementation can be found in http://ict.ewi.tudelft.nl.


30 32 34 36 38 40 42

DWR (dB)

10−4

10−3

10−2

10−1

100

Pe

KT SD-QIM, c = 1KT SD-QIM, c = 2KT SD-QIM, c = 5

KT SD-QIM, c = 10KT SD-QIM, c = 100SD-QIM

(a)

28 30 32 34 36 38 40

DWR (dB)

10−4

10−3

10−2

10−1

100

Pe

KT SD-QIM, c = 1KT SD-QIM, c = 2KT SD-QIM, c = 5

KT SD-QIM, c = 10KT SD-QIM, c = 100SD-QIM

(b)

Figure 5: SD-QIM bit error rate (BER) Pe as a function of the document-to-watermark ratio (DWR) for the original SD-QIM scheme andKT SD-QIM with different scaling factors c = 1, 2, 5, 10, and 100 for (a) Lena and (b) Baboon images.

Here, σ2x is the variance of the data, into which the water-

mark is embedded, which, in our case, are the DC DCT co-efficients of 8×8 blocks. Further, σ2

w is the variance of thedistortion caused by the embedded watermark. Following[6], we equate σ2

w = Δ2/3. The objective, a watermarkingscheme, is to have a low BER with a high DWR. The propervalues for the DWR and thus Δ is application and data de-pendent. In this paper, we are not concerned with select-ing a suitable value of Δ. We rather study the behavior ofthe BER as a function of the DWR for the plaintext andKuribayashi-Tanaka versions of the SD-QIM watermarkingscheme.

Figure 5 shows the BER-DWR relation for the two ver-sions of the SD-QIM algorithm. The performance of theKuribayashi-Tanaka version of the SD-QIM (KT SD-QIM)watermarking scheme is shown for several values of the scal-ing factor c. Although there is no deliberate attack performedon the watermark, the inverse DCT transform, and conse-quential rounding to 8 bit pixel values introduces a distor-tion into the fingerprinted signal. The robustness of the wa-termarking scheme is sufficient, however, to result in no-biterrors at a DWR of 31–34 dB. A peculiar effect is the in-creased robustness of the heavily rounded (i.e., scaling fac-tor c = 1) KT SD-QIM compared to the original water-marking scheme. We believe that this behavior is caused bythe distorting effect of the (inverse) DCT transform. By in-creasing the scaling factor c, we can approximate the per-formance of the original SD-QIM. The performance is al-ready closely approximated with c = 100 in this instance,but in general, the application, the data, and the implemen-tation of the DCT will determine which value of c is requiredto approximate the performance of the plaintext SD-QIMscheme.

5.2. Distortion-Compensated QIM

Figure 5 showed the BER in a scenario without any explicitattacks on the watermark. Distortion-compensated QIM canbe used to provide optimal robustness against additive noiseattacks. Therefore, we will show the performance of theKuribayashi-Tanaka adaptation of DC-QIM and compare itwith the original DC-QIM and the previously discussed SD-QIM. A measure of the amount of noise introduced relativeto the strength of the watermark is the watermark-to-noiseratio (WNR):

WNR = 10 log 10

(σ2w

σ2n

)(dB). (22)

Here, σ2n is the variance of the additive zero-mean Gaussian

noise that the attacker adds to the fingerprinted content. Thevalue of α is chosen according to (14) so that the DC-QIMscheme is tuned for a specific additive noise-variance level.In all our experiments, we use σn = 15 and change the valueof Δ = √3σw as so to obtain a varying WNR.

Figure 6 shows the BER-WNR relation for SD-QIM andDC-QIM. We choose to fix the amount of additive noise in-stead of the DWR because we are interested in the effect thescaling factor c has on the required embedding strength (i.e.,value of Δ and thus the watermark power) and not a variableamount of additive noise. Therefore, Figure 6 cannot be eas-ily compared to other literature on watermark robustness. Asin our previous experiment, the watermark distortion is cal-culated using the expression σ2

w = Δ2/3 [6].As can be observed, the performance of the DC-QIM is

better than SD-QIM with additive noise, which is in accor-dance with [6]. We are mostly concerned with the compari-son of the original version of the DC-QIM scheme and the


−4 −2 0 2 4 6 8 10 12

WNR (dB), σn = 15

10−2

10−1

100

Pe

Original SD-QIMKT SD-QIM, c = 1KT SD-QIM, c = 100

Original DC-QIMKT DC-QIM, c = 1KT DC-QIM, c = 100

(a)

−4 −2 0 2 4 6 8 10 12

WNR (dB), σn = 15

10−2

10−1

100

Pe

Original SD-QIMKT SD-QIM, c = 1KT SD-QIM, c = 100

Original DC-QIMKT DC-QIM, c = 1KT DC-QIM, c = 100

(b)

Figure 6: SD-QIM and DC-QIM bit error rate (BER) as a function of the watermark-to-noise ratio (WNR) with additive noise (σn = 15)for the original SD-QIM and DC-QIM schemes and the KT SD-QIM and DC-QIM schemes with different scaling factors c for (a) Lena and(b) Baboon images.

Kuribayashi-Tanaka adaptation of DC-QIM. As expected,the performance of the original DC-QIM scheme and theKuribayashi-Tanaka adaptation of DC-QIM (KT DC-QIM)differ very little. Also the scaling factor c has little effect onthe BER. This can be explained by the fact that the additivenoise dominates the errors caused by the integer rounding.

5.3. Rational dither modulation

Unlike the previous two watermarking schemes, rationaldither modulation (RDM) depends on a sufficiently largescaling factor c in order to achieve a quantization coarsenessΔ lower than 1. The scaling factor c determines the possi-ble resolution of Δ. We are interested to see which resolutionis required in order to achieve good performance. Althoughthe results depend on the data and the strength of the addednoise, the trend of these results will be observed for othercases and data as well because the signal coefficients xi arenormalized before embedding.

Figure 7 shows the bit error rate (BER) performance ofRDM as a function of the watermark-to-noise ratio (WNR)for the plain text and Kuribayashi-Tanaka versions of RDM.The different curves reflect different values for the scalingfactor c. Because of the complexity of the analytical expres-sion of the watermark distortion σ2

w in [7], we measured thewatermark distortion directly from the data.

Figure 7 shows that the value of the scaling factor c deter-mines the points of the Pe-WNR curve, which are attainableby the Kuribayashi-Tanaka RDM scheme. With a scaling fac-tor c = 10, only WNRs with 12 dB or higher are reachable(see “KT RDM, c = 10” curve in Figure 7, which starts at12 dB), allowing for very little flexibility in choosing the op-

timal embedding strength for a specific application. A scalingfactor of 100 performs much better, but 1000 approximatesthe original RDM closely.

Besides the equivalent robustness to additive-noise at-tacks of RDM compared to SD-QIM, RDM is robust againstamplitude-scaling attacks. Figure 8 shows the robustness ofSD-QIM, DC-QIM, and RDM to a performed amplitude-scaling attack. SD-QIM and DC-QIM, show a high vulner-ability against amplitude-scaling attacks. At a small gain fac-tor ρ of 1.05, approximately 50 percent of the buyer’s identi-fying information cannot be retrieved correctly, while RDMis robust throughout the whole range for the gain factor. Al-though theoretically RDM should not be at all affected by anamplitude-scaling attack, some bit errors start to show up atgain factors larger than 1.06. These are inherent to the 8 bitdata-representation format, which easily overflows for largegain factors.

6. SECURITY ASPECTS OF BUYER IDENTITY

As fingerprint detection is a signal processing operation, de-tected fingerprints will usually be distorted even without at-tacks on the fingerprint by a malicious buyer, as discussedin Section 4. The fingerprint can, for instance, be distortedby perfectly legitimate signal-processing operations such ascompression, the obligatory inverse DCT, and consequentialrounding. In this scenario, the merchant would normally notbe able to present a perfectly retrieved buyer id. The regis-tration center could accept merchant buyer id submissions,which are similar to a correct buyer id. However, the securityof the buyer depends on the inability of the merchant to guessa correct buyer id. To allow the merchant to submit similar


−5 0 5 10 15 20

WNR (dB), σn = 15

10−3

10−2

10−1

100

Pe

Original RDMKT RDM, c = 10

KT RDM, c = 100KT RDM, c = 1000

(a)

−5 0 5 10 15 20

WNR (dB), σn = 15

10−3

10−2

10−1

100

Pe

Original RDMKT RDM, c = 10

KT RDM, c = 100KT RDM, c = 1000

(b)

Figure 7: RDM bit error rate (BER) as a function of the watermark-to-noise ratio (WNR) with additive noise (σn = 15) for the originalRDM scheme and KT RDM scheme with different scaling factors c for (a) Lena and (b) Baboon images.

1.03 1.035 1.04 1.045 1.05 1.055 1.06 1.065

Gain factor (ρ)

10−4

10−3

10−2

10−1

100

Pe

KT SD-QIMKT DC-QIMKT RDM

Figure 8: KT bit error rate (BER) as a function of the gain factor(ρ) for KT SD-QIM, KT DC-QIM and KT RDM schemes with c =1000. The DWR is fixed to 7.1 dB. Datapoints below a BER of 10−3

are plotted for visualization, but in reality 0.

buyer ids and for the registration center to accept these wouldthus harm the buyer’s security.

By letting the registration center extend the buyer id witha forward-error-correcting scheme, the merchant can com-pensate for a small and fixed maximum number of bit errorsin the buyer id. This is of course equivalent to increasing thesize of the buyer id and allowing for a small number of bit er-

rors at the registration center. This approach has the advan-tage that it moves the computational complexity of the errorcorrection from the registration center to the merchant.

There is a choice to be made concerning the locations ofthe embedding positions for each buyer. The embedding po-sitions can be changed for each buyer, but this would notprovide any real benefits to the robustness of the total fin-gerprinting scheme other than that colluding buyers wouldhave to compare their individual fingerprinted version witha number of other versions in order to detect the embeddinglocations. If the embedding locations are identical for eachfingerprinted copy, buyers who have located these embed-ding positions could publish these, and all buyers could thenremove the fingerprint from their copy. Using unique em-bedding positions for each buyer has, however, a big disad-vantage upon detection. As with any fingerprinting scheme,the merchant cannot know the used embedding positions be-fore detection, as the detection procedure is the sole methodto discriminate between copies. The unavailability of the em-bedding positions prevents the merchant from detecting thebuyer id, resulting in a deadlock. In order to break this dead-lock, the merchant could estimate the embedding positionsby using a nonblind detection procedure (e.g., subtract theoriginal image from the encountered image and thus findthe most likely candidate embedding locations, as they willshow up to have a high difference to the original signal) orby embedding a pilot signal to identify the used embeddingpositions. However, this would be ineffective for heavily at-tacked copies, which are heavily distorted by attacks. Anotherway to retrieve the correct buyer id is to let the merchantdetect for all possible embedding locations and use a (soft)error-correction scheme to determine the most likely buyerid, based on the distance, the detected id is from a valid code-word in the used error-correction scheme. This, however,


makes the detection procedure linear in complexity relatedto the number of buyers as it has to be performed for eachused combination of embedding positions.

Although dithering prevents an individual buyer to de-tect the embedding positions, a coalition of buyers can col-lude to find them. By comparing different fingerprintedcopies, the coalition can locate the differing samples and co-efficients and, as the fingerprint embedding is the predomi-nant cause of these differing samples, consequently, the em-bedding positions. This vulnerability can be eliminated byconstructing the buyer’s ids through the scheme of Bonehand Shaw [17], making them collusion secure. The collusionsecurity of the scheme of Boneh and Shaw depends on gen-erating buyer ids such that they have a number of identicalbits wj for any colluding coalition of c buyers. Because thesebuyer id bits are identical, the coalition is not able to detectthese embedded bits by comparing their individually finger-printed copies. This does, however, require that the embed-ding positions are identical for each fingerprinted copy. Be-cause the embedding positions for these bits cannot be deter-mined, they are safe from targeted attacks and can thereforebe detected correctly by the merchant even after the attack bythe colluding buyer coalition. Constructing such a collusion-secure code for a large coalition constitutes a large increase inthe buyer id length. As shown in [17], the length is equal toO(c4 log (N/e) log (1/e)), where c is the number of colludingbuyers, N is the total number of buyers, and e is the proba-bility that the cheating buyer cannot be retrieved after a col-lusion attack. Because of the anonymity of the embeddingprocedure, the registration center will have to generate thecollusionsecure buyer ids as this will be the only person themerchant trusts to generate a valid buyer id.

7. CONCLUSION

In conventional fingerprinting schemes, the buyer’s identityis known to the merchant during embedding. This knowl-edge can be easily abused by a malicious merchant by cre-ating fingerprinted copies containing this identity informa-tion without the buyer’s consent. After distribution, the mer-chant can claim a license violation for this specific buyer. Todeal with this problem, Kuribayashi and Tanaka proposed areasonably efficient solution in [5] based on embedding thebuyer identification information using additive homomor-phic encryption schemes. The problem of the proposed pro-tocol in [5] is the vulnerability of the underlying basic QIMwatermarking scheme, which is fragile to simple attacks likeamplitude scaling and allows for the detection of the embed-ding positions. Therefore, we have proposed to adapt DC-QIM and RDM techniques to the anonymous fingerprintingscheme of Kuribayashi and Tanaka.

We have adapted DC-QIM and RDM techniques, whichhide the embedding locations, unlike basic QIM, becausethey are based on SD-QIM. They perform provably equiv-alent (RDM) or better (DC-QIM) than the watermark-ing scheme in the original work against additive-noise at-tacks. Furthermore, RDM provides robustness to amplitude-scaling attacks which is a major drawback of the basic QIMscheme used in [5].

Although rounding errors can be made arbitrarily smallthrough the use of scaling factors, the practical need, asshown in the experiments, is small. As integer quantizationstep sizes have to be used because of the homomorphic en-cryption scheme, the distortion introduced by the finger-print embedding is usually larger than the distortion intro-duced by integer rounding. As a consequence, rounding witha scaling factor of one (i.e., no scaling) already has accept-able performance. The scaling factor has its use, however, inincreasing the effective quantizer resolution. Although this isof limited use for signals with a relatively large value range, itis essential for signals with a small value range, as is the casefor RDM after normalization.

Due to attacks on the digital content or transmission er-rors, the identity information of the buyer can be extractedwith bit errors. In that case, using error-correction codescan improve the abilities of the merchant to recover theidentity information. By letting the registration center se-lect the buyer identity information, we can incorporate theseerror-correction capabilities or even provide a collusionse-cure fingerprinting scheme. This greatly increases the em-bedded buyer’s identification information and the complex-ity of constructing a valid identity at the registration cen-ter. Although this might not be practical in real applications,it provides a theoretical solution to the problem of collu-sion.

By adapting the DC-QIM and RDM watermarkingschemes to the anonymous fingerprinting protocol of Kurib-ayashi and Tanaka, we increased the robustness of the em-bedded fingerprints, while preserving the anonymity of thefingerprinting protocol. Consequently, the buyer’s ability tosuccessfully attack embedded fingerprints is reduced, in-creasing the deterrence to the illegal redistribution of digitalcontent.

ACKNOWLEDGMENTS

The work described in this paper has been supported in partby the European Commission through the IST Programmeunder Contract no. 034238-SPEED. The information in thisdocument reflects only the authors’ views, is provided as isand no guarantee or warranty is given that the informationis fit for any particular purpose. The user thereof uses theinformation at its sole risk and liability.

REFERENCES

[1] N. Memon and P. Wong, “A buyer-seller watermarking proto-col,” IEEE Transactions on Image Processing, vol. 10, no. 4, pp.643–649, 2001.

[2] B. Pfitzmann and M. Waidner, “Anonymous fingerprinting,”in International Conference on the Theory and Application ofCryptographic Techniques (EUROCRYPT ’97), vol. 1233, pp.88–102, Konstanz, Germany, May 1997.

[3] B. Pfitzmann and A.-R. Sadeghi, “Coin-based anonymous fin-gerprinting,” in International Conference on the Theory andApplication of Cryptographic Techniques (EUROCRYPT ’99),vol. 1592, pp. 150–164, Prague, Czech Republic, May 1999.


[4] B. Pfitzmann and A.-R. Sadeghi, “Anonymous fingerprintingwith direct non-repudiation,” in Proceedings of the 6th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security (ASIACRYPT ’00), vol. 1976, pp.401–414, Kyoto, Japan, December 2000.

[5] M. Kuribayashi and H. Tanaka, “Fingerprinting protocolfor images based on additive homomorphic property,” IEEETransactions on Image Processing, vol. 14, no. 12, pp. 2129–2139, 2005.

[6] B. Chen and G. W. Wornell, “Quantization index modulation:a class of provably good methods for digital watermarkingand information embedding,” IEEE Transactions on Informa-tion Theory, vol. 47, no. 4, pp. 1423–1443, 2001.

[7] F. Perez-Gonzalez, C. Mosquera, M. Barni, and A. Abrardo,“Rational dither modulation: a high-rate data-hiding methodinvariant to gain attacks,” IEEE Transactions on Signal Process-ing, vol. 53, no. 10, part 2, pp. 3960–3975, 2005.

[8] T. Okamoto and S. Uchiyama, “A new public-key cryptosys-tem as secure as factoring,” in International Conference onthe Theory and Application of Cryptographic Techniques (EU-ROCRYPT ’98), vol. 1403, pp. 308–318, Espoo, Finland, June1998.

[9] N. Ahituv, Y. Lapid, and S. Neumann, “Processing encrypteddata,” Communications of the ACM, vol. 30, no. 9, pp. 777–780,1987.

[10] P. Paillier, “Public-Key Cryptosystems Based on CompositeDegree Residuosity Classes,” in International Conference on theTheory and Application of Cryptographic Techniques (EURO-CRYPT ’99), vol. 1592 of Lecture Notes in Computer Science,pp. 223–238, Springer, Prague, Czech Republic, May 1999.


[12] R. L. Rivest, A. Shamir, and L. Adleman, “A method for obtain-ing digital signatures and public-key cryptosystems,” Commu-nications of the ACM, vol. 21, no. 2, pp. 120–126, 1978.

[13] T. ElGamal, “A public key cryptosystem and a signaturescheme based on discrete logarithms,” IEEE Transactions onInformation Theory, vol. 31, no. 4, pp. 469–472, 1986.

[14] I. D. Shterev and R. L. Lagendijk, “Amplitude scale estimationfor quantization-based watermarking,” IEEE Transactions onSignal Processing, vol. 54, no. 11, pp. 4146–4155, 2006.

[15] M. Costa, “Writing on dirty paper,” IEEE Transactions on In-formation Theory, vol. 29, no. 3, pp. 439–441, 1983.

[16] F. Bartolini, M. Barni, and A. Piva, “Performance analysisof ST-DM watermarking in presence of nonadditive attacks,”IEEE Transactions on Signal Processing, vol. 52, no. 10, pp.2965–2974, 2004.

[17] D. Boneh and J. Shaw, “Collusion-secure fingerprinting fordigital data,” IEEE Transactions on Information Theory, vol. 44,no. 5, pp. 1897–1905, 1998.


Research ArticleTransmission Error and Compression Robustness of2D Chaotic Map Image Encryption Schemes

Michael Gschwandtner, Andreas Uhl, and Peter Wild

Department of Computer Sciences, Salzburg University, Jakob-Haringerstr. 2, 5020 Salzburg, Austria

Correspondence should be addressed to Andreas Uhl, [email protected]

Received 30 March 2007; Revised 10 July 2007; Accepted 3 September 2007


This paper analyzes the robustness properties of 2D chaotic map image encryption schemes. We investigate the behavior of suchblock ciphers under different channel error types and find the transmission error robustness to be highly dependent on the typeof error occurring and to be very different as compared to the effects when using traditional block ciphers like AES. Additionally,chaotic-mixing-based encryption schemes are shown to be robust to lossy compression as long as the security requirements arenot too high. This property facilitates the application of these ciphers in scenarios where lossy compression is applied to encryptedmaterial, which is impossible in case traditional ciphers should be employed. If high security is required chaotic mixing loses itsrobustness to transmission errors and compression, still the lower computational demand may be an argument in favor of chaoticmixing as compared to traditional ciphers when visual data is to be encrypted.

Copyright © 2007 Michael Gschwandtner et al. This is an open access article distributed under the Creative Commons AttributionLicense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properlycited.

1. INTRODUCTION

A significant amount of encryption schemes specifically tai-lored to visual data types has been proposed in literature dur-ing the last years (see [9, 20] for extensive overviews). Themost prominent reasons not to stick to classical full encryp-tion employing traditional ciphers like AES [6] for such ap-plications are the following:

(i) to reduce the computational effort (which is usuallyachieved by trading off security as it is the case in par-tial or soft encryption schemes);

(ii) to maintain bitstream compliance and associated func-tionalities like scalability (which is usually achievedby expensive parsing operations and marker avoidancestrategies);

(iii) to achieve higher robustness against channel or storageerrors.

Using invertible two-dimensional chaotic maps (CMs)on a square to create symmetric block encryption schemesfor visual data has been proposed [4, 8] mainly to serve thefirst purpose, that is, to create encryption schemes with lowcomputational demand. CMs operate in the image domain

which means that in some sense bitstream compliance is notan issue, however, they cannot be combined in a straightfor-ward manner with traditional compression techniques.

Compensating errors in transmission and/or storage ofdata, especially images, is fundamental to many applications.One example is digital video broadcast or RF transmissionswhich are also prone to distortions from atmosphere or in-terfering objects. On the one hand, effective error conceal-ment techniques already exist for most current file formats,but when image data needs to be encrypted, these techniquesonly partly apply since they usually depend on the data for-mat which is not accessible in encrypted form. On the otherhand, error correction codes may be applied at the networkprotocol level or directly to the data but these techniques ex-hibit several drawbacks which may be not acceptable in cer-tain application scenarios.

(i) Processing overhead: applying error correction codesbefore transmission causes additional computationaldemand which is not desired if the acquiring and send-ing device has limited processing capability (like anymobile device).

(ii) Data rate increase: error correction codes add redun-dancy to data; although this is done in a fairly efficient


manner, data rate increase is inevitable. In case of low-bandwidth network links (like any wireless network)this may not be desired.

One famous example for an application scenario of thattype are RF surveillance cameras with their embedded pro-cessors, which are used to digitize the signal and encrypt itusing state-of-the-art ciphers. If further error correction canbe avoided, the remaining processing capacity (if any) can beused for image enhancement and higher network capacity al-lows better quality images to be transmitted. In this work weinvestigate a scenario where neither error concealment norerror correction techniques are applied, the encrypted visualdata is transmitted as it is due to the reasons outlined above.

Due to intrinsic properties (e.g., the avalanche effect)of cryptographically strong block ciphers (like AES), suchtechniques are very sensitive to channel errors. Single bitslost or destroyed in encrypted form cause large chunks ofdata to be lost. For example, it is well known that a singlebit failure of AES-encrypted ciphertext destroys at least onewhole block plus further damage caused by the encryptionmode architecture. Permutations have been suggested to beused in time-critical applications since they exhibit signif-icantly lower computational cost as compared to other ci-phers, however, this comes at a significantly reduced securitylevel (this is the reason why applying permutations is saidbe a type of “soft encryption”). Hybrid pay-TV technologyhas extensively used line permutations (e.g., in the Nagravi-sion/Syster systems), many other suggestions have been madeto employ permutations in securing DCT-based [21, 22] orwavelet-based [14, 23] data formats. In addition to being veryfast, permutations have been identified to be a class of cryp-tographic techniques exhibiting extreme robustness in casetransmission errors occur [19].

Bearing in mind that CM crypto systems mainly rely onpermutations makes them interesting candidates for the usein error-prone environments. Taken this fact together withthe very low computational complexity of these schemes,wireless and mobile environments could be potential appli-cation fields. While the expected conclusion that the highersecurity level of cryptographically strong ciphers implieshigher sensitivity to errors compared to CM crypto systemsis nothing new, we investigate the impact of different errormodels on image quality to obtain a quantifiable tradeoff be-tween security and transmission error robustness. The rise ofwireless local area networks and its diversity of errors enforcethe development of new transmission methods to achievegood quality of transmitted image data at a certain protec-tion level.

Accepting the drawback of a possibly weaker protectionmechanism, it may be possible to achieve better quality re-sults in the decrypted image after transmission over noisychannels as compared to classical ciphers. In this work wecompare the impact of different types of distortions of trans-mission links (i.e., channel errors) on the transmission of im-ages using block cipher encryption with CM encryption (seeFigure 1, part A).

Additionally (see Figure 1, part B), we focus on an is-sue different to those discussed so far at first sight, however,

this topic is related to the CMs’ robustness against a specifictype of errors (value errors): we investigate the lossy com-pression of encrypted visual material [10]. Clearly, data en-crypted with classical ciphers cannot be compressed well: dueto the statistical properties of encrypted data no data reduc-tion may be expected using lossless compression schemes,lossy compression schemes cannot be employed since the re-constructed material cannot be decrypted any more due tocompression artifacts. For these reasons, compression is al-ways required to be performed prior to encryption whenclassical ciphers are used. However, for certain types of ap-plication scenarios it may be desirable to perform lossy com-pression after encryption (i.e., in the encrypted domain).CMs are shown to be able to provide this functionality to acertain extent due to their robustness to random value errors.We will experimentally evaluate different CM configurationswith respect to the achievable compression rates and qualityof the decompressed and decrypted visual data.

A brief introduction to chaotic maps and their respec-tive advantages and disadvantages as compared to classicalciphers is given in Section 2. Experimental setup and usedimage quality assessment methods are presented in Section 3.Section 4 discusses the robustness properties of CM block ci-phers with respect to different types of network errors andcompares the results to the respective behavior of a classi-cal block cipher (AES) in these environments. Section 5 dis-cusses possible application scenarios requiring compressionto be performed after encryption and provides experimentalresults evaluating a JPEG compression, a JPEG 2000 com-pression and finally JPEG 2000 with wavelet packets, all withvarying quality applied to CM encrypted data. Section 6 con-cludes the paper.

2. CHAOTIC MAP ENCRYPTION SCHEMES

Using CMs as a (mainly) permutation-based symmetricblock cipher for visual data was introduced by Scharinger[17] and Fridrich [8]. CM encryption relies on the use of dis-crete versions of chaotic maps. The good diffusion propertiesof chaotic maps, such as the baker map or the cat map, soonattracted cryptographers. Turning a chaotic map into a sym-metric block cipher requires three steps, as [8] points out.

(1) Generalization. Once the chaotic map is chosen, itis desirable to vary its behavior through parameters.These are part of the key of the cipher.

(2) Discretization. Since chaotic maps usually are not dis-crete, a way must be found to apply the map onto afinite square lattice of points that represent pixels in aninvertible manner.

(3) Extension to 3D. As the resulting map after step two is aparameterized permutation, an additional mechanismis added to achieve substitution ciphers. This is usuallydone by introducing a position-dependent gray levelalteration.

In most cases a final diffusion step is performed, oftenachieved by combining the data line or column wise with theoutput of a random number generator.

Michael Gschwandtner et al. 3

Sender

Raw image data

A) Transmission error

B) Lossy compression

CM/AESencryption

JPEG/JPEG 2000compression

Distortion

Distorted raw image data

Receiver

JPEG/JPEG 2000decompression

CM/AESdecryption

Figure 1: Experimental setup examining (A) transmission error resistance and (B) lossy compression robustness of CM and AES encryptionschemes.

The most famous example of a chaotic map is the stan-dard baker map:

B: [0, 1]2 −→ [0, 1]2,

B(x, y) =

⎧⎪⎪⎪⎨

⎪⎪⎪⎩

(

2x,y

2

)

if 0 ≤ x <12

,(

2x − 1,y + 1

2

)

if12≤ x ≤ 1.

(1)

This corresponds geometrically to a division of the unitsquare into two rectangles [0, 1/2[×[0, 1] and [1/2, 1]×[0, 1]that are stretched horizontally and contracted vertically. Sucha scheme may easily be generalized using k vertical rectangles[Fi−1Fi[×[0, 1[ each having an individual width pi such thatFi =

∑ ij=1pj , F0 = 0,Fk = 1. The corresponding vertical

rectangle sizes pi, as well as the number of iterations, intro-duced parameters. Another choice of a chaotic map is theArnold Cat map:

C: [0, 1]2 −→ [0, 1]2,

C(x, y) =(

1 11 2

)(xy

)

mod 1,(2)

where xmod 1 denotes the fractional part of a real num-ber x by subtracting or adding an appropriate integer. Thischaotic map can be generalized using a Matrix A introduc-ing two integers a, b such that det(A) = 1 as follows:

Cgen(x, y) = A

(xy

)

mod 1, A =(

1 ab ab + 1

)

. (3)

Now each generalized chaotic map needs to be modifiedto turn into a bijective map on a square lattice of pixels. LetN := {0, . . . ,N − 1}, the modification is to transform do-main and codomain to N 2. Discretized versions should avoidfloating point arithmetics in order to prevent an accumula-tion of errors. At the same time they need to preserve sen-sitivity and mixing properties of their continuous counter-parts. This challenge is quite ambitious and many questionsarise, whether discrete chaotic maps really inherit all impor-tant aspects of chaos by their continuous versions. An im-portant property of a discrete version F of a chaotic map fis

limN→∞

max0≤i, j<N

∣∣ f (i/N , j/N)− F(i, j)

∣∣ = 0. (4)

Discretizing a chaotic Cat map is fairy simple and intro-duced in [4]. Instead of using the fractional part of a realnumber, the integer modulo arithmetic is adopted:

Cdisc : N 2 −→ N 2,

Cdisc(x, y) = A

(xy

)

modN , A =(

1 ab ab + 1

)

.(5)

Finally, an extension to 3D is inserted that may be appliedto any two-dimensional chaotic map. As all chaotic mapspreserve the image histogram (and with it all correspond-ing statistical moments), a procedure to result in a uniformhistogram after encryption is desired. The extension of a twodimensional discrete chaotic map F : N 2→N 2 to three di-mensions consists of a position-dependent grey-level shift(assuming L grey levels L := {0, . . . ,L − 1}) at each levelof iteration:

F3D : N 2 ×L −→ N 2 ×L

F3D(i, j, gi j

) =⎛

⎜⎝

i′

j′

h(i, j, gi j

)

⎞

⎟⎠ ,

(i′

j′

)

= F(i, j).(6)

The map h modifies the grey level of a pixel and is a functionof the initial position and initial grey level of the pixel, thatis, h(i, j, gi j) = gi j + h(i, j) mod L. There are various possiblechoices of h, we use h(i, j) = i· j.

Since chaotic maps after step two or three are bijectionsof a square lattice of pixels, an additional spreading of lo-cal information over the whole image is desirable. Otherwisethe cipher is extremely vulnerable to known plaintext attacks,since each pixel in the encrypted image corresponds exactlyto one pixel in the original. The diffusion step is often real-ized as a linewise process, for example,

v(i, j)∗ = v(i, j) +G(v(i, j − 1)∗

)mod L, (7)

where v(i, j) is the not-yet modified pixel at position (i, j),v(i, j)∗ is the modified pixel at that position, and G is an ar-bitrarily chosen random lookup table.

Concerning robustness against transmission errors, CMsof course are expected to be more robust when diffusion stepsare avoided (compare results). If local information is spread


Table 1: Cardinality of key spaces K(N).

N = 20 N = 25 N = 128 N = 512

Baker map keyset1 83343 571 1031 10126

Baker map keyset2 524288 16777216 1038 10153

Cat map 400 625 16384 262144

AES128 1038 1038 1038 1038

AES256 1077 1077 1077 1077

during encryption, that is, in diffusion steps, a single pixelerror in the encrypted image causes several pixel errors in theoriginal image. For this reason, we investigate both settingswith and without diffusion.

It should be clear that chaotic maps have different prop-erties when compared to conventional block ciphers. Typi-cally, conventional block encryption schemes like AES workon block sizes of 128, 256, or 512 bit. key space contains 2n

elements, where n is the number of key bits, which is usuallyoften 1 : 1 to block size.

As the main property of CM is permutation, it operateson larger units, that are full (square) images. Their smallestelement to be permuted is a pixel. To encrypt an N × N im-age, N2! permutations exist. However, the key space availableto parameterize the chaotic map is often orders of magni-tude smaller. Another drawback is dependency on image size.There are configurations where a small change in image sizecauses key space to shrink dramatically (see keyset1 and key-set2 in Table 1). In Table 1, cardinalities of key spaces K(N)for Baker map, Cat map, and AES are compared choosing arepresentative N ×N grey-scale image. While the number ofiterations and parameters for the diffusion step is usually partof the key for chaotic encryption algorithms they have beenneglected for this comparison. It is evident that key space, es-pecially for smaller image sizes, is insufficient. In this case orfor problematic image sizes, padding should be used to pre-vent a guessing of all possible key combinations. At this pointa main drawback of the Cat map becomes evident: its pa-rameters offer little combinations compared to other chaoticmaps.

Chaotic maps are generally sensitive to initial conditionsand parameters. But some discrete versions bear unexpectedbehavior when using similar keys. While classical encryp-tion algorithms are sensitive to keys, chaotic maps such asthe Baker map exhibit a set of keys S(K) for each key K ,such that the image encrypted with K and decrypted usingk ∈ S(K), k�=K is close to its original. We get similar resultswhen using keys that are derived from the original by replac-ing a large parameter by two smaller ones or merging twosmall parameters into a larger one. This has been observedby [8]. Accepting the drawback of a further limitation of keyspace (the intruder may be content to find a key that pro-duces acceptable approximations of original images and con-tinues with refinement), this may also be seen as a feature ofthe encryption system. Transmission errors destroying singlebits of the key do not necessarily lead to fully destroyed de-cryption. Heuristics could produce a similar key, that allowsdecryption at a low but probably sufficient quality.

Table 2: Tested image encryption algorithms for part A.

Name Description

2DCatMap Cat map

2DBMap Baker map

3DCatMap Cat map with 3D extension

2DCatDiff Cat map with diffusion step

AES128ECB AES using ECB on 128 bit blocks

AES128CBC Same as AES128ECB, using CBC

Table 3: Tested image encryption algorithms for part B.

Name Description

2DCatMap5/7/10 Cat map with 5/7/10 iterations

2DCatDiff5 Cat map with diffusion step and five iterations

3DCatMap5 Cat map with 3D extension and five iterations

2DBMap5/17 Baker map with 5/17 iterations

Table 4: Employed keys/parameters for experiments.

Name Value

BakerMapKey1 192,32,32

BakerMapKey2 32,64,32,16,32,32,16,8,8,8,8

AES IV 10111213141516171819202122232425

AESKey 000102030405060708090A0B0C0D0E0F

CatMapKey 2,3,1,1

3. EXPERIMENTAL SETUP

We analyze both transmission error resistence (part A) andcompression robustness (part B) of three different flavors ofthe chaotic Cat map algorithm, a simple 2D version of theBaker map and AES using different block encryption modes(see Tables 2, 3). All chaotic ciphers use 10 iteration rounds,if not specified differently.

Since the number of iterations used in CM algorithmslargely affects the distribution of distortions caused by lossycompression, we examine the impact of this parameter onimage quality. The diffusion step has been excluded from allchaotic maps, except CatDiff. All algorithms are applied to aset of 10 natural and 6 synthetic 256 × 256 images with 256grey levels referenced in Figure 2 (only 13 of 16 pictures areshown due to copyright restrictions) using two sets of rep-resentative encryption keys (keyset2 represents a strong keywhereas keyset1 exhibits certain weaknesses with respect tosecurity). Key parameters for the visual quality experimentare given in Table 4.

3.1. Setup

A flow chart to illustrate the test procedure for both part Aand part B is depicted in Figure 1. Recapitulating, the testprocedure is as follows.

(i) Part A: transmission error robustness. After encryption,a specific type of error as introduced in Section 4.1 isapplied to the encrypted image data. Finally, the imageis decrypted and the result is compared to the original.


(ii) Part B: compression robustness. After encryption, threedifferent compression algorithms (JPEG, JPEG 2000,and JPEG 2000 with wavelet packets) are applied tothe encrypted image data. To assess the behavior of thedescribed processing pipeline, the image is finally de-compressed, decrypted and the result is compared tothe original image and the achieved compression ratio(using the encrypted image as reference) is recorded.

3.2. Image quality assessment

It is difficult to find reliable tools to measure quality of dis-torted images. This is especially true in a low-quality sce-nario. Several metrics exist, such as the signal-to-noise ra-tio (SNR), peak SNR (PSNR), or mean-square error (MSE),which are frequently used in quantifying distortions (see[3, 7]). Mao and Wu [11] propose a measure specifically tai-lored to encrypted imagery that separates evaluation of lumi-nance and edge information into a luminance similarity score(LSS) and an edge similarity score (ESS), reflecting propertiesof the human visual system. According to the authors, thismeasure is well suited for assessing distortion of low-qualityimages. LSS behaves in a way very similar to PSNR. ESS isthe more interesting part in the context of the survey pre-sented here, as it reflects the extent for structural distortion.ESS is computed by block-based gradient comparison andranges, with increasing similarity, between 0 and 1. However,reliable assessment of low-quality images should be made byhuman observers in a subjective rating as this cannot be ac-complished in a sensible way using the metrics above. Subjec-tive visual assessment of transmissions yields a mean opin-ion score (MOS) [1] evaluating gradings of human observersaccording to strictly specified testing conditions. Such con-ditions are specified in, for example, [2] for the subjectiveassessment of the quality of television pictures. These meth-ods can be extended to the assessment of images in generaland are frequently adopted, such as in [5]. RecommendationITU-R-BT500-11 [2] introduces both double stimulus (withreference picture) and single stimulus (without reference pic-ture) assessment methods with a strictly defined testing envi-ronment, that is, quality and impairment scales, lighting con-ditions and also restrictions regarding selection of observers.We have decided to adopt only a subset of features, in partic-ular,

(i) we adopt to a simultaneous double stimulus method(SDSCE) with reference and test pictures being shownat the same time;

(ii) we employ the specified five-graded quality scale (seeTable 5).

Additionally, we conform the specified condition, that atleast fifteen subjects, nonexperts, should be employed.

Since [2] specifies subjective video quality assessmentmethods, it should be noticed that observers evaluate the av-erage quality of the frames displayed. In our case still imagesare evaluated. Therefore, we let the observer vote for the av-erage quality of three different test pictures (encrypted usingthe same algorithm, but different keys) with respective origi-

Table 5: ITU-R-BT500-11 subjective quality rating scales.

Quality Description

5 Excellent

4 Good

3 Fair

2 Poor

1 Bad

nals being shown at the same time, that is, in one assessmentstep, using the quality levels introduced in Table 5.

In the following section we give a short description ofthe observed results with respect to distortions. In order tocomplement the subjective ratings, we also report the refer-ence PSNR value. While it is clear, that in some cases furthererror correction by means of denoising might be useful andthus better results can be achieved, we do not concentrate onpostprocessing techniques at this point.

4. TRANSMISSION ERROR ROBUSTNESS

In this section, our goal is to provide a comparison of twocompletely different block ciphers with respect to their be-havior in the transmission of encrypted visual data over noisychannels. Therefore, this section introduces a set of distor-tion models we believe are practical and illustrative for ap-plications.

4.1. Classification of used error models

Much work has already been done to classify transmissionerrors occurring at wireless data transmission and a varietyof sophisticated network simulators already exist. To focuson a generally applicable comparison of the two encryptionmechanisms CM and AES, we arrange simulations that canbe described by the following model: a sender S transmitsa sequence s0, s1, s2, . . . , sn of n + 1 bytes over a lossy chan-nel. Receiver R receives a sequence r0, r1, r2, . . . , rm of bytes,that is possibly different to s0, s1, s2, . . . , sn. There are situa-tions where n�=m. We identify two categories of observableerrors.

(i) Value errors, where n = m and r0, r1, . . . , rn are derivedfrom the original sequence alternating selected bytes.More formally, there exists a set A ⊂ {0, . . . ,n} anderror function f such that for all i ∈ {0, . . . ,n}

ri ={f (si) if i ∈ A;

si else.(8)

Note that f may depend on additional random variables.

(ii) Buffer errors, where bytes are changed, inserted, re-moved, and possibly resorted. There exists a set A ⊂{0, . . . ,m} and error function f such that a received


stream may be described as

∀ j ≤ m ∃i ≤ n : r j ={f (si) if j ∈ A;

si else.(9)

Various combinations of such errors can occur. However, toextend the observations to existing network behavior, it is in-evitable to model characteristics of transmission packets andnetwork protocols. We believe at this point that the intro-duced classes are sufficient to show the main differences be-tween the two algorithms CM and AES. Another reason whyfurther modeling is not adequate at this point is the follow-ing: if we get close to an error saturation, the category of er-ror should be negligible, as many small buffer errors behavesimilar to many value errors.

4.2. Value errors

Proceeding with the notion of an incoming distorted se-quence r0, r1, . . . , rn, one can identify several different subsetsA and functions f to model a value error.

(i) Static error

In this model every single byte will be changed, that is, A ={0, . . . ,n}. The change for all bytes is quite simple: each bytegets logically ORed with a static byte b ∈ {0, . . . , 255}. Forour experiments we have assigned to b the value 85. Thus, wehave for all i ∈ {0, . . . ,n} : ri = si OR b. This can be usedto simulate defect bus lines, which are permanently at a higherror level.

(ii) Random error and random Gaussian error

The most general error assumption may be the selection ofA using distribution functions. Having to transmit n bytes,for each byte si a specifically distributed random variable de-cides whether i ∈ A or i �∈ A, that is, whether it is trans-mitted correctly or not. The classes random error and ran-dom Gaussian error use the uniform distribution and normaldistribution for selection, respectively. Let X∼U(0, 1) be a(standard, continuous) uniformly distributed random vari-able and let E∼UD(0, 255) denote a discrete uniformly dis-tributed random variable, then a random error is defined forall i ∈ {0, . . . ,n} by

ri ={Ei if Xi < p;

si else.(10)

The choice of p ∈ [0, 1] influences error rate and was selectedto be p = 0.01 for our experiments. For random Gaussianerror the random variable X is chosen to be normally dis-tributed, that is, X∼N (μ, σ2) and we define∀i ∈ {0, . . . ,n}:

ri ={Ei if

∣∣Xi

∣∣ > p;

si else.(11)

The assignments for our experiments are as follows: μ = 0,σ = 1, p = 2.5. This error model is often used to simulate

Table 6: State transitions in Two-State Model.

Probability State transition

p Stay in normal

(1− p) Change to error

q Stay in error

(1− q) Change to normal

distortions in RF transmissions. Moderate rain causes pix-els in satellite TV transmissions to be distorted using specificdistribution functions.

(iii) Random Markov chain

Similarly to the error model introduced before this modelassumes that a byte is overwritten by a random value if it isselected to contain an error. But the decision if a byte has anerror is made conforming to a 2-state Markov chain.

Given two states (1 = error and 0 = normal), thereare transition properties to stay or change the currentstate. Transitions are handled as shown in Table 6. Espe-cially for modeling errors in wireless transmission, thismodel has frequently been adopted (see, e.g., [13]). LetX∼U(0, 1),Y∼U(0, 1) be uniformly distributed randomvariables and p, q ∈ [0, 1] denote state-transition probabil-ities as introduced before, then we formulate a state func-tion returning the current state at time ti with starting stateI0 ∈ {0, 1} as follows:

I(t0) := I0

I(ti+1

):=

⎧⎪⎪⎨

⎪⎪⎩

1 if I(ti) = 0∧ Xi > p

or I(ti) = 1∧ Yi ≤ q;

0 else.

(12)

Thus, if we use again E∼UD(0, 255), we have ∀i ∈ {0,. . . ,n}:

ri ={Ei if I(ti) = 1;

si else.(13)

For the implemented error model we make the following as-signments: p = 0.98, q = 0.03, I0 = 0.

4.3. Buffer errors

In contrast to value-errors representatives of the followingtype of errors correspond to distortions in packet-switcheddata networks. Being able to restore single damaged bytes,for example, by the employment of error-correcting codes,the major problem here is a possible perturbation, replayingand loss of packets consisting of one or multiple bytes.

These errors are often simulated with special networksimulators like ns2 (see at http://www.isi.edu/nsnam/ns).Reference [12] shows that these errors happen in bursts


def random buffer(){

for (i = 0; i < Image.Length; i++){

if (randomDouble(0.0,1.0) < p){

switch(mode){

case InsertBytes{

Image.InsertByte(i, randomInt(255)) i++}case RemoveBytes{

Image.RemoveByte(i)}

}}

}}

Algorithm 1: Pseudocode representation of the random buffer er-ror algorithm with an error probability of p.

(subsequently). We do not consider the error in bursts as thismakes an assumption on the transmission channel, and inthe encryption context “real random” errors are the worstcase scenario. As the error may occur inside the destroyedbuffer and on the “error edges” (for blockciphers in chain-ing mode only), we can see that the impact with bursts is lesssevere as there are fewer “error edges.”

(i) Random buffer error

The most simple case is when packet size is a single byte. Tomodel a behavior where each sent byte may be lost, repli-cated, or finally perturbated in the final sequence the corre-sponding actions are modeled as random variables. In ourcurrent implementation, only one type of error (add or re-move of a selected byte) per transmission is possible. The de-scribed simulation models errors appearing on serial trans-mission links, where the sender and the receiver are slightlyout of synchronization. Algorithm 1 is a simplified pseu-docode representation of the implemented algorithm.

(ii) Random packet error

Compared to the random buffer error, the random packeterror represents an error which is more likely in current sys-tems. As practically any modern computer networks (wiredand wireless) are packet switched, packet loss errors, dupli-cated packets, or out-of-order packets of any common sizecan occur during transmissions. Simulation of packet loss(the most common error) is done by cutting out parts (con-sisting of an arbitrary number of bytes) of the encrypted im-age or overwriting them with a specified byte. The imple-mented algorithm is sketched in Algorithm 2.

def random packet(){

for (i = 0; i < Image.Length/64; i++){

if (randomDouble(0.0,1.0) < p){

switch(mode){

case LooseBytes {Image.RemoveRange(i∗64, 64)

}case ConceilBytes {

Image.SetRange(i∗64, 64, 0)}

}}

}}

Algorithm 2: Pseudocode representation of the random packet er-ror algorithm with an error probability of p.

4.4. Experiments

We show the mean opinion scores of 107 (90 male, 17 fe-male) human observers for the test pictures Lena, Landscape,and Ossi together with the reference mean PSNR values inTable 7. The maximum absolute MOS distance between maleand female observers is 0.26 and 0.19 for image-quality ex-perts versus nonexperts. Especially for random packet errors,experts tend to grade AES and CM diffusion results better,while finding CM random Gaussian errors to be more both-ersome.

As can be seen in Table 7, mean PSNR is a good indi-cator for MOS. Since subjective image assessments are timeconsuming (they cannot be automated), we analyze the com-plete test picture set in Figure 2 with respect to this qualitymetric.

It is clear that comparison results largely depend on theparameters of the error model, such as the error byte b forstatic error or the error rate r. Figure 3 depicts exactly thisrelationship comparing CM and AES error resilience perfor-mance against different error rates (the plots display averagePSNR values of the images displayed in Figure 2). Inspect-ing the mean PSNR curves, we can see that for all differ-ent types of errors, 2DCatMap and 2DBMap do not differmuch, as well as do not differ AES encryption modes. It alsoillustrates CMs superiority in transmission error robustnessfor random errors. Interestingly, also 3DCatMap performsequivalently to the pure 2D case for value errors (comparealso Table 6). The results for random buffer errors also in-dicate superiority of CMs, but the low overall PSNR rangeobtained does not really lead to visually better results. Forrandom buffer errors, 3DCatMap gives equal results to the2DCatDiff variant contrasting to the value error cases. Forrandom packet errors, AES exhibits 1.5–2 dB higher meanPSNR values than standard 2D CM crypto systems. It is


Table 7: Comparing AES and CM with respect to objective and subjective image quality using Landscape, Lena, and Ossi test images.

AlgorithmStatic error Random error R. Gaussian error R. buffer error R. Packet error

Mean PSNR MOS Mean PSNR MOS Mean PSNR MOS Mean PSNR MOS Mean PSNR MOS

Original 13.87 3.10 28.36 4.61 27.53 4.57 10.54 1.39 11.25 2.12

2DCatMap 13.87 3.06 28.34 4.50 27.52 4.56 9.56 1.02 9.73 1.43

2DBMap 13.87 3.07 28.47 4.57 27.37 4.58 9.60 1.00 10.13 1.13

3DCatMap 14.74 2.78 28.43 4.53 27.59 4.56 8.47 1.00 8.92 1.17

2DCatDiff 8.47 1.00 14.24 3.03 13.30 2.75 8.47 1.00 8.46 1.00

AES128ECB 8.52 1.00 16.56 3.21 15.77 3.00 8.58 1.02 10.93 2.40

AES128CBC 8.46 1.00 16.47 3.12 15.63 2.92 8.55 1.04 11.48 2.23

(a) Anton (b) Building (c) Cat

(d) Disney (e) Fractal (f) Gradient

(g) Grid (h) Landscape (i) Lena

(j) Pattern (k) Niagara (l) Tree

(m) Ossi

Figure 2: Test pictures for transmission errors and compression ro-bustness.

also interesting to see that for AES even at very low errorrates starting at 4-5 percent random errors cause at leastas much damage to image quality than random packet er-rors. However, when error rates become very high, there isnot much difference between any of the introduced errormodels.

4.4.1. Static error

For simulating the static error case, all bytes are ORed withb = 85 (Figures 4(a) and 4(b)). It is evident that results forAES are unsatisfactory. As every byte of the encrypted im-age is changed, the decrypted image is entirely destroyed re-sulting in a noise-type pattern. The distortion of the CM-encrypted image is exactly as significant as if the image hadnot been encrypted. The cause for the observable preserva-tion of the original image is the fact that simple 2D CM issolely a permutation. In contrast, 3D CM consists of an ad-ditional color shift depending on pixel positions. Also the 3DCM handles this type of distortion well whereas the diffusionstep added destroys the result. The number of alternately de-pendent bits can be controlled with the number r of itera-tion rounds. If just a few rounds are used, an error does notspread over large parts of the image. Using many rounds, asingle flipping bit causes the scrambling of the entire image.

4.4.2. Random error and random Gaussian error

As we have expected, random error and random Gaussian er-ror show very similar results. When considering properties ofblock ciphers, we can see that the alternation of a single bytedestroys the encrypted block in ECB mode (including a byteof the following block in CBC/CFB mode). This causes everyerror to destroy bs bytes (bs+1 in CBC/CFB) in the decryptedimage, where bs is the used block size (see Figure 5(b)). Fur-ther errors occurring in already destroyed blocks have no ef-fect. This leads to stronger impact on block ciphers when pa-rameters for error probability are small. When the error rateis high, this drawback is reduced as more and more errorslie within the same damaged block. The CMs cope very wellwith this distortion type since errors are not expanded andthe result is again identical as if the image had not been en-crypted (see Figure 5(a)). Again, applying diffusion is the ex-ception where degradation may become even more severe ascompared to the AES cases.

4.4.3. Random buffer error

Using random buffer error in the AES case, we observe thefollowing phenomenon. Each time the encrypted blocks getsynchronized with their respective original counterparts, thefollowing blocks are decrypted correctly until the next error


5

10

15

20

25

30

PSN

R(d

B)

0 10 20 30 40 50 60 70 80 90

Error probability (%)

2DCatMap/2DBMap/3DCatMap2DCatDiffAES128ECB/AES128CBC

(a) Random error

8

8.5

9

9.5

10

10.5

11

PSN

R(d

B)

0 10 20 30 40 50 60 70 80 90


2DCatMap/2DBMap3DCatMap/2DCatDiffAES128ECB/AES128CBC

(b) R. buffer error

6

8

10

12

14

16

18

20

PSN

R(d

B)

0 10 20 30 40 50 60 70 80 90


2DCatMap/2DBMap3DCatMap/2DCatDiffAES128ECB/AES128CBC

(c) R. packet error

Figure 3: Comparing AES and CM transmission error robustness against error rate.

(a) 2DCatMap (b) AES128ECB

Figure 4: Effect of static byte errors on Lena image.

occurs (see Figure 6(b)). If we use CBC or CFB, the blockdirectly after the synchronization point SP is additionally de-stroyed. Of course, this analysis is only correct in case identi-cal keys are employed for each block.

As we model only insertion or deletion of bytes, we reachSPs every blocksize (bs) errors. Each time an error occurs westep either into an error phase, where every pixel is decryptedincorrectly, or a normal phase (where pixels get decryptedcorrectly). Let us assume that for the number of errors e, theblocksize bs, and the image size is the relation

bs� e� is

bs(14)

holds. Then we get approximately (bs − 1) times more errorphases than normal phases. If the error rate exceeds the upperbound, the entire image is destroyed.

The reason why CM-encrypted images are completelydestroyed with random buffer error (Figure 6(a)) is the in-herent sensitivity with respect to initial conditions. In mostcases, neighboring pixels in the encrypted image are far apartin the decrypted image. Every time an error occurs, the pix-els are shifted by one and the decrypted pixels are completelyout of place. In CM we cannot identify SPs.

4.4.4. Random packet error

For random packet error we distinguish two different ver-sions:

(1) the packet loss gets detected and the space is paddedwith bytes;

(2) no detection of the packet loss is done.

As to the first version we observe, when using AES, that thelost part plus bs (respective 2× bs) bytes are destroyed. With2DCatMap and 3DCatMap only the amount of lost pixels isdestroyed. This case corresponds to a value error occurringin bursts or a local static error, the results obtained show therespective properties.

In the second case (which is covered in Table 7) CM hasthe same synchronization problems as in random buffer errorwhich causes the image to be entirely degraded (Figure 7(a)).The impact on block ciphers depends on the size of thepacket ps. If the equation

psmod bs = 0 (15)

holds, the error gets compensated very well (shown inFigure 7(b); this block-type shift can be inverted very eas-ily). Scrambled parts after the cut points come to bs(respective 2× bs). If the packet size is different, only the


(a) 2DCatMap (b) AES128ECB

Figure 5: Effect of random byte errors on Lena image.

(a) 2DCatMap (b) AES128CBC

Figure 6: Effect of buffer errors on Lena image.

parts of the image lying between synchronization points andthe next error are decrypted correctly.

In normal packet switched networks, the packets needidentification numbers and therefore lost packets can be de-tected. That is why the first case of random packet errors ismost likely to occur.

Overall we have found excellent robustness of CM withrespect to value errors which results in significantly better be-havior as compared to classical block ciphers in such scenar-ios. However, CM cannot be said to be robust against trans-mission errors in general, since the robustness against buffererrors is extremely low due to the high sensitivity towardsinitial conditions of these schemes. Depending on the targetscenario, either CM or classical block ciphers may providebetter robustness properties.

5. COMPRESSION ROBUSTNESS

As already outlined in the introduction, classically encryptedimages cannot be compressed well, because of the typicalproperties encryption algorithms have. In particular it is notpossible to employ lossy compression schemes since in thiscase potentially each byte of the encrypted image is changed(and most bytes in fact are), which leads to the fact that thedecrypted image is entirely destroyed resulting in a noise-type pattern. Therefore, in all applications involving com-pression and encryption, compression is performed prior toencryption.

On the other hand, application scenarios exist where acompression of encrypted material is desirable. In such a sce-nario classical block or stream ciphers cannot be employed.For example, dealing with video surveillance systems oftenconcerns about protecting the privacy of the recorded per-sons arise. People are afraid what happens with recorded dataallowing to track a persons daily itineraries. A compromiseto minimize impact on personal privacy would be to con-tinuously record and store the data but only view it, if somecriminal offense has taken place.

To assure that data cannot be reviewed unauthorized, it istransmitted and stored in encrypted form and only few peo-ple have the authorization (i.e., the key material) to decryptit.

The problem, as depicted in Figure 8, is the amount ofmemory needed to store the encrypted frames (due to hard-ware restrictions of the involved cameras, the data is trans-mitted in uncompressed form in many cases). For this rea-son, frames should be stored in a compressed form only.When using block ciphers, the only way to do this would bethe decryption, compression, and re-encryption of frames.This would allow the administrator of the storage device toview and extract the video signal which obviously threatensprivacy. There are two practical solutions to this problem.

(1) Before the image is encrypted and transmitted, itis compressed. Beside the undesired additional computa-tional demands for the camera system, this has further disad-vantages, as transmission errors in compressed images haveusually an even bigger impact without error concealment


(a) 2DCatMap (b) AES128CBC

Figure 7: Effect of packet errors on Lena image.

Camera

Acquiredimage Encryption

A) Live observation

B) Criminal investigation

Database

Lossycompression

Insecurechannel

Decryption

Observer

Decompression Decryption

View

Figure 8: Privacy solution for surveillance systems.

strategies enabled. This strategy increases the error rate asinduced by decrypting partially incorrect data even further.This is prohibitive in environments where the radio signal iseasily distorted.

(2) The encrypted frames are compressed directly. In thismanner, the key material does not have to be revealed whenstoring the visual data thereby maintaining the privacy of therecorded persons. Figuure 8 shows such a system. Clearly, inthis scenario classical encryption cannot be applied. In thefollowing we will investigate whether CM can be applied andwhich results in terms of quality and compression are to beexpected.

A second example where compression of encrypted vi-sual data is desirable is data transmission over heterogeneousnetworks, for example, a transition from wired to wirelessnetworks with corresponding decreasing bandwidth. Con-sider the transmission of uncompressed encrypted visualdata in such an environment as occurring in telemedicineor teleradiology, for example, when changing from the wirednetwork part to the wireless one, the data rate of the visualmaterial has to be reduced to cope with the lower bandwidthavailable. Employing a classical encryption scheme, the datahas to be decrypted, compressed, and re-encrypted similarto the surveillance scenario described before. In the networkscenario, these operations put significant computation loadonto the network node in charge for the rate adaptation andthe key material needs to be provided to that network node,which is demanding in terms of key management. A solutionwhere the encrypted material may be compressed directlyis much more efficient of course. The classical approach totackle this second scenario is to apply format compliant en-

cryption to a scalable or embedded bitstream like JPEG2000.While this approach solves the question of transcoding in theencrypted domain in the most elegant manner, the transmis-sion error robustness problem as discussed for the surveil-lance scenario remains unsolved.

5.1. Experiments

Based on the observation of the excellent robustness of CMagainst value errors, these encryption schemes seem to benatural candidates to tolerate the application of compressiondirectly in the encrypted domain without the need for de-cryption and re-encryption. The reason is that compressionartifacts caused by most lossy compression schemes may bemodeled as random value errors (e.g., errors caused by quan-tization of single coefficients in JPEG are propagated into theentire block due to the nature of the DCT). In the follow-ing, we experiment with applying lossy compression to theencrypted domain of CM.

5.1.1. JPEG-compression of CM encrypted images

Figures 9–14 show images where the encrypted data got lossyJPEG compressed [15], decompressed, and finally decryptedagain. In these figures, we provide the quality factor q of theJPEG compression, the data size of the compressed image inpercent % of the original image size, and the PSNR of thedecompressed and decrypted image given in dB.

In general, we observe quite unusual behavior of the CMencryption technique. The interesting fact is that despite thelossy compression, a CM-encrypted image can be decrypted


(a) q = 55: 36%, 23.4 dB (b) q = 45: 37%, 15.9 dB (c) q = 45: 37%, 9.2 dB

Figure 9: Cat map with 5 iterations (without extensions and using 3D and diffusion extensions, resp.), keyset2.

(a) q = 30: 29%, 18.9 dB (b) q = 20: 21%, 16.4 dB (c) q = 10: 13%, 14.5 dB

Figure 10: Cat Map with 5 iterations using different compression ratios on the Ossi image, keyset1.

quite well (depending on the compression rate of course). Asalready mentioned, this is never the case if classical encryp-tion is applied.

Figure 9 compares the application of the standard 2D Catmap without and with additional extensions to increase secu-rity (i.e., 3D or diffusion extensions are employed addition-ally). At a fixed compression rate (slightly lower than 3), weobtain a somewhat noisy but clearly recognizable image incase of no further extensions are used (Figure 9(a)). Apply-ing the 3D extension to the standard Cat map (Figure 9(b)),we observe significant degradation of the decrypted imageas compared to the standard Cat map with identical numberof iterations. However, the image content is still recognizablewhich is no longer true in case the diffusion extension is used;see Figure 9(c). It is worthwhile noticing that we obtain thesame result, noise, no matter which compression rate or im-age quality is used in case the diffusion step is performed. Ac-tually this result is identical to a result if a cryptographicallystrong cipher like AES had been used instead of Catdiff.

The effect when compression ratio is steadily increasedis shown in Figure 10 on the Ossi test image. Lower datarates in compression increase the amount of noise in the de-crypted images, however, still with a compression ratio of5 (21%) the image is clearly recognizable and the qualitywould be sufficient for a handhold phone or PDA display, forexample (Figure 10(b)). Of course, higher compression ra-tios lead to even more severe degradations which are hardlyacceptable for any application (e.g., compression ratio 7.5in Figure 10(c)). However, higher compression ratios couldbe achieved with sensible quality using more advanced lossycompression schemes like JPEG2000 [18] for example.

Increasing the number of iterations to more than 5 doesnot affect the results of the Cat map for a sensible keyset (asused, e.g., in Figure 9). This is not true for the Baker mapas shown in Figure 11. When using 5 iterations, the com-pression result is significantly better as compared to the Cat

map case with the same data rate (compare Figure 11(a) toFigure 9(a)). The reason is displayed in Figure 11(b); usingthe Baker map with 5 iterations, we still recognize structures(horizontal areas of smoothly varying gray values in a singleline) in the encrypted data which means that mixing has notyet fulfilled its aim to a sufficient degree. On the one hand,this is good for compression since errors are not propagatedto a large extent; on the other hand, this threatens securitysince the structures visible in the encrypted data can be usedto derive key data used in the encryption process.

Increasing the number of iterations (e.g., to 17 as shownin Figures 11(c) and 11(d)) significantly reduces the amountof visible structures. As it is expected, the compression resultsare similar now to the Cat map case using 5 iterations. Using20 iterations and more, no structures are visible any moreand the compression results are identical to the Cat mapcase.

In Figure 12 we give examples of the effects in case patho-logical key material is used for encryption. When using key-set 1 for encryption with the Baker map (Figures 12(a) and12(b)), the structures visible in the encrypted material areeven clearer and in perfect correspondence also the compres-sion result is superior to that of keyset 2 (Figure 11). Withthese setting, an even higher number of iterations are re-quired to achieve reasonable security (which again destroysthe advantage with respect to compression). Also for the Catmap, weak keys exist. In Figure 12(d) the encrypted data isshown in case 10 iterations are performed using keyset 1. Inthis case, even image content is revealed and the key param-eters are reconstructed easily with a ciphertext only attack.Correspondingly, also the compression results are much bet-ter as compared to the case when 5 iterations are applied(see Figure 9(a)). These parameters (weak keys) and corre-sponding effects (reduced security) have been described inthe literature on CM and have to be avoided for any applica-tion of course.


(a) q = 70: 37%, 28.0 dB (b) q = 70: encrypted

(c) q = 60: 36%, 24.9 dB (d) q = 60: encrypted

Figure 11: Baker map with varying number of iterations (5 and 17 iterations), keyset2.

(a) q = 75: 36%, 30.9 dB (b) q = 75: encrypted

(c) q = 70: 36%, 27.3 dB (d) q = 70: encrypted

Figure 12: Baker map and Cat map with pathological keyset1 (5 and 10 iterations).

Applying the Cat map with poor quality keys shows an-other unique property. While increasing the number of it-erations increases the security of the Baker map as we haveobserved, the opposite can occur for the Cat map for specifickeysets. Accordingly, also compression results are better in

this case for a higher number of iterations. Figure 13 showsthe Ossi image when applying 7 and 10 iterations using key-set1, while Figure 10(a) shows the case of 5 iterations. Fixingthe data rate, the higher the number of iterations is, the betterthe quality gets.


(a) q = 30: 28%, 19.3 dB, 7 iterations (b) q = 50: 29%, 23.4 dB, 10 iterations

Figure 13: Cat map with 7–10 iterations on the Ossi image, keyset1.

(a) q = 30, 5 iterations (b) q = 30, 7 iterations (c) q = 30, 10 iterations

Figure 14: Cat map with 5–10 iterations on the Ossi image, keyset1, encrypted domain.

The reason for this effect is shown in Figure 14. The moreiterations are applied, the more structural information is vis-ible and key information may be derived. As shown beforefor the Lena image, with 10 iterations in use already imagecontent is revealed. Of course, due to the higher amount ofcoherent structures present in the encrypted domain (espe-cially exhibited in Figure 14(c)), corresponding compressioncan achieve better results.

5.1.2. JPEG 2000-compression of CM encrypted images

We have not only evaluated lossy compression using theJPEG algorithm but also with JPEG 2000 [18] and JPEG 2000with wavelet packet decomposition [16] and best basis selec-tion using log energy as cost function and full decomposi-tion. Apart from providing visual evidence as shown in thepreceeding subsection, we have also conducted large scale ex-perimentation using the images shown in Figure 2. Figure 15shows averaged PSNR results for a decreasing amount ofcompression comparing PSNR quality of original images tothree variants of CMs. The results show that the choice ofthe algorithm has very little impact on the overall trend ofour results. While diffusion entirely destroys robustness tolossy compression, 2D (as well as 3D variants to some ex-tent) CMs exhibit a certain amount of robustness against allsorts of compression. While JPEG2000 with classical pyra-midal decomposition outperforms the JPEG results by up to2 dB, the wavelet-packet-based technique performs similar toJPEG only. It seems that the deep decomposition structuresproduced by the best basis search caused by the noise in thesubbands tend to detoriate the results.

In general, we observe a significant tradeoff between se-curity and visual quality of compressed data when compar-ing the different settings as investigated. Increasing the num-ber of iterations up to a certain level increases security butdecreases compression performance (this is especially truefor the Baker map which requires a higher number of iter-ations in general to achieve reasonable security). However, ofcourse the computational effort increases as well.

We face an even more significant tradeoff when increas-ing security further: the 3D extensions already strongly de-crease image quality whereas diffusion entirely destroys thecapability of compressing encrypted visual data. When thesecurity level approaches the security of cryptographicallystrong ciphers like AES, also CMs do not offer robustnessagainst lossy compression any longer.

6. CONCLUSION

CM behaves differently with respect to robustness againsttransmission errors depending on the nature of errors.Whereas CM has turned out to be extremely robust in caseof value errors, the opposite is true for buffer errors. If pixelvalues change, the errors remain restricted to the affectedpixels even after decryption whereas missing or added pix-els entirely destroy the synchronization of the CM schemes.The observed robustness against value errors also explainsthe unique property to tolerate a medium amount of lossycompression which is an exceptional property not found inother ciphers. Applying the Cat map with 5 iterations or theBaker map with 20 iterations provides a certain degree of se-curity and decrypted images show acceptable image qualityeven after significant JPEG compression.


0

10

20

30

40

50

60

PSN

R(d

B)

0 10 20 30 40 50 60 70 80 90 100

File size (%)

2DCatMap JPEG3DCatMap JPEG2DCatDiff JPEGOriginal JPEG

(a) JPEG

0

10

20

30

40

50

60

PSN

R(d

B)

0 10 20 30 40 50 60 70 80 90 100

File size (%)

2DCatMap JPEG 2000

3DCatMap JPEG 20002DCatDiff JPEG 2000Original JPEG 2000

(b) JPEG 2000

0

10

20

30

40

50

60

PSN

R(d

B)

0 10 20 30 40 50 60 70 80 90 100

File size (%)

2DCatMap JJ 2000 WP

3DCatMap JJ 2000 WP2DCatDiff JJ 2000 WPOriginal JJ 2000 WP

(c) JJ 2000 WP

Figure 15: Mean PSNR versus file size of 16 different test images under varying using JPEG, JPEG 2000, and JPEG 2000 compression withwavelet packets.

However, the statements about robustness only applyif CM is used without diffusion step (i.e., in a less securemode). If diffusion is added, robustness against transmissionvalue errors and compression is entirely lost. Even in caseonly the 3D extension technique is used, robustness is sig-nificantly reduced.

As long as a lower security level is acceptable (i.e., diffu-sion is omitted), classical block ciphers like AES may be com-plemented by CM block ciphers in case of value errors in anefficient manner (computational demand is much lower androbustness to transmission value errors is higher). Also, lossycompression may be applied in the encrypted domain to acertain extent which is not at all possible with classical ci-phers. If high security is required, it is better to stick to clas-sical block ciphers in any environment.

ACKNOWLEDGMENTS

This work has been partially supported by the Austrian Sci-ence Fund, Projects nos. 15170 and 19159. The followingpictures are licensed under Creative Commons: Figure 2(b)by Emmanuel SalA, Figure 2(c) by Michael Jastremski,Figure 2(d) by Natthawut Kulnirundorn, Figure 2(h) by VinuThomas, and Figure 2(k) by Scott Kinmartin.

REFERENCES

[1] “Methods for subjective determination of transmission qual-ity,” ITU-R Recommendation P.800, 1996.

[2] “Methodology for the subjective assessment of the qualityof television pictures,” ITU-R Recommendation BT.500-11,2002.

[3] I. Avcibas, B. Sankur, and K. Sayood, “Statistical evaluation ofimage quality measures,” Journal of Electronic Imaging, vol. 11,no. 2, pp. 206–223, 2002.

[4] G. Chen, Y. Mao, and C. K. Chui, “A symmetric image encryp-tion scheme based on 3D chaotic cat maps,” Chaos, Solitonsand Fractals, vol. 21, no. 3, pp. 749–761, 2004.

[5] S.-G. Cho, Z. Bojkovic, D. Milovanovic, J. Lee, and J.-J.Hwang, “Image quality evaluation: Jpeg 2000 versus intraonlyh.264/avc high profile,” Facta Universitatis, Nis, Series: Elec-tronics and Energetics, vol. 20, no. 1, pp. 71–83, 2007.

[6] J. Daemen and V. Rijmen, The Design of Rijndael: AES—TheAdvanced Encryption Standard, Springer, New York, NY, USA,2002.

[7] A. M. Eskicioglu, “Quality measurement for monochromecompressed images in the past 25 years,” in Proceedings of IEEEInternational Conference on Acoustics, Speech and Signal Pro-cessing (ICASSP ’00), vol. 4, pp. 1907–1910, Istanbul, Turkey,June 2000.

[8] J. Fridrich, “Symmetric ciphers based on two-dimensionalchaotic maps,” International Journal of Bifurcation and Chaosin Applied Sciences and Engineering, vol. 8, no. 6, pp. 1259–1284, 1998.

[9] B. Furht and D. Kirovski, Eds., Multimedia Security Handbook,CRC Press, Boca Raton, Fla, USA, 2005.

[10] M. Gschwandtner, A. Uhl, and P. Wild, “Compression of en-crypted visual data,” in Proceedings of the 10th IFIP Interna-tional Conference on Communications and Multimedia Security(CMS ’06), H. Leitold and E. Markatos, Eds., vol. 4237 of Lec-ture Notes on Computer Science, pp. 141–150, Springer, Crete,Greece, October 2006.

[11] Y. Mao and M. Wu, “Security evaluation for communication-friendly encryption of multimedia,” in Proceedings of Interna-tional Conference on Image Processing (ICIP ’04), vol. 1, pp.569–572, Singapore, October 2004.

[12] V. Markovski, F. Xue, and L. Trajkovic, “Simulation and analy-sis of packet loss in user datagram protocol transfers,” Journalof Supercomputing, vol. 20, no. 2, pp. 175–196, 2001.

[13] G. T. Nguyen, R. H. Katy, B. Noble, and M. Satyanaryanan,“Trace-based approach for modeling wireless channel be-havior,” in Proceedings of the Winter Simulation Conference(WSC ’96), pp. 597–604, Coronado, Calif, USA, December1996.

[14] R. Norcen and A. Uhl, “Encryption of wavelet-coded imageryusing random permutations,” in Proceedings of InternationalConference on Image Processing (ICIP ’04), vol. 2, pp. 3431–3434, Singapore, October 2004.


[15] W. B. Pennebaker and J. L. Mitchell, JPEG—Still Image Com-pression Standard, Van Nostrand Reinhold, New York, NY,USA, 1993.

[16] M. Reisecker and A. Uhl, “Wavelet-packet subband structuresin the evolution of the JPEG 2000 standard,” in Proceedingsof the 6th Nordic Signal Processing Symposium (NORSIG ’04),vol. 46, pp. 97–100, Espoo, Finland, June 2004.

[17] J. Scharinger, “Fast encryption of image data using chaoticKolmogorov flows,” Journal of Electronic Imaging, vol. 7, no. 2,pp. 318–325, 1998.

[18] D. Taubman and M. W. Marcellin, JPEG2000—Image Com-pression Fundamentals, Standards and Practice, Kluwer Aca-demic, Boston, Mass, USA, 2002.

[19] A. S. Tosun and W. Feng, “On error preserving encryptionalgorithms for wireless video transmission,” in Proceedingsof the ACM International Multimedia Conference and Exhibi-tion, no. 4, pp. 302–308, Ottawa, Ontario, Canada, September-October 2001.

[20] A. Uhl and A. Pommer, Image and Video Encryption. FromDigital Rights Management to Secured Personal Communica-tion, vol. 15 of Advances in Information Security, Springer, NewYork, NY, USA, 2005.

[21] J. G. Wen, M. Severa, W. Zeng, M. H. Luttrell, and W. Jin, “Aformat-compliant configurable encryption framework for ac-cess control of video,” IEEE Transactions on Circuits and Sys-tems for Video Technology, vol. 12, no. 6, pp. 545–557, 2002.

[22] W. Zeng, J. Wen, and M. Severa, “Fast self-synchronous con-tent scrambling by spatially shuffling codewords of com-pressed bitstreams,” in Proceedings of International Conferenceon Image Processing (ICIP ’02), vol. 3, pp. 169–172, Rochester,NY, USA, September 2002.

[23] W. Zeng and S. Lei, “Efficient frequency domain selectivescrambling of digital video,” IEEE Transactions on Multimedia,vol. 5, no. 1, pp. 118–129, 2003.

6.pdf

Documents

Transcript of 6.pdf