Download - 07112038

Transcript

China Communications • May 2015 162

I. INTRODUCTION

In the past decade, wireless sensor networks (WSNs) have emerged as a premier research topic from theoretical research to practical ap-plications. Some research community regard-ed it as one of the technologies that could save the world, and even will change the world. A WSNs consists of a number of spatially distributed autonomous tiny wireless sensor devices which are used to monitor physical or environmental conditions, such as tempera-ture, sound, humidity and pressure, etc. [1]. Special characteristics of WSNs, such as rapid deployment, self-organization, and fault toler-ance make them widely used in many monitor-ing and tracking application areas, including environmental monitoring, health and wellness monitoring, traffic tracking and battlefield sur-veillance. However, sensor nodes are usually resource-constrained in typical wireless sensor networks, and this limits its applications to a certain extent.

In the process of sensed data gathering from multiple sensors to the base station, data generated from neighboring sensors is often redundant and highly correlated. Data aggre-gation is an effective method to combine these

Abstract: Wireless sensor networks (WSNs) consist of a great deal of sensor nodes with limited power, computation, storage, sensing and communication capabilities. Data aggre-gation is a very important technique, which is designed to substantially reduce the commu-nication overhead and energy expenditure of sensor node during the process of data collec-tion in a WSNs. However, privacy-preserva-tion is more challenging especially in data ag-gregation, where the aggregators need to per-form some aggregation operations on sensing data it received. We present a state-of-the art survey of privacy-preserving data aggregation in WSNs. At first, we classify the existing pri-vacy-preserving data aggregation schemes into different categories by the core privacy-pre-serving techniques used in each scheme. And then compare and contrast different algorithms on the basis of performance measures such as the privacy protection ability, communication consumption, power consumption and data ac-curacy etc. Furthermore, based on the existing work, we also discuss a number of open issues which may intrigue the interest of researchers for future work.Keywords: wireless sensor networks; data ag-gregation; privacy-preserving

A Survey on the Privacy-Preserving Data Aggregation in Wireless Sensor NetworksXu Jian1*, YAng geng1,2, CHen Zhengyu3, WAng Qianqian 4

1Jiangsu High Technology Research Key Lab for WSNs, College of Computer Science & Technology, Nanjing University of Posts & Telecommunications, Nanjing 210046, China2 Communication and Sensor Network Technology of Ministry of Education, Nanjing University of Posts & Telecommunications, Nanjing 210046, China3 School of Electrical & Information Engineering, Jinling Institute of Technology, Nanjing 211169, China4 School of Software Engineering, Jinling Institute of Technology, Nanjing 211169, China

SECURITY SCHEMES AND SOLUTIONS

China Communications • May 2015163

Data Aggregation protocols are broadly cate-gorized into two categories: encrypted based protocols and unencrypted based protocols. We introduce some typical protocols in each category and make a comprehensive compar-ison between them. Finally, we discuss some interesting and challenging open issues in the field of privacy-preserving data aggregation for WSNs.

The rest of the paper is organized as fol-lows. In Section II, we review privacy-pre-serving techniques in related fields. In Section III, we introduce some essential design prin-ciples of privacy-preserving data aggregation protocols for WSNs. In Sections IV and V, we classify and summarize the existing schemes, and introduce the typical protocols of each cat-egory respectively. In Section VI, we evaluate and compare the performance of different pri-vacy-preserving data aggregation techniques. Section VII outlines the open challenges for future research. Conclusions are drawn in Sec-tion VIII.

II. RELATED WORK

As mentioned in [7], there are two main cat-egories of privacy-preserving techniques for protecting two types of private information, data-oriented and context-oriented privacy, as shown in the Figure 1. Data-oriented tech-niques focus on the privacy of data collected from, or query posted to, a WSNs. On the other hand, context-oriented techniques con-centrate on contextual information, such as the location and timing of traffic flows in WSNs.

Query privacy:The issue of query privacy has long been

the object of intensive study in both traditional wireless sensor networks and two-tier wireless sensor networks. In traditional WSNs, private data query represents significant challenges to the design of a resource-constrained WSNs [8-12].

Location privacy:There are two different types of location

privacy techniques in WSNs, source location privacy and base station location privacy.

redundant data into high quality information in the intermediate nodes, resulting in conser-vation of energy and bandwidth [2]. Some of the important performance of data aggregation algorithms, such as network lifetime, data accuracy and latency, have been well defined and widely studied [3-5].

Another issue of WSNs is how to preserve the sensitive data on information being col-lected, transmitted, and analyzed in a WSNs. In many scenarios, such private information of concern may include critical data collected by sensors and transmitted to a centralized data processing server. For instance, in Smart Homes, data from sensors might measure household details, such as power and water usage, computing average trends and making local recommendations [6]. Privacy protection has been extensively studied in various fields in WSNs such as location privacy, temporal privacy and data query. Nonetheless, special data processing of data aggregation, which the aggregators need to perform some aggregation operations on sensing data, introduce unique challenges for privacy preservation during data aggregation. To address this issue, some protocols have been proposed by researchers at various universities and institutions. In this paper, we present a state-of-the art survey of the most recent and important privacy-pre-serving data aggregation techniques in WSNs.

We first introduce the related work for pri-vacy protecting techniques in WSNs. Next, we explain some essential design principles of privacy-preserving data aggregation proto-cols for WSNs. Then, we classify the existing schemes into different categories by the core privacy-preserving techniques used in each scheme. In this paper, the Privacy-preserving

Fig.1 Taxonomy of privacy-preserving techniques for WSNs

Privacy in WSNs

Context PrivacyData Privacy

Data Aggregation Data Query Location Privacy Temporal Privacy

This paper presents a state-of-the art survey of privacy-preserving data aggregation in WSNs.

China Communications • May 2015 164

respectively.Secondly, the privacy-preserving data ag-

gregation protocols can be classified based on the basis network topology, which reflect the relationship between different nodes. In this type of classification, protocols are grouped into three classes: cluster, tree-structure and hybrid. For example, in cluster-based WSNs, each cluster has a cluster head (CH) responsi-ble for collecting and aggregating sensing data within the same cluster, and then sends the ag-gregation results to the sink node. While in a tree-based WSNs, the data aggregation always be operated at the parent nodes, and the aggre-gation result will be forward to the sink node layer by layer.

Bista et al. [27] described another clas-sification according to the type of nodes in WSNs. The author broadly categorized priva-cy-preserving data aggregation protocols into two categories: homogeneous protocols and heterogeneous protocols. In a homogeneous WSNs, each node has the same capabilities, and the aggregators act as regular sensor nodes. On the other hand, cluster heads act as by powerful high-end sensors, in a hetero-geneous WSNs which incorporates different types of SNs with different capabilities [28].

However, the existing classifications do not reflect the peculiar feature of each protocol, especially in elementary operation during the aggregation process. In this paper, we first class the privacy-preserving data aggregation protocols into two categories: Encrypted based protocols and Unencrypted based protocols. Moreover, the protocols are further classified into several categories by the core privacy-pre-serving techniques used in each scheme, such as hop-by-hop encryption, secure Multi-party computation, and privacy homomorphism and so on, as shown in the Figure 2.

IV. ENCRYPTED PRIVACY-PRESERVING DATA AGGREGATION

4.1 Hop-by-hop encryption based protocols

The concept of source location privacy (SLP) in relation to context privacy was first described by Ozturk et al. [13] based on the panda hunter game. The core techniques used within each of the SLP solutions including Random Walk [13-16], Geographic Routing [17-19], Using Dummy Data Sources [13,20-21] and so on. On the other hand, the location privacy of base station requires ultimate pro-tection due to its crucial position in the net-works [22-24].

Temporal Privacy:Most temporal privacy protect solutions in

WSNs are based on delaying packets at (inter-mediary) nodes. [25-26] introduced two de-lay-based temporal privacy protect solutions.

III. CLASSIFICATION OF PRIVATE DATA AGGREGATION IN WSNS

In this section, we introduce our taxonomy of privacy-preserving data aggregation tech-niques for WSNs.

The challenge of private data aggregation is how to perform efficient in-network aggrega-tion at the intermediate node, and still protect the privacy of sensitive data. A quantity of pri-vacy-preserving data aggregation techniques are proposed in the literature, and each of them adopted different schemes to protect the privacy of aggregation data. These techniques can be classified into different taxonomy ac-cording to different perspectives.

At first, the wireless sensor systems can be categorized according to their architec-ture which can be centralized, hierarchical or distributed [1]. In the centralized one, the base station will initiate the process of data collection from all the nodes and will monitor and control the entire network. The distributed architecture focuses on the autonomy of the sensor node. Thus, each node can coordinate its actions, such as topology formation and data aggregation, based on knowledge that it can acquire from a neighboring substation. Based on the network architecture, we can classify the existing privacy-preserving data aggregation protocols into three taxonomies

China Communications • May 2015165

putation costs.At first, Bista et al. assume that every sen-

sor node shares two types of key as presented in [30]. The first is a pair-wise secret key shared with a Master Device (MD) in order to be a trusted member of a WSNs. The second is a symmetric key shared with each sensor node lying on their aggregation tree in order to se-cure data transmissions. There are two import-ant definitions, masked value and customized data in this protocol.

Masked value: The sampled data by a sen-sor node which is combined with a private real number. For example, if temperature reading by a sensor node is 30, and 101 is the real number, and then masking is done as 30+101= 131.

Customized data: The complex number that can be expressed in the form a+bi, where a is a Masked value. For example, if 131 is the masked value and 60i is an imaginary unit, the customized data C=131+60i, which is a com-plex number.

In this scheme, encryption and decryption are done at each hop. As we can see from the algorithm, when aggregators receive the en-crypted data from their children sensor nodes, each of the aggregators first decrypts the data by using respective symmetric keys and adds the data including its own by using additive property of complex numbers to generate an intermediate result. Then, the intermediate result is encrypted and sent to the upper level towards the query server. Note that, the “plain-text” decrypted by aggregators is a Custom-ized data, so it is still difficult for an adversary to recover sensitive information even though data are overheard and decrypted.

4.1.2 Hilbert-curve based privacy-preserving data aggregation scheme

By using a Hilbert-curve technique [31] and seed exchanges among sensor nodes, Kim et al. [32] provided a novel algorithm HAD to enforce data privacy and data integrity for wireless sensor networks. The proposed scheme is performed through three phases: a network construction phase, a data encryption

Because of its distinguished performance for supporting data authentication, integrality and confidentiality, hop-by-hop cryptography has been widely used to ensure the privacy in com-munications with the presence of third parties. For traditional hop-by-hop encrypted data aggregation, encryption and decryption are done at each hop. However, this means that an aggregator has to decrypt each received mes-sage to plaintext, then aggregate the plaintext according to the corresponding aggregation function and, finally, encrypt the aggregation result before forwarding it. As a result, if a node is compromised, the attacker can easily get hold of the sensitive data.

The challenge for hop-by-hop encrypted data aggregation in WSNs is to find a general privacy-preserving protocol, which can pro-vide not only secure and efficient hop-by-hop aggregation, but also end-to-end data conceal-ment to protect its privacy.

4.1.1 Bista et al.’s privacy-preserving data aggregation scheme

Bista et al. proposed a private data aggregation scheme for WSNs in[29]. This scheme makes use of complex numbers, an algebraic expres-sion which uses arithmetic operations, in order to aggregate and hides sensor data from other sensor nodes during their aggregation and then transmitted to the sink. The scheme not only ensures no the trend of private data of a sensor node is released to any other nodes, but also provides an efficient private data aggregation method in terms of communication and com-

Privacy-preserving data aggregation protocols in WSNs

Unencrypted protocolsEncrypted protocols

Data-Slicing based Data perturbation based

• CDA• Castelluccia et al.’s

scheme

• SMART• iPDA• EEHA• PEPDA

• KIPDA• GP2S• Zhang et al.’s

scheme

Secure Multi-party Computation based

Privacy Homomorphism based

Hop-by-hop encryption based

• Bista et al.’s scheme• Hilbert-Curve Based

Scheme

• CPDA• iCPDA• ESPART• Jung Et al.’s

scheme• EC-EG• RCDA• CDAMA

Symmetric PH Asymmetric PH

Fig.2 Classification of the existing privacy-preserving data aggregation protocols for WSNs

China Communications • May 2015 166

each sensor node sends the aggregated data to a parent node where all the data from child nodes are merged with its encrypted data. A sink node aggregates all data of sensor nodes in the network.

For aggregation, an intermediate node can receive the data from its child node and re-en-crypt the data with its own data before sending it to the sink node. The pivotal operate of this step is re-encrypt, in which the aggregator node will carry out its operation of decryp-tion, aggregation and encryption in proper sequence.

4.2 Secure multi-party computation based schemes

Secure multi-party computation (SMC) is a subfield of cryptography which was first for-mally introduced by Yao [34] in 1982. The goal of this field is to create methods that enable parties to jointly compute a function over their inputs, while at the same time keep-ing these inputs private. Due to the ability of preserving privacy of individual data, SMC is widely used in privacy preserving for data mining, database query, and intrusion detec-tion and so on. However traditional SMC-based privacy-preserving schemes are usually computationally expensive, which is not appli-cable to resource-constrained wireless sensor networks. For the purpose of reducing the complexity and energy consumption of SMC, some modified Secure Multi-Party Compu-tation schemes has been proposed in recent

phase, and a data transmission phase.network Construction Phase: In this

phase, Kim et al. adopt a message flooding scheme as in [33] to let each node determines its sibling nodes, parent node, and child nodes by sending broadcast messages. It is worth mentioning that, in this procedure, the scheme set the maximum number of child nodes so as to avoid network imbalance.

Data Encryption Phase: After construct-ing a sensor network, each node generates a random seed data for seed exchange to other nodes among its sibling nodes. The seed is used for hiding the original data from an ad-versary. The original data can be changed by extracting some part of a seed value, which is sent to other nodes. Some part of the seed value is also added from another node. As a consequence, the sensed data can be hidden among seed exchange group members. After that, each node transforms the sensed data into a value by using its generated seed and the received seeds according to the following equation:

Processed value=Original value–Generat-ed seed

Figure 3 shows a sensed data transforms result on each sensor node after exchanging a seed.

The changed value is encrypted with the Hilbert-curve algorithm. The Hilbert curve is a continuous fractal space-filling curve that gives a mapping between 1D and 2D space to preserve locality. To adapt the Hilbert curve to the algorithm, the author assumes that each sensor node transforms the one-dimensional sensed value into two-dimensional data. Here, the one-dimensional value is the aggregated value after applying the seed exchange algo-rithm for each node group. The two-dimen-sional data are the coordinate of the aggregat-ed value along with the Hilbert curve in 2n *2n metrics. By selecting the direction and the lev-el of the Hilbert curve, we can encrypt it as a tuple of <key(d, l), x, y> by using two-dimen-sional data (x, y), the level l, and the direction d.

Data Transmission Phase: In this phase, Fig.3 Original data change by seed exchange from three nodes

A

B C

a+Sc-Sa

b+Sa-Sb c+Sb-Sc

Sa

Sb

Sc

China Communications • May 2015167

constructed.2) Calculation within Clusters: The second

step of CPDA is the intermediate aggregations within clusters. To simplify the discussion, based on the additive property of polynomials, the author use a simple scenario to illustrate the message exchange among the three nodes to obtain the desired sum without releasing individual private data. As shown in Figure 4, where a cluster contains three members: A, B, and C. a, b and c represent the private data held by nodes A, B and C, respectively. Let A be the cluster leader of this cluster. Let B and C be cluster members.

First, nodes within a cluster share a com-mon (non-private) knowledge of non-zero numbers, refer to as seeds, x, y, and z, which are distinct with each other. Then node A cal-culates

vAA = a + rA

1 x + rA2 x2

vAB = a + rA

1 y + rA2 y2

vAC = a + rA

1 z + rA2 z2

where rA1 and rA

2 are two random numbers generated by node A, and known only to node A.

Similarly, node B and C calculate vAB , vB

B , vC

B and vAC , vB

C , vCC independently, and then

CPDA exchange data between each two node within the cluster. For example, after exchang-ing data, node B calculates a value FB as fol-lows

FB = (a + b + c) + r1y + r2y2

where, r1 = rA1 + rB

1 + rC1 , r2 = rA

2 + rB2 + rC

2 .Then node B and C broadcast FB and FC to

the cluster leader A (Figure 4). So far, node A can deduce the aggregate value (a+b+c). One can see that the cluster heads now learn the sum of private data from their cluster mem-bers. Nonetheless, the cluster heads cannot compromise the privacy of their respective members separately.

3) Cluster Data Aggregation: In this step, each cluster leader routes the derived sum within the cluster back towards the query serv-er through a TAG routing tree rooted at the server.

years. Sheikh et al. [35] proposed a k-secure sum protocol, which allows multiple cooperat-ing parties to compute some function of their individual data without revealing the data to one another.

4.2.1 CPDA: cluster-based private data aggregation and its modification

Similar to Sheikh et al.’s work, He et al. [36] first introduced modified SMC to privacy-pre-serving data aggregation in WSNs, and pro-posed the Cluster-Based Private Data Aggre-gation (CPDA).

At first, CPDA uses a random key distribu-tion mechanism proposed in [37] for encrypt-ing messages to prevent message eavesdrop-ping attacks. The key distribution consists of three phases, key pre-distribution, shared-key discovery and path-key establishment. Based on the key distribution and management mechanism described above, CPDA consists of three phases: cluster formation, calculation of the aggregate results within clusters, and cluster data aggregation.

1) Formation of Clusters: The first step in CPDA is to construct clusters to perform intermediate aggregations. A query server Q triggers a query by a HELLO message. Upon receiving the HELLO message, a sensor node elects itself as a cluster leader with a probabil-ity pc, which is a preselected parameter for all nodes. If a node becomes a cluster leader, it will forward the HELLO message to its neigh-bors; otherwise, the node waits for a certain period of time to get HELLO messages from its neighbors, then it decides to join one of the clusters by broadcasting a JOIN message. As this procedure goes on, multiple clusters are

Fig.4 Message exchange in CPAD

A

B C

x

y z

A

B C

A

B C

FB FC

E(vB, kBC)C

E(vC, kBC)B

China Communications • May 2015 168

ing protocol which successfully guarantees data privacy while all channels are subject to eavesdropping attacks, and all the communi-cations throughout the aggregation are open to others.

Jung et.al employed two different models: One Aggregator Model and Participants Only Model, and present two calculation protocols: Product Protocol and sum Protocol for pre-serve individual’s data privacy in each model. In our survey, we only introduce the product protocol.

In the Participants Only Model, product protocol assume that all participants togeth-

er want to compute the value f (x) =∏

ixi

given their privately known values xi ∈ Zp . The basic idea of our protocol is to find some random integers Ri ∈ Zp such that

∏i Ri = 1

mod p and the user pi can compute the random number Ri easily while it is computationally expensive for other participants to compute the value Ri. Communications in participants only model is shown in Figure 5.

Let G1 ⊂ Zp be a cyclic multiplicative group of prime order p and g1 be its generator. Then the protocol for privacy preserving pro-

duction ∏

ixi has the following steps: Setup,

Encrypt, and Product.

4.3 Privacy Homomorphism based privacy-preserving data aggregation

A Privacy Homomorphism (PH) is an encryp-

The random data mentioned in CPDA could be regarded as noise which assists to hide individual raw data items from being known by the cluster head. Due to the cooperation of nodes in the cluster, the negative effect of noise is eliminated, and the sum calculated is exactly the same as its original value.

Some improved schemes are proposed based on the CPDA scheme [38-40]. In [38], He et.al address both privacy of individual sensory data and integrity of the aggregation result simultaneously by proposing a proto-col called iCPDA, which piggybacks on a cluster-based privacy-preserving data aggre-gation protocol (CPDA). iCPDA enable peer monitoring due to the shared medium nature of wireless communication. Neighbors of an aggregator can overhear the aggregation re-sults from the aggregator’s children and over-hear the result sent from the aggregator to its parent. Therefore, it is possible for neighbors of the aggregator to detect if the aggregator change the intermediate result, and if a node detect the misbehavior, it will report the mis-behavior to the base station. In this way, the integrity of the aggregation result can protect efficiently. In our previous work, we presented an energy-saving and privacy-preserving data aggregation (ESPART) scheme [39], which shows a good performance in both energy consumption and privacy-preserving efficacy. As a result, the lifetime of network could be prolonged. Based on the analysis of CPDA, Jaydip et al. presented two modifications schemes: one towards making the protocol more efficient and the other for making it more secure [40].

4.2.2 Jung et al.’s privacy-preserving data aggregation scheme

Most of the existing SMC based Privacy-Pre-serving schemes require an initialization phase during which participants request keys from key issuers via secure channel. This could be a security hole since the security of those schemes relies on the assumption that keys are disclosed to authorize participants only. Jung et.al [41] proposed a novel privacy-preserv- Fig.5 Communications in participants only model

Pn-

1

Pn P1

P2

Pi-1Pi

Pi+

1

Pn

P(n-1)

P(i-1)P(i+1)

Pi

P1

P2

(a) Setup (b) Encrypt

China Communications • May 2015169

the encryption transformation involves some randomness that chooses the ciphertext cor-responding to a given cleartext from a set of possible ciphertexts.

DF has both the additive and the multi-plicative PH properties. For the ciphertext multiplication, all terms are cross-multiplied in Zg , with the d1-degree term by a d2-degree term yielding a (d1+d2) degree term. Terms having the same degree are added up. DF is a symmetric algorithm that requires the same secret key for encryption and decryption. The aggregation is performed with a key that can be publicly known, i.e., the aggregator nodes do not need to be able to decrypt the encrypt-ed messages. However, it is required that the same secret key is applied on every node in the network that needs to encrypt data.

4.3.1.2 Castelluccia et al.’s scheme

Castelluccia, Mykletun, and Tsudik [49] pro-pose a simple and provably secure additive-ly homomorphic stream cipher that allows efficient aggregation of encrypted data. The main idea of the scheme is to replace the ex-clusive-OR (XOR) operation typically found in stream ciphers with modular addition (+). Since this new cipher only uses modular addi-tions (with very small moduli), it is very well suited for CPU-constrained devices.

It is assumed that 0≤m<M. Due to the commutative property of addition, the above scheme is additively homomorphic. In fact, if c1=Enc(m1,k1,M) and c2=Enc(m2,k2,M), then c1+c2=Enc(m1+m2, k1+k2, M .

Note that if n different ciphers ci are added, then M must be larger than

∑ni=1 mi ; otherwise,

correctness is not provided. In fact, if ∑n

i=1 mi is larger than M, decryption will result in a value m’ that is smaller than M. In practice, if p=max(mi), then M should be selected as M = 2�log2(p∗n)� .

The key stream k can be generated by using a stream cipher, such as RC4, keyed with a node’s secret key si and a unique message ID. This secret key is precomputed and shared between the node and the sink, while the mes-

tion transformation that allows computation directly on encrypted data. Let Q and R denote two rings, + denote addition and × denote multiplication on both. Let K be the key space. We denote an encryption transformation E:K×Q→R and the corresponding decryption transformation D:K×R→Q.

Given a,b∈Q and k∈K, term:a + b = Dk (Ek (a) ⊕ Ek (b))additively homomorphic anda + b = Df (k1 ,k2 )

(Ek1

(a) ⊕ Ek2(b))

multiplicatively homomorphic.First work on PHs was done in a seminal

paper by Rivest et al. [42]. In [43] Domin-go-Ferrer presented an additive and multipli-cative PH which is a symmetric scheme and secure against chosen ciphertext attacks.

The advantage of PH based Privacy-pre-serving Data Aggregation is obviously: con-ceals sensed data end-end, and still providing efficient in-network data aggregation. Cryp-tographic algorithms that support privacy homomorphism are divided into Symmetric PH and Asymmetric PH/Public key homomor-phism. In symmetric PH, node encrypts their sensed reading with the key shared to the base station. So the base station can only decrypt the data and can achieve end to end confiden-tiality. In asymmetric PH, node encrypts their sensed data with base station public key. So the base station owns the private key can de-crypt the data.

4.3.1 Symmetric PH based schemes

Symmetric PH schemes require identical secret information for encryption and decryp-tion. The well-known symmetric PH schemes include DF[44] scheme, CDA[45-47] scheme and so on. At the same time, some investigator tried to combine different PH algorithms and proposed the hybrid symmetric PH approach [48].

4.3.1.1 CDA: Concealed Data aggregation

In [45-47], Girao et.al proposed the first effi-cient PH cryptographic system CDA for WSNs based on the symmetric PH scheme (DF) [44]. The PH is probabilistic, which means that

China Communications • May 2015 170

usage of aggregation functions is constrained (only additive operations), therefore, they are ineffective if the base station desires to query the maximum value of all sensing data. Second, the base station cannot verify the in-tegrity and authenticity via attaching message digests or signatures to each sensing sample.

In [52], Chen et al. introduced a concept named Recoverable Concealed Data Aggre-gation (RCDA). In RCDA, a base station can recover each sensing data generated by all sensors even if these data have been aggre-gated by cluster heads (aggregators). Thus, the base station can verify the integrity and authenticity of all sensing data, and of course, can perform any aggregation functions on them. Furthermore, the author proposed two RCDA schemes named RCDA-HOMO and RCDA-HETE, which has been generalized and adopted on homogeneous and heteroge-neous wireless sensor networks respectively.

In this survey, we only introduce the RC-DA-HOMO scheme which can be applied to both homogeneous and heterogeneous WSNs without modification. RCDA-HOMO is com-posed of four procedures: Setup, Encrypt-Sign, Aggregate, and Verify.

4.2.2.3 CDAMA: Concealed Data Aggregation Scheme for Multiple Applications

In previous studies, homomorphic encryptions have been applied to conceal communication during aggregation, thus adversaries are not able to forge aggregated results by compro-mising them. However, these schemes are not satisfy multi-application environments. For example, smoke alarms and thermometer sensors may be deployed in the same environ-ment. If we apply conventional concealed data aggregation schemes mentioned above, the ciphertexts of different applications cannot be aggregated together; otherwise, the decrypted aggregated result will be incorrect. The only solution is to aggregate the ciphertexts of dif-ferent applications separately. As a result, the transmission cost grows as the number of the applications increases.

Therefore, Lin et.al [54] proposed a new

sage ID can either be included in the query from the sink or it can be derived from the time period in which the node is sending its values in (assuming some form of synchroni-zation).

4.3.2 Asymmetric PH based schemes

Due to the unique features of elliptic curve crypto-schemes, such as short cipher texts, the smaller real estate required for hardware implementations, and better security-per-bit ratio, most of asymmetric PH data aggregation schemes are based on.

4.3.2.1 EC-EG: Elliptic Curve ElGamal encryption scheme

Mykletun et al. [50] proposed a Concealed Data aggregation scheme with the property of additive homomorphic encryption based on elliptic curve ElGamal (EC-EG) cryptosystem. In fact, EC-EG is equivalent to the traditional ElGamal encryption scheme [51] but trans-formed into an additive group. The key setup consists of choosing an elliptic curve E togeth-er with a prime p and generator G. Its security is based upon the Elliptic Curve Discrete Log Problem (ECDLP).

EC-EG is additively homomorphic, and ci-phertexts are combined through addition. The summation of two EC-EG ciphertexts requires two point additions, namely one for each of the ciphertext components R and S.

The function map() refers to map the plain-text m into a curve point M. In Mykletun et al.’s design, it satisfies the desired additive homomorphic property since map(m1+…+mn) = (m 1+…+m n)*G = m 1*G+…+m n*G = map(m1)+…+map(mn). The reverse function rmap() maps a given point M to the original plaintext m. The reverse map can be achieved by pollard- method on elliptic curve crypto-systems.

4.3.2.2 RCDA: Recoverable Concealed Data Aggregation

In the above PH-based schemes, the base station receives only the aggregated results. However, it brings two problems. First, the

China Communications • May 2015171

aggregation. In our previous work [61], we proposed a precision-enhanced and encryp-tion-mixed privacy-preserving data aggrega-tion (PEPDA) scheme, which to optimize the SMART scheme with some factors to reduce the collision rate and to increase data aggrega-tion accuracy. Beside that, the similar technol-ogies are widely used in the other fields, for example , [62] presented a new solution PriS-ense, which to provide privacy preserving data aggregation in people centric urban sensing systems based on the concept of data slicing and mixing.

4.4.1 SMART: Slice-Mix-AggRegaTe and iPDA

In the SMART[36] scheme, each node hides its private data by slicing it into pieces. It sends encrypted data slices to different inter-mediate aggregation nodes. After the pieces are received, intermediate nodes calculate intermediate aggregate values and further ag-gregate them to the sink.

Figure 6 give a clear illustration of the ba-sic idea of SMART for a sensor network with network size N = 5, slicing size J = 2, and forwards 1 of them to 1-hop neighbors. In this example, the five sensors are denoted as s1 to s5, respectively. Let dii be the piece of data kept by si, where let dij mean the piece of data transmitted from si to sj and ri means the data aggregated by si.

For SMART, the slicing technology leads to a large number of exchanged messages, which creates more opportunities for collisions to occur in the network. On the other hand, the cost is mainly on the power consumption due to the exchange of sliced data.

Thus, an integrity-protecting, private data aggregation (iPDA) scheme [58] was also proposed by He et al. to improve the integrity of the data based on the SMART scheme. This enhanced scheme also achieves data privacy for WSNs by using slicing and assembling technology. Meanwhile, data integrity is im-plemented by constructing two disjointed aggregation trees (Figure 7) to collect the data of interest and check for redundancy. During data transmission, the iPDA scheme makes

concealed data aggregation scheme CDA-MA, in which the ciphertexts from different applications can be encapsulated into “only” one ciphertext. Conversely, the base station can extract application-specific plaintexts via the corresponding secret keys. Basically, CDAMA is a modification from Boneh et al.’s [55] public-key PH encryption system, which integrates the Paillier [56] with the Okamo-to-Uchiyama encryption schemes [57].

4.4 Data-slicing based schemes

As the old saying ‘don’t put all your eggs in one basket’, it’s advisable to slice it into blocks and transmit them on different channels while we want to protect the privacy of sensi-tive data.

He et al. [36] first introduced data ‘slic-ing and assembling’ technique to protect the privacy for data aggregation in WSNs (Slice-Mix-AggRegaTe, SMART). And then, to improve the integrity of the data based on the SMART scheme, He et al. proposed an integrity-protecting, private data-aggre-gation (iPDA) scheme [58]. However, the slicing technology in both SMART and iPDA schemes leads to a large number of exchanged messages, which creates high communication overhead and computational requirements. In order to overcome previous drawbacks, the ‘slicing and assembling’ technology has been modified in [59-62], but the emphasis in each case was different. The scheme proposed by Li et al. [59] achieves energy-efficient and high-accuracy data aggregation (EEHA) and preserves data privacy like SMART scheme. It improves the performance of data aggre-gation by dividing the nodes into leaf nodes and intermediate nodes, the operations of the two types of nodes are different. In [60], a random distribution was introduced into the ‘slicing and assembling’ technology to decide the number of sliced data, resulting in that the number of data pieces that each node slices its private data is not a fixed number. In addition, the author developed a data query mecha-nism to check the procession of data during the transmission to improve the accuracy of

China Communications • May 2015 172

There is the possibility that large pieces are sent out while a small piece is kept by the node itself; under this circumstance, if a colli-sion occurs, most part of the data will be lost. Therefore, aggregation accuracy will be influ-enced. In order to improve the performance of SMART scheme, we define a small data factor 𝐿 and make small fragments to be sent to neighbors.

--Positive and Negative Factor: The aggre-gation accuracy can be improved if a negative piece is sent because that can increase pro-portion of the data kept by the node itself and decreases the influence of data loss caused by collision.

--Compensation Factor: In PEPDA scheme, an ACK message will be sent to the neighbor to get the loss rate and calculate the compensation. That enable the node to know

each node send its private data to both the ag-gregation trees and achieve data aggregation separately. Comparing aggregation results from both aggregation trees, the base station can determine whether the aggregation result has been polluted by malicious nodes and, thereby, guarantee the integrity of the private data. The use of slicing technology and two aggregation trees in the iPDA scheme results in the exchange of more messages and higher communication overhead, and the aggregation accuracy is decreased because of the higher data collision rate.

4.4.2 PEPDA: precision-enhanced and encryption-mixed privacy-reserving data aggregation

To the best of our knowledge, [61] provided the most comprehensive discussion about modification of SMART. In PEPDA, five fac-tors are used to optimize the algorithms of data aggregation based on SMART scheme, which are shown in Figure 8. In order to reduce col-lision rate, a randomized time slot and node choosing technique are developed, while to re-duce collision loss, small data packet, positive and negative piece slicing, and compensation methods are presented.

-- Randomized Time Slot Factor: In order to reduce the collision rate, a random sending time schedule is used during collusion phase, instead of spontaneously sending sliced pieces at the same time.

--Partial Factor: Note that the node in the set of slice failed node (SFN) cannot find enough neighbor nodes to send J−1 pieces. Therefore, there is at least one edge used to transmit twice the pieces. As a remedy, we divide nodes into two subsets 𝑇 and 𝐹 based i≥J−1, where i=1,2,…,N. The node set 𝐹 con-tains all SFN nodes, while the rest is in the node set 𝑇. Only nodes from the set 𝑇 partic-ipate in piece slicing and mixing. Therefore, communication overhead is cut down, and then collision rate and energy consumption are reduced.

--Small Data Factor: In SMART scheme, J-1 pieces will be forwarded to neighbors.

Fig.6 The basic idea of SMART

2

3 4

Base station

2

3 4

Base station

r4=d44+d54

r5=d55

r3=d33+d43

r2=d22+d12+d3

2

r1=d11+d21d11

2

3 4

d12d21

d22

d32

d33d44

d43 d54 d55

Base station

(a) Slicing (b) Mixing (c) Aggregating

Fig.7 Two disjoint aggregation trees rooted at the base station

BS

China Communications • May 2015173

specialize for maximum and minimum aggre-gation functions. KIPDA obfuscates sensitive measurements by hiding them among a set of camouflage values, enabling k-indistinguish-ability for data aggregation. Because the sensi-tive data are not encrypted, it is easily and ef-ficiently aggregated with minimal in-network processing delay.

Let Ui ={vi

1, vi2 . . . , v

in

} be the set of n val-

ues in the message set for node i, and I={1,2, ..., n} be the index set of Ui, | I |=|Ui|. The KIP-DA scheme can be divided in are four phases: pre-distribution, reporting, aggregating, and base station processing.

Step1. Pre-distributionIn the pre-distribution phase, the base

station chooses a global secret set GSS, and GSS⊂I. This is the global secret information, and each node i keeps a subset of GSS in set GSSi along with some noise values drawn from GSS. Next, the base station determines GS S i

T and GSSi for each node i. GS S i

T denotes the index set of the real values in Ui for node I, and NS S i

T ⊂ GS S ; GSSi include all elements from GSS and a subset of elements from GSS, GS S ⊂ NS S i and GS S ∩ NS S i � ∅ . As we can see from Figure 9, assume |I|=7, |GSS|=3, |NSSi|=5, and GSS={2,4,6}, then have

GS S = {1, 3, 5, 7} ;NS S 1

T= {2} ,NS S 2T= {6} ,NS S 3

T= {4} ;NS S 1= {2, 4, 6, 1, 3},NS S 2= {2, 4, 6, 3, 7},NS S 3= {2, 4, 6, 1, 7}.

Step2. ReportingIn the reporting phase, each node i deter-

mines the values for the set Ui. The message set Ui contains the real values, the restrict-ed camouflage values, and the unrestricted camouflages values. If sensed values have the range [dmin, dmax] and the real value is di, the restricted camouflage values are drawn from [dmin, di] for MAX aggregation, and from [di, dmax] for MIN aggregation, for node i. The unrestricted camouflages values are drawn from [dmin, dmax] . Node i places these values in Ui according to corresponding position and sends message set Ui to its aggregator. Con-sider a case of MAX aggregation in Figure 9, assume that the range of sense data is [0,30], nodes 1, 2, and 3 have sensor readings 16, 28,

whether a piece is received by a neighbor suc-cessfully or has the loss rate, and even further it can compensate for aggregating data and forward the result upstream during the data aggregation phase.

4.5 Unencrypted privacy-preservation data aggregation

Based on different principles, various privacy preserving techniques are developed, such as distortion, encryption and limited distribution. The other two techniques have also been ap-plied to privacy preserving in data aggrega-tion except encryption technology.

4.5.1 KIPDA

Data perturbation is a kind of technique which seeks to accomplish masking of individual confidential data elements while maintaining underlying aggregate relationships of the da-tabase. These techniques modify actual data values to ‘hide’ specific sensitive individual record information [63]. Groat et al. [64] pre-sented a data perturbation based privacy-pre-serving aggregation method, KIPDA, which

Fig.8 Improvement outline of PEPDA

Low accuracy, high collision in SMART scheme

Reduce collision lossReduce collision rate

Reduce collision rate by sending in order

Reduce collision rate by sending less pieces

Reduce collision loss in collision duration

Reduce collision loss in aggregation duration

Randomized time slot factor Partial factor Small data

factorPositive and

negative factorCompensation

factor

Fig.9 Data aggregation in KIPDA

Base station

N1

N2 N3

GSS={2,4,6}

29 22 18 12 23 28 26

2 16 15 8 23 6 16

8 1 18 12 21 10 3

29 22 16 5 9 28 26

China Communications • May 2015 174

a hybrid PHA scheme. In this survey, we only discuss the basic version b-PHA. With b-PHA, a data-targeted query is performed in two steps: first, the distribution of all sensor read-ings (i.e., sensor data histogram) is queried; second, the answer to the particular query is computed based on the histogram.

Based on the data concealment in Histo-gram-based aggregation described above, an-other PHA scheme has been proposed in [68] which focus on the reconciling privacy preser-vation and intrusion detection in sensory data aggregation. The proposed scheme can detect ill-performed aggregation without knowing the actual content of sensory data, and there-fore allow sensory data to be kept concealed.

V. COMPARISONS

In the previous sections, we present a compre-hensive survey of the existing techniques for each category of problems respectively. In this section, we horizontally compare all priva-cy-preserving data aggregation techniques that have been reviewed in this paper. We use the following metrics to evaluate the performanc-es of each scheme.

Privacy Protection Ability (PPA): Since privacy is one of the main design principles of a privacy-preserving data aggregation pro-tocol, the first metric we used is data privacy which refers to the degree of privacy protec-tion provided by the reviewed techniques. For example, some techniques (i.e., all the PH-based schemes and , KIPDA) can provide perfect protection for the privacy, while most encrypted based protocol’s PPA must depend on the security of the shared key.

Accuracy (ACU): The final decision at the sink is based on the aggregation result ob-tained from the BS. This is the deflection of the aggregated value obtained from the real value of sensor data. For example, PHA has low accuracy because it only presents an ap-proximately plot data histogram, and CPDA’s accuracy based on the data loss rate.

Communication cost (CMC): This is the number of messages generated in the given

and 12 respectively. In this way,

U1 = {2, 16, 15, 8, 23, 6, 16}U2 = {29, 22, 16, 5, 9, 28, 26}U3 = {8, 1, 18, 12, 21, 10, 3}

Step3. AggregationIn the aggregation phase, each node i com-

putes the MAX (or MIN) for each l={1,2, ..., n} in vj

i , among all child nodes j, plus its own vj

i if it is also a sensing node. The aggregator i replaces values vj

i in Ui with the aggregated values (vi

l = max(ormin)(vhl , v

il),∀h ), and then

passes the aggregated message set to its next hop. As be shown in Figure 9, the aggregated message set U1={29,22,18,12,23,28,26}.

Step4. Base Station ProcessingThe final aggregated message set, UΩ, ar-

rives at the base station and the aggregated result is computed by selecting the MAX (or MIN) from the values indexed by GSS in UΩ, max (min)k∈GS S (vf

k ) . Hence, the ultimate aggre-gation result in Figure 9 is MAX(22,12,28)=28.

4.5.2 PHA

Differential privacy approach makes data dis-tortion to preserve privacy by means of adding noise. Histograms, which provide useful sum-maries of a dataset, is a basic technique for achieving Differential privacy [65-66].

Zhang et al. [67,68] proposed two perturbed histogram-based aggregation (PHA) schemes to achieve privacy-preserving data aggregation in WSNs.

The first scheme is GP2S: Generic Priva-cy-Preservation Solutions for approximate aggregation [67]. The basic idea of GP2S is to generalize the values of data transmitted in a WSNs, such that although individual data content cannot be decrypted, the aggregator can still obtain an accurate estimate of the histogram of data distribution, and thereby ap-proximate the aggregates. The author present-ed three version protocols which are proceed in an orderly way and step by step. The basic version of the PHA scheme called b-PHA, and to reduce the bandwidth consumption Zhang et al. further design advanced PHA schemes, namely a function-assisted PHA scheme and

China Communications • May 2015175

The delay is less in the end to end encrypted data aggregation and more in hop by hop en-crypted data aggregation due to the decryption at aggregator.

Aggregation Function (AF): To determine how many aggregation functions can support by the aggregation scheme, among Sum, Av-erage, Count, Standard deviation, Max, Min, Variance, Histogram and Median. The two classes: Numerous and Few.

Data Integrity (DI): It guarantees that the data has not been altered during the transmis-sion of data from the sensor node to the BS. If a protocol supports data integrity, its DI is Yes, otherwise the DI value is No. For exam-ple, the DI values of iPDA, iCPDA and RCDA are labeled as Yes, because all of them support data integrity. The DI value of SMART is No because it does not support this feature.

Data Recoverable (DR): This is a metric used to check whether a protocol supports the base station to recover each sensing data gen-erated by all sensors data recover even if these data have been aggregated by aggregators. The DR value of RCDA and KIPDA are Yes be-

WSNs. To evaluate CMC, we use High, Me-dium and Low. The communication costs of protocols belong to High, Medium and Low when the number of message generated (m) per sensor node is m ≥ 3, 3 > m > 1, m = 1, respectively.

Computation cost (CPC): This is the pro-cessing overhead of processor to achieve pri-vacy preserving data aggregation. The values are High, Medium and Low. The CPC is high: if a sensor node performs many encryption/decryption, arithmetic operation and other op-erations. Medium: If a node performs a couple of encryption/decryption, some arithmetic operation. Low: if a sensor node performs few arithmetic operations, one encryption or de-cryption.

Data pollution (DP): This is a logical op-erator for detecting the malicious modification of sensor data. If the malicious modification is detected, the metric gives Yes, otherwise it gives No. Because only the scheme in [68] support this feature, it is labeled Yes.

Delay (DLY): This is the time taken to get the sensed data from the source to the sink.

Table i Comparison result of privacy-preserving data aggregation protocols for WSNsProtocols

MetricsPPA ACU CMC CPC D P DLY AF DI DR MA

Encryptedprotocols

Hop-by-hop encryption

Bista et al.’s scheme H H L L N L F N N N

iHDA H H L M N L F Y N N

Secure Multi-party Computation

CPDA H H H H N M F N N N

iCPDA H H H H N M F Y N N

ESPART H H H H N M F N N N

Jung et al.’s scheme H M M M N M N N N N

Privacy Homom-orphism

Symmetric PHCDA H H L L N L F N N N

Castelluccia et al.’s scheme H H L L N L F N N N

Asymmetric PH

EC-EG H H L L N H F N N N

RCDA H H M H Y H N Y Y N

CADMA H H H H N H F N N Y

Data-Slicing

SMART H H H L N M F N N N

iPDA H H H M N M F Y N N

PEPDA H H H M N M F N N N

Unencrypted protocols

Data perturbation based

KIPDA H H M L Y L N Y Y N

GP2S H L L M N L N N N N

Zhang et al.’s scheme H L L M Y L N N N N

Legend: H = High; M = Medium; L = Low; N = No; Y = Yes

China Communications • May 2015 176

other metrics. That means one of the main challenges for the future design of priva-cy-preserving data aggregation techniques is how to make a proper tradeoff between dif-ferent metrics, such as the privacy protection ability, communication cost, computation cost, and aggregation accuracy.

Second, the detection of malicious modi-fications of sensor data on their ways to the sink is crucial for achieving a true aggregation result. However, in most existing privacy-pre-serving data aggregation, the actual content of sensory data should be changed its appear-ance (by encrypt, conceal, slice or perturb) to preserve the privacy, but the data conceal-ment feature may be abused by compromised sensors to modify or ill-process data without being caught. Hence, another challenge is to reconcile privacy preservation and intrusion detection is challenge of primary importance.

Most solutions require an initialization phase during which participants request keys from key issuers via secure channel. But in many real scenarios, such secure channel doesn’t exist. Accordingly, how to design ef-ficient protocols without relying on a trusted authority and secure pair-wise communication channels.

The different requirement about priva-cy-preserving is raised for different applica-tion. So that, how to customize appropriate privacy-preserving data aggregation scheme in various domains is the main emphasis of our further research.

VII. CONCLUSIONS

In this paper, we have presented a comprehen-sive survey on privacy-preserving data aggre-gation techniques in WSNs. In the literature, there are surveys related to privacy-preserving data aggregation in WSNs, such as the works of Bista et al. [27], Jose et al. [69] and Li et al. [7], however none of these works covers the broad set of studies in the state of the art regarding privacy-preserving data aggregation in WSNs as we do. We first discussed some of the related works based on different concepts

cause these two schemes support this feature.Multiple Applications (MA): Most of exist-

ing schemes were designed for single applica-tion WSNs, in these protocols the ciphertexts of different applications cannot be aggregated together; otherwise, the decrypted aggregated result will be incorrect. Only the CADMA scheme can be used in a multi-application en-vironment, so the value of CDAMA is Yes.

Based on the comparison result in Table 1, we have the following analysis. First of all, the encrypted protocols must be based on some key distribution mechanism, efficient key establishment scheme always means more complex to actualize. For example, a symmetric PH based schemes, such as RCDA and CDAMA, have high transmission delay to distribute keys between each node and the BS. Secondly, we observed that both of the secure Multi-party computation based solu-tions and data-slicing based solutions that try to slice real data often consume a lot more communication than solutions that try to con-ceal the data. Solutions, such as CPDA and SMART require nodes to generate multiple messages and exchange them to the neighbor, the communication costs of these protocols always high. Thirdly, for hop-by-hop encryp-tion based schemes, encryption and decryption are done at each hop, it is inevitable that they will consume more power than end-to-end solutions. Another observation is that almost all data aggregation techniques shown in Table 1 can provide perfect protection for the priva-cy of data collected from individual sensors. However, some solutions require an essential precondition: secure pair-wise communication channels, especially the encrypted solutions. In every case, the benefit of privacy protection usually comes at the cost of other metrics.

VI. OPEN PROBLEMS

In this section, we list some important open problems which will need to be further gone into in future work.

As our analysis above, the benefit of pri-vacy protection usually comes at the cost of

China Communications • May 2015177

[5] Wu Yanwei, Li Xiangyang, Liu Yunhao, et al. Energy-Efficient Wake-Up Scheduling for Data Collection and Aggregation [J]. IEEE Transaction on Parallel and Distributed System, 2010, 21(2): 275–287.

[6] ARABO A, BROWN I, EL-MOUSSA F. Privacy in the Age of Mobility and Smart Devices in Smart Homes[C]// Proceedings of the 2012 Interna-tional Conference on Privacy, Security, Risk and Trust (PASSAT’12), 2012:819–826.

[7] LI Na, ZHANG Nan, SAJAL K D, et al. Privacy preservation in wireless sensor networks: A state-of-the-art survey [J]. Ad Hoc Networks, 2009, 7: 1501-1514.

[8] CARBUNAR B, Yu Yang, Shi Larry, et al. Query privacy in wireless sensor networks[C] // Pro-ceeding of the 4th Annual IEEE Communica-tions Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, 2007 (SECON’07). San Diego, CA, 2007: 203–212.

[9] Li Xiangyang, TAEHO J. Search Me If You Can: Privacy-preserving Location Query Service[C]// Proceedings of the 32nd IEEE International Conference on Computer Communications (IEEE INFOCOM’13), 2013:2760–2768.

[10] TSOU Y, LU ChunShien, KOU Syyen. Privacy-and integrity-preserving range query in wireless sensor networks[C] // Proceedings of the 2012 IEEE Global Communications Conference (GLO-BECOM’12), 2012:328–334.

[11] LIAO Xiaojing, LI Jianzhong. Privacy-preserving and Secure Top-k Query in Two-tier Wireless Sensor Network[C]// Proceedings of the 2012 IEEE Global Communications Conference (GLO-BECOM’12), 2012:335–341.

[12] YI Yeqing, LI Rui, CHEN Fei. A Digital Water-marking Approach to Secure and Precise Range Query Processing in Sensor Networks[C]//Pro-ceedings of the 32nd IEEE International Con-ference on Computer Communications (IEEE INFOCOM’13), 2013:1950–1958.

[13] OZTURK C, ZHANG Yanyong, TRAPPE W. Source-location privacy in energy-constrained sensor network routing[C]// Proceedings of the 2nd ACM workshop on Security of Ad hoc and Sensor Networks (SASN ’04), 2004:88–93.

[14] KAMAT P, ZHANG Yanyong, TRAPPE W, et al. Enhancing sourcelocation privacy in sensor net-work routing[C]// Proceedings of the 25th IEEE International Conference on Distributed Com-puting Systems (ICDCS’05), 2005:599–608.

[15] LUO Xi, JI Xu, PARK M. Location privacy against traffic analysis attacks in wireless sensor net-works[C]// Proceedings of 2010 International Conference on Information Science and Appli-cations (ICISA’10), 2010:1–6.

[16] SHI Rui, GOSWAMI M, GAO Jie, et al. Is Ran-dom Walk Truly Memoryless - Traffic Analysis and Source Location Privacy under Random Walks[C]// Proceedings of the 32nd IEEE Inter-

such as location privacy, timing privacy, and query privacy. Then, we included a classifi-cation of the existing privacy-preserving data aggregation techniques based on the core tech-niques of the solutions. Next, we discussed and explained over 30 solutions that we found in the literature that provide privacy-preserv-ing data aggregation. Also, we attempted to compare the existing techniques in terms of such metrics as privacy, accuracy, communica-tion cost and aggregation function, and so on. Based on the comparison result, we presented an overview of the solutions, adopt which type of privacy-preserving technique, and wheth-er the solution is actually applicable for data aggregation in WSNs. Furthermore, based on the existing work, we listed a number of open issues which may in need of future research.

We believe that our work can serve as a good starting point for researchers to design more effective and more available protocols for achieving privacy-preserving data aggre-gations in WSNs.

ACKNOWLEDGEMENTS

This work was supported in part by the Na-tional Natural Science Foundation of China (No. 61272084, 61202004), and the Natural Science Foundation of Jiangsu Province (No.BK20130096), and the Project of Natural Science Research of Jiangsu University (No.14KJB520031, No.11KJA520002).

References[1] AKYILDIZ I F, SU W, SANKARASUBRAMANIAM

Y, et al. Wireless sensor networks: a survey [J].Computer Networks, 2002, 38(4):393-422.

[2] RAMESH R, PRAMOD K., Varshney. Data aggre-gation techniques in sensor networks: A survey [J]. IEEE Comm. Surveys & Tutorials, 2006(8): 48-63.

[3] Kim J, Lin X, Shroff N B and Sinha P. Minimizing Delay and Maximizing Lifetime for Wireless Sensor Networks With Anycast [J]. IEEE/ACM Transaction on Networking,2010,18(2):515–528.

[4] XU Xiaohua, LI Xiangyang, MAO Xufang, et al. A Delay-Efficient Algorithm for Data Aggregation in Multi-hop Wireless Sensor Networks[J]. IEEE Transactions on Parallel and Distributed Sys-tems, 2011, 22(1):163–175.

China Communications • May 2015 178

data aggregation scheme for protecting integ-rity in wireless sensor networks[C]// Proceed-ings of 10th IEEE International Conference on Computer and Information Technology, 2010: 2463–2470.

[30] BLA E-O, ZITTERBART M. An efficient key estab-lishment scheme for secure aggregating sensor networks[C]//Proceedings of the 1st ACM Sym-posium on Information, Computer and Com-munications Security, 2006:303 – 310.

[31] BUTZ A R. Alternative algorithm for Hilbert’s space filling curve[J]. IEEE Transactions on Com-puters, 1971,20(4):424–426.

[32] KIM Y, LEE H, YOON M, et al. Hilbert-Curve Based Data Aggregation Scheme to Enforce Data Privacy and Data Integrity for Wireless Sensor Networks[J]. International Journal of Distributed Sensor Networks, 2013, Article ID 217876, 14 pages.

[33] PANTHACHAI Y, KEERATIWINTAKORN P. An energy model for transmission in Telos-based wireless sensor networks[C] // Proceedings of the International Joint Conference on Computer Science and Software Engineering (JCSSE ‘07), 2007.

[34] YAO A. Protocols for secure computations[C]// Proceedings of the 23rd Annual Symposium on Foundations of Computer Science, 1982:160-164.

[35] SHEIKH R, KUMMAR B, MISHRA D. Privacy pre-serving k secure sum protocol[J]. International Journal of Computer Science and Information Security, 2009,6(2):184-188.

[36] HE Wenbo, LIU Xue, Nguyen H, et al. Pda: Priva-cy-preserving data aggregation in wireless sen-sor networks[C]// Proceedings of the 26th IEEE International Conference on Computer Commu-nications (IEEE INFOCOM’07). 2007:2045–2053.

[37] LAURENT E, GLIGOR D. A key-management scheme for distributed sensor networks[C]//Proceedings of the 9th ACM Conference on Computer and Communications Security, 2002:41–47.

[38] HE Wenbo, LIU Xue, Nguyen H, et al. A Clus-ter-based Protocol to Enforce Integrity and Preserve Privacy in Data Aggregation[C] //Proceedings of the 29th IEEE International Conference on Distributed Computing Systems Workshops,2009:14-19.

[39] YANG Geng, WANG Anqi, CHEN Zhengyu, et al. An energy-saving privacy-preserving data aggregation algorithm[J]. Chinese Journal of Computers, 2011,34:792–800.

[40] SEN J. Secure and Privacy-Preserving Data Aggregation Protocols for Wireless Sensor Net-works[J]. Cryptography and Security in Com-puting,2012, 3:133-164.

[41] Taeho J, MAO Xufei, LI Xiangyang, et al. Priva-cy-Preserving Data Aggregation without Secure Channel: Multivariate Polynomial Evaluation[C]

national Conference on Computer Communica-tions (IEEE INFOCOM’13), 2013:3021-3029.

[17] KARP B, KUGN H. Gpsr: greedy perimeter state-less routing for wireless networks[C]// Proceed-ings of the 6th Annual International Conference on Mobile Computing and Networking (MOBI-COM’00), 2000:243–254.

[18] SHAIKH R, JAMEEL H, BRAIN J, et al. Achieving network level privacy in wireless sensor net-works [J]. Sensors, 2010,10 (3):1447–1472.

[19] LIGHTFOOT L, LI Yun, REN Jian. Preserving source-location privacy in wireless sensor net-work using star routing[C]// Proceedings of 2010 IEEE Global Telecommunications Confer-ence (GLOBECOM’10), 2010:1–5.

[20] SHAO Min, YANG Yi, ZHU Sencun, et al. Towards statistically strong source anonymity for sensor networks[C]// Proceedings of the 27th IEEE In-ternational Conference on Computer Commu-nications (IEEE INFOCOM’08), 2008:51–55.

[21] ALOMAIR B, CLARK A, CUELLAR J, et al. Towards a statistical framework for source anonymity in sensor networks[J]. IEEE Transactions on Mobile Computing, 2013,12(2):248–260.

[22] YING Jian, CHEN Shigang, ZHANG Zhan, et al. Protecting receiver-location privacy in wireless sensor networks[C]// Proceedings of the 26th IEEE International Conference on Computer Communications (IEEE INFOCOM’07), 2007: 1955–1963.

[23] NEZHAD A, MAKRAKIS D, MIRI A. Anonymous topology discovery for multihop wireless sen-sor networks[C]// Proceedings of the 3rd ACM workshop on QoS and security for wireless and mobile networks (Q2SWinet ‘07), 2007:78–85.

[24] LI Xinfeng, WANG Xiaoyuan, ZHENG Nan, et al. Enhanced Location Privacy Protection of Base Station in Wireless Sensor Networks[C]// Pro-ceedings of the 5th International Conference on Mobile Ad-hoc and Sensor Networks (MSN’09), 2009:457–464.

[25] HONG Xiaoyan, WANG Pu, KONG Jiejun, et al. Effective probabilistic approach protecting sensor traffic[C]// Proceedings of IEEE Military Communications Conference (MILCOM’05), 2005:169–175.

[26] PANDURANG K, XU Wenyuan, TRAPPE W. Tem-poral Privacy in Wireless Sensor Networks[C]// Proceedings of International Conference on Distributed Computing Systems. 2007:25-27.

[27] BISTA R, CHANG J W. Privacy-Preserving Data Aggregation Protocols for Wireless Sensor Net-works: A Survey[J]. Sensors, 2010,10(5):4577-4601.

[28] CHEN C M, LIN Y H ,LIN Y C, et al. RCDA: Recov-erable Concealed Data Aggregation for Data Integrity in Wireless Sensor Networks[J]. IEEE Transactions on Parallel and Distributed Sys-tems, 2012,23(4):727-734.

[29] BISTA R, YOO H K, CHANG J W. A new sensitive

China Communications • May 2015179

of Cryptographic Techniques (Eurocrypt), 2003: 416-432.

[54] LIN Y, CHANG S, SUN H. CDAMA: Concealed Data Aggregation Scheme for Multiple Applica-tions in Wireless Sensor Networks[J]. IEEE Trans-actions on Knowledge and Data Engineering, 2013,25(7):1471-1483.

[55] BONEH D, GOH E, NISSIM K. Evaluating 2-DNF Formulas on Ciphertexts[C] // Processing of the 2nd International Conference on Theory of Cryptography, 2005:325-341.

[56] PAILLIER P. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes[C]//Processing of the 17th International Conference on Theory and Application of Cryptographic Techniques,1999:223-238.

[57] OKAMOTO T, UCHIYAMA S. A New Public-Key Cryptosystem as Secure as Factoring[C]//Processing of the International Conference on Theory and Application of Cryptographic Tech-niques, 1998:308-318.

[58] HE Wenbo, Nguyen H, LIU Xue, et al. iPDA: An integrity-protecting private data aggregation scheme for wireless sensor networks[C]//Pro-ceedings of the IEEE Military Communications Conference,2008:1-7.

[59] LI Hongjuan, LIN Kai, LI Keqiu. Energy-efficient and high-accuracy secure data aggregation in wireless sensor networks[J]. Computer Commu-nications,2011,34(4):591–597.

[60] LIU Chenxu, LIU Yun, ZHANG Zhenjian, et al. High energy-efficient and privacy-preserving secure data aggregation for wireless sensor networks[J]. International Journal of Communi-cation Systems,2013,34(26):380-394.

[61] YANG Geng, LI Seng, XU Xiaolong, et al. Pre-cision-enhanced and encryption-mixed pri-vacy-preserving data aggregation in wireless sensor networks[J]. International Journal of Distributed Sensor Networks, 2013, Article ID 427275, 12 pages.

[62] SHI Jing, ZHANG Rui, LIU Yunzhong, et al. Pri-Sense: Privacy-Preserving Data Aggregation in People-Centric Urban Sensing Systems[C]//Pro-ceedings of the 29th Conference on Computer Communications (INFOCOM ‘2010), 2010:1-9.

[63] WILSON R, ROSEN P. Protecting Data through ‘Perturbation’ Techniques: The Impact on Knowledge Discovery in Databases[J]. Journal of Database Management, 2003,14(2):14-26.

[64] GROAT M, HEY W, FORREST S. KIPDA: K-indis-tinguishable privacy-preserving data aggrega-tion in wireless sensor networks[C]//Proceed-ings of the 30th IEEE International Conference on Computer Communications (IEEE INFOCOM ‘11), 2011:2024–2032.

[65] CYNTHIA D. Differential Privacy: A Survey of Re-sults[J]. TAMC 2008, LNCS 4978,2008: 1-19

[66] XIONG Ping, ZHU Tianqing, WANG Xiaofeng. A Survey on Differential Privacy and Appli-

// Proceedings of the 32nd IEEE International Conference on Computer Communications (IEEE INFOCOM’13), 2013:2634-2642.

[42] RIVEST R L, ADLEMAN L, DERTOUZOS M L. On data banks and privacy homomorphisms[J]. Foundations of Secure Computation (Academic Press, New York), 1978:169-179.

[43] FERRER J D. A new privacy homomorphism and applications[J]. Information Processing Let-ters,1996,60(5):277- 282.

[44] FERRER J D. A provably secure additive and multiplicative privacy homomorphism[C] // Pro-ceedings of the 5th International Conference on Information Security Information Security Conference, 2002:471-483.

[45] GIRAO J, SCHNEIDER M, WESTHOFF D. CDA:Concealed data aggregation in wireless sensor networks[C] // Proceedings of ACM Workshop on Wireless Security, 2004.

[46] GIRAO J, WESTHOFF D, SCHNEIDER M. “CDA:Concealed data aggregation for reverse multicast traffic in wireless sensor networks[C] // Proceedings of IEEE International Conference on Communications (ICC’05), 2005:3044-3049.

[47] WESTHOFF D, GIRAO J, ACHARYA M. Concealed Data Aggregation for Reverse Multicast Traffic in Sensor Networks:Encryption, Key Distribu-tion, and Routing Adaptation[J]. IEEE Transac-tions on Mobile Computer, 2006,5(10):1417-1431.

[48] PETER S,LANGENDORFER P, PIOTROWSKI K. On Concealed Data Aggregation for Wireless Sensor Networks[C] // Proceedings of 4th IEEE Conference on Consumer Communication and Networking (CCNC’07), 2007.

[49] CASTELLUCCIA C, MYKLETUN E, TSUDIK G. Efficient Aggregation of Encrypted Data in Wireless Sensor Networks[C] // Proceeding of the Second Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (Mobiquitous ’05), 2005:109-117.

[50] MYKLETUN E, GIRAO J, WESTHOFF D. Public Key Based Crypto schemes for Data Concealment in Wireless Sensor Networks[C]//Proceedings of the IEEE International Conference on Communi-cation, 2006,5:2288-2295.

[51] ELGAMAL T. A Public Key Cryptosystem and a Signature Scheme Based on Discrete Loga-rithms[C]//Proceeding of the Annual Interna-tional Cryptology Conference (CRYPTO ’85), 1985,31(4):469-472.

[52] CHEN Chienming, LIN Yuehsun, LIN Yaching, et al. RCDA: Recoverable Concealed Data Ag-gregation for Data Integrity in Wireless Sensor Networks[J]. IEEE Transactions on Parallel and Distributed Systems, 2011,22(4):727-734.

[53] BONEH D, GENTRY C, LYNN B, et al. Aggregate and Verifiably Encrypted Signatures from Bilin-ear Maps[C] //Processing of the 22nd Interna-tional Conference on Theory and Applications

China Communications • May 2015 180

security and Privacy-preserving. Email:[email protected].

Yang geng, received his Ph.D. degrees in computer science from Laval University in 1994 and did a post-doc at Montreal University from 1994 to 1996 in Canada respectively. He is a professor at Nanjing Uni-versity of Posts & Telecommunications, China. His re-search interest includes network security, parallel and distributed computing, mobile computing. Professor Yang is a senior member of the China Computer Fed-eration, a member of the IEEE Computer Society and a Standing Member of Chinese Computer Education Society. Email:[email protected].

CHen Zhengyu, received his M.S. degree from Nan-jing University of Posts & Telecommunications, China in 2006. He is an associate professor at Jinling Insti-tute of Technology. Currently, he is a Ph.D. candidate in Information Security. His research interests include information security and Privacy-preserving.

Wang Qianqian, received the M.S. degree from Nanjing University of Posts & Telecommunications, China in 2007. She is currently a Lecture at Jinling Institute of Technology, China. Her research interests focus on wireless communications and information security.

cations[J]. Chinese Journal of Computers, 2014,37(1):101–122.

[67] ZHANG Wensheng, WANG Chuang, FENG Taiming. GP2S: Generic Privacy-Preservation Solutions for Approximate Aggregation of Sensor Data[C]//Proceedings of the 6th Annu-al IEEE International Conference on Pervasive Computing and Communications (PerCom’08), 2008:179–184.

[68] WANG Chuang, WANG Guiling, ZHANG Wen-sheng, et al. Reconciling Privacy Preservation and Intrusion Detection in Sensory Data Aggre-gation[C]//Proceedings of the Mini-Conference at 30th IEEE International Conference on Com-puter Communications (IEEE INFOCOM ‘11), 2011: 336-340.

[69] JOSE J, PRINCY M, JOSE J. Integrity Protecting and Privacy Preserving Data Aggregation Pro-tocols in Wireless Sensor Networks: A Survey[J]. International Journal of Computer Network and Information Security,2013,7:66-74.

BiographiesXu Jian, received the M.S. degree from Nanjing University of Posts & Telecommunications, China in 2008. He is currently a Ph.D. candidate in Information Security. His research interests include information