Clustering Models in Secure Clustered Multiparty...

14
Clustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi * , Stelvio Cimato, and Ernesto Damiani Universit` a degli Studi di Milano Milan, Italy {sedigheh.abbasi, stelvio.cimato, ernesto.damiani}@unimi.it Abstract Secure Multiparty Computation (SMC) is well-known to have serious scalability issues. In fact, in spite of many significant achievements reported in SMC literature, SMC protocols are seldom efficiently scalable to real-world applications. In this paper we propose a new scheme, called Se- cure Clustered Multiparty Computation (SCMC) to improve the efficiency of SMC by reducing the number of participant through clustering. SCMC improves efficiency at the expense of privacy of participants with respect to other members of the same cluster. The paper analyzes different cluster- ing models and presents some protocols suitable for SCMC schemes. Keywords: Secure Multiparty Computation (SMC), Secure Clustered Multiparty Computation (SCMC), Trust-Based Intra-Cluster Computation 1 Introduction Secure Multiparty Computation (SMC) was originally introduced by Yao in 1982 as a way to compute a function whose arguments are partitioned between two participants that are not willing to share them. Afterwards, the problem was extended to the multiparty case and more general protocols were proposed [1, 2, 3]. Generally speaking, the term SMC now applies to any protocol in which a number of partici- pants need to jointly compute a function over their private inputs in such a way that no information other than the final result of the computation is inferable from the output. As a very significant result achieved over the years, it has been proven that any multiparty functionality can be securely computed: in the presence of any number of corrupted participants (assuming enhanced trapdoor permutations), allowing abort [1, 2]. Assuming an honest majority, full security is achievable (whenever private channels are available) [4, 3]. Despite these significant theoretical achievements, handling complexity of SMC is still an open issue. To the best of our knowledge, practical applications of SMC have been only sporadically reported in the literature. As the first large-scale practical experiment, SMC was used to implement a secure sugar-beet auction in Denmark [5]. Also, SMC has been utilized in supply chain management to solve a shared optimization problem in which the privacy of the users’ inputs is preserved [6]. However, in both cases a huge time complexity of the computations has been reported. Since the number of participants of a SMC protocol plays an important role in the efficiency of the protocol, a significant decrease in the number of participants would effectively improve the efficiency of SMC. This is in fact the main objective of a Secure Clustered Multiparty Computation (SCMC) protocol. Indeed, the general idea behind SCMC (as proposed in our previous work [7]) is to apply SMC protocol over a number of participants previously reduced by some clustering techniques. In other words, SCMC firstly clusters participants based on some reasonable criteria and then engages clusters’ representatives instead of individual participants in the SMC protocol. Since the representatives are supposed to take part in the SMC protocol on behalf of their cluster’s members, an additional computation phase is needed Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications, volume: 4, number: 2, pp. 63-76 * Corresponding author: Via Teresine 8, Crema, CR, Italy, 26013, Tel: +39-393-2317102 63

Transcript of Clustering Models in Secure Clustered Multiparty...

Page 1: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in Secure Clustered Multiparty Computation

Sedigheh Abbasi∗, Stelvio Cimato, and Ernesto DamianiUniversita degli Studi di Milano

Milan, Italy{sedigheh.abbasi, stelvio.cimato, ernesto.damiani}@unimi.it

Abstract

Secure Multiparty Computation (SMC) is well-known to have serious scalability issues. In fact,in spite of many significant achievements reported in SMC literature, SMC protocols are seldomefficiently scalable to real-world applications. In this paper we propose a new scheme, called Se-cure Clustered Multiparty Computation (SCMC) to improve the efficiency of SMC by reducing thenumber of participant through clustering. SCMC improves efficiency at the expense of privacy ofparticipants with respect to other members of the same cluster. The paper analyzes different cluster-ing models and presents some protocols suitable for SCMC schemes.

Keywords: Secure Multiparty Computation (SMC), Secure Clustered Multiparty Computation (SCMC),Trust-Based Intra-Cluster Computation

1 Introduction

Secure Multiparty Computation (SMC) was originally introduced by Yao in 1982 as a way to computea function whose arguments are partitioned between two participants that are not willing to share them.Afterwards, the problem was extended to the multiparty case and more general protocols were proposed[1, 2, 3]. Generally speaking, the term SMC now applies to any protocol in which a number of partici-pants need to jointly compute a function over their private inputs in such a way that no information otherthan the final result of the computation is inferable from the output. As a very significant result achievedover the years, it has been proven that any multiparty functionality can be securely computed: in thepresence of any number of corrupted participants (assuming enhanced trapdoor permutations), allowingabort [1, 2]. Assuming an honest majority, full security is achievable (whenever private channels areavailable) [4, 3]. Despite these significant theoretical achievements, handling complexity of SMC is stillan open issue. To the best of our knowledge, practical applications of SMC have been only sporadicallyreported in the literature. As the first large-scale practical experiment, SMC was used to implement asecure sugar-beet auction in Denmark [5]. Also, SMC has been utilized in supply chain management tosolve a shared optimization problem in which the privacy of the users’ inputs is preserved [6]. However,in both cases a huge time complexity of the computations has been reported.Since the number of participants of a SMC protocol plays an important role in the efficiency of theprotocol, a significant decrease in the number of participants would effectively improve the efficiency ofSMC. This is in fact the main objective of a Secure Clustered Multiparty Computation (SCMC) protocol.Indeed, the general idea behind SCMC (as proposed in our previous work [7]) is to apply SMC protocolover a number of participants previously reduced by some clustering techniques. In other words, SCMCfirstly clusters participants based on some reasonable criteria and then engages clusters’ representativesinstead of individual participants in the SMC protocol. Since the representatives are supposed to takepart in the SMC protocol on behalf of their cluster’s members, an additional computation phase is needed

Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications, volume: 4, number: 2, pp. 63-76∗Corresponding author: Via Teresine 8, Crema, CR, Italy, 26013, Tel: +39-393-2317102

63

Page 2: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

to aggregate the private inputs of the clusters’ members at their representatives’ sites before the start ofSMC protocol.Of course, the increase in SMC efficiency brought about by SCMC is achieved at the expense of pri-vacy of the participants; since intra-cluster computations will rely on a relaxed privacy model, while theSMC protocol will be executed over clusters’ representatives among which privacy assumptions will bepreserved. In this paper, we consider two different privacy models for the SCMC protocol: a very strictone among the clusters and a reasonably relaxed one inside the clusters. The subject of privacy will becarefully addressed in Section 3.The execution of a SCMC-enabled SMC protocol is divided into two main phases: Semi-Secure ClusteredComputation (SSCC) phase and SMC phase. All operations related to participants clustering, represen-tatives selection and private inputs aggregation (intra-cluster computation) are carried out in the SSCCphase. Some experimental results are shown that support our claim of an increased efficiency of SMCwith a controlled and acceptable loss of privacy. Thus, the SMC phase deals with the execution of a SMCprotocol using the clusters’ representatives as the participants of the protocol. In Section 2, some of theexisting SMC solutions will be reviewed; however, the details of this phase are outside the scope of ourour study.The main contribution of of this paper is to address the issues related to the participants’ clustering ina SCMC protocol. Clearly, these issues are independent of the specific SMC protocol that is run in thelast phase of the protocol. In Section 4, we analyze some of our proposed computational models adopt-able in SSCC phase describing a privacy-preserving clustering algorithm that is effectively applicable inthe context of SCMC and is proved secure against passive adversaries. In short, we assume honest-but-curious participants that faithfully follow the protocols while trying to gather information about the otherparticipants’ private inputs utilizing the intermediate messages they receive.

2 Related Work

From the first two-party protocol proposed by Yao, many efforts have been done to design efficient SMCprotocols. Early proposals (e.g. [1, 2]) were based on a complexity-theoretic assumption (the availabilityof a trapdoor permutation) that proved to be hardly achievable in practice. Endeavors toward eliminatingthis assumption have led to solutions utilizing a richer communication model: private communication(e.g. protocols proposed in [4, 3]). Such protocols does not rely on any cryptographic assumptions andare secure against active adversaries if they are less than third of all the participants).More recently, much research been targeted toward reducing communication complexity of generic SMCprotocols (e.g. [8, 9, 10, 11, 12]). A very important achievement of these studies, is SMC with linearcommunication complexity in the number of participants for passive adversaries. Assuming active ad-versaries, communication complexity is quadratic in the multiplicative depth of the circuit [11].Some of the works in the literature attempt to trade privacy for efficiency in different ways. For example,in [13] tradeoffs of information leakage and round complexity for two-party secure computation havebeen considered. In [14] some leakage of information has been allowed to compute some specific func-tions. Besides, in [15] a k-leaked security model has been developed which leads to design more efficientvariations of Yao’s protocol in face active adversaries. Our SCMC proposal can be also considered partof this category, since it allows some information leakage during the SSCC phase of the protocol andimproves efficiency by replacing some parts of expensive secure computation by a cheaper semi-secureone.In a SCMC protocol, the impact of the privacy leakage is reduced utilizing a trust-based clustering modelin which each participant consciously decides about the other participant(s) she trusts enough to shareher private input with during the intra-cluster computation.

64

Page 3: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

Our proposal is related to, but distinct from classic privacy-preserving clustering algorithms. In [16],randomization techniques have been used to address the problem of privacy-preserving clustering. Also,a privacy preserving protocol has been proposed in [17] to cluster objects of a vertically partitioneddataset. A lot of works also have focused on clustering over horizontally partitioned data. For example,Inan et al. in [18], have proposed a general technique to securely compute the similarity matrix of theobjects of a horizontally partitioned dataset using secure multiparty comparison protocols proposed fordifferent datatypes. Other researchers have proposed some improved variations of k-means algorithm(e.g. [19, 20, 21, 22]). Jagannathan et al. have also proposed a clustering protocol for arbitrarily parti-tioned data [23].We argue that some specific features of the SCMC clustering problem prevent these classic privacy-preserving clustering algorithms from being used in the context of SCMC. Generally speaking, the clus-tering problem of SCMC can be mapped to the privacy-preserving clustering over horizontally parti-tioned data. However, there are two important differences between our clustering problem and what isusually considered in the other clustering fields. Firstly, in SCMC every participant holds only a sin-gle row of the trust matrix; thereby, the local computations that are usually done at each site in thewell-known privacy preserving clustering algorithms in the literature, are not applicable to this case. Inparticular, the number of local private trust vectors is not enough to create local centroids (as for ex-ample proposed in [21]) and to be aggregated to preserve their secrecy. Secondly, the trust matrix byitself cannot be completely mapped to the concept of similarity matrix in the clustering literature. Thisis mainly because, unlike a similarity matrix, a trust matrix is not necessarily symmetric. Moreover, cre-ating a similarity matrix using the trust matrix, as done for example in [18], either leads to uncontrolledprivacy leakage during the intra-cluster computation, or imposes complexity overheads. Therefore, aspecifically-tailored clustering protocol is a preliminary revision of necessity. In our previous work [7],we proposed such a protocol. In this work we propose a new clustering protocol. All alternatives for theclustering models of SCMC will be investigated in Section 4.

3 Trust-Based Intra-Cluster Computation

As mentioned before, our approach to improve the efficiency of a secure multiparty computation relieson a tradeoff between privacy and efficiency that is achieved by deploying a relaxed privacy scheme inthe intra-cluster computation phase (semi-secure computation). The privacy model of a SCMC protocolconsists of two different schemes: a relaxed privacy scheme and a strict privacy scheme. The formeris in fact applied in the intra-cluster computation phase that results in a semi-secure computation, andthe latter corresponds to the privacy scheme adopted by the SMC protocol executed among the clusters’representatives in which no privacy leakage is usually allowed. Regardless of the privacy model of theinter-cluster computation phase, we define the relaxed privacy scheme adopted in the intra-cluster com-putation phase via a subset of cluster members, as follows:

Definition 1: For each participant pi a set Ri of participants (6= pi) is assigned to act as her repre-sentative in the intra-cluster computation phase. Ri is defined as Ri ⊆ {p j : pi trusts p j}.

Clearly, Ri = φ means that the participant does not share her private input with any other participant.In the ”relaxed” privacy model of our intra-cluster computation phase, privacy leakage is controlled bydirected trust relations among the participants. We consider two levels of relaxation: partial data sharingwhere the private input of a participant is shared among a set of her (preferably) trusted participants, and

65

Page 4: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

complete data sharing where participant pi completely shares her private input (without any privacyconcern) with one of the members of Ri. The amount of data to be shared during the intra-clustercomputation phase results in different clustering models.In turn, the clustering model selected for SCMC determines the algorithm used to cluster the participantsas well as its (semi-secure) intra-cluster computation protocol. Some clustering models will be discussedin Section 4.A trust management system is also required to compute and update pair-wise trust among participantsto be used as the input of the clustering algorithm. Several trust management systems available in theliterature (see for instance [24]) can be plugged in to our SCMC protocols. However, without gettinginto the details of trust management, the existence of a binary trust matrix (Tn×n, where n is the numberof participants) is assumed throughout this paper. Each element of the trust matrix (Ti j) is defined as0, if pi does not trust p j and 1, otherwise. Clearly, all the diagonal elements of the trust matrix are 1.Each row of the matrix corresponds to the trust vector of a participant that shows her trust beliefs aboutthe other participants. Due to the secrecy of the trust values, each trust vector is assumed to be privatelyowned by its corresponding participant.

4 Clustering Models in SCMC

We now propose two trust-based clustering models for the SCMC protocol. In the partially-shared modelno private input is provided to another participant in its plain form, but the intra-cluster privacy leakageis allowed in terms of secret sharing: each participant pi is assigned a set of representatives (Ri in theabove definition) with whom she secretly shares her private data during the intra-cluster computationphase. Such a set of representatives is called Board of Trustee (BoT). Therefore, the clustering problemin a partially-shared model consists of finding a set of clusters so that all the participants included ina cluster have a common BoT. As a result, the number of final clusters here is controlled by the totalnumber of common BoTs. After the intra-cluster computation phase of the protocol, all or one of theBoT members of each cluster takes part in the SMC phase using the aggregated result of the privateinputs of the participants included in their cluster. This model will be discussed in more detail in Section4.1.On the other hand, in the completely-shared model the role of trust is a central one. In this approach,each participant agrees to send her private input (in the plain form) to at most one of its trustees. Aselected trustee, which is called the leader, gets access to its follower’s private input in the intra-clustercomputation phase. Therefore, the clustering problem in such a model consists of finding a set of disjointhierarchies depicting the overall follower-leader relationships. In this model, the number of final clustersis equal to the number of disjoint hierarchies recognized and the root of each cluster (who correspondsto the most trusted participant of the cluster) will take part in the SMC phase using the result of theaggregation of all its cluster members’ private inputs. This model will be considered in detail in Section4.2.

4.1 Partially-Shared Clustering Model

In the partially-shared model clusters are created based on common BoTs. Therefore, our algorithmresults in a flat clustering in which participants are grouped based on their common BoTs.In order to compute a globally minimum set of clusters, it would be necessary to run multiple rounds of amultiparty privacy-preserving set intersection protocol (e.g. [25]) which would result in a very inefficientsolution. Instead, we propose a relatively fast protocol that leads to an acceptable cluster cardinality. Thisprotocol that we call clustering based on Dominant Trust Patterns (DTP-based clustering), follows the

66

Page 5: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

same strategy as classic k-means clustering in data mining literature to achieve an initial clustering andthen tries to stabilize it by selecting a BoT for each cluster.DTP-based clustering is run in two phases: pre-clustering and clustering stabilization.

4.1.1 Anonymously-Located Pre-Clustering

The first stage executes a privacy-preserving clustering protocol using the participants’ trust vectors asprivate inputs and clusters them based on the similarity of their trust patterns. During the execution ofthis phase, no participant is aware of the other participants’ relocation among the pre-clusters. In otherwords, all information about the intermediate states and the final state of a particular participant is onlyavailable to the participant itself.Our algorithm starts with k random pre-clusters 1 and iteratively calculates the index vector of each clus-ter based on the trust pattern of the majority 2 of the cluster members and (if necessary) relocates eachparticipant to another cluster whose index vector is more similar to the participant’s trust vector. Theindex vector of a cluster indicates the dominant trust pattern of the cluster members. This step terminateswhen a convergence is reached (i.e. no more influential relocation is possible). This is recognized by thetotal number of bits changed in the index vectors. Let us now describe the execution in detail.We assume that all participants are ordered along a ring. Also, we denote pre-clusters (groups 3) byG1,G2, · · · ,Gk. At each iteration of the execution, one of the participants in the ring starts the computa-tion as the starter 4 by running the following steps:

1. The starter generates k random integer vectors V1,V2, · · · ,Vk of dimension n (the number of partic-ipants) as well as k random integers r1,r2, · · · ,rk; so that the pair (Vi,ri) corresponds to the groupGi.

2. Assuming the starter belongs to the group G j, she adds its private trust vector to Vj and increasesr j by 1. Then, she sends all the vectors and integers to the next participant in the ring.

3. Each participant in the ring does the same operation the starter has done in step 2.

4. When the starter receives vectors and integers, she generates k index vectors (one for each of thegroups) in the following way:

(a) Subtracts the initial random values from the vectors and integers. Therefore, ri shows thetotal number of participants in Gi and each Vi j represents the number of participants in Gi

who trust p j.

(b) For each group Gi, computes an index vector Ii in which each element Ii j is filled by 0 ifVi j ≤

ri

2, and 1, otherwise.

5. The starter compares the new index vectors with the index vectors of the previous iteration (ifany). If the total difference δ (in the number of bits) is less than or equal to a predefined value ε

(i.e. δ ≤ ε) the starter broadcasts a termination message to all the participants and the algorithm

1Without loss of privacy, k is assumed to be known by all the participants.2The word ”majority” can be generalized to

where σ is defined as a relaxation factor.3To prevent any confusion between the pre-clusters of this phase and the final clusters, from now on we use the term ”group”for the pre-clusters of this phase and the term ”cluster” for final clusters.4Although, in this paper we assume an honest-but-curious setting, in order to balance the workload of the participants, we canassume a token that is initially given to one of the participants and at the end of each iteration it is passed to the next participantin the ring to determine the starter. However, using alternate starters would make the protocol more robust.

67

Page 6: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

is stopped. Otherwise, she broadcasts new index vectors to all the participants and sends the tokento the next participant in the ring.

6. Each participant updates its group name to the group whose index vector is more similar to itsprivate trust vector.

In this way, our pre-clustering results in k groups 5 of participants in such a way that the trust patternsof the participants in a group are more similar to each other than to the participants of the other groups.Also, each group is indicated by its index vector which shows the dominant trust pattern of the groupmembers. Figure 1 illustrates the execution of this algorithm over 20 participants (p0 to p19) using arandom trust matrix with k = 4 and ε = 0. As shown in the picture, the execution continues till there isno change in the total number of bits of the index vectors of all the groups.

It should be noted that at the end of anonymously-located pre-clustering each participant just knowsthe group name she belongs to (without having any information about the other participants grouping) aswell as the index vectors of all the groups. Using this information, clustering can be stabilized.

4.1.2 Clustering Stabilization

During this phase, clustering is stabilized by trying to assign a BoT to each group. Generally speak-ing, clustering effectiveness is strongly affected by the BoT selection. The effectiveness of a clusteringprotocol is evaluated by the following criteria:

• Collusion robustness. In a partially-shared clustering model, this is mostly related to the averageof mutual trusts among a BoT members. Clearly, the more the BoT members trust each other, theless the protocol is robust against collusion.

• Intra-cluster privacy leakage. Since, during the pre-clustering phase participants are clusteredbased on similarity in their trust patterns, different combinations of them (as a whole) may havedifferent levels of trustworthiness, even though all the BoT member candidates of a cluster aretrusted by the majority of the members. For example, assume a cluster consists of 10 participantswith two different BoT candidates of cardinality 2, say {p7, p9} and {p2, p4}. Since all candidateBoTs have been recognized trusted by the majority of the members, each of them is at least trustedby 6 members. However, if p7 and p9 are trusted by all 10 members and p2 and p4 are trustedby exactly 6 of them, selecting {p7, p9} results in no untrusted sharing, whereas the other optionleads to 8 untrusted shares inside the cluster. Obviously, the more untrusted sharing occurs insidea cluster, the privacy of the private inputs will probably be more at risk in the computation phase.

• Number of solitary clusters. In a SCMC-enabled SMC protocol solitary clusters should be avoided(since they increase the final number of clusters in an unpredictable way). Generally, solitaryclusters are formed due to the impossibility of assigning a BoT to one or more participants. Thereare mainly two reasons for this phenomenon: firstly, the existence of very suspicious participantswho trusts less than nBoT participants and secondly, the exclusive assignment of BoT members inthe cases where the trusted participants of different clusters are limited in number and similar toeach other. In both cases, solitary clusters are usually inevitable.

Unfortunately, it is impossible to bind all these optimizations together while keeping the privacy oftrust vectors. The main reason is their incompatibility with each other. Robustness against collusions, for

5It is also possible to have less than k groups if all the members of a cluster leave it during the algorithm. In this case, thatgroup is eliminated by the starter.

68

Page 7: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

Figure 1: The execution of the anonymously-located pre-clustering algorithm using a random trust matrix(n = 20, k = 4 and ε = 0).

example, requires to employ the more suspicious and yet more untrusted participants (Most UntrustedSuspicious First (MUSF) selection) in the BoTs. But, this strategy itself is more risky from the viewpointof privacy because it leads to an increase in the untrusted shares. Figure 4 shows the impact of this failedstrategy on the number of untrusted shares compared to naıve selection 6. Surprisingly, this approachdoes not show any improvement in the average number of mutual trusts among the BoT members neither.

6By naıve selection we mean selection without any criteria not necessarily random selection.

69

Page 8: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

The results are shown in Table 1.

Table 1: Impact of untrusted suspicious selection on the average number of mutual trusts among BoTmembers compared to the naıve selection (assuming k = 10 and nBoT = 5).

However, as Figure 2 illustrates, when the cardinality of BoT is between 60% to 80% of the averageof cluster cardinality, the probability of mutual trust is 0.25. Also the figures in the table show that inthe worst case where the cardinality of BoT is about 20% of the average of cluster population, this valueincreases to 0.27 and for more acceptable BoT cardinalities (∼ 10%) it is about 0.21.

Figure 2: The probability of mutual trusts among BoT members (assuming n = 500 and k = 10).

Our protocol does a good job in reducing untrusted shares. Clearly, the number of untrusted shares isreduced if the BoT members are selected on the basis of their number of votes among the group members.Since this information is available to the starter during the execution of pre-clustering, it can be madeavailable to all the participants. In this way BoT members can be selected more intelligently. Clearly,the clustering in which the BoT members are selected on the basis of their number of votes, results in aminimized number of untrusted shares (over a particular pre-clustering). Hence, we propose to utilize thevotes of group members to each BoT member candidate and select perform a prioritized selection basedon the number of votes.In order to reduce the number of solitary clusters, we avoid exclusive assignment of BoT members.Figure 3 depicts the results of applying naıve and the Vote-Prioritized (VP) selections on the exampleshown in Figure 1. In this example, the VP algorithm results in 3 untrusted shares while naıve selectioncauses 7. Note that the solitary cluster created by the vote-prioritized algorithm has not been caused by

70

Page 9: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

the lack of enough BoT members, but, it is due to the creation of a group with no member 7. Dashedarrows show multi-cooperative BoT members among different clusters.

Figure 3: Two possible clusterings resulted by naıve BoT selection (in the left) and vote-prioritizedselection (in the right) over the final grouping of the example shown in Figure 1 (assuming nBoT = 3).

Figure 4 illustrates the effect of VP selection on the total number of untrusted shares. Values on thevertical axis represent the total number of untrusted shares that has occurred in average for given numberof participants.

Our experiments show that assuming reasonable values for initial parameters (k, nBoT ), the finalnumber of solitary clusters is not a concern. In other words, the number of solitary clusters always tendsto zero by increasing the number of participants. However, for non-zero cases, VP selection improvesresults compared to the naıve selection. Table 2 shows the results. Solitary clusters formed by partici-pants having suspicious trust patterns can be reduced simply by including them as BoT member of otherclusters. Such a selection technique prioritizes the participants based on their untrustworthiness amongthe groups using the index vectors. This approach that we call Most Untrusted First (MUF) selectionreduces the final number of solitary clusters (see Table 2). However, MUF selection leads to many un-

7This situation can be happened in the other alternatives too.

71

Page 10: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

Figure 4: Clustering result of the prioritized BoT selection compared to the naıve selection (assumingk = 10 and nBoT = 5).

trusted shares as illustrated in Figure 4.Therefore, VP selection based on coincident BoTs leads to a clustering stabilization over the groupsformed by the pre-clustering algorithm with minimized number of untrusted shares and acceptable soli-tary clusters.

Table 2: The effect of different selection algorithms on the number of solitary clusters (assuming k = 10and nBoT = 5).

4.1.3 DTP-Based Clustering Analysis

DTP-based clustering using VP selection does not require any communication during the stabilizationphase. However, at each iteration of the pre-clustering phase, n−1 messages (each of the length k(n+1))are transmitted along the ring in addition to one broadcast message of length 2kn. The computation com-plexity of the protocol only consists of k(n+1) random number generations for the starter.The number of iterations of the pre-clustering phase of DTP-based protocol can be adjusted by the con-vergence factor (δ ). Figure 5 depicts the number of iterations as a function of the number of participants.Based on this figure, the number of iterations behaves nearly linearly. At least it can be said that the num-ber of iterations exhibits a slow growth when the number of participants increases.

From the viewpoint of privacy, our protocol produces encouraging anecdotal results on the numberof untrusted shares. As depicted in the figure, even with such a small cardinality for BoT, the number of

72

Page 11: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

Figure 5: Number of iterations in anonymously-located pre-clustering (assuming k = 10 and nBoT = 5).

untrusted shares remains acceptable. Even in the worst case the number of untrusted shares is always less

thann×nBoT −n2

BoT

2which is itself bounded by n× nBoT . The average case shows much better results

(compare 5111.34 with 12487.5 for n = 5000, k = 10 and nBoT = 5). Also, it is worth-noting that theuntrusted sharing by itself does not threaten the privacy of inputs because, the intra-cluster computationis based on secret sharing techniques. In other words, if the untrusted sharing couples with collusion,the privacy of private inputs could be compromised. Based on experimental results of the probability ofcollusion (Figure 2) this risk is justifiable.About the privacy of trust vectors, none of the vectors is revealed during the execution of our protocolbecause all messages consist only of random values. The only information revealed is the overall numberof votes that is broadcast to all the participants at the end of each iteration. This information correspondsto the total number of 1s in each column of the trust matrix.We remark that the simple secure sum protocol used in our solution can be improved by adopting someother protocol already proposed in the literature which has been proved to be collusion resistant (e.g.[26]).

4.2 Completely-Shared Clustering Model

In this model each participant agrees to send her private input (in plain form) to at most one of hertrustees. In our previous work [7] we considered this model in detail and proposed a consistent clusteringprotocol called Prioritized Voting Clustering (PVC).Unlike DTP-based protocol, PVC protocol results in a hierarchical clustering of the participants. So that,each hierarchy clearly defines the follower-leader relations which shows the flow of the follower’s privateinput to her leader during the intra-cluster computation phase. Utilizing one of the participants as theinductor and a Trusted Third Party (TTP) as well as two different public-key cryptosystems, the privacyof the trust vectors are preserved during the clustering: Participants take part in a sequence of electionsrun by the TTP to select a leader and at each iteration each participant just gets aware of her leader (if

73

Page 12: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

has been allocated any). Therefore, at the end of the protocol execution, none of the participants knowabout the other participants’ location in clustering. Compared to DTP-based clusteing, PVC protocol hasthree disadvantages:

• Utilizing a TTP. PVC protocol cannot be employed in the cases that does not access to a TTP.

• Hierarchical Structure. Due to the non-flat structure of clusters, the round complexity of the intra-cluster computation is higher.

• Inflexibility. Since, this protocol relies on the plain flow of private inputs, possible follower-leaderrelations are so strict. In other words, in spite of DTP-based clustering, a participant cannot beforced to include to a cluster without absolute trust to someone in that cluster. As a consequence,the number of final clusters cannot be guaranteed to be small enough. Whereas, in DTP-basedprotocol, as the results show, the number of final clusters is very close to the initial parameter k.

5 Conclusions and Outlook

In this paper we presented SCMC scheme aimed at improving the efficiency of SMC by reducing thenumber of participants through clustering. Also, we investigated two clustering models for SCMC: oneof the introduced modes corresponds to the idea of our previous work [7]. Our trust-based privacy pre-serving clustering protocol exhibits encouraging results from the viewpoint of efficiency. Also, privacyloss has been qualitatively evaluated.We are currently working toward a quantitative model for the assessment of privacy loss that will enablerisk-driven deployment of our clustering technique based on ”acceptable” levels of privacy loss.

Acknowledgment

This work has been partly supported by the EU-funded project ”CUMULUS” (Contract No. FP7-318580).

References[1] A. C. Yao, “Protocols for secure computations,” in Proc. of the 23rd Annual Symposium on Foundations of

Computer Science (SFCS’82), Chicago, Illinois, USA. IEEE, October 1982, pp. 160–164.[2] O. Goldreich, S. Micali, and A. Wigderson, “How to play any mental game,” in Proc. of the 19th Annual

ACM symposium on Theory of Computing (STOC’87), New York, New York, USA. ACM, May 1987, pp.218–229.

[3] D. Chaum, C. Crepeau, and I. Damgard, “Multiparty unconditionally secure protocols,” in Proc. of the 20thannual ACM symposium on Theory of Computing (STOC’88), Chicago, Illinois, USA. ACM, May 1988,pp. 11–19.

[4] M. Ben-Or, S. Goldwasser, and A. Wigderson, “Completeness theorems for non-cryptographic fault-tolerantdistributed computation,” in Proc. of the 20th annual ACM symposium on Theory of Computing (STOC’88),Chicago, Illinois, USA. ACM, May 1988, pp. 1–10.

[5] P. Bogetoft, D. L. Christensen, I. Damgard, M. Geisler, T. Jakobsen, M. Krøigaard, J. D. Nielsen, J. B.Nielsen, K. Nielsen, J. Pagter et al., “Secure multiparty computation goes live,” in Financial Cryptographyand Data Security: Revised Selected Papers from the 13th International Conference (FC’09), Accra Beach,Barbados, February 23-26, 2009, LNCS, R. Dingledine and P. Golle, Eds. Springer-Verlag, 2009, vol. 5629,pp. 325–343.

74

Page 13: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

[6] F. Kerschbaum, A. Schropfer, A. Zilli, R. Pibernik, O. Catrina, S. de Hoogh, B. Schoenmakers, S. Cimato,and E. Damiani, “Secure collaborative supply-chain management,” Computer, vol. 44, no. 9, pp. 38–43,2011.

[7] S. Abbasi, S. Cimato, and E. Damiani, “Toward secure clustered multi-party computation: A privacy-preserving clustering protocol,” in Proc. of the 2013 International Conference on Information and Com-munication Technology (ICT-EurAsia’13), Yogyakarta, Indonesia, LNCS. Springer-Verlag, March 2013,vol. 7804, pp. 447–452.

[8] D. Beaver, S. Micali, and P. Rogaway, “The round complexity of secure protocols,” in Proc. of the 22ndannual ACM symposium on Theory of Computing (STOC’90), Baltimore, Maryland, USA. ACM, May1990, pp. 503–513.

[9] R. Cramer, I. Damgard, S. Dziembowski, M. Hirt, and T. Rabin, “Efficient multiparty computations secureagainst an adaptive adversary,” in Proc. of the 1999 International Conference on the Theory and Applicationof Cryptographic Techniques (EUROCRYPT’99), Prague, Czech Republic, LNCS, vol. 1592. Springer-Verlag, May 1999, pp. 311–326.

[10] M. Hirt and U. Maurer, “Robustness for free in unconditional multi-party computation,” in Proc. of the 21stAnnual International Cryptology Conference (CRYPTO’01), Santa Barbara, California, USA, LNCS, vol.2139. Springer-Verlag, August 2001, pp. 101–118.

[11] I. Damgard and J. B. Nielsen, “Scalable and unconditionally secure multiparty computation,” in Prof. of the27th Annual International Cryptology Conference on Advances in Cryptology (CRYPTO’07), Santa Barbara,California, USA, LNCS, vol. 4622. Springer-Verlag, August 2007, pp. 572–590.

[12] Z. Beerliova-Trubiniova and M. Hirt, “Efficient multi-party computation with dispute control,” in Proc. ofthe 3rd conference on Theory of CryptographyTheory of Cryptography (TCC’06), New York, New York, USA,LNCS, vol. 3876. Springer-Verlag, March 2006, pp. 305–328.

[13] R. Bar-Yehuda, B. Chor, E. Kushilevitz, and A. Orlitsky, “Privacy, additional information and communica-tion,” IEEE Transactions on Information Theory, vol. 39, no. 6, pp. 1930–1943, 1993.

[14] R. Fagin, M. Naor, and P. Winkler, “Comparing information without leaking it,” Communications of theACM, vol. 39, no. 5, pp. 77–85, 1996.

[15] P. Mohassel and M. Franklin, “Efficiency tradeoffs for malicious two-party computation,” in Proc. of the 9thInternational Conference on Theory and Practice of Public-Key Cryptography (PKC’06), New York, NewYork, USA, LNCS, vol. 3958. Springer-Verlag, April 2006, pp. 458–473.

[16] S. R. Oliveira and O. R. Zaiane, “Privacy preserving clustering by data transformation,” in Proc. of the 18thBrazilian Symposium on Databases (SBBD’03), Manaus, Amazonas, Brazil, October 2003, pp. 304–318.

[17] J. Vaidya and C. Clifton, “Privacy-preserving k-means clustering over vertically partitioned data,” in Proc.of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’03),Washington, DC, USA. ACM, August 2003, pp. 206–215.

[18] A. Inan, S. V. Kaya, Y. Saygın, E. Savas, A. A. Hintoglu, and A. Levi, “Privacy preserving clustering onhorizontally partitioned data,” Data & Knowledge Engineering, vol. 63, no. 3, pp. 646–666, 2007.

[19] Y. Shen, J. Han, and H. Shan, “The research of privacy-preserving clustering algorithm,” in Proc. of the 3rdInternational Symposium on Intelligent Information Technology and Security Informatics (IITSI’10), Jing-gangshan, China. IEEE, April 2010, pp. 324–327.

[20] S. Jha, L. Kruger, and P. McDaniel, “Privacy preserving clustering,” in Proc. of the 10th European Sympo-sium On Research In Computer Security (ESORICS’05), Milan, Italy, LNCS, vol. 3679. Springer-Verlag,September 2005, pp. 397–417.

[21] G. Jagannathan, K. Pillaipakkamnatt, R. N. Wright, and D. Umano, “Communication-efficient privacy-preserving clustering,” Transactions on Data Privacy, vol. 3, no. 1, pp. 1–25, 2010.

[22] A. Xue, D. Jiang, S. Ju, W. Chen, and H. Ma, “Privacy-preserving hierarchical-k-means clustering on hori-zontally partitioned data,” International Journal of Distributed Sensor Networks, vol. 5, no. 1, January 2009.

[23] G. Jagannathan and R. N. Wright, “Privacy-preserving distributed k-means clustering over arbitrarily parti-tioned data,” in Proc. of the 11th ACM SIGKDD International Conference on Knowledge Discovery and DataMining (KDD’05), Chicago, Illinois, USA. ACM, August 2005, pp. 593–599.

[24] S. Weeks, “Understanding trust management systems,” in Proc. of the 2001 IEEE Symposium on Security

75

Page 14: Clustering Models in Secure Clustered Multiparty Computationisyou.info/jowua/papers/jowua-v4n2-4.pdfClustering Models in Secure Clustered Multiparty Computation Sedigheh Abbasi , Stelvio

Clustering Models in SCMC Abbasi, Cimato, and Damiani

and Privacy (S&P’01), Oakland, California, USA. IEEE, May 2001, pp. 94–105.[25] J. H. Cheon, S. Jarecki, and J. H. Seo, “Multi-party privacy-preserving set intersection with quasi-linear

complexity,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences,vol. 95, no. 8, pp. 1366–1378, 2012.

[26] Y. Zhu, L. Huang, W. Yang, and X. Yuan, “Efficient collusion-resisting secure sum protocol,” Chinese Journalof Electronics, vol. 20, pp. 407–413, 2011.

Sedigheh Abbasi Sedigheh Abbasi is a PhD student in Computer Science at the Uni-versita degli Studi di Milano, Italy. She received her M.A. in Computer Engineering(Software) from Shiraz University, Iran. She worked as a university instructor forfive years at Marvdasht Islamic Azad University, Iran. Currently, she is working onPrivacy and Efficiency issues in Secure Multiparty Computation. Her main researchinterests include cryptography, data outsourcing, trust and privacy.

Stelvio Cimato Stelvio Cimato is an Assistant Professor with the Dipartimento diInformatica of the Universita degli Studi di Milano. He got the Ph.D in ComputerScience at University of Bologna, Italy in 1999. His main research interests are inthe area of cryptography, network security, and Web applications. He has publishedseveral papers in the field and is active in the community, serving as member of theprogram committee of several international conferences in the area of cryptographyand data security.

Ernesto Damiani Ernesto Damiani is Full Professor at the Universita degli Studi diMilano, Italy, and Director of the University’s Ph.D. program in computer science. Heleads the SESAR lab whose researchers have been involved in several projects somefunded by the European community, some by the Italian Ministry of Research andsome by private companies, including Nokia Siemens, Cisco Systems, British Tele-com and Telecom italia. His main areas of interest include trust, reputation and riskmodels, service-oriented computing, processing of semi and unstructured information

(e.g., XML), models and platforms supporting open source development. He has published several booksand more than 250 papers and international patents. Prof. Damiani was the founding chair of the IEEEConference on Digital Ecosystems (IEEE-DEST) and the Vice-Chair of the IEEE Technical Committeeon Industrial Informatics. He is a Senior Member of the IEEE and ACM Distinguished Scientist, and arecipient of the IEEE Chester Hall Award.

76