Transtutors · Web viewIn today's modern world privacy is a atmost important commodity. Here the...

9
IMPLEMENTING EFFICIENT ROBUST PRIVATE SET INTERSECTION BASED ON HOMOMORPHIC ENCRYPTION L. Ertaul, T. Dinesh Abstract - Set intersection computation between two mutually mistrusting parties privately and efficiently is an important procedure in the field of data mining [1]. Assuring toughness, and dealing with the malicious parties while retaining protocol efficiency is not a secret problem. Here we are implementing the Set intersection protocol by using the contacts in the Android emulator. 1. INTRODUCTION For computing set intersection by constructing efficient, robust two way protocol is introduced by the Freedman, Nissim and Pinkas [2]. Here to solve this problem they present a protocol in which two malicious parties holding the private inputs to compute the intersection of their inputs without letting any information out. It is mainly used in the field of data mining [1]. The prototype application involves sharing a secure information in health and finance. In today's modern world privacy is a atmost important commodity. Here the Set intersection protocol is a privacy protocol [7]. Private data should be shared among the mutually suspicious people.. Organization of paper is as follows. Section II describes the methods and techniques that are used in the protocol. Section III gives the implemented protocol. Section IV describes the Protocol explanation. Section V explains the Implementation of protocol. Section VI gives the References. II. APPLICATIONS OF PSI There are many applications in which we use Private Set Intersection 1. USA government wants to assure that the employees of its industrial contractor have no criminal background. Neither the government agency or the contractor wants to reveal the information but they want to know the intersection. 2. Two law enforcement bodies want to compare the terrorist list in their databases but according to the National privacy laws they are not allowed to disclose the data but according Dr. L. Ertaul is with the Department of Math & Computer Science, California State University, Hayward, CA 94542, USA (email: [email protected]) T. Dinesh is with the Department of Computer Networks, California State University, Hayward, CA 94542, USA (email: [email protected]) to the treaty they are allowed to share information on common interests 3. Two real estate companies want to find the double dealers i.e. customers who sign contract with two companies so for this purpose we use PSI. 4. Federal tax authority want to identify the tax evaders so it wants to find the accounts of tax evaders with a foreign bank and try to obtain the account records. Neither the bank want to disclose the information of account holders nor the tax authority wants to reveal list of suspects.

Transcript of Transtutors · Web viewIn today's modern world privacy is a atmost important commodity. Here the...

Page 1: Transtutors · Web viewIn today's modern world privacy is a atmost important commodity. Here the Set intersection protocol is a privacy protocol [7]. Private data should be shared

IMPLEMENTING EFFICIENT ROBUST PRIVATE SET INTERSECTION BASED ON HOMOMORPHIC ENCRYPTION

L. Ertaul, T. Dinesh

Abstract - Set intersection computation between two mutually mistrusting parties privately and efficiently is an important procedure in the field of data mining [1]. Assuring toughness, and dealing with the malicious parties while retaining protocol efficiency is not a secret problem. Here we are implementing the Set intersection protocol by using the contacts in the Android emulator.

1. INTRODUCTION

For computing set intersection by constructing efficient, robust two way protocol is introduced by the Freedman, Nissim and Pinkas [2]. Here to solve this problem they present a protocol in which two malicious parties holding the private inputs to compute the intersection of their inputs without letting any information out. It is mainly used in the field of data mining [1]. The prototype application involves sharing a secure information in health and finance. In today's modern world privacy is a atmost important commodity. Here the Set intersection protocol is a privacy protocol [7]. Private data should be shared among the mutually suspicious people.. Organization of paper is as follows. Section II describes the methods and techniques that are used in the protocol. Section III gives the implemented protocol. Section IV describes the Protocol explanation. Section V explains the Implementation of protocol. Section VI gives the References.

II. APPLICATIONS OF PSI

There are many applications in which we use Private Set Intersection1. USA government wants to assure that the employees of its industrial contractor have no criminal background. Neither the government agency or the contractor wants to reveal the information but they want to know the intersection.2. Two law enforcement bodies want to compare the terrorist list in their databases but according to the National privacy laws they are not allowed to disclose the data but according

Dr. L. Ertaul is with the Department of Math & Computer Science, California State University, Hayward, CA 94542, USA (email: [email protected]) T. Dinesh is with the Department of Computer Networks, California State University, Hayward, CA 94542, USA (email: [email protected])

to the treaty they are allowed to share information on common interests3. Two real estate companies want to find the double dealers i.e. customers who sign contract with two companies so for this purpose we use PSI.4. Federal tax authority want to identify the tax evaders so it wants to find the accounts of tax evaders with a foreign bank and try to obtain the account records. Neither the bank want to disclose the information of account holders nor the tax authority wants to reveal list of suspects.5. Department of homeland security wants to check the terrorist list with the one who are travelling in a flight operated by a foreign airline. Both of them are not willing to reveal the information. If there is an intersection DHS would not allow to flight to be landed.

III. METHODOLOGIES & TECHNIQUES First we check the semi honest protocol [3] that computes set intersection protocol via polynomial evaluation. In this the Server evaluate the encrypted polynomial of degree n of the each of the inputs and send it back to the client. Now the client should verify the Server evaluated the polynomials correctly or not just to ensure the security. For this redundancy is used for the representation of the inputs [4]. Here we employ a server which shares its input via shamir secret sharing [5] and commits to shares of its input. The polynomials are closed under composition so it is a valid secret sharing [5] of the output value. We check that the Server acted honestly by executing the cut and choose protocol [8] that enables the server to open k random shares for input value. By opening these shares no information is shared about the input because of the secret sharing scheme [5]. The client checks the output shares that lie on the same polynomial of degree d. The client reconstructs the secret which is comparable to the Servers inputs. Here we need to calculate the computations if the other party acts honestly or detect cheating and this makes us to use the Lagrange interpolation [9] as an error detection code.

IV. PRIVATE SET INTERSECTION PROTOCOL

Now we will look at the protocol of the private set intersection1. The Client runs GEN(1k) to obtain a secret key sk and a public key pk for Homenc and sends pk to the Server [8].

Page 2: Transtutors · Web viewIn today's modern world privacy is a atmost important commodity. Here the Set intersection protocol is a privacy protocol [7]. Private data should be shared

2. The Client computes a polynomial P(x) = xn + an−1xn−1 + ..... + a1x + a0 of degree the size of his input n over a finite field such that P(x) = 0 if and only if x ϵ X [8].3. The Client encrypts the coefficients of P, bi = ENC(ai) and sends them to the Server [8].4. For each yjϵY S chooses a random value rj and constructs the function [8]F(yj) = ENCpk1(rj.(yj)+yj+0) = ENCpk1(0).ENCpk1(yj).(π(ENCpk1(as))y

j^s)r

j

= ENCpk1(0).ENCpk1(yj).(π(bs)yj^s)r

j

5. The Server replaces each of its inputs yj with new variables cl = yj

2l for 0 ≤ l ≤ log n − 1 and transforms the above function to [8]F(yj) = ENCpk1(0) . ENCpk1 (yj) . (π(bs)π

l=0 to log n (yj

2^l) s[l])rj

6. The Server shares each of his input y ϵ Y with polynomial Pyj and each of the random values rj with a polynomial Prj [8]7. Additionally the Server computes m random polynomials P0,j that have constant coefficient zero. These are used to ”rerandomize” the output shares so that they give no information about the input [8].8. Using all of the above shares and a random ri,j (to ”rerandomize” the encryption) the Server computes shares of the values [8].F(yj ): Outi,j = (F(yj))(i) = ENCpk1(0; ri,j) . ENCpk1 (P0,j (i); 0) . ENCpk1 (Pyj (i); 0) . (π(bs)π

l=0 to log n (P

yj2l(i) s[l])

)Prj

(i)

and sends them to the Client.9. The Client decrypts the values that he received from the Server, verifies that they are valid, and uses them to reconstruct the shared values. He concludes that the obtained values that are in his input set are the values in the intersection set [8].

V. SET INTERSECTION PROTOCOL EXPLANATION

Now i explain this protocol in step wise

Client side computation:

In the Step 1 the Client generates three variables (q,G,g) and now using these and Additive Elgamal Encryption [10] public key pk is calculated using the formula

pk = h = g^x mod q Now in Step 2 polynomial is constructed using the Clients input data set X. So C computes a polynomial P(x) of degree 'n' such that P(x) = 0 if x belongs to X and coefficients are taken from the polynomial. Now in Step 3 the coefficients are encrypted using the Additive Elgamal Encryption [10] to get the cipher text. The formula for calculating cipher text is

(c1, c2) = (g^y, h^y.g^m)

Server side computation:

Now in Step 4 the Server Polynomial Evaluation is done with its input set Y and it is calculated using the formula

ENCpk1(0).ENCpk1(yj).(π(bs)yj^s)r

j

Now in Step 5 Server replaces old variables with the new variables for the Randomization so that the input information is not known and it is done using the formula F(yj) = ENCpk1(0) . ENCpk1 (yj) . (π(bs)π

l=0 to log n (yj

2^l) s[l])rj

Now in Step 6 the Server calculates the shares Py and Pr for each of its inputs 'y' and random values 'r' using the Shamir secret sharing [5] shown in the formula

10k((logn)+1) Now in Step 7 Re-randomization so the Server makes sure that the information is not given about the input. Now in Step 8 output shares are calculated using the formulaF(yj ): Outi,j = (F(yj))(i) = ENCpk1(0; ri,j) . ENCpk1 (P0,j (i); 0) . ENCpk1 (Pyj (i); 0) . (π(bs)π

l=0 to log n (P

yj2l(i) s[l])

)Prj

(i)

and these are sent to the Server. Now in Step 9 the shares received from the Server are decrypted using the Additive Elgamal decryption and these decrypted values are recomputed using the Lagrange Interpolation [9] to find the Intersection using the formula

Now in this process Client knows only the Intersection while the server knows knowing except the Clients input data set.

A. Important design choices

1. Homomorphic encryption:

We use the homomorphic encryption scheme [6]. The plain text of the semantically-secure encryption are elements of P with operation + and the cipher text of semantically-secure encryption are elements of C with operation *. The homomorphic property [6] of ENC has 2 properties

Property 1 (Homomorphic Encryption) :

ENC(X,r1).ENC(Y,r2) = ENC(X+Y,r)

ENC(X,r3)λ = ENC(λ.X,r')

Property 2

ENC(X,r1).ENC(Y,r2) = ENC(X+Y,r1+r2)

ENC(X,r3)λ = ENC(λ.X,λ.r3)

Additive El-Gamal Encryption [10]

Page 3: Transtutors · Web viewIn today's modern world privacy is a atmost important commodity. Here the Set intersection protocol is a privacy protocol [7]. Private data should be shared

GEN: input 1n generate (G,q,g) where q is prime, g is generator and G is cyclic group of order q. Compute h = qx. The public key is (G,q,g,h) and private key is (G,q,g,x)

ENC: Input pk = (G,q,g,h), message mϵZq, choose random y<--Zq and output is ciphertext is (gy,hy.gm)

DEC : Input sk = (G,q,g,x) and ciphertext (c1,c2) output is gm

= c2/c1x

2. Input sharing via shamir secret scheme [5]

Here we share the function using the shamir scheme [5] by secretly sharing the arguments of the function and evaluating on the corresponding shares to get the final output. We use this secret sharing [5] for the purpose of efficiency. Here we use the polynomial f = anxn+an−1xn−1+...+a0 . For given z and r we get the shares of g(z,r) = r.f(z)+z by secret sharing z and r so for this we use random polynomials. Here the number of shares needed is n.k+k

3. Reconstruction of the shared values

Here we check how the Client reconstructs the shared values received from the Server and checks whether the input is in the particular set and in the intersection set using the shares [Ci,j ]1≤i≤10k(_log n_+1). The Client first decrypts the shares to get the plain text. The Client uses Lagrange polynomial [9] and some points to do that. Now the Client reconstructs the shared value

C'0,j = Lj(0) = Σv=1

1+k+k(logn+1)C'j,vlv(0)

VI. IMPLEMENTATION OF PRIVATE SET INTERSECTION

A. Prototype implementation

We now report on the design of prototype implementation for set intersection protocol. It is implemented in Java and the output is displayed in the Android emulator. Motivated by the design choices above we use the Additive Elgamal Encryption [10], Additive Elgamal Decryption, Shamir secret sharing [5] and Lagrange Interpolation scheme [9]. My system consists of 8GB RAM with Core-i7 processor and running on Windows 8. We installed Eclipse IDE to write the Java code and to run the Android Emulator. Android SDK tools are installed in the Eclipse and Android latest version 4.4.2 is installed to run the simulation. The Android simulator is equipped with ARM v7 processor and a RAM of 512MB. Later we wrote the code in Java and run

that in the Android. Now the code is converted into .apk and got installed. We used the Contacts application to get the intersection in which we create contacts in the Android emulator and then using the contacts polynomials are created and using those polynomials we get the intersection. First we started the project with socket programming where the Client and Server exchange the messages. The Client sends the messages and the Server receives them in order to execute the instruction.

Fig. 1. Client and Server message passing

Here two emulators are created named Client and Server. Then we moved on to develop the android application in which the contacts are created using the phonebook (People) app that is available in the Android phone. On the Client side we took 5 contacts and these contacts are mapped to some hash values and these unique. These hash values are created using the Java's built in tool. Then the polynomials are constructed using these hash values and they are computed. Now we will get an equation and from this equation coefficients are taken into consideration. Now the obtained coefficients are encrypted using Additive Elgamal Encryption [10] and these values are sent to the Server. This completes the Client side computation.

Page 4: Transtutors · Web viewIn today's modern world privacy is a atmost important commodity. Here the Set intersection protocol is a privacy protocol [7]. Private data should be shared

Fig. 2. Client side computation

Now let us check the Server side computation after sending the encrypted coefficients to Server the Server receives it. In the Server also 5 contacts are created and they are mapped to the hash values which is same procedure seen in the Client side computation. Now the Server polynomial evaluation is done with its input set and we will get some values of F(yj). To calculate how many shares we need we use the formula that is shown in the Section V. Here we need 20 shares. After that new variables are created by choosing some random value and the shares are shown in the figure 5. After that a random polynomial is chosen and shares are calculated for that random polynomial. These shares are sent to the Client.

Fig.3. Server side welcome page

Fig.4. Server side computation

Fig.5. Server input shares

Page 5: Transtutors · Web viewIn today's modern world privacy is a atmost important commodity. Here the Set intersection protocol is a privacy protocol [7]. Private data should be shared

Fig.6. Server random polynomials

Fig.7. Shares sent to Client

After receiving the shares from Server, Client then decrypts it using the Additive Elgamal Decryption and later it computes the intersection using the Lagrange Interpolation [9].

Fig.8. Client side decryption

VII. PERFORMANCE STATISTICS

Efficient Robust Private Set Intersection is run on both Client and Server Emulators. To get the Client and Server dataset intersection it took like around 1.8 seconds. In this the server took a long time as it has to calculate lot of computations like random polynomials calculation, random input shares to let the third party don't know the shares etc before sending it to the Client. The Client side has calculations like it should generate polynomials based on the hash values and encrypt them before sending it to the Server. The performance of this look good as it handled large data sets i.e. contacts very quickly. The graphs show that it can handle large amounts of data without any delay and also the intersection can be found robustly and efficiently. Now the performance of the Client and Server side is shown in the below graphs. Coming to the Client side we can see that as the contacts are increased the computation is also getting increased.

Page 6: Transtutors · Web viewIn today's modern world privacy is a atmost important commodity. Here the Set intersection protocol is a privacy protocol [7]. Private data should be shared

Fig.9. Performance of Client with number of contacts

Coming to the Server side computation we can see that it increased exponentially with number of contacts. Comparing Client and Server computations Client side took more time because of the Lagrange [9] and Elgamal [10] operations during decryption to find the Intersection.

Fig.10. Performance of Server with number of contacts

On increasing the contact size the overall performance of the protocol is as follows

Fig.11. Efficient Robust Private Set Intersection protocol

performance with number of contacts

VIII. ACKNOWLEDGEMENTS I would like to thank Dr. Levent Ertaul for helping me out in this project. With his guidance and encouragement i was able to complete the project with in the time. I specially thank him for giving suggestions on how to write this paper.

IX. CONCLUSION Privacy plays an important role in today's world. As there is no privacy there is no secrecy for the data so it is a precious commodity. In this paper, to protect the privacy i implemented the Efficient Robust Private Set Intersection protocol to find the intersection of the data sets. I have evaluated the performance of the protocol. It is also scalable as i tested this protocol with large datasets. I implemented this protocol in Android platform which shows that it can run on the mobile phones.

X. REFERENCES

[1] Lindell, Y., Pinkas, B.: Privacy preserving data mining. Journal of Cryptology,36–54 (2000)

[2] Freedman, M.J., Nissim, K., Pinkas, B.: Efficient private matching and set intersection.In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 1–19. Springer, Heidelberg (2004)

[3] Freedman, M.J., Nissim, K., Pinkas, B.: Efficient private matching and set intersection.In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027,pp. 1–19. Springer, Heidelberg (2004)

[4] Choi, S., Dachman-Soled, D., Malkin, T., Wee, H.: Black-box construction of a non-malleable encryption scheme from any semantically secure one. In: Canetti,R. (ed.) TCC 2008. LNCS, vol. 4948, pp. 427–444. Springer, Heidelberg (2008)

[5] Shamir, A.: How to share a secret. Commun. ACM 22(11), 612–613 (1979)

[6] Katz, J., Lindell, Y.: Introduction to Modern Cryptography. Chapman & Hall/Crc Cryptography and Network Security Series. Chapman & Hall/CRC, Boca Raton (2007)

[7] Practical Private Set Intersection Protocols withLinear Computational and Bandwidth Complexity_Emiliano De Cristofaro and Gene Tsudik

[8] Efficient Robust Private Set Intersection Dana Dachman-Soled1, Tal Malkin1, Mariana Raykova1, and Moti Yung2

[9] http://en.wikipedia.org/wiki/Lagrange_polynomial

[10] http://en.wikipedia.org/wiki/ElGamal_encryption