Practical Techniques for Searches on Encrypted Data Yongdae Kim [email protected] Written by Song,...

24
Practical Techniques Practical Techniques for Searches on for Searches on Encrypted Data Encrypted Data Yongdae Kim Yongdae Kim [email protected] [email protected] Written by Song, Wagner, Written by Song, Wagner, Perrig Perrig

Transcript of Practical Techniques for Searches on Encrypted Data Yongdae Kim [email protected] Written by Song,...

Page 1: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Practical Techniques for Practical Techniques for Searches on Encrypted DataSearches on Encrypted Data

Yongdae KimYongdae Kim

[email protected]@cs.umn.edu

Written by Song, Wagner, PerrigWritten by Song, Wagner, Perrig

Page 2: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

ContentsContents

IntroductionIntroduction Basic CryptographyBasic Cryptography SchemesSchemes

Basic searchBasic search Controlled SearchControlled Search Hidden queryHidden query Final schemeFinal scheme

DiscussionsDiscussions Conclusion and open problemsConclusion and open problems

Page 3: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

IntroductionIntroduction

IEEE Symp. on Security and Privacy 2000IEEE Symp. on Security and Privacy 2000

I’m not expert in database, but…I’m not expert in database, but…

Desirable featuresDesirable features Encrypted dataEncrypted data Encrypted queryEncrypted query Encrypted resultEncrypted result Untrusted serverUntrusted server

Page 4: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

ExampleExample

Mail ServerMail Server Fully trusted, i.e. sys admin can read my e-mail Fully trusted, i.e. sys admin can read my e-mail Can build secure storage Can build secure storage

But need to sacrifice functionalityBut need to sacrifice functionality

Moving the computation to the data storage Moving the computation to the data storage

seems to be very difficultseems to be very difficult

For example, how to search encrypted data?For example, how to search encrypted data?

Page 5: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Nice FeaturesNice Features

Provably secureProvably secure Controlled searching: untrusted server cannot search for Controlled searching: untrusted server cannot search for

a word without owner’s authorizationa word without owner’s authorization Hidden queries: user may ask the untrusted server to Hidden queries: user may ask the untrusted server to

search for a secret word without revealing the wordsearch for a secret word without revealing the word

Fast and efficientFast and efficient Do not rely on public key algorithmDo not rely on public key algorithm Based on stream cipherBased on stream cipher

Page 6: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Other FeaturesOther Features

Each document is divided up into “words”Each document is divided up into “words” Assume it has same lengthAssume it has same length Otherwise, pad or split itOtherwise, pad or split it

Certain computation on the ciphertextCertain computation on the ciphertext

Search methodSearch method IndexingIndexing

advantageous for read-only dataadvantageous for read-only data But faster searchBut faster search

Sequential scanSequential scan

Page 7: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

BasicsBasics

CryptographyCryptography the study of mathematical techniques related to

aspects of information security such as confidentiality, data integrity, entity

authentication, and data origin authentication.

Alice Bob Eve

Page 8: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Taxonomy of Cryptographic PrimitivesTaxonomy of Cryptographic PrimitivesArbitrary length hash functions

One-way permutations

Random sequences

Symmetric-key ciphers

Arbitrary length hash functions(MACs)

Signatures

Pseudorandom sequences

Identification primitives

Public-key ciphers

Signatures

Identification primitives

UnkeyedPrimitives

Symmetric-keyPrimitives

Public-keyPrimitives

SecurityPrimitives

Blockciphers

Streamciphers

Symmetric-key ciphers

Arbitrary length hash functions(MACs)

Blockciphers

Streamciphers

Page 9: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Symmetric Key Encryption .Symmetric Key Encryption .

Encryption key and decryption key are same Encryption key and decryption key are same (mostly)(mostly) EEKK(M) = C(M) = C

DDKK(C) = M(C) = M

Ex. DES, AES, IDEA, …Ex. DES, AES, IDEA, … FastFast

Based on simple operations (exor, shift, Based on simple operations (exor, shift, substitute, rotate, …)substitute, rotate, …)

How to share a key?How to share a key?

Page 10: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Block/Stream ciphersBlock/Stream ciphers

Block cipherBlock cipher breaks up the plaintext into blocks of a fixed breaks up the plaintext into blocks of a fixed

length, length, and then encrypts one block at a time.and then encrypts one block at a time.

Stream cipherStream cipher takes the plaintext string and produces a takes the plaintext string and produces a

ciphertext string using keystreamciphertext string using keystream M M S = C, C S = C, C S = M S = M

where S is a key stream, where S is a key stream, is a bit-wise exclusive-or is a bit-wise exclusive-or S is generated by a key stream generator or pseudo-S is generated by a key stream generator or pseudo-

random functionrandom function

Page 11: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Hash function/MACHash function/MAC

Hash functionHash function computationally efficient function

mapping binary strings of arbitrary length to binary strings of some fixed length,

Cryptographic hash functionCryptographic hash function One-way, collision-freeOne-way, collision-free

MAC (Message authentication code)MAC (Message authentication code) Keyed hash functionKeyed hash function Parties that share a key can check the integrity of Parties that share a key can check the integrity of

datadata MACMACKK(M) = H(K(M) = H(K1 1 || H(K|| H(K22, M)), M))

Page 12: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

NotationsNotations

SSi i : i-th stream from stream cipher G, n-m bits: i-th stream from stream cipher G, n-m bits

WWii : i-th word, n bits : i-th word, n bits

CCii : i-th cipher text, n bits : i-th cipher text, n bits

: Bitwise exclusive-or: Bitwise exclusive-or

FFk k (x): MAC of x using key k, m bits output(x): MAC of x using key k, m bits output

Page 13: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Scheme I: Basic schemeScheme I: Basic scheme

To search WTo search W Alice reveals {kAlice reveals {kii | where W may occur} | where W may occur} Bob checks if WBob checks if Wii C Cii is of the form <s, F is of the form <s, FKiKi(s)> for some s(s)> for some s

For unknown kFor unknown kii, Bob knows nothing, Bob knows nothing To search W, eitherTo search W, either

Alice reveal all kAlice reveal all kii, or , or Alice has to know where W may occur Alice has to know where W may occur

Wi

Si FKi(Si)

F Ki

Plaintext

Stream Cipher

ciphertext

Page 14: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Scheme II: Controlled search .Scheme II: Controlled search .

Replace kReplace kii = f = f k’k’ (W (Wii) where) where k’ is secret, never revealedk’ is secret, never revealed f is another MAC with output size = | kf is another MAC with output size = | k i i ||

Reveal only f Reveal only f k’k’ (W) and W (W) and W

Bob identifies only location where W occursBob identifies only location where W occurs

But reveals nothing on the locations i where But reveals nothing on the locations i where

W != WW != Wii

Still does not support hidden searchStill does not support hidden search

Page 15: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Scheme III: Hidden Searches .Scheme III: Hidden Searches .

Ek”(Wi)

Si FKi(Si)

F Ki

Plaintext

Stream Cipher

ciphertext

Wi

E k”

Page 16: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Scheme III (Cnt’d)Scheme III (Cnt’d)

Let XLet Xii := E := Ek”k” (W (Wii))

After the pre-encryption, Alice has XAfter the pre-encryption, Alice has X11, … , X, … , Xll

Same as before, CSame as before, Cii = X = Xi i T Tii where where XXii = E = Ek”k” (W (Wii))

TTii = < S = < Sii, F, Fki ki (S(Sii) >) >

To search W, Alice queries (X, k) such that To search W, Alice queries (X, k) such that X := EX := Ek”k”(W) and k := f(W) and k := fk’k’(X) (X)

Page 17: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

A problem of Scheme IIIA problem of Scheme III

Scheme III has a problem… Guess what? Scheme III has a problem… Guess what?

If Alice generates kIf Alice generates kii = f = fk’k’(E(Ek”k”(W(Wii)), she cannot )), she cannot

recover the plaintext from the ciphertext. recover the plaintext from the ciphertext. CCii = X = Xi i T Tii where T where Tii = < S = < Sii, F, Fki ki (S(Sii) >) >

To compute XTo compute Xi i from Cfrom Cii, we have to know T, we have to know T ii

SSii can be computed easily can be computed easily

How about FHow about Fkiki (S (Sii)?)?

The problem is kThe problem is k ii

To compute this, we have to know all ETo compute this, we have to know all Ek”k”(W(Wii) for all i) for all i

Ups! If you know all of these, why do you need search?Ups! If you know all of these, why do you need search?

Page 18: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Scheme IV: The Final Scheme .Scheme IV: The Final Scheme .

FixFix XXii = E = Ek”k” (W (Wii) = < L) = < Lii, R, Rii > where |L > where |Lii|=n-m bits|=n-m bits

TTii=< S=< Sii, F, Fki ki (S(Sii) > where k) > where kii=f =f k’k’((LLii) instead of f ) instead of f k’k’(W(Wii) )

Page 19: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Scheme IV: The Final PictureScheme IV: The Final Picture

Ek”(Wi)

Si Fki(Si)

F Ki

Plaintext

Stream Cipher

ciphertext

Wi

E k”

Li

f k’k i

Page 20: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Practical ConsiderationsPractical Considerations

Alice only needs to remember only one Alice only needs to remember only one password k”password k”

Supporting more advanced queriesSupporting more advanced queries Boolean operations (W and W’)Boolean operations (W and W’) Proximity queries (W near W’)Proximity queries (W near W’) Phrase searches (W immediately precedes W’)Phrase searches (W immediately precedes W’)

Page 21: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Dealing with variable length wordsDealing with variable length words

Pick a long enough fixed-size block Pick a long enough fixed-size block A fixed padding is requiredA fixed padding is required Inefficient in spaceInefficient in space

Support variable length word with word lengthSupport variable length word with word length Instead of W, use < lInstead of W, use < lWW, W>, W>

Move pointer bit by bitMove pointer bit by bit Longer scan time, but efficient spaceLonger scan time, but efficient space

Page 22: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Index-based SearchIndex-based Search

For large database applicationsFor large database applications Index contains a list of keywordsIndex contains a list of keywords

each keyword points to documents containing iteach keyword points to documents containing it

MethodsMethods Encrypt keyword and leave pointers unencryptedEncrypt keyword and leave pointers unencrypted Encrypt pointers alsoEncrypt pointers also

Alice queries encrypted keyword, and Bob returns Alice queries encrypted keyword, and Bob returns encrypted pointersencrypted pointers

Alice needs to spend extra roundAlice needs to spend extra round

Update cost is expensiveUpdate cost is expensive

Page 23: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.

Conclusion and Open ProblemsConclusion and Open Problems

Pretty efficientPretty efficient No public key operationNo public key operation Small message expansionSmall message expansion

Interesting, and useful Interesting, and useful

Open problemsOpen problems Searching “Record > 13” ?#^@*#^!Searching “Record > 13” ?#^@*#^! Searching “a[a-z]b” : needs 26 queriesSearching “a[a-z]b” : needs 26 queries

Page 24: Practical Techniques for Searches on Encrypted Data Yongdae Kim kyd@cs.umn.edu Written by Song, Wagner, Perrig.