Post on 24-Dec-2015
Practical Techniques for Practical Techniques for Searches on Encrypted DataSearches on Encrypted Data
Yongdae KimYongdae Kim
kyd@cs.umn.edukyd@cs.umn.edu
Written by Song, Wagner, PerrigWritten by Song, Wagner, Perrig
ContentsContents
IntroductionIntroduction Basic CryptographyBasic Cryptography SchemesSchemes
Basic searchBasic search Controlled SearchControlled Search Hidden queryHidden query Final schemeFinal scheme
DiscussionsDiscussions Conclusion and open problemsConclusion and open problems
IntroductionIntroduction
IEEE Symp. on Security and Privacy 2000IEEE Symp. on Security and Privacy 2000
I’m not expert in database, but…I’m not expert in database, but…
Desirable featuresDesirable features Encrypted dataEncrypted data Encrypted queryEncrypted query Encrypted resultEncrypted result Untrusted serverUntrusted server
ExampleExample
Mail ServerMail Server Fully trusted, i.e. sys admin can read my e-mail Fully trusted, i.e. sys admin can read my e-mail Can build secure storage Can build secure storage
But need to sacrifice functionalityBut need to sacrifice functionality
Moving the computation to the data storage Moving the computation to the data storage
seems to be very difficultseems to be very difficult
For example, how to search encrypted data?For example, how to search encrypted data?
Nice FeaturesNice Features
Provably secureProvably secure Controlled searching: untrusted server cannot search for Controlled searching: untrusted server cannot search for
a word without owner’s authorizationa word without owner’s authorization Hidden queries: user may ask the untrusted server to Hidden queries: user may ask the untrusted server to
search for a secret word without revealing the wordsearch for a secret word without revealing the word
Fast and efficientFast and efficient Do not rely on public key algorithmDo not rely on public key algorithm Based on stream cipherBased on stream cipher
Other FeaturesOther Features
Each document is divided up into “words”Each document is divided up into “words” Assume it has same lengthAssume it has same length Otherwise, pad or split itOtherwise, pad or split it
Certain computation on the ciphertextCertain computation on the ciphertext
Search methodSearch method IndexingIndexing
advantageous for read-only dataadvantageous for read-only data But faster searchBut faster search
Sequential scanSequential scan
BasicsBasics
CryptographyCryptography the study of mathematical techniques related to
aspects of information security such as confidentiality, data integrity, entity
authentication, and data origin authentication.
Alice Bob Eve
Taxonomy of Cryptographic PrimitivesTaxonomy of Cryptographic PrimitivesArbitrary length hash functions
One-way permutations
Random sequences
Symmetric-key ciphers
Arbitrary length hash functions(MACs)
Signatures
Pseudorandom sequences
Identification primitives
Public-key ciphers
Signatures
Identification primitives
UnkeyedPrimitives
Symmetric-keyPrimitives
Public-keyPrimitives
SecurityPrimitives
Blockciphers
Streamciphers
Symmetric-key ciphers
Arbitrary length hash functions(MACs)
Blockciphers
Streamciphers
Symmetric Key Encryption .Symmetric Key Encryption .
Encryption key and decryption key are same Encryption key and decryption key are same (mostly)(mostly) EEKK(M) = C(M) = C
DDKK(C) = M(C) = M
Ex. DES, AES, IDEA, …Ex. DES, AES, IDEA, … FastFast
Based on simple operations (exor, shift, Based on simple operations (exor, shift, substitute, rotate, …)substitute, rotate, …)
How to share a key?How to share a key?
Block/Stream ciphersBlock/Stream ciphers
Block cipherBlock cipher breaks up the plaintext into blocks of a fixed breaks up the plaintext into blocks of a fixed
length, length, and then encrypts one block at a time.and then encrypts one block at a time.
Stream cipherStream cipher takes the plaintext string and produces a takes the plaintext string and produces a
ciphertext string using keystreamciphertext string using keystream M M S = C, C S = C, C S = M S = M
where S is a key stream, where S is a key stream, is a bit-wise exclusive-or is a bit-wise exclusive-or S is generated by a key stream generator or pseudo-S is generated by a key stream generator or pseudo-
random functionrandom function
Hash function/MACHash function/MAC
Hash functionHash function computationally efficient function
mapping binary strings of arbitrary length to binary strings of some fixed length,
Cryptographic hash functionCryptographic hash function One-way, collision-freeOne-way, collision-free
MAC (Message authentication code)MAC (Message authentication code) Keyed hash functionKeyed hash function Parties that share a key can check the integrity of Parties that share a key can check the integrity of
datadata MACMACKK(M) = H(K(M) = H(K1 1 || H(K|| H(K22, M)), M))
NotationsNotations
SSi i : i-th stream from stream cipher G, n-m bits: i-th stream from stream cipher G, n-m bits
WWii : i-th word, n bits : i-th word, n bits
CCii : i-th cipher text, n bits : i-th cipher text, n bits
: Bitwise exclusive-or: Bitwise exclusive-or
FFk k (x): MAC of x using key k, m bits output(x): MAC of x using key k, m bits output
Scheme I: Basic schemeScheme I: Basic scheme
To search WTo search W Alice reveals {kAlice reveals {kii | where W may occur} | where W may occur} Bob checks if WBob checks if Wii C Cii is of the form <s, F is of the form <s, FKiKi(s)> for some s(s)> for some s
For unknown kFor unknown kii, Bob knows nothing, Bob knows nothing To search W, eitherTo search W, either
Alice reveal all kAlice reveal all kii, or , or Alice has to know where W may occur Alice has to know where W may occur
Wi
Si FKi(Si)
F Ki
Plaintext
Stream Cipher
ciphertext
Scheme II: Controlled search .Scheme II: Controlled search .
Replace kReplace kii = f = f k’k’ (W (Wii) where) where k’ is secret, never revealedk’ is secret, never revealed f is another MAC with output size = | kf is another MAC with output size = | k i i ||
Reveal only f Reveal only f k’k’ (W) and W (W) and W
Bob identifies only location where W occursBob identifies only location where W occurs
But reveals nothing on the locations i where But reveals nothing on the locations i where
W != WW != Wii
Still does not support hidden searchStill does not support hidden search
Scheme III: Hidden Searches .Scheme III: Hidden Searches .
Ek”(Wi)
Si FKi(Si)
F Ki
Plaintext
Stream Cipher
ciphertext
Wi
E k”
Scheme III (Cnt’d)Scheme III (Cnt’d)
Let XLet Xii := E := Ek”k” (W (Wii))
After the pre-encryption, Alice has XAfter the pre-encryption, Alice has X11, … , X, … , Xll
Same as before, CSame as before, Cii = X = Xi i T Tii where where XXii = E = Ek”k” (W (Wii))
TTii = < S = < Sii, F, Fki ki (S(Sii) >) >
To search W, Alice queries (X, k) such that To search W, Alice queries (X, k) such that X := EX := Ek”k”(W) and k := f(W) and k := fk’k’(X) (X)
A problem of Scheme IIIA problem of Scheme III
Scheme III has a problem… Guess what? Scheme III has a problem… Guess what?
If Alice generates kIf Alice generates kii = f = fk’k’(E(Ek”k”(W(Wii)), she cannot )), she cannot
recover the plaintext from the ciphertext. recover the plaintext from the ciphertext. CCii = X = Xi i T Tii where T where Tii = < S = < Sii, F, Fki ki (S(Sii) >) >
To compute XTo compute Xi i from Cfrom Cii, we have to know T, we have to know T ii
SSii can be computed easily can be computed easily
How about FHow about Fkiki (S (Sii)?)?
The problem is kThe problem is k ii
To compute this, we have to know all ETo compute this, we have to know all Ek”k”(W(Wii) for all i) for all i
Ups! If you know all of these, why do you need search?Ups! If you know all of these, why do you need search?
Scheme IV: The Final Scheme .Scheme IV: The Final Scheme .
FixFix XXii = E = Ek”k” (W (Wii) = < L) = < Lii, R, Rii > where |L > where |Lii|=n-m bits|=n-m bits
TTii=< S=< Sii, F, Fki ki (S(Sii) > where k) > where kii=f =f k’k’((LLii) instead of f ) instead of f k’k’(W(Wii) )
Scheme IV: The Final PictureScheme IV: The Final Picture
Ek”(Wi)
Si Fki(Si)
F Ki
Plaintext
Stream Cipher
ciphertext
Wi
E k”
Li
f k’k i
Practical ConsiderationsPractical Considerations
Alice only needs to remember only one Alice only needs to remember only one password k”password k”
Supporting more advanced queriesSupporting more advanced queries Boolean operations (W and W’)Boolean operations (W and W’) Proximity queries (W near W’)Proximity queries (W near W’) Phrase searches (W immediately precedes W’)Phrase searches (W immediately precedes W’)
Dealing with variable length wordsDealing with variable length words
Pick a long enough fixed-size block Pick a long enough fixed-size block A fixed padding is requiredA fixed padding is required Inefficient in spaceInefficient in space
Support variable length word with word lengthSupport variable length word with word length Instead of W, use < lInstead of W, use < lWW, W>, W>
Move pointer bit by bitMove pointer bit by bit Longer scan time, but efficient spaceLonger scan time, but efficient space
Index-based SearchIndex-based Search
For large database applicationsFor large database applications Index contains a list of keywordsIndex contains a list of keywords
each keyword points to documents containing iteach keyword points to documents containing it
MethodsMethods Encrypt keyword and leave pointers unencryptedEncrypt keyword and leave pointers unencrypted Encrypt pointers alsoEncrypt pointers also
Alice queries encrypted keyword, and Bob returns Alice queries encrypted keyword, and Bob returns encrypted pointersencrypted pointers
Alice needs to spend extra roundAlice needs to spend extra round
Update cost is expensiveUpdate cost is expensive
Conclusion and Open ProblemsConclusion and Open Problems
Pretty efficientPretty efficient No public key operationNo public key operation Small message expansionSmall message expansion
Interesting, and useful Interesting, and useful
Open problemsOpen problems Searching “Record > 13” ?#^@*#^!Searching “Record > 13” ?#^@*#^! Searching “a[a-z]b” : needs 26 queriesSearching “a[a-z]b” : needs 26 queries