CryptDB: A Practical Encrypted Relational DBMS
Raluca Ada Popa, Nickolai Zeldovich, and Hari BalakrishnanMIT CSAIL
New England Database Summit 2011
Hackers Curious DB administrators Physical attacks Both on public clouds and private data centers Regulatory laws
Problem: data leaks from DBs
Perform SQL query processing on encrypted data
Approach
Clientfrontend
Database serveruser queries
Trusted Stores schema, master key No query execution Stores the database and
processes SQL queries Not trusted to keep data
private
1. Support standard SQL queries on encrypted data
2. Process queries completely at the DB server
3. No change to existing DBMS
?
Example
rank name salary
emp
SELECT * FROM emp WHERE salary = 100
x5a8c34
x934bc1x5a8c34
x5a8c34
x84a21c
x5a8c34
≥
x638e54
x638e54x922eb4
x1eab81
SELECT * FROM table1 WHERE col1 = x5a8c34≥
Frontend
60100800100
?x5a8c34x5a8c34x5a8c34
x638e54x922eb4x638e54
x4be219
x95c623
x2ea887
x17cea7
x638e54
1. SQL-aware encryption strategy– Different encryption schemes provide different
functionality
2. Adjustable query-based encryption– Adapt encryption of data based on user queries
Two techniques
1. SQL-aware encryption
Privacy
e.g., =, !=, GROUP BY, IN, COUNT, DISTINCT
HighestScheme Operation Details
RND None AES in UFE
HOM +, *
AES in CTR DET equality
e.g., Paillier
SEARCH
join new JOIN
ILIKE Song et al.’00
OPE orderBoldyreva et al.
’09
e.g., >, <, ORDER BY, SORT, MAX, MIN
first practical implementation
Any valueJOIN
SEARCHDET
RND
Any valueOPE-JOIN
OPERND
int valueHOM
Each column has the same key in a given layer of an onion
Onion 1 Onion 2 Onion 3
Onions of encryptions
2. Adjustable query-based encryption
Start out the database with the most secure encryption scheme
Adjust encryption dynamically Strip off levels of the onions: frontend gives key to
server using a UDF
Example
SELECT * FROM emp WHERE salary = 100000
UPDATE table1 SET col3onion1 = DecryptRND(key, col3onion1)
Any valueJOIN
SEARCH
DETRND
SELECT * FROM table1 WHERE col3onion1 = x5a8c34
DETemp:
rank name salary
JOIN needs new crypto Challenge: do not know which columns will be joined
Col2Col1
ClientFrontend
Join key Col1-Col2
Data items not revealed, cannot join without join key
= -
Further components
Inserts, updates, deletes, nested queries Indexes Transactions, auto-increments Optimizations to speed up performance Not supported: A.a + A.b > B.c
Security converges… … to maximum privacy for query mix Onion levels stripped only when new operations
needed
Steady State: no decryptions at server
Practical: typical SQL processing on enlarged tuples
• aggregation on salary nothing• no filter on a column nothing
• order predicate on name order
Privacy Guarantees
emp:
rank name salary
If query has • equality predicate on name
repeats
• Never reveal plaintext• Server cannot compute unrequested queries
requiring new relationships
Formal privacy definition and proof Implications:
Privacy (cont’d)
DB owner can specify minimum security level for some fields
CREATE TABLE emp (SSN text ≥ DET, name text, …)
Implementation
Frontend
Unmodified DBMS
CryptDB PK tables
CryptDB UDFs (user-defined
functions)
Server
Query
Results
Encrypted Query
Encrypted Results
SQL Interface
No change to the DBMS Should work on most SQL DBMS
Portability
Ported CryptDB from Postgres to MySQL with 86 lines of code No change to MySQL Code changed was to connect to server, UDF
declarations
Adjustable encryption Steady state of columns for TPC-C:
71% of columns remain encrypted with RND
Importance of adjustable query-based encryption to privacy
In practice, we expect most sensitive fields to remain at RND or DET (e.g., credit cards)
Theoretical approaches [Gennaro et al., ’10]
– Inefficient Search on encrypted data (e.g., [Chang, Mitzenmacher ‘05],
[Evdokimov, Guenther ’07])
– Restricted set of queries, inefficient Systems proposals (e.g., [Hacigumus et al., ’02])
– Lower degree of security, rewrite the DBMS, client-side processing
Related work
Top Related