Topics in Cryptography Lecture 7 Topic: Side Channels Lecturer: Moni Naor.
-
Upload
tabitha-warren -
Category
Documents
-
view
217 -
download
1
Transcript of Topics in Cryptography Lecture 7 Topic: Side Channels Lecturer: Moni Naor.
Topics in Cryptography
Lecture 7Topic: Side Channels
Lecturer: Moni Naor
Recap: chosen ciphertext security• Why chosen ciphertext/malleability matters• Taxonomy of Attacks and Security• Ideas for achieving CCA
– Redundancy + Verification• The NIZK approach• Simple scheme achieving CCA1
– Based on DDH– Modification achieving CCA2
• Chosen-Ciphertext Security via Correlated Products• CCA and IBE• Deniable Authentication
3
Adversarial ModelsSTANDARD MODEL: Abstract models of computation
Interactive Turing machines Private memory, randomness ...
Well-defined adversarial access Can model powerful attacks
REAL LIFE: Physical implementations leak information Adversarial access not always captured by
abstract models
Ek(m)
4
Adversarial Models
Ek(m)
Attacks in the standard model:
Chosen-plaintext attacks Chosen-ciphertext attacks Composition Self-referential encryption Circular encryption ....
Attacks outside the standard model:
Timing attacks [Kocher 96] Fault detection [BDL 97, BS 97] Power analysis [KJJ 99] Cache attacks [OST 05] Memory attacks [HSHCPCFAF 08] ...
5
Adversarial ModelsAttacks in the standard
model: Chosen-plaintext attacks Chosen-ciphertext attacks Composition Self-referential encryption Circular encryption ....
Attacks outside the standard model:
Timing attacks [Kocher 96] Fault detection [BDL 97, BS 97] Power analysis [KJJ 99] Cache attacks [OST 05] Memory attacks [HSHCPCFAF 08] Electromagnetic radiation analysis ...
Side channel:
Any information not captured by the abstract “standard” model
6
Countermeasures
GOALS Preserve functionality Security
Efficiency Generic methods
HARDWARE E.g., minimizing electromagnetic
leakage, “tamper-proof” devices,... Ad-hoc solutions Typically expensive or inefficient
SOFTWARE E.g., fixed timing (indep. of input),
oblivious RAM,... Many heuristics Require precise modeling
7
Thesis of this course
Many tools developed in the foundations of cryptography are
helpful for protecting against side-channel attacks
Proof by examples...
and not only at implementation time
Must incorporate side-channel attacks
in the design of systems
slide 9
Side Channels• Timing Attacks • Cache Attacks• Memory Attacks
slide 10
Timing Attacks• Kocher, Timing Attacks on Implementations of Diffie-
Hellman, RSA, DSS, and Other Systems, (CRYPTO 1996)
• Brumley and Boneh, Remote Timing Attacks Are Practical, (USENIX Security 2003)
Slides based on Vitaly Shmatikov
slide 11
Timing Attack• Basic idea: learn the system’s secret by observing
how long it takes to perform various computations• Typical goal: extract private key• Extremely powerful because isolation doesn’t help
– Victim could be remote– Victim could be inside its own virtual machine– Keys could be in tamper-proof storage or smartcard
• Attacker wins simply by measuring response times
RSA Cryptosystem• Key generation:
– Generate large primes P, Q– Compute N=PQ and (N)=(P-1)(Q-1)– Choose small e, relatively prime to (N)
• Typically, e=3 or e=216+1=65537
– Compute unique d such that ed = 1 mod (N)– Public key = (e,N); private key = d
• Encryption of m (simplified!): c = me mod N• Decryption of c: cd mod N = (me)d mod N =
m
Why?
RSA Decryption• RSA decryption: compute yx mod N
– A modular exponentiation operation• Naive algorithm: square and multiply:
w
1 0 1x
Basic Timing
This takes a whileto compute
This is instantaneous
Whether iteration takes a long timedepends on the kth bit of secret exponent
Old observation: timing depends on number of 1’s
If all multiplication take the same time: all you get
Not all multiplications were created equal
• Different timing given operands • Assumption/Heuristic: timings of subsequent
multiplications are independent– Given that we know the first k-1 bits of x– Given a guess for the kth bit of x– Time of remaining bits independentGiven measurement of total time can see whether there is
correlation between events: kth step is long Total time is long
Exact timing
Exact guess
Outline of Kocher’s Attack• Idea: guess some bits of the exponent;
– Predict how long decryption will take• If guess is correct, will observe correlation; if
incorrect, then prediction will look random– The more bits you already know, the stronger the signal,
thus easier to detect (error-correction property)• Start by guessing a few top bits, look at correlations
for each guess, pick the most promising candidate and continue
Works against systems under direct control
RSA in OpenSSL• OpenSSL: popular open-source toolkit
– mod_SSL (in Apache = 28% of HTTPS market)– stunnel (secure TCP/IP servers)– sNFS (secure NFS)– Many more applications
• Kocher’s attack doesn’t work against OpenSSL– Instead of square-and-multiply, OpenSSL uses CRT, sliding
windows and two different multiplication algorithms for modular exponentiation
• CRT = Chinese Remainder Theorem• Secret exponent is processed in chunks, not bit-by-bit
Chinese Remainder Theorem• n = n1n2…nk
where gcd(ni,nj)=1 when i j
• The system of congruences x = x1 mod n1 = … = xk mod nk
– Has a simultaneous solution x to all congruences – There exists exactly one solution x between 0 and n-1
• For RSA modulus N=PQ, to compute x mod N enough to know x mod P and x mod Q
Attack this computation in order to learn Q
RSA Decryption With CRT
• To decrypt c, need to compute m=cd mod N • Use Chinese Remainder Theorem
– d1 = d mod (P-1)
– d2 = d mod (P-1)
– qinv = Q-1 mod P– Compute m1 = cd1 mod P; m2 = cd2 mod Q
– Compute m = m2+(qinv*(m1-m2) mod P)*Q
these are precomputed
Operations Involved in Decryption
What is needed to compute cd mod Q and xy mod Q?
• Exponentiation– Sliding windows
• Multiplication routines– “Normal” - when operands have unequal length– Karatsuba - faster when operands have equal length
• Modular reduction– Montgomery reduction
nlog23
Time of these operations is input sensitive
Montgomery Reduction• Decryption requires computing m2 = cd2 mod Q
• Done by repeated multiplication– Simple: square and multiply (process d2 one bit at a time)
– More clever: sliding windows (process d2 in 5-bit blocks)
• In either case, many multiplications modulo Q• Multiplications use Montgomery reduction
– Pick some R = 2k
– To compute x ¢ y mod Q: convert x and y into their Montgomery form xR mod q and yR mod q
– Compute (xR * yR) * R-1 = zR mod q• Multiplication by R-1 can be done very efficiently
Avoid long divisions
R a power of 2
Schindler’s Observation• At the end of Montgomery reduction: if zR > Q,
then need to subtract Q– Probability of this extra step is proportional to c mod Q
• If c is close to Q, many subtractions will be done• If c mod Q = 0, very few subtractions
– Decryption will take longer as c gets closer to Q, then become fast as c passes a multiple of Q
• By playing with different values of c and observing how long decryption takes, attacker can guess Q!
If all other operations are fixed!
Value of ciphertext c
Decryption time
Q 2Q P
Reduction Timing Dependency
Integer Multiplication Routines
• 30-40% of OpenSSL running time is spent on integer multiplication
• If integers have the same number of words n, OpenSSL uses Karatsuba multiplication– Takes O(nlog23)
• If integers have unequal number of words n and m, OpenSSL uses normal multiplication– Takes O(nm)
slide 25
g<q g>q
Montgomery effect
Longer Shorter
Multiplication effect
Shorter Longer
g is the decryption value (same as c)Different effects… but one will always dominate!
Summary of Time Dependencies
slide 26
Decryption time #ReductionsMult routine
Value of ciphertext Q
0-1 Gap
Attack Is Binary Search
slide 27
• Initial guess g for Q between 2511 and 2512
• Try all possible guesses for the top few bits• Suppose we know i-1 top bits of Q. Goal: ith bit
– Set g =…known i-1 bits of Q …000000 – Set ghi=…known i-1 bits of Q …100000 (note: g<ghi)
• If g<Q<ghi then the ith bit of Q is 0
• If g<ghi<Q then the ith bit of Q is 1
• Goal: decide whether g<Q<ghi or g<ghi<Q
Attack Overview
slide 28
Two Possibilities for ghi
Decryption time #ReductionsMult routine
Value of ciphertext Q
g ghi?
ghi?Difference in decryption timesbetween g and ghi will be small
Difference in decryption timesbetween g and ghi will be large
slide 29
Timing Attack Details• What is “large” and “small”?
– Know from attacking previous bits• Decrypting just g does not work because of sliding
windows– Decrypt a neighborhood of values near g– Will increase difference between large and small values,
resulting in larger 0-1 gap• Attack required only 2 hours, about 1.4 million
queries to recover the private key– Only need to recover most significant half bits of q
g, g+1, …, g+
slide 30
The 0-1 Gap
Zero-one gap
slide 31
Extracting RSA Private Key
Montgomery reductiondominates
Multiplication routine dominates
zero-one gap
slide 32
Normal SSL Handshake
Regular clientSSL
server 1. ClientHello
2. ServerHello (send public key)
3. ClientKeyExchange(encrypted under public key)
Exchange data encrypted with new shared key
slide 33
Attacking SSL Handshake
SSL server
1. ClientHello
2. ServerHello (send public key)
Attacker
3. Record time t1
Send guess g or ghi4. Alert
5. Record time t2
Compute t2–t1
slide 34
Works On The Network
Similar timing onWAN vs. LAN
slide 35
Defenses• Require statically that all decryptions take the
same time– For example, always do the extra “dummy” reduction– … but what if compiler optimizes it away?
• Dynamically make all decryptions the same or multiples of the same time “quantum”– Now all decryptions have to be as slow as the slowest
decryption• Use RSA blinding
slide 36
RSA Blinding• Instead of decrypting ciphertext c, decrypt a
random ciphertext related to c– Choose random r 2 ZN
*
– Compute x’ = c ¢ re mod N– Decrypt x’ to obtain m’ =x’d
– Calculate original plaintext m = m’/r mod N• Since r is random, decryption time is independent
of ciphertext• 2-10% performance penalty
Can prepare ahead
slide 37
Blinding Works
Cache Attacks
• Cryptanalysis through Cache Address Leakage: Dag Arne Osvik, Adi Shamir , Eran Tromer
Slides based on Eran Tromer
Cache attacks• Pure software• No special privileges• No interaction with the cryptographic code • Very efficient
– full AES key extraction from Linux encrypted partition in 65 milliseconds)
• Compromise otherwise well-secured systems
• “Commoditize” side-channel attacks: – Easily deployed software breaks many common
systems
CPU core60% (until recently)
Main memory7-9%
Why cache?
cacheAnnual speedincrease:
Typicallatency:
50-150ns0.3ns → timing gap
Address leakageThe cache is a shared resource:• cache state affects, and is affected by, all processes,
leading to crosstalk between processes.• The cached data is subject to memory protection…
– Not attacked• But the “metadata” leaks information about memory
access patterns: Which addresses are being accessed.
Associative memory cacheD
RA
Mca
che
memory block(64 bytes)
cache line
(64 bytes)
cache set
(4 cache lines)
S-box tables in memoryD
RA
Mca
che
S-box
table
Detecting access to AES tablesDR
AM
cach
e
Atta
cker
mem
ory
S-box
table
Measurement technique
Two approaches to exploit Inter-process crosstalk:
• Measuring the effect of the cache on the encryption– Need precise timing
• Measuring the effect of the encryption on the cache
DR
AM
cach
e
T0Attacker
memory
1. Make sure the tables are cached
2. Evict one cache set
3. Time an encryption and see if it’s slow
Measuring effect of cache on encryption
Measurement technique
Two approaches to exploit Inter-process crosstalk:
• Measuring the effect of the cache on the encryption– Need precise timing
• Measuring the effect of the encryption on the cache
Measuring effect of encryption on cacheD
RA
Mca
che
Attacker
memory
1. Completely evict tables from cache
S-box
table
Measuring effect of encryption on cacheD
RA
Mca
che
Attacker
memory
1. Completely evict tables from cache
2. Trigger a single encryptionS-b
oxta
ble
Measuring effect of encryption on cacheD
RA
Mca
che
Attacker
memory 1. Completely
evict tables from cache
2. Trigger a single encryption
3. Access attacker memory again. See which cache sets are slow
S-box
table
Advantages of second method
• Yields more information (64) from a single encryption
• Insensitive to timing variance in encryption code path
• No real need to trigger the encryption – can wait until it happens by itself
char p[16], k[16]; // plaintext and keyint32 T0[256],T1[256],T2[256],T3[256]; // lookup tablesint32 Col[4]; // intermediate state
...
/* Round 1 */
Col[0] T0[p[ 0]©k[ 0]] T1[p[ 5]©k[ 5]] T2[p[10]©k[10]] T3[p[15]©k[15]];
Col[1] T0[p[ 4]©k[ 4]] T1[p[ 9]©k[ 9]] T2[p[14]©k[14]] T3[p[ 3]©k[ 3]];
Col[2] T0[p[ 8]©k[ 8]] T1[p[13]©k[13]] T2[p[ 2]©k[ 2]] T3[p[ 7]©k[ 7]];
Col[3] T0[p[12]©k[12]] T1[p[ 1]©k[ 1]] T2[p[ 6]©k[ 6]] T3[p[11]©k[11]];
A typical software implementation of AES
lookup index = plaintext key
Synchronous attack
• A software service performs AES encryption using a secret key.
• An attacker process runs on the same CPU.• The attacker process can somehow invoke the service
on known plaintext.
• Examples:– Encrypted disk partition + filesystem– IP/Sec, VPN
Synchronous attack on AES: Overview• Measure (possibly noisy) cache usage of many encryptions of known
plaintexts.• Guess the first key byte. For each hypothesis:
– For each sampled plaintext, predict which cache line is accessed by “T0[p[ 0]©k[ 0]]”
• Identify the hypothesis which yields maximal correlation between predictions and measurements.
• Proceed for the rest of the key bytes.• Practically, a few hundred samples suffice.Got 64 bits of the key (high nibble of each byte)!• Use these partial results to mount attack further AES rounds, exploiting
S-box nonlinearity.A few thousand samples for complete key recovery.
Protection: The Oblivious RAM Model
Oblivious Turing Machine:• At any point in time know where the heads are
– The access pattern is independent of the
• Important: to convert to circuits• Get good results for the Cook-Levin TheoremOblivious RAM• The access pattern is independent of the
– Probability distribution!
Suggested by Goldreich 1987
Model
CPUMain memory
Small private
memory
qi
M[qi]
Oblivious RAM Requirements
Any sequence of locations i1, i2, …induces a distribution on sequences of requests q1, q2…
• Functionality: should be able to figure out the original content
• Security: for any two sequence of locations i1, i2, … and i’1, i’2, … induced distributions of requests should be indistinguishable
Oblivious Ram Constructions
• Trivial: O(n) slowdown– O(log n) bits private memory
• Known: polylog slowdown [Goldreich-Ostrovsky 96]– O(log n) bits private memory
59
Memory Attacks [HSHCPCFAF 08] Concern: Not only computation leaks information Memory retains its content after power is lost
5 seconds
30 seconds
60 seconds
5 minutes
http://citp.princeton.edu/memory
60
Can use redundancy in round keys
Not only computation leaks information Memory retains its content after power is lost
Recover “noisy” keys Cold boot attacks Completely compromise popular disk encryption systems Reconstruct DES, AES, and RSA keys
http://citp.princeton.edu/memory
Memory content can even last for several minutes
Memory Attacks [HSHCPCFAF 08]
61
Public-Key EncryptionSemantic security [GM82] under CPA:For any m0 and m1 infeasible to distinguish Epk(m0) and Epk(m1)
(sk, pk)
pk
m0, m1
Output b’ Epk(mb)
b à {0,1}
62
Key-Leakage AttacksSemantic security with key leakage [AGV 09]:For any* leakage f(sk) and for any m0 and m1 infeasible to distinguish Epk(m0) and Epk(m1)
(sk, pk)
pk
f
Output b’
f(sk)
b à {0,1}
Clearly, cannot allow f(sk) that easily reveals sk For now f : SK ! {0,1}¸ for ¸ < |sk|
m0, m1
Epk(mb)
Akavia, Goldwasser and Vaikuntanathan
63
Is this the right model? Noisy leakage
as opposed to low-bandwidth leakage
Leakage of intermediate values Are intermediate values always erased? Key generation process Decryption process
Keys generated using a “weak” random source
Not a perfect model, but still a good starting point
Discuss extensions later on
64
What We Know A generic method for protecting against key-leakage attacks
Main building block: Hash Proof Systems [CS 02] Efficient instantiations
Based on decisional Diffie-Hellman, few exponentiations
Chosen-ciphertext key-leakage attacks A generic CPA-to-CCA transformation Efficient schemes
Extensions Noisy leakage Leakage of intermediate values Weak random sources
65
Outline of the Talk Some tools
The generic construction by examples A simple scheme: ¸ ¼ |sk|/2
Improved schemes: ¸ ¼ |sk|
Extensions of the model
Conclusions, further work, and some rest...
66
Min-EntropyProbability distribution X over {0,1}n
H1(X) = - log maxx Pr[X = x]
X is a k-source if H1(X) ¸ k (i.e., Pr[X = x] · 2-k for all x)
Represents the probability of the most likely value of X
¢(X,Y) = a|Pr[X=a] – Pr[Y=a]|Statistical distance:
67
ExtractorsUniversal procedure for “purifying” an imperfect source
Definition:
Ext: {0,1}n £ {0,1}d ! {0,1}ℓ is a (k,)-extractor if for any k-source X
¢(Ext(X, Ud), Uℓ) ·
d random bits
“seed”
EXT
k-source of length n
ℓ almost-uniform bits
x
s
68
Strong ExtractorsOutput looks random even after seeing the seed
Definition:
Ext: {0,1}n £ {0,1}d ! {0,1}ℓ is a (k,)-strong extractor if
Ext’(x, s) = s ◦ Ext(x,s)
is a (k, )-extractor
Leftover hash lemma [ILL 89]:Pairwise independent hash functions are strong extractors
Example: Ext(x, (a,b)) = first ℓ bits of ax+b over GF[2n] Output length ℓ = k – 2log(1/) Seed length d = 2n, almost pairwise independence d = O(log n + k)
69
Decisional Diffie-Hellman
gx
gyAlice Bob
Both parties compute K = gxy
DDH assumption:
(g, gx, gy, gxy) (g, gx, gy, gz)
for random x, y, z 2 Zq
(g1, g2, g1r, g2
r) (g1, g2, g1r1, g2
r2)
for random g1, g2 2 G and r, r1, r2 2 Zq
70
Outline of the Lecture Some tools
The generic construction by examples A simple scheme: ¸ ¼ |sk|/2
Improved schemes: ¸ ¼ |sk|
Extensions of the model
Conclusions, further work, and some rest...
71
G - group of order q Ext : G £ {0,1}d ! {0,1} - strong extractor
Choose g1, g2 2 G and x1, x2 2 Zq
Let h = g1x1 g2
x2
Output sk = (x1, x2) and pk = (g1, g2, h)
Key generation
A Simple Scheme
MAIN IDEA: Redundancy: any pk corresponds to many possible sk’s h=g1
x1 g2x2 reveals only log(q) bits of information on
sk=(x1,x2) Leakage of ¸ bits ) sk still has min-entropy log(q) - ¸
72
G - group of order q Ext : G £ {0,1}d ! {0,1} - strong extractor
Choose g1, g2 2 G and x1, x2 2 Zq
Let h = g1x1 g2
x2
Output sk = (x1, x2) and pk = (g1, g2, h)
Choose r 2 Zq and a seed s 2 {0,1}d
Output (g1r, g2
r, s, Ext(hr, s) © m)
Output e © Ext(u1x1 u2
x2, s)
Key generation
Encpk(m)
Decsk(u1, u2, s, e)
A Simple Scheme
u1x1 u2
x2 = g1rx1 g2
rx2 = (g1x1 g2
x2)r = hr
73
Theorem: The scheme is resilient to any leakage of ¸ ¼ log(q) bits
half the size of sk
A Simple Scheme
Proof by reduction:
Adversary for the encryption scheme
Distinguisher for decisional Diffie-Hellman