1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam...
-
Upload
brett-dixon -
Category
Documents
-
view
218 -
download
0
description
Transcript of 1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam...
1
Generating Strong Keys from Biometric Data
Yevgeniy Dodis, New York U Leonid Reyzin, Boston U
Adam Smith, MIT
2
Keys & Passwords from Biometric Data
• Secure cryptographic keys: long, random strings
• Too hard to remember
• Popular solution: biometrics– Fingerprints, iris scans, voice prints, …
– Same idea: personal information, e.g. SSN, Mom’s maiden name, street where you grew up, …
• Many nice ideas in literature, somewhat ad hoc
3
This Work
Formal framework and general tools
for handling personal/biometric key material
• Secure password storage
• Key encapsulation (for any application)
4
Issues with Personal Info / Biometrics
Noise and human error Not Uniformly Random Should Not Be Stored in the Clear• Fingerprint is Variable
– Finger orientation, cuts, scrapes
• Personal info subject to memory failures / format– Name of your first girl/boy friend:
“Um… Catherine? Katharine? Kate? Sue?”
• Measured data will be “close” to original in some metric (application-dependent)
5
Issues with Personal Info / Biometrics
Noise and human error Not Uniformly Random Should Not Be Stored in the Clear• (Crypto keys should be random)• Fingerprints are represented as list of features
– All fingers look similar
• Distribution is unknown– Ivanov is rare last name… unless first name is Sergei
6
Issues with Personal Info / Biometrics
Noise and human error Not Uniformly Random Should Not Be Stored in the Clear Limited Changes Possible• Theft is easy, makes info useless
– Customer service representatives learn Mom’s maiden name, Social Security Number, …
• Keys cannot be changed many times– 10 fingers, 2 eyes, 1 mother
7
Issues with Personal Info / Biometrics
Noise and human error Not Uniformly Random Should Not Be Stored in the Clear
How can we use this stuff as a
password?
Do we want to use this stuff as a
password?(too late…)
8
Issues with Personal Info / Biometrics
Noise and human error Not Uniformly Random Should Not Be Stored in the Clear
How can we use this stuff as a
password?
9
Example: Authentication
10
Authentication [EHMS,JW]
Alice Server“How do I know you’re Alice”?
Solution #1: Store a copy on serverProblem: Password in the Clear
?¼
11
Authentication [EHMS,JW]
Alice “How do I know you’re Alice”?
Solution #2: Store a hash of passwordProblem: No Error Tolerance
H( )Server
?=H( )H( )
13
Related Work
Basic set-up studied for quite a while:• Davida, Frankel, Matt ’98, Ellison, Hall, Milbert, Shneier ’00First abstractions:• Juels, Wattenberg ‘99, Frykholm, Juels ’01
– Handling noisy data in Hamming metric
• Juels, Sudan ’02– Set difference metric
Provable security:• Linnartz, Tuyls ’03
– Provable security, specific distribution (multi-variate Gaussian)
Acknowledgement: some slide ideas from Ari Juels
14
This Talk
Formal framework and general tools
for handling personal/biometric key material
15
Outline
Basic Setting: Password Authentication
Simple abstraction: Fuzzy Sketch• Example: Hamming distance
• Fuzzy Sketch ) Authentication
Other metrics
Stronger privacy? [DS]
16
Outline
Basic Setting: Password Authentication
Simple abstraction: Fuzzy Sketch• Example: Hamming distance
• Fuzzy Sketch ) Authentication
Other metrics
Stronger privacy? [DS]
17
Fuzzy Sketch
1. Error-correction: If x’ is “close” to x, then recover x
2. Secrecy: Given S(X), it’s hard to predict X
• Meaning of “close” depends on application
• How to measure hardness of prediction?
x S(x)S
Recoverx’S(x) x
18
Measuring Security
• X a random variable on {0,1}n
• Probability of predicting X = maxx Pr[X = x]
• There are various ways to measure entropy…
• Min-entropy: H1(X) = -log (maxx Pr[X=x])• Uniform on {0,1}n : H1(Un) = n
• “ Password has min-entropy t ” means that adversary’s
probability of guessing the password is 2-t
• Passwords had better have high entropy!
19
Measuring Security
• X a random variable on {0,1}n
• Probability of predicting X = maxx Pr[X = x]
• There are various ways to measure entropy…
• Min-entropy: H1(X) = -log (maxx Pr[X=x])• Conditional entropy
H1(X | Y ) = -log ( prob. of predicting X given Y )
= -log ( Expy { maxx Pr[X=x | Y=y] } )
20
Fuzzy Sketch
1. Error-correction: If x’ is “close” to x, then recover x
2. Secrecy: Given S(X), it’s hard to predict X
Goals: - Minimize entropy loss: H1(X) – H1(X | S(X) )
- Maximize tolerance: how “far” x’ can be from x
X S(X)S
Recoverx’S(x) x
21
Example: Code-Offset Construction [BBR88, Cré97, JW02]
22
Code-Offset Construction [BBR,Cré,JW]
• View password as n bit string : x 2 {0,1}n
• Error model: small number of flipped bits
• Hamming distance:
dH(x,x’) = # of positions in which x, x’ differ
• Main idea: non-conventional use of standard error-correcting codes
23
Code-Offset Construction [BBR,Cré,JW]
• Error-correcting code ECC: k bits ! n bits
• Any two codewords differ by at least d bits
• S(x) = x © ECC(R)where R is random string
Equiv: S(x) = syndrome(x) dx
ECC(R)S(x)
x’
24
Recovery
• Error-correcting code ECC: k bits ! n bits
• Any two codewords differ by at least d bits
• S(x) = x © ECC(R)where R is random string
Equiv: S(x) = syndrome(x)
• Given S(x) and x’ close to x:– Compute x’ © S(x)– Decode to get ECC(R)– Compute x = S(x) © ECC(R)
dx
ECC(R)S(x)
x’
• Corrects d/2 errors• How much entropy loss?
25
Entropy Loss
• Error-correcting code ECC: k bits ! n bits
• Any two codewords differ by at least d bits
• S(x) = x © ECC(R)where R is random string
H1(X | S(X))
= H1(X, R | S(X) )
¸ H1(X) + H1(R) – |S(X)|
= H1(X) + k – n
dx
ECC(R)S(x)
Revealing n bitscosts · n bits of entropy
26
Entropy Loss
• Error-correcting code ECC: k bits ! n bits
• Any two codewords differ by at least d bits
• S(x) = x © ECC(R)where R is random string
H1(X | S(X))
= H1(X, R | S(X) )
¸ H1(X) + H1(R) – |S(X)|
= H1(X) + k – n
Entropy loss = n – k = redundancy of code
dx
ECC(R)S(x)
27
Using Sketches for Authentication
“How do I know you’re Alice”?
x’=
x =
S(x), Hash(x)
• Input to Hash should be uniformly random• X is not uniform (especially given S(X))
Problem
Recover
Hash
?=
Alice Server
28
Using Sketches for Authentication
“How do I know you’re Alice”?
x’=
x =
S(x), Hash(x)
Alice Server
29
Padding via Random Functions
Alice Server x =
S(x), F, Hash(F(x))
Scramble input with “random” function F. Store F.Solution
•Thm: “2-universal” F is sufficient (e.g., random linear map)
• Verification same as before
• Is hash input random enough?
“How do I know you’re Alice”?
x’=
30
Padding via Random Functions
Alice Server x =
S(x), F, Hash(F(x))
Scramble input with “random” function F. Store F.Solution
• Secure as long as:Entropy-LossS + | Hash | + 2 log(1/) · H1(X)
• Proof idea: S(x), F, Hash(F(x)) ¼ S(x), F, Hash(R) • Similar to “left-over hash lemma / privacy amplification”
“How do I know you’re Alice”?
x’=
31
Sketches and Authentication
• “Fuzzy Sketch” + Hashing Solves Authentication
• Assumption: X has high entropy– Necessary
– X could be several passwords taken together
• “Hamming” errors can be handled with standard ECC
• Similar techniques imply one can use X as key for any crypto application (e.g. encryption)– Covers several previously studied settings
32
(Safely public)
Fuzzy Extractor
K =
Gen …
(Random given P)
Some application
e.g. (S( ), F )
K =
RecP
’
P
33
Outline
Basic Setting: Password Authentication
Simple abstraction: Fuzzy Sketch• Example: Hamming distance
• Fuzzy Sketch ) Authentication
Other metrics
Stronger privacy?
34
Other metrics
• Variants of code-offset work for many metrics– “transitivity” suffices [DRS]
– Continuous normed spaces: Euclidean [LT,VTDL], Lp
• [DRS] New constructions– Set difference metric (introduced by [JS])
– General technique via special embeddings:
• edit distance, permutation metric
• Real(er)-life metrics?
35
Why Set Difference? [EHMS,FJ,JS]
• Inputs: tiny subsets in a HUGE universe• Some representations of personal / biometric data:
– Fingerprints represented as feature list (minutiae/ridge meetings)
– List of favorite books
• X µ {1,…, N}, #X = s
• dS(X,Y) = ½ # (X Y)= Hamming distanceon vectors in {0,1}N.
X Y
X Y• Code-offset not good: N-bit string is too long!• Want: s log N bits
44
Edit Distance
• Strings of bits
• d(x,y) = number of insertions & deletions to go from x to y
• [ADGIR] No good standard embeddings into Hamming
• Shingling [Broder]: “biometric” embedding into Set.Diff.
10101010101010101101010101011010101
45
Outline
Basic Setting: Password Authentication
Simple abstraction: Fuzzy Sketch• Example: Hamming distance
• Fuzzy Sketch ) Authentication
Other metrics
Stronger privacy?
46
Stronger Privacy?
• Previous notion: Unpredictability– Can’t guess X even after seeing sketch
– Sufficient for using X as a crypto key
• What about the privacy of X itself?– Do not want particular info about X leaked (say, first 20 bits)
• Ideal notion, ignoring computational restrictions:
X almost independent of S(X)
• Problem: Some info must be leaked by S(X) (provably)
– Mutual information I(X ; S(X)) is large
• We want the ensure that “useful” information is hidden
47
Attempt: Hiding Useful Info [CMR,RW,DS]
Definition: S(X) hides all functions of X ifFor all functions g , for all adversaries A
Pr[ A(S(X)) = g(X) ] – Pr[ A(S(X’)) = g(X) ] <
where X, X’ are independent, identical random variables
Intuition:“A cannot guess g(X) given S(X)”
S(X)X
g(X) difficult
Implies unpredictability
48
Attempt: Hiding Useful Info [CMR,RW,DS]
Definition: S(X) hides all functions of X when H1(X) ¸t if
For all functions g , for all adversaries A
Pr[ A(S(X)) = g(X) ] – Pr[ A(S(X’)) = g(X) ] <
where X, X’ are independent, identical random variables
• Non-computational version of semantic-securityfor high-entropy input distributions
49
Hiding Useful Info
Definition: S(X) hides all functions of X when H1(X) ¸t if
For all functions g , for all adversaries A
Pr[ A(S(X)) = g(X) ] – Pr[ A(S(X’)) = g(X) ] <
where X, X’ are independent, identical random variables
• Can be achieved in code-offset construction [DS03]– Use pseudo-randomly chosen code
– Need to keep decodability
– Exact parameters still unknown
50
Technique: Equivalence to Extraction
Definition: S(X) hides all functions of X when H1(X) ¸t if
For all functions g , for all adversaries A
Pr[ A(S(X)) = g(X) ] – Pr[ A(S(X’)) = g(X) ] <
where X, X’ are independent, identical random variables
S() is an “extractor” if for all r.v.’s X1, X2 of min-entropy t
S(X1) ¼ S(X2)
Thm: S(X) hides all functions of X, whenever H1(X)¸ t , S() is an “extractor” for r.v.’s of min-entropy t – 1
51
The right notion?
Definition: S(X) hides all functions of X when H1(X) ¸t if
For all functions g , for all adversaries A
Pr[ A(S(X)) = g(X) ] – Pr[ A(S(X’)) = g(X) ] <
where X, X’ are independent, identical random variables
• Is this strong enough?– Not secure if repeated: S(X), S’(X), S’’(X) may reveal X !
– What’s the “right” notion?
– What computational notions might we use? (see [CMR])
52
Outline
Basic Setting: Password Authentication
Simple abstraction: Fuzzy Sketch• Example: Hamming distance
• Fuzzy Sketch ) Authentication
Other metrics
Stronger privacy?
53
Conclusions
• Generic framework for turning noisy, non-uniform data into secure cryptographic keys– Only assume high entropy (necessary)– Abstraction, simplicity allow comparing schemes– Constructions for Hamming, Set Difference, Edit Metrics
• Possible to have strong privacy of data• Future work
– Real-life metrics (under way)– Other settings ([DV] use sketches in bounded-storage model)– Better definitions
55
Questions?