1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam...

1

Generating Strong Keys from Biometric Data

Yevgeniy Dodis, New York U Leonid Reyzin, Boston U

Adam Smith, MIT

2

Keys & Passwords from Biometric Data

• Secure cryptographic keys: long, random strings

• Too hard to remember

• Popular solution: biometrics– Fingerprints, iris scans, voice prints, …

– Same idea: personal information, e.g. SSN, Mom’s maiden name, street where you grew up, …

• Many nice ideas in literature, somewhat ad hoc

3

This Work

Formal framework and general tools

for handling personal/biometric key material

• Secure password storage

• Key encapsulation (for any application)

4

Issues with Personal Info / Biometrics

Noise and human error Not Uniformly Random Should Not Be Stored in the Clear• Fingerprint is Variable

– Finger orientation, cuts, scrapes

• Personal info subject to memory failures / format– Name of your first girl/boy friend:

“Um… Catherine? Katharine? Kate? Sue?”

• Measured data will be “close” to original in some metric (application-dependent)

5


Noise and human error Not Uniformly Random Should Not Be Stored in the Clear• (Crypto keys should be random)• Fingerprints are represented as list of features

– All fingers look similar

• Distribution is unknown– Ivanov is rare last name… unless first name is Sergei

6


Noise and human error Not Uniformly Random Should Not Be Stored in the Clear Limited Changes Possible• Theft is easy, makes info useless

– Customer service representatives learn Mom’s maiden name, Social Security Number, …

• Keys cannot be changed many times– 10 fingers, 2 eyes, 1 mother

7


Noise and human error Not Uniformly Random Should Not Be Stored in the Clear

How can we use this stuff as a

password?

Do we want to use this stuff as a

password?(too late…)

8


Noise and human error Not Uniformly Random Should Not Be Stored in the Clear

How can we use this stuff as a

password?

9

Example: Authentication

10

Authentication [EHMS,JW]

Alice Server“How do I know you’re Alice”?

Solution #1: Store a copy on serverProblem: Password in the Clear

?¼

11

Authentication [EHMS,JW]

Alice “How do I know you’re Alice”?

Solution #2: Store a hash of passwordProblem: No Error Tolerance

H( )Server

?=H( )H( )

13

Related Work

Basic set-up studied for quite a while:• Davida, Frankel, Matt ’98, Ellison, Hall, Milbert, Shneier ’00First abstractions:• Juels, Wattenberg ‘99, Frykholm, Juels ’01

– Handling noisy data in Hamming metric

• Juels, Sudan ’02– Set difference metric

Provable security:• Linnartz, Tuyls ’03

– Provable security, specific distribution (multi-variate Gaussian)

Acknowledgement: some slide ideas from Ari Juels

14

This Talk

Formal framework and general tools

for handling personal/biometric key material

15

Outline

Basic Setting: Password Authentication

Simple abstraction: Fuzzy Sketch• Example: Hamming distance

• Fuzzy Sketch ) Authentication

Other metrics

Stronger privacy? [DS]

16

Outline




Other metrics

Stronger privacy? [DS]

17

Fuzzy Sketch

1. Error-correction: If x’ is “close” to x, then recover x

2. Secrecy: Given S(X), it’s hard to predict X

• Meaning of “close” depends on application

• How to measure hardness of prediction?

x S(x)S

Recoverx’S(x) x

18

Measuring Security

• X a random variable on {0,1}n

• Probability of predicting X = maxx Pr[X = x]

• There are various ways to measure entropy…

• Min-entropy: H1(X) = -log (maxx Pr[X=x])• Uniform on {0,1}n : H1(Un) = n

• “ Password has min-entropy t ” means that adversary’s

probability of guessing the password is 2-t

• Passwords had better have high entropy!

19

Measuring Security

• X a random variable on {0,1}n

• Probability of predicting X = maxx Pr[X = x]

• There are various ways to measure entropy…

• Min-entropy: H1(X) = -log (maxx Pr[X=x])• Conditional entropy

H1(X | Y ) = -log ( prob. of predicting X given Y )

= -log ( Expy { maxx Pr[X=x | Y=y] } )

20

Fuzzy Sketch

1. Error-correction: If x’ is “close” to x, then recover x

2. Secrecy: Given S(X), it’s hard to predict X

Goals: - Minimize entropy loss: H1(X) – H1(X | S(X) )

- Maximize tolerance: how “far” x’ can be from x

X S(X)S

Recoverx’S(x) x

21

Example: Code-Offset Construction [BBR88, Cré97, JW02]

22

Code-Offset Construction [BBR,Cré,JW]

• View password as n bit string : x 2 {0,1}n

• Error model: small number of flipped bits

• Hamming distance:

dH(x,x’) = # of positions in which x, x’ differ

• Main idea: non-conventional use of standard error-correcting codes

23

Code-Offset Construction [BBR,Cré,JW]

• Error-correcting code ECC: k bits ! n bits

• Any two codewords differ by at least d bits

• S(x) = x © ECC(R)where R is random string

Equiv: S(x) = syndrome(x) dx

ECC(R)S(x)

x’

24

Recovery




Equiv: S(x) = syndrome(x)

• Given S(x) and x’ close to x:– Compute x’ © S(x)– Decode to get ECC(R)– Compute x = S(x) © ECC(R)

dx

ECC(R)S(x)

x’

• Corrects d/2 errors• How much entropy loss?

25

Entropy Loss




H1(X | S(X))

= H1(X, R | S(X) )

¸ H1(X) + H1(R) – |S(X)|

= H1(X) + k – n

dx

ECC(R)S(x)

Revealing n bitscosts · n bits of entropy

26

Entropy Loss




H1(X | S(X))

= H1(X, R | S(X) )

¸ H1(X) + H1(R) – |S(X)|

= H1(X) + k – n

Entropy loss = n – k = redundancy of code

dx

ECC(R)S(x)

27

Using Sketches for Authentication

“How do I know you’re Alice”?

x’=

x =

S(x), Hash(x)

• Input to Hash should be uniformly random• X is not uniform (especially given S(X))

Problem

Recover

Hash

?=

Alice Server

28

Using Sketches for Authentication


x’=

x =

S(x), Hash(x)

Alice Server

29

Padding via Random Functions

Alice Server x =

S(x), F, Hash(F(x))

Scramble input with “random” function F. Store F.Solution

•Thm: “2-universal” F is sufficient (e.g., random linear map)

• Verification same as before

• Is hash input random enough?


x’=

30

Padding via Random Functions

Alice Server x =

S(x), F, Hash(F(x))

Scramble input with “random” function F. Store F.Solution

• Secure as long as:Entropy-LossS + | Hash | + 2 log(1/) · H1(X)

• Proof idea: S(x), F, Hash(F(x)) ¼ S(x), F, Hash(R) • Similar to “left-over hash lemma / privacy amplification”


x’=

31

Sketches and Authentication

• “Fuzzy Sketch” + Hashing Solves Authentication

• Assumption: X has high entropy– Necessary

– X could be several passwords taken together

• “Hamming” errors can be handled with standard ECC

• Similar techniques imply one can use X as key for any crypto application (e.g. encryption)– Covers several previously studied settings

32

(Safely public)

Fuzzy Extractor

K =

Gen …

(Random given P)

Some application

e.g. (S( ), F )

K =

RecP

’

P

33

Outline




Other metrics

Stronger privacy?

34

Other metrics

• Variants of code-offset work for many metrics– “transitivity” suffices [DRS]

– Continuous normed spaces: Euclidean [LT,VTDL], Lp

• [DRS] New constructions– Set difference metric (introduced by [JS])

– General technique via special embeddings:

• edit distance, permutation metric

• Real(er)-life metrics?

35

Why Set Difference? [EHMS,FJ,JS]

• Inputs: tiny subsets in a HUGE universe• Some representations of personal / biometric data:

– Fingerprints represented as feature list (minutiae/ridge meetings)

– List of favorite books

• X µ {1,…, N}, #X = s

• dS(X,Y) = ½ # (X Y)= Hamming distanceon vectors in {0,1}N.

X Y

X Y• Code-offset not good: N-bit string is too long!• Want: s log N bits

44

Edit Distance

• Strings of bits

• d(x,y) = number of insertions & deletions to go from x to y

• [ADGIR] No good standard embeddings into Hamming

• Shingling [Broder]: “biometric” embedding into Set.Diff.

10101010101010101101010101011010101

45

Outline




Other metrics

Stronger privacy?

46

Stronger Privacy?

• Previous notion: Unpredictability– Can’t guess X even after seeing sketch

– Sufficient for using X as a crypto key

• What about the privacy of X itself?– Do not want particular info about X leaked (say, first 20 bits)

• Ideal notion, ignoring computational restrictions:

X almost independent of S(X)

• Problem: Some info must be leaked by S(X) (provably)

– Mutual information I(X ; S(X)) is large

• We want the ensure that “useful” information is hidden

47

Attempt: Hiding Useful Info [CMR,RW,DS]

Definition: S(X) hides all functions of X ifFor all functions g , for all adversaries A

Pr[ A(S(X)) = g(X) ] – Pr[ A(S(X’)) = g(X) ] <

where X, X’ are independent, identical random variables

Intuition:“A cannot guess g(X) given S(X)”

S(X)X

g(X) difficult

Implies unpredictability

48

Attempt: Hiding Useful Info [CMR,RW,DS]

Definition: S(X) hides all functions of X when H1(X) ¸t if

For all functions g , for all adversaries A

Pr[ A(S(X)) = g(X) ] – Pr[ A(S(X’)) = g(X) ] <


• Non-computational version of semantic-securityfor high-entropy input distributions

49

Hiding Useful Info



Pr[ A(S(X)) = g(X) ] – Pr[ A(S(X’)) = g(X) ] <


• Can be achieved in code-offset construction [DS03]– Use pseudo-randomly chosen code

– Need to keep decodability

– Exact parameters still unknown

50

Technique: Equivalence to Extraction



Pr[ A(S(X)) = g(X) ] – Pr[ A(S(X’)) = g(X) ] <


S() is an “extractor” if for all r.v.’s X1, X2 of min-entropy t

S(X1) ¼ S(X2)

Thm: S(X) hides all functions of X, whenever H1(X)¸ t , S() is an “extractor” for r.v.’s of min-entropy t – 1

51

The right notion?



Pr[ A(S(X)) = g(X) ] – Pr[ A(S(X’)) = g(X) ] <


• Is this strong enough?– Not secure if repeated: S(X), S’(X), S’’(X) may reveal X !

– What’s the “right” notion?

– What computational notions might we use? (see [CMR])

52

Outline




Other metrics

Stronger privacy?

53

Conclusions

• Generic framework for turning noisy, non-uniform data into secure cryptographic keys– Only assume high entropy (necessary)– Abstraction, simplicity allow comparing schemes– Constructions for Hamming, Set Difference, Edit Metrics

• Possible to have strong privacy of data• Future work

– Real-life metrics (under way)– Other settings ([DV] use sketches in bounded-storage model)– Better definitions

55

Questions?

1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam...

Documents

Transcript of 1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam...