Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits...

38
Privacy-Preserving Machine Learning CS 760: Machine Learning Spring 2018 Mark Craven and David Page www.biostat.wisc.edu/~craven/cs760 1

Transcript of Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits...

Page 1: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Privacy-PreservingMachineLearning

CS760:MachineLearningSpring2018

MarkCravenandDavidPage

www.biostat.wisc.edu/~craven/cs760

1

Page 2: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

GoalsfortheLecture

• Youshouldunderstandthefollowingconcepts:

• publickeycryptography• linearlyhomomorphicencryption• fullyhomomorphicencryption• differentialprivacy• globalsensitivity• Laplacemechanism

• ThanksEricLantzandIreneGiacomelli!

2

Page 3: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

3

Page 4: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

4

Page 5: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

5

Page 6: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

6

Page 7: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

F(x)

7

Page 8: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

8

Page 9: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

F(x)

9

Page 10: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

F(x)

10

Page 11: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

F(x)

11

Page 12: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

12

Page 13: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

• Largedatabasesofpatientinformation• Regulationsandexpectationsofprivacy• Largepotentialgainsfromdatamining• Howtobalanceutilityandprivacy?

• Privacyapproaches• k-anonymity(Sweeney,2002),l-diversity(Machanavajjhala,2007),t-closeness(Li,2007)

• Homomorphicencryption• Differentialprivacy(Dwork,2006)

NeedforPrivacy

13

Page 14: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Recall:IWPCWarfarindosingalgorithm

• Overadozenreal-valuepredictiontechniqueswereused

• Linearregressionandsupportvectorregressionwerethebestperformers

5.6044-0.2614 Age in decades

+0.0087 Height in cm+0.0128 Weight in kg-0.8677 VKORC1A/G-1.6974 VKORC1 A/A-0.4854 VKORC1 genotype unknown-0.5211 CYP2C9 *1/*2-0.9357 CYP2C9 *1/*3-1.0616 CYP2C9 *2/*2-1.9206 CYP2C9 *2/*3-2.3312 CYP2C9 *3/*3-0.2188 CYP2C9 genotype unknown-0.1092 Asian race-0.2760 Black or African American-0.1032 Missing or Mixed race+1.1816 Enzyme inducer status-0.5503 Amiodarone status= square root of final dose

0 5 10 15

PGx

Clinical

Fixed

PredictionError(mg/wk)

14

Page 15: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Recall:Ridge Regression

Data point: (x, y ), x ∈ Rd and y ∈ R

Model: w ∈ Rd vector ofweights

8/ 29

Page 16: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Public-Key Encryption

9/ 29

sk → secret keypk → public key

Encryption:

Decryption:

Page 17: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Public-Key Encryption

sk → secret keypk → public key

Encryption: c = Encpk (m)c → hides m to everyone that does NOT have sk

Decryption:

m =hello! Enc c =6a7#87tpk

9/ 29

Page 18: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Public-Key Encryption

c =6a7#87tm =hello! Enc Dec hello!sk

9/ 29

sk → secret keypk → public key

Encryption: c = Encpk (m)c → hides m to everyone that does NOT have sk

Decryption:

c→revealsmtoeveryonethathas sk

pk

Page 19: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Linearly-Homomorphic Encryption

10/ 29

Addition of ciphertexts

Encpk(m1) 83 Encpk(m2) = Encpk(m1 + m2)

Multiplication of a ciphertext by a plaintext

m1 [SJ Encpk(m2) = Encpk(m1 × m2)

Page 20: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Linearly-Homomorphic Encryption

10/ 29

Addition of ciphertexts

Encpk(m1) 83 Encpk(m2) = Encpk(m1 + m2)

Multiplication of a ciphertext by a plaintext (m1 is public!)

M1 [SJ Encpk(m2) = Encpk(m1 × m2)

Page 21: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Linearly-Homomorphic Encryption

10/ 29

Addition of ciphertexts

Encpk(m1) 83 Encpk(m2) = Encpk(m1 + m2)

Multiplication of a ciphertext by a plaintext (m1 is public)

M1 [SJ Encpk(m2) = Encpk(m1 × m2)

Fully homomorphic requires multiplication analog of

and currently is much slower.

Database (DB): 105 × 102 real numbers in [−2000, 2000] with 3 digits inthe fractional part. Times using linearly-homomorphic encryption:- encrypt the DB: 40 minutes- sum of two DBs: 3 seconds- mult. by a constant: 25 mins

Page 22: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Illustration

...

D1

D2

D t

ML Engine

Crypto Provider

11/ 29

Page 23: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Illustration

...

D1

D2

D t

ML Engine

Crypto Provider

(sk,pk)

pk

pk

pk

pk

pk

pk

11/ 29

Page 24: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Illustration

...

D 1

D2

D t

ML Engine

Crypto Provider

(sk,pk)

pk

pk

pk

Encpk(D1)

Encpk(D2)

Encpk(D t )

11/ 29

Page 25: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Illustration

...

D 1

D2

D t

ML Engine

(sk,pk)

Crypto Provider

11/ 29

pk

pk

pk

Encpk (D 1)Encpk (D 2)

...Encpk(D t )

Page 26: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Illustration

ML Engine

Crypto Provider

w trained onD1 ∪ ···∪ D t

interactive protocol

11/ 29

Page 27: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Illustration

Interactive protocol:

1. the ML engine “masks inside the encryption” Encpk(D) → Encpk(D˜)

2. the crypto provider decrypts, gets D̃ and computes a “maskedmodel”, w̃

3. the ML engine computes the real model w from the masked one

Page 28: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Illustration

Results for seven UCI datasets (time in seconds):(phase 1 = encryption, phase 2 = interactive protocol)

n = training data (number of data points)d = number of features

Page 29: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

• Benefits• Highutility– becauseNoNoise!!!• Nooneseesdata“intheclear”

• Disadvantages• Models(orevenjustpredictions)maystillgiveawaymoreinformationabouttrainingexamples(e.g.,patients)thanaboutotherexamples(patients)

• Veryhigh(asofnow,completelyimpractical)runtimesforsomemethods(fullyhomomorphicencryption)

• Feasibleapproaches(e.g.,linearlyhomomorphicencryption)requirere-developingeachlearningalgorithm(e.g.,ridgeregression)fromscratchwithlimitedoperations

• Protectionsmaybelostif/whenQuantumComputersbecomeavailable

CommentsonHomomorphicEncryption

29

Page 30: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

JustReleasingaLearnedModel CanViolatePrivacy

• IWPCWarfarinModel• Canwepredictgenotypeoftrainingsetbetterthanothers?

30

Page 31: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

PrivacyBlueprint

31

Page 32: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

• Goal• Smalladdedriskofadversarylearning(private)informationaboutanindividualifhis/herdataintheprivatedatabaseversusnotinthedatabase

• Informally• Queryoutputdoesnotchangemuchbetweenneighboringdatabases• E.g.:whatisfractionofpeopleinclinicwithdiabetes?

DifferentialPrivacy(Dwork,2006)

32

Page 33: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

• Given• InputdatabaseD• Randomizedalgorithmf:D->Range(f)• f is(e ,d)-differentiallyprivateiff

• ForanyS Î Range(f)andD’whered(D,D’)=1• ε andd areprivacybudget

• Smallermeansmoreprivate

DifferentialPrivacyDefinition

33

Page 34: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

• Note:Definitionrequiresstochasticoutput… howtoachieve?

• Perturbation{LaplaceMechanism}(Dwork,2006)• Calculatecorrectanswerf(D)• Addnoisef(D)+h

• Soft-max{ExponentialMechanism}(McSherry andTalwar,2007)• Qualityfunctionq(D,s)• Exponentialweightingexp(e q(D,s))

• Inbothcases,noiseisproportionaltothesensitivity ofthefunction

ObtainingDifferentialPrivacy

34

Page 35: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

• Givenf :D ->R,globalsensitivityoff is

• Worstcase• Oncef andthedomainofD arechosen,globalsensitivityisfixed

GlobalSensitivity

35

Page 36: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

AddLaplaceNoise,μ=0,b afunctionofsensitivityandε

36

Page 37: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

Privacy-UtilityTradeoffforPrivateWarfarinModel

37

Page 38: Privacy-Preserving Machine Learningcraven/cs760/lectures/Privacy.pdf · 2018-04-30 · •Benefits •High utility –because No Noise!!! •No one sees data “in the clear” •Disadvantages

CommentsonDifferentialPrivacy

• Provableguarantees,regardlessofsideinformationadversaryhas

• Elegantformulationthatleadstomanyattractivealgorithms

• Hasinsightsforotherareassuchasfairness

• Poorintuitionforhowtoselectε

• Cankillutility(e.g.,accuracy,AUC)unlesswehaveverymanyexamples… sogoodfitforageofBigDatabutnotformediumdata

• Howtosetprivacybudget?IfreleaseDPdataset,canupdatewithnewreleasewithoutaddingtopreviousε,somustplanfarahead