Differential Privacy Analysis of Data Processing Workflows

20
Differential Privacy Analysis of Data Processing Workflows Marlon Dumas, Luciano García-Bañuelos, Peeter Laud

Transcript of Differential Privacy Analysis of Data Processing Workflows

Page 1: Differential Privacy Analysis of Data Processing Workflows

Differential Privacy Analysis of Data Processing Workflows

Marlon Dumas, Luciano García-Bañuelos, Peeter Laud

Page 2: Differential Privacy Analysis of Data Processing Workflows

Motivating example

Need to assign ship

Send ship assignment

Page 3: Differential Privacy Analysis of Data Processing Workflows

Conflicting goals: Privacy vs. Utility

We need to release aggregate information about data without leaking information about an individual involved in the incident

• Aggregate info: Number of crew members of nationality X in the ship

• Individual info: Is a particular crew member of nationality X?

Problem: Aggregate information may leak information on individuals

Number of crew members of nationality X in the ship,

Number of crew members of nationality X in the ship excluding Y

Page 4: Differential Privacy Analysis of Data Processing Workflows

Differential privacy (Dwork 2006)

K gives e-differential privacy if for all values of DB, DB’differing

in a single element, and all S in Range(K )

Pr[ K (DB) in S]

Pr[ K (DB’) in S]≤ eε ~ (1+ε)

Page 5: Differential Privacy Analysis of Data Processing Workflows

Source: Gerome Miklau and Michael Hay

Accuracy loss!

Differential privacy (Laplacian noise)

Page 6: Differential Privacy Analysis of Data Processing Workflows

NAPLES project’s goal

Develop theoretical foundations for implementing tools that

• Let one model stakeholders and flows in the Business Process Model and Notation

(BPMN)

• Find data leaks in these process models, taking into account the Privacy-Enhancing

Technologies used in the as-is models

• Quantify leakages using differential privacy

• Quantify accuracy loss

• Suggest relevant privacy-enhancing technologies to reduce privacy leaks

See http://pleak.io

Page 7: Differential Privacy Analysis of Data Processing Workflows

Usage scenarios

•Support privacy audit of existing systemWhat will each stakeholder of the System learn about a

private data object? e.g. with respect to differential privacy

•Build a new privacy-aware systemWhat will each stakeholder of the System learn about each

private data object?

Which Privacy Enhancing Technologies would help reducing the leakage?

Page 8: Differential Privacy Analysis of Data Processing Workflows

Architecture

Page 9: Differential Privacy Analysis of Data Processing Workflows

Adding privacy-enhancing technologies

Org

aniz

ation X

Em

erg

ency O

ffic

er

Dis

patc

her

CommunitiesRetrieve

CommunitiesRequiring

Intensive Care

Ships

ReservedFor

Evacuees

Adjust Ship toCommunityAssignment

Ship toCommunityAssignment

Communities ICUrequirements Ships

Org

aniz

ation

Y

Retrieve

Nearby Ships with

Spare Capacity &

Equipment

dp-taskdp-flow

Need to assign ship

Send ship assignment

(ships, disaster) -> {

avail_food = 0;

avail_ships = [];

for (ship in ships) do {

fuzzed_loc = ship.loc() + Lap2(3);

if (dist(fuzzed_loc, disaster.loc()) / ship.speed() <= 2

&& ship.cargo_type() == "food"

&& !ship.contains(dangerous_materials) ) {

avail_food += ship.cargo();

avail_ships.append({ship.name(), fuzzed_loc});

}

}

avail_food += Lap(2);

return (avail_food, avail_ships);

}

Page 10: Differential Privacy Analysis of Data Processing Workflows

Annotating the model with DF and sensitivity bounds

Org

aniz

ation X

Em

erg

ency O

ffic

er

Dis

patc

her

CommunitiesRetrieve

CommunitiesRequiring

Intensive Care

Ships

ReservedFor

Evacuees

Adjust Ship toCommunityAssignment

Ship toCommunityAssignment

Communities ICUrequirements Ships

Org

aniz

ation

Y

Retrieve

Nearby Ships with

Spare Capacity &

Equipment

Ship toCommunityAssignment

(∈:0.1,c:0.2)

(∈:0.1,c:0.2) (∈:0.1,c:0.2)

(∈:0.1,c:0.2)

(∈:0.1,c:0.2) (∈:0.1,c:0.2)

Need to assign ship

Send ship assignment

Page 11: Differential Privacy Analysis of Data Processing Workflows

Differential Privacy Disclosure (Roles)O

rganiz

atio

n X

Em

erg

ency

Offic

er

Dis

patc

her

CommunitiesRetrieve

CommunitiesRequiring

Intensive Care

Ships

ReservedFor

Evacuees

Adjust Ship toCommunityAssignment

Ship toCommunityAssignment

Communities ICUrequirements Ships

Org

aniz

atio

n Y

Retrieve

Nearby Ships with

Spare Capacity &

Equipment

Ship toCommunityAssignment

(∈:0.1,c:0.2)

(∈:0.1,c:0.2) (∈:0.1,c:0.2)

(∈:0.1,c:0.2)

(∈:0.1,c:0.2) (∈:0.1,c:0.2)

Need to assign ship

Send ship assignment

Page 12: Differential Privacy Analysis of Data Processing Workflows

Data processing workflows

Page 13: Differential Privacy Analysis of Data Processing Workflows

Model with DF/sensitivity bounds

Page 14: Differential Privacy Analysis of Data Processing Workflows

Generalized sensitivity• Generalized distances – any partial order with addition and least element

• has sensitivity

• Differential privacy is a specific case of generalized sensitivity

• Generalized sensitivity is composable, i.e.

Page 15: Differential Privacy Analysis of Data Processing Workflows

Model with DF/sensitivity bounds

Page 16: Differential Privacy Analysis of Data Processing Workflows

Model with DF/sensitivity bounds

o

s

i

min(0.2, 0.2 * 0.4)

Page 17: Differential Privacy Analysis of Data Processing Workflows

Result of analysis

Page 18: Differential Privacy Analysis of Data Processing Workflows

Differential Privacy Disclosure of a Data Source to a Party

r

Page 19: Differential Privacy Analysis of Data Processing Workflows

Outlook•Extend privacy analyzer to cover a broader class of BPMN

process models E.g. adding conditional branching

•Principles of program analysis for DP For arbitrary generalized metrics and sensitivities

•Defining a super set of BPMN, with ad-hoc constructs to model privacy related concerns (i.e. PA-BPMN)

•Building the PETs library & extend PA-BPMN to cater for other PETs besides differential privacy

Page 20: Differential Privacy Analysis of Data Processing Workflows

Research funded by DARPA (Brandeis program 2015-2019)

Thanks!