On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed...

20
On Your Social Network De-anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology Fengli Zhang 11/4/2015

Transcript of On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed...

Page 1: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

On Your Social Network De-anonymizablity:Quantification and Large Scale Evaluation with Seed Knowledge

NDSS 2015, Shouling Ji, Georgia Institute of Technology

Fengli Zhang

11/4/2015

Page 2: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Outline

• Introduction • Motivation • Contribution • De-anonymization Quantification • Evaluation• Conclusion

Page 3: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Introduction

• As social networks have become deeply integrated in people’s lives, social networks can produce a significant amount of social data that contains their users’ detailed personal information• To protect users’ privacy, data owner usually anonymize

their data before it is shared, transferred, and published Naïve ID removal, K-anonymization, Differential privacy

• Existing anonymization schemes have vulnerabilities. Structure based de-anonymization attacks can break the privacy of social networks effectively based only on the data’s structural information.

Page 4: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

De-anonymization Attack

Page 5: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Motivation

• Question 1 : Why social networks are vulnerable to structure based de-anonymization attacks? • Question 2 : How de-anonymizable a social network is? • Question 3 : How many users within a social network

can be successfully de-anonymized?

Page 6: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Contributions

• first theoretical quantification on the perfect and partial de-anonymizablity of social networks in general scenarios, where the social network can follow an arbitrary network model• implement the first large scale evaluation of the

perfect and partial de-anonymizablity of 24 various real world social networks• find that compared to the structural information

associated with known seed users, the other structural information(the structural information among anonymized users) is also useful in improving structure based de-anonymization attacks

Page 7: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Data Model • Anonymized Data ()=(, )• Auxiliary Data ()=(, )• De-anonymization scheme (σ) σ is a mapping: if i Є , σ(i) Є

• Seed mapping S S={(i, σ(i)|i Є , σ(i) Є }, Λ=|S|• Conceptual Underlying Graph (G)• Sampling rate s• Measurement

Page 8: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

System Model

• : Edge difference between and under σ• For the mapping (i, σ(i)=j) Є σ • 2

Page 9: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

De-anonymization Quantification

• Graph G : Erdos-Renyi (ER) model; General model• QuantificationSeed based perfect de-anonymization

Structure based perfect de-anonymization

Page 10: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

De-anonymization Quantification

Error Toleration QuantificationWe define is (1 − ϵ)-de-anonymizable if at least (1−ϵ)n users in are perfectly de-anonymizable. That is at most ϵn incorrect de-anonymizations are allowable.

Page 11: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Datasets

Page 12: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Setup

• Suffixes -S: Using seed information -A, None: Using overall structural information -e.g. Twitter-A, Twitter-S• Seed mapping are chosen randomly -High-degree users are not given preference -Representing the general scenatios• 2-part of Evaluation - Evaluation of perfect De-anonymizablity - Evaluation of (1 − ϵ)-de-anonymizablity

Page 13: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Evaluation _ perfect De-anonymizablity [1/3]

Page 14: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Evaluation _ perfect De-anonymizablity [2/3]

Page 15: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Evaluation _ perfect De-anonymizablity [3/3]

Page 16: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Evaluation _ partial De-anonymizablity [1/2]

Page 17: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Evaluation _ partial De-anonymizablity [2/2]

Page 18: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Evaluation Overview

Page 19: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.

Conclusion & Limitation

• Provide the theoretical foundation for the existing De-anonymization attacks with seed information• The overall structural information based de-

anonymization is more powerful and it can perfectly de-anonymize a social network even without any seed information• Do not speciafically consider how to design structural

data anonymization technique to defend against such de-anonymization attacks• Do not explicitly involve the noise model because it

does not have proper scheme to add noise with data utility preservation

Page 20: On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.