On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed...
-
Upload
emil-stevenson -
Category
Documents
-
view
214 -
download
0
Transcript of On Your Social Network De- anonymizablity: Quantification and Large Scale Evaluation with Seed...
On Your Social Network De-anonymizablity:Quantification and Large Scale Evaluation with Seed Knowledge
NDSS 2015, Shouling Ji, Georgia Institute of Technology
Fengli Zhang
11/4/2015
Outline
• Introduction • Motivation • Contribution • De-anonymization Quantification • Evaluation• Conclusion
Introduction
• As social networks have become deeply integrated in people’s lives, social networks can produce a significant amount of social data that contains their users’ detailed personal information• To protect users’ privacy, data owner usually anonymize
their data before it is shared, transferred, and published Naïve ID removal, K-anonymization, Differential privacy
• Existing anonymization schemes have vulnerabilities. Structure based de-anonymization attacks can break the privacy of social networks effectively based only on the data’s structural information.
De-anonymization Attack
Motivation
• Question 1 : Why social networks are vulnerable to structure based de-anonymization attacks? • Question 2 : How de-anonymizable a social network is? • Question 3 : How many users within a social network
can be successfully de-anonymized?
Contributions
• first theoretical quantification on the perfect and partial de-anonymizablity of social networks in general scenarios, where the social network can follow an arbitrary network model• implement the first large scale evaluation of the
perfect and partial de-anonymizablity of 24 various real world social networks• find that compared to the structural information
associated with known seed users, the other structural information(the structural information among anonymized users) is also useful in improving structure based de-anonymization attacks
Data Model • Anonymized Data ()=(, )• Auxiliary Data ()=(, )• De-anonymization scheme (σ) σ is a mapping: if i Є , σ(i) Є
• Seed mapping S S={(i, σ(i)|i Є , σ(i) Є }, Λ=|S|• Conceptual Underlying Graph (G)• Sampling rate s• Measurement
System Model
• : Edge difference between and under σ• For the mapping (i, σ(i)=j) Є σ • 2
De-anonymization Quantification
• Graph G : Erdos-Renyi (ER) model; General model• QuantificationSeed based perfect de-anonymization
Structure based perfect de-anonymization
De-anonymization Quantification
Error Toleration QuantificationWe define is (1 − ϵ)-de-anonymizable if at least (1−ϵ)n users in are perfectly de-anonymizable. That is at most ϵn incorrect de-anonymizations are allowable.
Datasets
Setup
• Suffixes -S: Using seed information -A, None: Using overall structural information -e.g. Twitter-A, Twitter-S• Seed mapping are chosen randomly -High-degree users are not given preference -Representing the general scenatios• 2-part of Evaluation - Evaluation of perfect De-anonymizablity - Evaluation of (1 − ϵ)-de-anonymizablity
Evaluation _ perfect De-anonymizablity [1/3]
Evaluation _ perfect De-anonymizablity [2/3]
Evaluation _ perfect De-anonymizablity [3/3]
Evaluation _ partial De-anonymizablity [1/2]
Evaluation _ partial De-anonymizablity [2/2]
Evaluation Overview
Conclusion & Limitation
• Provide the theoretical foundation for the existing De-anonymization attacks with seed information• The overall structural information based de-
anonymization is more powerful and it can perfectly de-anonymize a social network even without any seed information• Do not speciafically consider how to design structural
data anonymization technique to defend against such de-anonymization attacks• Do not explicitly involve the noise model because it
does not have proper scheme to add noise with data utility preservation