TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.
-
Upload
molly-thompson -
Category
Documents
-
view
231 -
download
1
Transcript of TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.
![Page 1: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/1.jpg)
TOWARDS IDENTITY ANONYMIZATION ON GRAPHS
![Page 2: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/2.jpg)
INTRODUCTION• Removing the identities of the nodes before publishing the
graph/social network data does not guarantee privacy.
• The structure of the graph itself, and in its basic form the degree of the nodes, can be revealing the identities of individuals.
• We call a graph -degree anonymous if for every node , there exist at least other nodes in the graph with the same degree as .
![Page 3: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/3.jpg)
MOTIVATION• Social networks, online communities, peer-to-peer file
sharing and telecommunication systems can be modelled as complex graphs.
• These graphs are of significant importance in various application domains such as marketing, psychology and homeland security.
![Page 4: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/4.jpg)
MOTIVATION• in a social network, nodes correspond to individuals or other
social entities, and edges correspond to social relationships between them.
• http://www.yasiv.com/facebook
• https://apps.facebook.com/touchgraph
![Page 5: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/5.jpg)
THE CHALLENGE
How to minimally modify the graph to protect the identity of each individual involved without losing the information ?
![Page 6: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/6.jpg)
THE PRIVACY BREACHES INSOCIAL NETWORK CATEGORIES• identity disclosure:
the identity of the individual who is associated with the node is revealed
• link disclosure:sensitive relationships between two individuals are disclosed
• content disclosure:the privacy of the data associated with each node is breached e.g., the email message sent and/or received by the individuals in a email communication graph
![Page 7: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/7.jpg)
NOTES • -anonymization is used for content disclosure.
• In this paper, we focus on identity disclosure.
![Page 8: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/8.jpg)
PROBLEM• Given a graph and an integer , modify via a set of edge-
addition (or deletion) operations in order to construct a new -degree anonymous graph, in which every node has the same degree with at least other nodes.
![Page 9: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/9.jpg)
CHALLENGE
we want to preserve the utility of the original graph, while at the same time satisfy the degree-anonymity constraint
![Page 10: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/10.jpg)
PROBLEM DEFINITION• The graph
• the degree sequence of ; vector of size
• is the degree of the node of
• entries in d are ordered in decreasing order of the degrees they correspond to, that is,
• is subsequence of that contains elements
![Page 11: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/11.jpg)
PROBLEM DEFINITION• Definition 1
A vector of integers is -anonymous, if every distinct value in v appears at least times.
vector v = [5; 5; 3; 3; 2; 2; 2]
2-anonymous
• Definition 2
A graph is -degree anonymous if the degree sequence of , is -anonymous.
• This property prevents the reidentification of individuals by adversaries with a priori knowledge of the degree of certain nodes.
![Page 12: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/12.jpg)
PROBLEM DEFINITION
![Page 13: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/13.jpg)
THE PROBLEM1• The input and an integer .
• The output -degree anonymous graph ( ,)
• we restrict the graph modification operations to edge additions
• The graph-anonymization cost should be minimized (minimizing the distance between and
• =
![Page 14: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/14.jpg)
THE PROBLEM1• We can naturally relax this requirement to the one where
rather than
• We call this Relaxed Graph Anonymization
![Page 15: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/15.jpg)
THE APPROACH• 1- Starting form . we Construct a new degree sequence which is -
anonymous and minimizes the cost.
• 2- starting from the we construct a the graph (,) such as and
![Page 16: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/16.jpg)
PROBLEM 2 FROM STEP 1 (DEGREE ANONYMIZATION) • Input is (the degree sequence of the graph and an integer
k.
• Output is -anonymous sequence such that is minimized.
![Page 17: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/17.jpg)
PROBLEM 3 FROM STEP 2 (GRAPH CONSTRUCTION) • The inputs are and a -anonymous degree sequence
• The output is graph (,) such as and or for the relaxed version.
![Page 18: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/18.jpg)
DEGREE ANONYMIZATION• We can construct a set of dynamic- programming equations
that solve the Graph Anonymization problem. That is
• The running time in o()
![Page 19: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/19.jpg)
DEGREE ANONYMIZATION• We can improve the running time of the DP algorithm from
O() to O(nk).
![Page 20: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/20.jpg)
DEGREE ANONYMIZATION• For completeness, we also give a Greedy linear-time
alternative algorithm for the Degree Anonymization problem.
• The Greedy algorithm first forms a group consisting of the first k highest-degree nodes. Then it checks whether it should merge the (k+1)th node into the previously formed group or start a new group at position (k + 1).
• For taking this decision the algorithm computes the following two costs:
![Page 21: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/21.jpg)
GRAPH CONSTRUCTION• Input are and the desired k-anonymous degree sequence
(which is the output of DP or Greedy algorithms)
• Output is a k-degree anonymous graph (,) such as and
![Page 22: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/22.jpg)
REALIZABILITY OF DEGREE SEQUENCE• A degree sequence d, with d(1) ≥, .., .., .., ≥ d(n) is called
realizable if and only if there exists a simple graph whose nodes have precisely this sequence of degrees.
• A degree sequence d with d(1) ≥ d(2) ≥… ≥ d(i) ≥… ≥ d(n) and Σd(i) even, is realizable if and only if
“Lemma 1”
1 1
( ) ( 1) min{ , ( )}, for every 1 1.l n
i i l
i l l l i l n
d d
![Page 23: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/23.jpg)
THE CONSTRUCTGRAPH ALGORITHM:
Takes as input the desired degree
sequence d and outputs a graph
with exactly this degree sequence,
if such graph exists.
Otherwise it outputs a “No" if
such graph does not exist
![Page 24: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/24.jpg)
REALIZABILITY OF DEGREE SEQUENCES WITH CONSTRAINTS• Notice that Lemma 1 is not directly applicable to the Graph
Construction problem.
• Because, we also require that .
we want here to devise an algorithm for constructing a degree-anonymous graph Ĝ which is a supergraph of G, if such a graph exists. We call this algorithm the Supergraph, which is an extension ofthe ConstructGraph algorithm.
![Page 25: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/25.jpg)
THE PROBING SCHEME• If the Supergraph algorithm returns graph Ĝ, then we
guarantee that the least number of edge additions has been made.
• If Supergraph returns “No” or “Unknown”, we are content in tolerating some more edge-additions in order to get the Probing scheme that forces the Supergraph algorithm to output the desired k-degree anonymous graph with a little extra cost.
![Page 26: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/26.jpg)
THE PROBING SCHEME
![Page 27: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/27.jpg)
RELAXED GRAPH CONSTRUCTIONMost of the edges of the original graph appear in the degree-anonymous graph as well, but not necessarily all of them.
![Page 28: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/28.jpg)
RELAXED GRAPH CONSTRUCTION• The Greedy_Swap algorithm
• is a greedy heuristic that given Ĝ0 and G, it transforms Ĝ0 into Ĝ (V, Ê) with degree sequence dĜ= = dĜ0 and Ê ∩ E ≈ E
• Where is the output of constractGraph algorithm Although it is k-degree anonymous but its structure may be quite diferent from the original graph G(V;E)
![Page 29: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/29.jpg)
RELAXED GRAPH CONSTRUCTION• The Greedy_Swap algorithm
![Page 30: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/30.jpg)
RELAXED GRAPH CONSTRUCTION• The Priority algorithm
• a simple modification of the ConstructGraph algorithm that allows the construction of degree anonymous graphs with similar high edge intersection with the original graph directly, without using Greedy_Swap
• it gives priority to already existing edges in the input graph G(V;E).
![Page 31: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/31.jpg)
RELAXED GRAPH CONSTRUCTION
![Page 32: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/32.jpg)
EXPERIMENTS
![Page 33: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/33.jpg)
EVALUATING DEGREE ANONYMIZATION ALGORITHMS• The closer R is to 1, the better the performance of the
Greedy algorithm
![Page 34: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/34.jpg)
EVALUATING GRAPH CONSTRUCTION ALGORITHMS
Evaluating Anonymization cost L1(dA - d)
The smaller the value of L1(dA - d) the better the
qualitative performance of the algorithm.
![Page 35: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/35.jpg)
EVALUATING GRAPH CONSTRUCTION ALGORITHMS
Clustering Coefficient (CC):
We additionally compare the clustering
coefficients of the anonymized graphs
with the clustering coefficients of the
original graphs.
![Page 36: TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.](https://reader037.fdocuments.us/reader037/viewer/2022102818/56649dca5503460f94ac097b/html5/thumbnails/36.jpg)
EVALUATING GRAPH CONSTRUCTION ALGORITHMS
Average Path Length (APL):