A Tutorial of Privacy-Preservation of Graphs and Social Networks
Privacy Preservation for Data Streams
description
Transcript of Privacy Preservation for Data Streams
![Page 1: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/1.jpg)
Privacy Preservation for Data StreamsFeifei Li, Boston University
Joint work with:Jimeng Sun (CMU), Spiros Papadimitriou, George A. Mihaila and Ioana Stanoi (IBM T.J. Watson Research Center)
![Page 2: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/2.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
2
Application (1)
Corp. A
Corp. B
Corp. C
Analytical Services
Finding trends, clusters, patterns,
aggregations.Sensitive data
P
P
P
![Page 3: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/3.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
3
Application (2)
Corp. A Information Hub
Publish data as a service
Client A
Client B
Subscribe data to identify trends, patterns, classes
P
![Page 4: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/4.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
4
Target Application Identify trendsvalue
timevalue
timevalue
timevalue
time
stream 1
stream 2
stream 3
stream 4
Cluster/classificati
on
![Page 5: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/5.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
5
Problem Formulation
time
time
time
……
..
A1
A2
AN
t
A1t
),1[, TA NT
Nt RA
+ NTE *NTA
Online generated noise,
one vector at a time
![Page 6: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/6.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
6
Problem Formulation (continued)
time
time
time
……
.
*NTA Rx
~NTA
),(min ~NTNTR AAD Offline and
Online
Given σ2, obtain A* online, s.t. D(A, A*) = σ2, and for given R, D(A, A~) is close to σ2
![Page 7: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/7.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
7
Data Perturbation
time
time
time
time
time
time
time
time
+
Random i.i.d noise
i.i.d: identical independently distributed
![Page 8: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/8.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
8
Principal Component Analysis: PCA
i.i.d Noise
![Page 9: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/9.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
9
Principal Component Analysis: PCA
Correlated Noise
![Page 10: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/10.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
10
PCA Based Data Reconstruction
A
A~
Removed Noise
Principal Direction
Remaining Noise
Privacy
A*
σ2
Added Noise: Utility
Projection Error
A*: Perturbed Data
A: Original Data
A~: Reconstructed Data
![Page 11: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/11.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
11
PCA Based Data Reconstruction
A
A~ Principal Direction
Remaining Noise
Privacy
A*σ2
Added Noise: Utility
Projection Error
A*: Perturbed Data
A: Original Data
A~: Reconstructed DataCorrelated Noise!
![Page 12: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/12.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
12
Data Perturbation: main idea
Observations
–The amount of the random noise controls privacy/utility tradeoff
– i.i.d (identical independently distributed) noise does not preserve the privacy! Not well enough
Lesson learned
– Noise should be correlated with original data
• Z. Huang et al. Sigmod 05.
![Page 13: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/13.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
13
Challenge 1: Dynamic Correlation
![Page 14: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/14.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
14
Challenge 1: Dynamic Correlation
![Page 15: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/15.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
15
Challenge 2: Dynamic Autocorrelation
![Page 16: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/16.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
16
Challenge 2: Dynamic Autocorrelation
![Page 17: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/17.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
17
Online Random Noise for Autocorrelation: Stock
![Page 18: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/18.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
18
State of the Art
Privacy Preservation
–Given a utility requirement, maximize the privacy
Existing Work (Z. Huang et al. Sigmod05)
–Batch mode, static data
–And many other works (see our paper for a detailed literature review)
![Page 19: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/19.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
19
Adding Dynamic Correlated Noise
A1
A2
A3
+
U3x3: online estimation
of principal components
At
Update U
Et
Generate noisedistributed along U
A~t
Publish A~
t
S. Papadimitriou et al. VLDB05
![Page 20: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/20.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
20
Put it into Algorithm: Distribute Noise
V
V )1(1 2
V
V )2(2 2
σ2 σ2
TU
k=3, U: eigenvectors, V: eigenvalues
Added to AtRotate back to data space
Noise distributed in principal components’ subspace
![Page 21: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/21.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
21
why is our algorithm better (state of the art)?
Local principal component Local principal
component
Global principal component
Noise added along global PC -- offline
Removed noise by online reconstruction
Noise added along global PC -- offline
Removed noise by online reconstruction
![Page 22: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/22.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
22
Online Reconstruction vs. Offline Reconstruction
Choice of adversary:
– Offline reconstruction based on global principal components
– Online tracking of the principal components and apply local reconstruction
– Please see the details in the paper
![Page 23: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/23.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
23
Tracking Autocorrelation
a=[1 2 3 4 5 6]T
w1
w2
w3
w4
W =
1 2 3
2 3 4
3 4 5
4 5 6
Time
h streams
![Page 24: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/24.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
24
Distribute Noise
W =
1 2 3
2 3 4
3 4 5
4 5 6
1 2 3
2 3 4
3 4 5
4 5 6
1 2 3
2 3 4
3 4 5
4 5 6
1 2 3
2 3 4
3 4 5
4 5 6
1 2 3
2 3 4
3 4 5
4 5 6
Avoid adding noise > allowed threshold!
And still auto-correlated with the stream Idea: constraint the
next k noise values based on previous h-k noises + current estimation of U becomes a linear system
![Page 25: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/25.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
25
Experiments
Three Real Data Streams
– Sensor streams, Lab: Light, Humidity, Volt, Temperature. 7712x198
– Choroline environmental streams: 4310x166
– Stock streams: 8000x2
![Page 26: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/26.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
26
Perturbation vs. Reconstruction
Perturbation i.i.d-N Offline-N Online-N: SCAN / SACAN
Reconstruction
Baseline Offline-R Online-R: SCOR / SACOR
noise correlated with global principal componentsstreaming correlated additive noisestreaming auto-correlated additive noiseoffline-reconstruction based on global principal componentsstreaming correlated online reconstructionstreaming auto-correlated online reconstruction
noise (discrepancy) is represented by the relative energy as percentage to the original data streams,i.e., D(A, A*)/||A||
take perturbed data as the reconstruction
![Page 27: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/27.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
27
Reconstruction Error: Online-R vs. Offline-R
online reconstruction achieves better accuracy asit minimizes the projection error
10% noisek=10
![Page 28: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/28.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
28
Reconstruction Error: vary k
1. online reconstruction achieves better accuracy2. large k reduces projection error
![Page 29: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/29.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
29
Privacy vs. Discrepancy, online-R: Lab data
![Page 30: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/30.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
30
Privacy vs. Discrepancy, online-R: Choroline
![Page 31: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/31.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
31
Online Random Noise for Autocorrelation: Choroline
![Page 32: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/32.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
32
Online Random Noise for Autocorrelation: Stock
![Page 33: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/33.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
33
Privacy vs. Discrepancy: Online-R (Choroline)
![Page 34: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/34.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
34
Privacy vs. Discrepancy: Online-R (Stock)
![Page 35: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/35.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
35
Running Time Analysis
![Page 36: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/36.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
36
Running Time Analysis
![Page 37: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/37.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
37
Future Work
Combing correlation and autocorrelation
Other type of data streams, other than numeric data, such as categorical data
![Page 38: Privacy Preservation for Data Streams](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814404550346895db09719/html5/thumbnails/38.jpg)
Privacy Preservation for Data StreamsPrivacy Preservation for Data Streams
38
Questions
Thank you!