How Robust are Linear Sketches to Adaptive Inputs?
description
Transcript of How Robust are Linear Sketches to Adaptive Inputs?
![Page 1: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/1.jpg)
How Robust are Linear Sketches to Adaptive Inputs?
Moritz Hardt, David P. WoodruffIBM Research Almaden
![Page 2: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/2.jpg)
Two Aspects of Coping with Big Data
EfficiencyHandle
enormous inputs
RobustnessHandle
adverse conditions
Big Question: Can we have both?
![Page 3: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/3.jpg)
Algorithmic paradigm: Linear Sketches
Linear map A
Applications: Compressed sensing, data streams, distributed computation, ...
Unifying idea: Small number of linear measurements applied to data
Data vector x in Rn
Output Sketch y = Ax in Rr
r << n For this talk: output can be any (not necessarily efficient) function of y
![Page 4: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/4.jpg)
“For each” correctnessFor each x: Pr { Alg(x) correct } > 1 – 1/poly(n)
Pr over randomly chosen matrix A
Does this imply correctness on many inputs?
No guarantee if input x2 depends on Alg(x1) for earlier input x1
Only under modeling assumption:Inputs are non-adaptively chosen
Why not?
![Page 5: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/5.jpg)
Example: Johnson-Lindenstrauss Sketch• Goal: estimate |x|2 from |Ax|2
• JL Sketch: if A is a k x n matrix of i.i.d. N(0, 1/k) random variable with k > log n, then Pr[|Ax|2 = (1±1/2)|x|2] > 1-1/poly(n)
• Attack: 1. Query x = ei and x = ei + ej for all standard unit vectors ei and ej
• Learn |Ai|2, |Aj|2, |Ai + Aj|2, so learn <Ai, Aj>
2. Hence, learn AT A, and learn kernel of A
3. Query a vector x 2 kernel(A)
![Page 6: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/6.jpg)
Benign/NaturalMonitor traffic using sketch, re-route traffic based on output, affects future inputs.
Adversarial DoS attack on network monitoring unit
Correlations arise in nearly any realistic setting
In this work: Broad impossibility results
Can we thwart the attack?
Can we prove correctness?
![Page 7: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/7.jpg)
Benchmark Problem
GapNorm(B): Given decide if(YES)(NO)
Easily solvable for B = 1+ε using “for each” guarantee by sketch with O(log n/ε2) rows using JL.
Goal: Show impossibility for very basic problem.
![Page 8: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/8.jpg)
Main ResultTheorem. For every B, given oracle access to a linear sketch using dimension r · n – log(Bn), we can find in time poly(r,B) a distribution over inputs on which sketch fails to solve GapNorm(B)
Corollary. Same result for any lp-norm.Corollary. Same result even if algorithm uses internal randomness on each query.
Efficient attack (rules out crypto), even slightlynon-trivial sketching dimension impossible
![Page 9: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/9.jpg)
Application to Compressed Sensing
Theorem. No linear sketch with o(n/C2) rows gurantees l2/l2 sparse recovery with approximation factor C on a polynomial number of adaptively chosen inputs.
l2/l2 recovery: on input x, output x’ for which:
Note: Impossible to achieve with deterministic matrix A, but possible with “for each” guarantee with r = k log(n/k).
[Gilbert-Hemenway-Strauss-W-Wootters12] has some positive results
![Page 10: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/10.jpg)
Outline
• Proof of Main Theorem for GapNorm– Proved using “Reconstruction Attack”
• Sparse Recovery Result– By Reduction from GapNorm– Not in this talk
![Page 11: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/11.jpg)
Computational Model
Definition (Sketch)An r-dimensional sketch is anyfunction satisfying
for some subspace
Sketch has unbounded computational poweron top of PUx
1. Sketches Ax and UTx are equivalent, where UT has orthonormal rows and row-span(UT) = row-span(A)
2. Sketch UTx equivalent to PU x = UUT x
Why?
![Page 12: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/12.jpg)
Algorithm (Reconstruction Attack)Input: Oracle access to sketch f using unknown subspace U of dimension rPut V0 = {0}, subspace of 0 dimensionFor t = 1 to t = r:
(Correlation Finding) Find vectors x1,...,xm weakly correlated with unknown subspace U, orthogonal to Vt-1(Boosting) Find single vector x strongly
correlated with U, orthogonal to Vt-1(Progress) Put Vt = span{Vt-1, x}
Output: Subspace Vr
![Page 13: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/13.jpg)
Algorithm (Reconstruction Attack)Input: Oracle access to sketch f using unknown subspace U of dimension rPut V0 = {0}, empty subspaceFor t = 1 to t = r:
(Correlation Finding) Find vectors x1,...,xm weakly correlated with unknown subspace U, orthogonal to Vt-1
(Boosting) Find single vector x strongly correlated with U, orthogonal to Vt-1
(Progress) Put Vt = Vt-1 + span{x}Output: Subspace Vr
![Page 14: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/14.jpg)
Conditional Expectation Lemma
Moreover, 1. Δ ≥ poly(1/d)2. g = N(0,σ)n for a carefully chosen σ unknown to sketching algorithm
Lemma. Given d-dimensional sketch f, we can find using poly(d) queries adistribution g such that:
“Advantage over random”
![Page 15: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/15.jpg)
SimplificationFact: If g is Gaussian, then PUg = UUTg is Gaussian as well
Hence, can think of query distribution as choosing random Gaussian g to be inside subspace U.
We drop the PU projection operator for notational simplicity.
![Page 16: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/16.jpg)
The three step intuition
(Symmetry) Since the queries are random Gaussian inputs g with an unknown variance, by spherical symmetry, sketch f learns nothing more about query distribution than norm |g|(Averaging) If |g| is larger than expected, the sketch is “more likely” to output 1(Bayes) Hence, by sort-of-Bayes-Rule, conditioned on f(g)=1, expectation of |g| is likely to be larger
![Page 17: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/17.jpg)
Def. Let p(y) = Pr{ f(y) = 1 }y in U uniformly random with |y|2 = s
Fact. If g is Gaussian with E|g|2 = t, then,
density of χ2-distribution with expectation t
and d degrees of freedom
![Page 18: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/18.jpg)
p(s) = Pr(f(y) = 1) y in U unif. random with |y|2 = s
?
Norm s
1
0d/n Bd/n
By correctness of sketch
l r
![Page 19: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/19.jpg)
Sliding χ2-distributions
• ¢(s) = sl r (s-t) vt(s) dt
• ¢(s) < 0 unless s > r – O(r1/2 log r)• s0
1 ¢(s) ds =s01 sl
r (s-t) vt(s) dt ds = 0
![Page 20: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/20.jpg)
Averaging Argument
• h(s) = E[f(gt) | |g|2 = s]
• Correctness:– For small s, h(s) ¼ 0, while for large s, h(s) ¼ 1
• s01 h(s) ¢(s) ds =s0
1 sl r h(s) (s-t) vt(s) dt ds ¸ d
• 9 t so that s01 h(s) s vt(s) ds ¸ t s0
1 vt(s) ds + d/(l-r)
• For this t, E[|gt|2 | f(gt) = 1] ¸ t + ¢
![Page 21: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/21.jpg)
Algorithm (Reconstruction Attack)Input: Oracle access to sketch f using unknown subspace U of dimension rPut V0 = {0}, empty subspaceFor t = 1 to t = r:
(Correlation Finding) Find vectors x1,...,xm weakly correlated with unknown subspace U, orthogonal to Vt-1
(Boosting) Find single vector x strongly correlated with U, orthogonal to Vt-1
(Progress) Put Vt = Vt-1 + span{x}Output: Subspace Vr
![Page 22: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/22.jpg)
Boosting small correlations
x1
x2
x3
.
.
.
xm
n
poly(r)
M =
1. Sample poly(r) vectors using CoEx Lemma
2. Compute top singular vector x of M
Lemma: |PUx| > 1-poly(1/r)
Proof: Discretization +Concentration
![Page 23: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/23.jpg)
Implementation in poly(r) time
• W.l.og. can assume n = r + O(log nB)
– Restrict host space to first r + O(log nB)
coordinates
• Matrix M is now O(r) x poly(r)
• Singular vector computation poly(r) time
![Page 24: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/24.jpg)
Iterating previous steps
Generalize Gaussian to “subspace Gaussian” = Gaussian vanishing on maintained subspace Vt
Intuition: Each step reduces sketch dimension by one.
After r steps:1. Sketch has no dimensions left!2. Host space still has n – r > O(log nB) dimensions
![Page 25: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/25.jpg)
Problem
Top singular vector not exactly contained in UFormally, sketch still has dimension r
Can fix this by adding small amount of Gaussiannoise to all coordinates
![Page 26: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/26.jpg)
Algorithm (Reconstruction Attack)Input: Oracle access to sketch f using unknown subspace U of dimension rPut V0 = {0}, empty subspaceFor t = 1 to t = r:
(Correlation Finding) Find vectors x1,...,xm weakly correlated with unknown subspace U, orthogonal to Vt-1
(Boosting) Find single vector x strongly correlated with U, orthogonal to Vt-1
(Progress) Put Vt = Vt-1 + span{x}Output: Subspace Vr
![Page 27: How Robust are Linear Sketches to Adaptive Inputs?](https://reader035.fdocuments.us/reader035/viewer/2022062501/568163e4550346895dd54451/html5/thumbnails/27.jpg)
Open Problems
• Achievable polynomial dependence still open
• Optimizing efficiency alone may lead to non-robust algorithms - What is the trade-off between robustness
and efficiency in various settings of data analysis?