1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.
-
Upload
valerie-byrd -
Category
Documents
-
view
235 -
download
0
Transcript of 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.
![Page 1: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/1.jpg)
1
Longest Common Subsequence as Private Search
Payman Mohassel and Mark Gondree
U of Calgary NPS
![Page 2: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/2.jpg)
2
Longest Common Subsequence
A = a d e f t c g
B = c e d e t f g
LCS(A,B) = detg
Strings A, B |A| = n, |B| = m
LCS(A,B) Subsequence of A Subsequence of B Maximum length
![Page 3: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/3.jpg)
3
Three Variants
LCS length : 4Length of LCS
LCS string : “detg”A string that is an LCS
LCS embed : ({2,3,5,7},{3,4,5,7})
Position of an LCS in A and B
A = a d e f t c g
B = c e d e t f g
![Page 4: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/4.jpg)
4
Motivation
Genomic computation Measure of similarity between DNA sequences
File comparison Used in diff program
Many problems related to LCS Edit distance (sequence alignment) Shortest common supersequence Longest increasing sequence
Beyond LCS Problems with dynamic programming solutions Problems formulated using DFAs
![Page 5: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/5.jpg)
5
Privacy Concerns
DNA data is sensitive Patients’ privacy: health, background
Many institutions hold genomic data Governments Pharmaceutical companies Research institutes
Privacy rules (e.g. HIPAA) Can lead to fines or legal action Focus on data access and sharing, not computation
![Page 6: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/6.jpg)
6
Secure Two-Party LCS
A B
Alice Bob
LCS-Len(A,B)
•Atallah et. al. WPES 03•Jha et. al. S&P 08
•With implementation•Using secure computation techniques
•What if we output more than length?
![Page 7: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/7.jpg)
7
Output More Than Length
LCS string or embedding More than one possible output
Can be exponentially many Which one do we output?
There are privacy concerns Output can work as a “covert channel”
Private search Beimel et. al. STOC 06, CRYPTO 07 Output sampling, equivalence protecting algorithms
![Page 8: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/8.jpg)
8
Output Sampling Algorithms
Choice of output leaks informationE.g. lexicographically first LCS reveals
Any string before the output not an LCS
Sample solution space randomly Not possible for all problems
Stable matching
![Page 9: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/9.jpg)
9
Output Sampling for LCS
Fill in the LCS length table Fill in the LCS counting table Devise a randomized backtracking
strategy using the tables
![Page 10: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/10.jpg)
10
Step 1: LCS Length
a
b
f
c
e
a
d
a f b c a a a d
1 1
2
1 1 1 1 1 1
1
1
1
1
1
1
1
2
2
2
2
2
2 1
2 2
3 3
3
2
3
43
3
4
1
3
2
3
1
4
4 4
3
4
3
1
2
2
2
2
2
3
3
2
5
4
Length of LCS between “afbcaa” and “abfc”
Between “afbcaaad” and “abfcead”
• Dynamic programming• O(mn) time
![Page 11: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/11.jpg)
11
a
b
f
c
e
a
d
a f b c a a a d
1 1
1
1 1 1 1 1 1
1
1
1
1
1
1
1
1
1
1
1
1
1 1
2 2
2 2
2
2
2
22
2
2
1
2
2
2
1
2
2 2
2
2
2
1
2
2
2
2
2
2
2
2
2
2
Step 2: LCS Counting
Number of LCS strings between “afbcaa” and “abfce”
Between “afbcaaad” and abfcead”
• Greenberg, 2003• O(mnlogS) time• S = number of solutions• Different for string and embed
![Page 12: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/12.jpg)
12
Step 3: Randomized Backtracking
Backtracking on the counting table Create multiple choices at each step To get a uniformly random solution
Cover all possible solutions No solution overlap between choices Weighted coin-toss based on number of solutions in each choice
LCS embed strategy: at (i,j) A[i] and B[j] are in LCS A[i] in LCS, B[j] not B[j] in LCS, A[i] not Neither one in LCS
Different strategy for LCS string O(mnlogS) time
![Page 13: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/13.jpg)
13
Equivalence Protecting Algorithms
What if protocol is run multiple times?Leaks additional information
Protect privacy of inputs with the same set of solutions
![Page 14: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/14.jpg)
14
Equivalence Classes
(A1,B1) (A2,B2)
LCS-Str(A,B)Set of all LCS strings between A and B
(A1,B1) ≈ (A2 ,B2 )
LCS-Str(A1,B1) = LCS-Str(A2,B2)
(A3,B3)(A3 ,B3) ≈ (A2,B2)
LCS-Str(A3,B3) ≠ LCS-Str(A2,B2)
![Page 15: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/15.jpg)
15
Equivalence Protecting Algorithms
Samples LCS-Str(A,B) randomly Sampling is seeded For every (C,D) ≈ (A,B) use the same
seed
(A,B)(C,D)
(E,F)output1
output2
![Page 16: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/16.jpg)
16
A Generic Construction
Compute a (Arep,Brep) Representative of (A,B)’s equivalence class Canonical representative algorithm
Compute the seed s = F(Arep,Brep) Run an output sampling algorithm on (Arep,Brep) using s for
randomness
Canonicalrepresentative
F Outputsampling
(Arep,Brep)(A,B)
(Arep,Brep)
s
Output
Equivalence protecting algorithm
![Page 17: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/17.jpg)
17
Equivalence Protecting Algorithms
Problem reduced toCanonical representative algorithms
LCS embedDirect solution
LCS stringVia reduction to DFAs
![Page 18: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/18.jpg)
18
Step 1
LCS-to-DFA(A,B)
MA,B
• MA,B is a DFA of size O(mn)• L(MA,B) = LCS-Str(A,B)
• MA generates all subsequence of A• MB generates all subsequences of B• Combine and prune the DFAs• Runs in O(mn) time
Equivalence protecting algorithm that outputs a word in L(MA,B)
![Page 19: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/19.jpg)
19
Step 2
DFA-RepMA,B
Mrep
• For M, M’ where L(M) = L(M’)• DFA-Rep(M) = DFA-Rep(M’)
• L(MA,B) = L(Mrep)
• Generate a minimal-state DFA, Mrep
• Myhill-Nerode theorem:• Mrep is unique
• Runs O(mn) time
![Page 20: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/20.jpg)
20
Equivalence Protecting Algorithm for LCS String
LCS-to-DFA(A,B)
DFA-RepMA,B Mrep
Mrep
Sample a word in L(Mrep)F seed
output
![Page 21: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/21.jpg)
21
Summary
Private search algorithms for LCS LCS string LCS embed
Matching best counting algorithms O(mnlogS) time
Can be implemented in a variety settings Private search algorithms for
DFA represented functionalities Problems with D.P. solutions
![Page 22: 1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.](https://reader035.fdocuments.us/reader035/viewer/2022062423/5697bfde1a28abf838cb200c/html5/thumbnails/22.jpg)
22
Open Problems
Privacy Alternative privacy definitions
The setting can effect privacy concerns Take into account input distribution
Algorithms Better private search for LCS
Computationally indistinguishable form uniform
Better counting for LCS